This document explains the DolphinScheduler application configurations according to DolphinScheduler-1.3.x versions.

Directory Structure

Currently, all the configuration files are under [conf ] directory. Check the following simplified DolphinScheduler installation directories to have a direct view about the position of [conf] directory and configuration files it has. This document only describes DolphinScheduler configurations and other topics are not going into.

[Note: the DolphinScheduler (hereinafter called the ‘DS’) .]

├─bin                               DS application commands directory
│  ├─         startup or shutdown DS application 
│  ├─                       startup all DS services with configurations
│  ├─                        shutdown all DS services with configurations
├─conf                              configurations directory
│  ├─         API-service config properties
│  ├─              datasource config properties
│  ├─               ZooKeeper config properties
│  ├─                  master-service config properties
│  ├─                  worker-service config properties
│  ├─                  quartz config properties
│  ├─                  common-service [storage] config properties
│  ├─                   alert-service config properties
│  ├─config                             environment variables config directory
│      ├─install_config.conf                DS environment variables configuration script [install or start DS]
│  ├─env                                load environment variables configs script directory
│      ├─            load environment variables configs [eg: JAVA_HOME,HADOOP_HOME, HIVE_HOME ...]
│  ├─org                                mybatis mapper files directory
│  ├─i18n                               i18n configs directory
│  ├─logback-api.xml                    API-service log config
│  ├─logback-master.xml                 master-service log config
│  ├─logback-worker.xml                 worker-service log config
│  ├─logback-alert.xml                  alert-service log config
├─sql                                   .sql files to create or upgrade DS metadata
│  ├─create                             create SQL scripts directory
│  ├─upgrade                            upgrade SQL scripts directory
│  ├─dolphinscheduler_postgre.sql       PostgreSQL database init script
│  ├─dolphinscheduler_mysql.sql         MySQL database init script
│  ├─soft_version                       current DS version-id file
├─script                            DS services deployment, database create or upgrade scripts directory
│  ├─         DS database init script
│  ├─        DS database upgrade script
│  ├─                  DS monitor-server start script       
│  ├─                       transfer installation files script                                     
│  ├─                  cleanup ZooKeeper caches script       
├─ui                                front-end web resources directory
├─lib                               DS .jar dependencies directory
├─                        auto-setup DS services script

Configurations in Details

serial number service classification config file
1 startup or shutdown DS application
2 datasource config properties
3 ZooKeeper config properties
4 common-service[storage] config properties
5 API-service config properties
6 master-service config properties
7 worker-service config properties
8 alert-service config properties
9 quartz config properties
10 DS environment variables configuration script[install/start DS] install_config.conf
11 load environment variables configs
12 services log config files API-service log config : logback-api.xml
master-service log config : logback-master.xml
worker-service log config : logback-worker.xml
alert-service log config : logback-alert.xml [startup or shutdown DS application] is responsible for DS startup and shutdown. Essentially, or startup and shutdown the cluster via Currently, DS just makes a basic config, remember to config further JVM options based on your practical situation of resources.

Default simplified parameters are:


"-XX:DisableExplicitGC" is not recommended due to may lead to memory link (DS dependent on Netty to communicate). [datasource config properties]

DS uses Druid to manage database connections and default simplified configs are:

Parameters Default value Description
spring.datasource.driver-class-name datasource driver
spring.datasource.url datasource connection url
spring.datasource.username datasource username
spring.datasource.password datasource password
spring.datasource.initialSize 5 initial connection pool size number
spring.datasource.minIdle 5 minimum connection pool size number
spring.datasource.maxActive 5 maximum connection pool size number
spring.datasource.maxWait 60000 max wait milliseconds
spring.datasource.timeBetweenEvictionRunsMillis 60000 idle connection check interval
spring.datasource.timeBetweenConnectErrorMillis 60000 retry interval
spring.datasource.minEvictableIdleTimeMillis 300000 connections over minEvictableIdleTimeMillis will be collect when idle check
spring.datasource.validationQuery SELECT 1 validate connection by running the SQL
spring.datasource.validationQueryTimeout 3 validate connection timeout[seconds]
spring.datasource.testWhileIdle true set whether the pool validates the allocated connection when a new connection request comes
spring.datasource.testOnBorrow true validity check when the program requests a new connection
spring.datasource.testOnReturn false validity check when the program recalls a connection
spring.datasource.defaultAutoCommit true whether auto commit
spring.datasource.keepAlive true runs validationQuery SQL to avoid the connection closed by pool when the connection idles over minEvictableIdleTimeMillis
spring.datasource.poolPreparedStatements true open PSCache
spring.datasource.maxPoolPreparedStatementPerConnectionSize 20 specify the size of PSCache on each connection [zookeeper config properties]

Parameters Default value Description
zookeeper.quorum localhost:2181 ZooKeeper cluster connection info
zookeeper.dolphinscheduler.root /dolphinscheduler DS is stored under ZooKeeper root directory
zookeeper.session.timeout 60000 session timeout
zookeeper.connection.timeout 30000 connection timeout
zookeeper.retry.base.sleep 100 time to wait between subsequent retries
zookeeper.retry.max.sleep 30000 maximum time to wait between subsequent retries
zookeeper.retry.maxtime 10 maximum retry times [hadoop、s3、yarn config properties]

Currently, mainly configures Hadoop,s3a related configurations.

Parameters Default value Description
data.basedir.path /tmp/dolphinscheduler local directory used to store temp files NONE type of resource files: HDFS, S3, NONE
resource.upload.path /dolphinscheduler storage path of resource files false whether hadoop grant kerberos permission /opt/krb5.conf kerberos config directory
login.user.keytab.username hdfs-mycluster@ESZ.COM kerberos username
login.user.keytab.path /opt/hdfs.headless.keytab kerberos user keytab
kerberos.expire.time 2 kerberos expire time,integer,the unit is hour
resource.view.suffixs txt,log,sh,conf,cfg,py,java,sql,hql,xml,properties file types supported by resource center
hdfs.root.user hdfs configure users with corresponding permissions if storage type is HDFS
fs.defaultFS hdfs://mycluster:8020 If, then the request url would be similar to 's3a://dolphinscheduler'. Otherwise if and hadoop supports HA, copy core-site.xml and hdfs-site.xml into 'conf' directory
fs.s3a.endpoint s3 endpoint url
fs.s3a.access.key s3 access key
fs.s3a.secret.key s3 secret key
yarn.resourcemanager.ha.rm.ids specify the yarn resourcemanager url. if resourcemanager supports HA, input HA IP addresses (separated by comma), or input null for standalone
yarn.application.status.address http://ds1:8088/ws/v1/cluster/apps/%s keep default if ResourceManager supports HA or not use ResourceManager, or replace ds1 with corresponding hostname if ResourceManager in standalone mode
dolphinscheduler.env.path env/ load environment variables configs [eg: JAVA_HOME,HADOOP_HOME, HIVE_HOME ...]
development.state false specify whether in development state [API-service log config]

Parameters Default value Description
server.port 12345 api service communication port
server.servlet.session.timeout 7200 session timeout
server.servlet.context-path /dolphinscheduler request path
spring.servlet.multipart.max-file-size 1024MB maximum file size
spring.servlet.multipart.max-request-size 1024MB maximum request size
server.jetty.max-http-post-size 5000000 jetty maximum post size
spring.messages.encoding UTF-8 message encoding
spring.jackson.time-zone GMT+8 time zone
spring.messages.basename i18n/messages i18n config
security.authentication.type PASSWORD authentication type [master-service log config]

Parameters Default value Description
master.listen.port 5678 master listen port
master.exec.threads 100 master-service execute thread number, used to limit the number of process instances in parallel
master.exec.task.num 20 defines the number of parallel tasks for each process instance of the master-service
master.dispatch.task.num 3 defines the number of dispatch tasks for each batch of the master-service LowerWeight master host selector, to select a suitable worker to run the task, optional value: random, round-robin, lower weight
master.heartbeat.interval 10 master heartbeat interval, the unit is second
master.task.commit.retryTimes 5 master commit task retry times
master.task.commit.interval 1000 master commit task interval, the unit is millisecond
master.max.cpuload.avg -1 master max CPU load avg, only higher than the system CPU load average, master server can schedule. default value -1: the number of CPU cores * 2
master.reserved.memory 0.3 master reserved memory, only lower than system available memory, master server can schedule. default value 0.3, the unit is G [worker-service log config]

Parameters Default value Description
worker.listen.port 1234 worker-service listen port
worker.exec.threads 100 worker-service execute thread number, used to limit the number of task instances in parallel
worker.heartbeat.interval 10 worker-service heartbeat interval, the unit is second
worker.max.cpuload.avg -1 worker max CPU load avg, only higher than the system CPU load average, worker server can be dispatched tasks. default value -1: the number of CPU cores * 2
worker.reserved.memory 0.3 worker reserved memory, only lower than system available memory, worker server can be dispatched tasks. default value 0.3, the unit is G
worker.groups default worker groups separated by comma, e.g., 'worker.groups=default,test'
worker will join corresponding group according to this config when startup [alert-service log config]

Parameters Default value Description
alert.type EMAIL alter type
mail.protocol SMTP mail server protocol mail server host
mail.server.port 25 mail server port
mail.sender mail sender email
mail.user mail sender email name
mail.passwd 111111 mail sender email password
mail.smtp.starttls.enable true specify mail whether open tls
mail.smtp.ssl.enable false specify mail whether open ssl specify mail ssl trust list
xls.file.path /tmp/xls mail attachment temp storage directory
following configure WeCom[optional]
enterprise.wechat.enable false specify whether enable WeCom xxxxxxx WeCom corp id
enterprise.wechat.secret xxxxxxx WeCom secret xxxxxxx WeCom agent id
enterprise.wechat.users xxxxxxx WeCom users
WeCom token url
WeCom push url
enterprise.wechat.user.send.msg send message format group message format
plugin.dir /Users/xx/your/path/to/plugin/dir plugin directory [quartz config properties]

This part describes quartz configs and configure them based on your practical situation and resources.

Parameters Default value Description
org.quartz.jobStore.driverDelegateClass org.quartz.impl.jdbcjobstore.StdJDBCDelegate
org.quartz.jobStore.driverDelegateClass org.quartz.impl.jdbcjobstore.PostgreSQLDelegate
org.quartz.scheduler.instanceName DolphinScheduler
org.quartz.scheduler.instanceId AUTO
org.quartz.scheduler.makeSchedulerThreadDaemon true
org.quartz.jobStore.useProperties false
org.quartz.threadPool.class org.quartz.simpl.SimpleThreadPool
org.quartz.threadPool.makeThreadsDaemons true
org.quartz.threadPool.threadCount 25
org.quartz.threadPool.threadPriority 5
org.quartz.jobStore.class org.quartz.impl.jdbcjobstore.JobStoreTX
org.quartz.jobStore.tablePrefix QRTZ_
org.quartz.jobStore.isClustered true
org.quartz.jobStore.misfireThreshold 60000
org.quartz.jobStore.clusterCheckinInterval 5000
org.quartz.jobStore.acquireTriggersWithinLock true
org.quartz.jobStore.dataSource myDs
org.quartz.dataSource.myDs.connectionProvider.class org.apache.dolphinscheduler.service.quartz.DruidConnectionProvider

install_config.conf [DS environment variables configuration script[install or start DS]]

install_config.conf is a bit complicated and is mainly used in the following two places.

  • DS Cluster Auto Installation.

System will load configs in the install_config.conf and auto-configure files below, based on the file content when executing ''. Files such as,,,,,,,,, etc.

  • Startup and Shutdown DS Cluster.

The system will load masters, workers, alert-server, API-servers and other parameters inside the file to startup or shutdown DS cluster.

File Content

# Note:  please escape the character if the file contains special characters such as `.*[]^${}\+?|()@#&`.
#   eg: `[` escape to `\[`

# Database type (DS currently only supports PostgreSQL and MySQL)

# Database url and port

# Database name

# Database username

# Database password

# ZooKeeper url

# DS installation path, such as '/data1_1T/dolphinscheduler'

# Deployment user
# Note: Deployment user needs 'sudo' privilege and has rights to operate HDFS.
#     Root directory must be created by the same user if using HDFS, otherwise permission related issues will be raised.

# Followings are alert-service configs
# Mail server host

# Mail server port

# Mail sender

# Mail user

# Mail password

# Whether mail supports TLS

# Whether mail supports SSL. Note: starttlsEnable and sslEnable cannot both set true.

# Mail server host, same as mailServerHost

# Specify which resource upload function to use for resources storage, such as sql files. And supported options are HDFS, S3 and NONE. HDFS for upload to HDFS and NONE for not using this function.

# if S3, write S3 address. HA, for example: s3a://dolphinscheduler,
# Note: s3 make sure to create the root directory /dolphinscheduler

# If parameter 'resourceStorageType' is S3, following configs are needed:

# If ResourceManager supports HA, then input master and standby node IP or hostname, eg: '192.168.xx.xx,192.168.xx.xx'. Or else ResourceManager run in standalone mode, please set yarnHaIps="" and "" for not using yarn.

# If ResourceManager runs in standalone, then set ResourceManager node ip or hostname, or else remain default.

# Storage path when using HDFS/S3

# HDFS/S3 root user

# Followings are Kerberos configs

# Specify Kerberos enable or not

# Kdc krb5 config file path

# Keytab username

# Username keytab path

# API-service port

# All hosts deploy DS

# Ssh port, default 22

# Master service hosts

# All hosts deploy worker service
# Note: Each worker needs to set a worker group name and default name is "default"

#  Host deploy alert-service

# Host deploy API-service
apiServers="ds1" [load environment variables configs]

When using shell to commit tasks, DolphinScheduler will export environment variables from bin/env/ The mainly configuration including JAVA_HOME, mata database, registry center, and task configuration.

# JAVA_HOME, will use it to start DolphinScheduler server
export JAVA_HOME=${JAVA_HOME:-/opt/soft/java}

# Database related configuration, set database type, username and password
export DATABASE=${DATABASE:-postgresql}

# DolphinScheduler server related configuration

# Registry center configuration, determines the type and link of the registry center
export REGISTRY_TYPE=${REGISTRY_TYPE:-zookeeper}

# Tasks related configurations, need to change the configuration if you use the related tasks.
export HADOOP_HOME=${HADOOP_HOME:-/opt/soft/hadoop}
export HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-/opt/soft/hadoop/etc/hadoop}
export SPARK_HOME1=${SPARK_HOME1:-/opt/soft/spark1}
export SPARK_HOME2=${SPARK_HOME2:-/opt/soft/spark2}
export PYTHON_HOME=${PYTHON_HOME:-/opt/soft/python}
export HIVE_HOME=${HIVE_HOME:-/opt/soft/hive}
export FLINK_HOME=${FLINK_HOME:-/opt/soft/flink}
export DATAX_HOME=${DATAX_HOME:-/opt/soft/datax}


Services logback configs

Services name logback config name
API-service logback config logback-api.xml
master-service logback config logback-master.xml
worker-service logback config logback-worker.xml
alert-service logback config logback-alert.xml