::Go back to Oozie Documentation Index::
Download the latest Oozie distribution from http://yahoo.github.com/oozie/downloads .
Expand the Oozie distribution tar.gz .
Java JRE should be in the PATH .
Set the OOZIE_HOME environment variable to the directory where the expanded Oozie distribution is located.
NOTE: The OOZIE_HOME environment variable is only required and used by the Oozie server. It is not used by the Oozie client.
Add ${OOZIE_HOME}/bin to the PATH .
Oozie WAR is bundled without the Hadoop JAR files and without the ExtJS library. The Hadoop JARs are required to run Oozie. The ExtJS library is optional (only required for the Oozie web-console to work).
The ExtJS library can be downloaded from ExtJS 2.2 (it must be version 2.2). The ExtJS library is not bundled with Oozie because it uses a different license.
Use the ${OOZIE_HOME}/bin/addtowar.sh script to add the Hadoop JARs and the ExtJS library to the Oozie WAR file.
Usage:
Usage : addtowar <OPTIONS> Options: -inputwar INPUT_OOZIE_WAR -outputwar OUTPUT_OOZIE_WAR [-hadoop HADOOP_VERSION HADOOP_PATH] [-extjs EXTJS_PATH] [-jars JARS_PATH] (multiple JAR path separated by ':')
The original Oozie WAR file is at ${OOZIE_HOME}/oozie.war .
After the Hadoop JARs and the ExtJS library has been added to the oozie.war file Oozie is ready for deployment.
If present, delete any previous oozie.war and oozie directory from Tomcat's webapps/ directory.
Copy the oozie.war file (the one that contains the Hadoop JARs adn the ExtJS library) to Tomcat's webapps/ directory.
IMPORTANT: Only one Oozie instance can be deployed per Tomcat instance.
Oozie works with HSQL, MySQL and Oracle databases.
If using HSQL, Oozie bundles HSQL JDBC driver. HSQL is an embedded in-memory database, all data is lost when Oozie stops running.
If using MySQL or Oracle, the corresponding JDBC driver JARs files must be in Oozie classpath (added to Oozie WAR or in Tomcat's common/lib directory). A database should be created for Oozie, Oozie creates its tables automatically.
The bin/addtowar.sh script has an option -jars that can be used to add the Oracle or MySQL JDBC driver JARs to the Oozie WAR file.
Oozie configuration is always read from the ${OOZIE_HOME}/conf directory.
The Oozie configuration is distributed in 3 different files:
All Oozie configuration properties and their default values are defined in the oozie-default.xml file.
Oozie resolves configuration property values in the following order:
The OOZIE_CONFIG_FILE environment variable can be set to indicate an alternate Oozie configuration file than the oozie-site.xml file. The alternate file must be in the ${OOZIE_HOME}/conf/ directory.
NOTE: The oozie-default.xml file found in the ${OOZIE_HOME}/conf/ directory is not used by Oozie, it is there for reference purposes only.
By default Oozie logs to the ${OOZIE_HOME}/logs/ directory in 4 different files:
Oozie log configuration is defined in the oozie-log4j.properties files.
If the Oozie log configuration file changes, Oozie reloads the new settings dynamically.
The OOZIE_LOG4J_FILE environment variable can be set to indicate an alternate Oozie logging configuration file. The alternate file must be in the ${OOZIE_HOME}/conf/ directory.
The OOZIE_LOG4J_RELOAD environment variable can be set to specify the logging configuration reload interval in seconds. The default value is 10 seconds.
Oozie can work with Hadoop 20 with Security distribution which supports Kerberos authentication.
Oozie authentication is configured using the following configuration properties (default values shown):
oozie.services.ext=org.apache.oozie.service.HadoopAccessorService oozie.service.HadoopAccessorService.kerberos.enabled=false local.realm=LOCALHOST oozie.service.HadoopAccessorService.keytab.file=${user.home}/oozie.keytab oozie.service.HadoopAccessorService.kerberos.principal=${user.name}/localhost@{local.realm}
The above default values are for a Hadoop 0.20.2 distribution without Kerberos authentication.
To use a Hadoop 20 with Security distribution (regardless of using Simple or Kerberos authentication) the following property must be set:
oozie.services.ext=org.apache.oozie.service.KerberosHadoopAccessorService
To enable Kerberos authentication, the following property must be set:
oozie.service.HadoopAccessorService.kerberos.enabled=true
When using Kerberos authentication, the following properties must be set to the correct values (default values shown):
local.realm=LOCALHOST oozie.service.HadoopAccessorService.keytab.file=${user.home}/oozie.keytab oozie.service.HadoopAccessorService.kerberos.principal=${user.name}/localhost@{local.realm}
IMPORTANT: When using Oozie with a Hadoop 20 with Security distribution, the Oozie user in Hadoop must be configured as a proxy user.
Oozie has a basic authorization model:
If security is disabled all users are admin users.
Oozie security is set via the following configuration property (default value shown):
oozie.service.AuthorizationService.security.enabled=false
If security is enabled, the admin users are read from the ${OOZIE_HOME}/conf/adminusers.txt file:
The SQL database used by Oozie is configured using the following configuration properties (default values shown):
oozie.db.schema.name=oozie oozie.db.schema.create=true oozie.service.StoreService.jdbc.driver=org.hsqldb.jdbcDriver oozie.service.StoreService.jdbc.url=jdbc:hsqldb:mem:${oozie.db.schema.name} oozie.service.StoreService.jdbc.username=sa oozie.service.StoreService.jdbc.password= oozie.service.StoreService.pool.max.active.conn=10
The default values are for HSQLDB, an embedded in-memory database bundled with Oozie.
To use MySQL or Oracle the corresponding JDBC driver JAR must be in the classpath (or added to the Oozie WAR file).
NOTE: If the oozie.db.schema.create property is set to true (default) the Oozie tables will be created automatically if they are not found in the database at Oozie start up time. In a production system this option should be set to false once the databaset tables have been created.
Oozie has a system ID that is is used to generate the Oozie temporary runtime directory, the workflow job IDs, and the workflow action IDs.
Two Oozie systems running with the same ID will not have any conflict but in case of troubleshooting it will be easier to identify resources created/used by the different Oozie systems if they have different system IDs (default value shown):
oozie.system.id=oozie-${user.name}
Copy and expand the oozie-client TAR.GZ file bundled with the distribution. Add the bin/ directory to the PATH .
Refer to the Command Line Interface Utilities document for a a full reference of the oozie command line tool.