Configuring HDFS High Availability

You can use Cloudera Manager to configure your CDH4 cluster for HDFS High Availability (HA). High Availability is not supported for CDH3 clusters.

An HDFS HA cluster is configured with two NameNodes - an Active NameNode and a Standby NameNode. Only one NameNode can be active at any point in time. HDFS High Availability depends on maintaining a log of all namespace modifications in a location available to both NameNodes, so that in the event of a failure the Standby NameNode has up-to-date information about the edits and location of blocks in the cluster.

There are two implementations available for maintaining the copies of the edit logs:

  • High Availability using Quorum-based Storage
  • High Availability using an NFS-mounted shared edits directory

Quorum-based Storage relies upon a set of JournalNodes, each of which maintains a local edits directory that logs the modifications to the namespace metadata.

The other alternative is to use a NFS-mounted shared edits directory (typically a remote Filer) to which both the Active and Standby NameNodes have read/write access.

Once you have enabled High Availability, you can enable Automatic Failover, which will automatically failover to the Standby NameNode in case the Active NameNode fails. You can also initiate a manual failover from Cloudera Manager.

See the CDH4 High Availability Guide for a more detailed introduction to High Availability with CDH4.

  Important:

Enabling or Disabling High Availability will shut down your HDFS service, and the services that depend on it – MapReduce, YARN, and HBase. Therefore, you should not do this while you have jobs running on your cluster. Further, once HDFS has been restored, the services that depend upon it must be restarted, and the client configurations for HDFS must be redeployed.

  Important:

Enabling or Disabling High Availability will cause the previous monitoring history to become unavailable.

Enabling High Availability with Quorum-based Storage

After you have installed HDFS on your CDH4 cluster, the Enable High Availability workflow leads you through adding a second (Standby) NameNode and configuring JournalNodes.

  1. From the Services tab, select your HDFS service.
  2. Click the Instances tab.
  3. Click Enable High Availability (This button does not appear if this is a CDH3 version of the HDFS service.)
  4. The next screen shows the hosts that are eligible to run a Standby NameNode and the JournalNodes.
    1. Select Enable High Availability with Quorum-based Storage as the High Availability Type.
    2. Select the host where you want the Standby NameNode to be set up. The Standby NameNode cannot be on the same host as the Active NameNode, and the host that is chosen should have the same hardware configuration (RAM, Disk space, number of cores, etc.) as the Active NameNode.
    3. Select an odd number of hosts (a minimum of three) to act as JournalNodes. JournalNodes should be hosted on machines with similar hardware specification as the NameNodes. It is recommended that you put a JournalNode each on the same hosts as the Active and Standby NameNodes, and the third JournalNode on similar hardware, such as the JobTracker.
    4. Click Continue.
  5. Enter a directory location for the JournalNode edits directory into the fields for each JournalNode host.
    • You may enter only one directory for each JournalNode. The names/paths do not need to be the same on every JournalNode.
    • The directories you specify should be empty, and must have the appropriate permissions.
    • If the directories are not empty, Cloudera Manager will not delete the contents; however, in that case the data should be in sync across the edits directories of the JournalNodes and should have the same version data as the NameNodes.
  6. You can choose whether the workflow will restart the dependent services and redeploy the client configuration for HDFS. To do this manually rather than have it done as part of the workflow, uncheck these extra options.
  7. Click Continue. Cloudera Manager proceeds to execute the set of commands that will stop the dependent services, delete, create, and configure roles and directories as appropriate, and will restart the dependent services and deploy the new client configuration if those options were selected.
  8. There are some additional steps you must perform if you want to use Hive, Impala, or Hue in a cluster with High Availability configured. Follow the Post Setup Steps described below.

Enabling High Availability using NFS Shared Edits Directory

After you have installed HDFS on your CDH4 cluster, the Enable High Availability workflow leads you through adding a second (Standby) NameNode and configuring the shared edits directory.

The shared edits directory is what the Standby NameNode uses to stay up-to-date with all the file system changes the Active NameNode makes. Note that you must have a shared directory already configured to which both NameNode machines have read/write access. Typically, this is a remote filer which supports NFS and is mounted on each of the NameNode machines. This directory must be writable by the hdfs user, and must be empty before you run the Enable HA workflow.

You can enable High Availability from the Actions menu on the HDFS Service page in a CDH4 cluster, or from the HDFS Service Instances tab.

  1. From the Services tab, select your HDFS service.
  2. Click the Instances tab.
  3. Click Enable High Availability (This button does not appear if this is a CDH3 version of the HDFS service.)
  4. The next screen shows the hosts that are eligible to run a Standby NameNode.
    1. Select Enable High Availability with NFS shared edits directory as the High Availability Type.
    2. Select the host where you want the Standby NameNode to be installed, and click Continue. The Standby NameNode cannot be on the same host as the Active NameNode, and the host that is chosen should have the same hardware configuration (RM, Disk space, number of cores, etc.) as the Active NameNode.
  5. Confirm or enter the directories to be used as the name directories for the NameNode.
  6. Enter the absolute path of the local directory, on each NameNode host, that is mounted to the remote shared edits directory. For example, hostA has /dfs/sharedA mounted to nfs:///exported/namenode, and hostB has /dfs/sharedB mounted to the same NFS location. The user should enter /dfs/sharedA for hostA and /dfs/sharedB for hostB. (/dfs/sharedA and /dfs/sharedB can be the same paths). You should only configure one shared edits directory. This directory must be mounted read/write on both NameNode machines. This directory must be writable by the hdfs user, and must be empty when you run the enable HA command.
  7. You can choose whether the workflow will restart the dependent services and redeploy the client configuration for HDFS. To do this manually rather than have it done as part of the workflow, uncheck these extra options.
  8. Click Continue to proceed.
  9. Cloudera Manager will now perform the steps to set up the Active and Standby NameNodes.
  10. When all the steps have been completed, click Finish. If the workflow fails, inspect the error message and logs for the cause of failure. After addressing the cause of failure, click Retry to re-execute all the steps. Alternatively, perform the remaining steps using the commands available in the Actions menu. Note that Retry will not work for workflows that fail after the "Bootstrapping Standby NameNode" step. To revert changes made by the failed workflow, use the Disable High Availability action available in the Instances tab. Note that when HA is enabled, there will no longer be a Secondary NameNode role running on your cluster. However, the Secondary NameNode's checkpoint directories are not deleted from the host. Make sure you start your services and re-deploy your client configurations before you try to run jobs on your cluster, if you did not have the Enable High Availability workflow do this automatically.
  11. There are some additional steps you must perform if you want to use Hive, Impala, or Hue in a cluster with High Availability configured. Follow the Post Setup Steps described below.
  Note:

After you enable High Availability for the first time, there may be a time lag before the next Reports Manager re-indexing phase, which means certain reports may not be immediately available. Restarting the Reports Manager service will make those reports more quickly available.

Post Setup Steps for Hue and Hive

There are several configuration changes you must make in order to successfully enable High Availability, whether you will be using Quorum-based storage or NFS-mounted shared edits directory. Before you enable HA, you must do the following:

Configuring Hue to work with High Availability

  1. From the Services tab, select your HDFS service.
  2. Click the Instances tab.
  3. Click the Add button.
  4. Under the HttpFS column, select a host where you want to install the HttpFS role and click Continue.
  5. After you are returned to the Instances page, select the new HttpFS role.
  6. From the Actions for Selected menu, select Start (and confirm).
  7. After the command has completed, go to the Services tab and select your Hue service.
  8. From the Configuration menu, select View and Edit.
  9. The HDFS Web Interface Role property will now show the httpfs role you just added. Select it instead of the namenode role, and Save your changes. (The HDFS Web Interface Role property is under the Service-Wide Configuration category.)
  10. Restart the Hue service for the changes to take effect.

Upgrading the Hive Metastore for HDFS High Availability

To upgrade the Hive metastore to work with High Availability, do the following:

  1. Go to the Services tab and select the Hive service.
  2. From the Actions menu, select Stop....
      Note:

    You may want to stop the Hue and Impala services first, if present, as they depend on the Hive service.

    Confirm that you want to stop the service.

  3. When the service has stopped, back up the Hive metastore database to persistent storage.
  4. From the Actions menu, click Update Hive Metastore NameNodes... and confirm the command.
  5. From the Actions menu on the Hive Service page, Start... the Hive MetaStore Service. Also restart the Hue and Impala services if you stopped them prior to updating the metastore.

Enabling Automatic Failover

You must have HDFS High Availability enabled in order to enable Automatic Failover.

  Important:

Enabling or Disabling Automatic Failover will shut down your HDFS service, and requires the services that depend on it to be shut down.

To enable Automatic Failover:

  1. From the Services tab, select your HDFS service.
  2. Click the Instances tab.
  3. Click Enable Automatic Failover...
  4. Confirm that you want to take this action. This will stop the NameNodes for the Nameservice, create and configure Failover Controllers for each NameNode, initialize the High Availability state in ZooKeeper, and start the NameNodes and Failover Controllers.
  Note:

If you are using NFS-based High Availability, a fencing method must be configured in order for failover (either automatic or manual) to function — Cloudera Manager configures this automatically. This is not required with Quorum-based Storage. See Fencing Methods if you want more information.

  Note:

If you started your services and re-deployed your client configurations after you enabled HA, you should not need to do so again now. If you did not start them after enabling HA, you must do so now, before you attempt to run any jobs on your cluster.

  Important:
If you change the NameNode Service RPC Port (dfs.namenode.servicerpc-address) while automatic failover is enabled, this will cause a mismatch between the NameNode address saved in the ZooKeeper /hadoop-ha znode and the NameNode address that the FailoverController is configured with. This will prevent the FailoverControllers from restarting. If you need to change the NameNode Service RPC Port after AutoFailover has been enabled, you must do the following to re-initialize the znode:
  1. Stop HDFS.
  2. Configure the service rpc port in the Service-Wide HDFS configuration:
    1. From the HDFS service, Configuration tab, select View and Edit.
    2. Search for "dfs.namenode.servicerpc" which should display the NameNode Service RPC Port property. (It is found under the NameNode (Default) role group, Ports and Addresses category).
    3. Change the port value as needed.
  3. On a ZooKeeper server host, run zkcli.sh:

    If using parcels: /opt/cloudera/parcels/CDH/lib/zookeeper/bin/zkCli.sh

    If using packages: /usr/lib/zookeeper/bin/zkCli.sh

  4. Execute the following to remove the pre-configured nameservice. This example assumes the name of the nameservice is nameservice1. You can identify the nameservice from the "High Availability and Federation" region on the Instances tab of HDFS:
     rmr /hadoop-ha/nameservice1
  5. Navigate to the HDFS Instances tab. To the right of the nameservice in the "High Availability and Federation" regions, there is an Actions menu. From the Actions menu select Initialize High Availability State in Zookeeper.
  6. Start HDFS

Disabling Automatic Failover

  Note:

You must disable Automatic Failover before you can disable High Availability.

To disable Automatic Failover

  1. From the Services tab, select your HDFS service.
  2. Click the Instances tab.
  3. Click Disable Automatic Failover...
  4. Confirm that you want to take this action. Cloudera Manager will stop the NameNodes, remove the Failover Controllers, and restart the NameNodes, transitioning one of them to be the Active NameNode.

Disabling High Availability

  Note:

If you have enabled Automatic Failover, you must disable it before you can disable High Availability.

To disable High Availability

  1. From the Services tab, select your HDFS service.
  2. Click the Instances tab.
  3. Click Disable High Availability...
  4. Confirm that you want to take this action. If you are using Quorum-based Storage, you will have the option of disabling the Quorum-based Storage, or leaving it enabled. If you are using NameNode Federation, you should consider leaving it enabled. Cloudera Manager ensures that one NameNode is active, and saves the namespace. Then it stops the Standby NameNode, creates a SecondaryNameNode, removes the Standby NameNode role, and restarts all the HDFS services. Note that although the Standby NameNode role is removed, its name directories are not deleted. Empty these directories after making a backup of their contents. As when you enabled High Availability, you have the choice to have your dependent services restarted, and your client configuration redeployed as part of the Disable High Availability workflow. If you choose not to do this, you must do this manually.

Fencing Methods

In order to ensure that only one NameNode is active at a time, a fencing method is required for the shared directory. During a failover, the fencing method is responsible for ensuring that the previous Active NameNode no longer has access to the shared edits directory, so that the new Active NameNode can safely proceed writing to it.

For details of the fencing methods supplied with CDH4, and how fencing is configured, see the Fencing Configuration section in the CDH4 High Availability Guide.

By default, Cloudera Manager configures HDFS to use a shell fencing method (shell(./cloudera_manager_agent_fencer.py)) that takes advantage of the Cloudera Manager agent. However, you can configure HDFS to use the sshfence method, or you can add your own shell fencing scripts, instead of or in addition to the one Cloudera Manager provides. .

The fencing parameters are found in the Service-Wide section of the Configuration tab for your HDFS service.

Converting from NFS-mounted shared edits directory to Quorum-based Storage

Converting your High Availability configuration from using a NFS-mounted shared edits directory to Quorum-based Storage just involves disabling your current High Availability configuration, then enabling High Availability using Quorum-based Storage.

  1. Disable High Availability (see Disabling High Availability).
  2. Although the Standby NameNode role is removed, its name directories are not deleted. Empty these directories.
  3. Enable High Availability with Quorum-based Storage (see Enabling High Availability with Quorum-based Storage).

Converting from Quorum-based Storage to NFS-mounted shared edits directory

To convert your High Availability configuration from using Quorum-based Storage to using a NFS-mounted shared edits directory you disable your current High Availability configuration, configure your NFS-mounted shared edits directories, then enable High Availability using your NFS-mounted directories.

  1. Disable High Availability (see Disabling High Availability).
  2. Although the Standby NameNode role is removed, its name directories are not deleted. Empty these directories.
  3. Enable High Availability using the NFS-mounted directory. Note that you must have a shared directory already configured to which both NameNode machines have read/write access. See Enabling High Availability using NFS Shared Edits Directory for detailed instructions.