Adding Flume

The Flume NG service must be added separately from the wizard; the packages are installed by the installation wizard, but the agents are not configured or started as part of First Run. As part of adding Flume as a service, you should first configure your Flume agents before you start those role instances.

For details of how to modify configurations and use configuration overrides in Cloudera Manager, see Modifying Service Configurations.

For detailed information about Flume agent configuration, see the Flume User Guide. To install Flume agents on your cluster:

  1. Follow the initial steps (above) to select Flume as the service to be added.
  2. Select the hosts on which you want Flume agents to be installed.
  3. Click Continue and the Flume agents are installed on the nodes you've selected.

The Flume agents are not started automatically. You must first configure your agents appropriately before you start them, following the instructions below.

A default Flume flow configuration is provided as an example in the Configuration properties for the flume agents; you should replace this with the your own configuration. The default configuration provides configuration for a single agent.

A single configuration file can contain the configuration for multiple agents, since each configuration property is prefixed by the agent name. You can then set the agents' names using role instance configuration overrides to specify the configuration applicable to each agent. Note that different agent role instances can have the same name. The agent names do not have to be unique. You can use this to further simplify the configuration file. This is the recommended method to configure Flume.

Flume NG can be installed on a cluster running either CDH3 or CDH4. However, monitoring of Flume is only supported if your cluster is running CDH4.1 or later, or CDH3u5 (refresh 2) or later.

  Note:

If you are using Flume to write to HDFS or HBase sinks, you must have at least one HDFS or HBase role instance on the Flume agent's host. If you do not want to run a daemon on the Flume agent's host, you can just add a Flume Gateway role on the host.

To configure your Flume agents:

  1. Go to the Flume Service page (by selecting your Flume service from the Services menu or from the All Services page).
  2. Pull down the Configuration tab, and select View and Edit.
  3. Select the Agent (Default) role group in the left hand column. The settings you make here apply to the default role group, and thus will apply to all agent instances unless those instances are associated with a different role group, or are overridden for specific agents.
  4. Set the Agent Name property to the name of the agent (or one of the agents) whose configuration is defined in your flume.conf. You can specify only one agent name here — the name you specify will be used as the default for all Flume agent instances, unless you override the name for specific agents. You can have multiple agents with the same name — they will share the same configuration based on your configuration file.
  5. Copy the contents of your flume.conf file, in its entirety, into the Configuration File field. Unless overridden for specific agent instances, this flume.conf file will apply to all your agents. You can provide multiple agent configurations in this file and use Agent Name overrides to determine which configurations to use for each agent. This is the recommended procedure.

To override the agent name for one or more specific agents: If you have specified multiple agent configurations in your flume.conf file, you must override the default agent name for the agent instances that should use a different (not the default) configuration.

  1. Pull down the Flume service Configuration tab, select Edit and the select the Agent (Default) role group in the left hand column.
  2. To override the Agent Name for one or more instances, move your cursor over the value area of the Agent Name property, and click Override Instances.
  3. Select the agent (role) instances you want to override.
  4. In the field labeled Change value of selected instances to: select "Other". (You can use the "Inherited Value" setting to return to the service-level value.)
  5. In the field that appears, type the agent name you want to use for the selected agents.
  6. Click Apply to have your change take effect.

After you have completed your configuration changes, you can start the Flume service, which will start all your Flume agents.

  Note:

If you need to modify your Flume configuration file after you have started the Flume service, you can use the Update Config... command from the Actions menu on the Flume Service Status page to update the configuration across flume agents without having to shut down the Flume service.