Using the LZO Parcel

This section describes how to install and use the LZO parcel in Cloudera Manager 4.6 or later.

The Repository

Add the appropriate repository to Cloudera Manager’s list of parcel repositories. The HADOOP_LZO parcel will then become available on the parcel management screen. If required, the repository can be mirrored in the same way as the CDH repo.

The public customer repository can be found at: http://archive.cloudera.com/gplextras/parcels/latest.

You must choose a specific version of the HADOOP_LZO parcel for each Impala version. Every time you upgrade one, you must upgrade the other to the corresponding version as follows:
Impala Version LZO Parcel Version
1.1.1 HADOOP_LZO-0.4.15-1.gplextras.p0.24
1.1.0 HADOOP_LZO-0.4.15-1.gplextras.p0.22
1.0.1 HADOOP_LZO-0.4.15-1.gplextras.p0.15

Activation

The HADOOP_LZO parcel can be downloaded, distributed, and activated in the same way as the CDH parcel. Once activated, it will be necessary to reconfigure and restart services that intend to use LZO functionality.

MapReduce

  1. Add the following entries to the MapReduce Client Environment Safety valve:
    1. Under the Configuration > View and Edit tab, search for "MapReduce Client Safety".
    2. In the MapReduce Client Environment Safety Valve, enter the following two lines:
      • HADOOP_CLASSPATH=$HADOOP_CLASSPATH:/opt/cloudera/parcels/HADOOP_LZO/lib/hadoop/lib/*
      • JAVA_LIBRARY_PATH=$JAVA_LIBRARY_PATH:/opt/cloudera/parcels/HADOOP_LZO/lib/hadoop/lib/native
  2. Add the LZO codecs to the io.compression.codecs property under the MapReduce Service:
    1. Under the Configuration > View and Edit tab, search for "io.compression".
    2. In the Compression Codecs property, click in the field, then click the + sign to open a new value field.
    3. Add the following two codecs:
      • com.hadoop.compression.lzo.LzoCodec
      • com.hadoop.compression.lzo.LzopCodec
  3. Save your configuration changes.
  4. Restart MapReduce.
  5. Redeploy MapReduce Client Configuration.

Oozie

  1. Go to /var/lib/oozie on each Oozie server and symlink the Hadoop LZO JAR.
    • /opt/cloudera/parcels/HADOOP_LZO/lib/hadoop/lib/hadoop-lzo-cdh4-0.4.15-gplextras.jar
  2. Restart Oozie.
  Note: The Oozie step is required, with or without parcels. The only difference is where you find the LZO jar to copy/replace. The LZO jar may already be present in /var/lib/oozie. Replacing any existing jar with the parcel jar (as described above) is strongly recommended.

HBase

  • Restart HBase.

Impala (1.0 or later)

  • Restart Impala.

Hive

  • Restart the Hive server.

Sqoop

  1. Add the following entries to the Sqoop Service Environment Safety valve:
    • HADOOP_CLASSPATH=$HADOOP_CLASSPATH:/opt/cloudera/parcels/HADOOP_LZO/lib/hadoop/lib/*
    • JAVA_LIBRARY_PATH=$JAVA_LIBRARY_PATH:/opt/cloudera/parcels/HADOOP_LZO/lib/hadoop/lib/native
  2. Restart the Sqoop service.

Note

Any service that does not require the use of LZO need not be configured. For example, if you are not using HBase, you do not need to do anything to the safety valve.