ApiHiveCloudReplicationArguments Data Model

Replication arguments for Hive services.

Properties
name data type description
sourceAccount string
destinationAccount string
cloudRootPath string
replicationOption ReplicationOption
Properties inherited from ApiHiveReplicationArguments
sourceService ApiServiceRef The service to replicate from.
tableFilters array of ApiHiveTable Filters for tables to include in the replication. Optional. If not provided, include all tables in all databases.
exportDir string Directory, in the HDFS service where the target Hive service's data is stored, where the export file will be saved. Optional. If not provided, Cloudera Manager will pick a directory for storing the data.
force boolean Whether to force overwriting of mismatched tables.
replicateData boolean Whether to replicate table data stored in HDFS.

If set, the "hdfsArguments" property must be set to configure the HDFS replication job.

hdfsArguments ApiHdfsReplicationArguments Arguments for the HDFS replication job.

This must be provided when choosing to replicate table data stored in HDFS. The "sourceService", "sourcePath" and "dryRun" properties of the HDFS arguments are ignored; their values are derived from the Hive replication's information.

The "destinationPath" property is used slightly differently from the usual HDFS replication jobs. It is used to map the root path of the source service into the target service. It may be omitted, in which case the source and target paths will match.

Example: if the destination path is set to "/new_root", a "/foo/bar" path in the source will be stored in "/new_root/foo/bar" in the target.

replicateImpalaMetadata boolean Whether to replicate the impala metadata. (i.e. the metadata for impala UDFs and their corresponding binaries in HDFS).
runInvalidateMetadata boolean Whether to run invalidate metadata query or not
dryRun boolean Whether to perform a dry run. Defaults to false.
numThreads number Number of threads to use in multi-threaded export/import phase

Example

{
  "sourceAccount" : "...",
  "destinationAccount" : "...",
  "cloudRootPath" : "...",
  "replicationOption" : "KEEP_DATA_IN_CLOUD",
  "sourceService" : {
    "peerName" : "...",
    "clusterName" : "...",
    "serviceName" : "..."
  },
  "tableFilters" : [ {
    "database" : "...",
    "tableName" : "..."
  }, {
    "database" : "...",
    "tableName" : "..."
  } ],
  "exportDir" : "...",
  "force" : true,
  "replicateData" : true,
  "hdfsArguments" : {
    "sourceService" : {
      "peerName" : "...",
      "clusterName" : "...",
      "serviceName" : "..."
    },
    "sourcePath" : "...",
    "destinationPath" : "...",
    "mapreduceServiceName" : "...",
    "schedulerPoolName" : "...",
    "userName" : "...",
    "sourceUser" : "...",
    "numMaps" : 12345,
    "dryRun" : true,
    "bandwidthPerMap" : 12345,
    "abortOnError" : true,
    "removeMissingFiles" : true,
    "preserveReplicationCount" : true,
    "preserveBlockSize" : true,
    "preservePermissions" : true,
    "logPath" : "...",
    "skipChecksumChecks" : true,
    "skipListingChecksumChecks" : true,
    "skipTrash" : true,
    "replicationStrategy" : "STATIC",
    "preserveXAttrs" : true,
    "exclusionFilters" : [ "...", "..." ],
    "raiseSnapshotDiffFailures" : true
  },
  "replicateImpalaMetadata" : true,
  "runInvalidateMetadata" : true,
  "dryRun" : true,
  "numThreads" : 12345
}