Hive Table Statistics
You can collect statistics on a Hive table by running the following commands from a BeeLine client connected to HiveServer2:
analyze table <table name> compute statistics; analyze table <table name> compute statistics for columns <all columns of a table>;
Configuring Hive to Store Statistics in MySQL
By default, Hive writes statistics to a Derby database backed by a file named /var/lib/hive/TempStatsStore. However, in production systems Cloudera recommends that you store statistics in a database. Hive table statistics are not supported for PostgreSQL or Oracle. To configure Hive to store statistics in MySQL:
- Set up a MySQL server. For instructions on setting up MySQL, see Installing and Configuring a MySQL Database .
This database will be heavily loaded, so it should not be installed on the same host as anything critical such as the Hive Metastore Server, the database hosting the Hive Metastore, or Cloudera Manager Server. When collecting statistics on a large table and/or in a large cluster, this host may become slow or unresponsive.
- Create a statistics database in MySQL:
mysql> create database stats_db_name DEFAULT CHARACTER SET utf8; Query OK, 1 row affected (0.00 sec) mysql> grant all on stats_db_name.* TO 'stats_user'@'%' IDENTIFIED BY 'stats_password'; Query OK, 0 rows affected (0.00 sec)
- Add the following into the HiveServer2 Configuration Safety Valve for hive-site.xml:
<property> <name>hive.stats.dbclass</name> <value>jdbc:mysql</value> </property> <property> <name>hive.stats.jdbcdriver</name> <value>com.mysql.jdbc.Driver</value> </property> <property> <name>hive.stats.dbconnectionstring</name> <value>jdbc:mysql://<stats_mysql_host>:3306/<stats_db_name>?useUnicode=true& characterEncoding=UTF-8&user=<stats_user>&password=<stats_password></value> </property> <property> <name>hive.aux.jars.path</name> <value>file:///usr/share/java/mysql-connector-java.jar</value> </property>
- Restart HiveServer2.
<< | ||
Terms and Conditions Privacy Policy |