Health
Cloudera Manager monitors the health of the services, roles, and hosts that are running in your clusters. This information is available in the following locations:
- Home page and tab, where various health results determine an overall health assessment of the service or role. The overall health of a role or service is a roll-up of its health tests; if any health test is Bad, the service's or role's health will be Bad. If any health test is Concerning (but none are Bad) the role's or service's health will be concerning.
- Hosts tab, which shows summary result for the hosts.
- Status tab - which shows metrics for services, role instances, and hosts. These are reflected in the results shown in the Health Tests panel when you have selected a service, role instance, or host.
There are two types of health tests:
- Pass-fail tests - there are two types:
- Compare a property to a yes-no value. For example, whether a service or role started as expected, a DataNode is connected to its NameNode, or a TaskTracker is (or is not) blacklisted.
- Exercise a service lightly to confirm it is working and responsive. HDFS (NameNode), HBase, and ZooKeeper services perform these tests, which are referred to as "canary" tests.
- Metric tests - compare a property to a numeric value. For example, the number of file descriptors in use, the amount of disk space used or free, how much time spent in garbage collection, or how many pages were swapped to disk in the previous 15 minutes. In these tests the property is compared to a threshold that determine whether everything is Good, (for example, plenty of disk space available), whether it is Concerning (disk space getting low), or is Bad (a critically low amount of disk space).
By default most health tests are enabled and (if appropriate) configured with reasonable thresholds. You can modify threshold values by editing the monitoring properties under the entity's Configuration tab. You can also enable or disable individual or summary health tests, and in some cases specify what should be included in the calculation of overall health for the service, role instance, or host. See Configuring Monitoring Settings for more information.
For some health test results, you can also chart the associated metrics over a time range. See Viewing Service Status, Viewing Role Instance Status, and Host Details for more details.
<< | ||
Terms and Conditions Privacy Policy |