Within the lifecycle of a cluster, the time will eventually come to decommission and delete some hosts from it. While Ambari Server can take care of this process within the UI, sometimes host entries are left behind on their way out of the Ambair database. This can lead to false alerts, log message warnings and false reporting metrics in Ambari Server from non-existent hosts. This increase in warnings and false reporting can also lead to poor performance from Ambari, so we will go over the tuning process for Ambari Server while we’re at it.
Example of such log warnings may look like the following;
WARN [alert-event-bus-1] AlertReceivedListener:530 - Unable to process alert yarn_nodemanager_health for an invalid service YARN and component NODEMANAGER on host myDeletedHost.cloud.local
So, how do you delete a host from Ambari that technically doesn’t exist? Glad you asked, here’s the process;
Fixing these issues will require the Ambari DB to have records altered. Please be sure to have a valid backup of your Ambari database before proceeding.
MariaDB [ambari]> select host_id, host_name from hosts where host_name='myDeletedHost.cloud.local'; +---------+--------------------------+ | host_id | host_name | +---------+--------------------------+ | 1002 | myDeletedHost.cloud.local| +---------+--------------------------+
delete from execution_command where task_id in (select task_id from host_role_command where host_id in (1002)); delete from host_version where host_id in (1002); delete from host_role_command where host_id in (1002); delete from serviceconfighosts where host_id in (1002); delete from hoststate where host_id in (1002); delete from kerberos_principal_host WHERE host_id='myDeletedHost.cloud.local'; delete from hosts where host_name in ('myDeletedHost.cloud.local'); delete from alert_current where history_id in ( select alert_id from alert_history where host_name in ('myDeletedHost.cloud.local'));
The order of operations is important here because the tables use Foreign Key Checks, to try and prevent orphaned entries. If you do you encounter issues where you cannot remove entries due to the Foreign Key checks, and you are CERTAIN you want to remove the entry anyway, you can disable them and re-enable them after your edits.
Turn off Foreign Key Checks:
SET foreign_key_checks = 0;
Turn on Foreign Key Checks:
SET foreign_key_checks = 1;
If you happen to continue having Ambari Server Performance alerts after removing the old host entries, you can try Tuning Ambari Server as recommended by Cloudera/Hortonworks. You will want to perform the following;
+-----------------------+-----------+-----------+ |# of Cluster Nodes |Xmx value |Xmn value | +-----------------------+-----------+-----------+ |100 - 400 |4 GB |2 GB | +-----------------------+-----------+-----------+ |400 - 800 |4 GB |2 GB | +-----------------------+-----------+-----------+ |800 - 1200 |8 GB |2 GB | +-----------------------+-----------+-----------+ |1200 - 1600 |16 GB |2.4 GB | +-----------------------+-----------+-----------+
In our experience for a small clusters of ~50 nodes, we have found that 4GB Xmx AND Xmn works best to keep Ambari happy.
Example for a 50 node cluster: 60 * 50 = 3600
server.ecCacheSize=3600
server.jdbc.connection-pool.acquisition-size=5 server.jdbc.connection-pool.max-age=0 server.jdbc.connection-pool.max-idle-time=14400 server.jdbc.connection-pool.max-idle-time-excess=0 server.jdbc.connection-pool.idle-test-interval=7200
set wait_timeout 28800; set interactive_timeout 28800; set global max_connections 128;
The Ambari Server properties “server.jdbc.connection-pool.max-idle-time” and “server.jdbc.connection-pool.idle-test-interval” must be lower than “wait_timeout” and “interactive_timeout” respectively. The settings in step 3 reflect this.
# ambari-server db-purge-history --cluster-name CLUSTER_NAME --from-date 2020-12-01 Using python /usr/bin/python Purge database history... Ambari Server configured for Embedded Postgres. Confirm you have made a backup of the Ambari Server database [y/n] y Ambari server is using db type Embedded Postgres. Cleanable database entries older than 2020-12-01 will be purged. Proceed [y/n] y Purging historical data from the database ... Purging historical data completed. Check the ambari-server.log for details. Ambari Server 'db-purge-history' completed successfully.
Restart Ambari Server and you should now be free of performance issues! If you found this article helpful, you can check out our other blog posts here.
Happy Hadooping!
Written by Ryan St. Louis