NHC Node Health Check
Abstract
Why NHC?
Currently no standard -- Most sites use custom, home-grown scripts
Often site-specific
Usually lacking portability
Unreliable execution, reporting, parent performance
Need a simple, robust framework easy to understand/apply.
What NHC does
Simply: Prevents jobs from running on unhealthy nodes