Monitoring Puppet – Part 2

As mentioned in one of my previous posts here, there are some more possibilities of observing the health of a Puppet agent. This one shows how I observe the state of an agent using its file /var/lib/puppet/state/state.yaml.

Every time the Puppet agent executes, this file is being regenerated. By simply checking the age of the file, one may get a hint if the agent has executed within a certain time. The NRPE plugin used for this check is the check_file_age plugin.

The help screen gives all relevant information for configuring the plugin:

[root@somehost plugins]# ./check_file_age --help
check_file_age v1.4.15 (nagios-plugins 1.4.15)
  check_file_age [-w ] [-c ] [-W ] [-C ] -f
  check_file_age [-h | --help]
  check_file_age [-V | --version]

    File must be no more than this many seconds old (default: warn 240 secs, crit 600)
    File must be at least this many bytes long (default: crit 0 bytes)
[root@somehost plugins]#

I wanted to have a warning issued once the client did not run for more than 35 minutes (2100 secs) and a critical error as soon as execution time was more than 65 minutes (3900 secs) ago. To achieve this, I added this line to the NRPE configuration file nrpe.cfg:

command[check_puppet_client_state_file]=/usr/lib64/nagios/plugins/check_file_age -w 2100 -c 3900 -f /var/lib/puppet/state/state.yaml

This check is then applied to every agent with an appropriate stored configuration manifest in Puppet:

   @@nagios_service { "check_puppet_client_state_file_${hostname}":
      use => "generic-service",
      host_name => "$fqdn",
      service_description => "${prefix}Puppet client state file",
      check_command => "check_nrpe!check_puppet_client_state_file",

Beware that this check only indicates if the agent has executed. It does not detect any errors while executing. For detecting errors, you may want to have a look at this post I made.

Related topics

I got several other posts covering the monitoring of Puppet which might be interesting:

  1. Monitoring Puppet – Part 1 shows how to keep the clients alive and how to check proper execution using the master’s /var/log/puppet/masterhttp.log file.
  2. Monitoring Puppet – Part 3 describes how to scan syslog for Puppet errors.

, , ,

No comments yet.

Leave a Reply

* Copy This Password *

* Type Or Paste Password Here *