The Usual Tech Ramblings

Nagios ping error

Stumbled on an odd one today. I was adding some new hosts to our Nagios server, and while I was looking at the interface, one of the servers kept popping up an error, the error was:

PING WARNING - System call sent warnings to stderr System call sent warnings to stderr

So I checked the output of the command, by executing the same command the script was executing:

./libexec/check_ping -H {server} -w 100.0,20% -c 500,50% -p 5

And sure enough, the same error came up. A little baffled, I did some googling, and stumbled on this. Somebody else was having a similar issue, but a different plugin. I briefly skimmed over the article, and noticed they did some debugging with strace. Not sure why I didn’t think of doing that before, so I did…

strace ./libexec/check_ping -H {server} -w 100.0,20% \
  -c 500,50% -p 5

Sure enough, there is that same error message, buried deep in amongst the extra code, however something else caught my eye.

read(5, “Warning: time of day goes back (“…, 4096) = 130

Bingo… So I checked, without using the Nagios plugins.

ping {host}

And sure enough, there was that error message again:

Warning: time of day goes back (-282us), taking countermeasures.

Then I remembered, this is the machine that was having issues with clock sync because it’s running as VMWare guest, and I had setup ntpdate to run every few minutes. It seems the time on the server sporadically runs faster than that of the windows server it is pinging, and as such, the times in the pings seem to come back faster than when they were sent. As it appears to only occur once every 15-20 pings or so, I’m not that worried. It’s not triggering alarms, it just flags one of the ping result sets as “warning” and goes away on the next ping.