13.07.2015 Views

Linux System Administration Recipes A Problem-Solution Approach

Linux System Administration Recipes A Problem-Solution Approach

Linux System Administration Recipes A Problem-Solution Approach

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

CHAPTER 3 ■ MONITORING AND UPDATINGThe generic service definition is useful again here. I’ll show how to set up the default to be for anyservice that has errors to send an e-mail. In the conf.d/generic-service_nagios2.cfg file, add thefollowing to the service definition:notification_interval 1440is_volatile 0check_period24x7normal_check_interval 5retry_check_interval 1max_check_attempts 10notification_period24x7notification_optionsc,rcontact_groupsadminsThe notification interval defines how often you get reminded (in minutes). 1440 minutes meansevery 24 hours. check_period defines when the service is expected to run—here, it’s all the time. Timeperiods such as 24x7 are defined in conf.d/timeperiods_nagios2.cfg and can be edited or added to ifthey don’t meet your needs. The normal_check_interval and retry_check_interval are also in minutes.The service here is set to be checked every five minutes, but if an answer isn’t forthcoming and a retry ismade, the retry should happen every minute. Ten retry attempts will be made before it is concluded thatthere’s something wrong with the service, and again, you can change this number.notification_period sets when alerts should be sent (all the time), and notification_options setswhen you should receive an alert. For hosts, d = notify on down states, u = notify on unreachable states, r= notify on host recoveries, and f = notify when host starts and stops flapping. For services, w = notify onwarning states, u = unknown states, c = critical states, r = recovery, and f = start/stop of flapping. Finally,contact_groups defines who to contact when a notification is required.■ Note Flapping refers to when a service or host changes state too frequently, resulting in a large number ofproblem/recovery alerts. This can indicate config problems or real network problems.Once you have all that in place, reload Nagios, and then try turning off SSH on your test client. Youshould receive a message to the address you set in the contacts file, telling you that the client is not SSHaccessible. Turn it back on, and you’ll get another alert telling you it’s OK again.■ Note The default From line in the e-mail alerts is the nagios user; this may not be good if you have a mailserver that wants a registered address before it will send. If you’re using exim4, you need to set the “untrusteduser” option and then add the following to the end of the host-notify-by-email and notify-by-emailcommands in commands.cfg:-- -f address@example.com70Download at WoweBook.Com

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!