Nagios: System and Network Monitoring shows how to configure and use Nagios, an open source system and network monitoring tool. Nagios makes it possible to continuously monitor network services (SMTP, POP3, HTTP, NNTP, PING, etc.), host resources (processor load, disk and memory usage, running processes, log files, etc.), and environmental factors (such as temperature). When Nagios detects a problem, it communicates the information to the sys admin via email, pager, SMS, or other user-defined method; current status information, historical logs, and reports can also be accessed via a web browser. Nagios System and Network Monitoring covers the Nagios core, all standard Nagios plug-ins and selected third-party plug-ins, and shows readers how to write their own plug-ins. The book covers Nagios 2.0 and is backwards compatible with earlier versions.
Cover......Page 1
Contents......Page 5
Introduction......Page 15
About This Book......Page 19
Note of Thanks......Page 21
1 Installation......Page 25
1.1 Compiling the Source Code......Page 26
1.2 Installing and Testing Plugins......Page 30
1.3 Configuration of the Web Interface......Page 33
2 Nagios Configuration......Page 37
2.1 The Main Configuration File nagios.cfg......Page 38
2.2 Objects—an Overview......Page 41
2.3 Defining the Machines to Be Monitored, with......Page 44
2.4 Grouping Computers Together with hostgroup......Page 46
2.5 Defining Services to Be Monitored with service......Page 47
2.7 Defining Addressees for Error Messages:......Page 50
2.8 The Message Recipient: contactgroup......Page 52
2.9 When Nagios Needs to Do Something: the......Page 53
2.11 Templates......Page 54
2.12 Configuration Aids for Those Too Lazy to Type......Page 56
2.13 CGI Configuration in cgi.cfg......Page 57
2.14 The Resources File resource.cfg......Page 59
3.1 Checking the Configuration......Page 61
3.2 Getting Monitoring Started......Page 63
3.3 Overview of the Web Interface......Page 64
4 Nagios Basics......Page 71
4.1 Taking into Account the Network Topology......Page 72
4.3 States of Hosts and Services......Page 75
5 Service Checks and How They Are Performed......Page 79
5.1 Testing Network Services Directly......Page 81
5.3 The Nagios Remote Plugin Executor......Page 82
5.4 Monitoring via SNMP......Page 83
5.5 The Nagios Service Check Acceptor......Page 84
6 Plugins for Network Services......Page 85
6.1 Standard Options......Page 87
6.2 Reachability Test with Ping......Page 88
6.3 Monitoring Mail Servers......Page 92
6.4 Monitoring FTP and Web Servers......Page 97
6.5 Domain Name Server under Control......Page 105
6.6 Querying the Secure Shell Server......Page 108
6.7 Generic Network Plugins......Page 110
6.8 Monitoring Databases......Page 114
6.9 Monitoring LDAP Directory Services......Page 121
6.10 Checking a DHCP Server......Page 124
6.11 Monitoring UPS with the Network UPS Tools......Page 126
7 Testing Local Resources......Page 133
7.1 Free Hard Drive Capacity......Page 134
7.2 Utilization of the Swap Space......Page 136
7.3 Testing the System Load......Page 137
7.4 Monitoring Processes......Page 138
7.5 Checking Log Files......Page 141
7.6 Keeping Tabs on the Number of Logged-in Users......Page 144
7.7 Checking the System Time......Page 145
7.8 Regularly Checking the Status of the Mail Queue......Page 147
7.9 Keeping an Eye on the Modification Date of a File......Page 148
7.10 Monitoring UPSs with apcupsd......Page 149
7.11 Nagios Monitors Itself......Page 150
7.12 Hardware Checks with LM Sensors......Page 152
7.13 The Dummy Plugin for Tests......Page 154
8.1 Negating Plugin Results......Page 155
8.2 Inserting Hyperlinks with urlize......Page 156
9 Executing Plugins via SSH......Page 157
9.1 The check_by_ssh Plugin......Page 158
9.2 Configuring SSH......Page 160
9.3 Nagios Configuration......Page 162
10 The Nagios Remote Plugin Executor (NRPE)......Page 165
10.1 Installation......Page 166
10.2 Starting via the inet Daemon......Page 168
10.3 NRPE Configuration on the Computer to Be Monitored......Page 170
10.4 Nagios Configuration......Page 172
10.5 Indirect Checks......Page 174
11 Collecting Information Relevant for Monitoring with SNMP......Page 177
11.1 Introduction to SNMP......Page 178
11.2 NET-SNMP......Page 184
11.3 Nagios’s Own SNMP Plugins......Page 196
11.4 Other SNMP-based Plugins......Page 205
12 The Nagios Notification System......Page 215
12.1 Who Should be Informed of What, When?......Page 216
12.3 The Message Filter......Page 217
12.4 External Notification Programs......Page 224
12.5 Escalation Management......Page 231
12.6 Dependences between Hosts and Services as a Filter Criterion......Page 234
13 Passive Tests with the External Command File......Page 239
13.1 The Interface for External Commands......Page 240
13.2 Passive Service Checks......Page 241
13.3 Passive Host Checks......Page 242
13.4 Reacting to Out-of-Date Information of Passive Checks......Page 243
14 The Nagios Service Check Acceptor (NSCA)......Page 247
14.1 Installation......Page 248
14.2 Configuring the Nagios Server......Page 249
14.3 Client-side Configuration......Page 252
14.4 Sending Test Results to the Server......Page 253
14.5 Application Example I: Integrating syslog and Nagios......Page 254
14.6 Application Example II: Processing SNMP Traps......Page 260
15 Distributed Monitoring......Page 265
15.1 Switching On the OCSP/OCHP Mechanism......Page 266
15.2 Defining OCSP/OCHP Commands......Page 267
15.3 Practical Scenarios......Page 269
16 The Web Interface......Page 273
16.1 Recognizing and Acting On Problems......Page 275
16.3 Planning Downtimes......Page 304
16.4 Additional Information on Hosts and Services......Page 307
16.5 Configuration Changes through the Web Interfaces: the Restart Problem......Page 311
17 Graphic Display of Performance Data......Page 313
17.1 Processing Plugin Performance Data with Nagios......Page 314
17.2 Graphs for the Web with Nagiosgraph......Page 317
17.3 Preparing Performance Data for Evaluation with Perf2rrd......Page 325
17.4 The Graphics Specialist drraw......Page 330
17.5 Automated to a Large Extent: NagiosGrapher......Page 336
17.6 Other tools and the limits of graphic evaluation......Page 349
18 Monitoring Windows Servers......Page 353
18.1 NSClient and NC Net......Page 354
18.2 NRPE for Windows: NRPE NT......Page 371
19 Monitoring Room Temperature and Humidity......Page 377
19.1 Sensors and Software......Page 378
19.2 The Nagios Plugin check_pcmeasure......Page 379
20 Monitoring SAP Systems......Page 383
20.1 Checking without a Login: sapinfo......Page 384
20.2 Monitoring with SAP’s Own Monitoring System (CCMS)......Page 388
Appendix A Rapidly Alternating States: Flapping......Page 401
A.1 Flap Detection with Services......Page 402
A.2 Flap Detection for Hosts......Page 406
Appendix B Event Handlers......Page 409
B.1 Execution Times for the Event Handler......Page 410
B.3 The Handler Script......Page 411
B.4 Things to Note When Using Event Handlers......Page 413
Appendix C Writing Your Own Plugins: Monitoring Oracle with the Instant Client......Page 415
C.1 Installing the Oracle Instant Client......Page 416
C.3 A Wrapper Plugin for sqlplus......Page 417
Appendix D An Overview of the Nagios Configuration Parameters......Page 423
D.1 The Main Configuration File nagios.cfg......Page 424
D.2 CGI Configuration in cgi.cfg......Page 443
Index......Page 447