Nagios Dependencies and Parents

Nagios 2.2 was released today so I decided to update my Nagios install at home which monitors my company’s server as well as Ken’s company server. I use Nagios, currently, to monitor ping, smtp, imap, pop, httpd, dns, and ftp. Just the basics. Once I get some free time, I am going to set up NPRE so I can monitor disk space, load, swap, etc. Since I had to update my install of Nagios (just run the same ./configure script I ran when installing it, make all, make install, make install-commandmode) I decided to finally figure out how to cut down the number of notifications I receive on a given day where my home internet goes down.

What I wanted: Nagios to send a SMS when any services of my server were unavailable as well as if my server was down.

What I did not want: Nagios to send me a SMS for every service when my local connection was down…after my local connection went back up. Not only was I getting messages that my local connection went down (after it went back up), I was also getting a text message for each service I was monitoring saying it was down, then recovered when really my local connection went down. Not good.

At first I thought I needed to use “host dependencies” in Nagios. The way I looked at it, I figured that my local router was dependent on the Comcast router, and my server was dependent on the my hosting company’s router, etc. While dependencies are helpful for more complicated setups (web servers who are dependent on database servers, etc).

What I really needed-and finally figured out-was to define “parents” for each of my hosts. After doing some research about dependencies and parents, I realized that if I set up parents, I could finally get the outage notifications and reporting I was looking for. For my setup, I have five hosts:

  1. My remote server
  2. Ken’s remote server
  3. Our hosting company’s router
  4. Comcast’s router
  5. My local router
  • My server has a parent: the hosting company’s router
  • Ken’s server has a parent: the hosting company’s router
  • The hosting company’s router has a parent: Comcast’s router
  • Comcast’s router has a parent: my local router
  • My local router does not have any parents because it is on the local network

So far, so good.

Helpful resources

Nagios 2.0 Stable

Just after I got Nagios 1.2 up and running nicely (which it saved me the last few days with the server hardware failure) I get an email letting me know Nagios 2.0 stable has been released. I have been looking forward to Nagios 2.0 for awhile now, mostly because they simplified the conf files tremendously. Lots of great improvements and goodies in version 2.0. Each time I install this, it gets a little less painful…but I have still yet to have a “quick” install.

Nagios Updated

Hi. I spent my exciting day with my old friend Nagios. Due to a few issues, I had to move my Nagios install to a new server, and while I had the time to move it, I also decided it was time to tweak it some. I wanted to point out that Nagios was updated to version 1.3 (change log). No major changes, but enough to upgrade.
Missed my Nagios articles?

Ps, the upgrade was a piece of cake. No issues (for once).
[tags]Nagios[/tags]

System Monitoring with Nagios – Part 4 of 4

nagios I have decided to write a four part article on the benefits of using Nagios. The first article focused on why I chose Nagios/what it offers. The second article focused on installing Nagios on Mac OS X. The third article focused on configuring Nagios. This fourth article will focus on improving the Nagios interface and further customizing it.

Customizing Nagios…the fun continues.
Run Nagios at Boot with an Init Script: There is a great article on how to create a StartUpItem for Mac OS X. Scroll all the way to the bottom and follow the instructions.

Adding Icons: If you know me, you also know I need to work with a good interface. Nagios…leaves something to be desired, but you can do some little things to make it look better, like adding icons. Icons? Yeah. I use blue Apple logos for all my Mac OS X Client machines, grey Apple logos for my Mac OS X Server machines, Cisco icons for my networking equip, HP printer icons for my printers, etc. How?

  • Download/create/use icons that come with Nagios. Make sure the icons are located: /usr/local/nagios/images/logos
  • Create a config file: /usr/local/nagios/etc/hostextinfo.cfg (see my working example in article 3). This config file allows you to attach an image, url, notes, etc to each host.
  • Uncomment the extended service information line in the cgi.cfg file – around line 275. (xedtemplate_config_file=/usr/local/nagios/etc/hostextinfo.cfg)

Changing the look of the web interface: When I first saw the “stylesheets” folder in /usr/local/nagios/share I got excited. After taking a look at the stylesheets, I became less than excited. Hundreds of “font-family”, “color”, etc styles is…well, not what I expected. Good news, a few good “find and replace” statements and you are set. I recommend doing the “find and replace” in multiple files all at once. Hopefully in version 2 of Nagios, they will go to using one stylesheet that controls everything. Please? 😉
[tags]Nagios[/tags]

System Monitoring with Nagios – Part 3 of 4

nagios I have decided to write a four part article on the benefits of using Nagios. The first article focused on why I chose Nagios/what it offers. The second article focused on installing Nagios on Mac OS X. This third article will focuses on configuring Nagios. The fourth article will focus on improving the Nagios interface and further customizing it.

Configuring Nagios…let the fun begin! With version 1.2 of Nagios, there are multiple files by default located in /usr/local/nagios/etc/:

  • cgi.cfg This file is used to define the settings for the nagios’ CGIs. All of your basic CGI paths, authentication, commands are in this file. cgi.cfg help more cgi.cfg help
  • checkcommands.cfg This file is used to define some basic commands you can use to check your systems with such as “check smtp”, “check ftp”,etc.
  • contactgroups.cfg This file is used to define information about contact groups. Ex. If you have multiple admins for your network (network admin, web admin, database admin, you can define which admins are in certain groups. (Nice to set up so that the web admins are not notified when the file server goes down, and the network admins are not notified when the web servers go down – assuming the network admin has no responsibility for the web servers)
  • contacts.cfg This file is used to define the contact information for the admins that need to be contacted if/when their servers go down. You can define pager numbers/email addresses/IM accounts, etc in this file.
  • dependencies.cfg This file is used to define information on your network’s dependencies. You can define things in a way so that nagios knows that your web server’s availability is dependent on your database server, and your database server is dependent on your local router, and your local router is dependent on your ISP’s router, etc. You can define both host dependencies and service dependencies here.
  • escalations.cfg This file is used to define information on escalating the known problem to other people/groups. This is handy if you have multiple tiers of IT staff, or if you have a small IT staff and want to make sure someone is notified of the problem. (Ex. Network admin is notified the mail server is down, but does not do anything in x minutes. After x minutes, a page, email, IM, etc can then go to someone else or another group of people.escalations.cfg help
  • hostgroups.cfg This file is used to define all the host groups. Ex. You can group all of your database servers, web servers, printers, network equipment, etc.
  • hosts.cfg This file is used to define all your hosts that you want to monitor. Each host will have an entry with the IP, host name, etc.
  • misccommands.cfg This file is used to define all the commands that will notify the admins (notify by pager, email, et).
  • nagios.cfg This file is used to define the main configuration information. nagios.cfg help
  • resource.cfg
  • services.cfg This file is used to define the actual servers and services you want to monitor. If you monitor several services (http, smtp, smb, etc) on one host, each service will have a listing.
  • timeperiods.cfg This file is used to define timeperiods. You can have varying time periods for different services/hosts. Ex. You can set up a “24×7” time period for all your high availability servers, but use a “work hours” time periods for monitoring/reporting failures for something like your printers.
  • **I also added icons and clickable URLs in another config file called: hostextinfo.cfg

All these files…but what do I do with them? How about some working examples of the cfg files that were changed!

System Monitoring with Nagios – Part 2 of 4

nagios I have decided to write a three part article on the benefits of using Nagios. The first article focused on why I chose Nagios/what it offers. This second article will focus on installing Nagios on Mac OS X. The third article will focus on configuring Nagios. The fourth article will focus on improving the Nagios interface and further customizing it.

Downloading Nagios 1.2 (and other files)

At the time of this article, the stable release of Nagios is 1.2. There is a beta out for version 2 (2.0b4), but I decided to use the stable release. There are two few files you will need to download to properly install Nagios: the core distributionand the plugins. I also would recommend looking at the Nagios Exchange for extras, and of course xicons for some good looking replacement icons.

Installing Nagios 1.2

I used a few good web site tutorials on how to install Nagios, so no need to rewrite it. I will pass on the resources I used and, of course, make some random comments.

My random comments

  • After the install, you are left with several conf files that need to be renamed. Make sure you save yourself some time by using sed to rename the muliple files all at once:
    for i in *cfg-sample; do mv $i `echo $i | sed -e s/cfg-sample/cfg/`; done;
  • Do not forget about installing the plugins, make sure you install them after you build and install the core distribution.
  • Consider using SSL on the server you are running Nagios so your password is not sent in the clear
  • Be prepared to spend a good amount of time on configuring the many conf files

In part 3 of 3 of my Nagios articles, I will go over my conf files and try and explain as much as possible so your Nagios configuration will go a little more smootly than mine did my first time around.

System Monitoring with Nagios – Part 1 of 4

nagios I have decided to write a three part article on the benefits of using Nagios. This first article will focus on why I chose Nagios/what it offers. The second article will focus on installing Nagios on Mac OS X. The third article will focus on configuring Nagios. The fourth article will focus on improving the Nagios interface and further customizing it.

If you manage a network with multiple servers or perhaps even just one server that runs multiple services (HTTP, SMTP, SMB, AFP, FTP, etc) and are looking for a network monitoring utility, look no further. Coming from a Mac OS background, I have used my share of monitoring utilities, and I am most impressed with Nagios.
In summary, Nagios monitors services or servers for failures/warnings so you, the Sys Admin, can take care of any problems as soon as they arise. You can set up Nagios to email you, page you, IM you, etc as soon as a problem is found with any of your servers.

Why Nagios?

  • Nagios is open source and O’Reilly ranks it as the #2 open source packages for System Administrators
  • Nagios has a web interface. Regardless of your location, you can always check your network’s health as long as you have access to a web browser.
  • Not only can you monitor network services (HTTP, FTP, etc), you can also monitor host resources (disk and memory usage, processes, log files, etc. Nagios will also monitor environmental factors too (temperature).
  • Reporting. You can easily create reports on trends, availability, alerts, notifications via the web interface
  • Plugins. You can easily develop your own host and service checks if Nagios does not have exactly what you need
  • Schedule downtime. We all have to upgrade our servers or restart them at some point. Nagios allows you to easily define “downtime” so you are not notified during scheduled maintenance.
  • You can use the web interface to acknowledge any problems (so you can stop getting notified over and over again until the problem is resolved).
  • Redundant and failover network monitoring. Great, you have monitor your network and servers from within your own network, but what happens when that goes down? Multiple installs (master and slave) of Nagios can be configured to communicate with each other so if one network cannot be contacted, the other Nagios install will take over.

The list goes on and on. Check back for part 2…the install.