LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Networking
User Name
Password
Linux - Networking This forum is for any issue related to networks or networking.
Routing, network cards, OSI, etc. Anything is fair game.

Notices


Reply
  Search this Thread
Old 01-21-2008, 06:46 PM   #1
creatorrr
Member
 
Registered: Nov 2007
Posts: 40

Rep: Reputation: 15
Anyone using Nagios?


If so, i would like to know in case you are monitoring a switch with lets say 50 interfaces (you monitoring interfaces as well) what would happen if you shut down the switch for a certain period of time. Would you get only one notification that the switch is down or you'll get 51 notification, that the switch is down as well as notification that each interface is down?

Thank You
 
Old 01-22-2008, 07:25 AM   #2
JimBass
Senior Member
 
Registered: Oct 2003
Location: New York City
Distribution: Debian Sid 2.6.32
Posts: 2,100

Rep: Reputation: 49
It depends how you have nagios setup. It could go either way. The way I config it, I would get a "host down" for the switch itself, and then "host unreachable" for all the interfaces. I also would set the switch itself as the parent of the interfaces. You could however identify each interface as an individual host, in which case you'd get 51 down messages, or if you disable the unreachable messages or play with the dependencies.cfg file, you could get just one down for the switch itself.

You'd have to print out at least the hosts.cfg and services.cfg file for me to answer for sure, and probably dependencies.cfg for me to answer with certainty.

Like most linux apps, nagios can do damn near anything, it is all in how you set it up.

Peace,
JimBass
 
Old 01-22-2008, 03:56 PM   #3
creatorrr
Member
 
Registered: Nov 2007
Posts: 40

Original Poster
Rep: Reputation: 15
Quote:
Originally Posted by JimBass View Post
It depends how you have nagios setup. It could go either way. The way I config it, I would get a "host down" for the switch itself, and then "host unreachable" for all the interfaces. I also would set the switch itself as the parent of the interfaces. You could however identify each interface as an individual host, in which case you'd get 51 down messages, or if you disable the unreachable messages or play with the dependencies.cfg file, you could get just one down for the switch itself.

You'd have to print out at least the hosts.cfg and services.cfg file for me to answer for sure, and probably dependencies.cfg for me to answer with certainty.

Like most linux apps, nagios can do damn near anything, it is all in how you set it up.

Peace,
JimBass
Thanks for your reply,

I have setup the switch as a host using the management VLAN as an IP, and each interface as a service. e.g:

define host{
use generic-switch ;
host_name Sydswcore01 ;
alias Sydney Core Switch 01 ;
address 10.10.1.253 ;
hostgroups switches ;
}


define service{
use generic-service ;
host_name Sydswcore01
service_description Gi2/1 IP:10.10.1.254 WAN-01 Link Status
check_command check_snmp!-C public -o ifOperStatus.1 -r 1 -m RFC1213-MIB
}

Couldnt find the files you mentioned above. Cant remember if i have seen these files before, my version is probably different. I have files like switch.cfg, windows.cfg, commmands.cfg, etc...

I guess i could test and see what happened but i wanted to avoid getting 50 emails for same device...
 
Old 01-23-2008, 08:44 AM   #4
JimBass
Senior Member
 
Registered: Oct 2003
Location: New York City
Distribution: Debian Sid 2.6.32
Posts: 2,100

Rep: Reputation: 49
Yeah, I compile nagios from source, and those files I mentioned get installed if you go through the full install. I have never worked with a packaged version of nagios. Test it an see. Block the nagios boxes ability to get to the VLAN address of the switch, or simply change it to something else (IE tell nagios the switch lives as 172.16.0.235 instead of 10.10.1.253 (assuming there is no 172.16.0.235)). If you start getting emails, shutdown nagios, change the configs back, and restart.

Peace,
JimBass
 
Old 01-23-2008, 09:31 AM   #5
lord-fu
Member
 
Registered: Apr 2005
Location: Ohio
Distribution: Slackware && freeBSD
Posts: 676

Rep: Reputation: 30
Nagios is amazing and takes some time to completely learn, but "playing" with it is what will get you there.
I say mess around and get 50 emails as JimBass suggested. In one location we monitor over 500 hosts and over 1000 services (yes the 3D graph for this is amazing :] ). We receive maybe 10-15 notifications a week.

http://www.nagios.org/faqs/viewfaq.php?faq_id=145

good luck Nagios is w00t!!
 
Old 01-23-2008, 09:47 AM   #6
JimBass
Senior Member
 
Registered: Oct 2003
Location: New York City
Distribution: Debian Sid 2.6.32
Posts: 2,100

Rep: Reputation: 49
Man, I laughed when you worried about 50 emails. I have 8 nagios installs, each monitoring between 20 and 110+ hosts, with multiple services on most hosts. I average probably 100-150 emails a day, and that's assuming nothing actually goes "hard down". I do have mine set so when a front host goes down (like a firewall or router), we are given "host unreachable" messages for all the hosts behind it.

But yeah, nagios is absolutely great. I use it to keep a lookout on everything, and with the nice GUI interface, my boss can look at a webpage and see exactly where a problem is, and what needs to be fixed. Unquestionably a great piece of work.

Peace,
JimBass
 
Old 01-23-2008, 05:41 PM   #7
creatorrr
Member
 
Registered: Nov 2007
Posts: 40

Original Poster
Rep: Reputation: 15
Quote:
Originally Posted by JimBass View Post
Man, I laughed when you worried about 50 emails. I have 8 nagios installs, each monitoring between 20 and 110+ hosts, with multiple services on most hosts. I average probably 100-150 emails a day, and that's assuming nothing actually goes "hard down". I do have mine set so when a front host goes down (like a firewall or router), we are given "host unreachable" messages for all the hosts behind it.

But yeah, nagios is absolutely great. I use it to keep a lookout on everything, and with the nice GUI interface, my boss can look at a webpage and see exactly where a problem is, and what needs to be fixed. Unquestionably a great piece of work.

Peace,
JimBass
I agree, i started using Nagios since few months ago and i have to say i am more and more satisfied. I guess the best way to learn what Nagios can do is to test, i also want to avoid winging from other Sys Admins of how many emails are they getting from Nagios
 
Old 01-23-2008, 06:08 PM   #8
creatorrr
Member
 
Registered: Nov 2007
Posts: 40

Original Poster
Rep: Reputation: 15
by the way, how many devices/services is recomended to monitor with one nagios instance? for example, we have 8 sites with total of 1500 users. would one Nagios instance be capable of monitoring all this?
 
Old 01-23-2008, 06:21 PM   #9
JimBass
Senior Member
 
Registered: Oct 2003
Location: New York City
Distribution: Debian Sid 2.6.32
Posts: 2,100

Rep: Reputation: 49
Yeah, one instance of nagios can easily do that. I don't know what you're monitoring with users, so I don't think that should matter, but no matter, nagios can do it easily. You might want to install it at different locations just for IP ability. All of the things I monitor are done over private IPs, so I need one server on each of those LANs. If you're monitoring things that can all be reached over the net, than one machine can easily handle it.

In regards to your test worries, define a new contact group, with yourself as the only member. Then when you fail the switch, only you'll get the notifies, not all the sysadmins.

Peace,
JimBass
 
Old 01-23-2008, 06:28 PM   #10
trickykid
LQ Guru
 
Registered: Jan 2001
Posts: 24,149

Rep: Reputation: 269Reputation: 269Reputation: 269
Quote:
Originally Posted by creatorrr View Post
by the way, how many devices/services is recomended to monitor with one nagios instance? for example, we have 8 sites with total of 1500 users. would one Nagios instance be capable of monitoring all this?
Don't leave out OpenNMS though. We're dropping Nagios for OpenNMS currently where I'm employed. Don't get me wrong though, I like Nagios but OpenNMS is way better, especially when the upper management like pretty graphs. It's like Nagios and Cacti rolled into one application and it's smarter too with autodiscovery.

And OpenNMS also will only notify you of one outage if there's something dependent behind it. Say if a router or switch goes down and you monitor ports on the switch, it won't page you on every single port on the switch that is down, etc.

Last edited by trickykid; 01-23-2008 at 06:29 PM.
 
Old 01-23-2008, 08:58 PM   #11
creatorrr
Member
 
Registered: Nov 2007
Posts: 40

Original Poster
Rep: Reputation: 15
Quote:
Originally Posted by trickykid View Post
Don't leave out OpenNMS though. We're dropping Nagios for OpenNMS currently where I'm employed. Don't get me wrong though, I like Nagios but OpenNMS is way better, especially when the upper management like pretty graphs. It's like Nagios and Cacti rolled into one application and it's smarter too with autodiscovery.

And OpenNMS also will only notify you of one outage if there's something dependent behind it. Say if a router or switch goes down and you monitor ports on the switch, it won't page you on every single port on the switch that is down, etc.
Hmmm, i spent far too much time with Nagios and Cacti to change them for another Monitoring software. I will have a look thou...

Anyway, these two does everything we need so far....
 
Old 01-23-2008, 09:02 PM   #12
creatorrr
Member
 
Registered: Nov 2007
Posts: 40

Original Poster
Rep: Reputation: 15
Quote:
Originally Posted by JimBass View Post

In regards to your test worries, define a new contact group, with yourself as the only member. Then when you fail the switch, only you'll get the notifies, not all the sysadmins.
Yeah i can do that.

Another question, what practice are you using on yor network in regards of monitoring devices/services? For example, the more sensitive device i am defining to receive notification the first time the devices is unreachable. You know, by the default is 3 times. I am wondering how other people are setting it up on their networks...
 
Old 01-23-2008, 09:44 PM   #13
JimBass
Senior Member
 
Registered: Oct 2003
Location: New York City
Distribution: Debian Sid 2.6.32
Posts: 2,100

Rep: Reputation: 49
I tend to monitor IP cameras over wireless networks, so I usually let the timeout get to 10 times before a notice is sent. I think you'll drive yourself crazy with a first miss message. Even something working well will miss from time to time.

Take for example 2 computers right next to each other, plugged into a hub. Say you ping the second machine from the first all night long. When you go and check it in the morning, you'll find there were a few missed pings. It might have missed 10 out of 100,000, but it won't be exactly 100%.

I wouldn't lower from 3 failures, but its your system. Set it up as you want, and see if it works. If you get "false downs" as a result of sending at too few failures, then increase the number before you get notified. If it is something like a production server doing e-commerce, than you might well want a notice at the first failure.

Peace,
JimBass
 
Old 01-23-2008, 10:15 PM   #14
creatorrr
Member
 
Registered: Nov 2007
Posts: 40

Original Poster
Rep: Reputation: 15
Quote:
Originally Posted by JimBass View Post
I tend to monitor IP cameras over wireless networks, so I usually let the timeout get to 10 times before a notice is sent. I think you'll drive yourself crazy with a first miss message. Even something working well will miss from time to time.

Take for example 2 computers right next to each other, plugged into a hub. Say you ping the second machine from the first all night long. When you go and check it in the morning, you'll find there were a few missed pings. It might have missed 10 out of 100,000, but it won't be exactly 100%.

I wouldn't lower from 3 failures, but its your system. Set it up as you want, and see if it works. If you get "false downs" as a result of sending at too few failures, then increase the number before you get notified. If it is something like a production server doing e-commerce, than you might well want a notice at the first failure.

Peace,
JimBass
Yeah, i think you are right. OK, one last question. I cant get to monitor for example, d or e drive on a windows machine. i have this for c drive:

check_command check_nt!USEDDISKSPACE!-l c -w 70 -c 80

I tried this for d but doesnt work:

check_command check_nt!USEDDISKSPACE!-l d -w 70 -c 80

Any sugestions?
 
Old 01-23-2008, 10:27 PM   #15
creatorrr
Member
 
Registered: Nov 2007
Posts: 40

Original Poster
Rep: Reputation: 15
Quote:
Originally Posted by creatorrr View Post
I cant get to monitor for example, d or e drive on a windows machine. i have this for c drive:

check_command check_nt!USEDDISKSPACE!-l c -w 70 -c 80

I tried this for d but doesnt work:

check_command check_nt!USEDDISKSPACE!-l d -w 70 -c 80
Didnt work coz i am an idiot. d drive was cdrom, changed it to e and its fine now....
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
LXer: Nagios and Oreon (Nagios web front end) installation and Configuration LXer Syndicated Linux News 1 05-31-2016 07:26 AM
Nagios Daemon, PID not found but rc.nagios runs agentc0re Slackware 1 07-03-2007 02:47 PM
LXer: Nagios 2.5 and Oreon 1.3 (Nagios web front end) installation with screenshots LXer Syndicated Linux News 0 08-11-2006 05:33 PM
Any help on NAGIOS ? sunlinux Linux - Software 1 07-18-2006 07:26 AM
Nagios markus1982 Linux - Software 1 04-01-2003 06:04 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Networking

All times are GMT -5. The time now is 12:36 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration