LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 08-29-2013, 09:45 AM   #1
digdougburns
LQ Newbie
 
Registered: Aug 2013
Posts: 3

Rep: Reputation: Disabled
Question Inherited Nagios box- email connection refused


Hi all,

I just started a new position and it appears that I'm now the Nagios admin. I'm fairly familiar with Linux. Moreso CentOS rather than the Ubuntu Server they're running here, but I'm managing alright.

Here's the issue. I discovered a couple of days ago that emails weren't being sent from Nagios. Looking back through the log files it seems this has been happening for months but no one knew/cared enough to attempt to fix it. Originally the problem was that /usr/bin/mail didn't exist at all. So I installed mailutils which got me the mail command and we were off to the races. Or so I thought. I'm currently getting most if not all of the host UP messages from Nagios, but rarely am I getting any host down messages. (I guess my Nagios box just likes to stay optimistic ) Anyway, so in my nagios.log file there are several errors that look like this:

[1377783520] SERVICE NOTIFICATION: admin;[server];CPU_check;CRITICAL;notify-by-email;Connection refused

I don't see any information in the mail logs of any errors. I have confirmed that postfix is running and listening on the appropriate port. It's very sporadic. I can get all host UP messages without fail but I either don't get host down messages at all or very randomly. Any thoughts?
 
Old 08-29-2013, 10:14 AM   #2
TenTenths
Senior Member
 
Registered: Aug 2011
Location: Dublin
Distribution: Centos 5 / 6 / 7
Posts: 2,166

Rep: Reputation: 751Reputation: 751Reputation: 751Reputation: 751Reputation: 751Reputation: 751Reputation: 751
Nagios can be configured on a host by host / service by service level to send e-mails on the following conditions:

Down / Unreachable / Recovery / Flapping / Downtime

So it's possible that for some reason or another the "Down" notifications have been disabled.

Check your nagios config files to see what notifications are enabled for each host. Look for:

Code:
        notification_options            d,u,r
        notifications_enabled           1
This says that down / unreachable / recovery are enabled.

I'm sure there will be more things to try but this is a good start point.

Oh, and also check your server / client spam filters just incase your alerts are being classed as spam!

Last edited by TenTenths; 08-29-2013 at 10:15 AM.
 
Old 08-29-2013, 10:26 AM   #3
digdougburns
LQ Newbie
 
Registered: Aug 2013
Posts: 3

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by TenTenths View Post
Nagios can be configured on a host by host / service by service level to send e-mails on the following conditions:

Down / Unreachable / Recovery / Flapping / Downtime

So it's possible that for some reason or another the "Down" notifications have been disabled.

Check your nagios config files to see what notifications are enabled for each host. Look for:

Code:
        notification_options            d,u,r
        notifications_enabled           1
This says that down / unreachable / recovery are enabled.

I'm sure there will be more things to try but this is a good start point.

Oh, and also check your server / client spam filters just incase your alerts are being classed as spam!
All hosts appear to have d,u,r set. Also services look like they have c,r set. I don't see anything caught in our server spam folder or my client folder. Seems like that connection refused has to be related in some way, no?
 
Old 08-29-2013, 10:37 AM   #4
TenTenths
Senior Member
 
Registered: Aug 2011
Location: Dublin
Distribution: Centos 5 / 6 / 7
Posts: 2,166

Rep: Reputation: 751Reputation: 751Reputation: 751Reputation: 751Reputation: 751Reputation: 751Reputation: 751
Quote:
Originally Posted by digdougburns View Post
All hosts appear to have d,u,r set. Also services look like they have c,r set. I don't see anything caught in our server spam folder or my client folder. Seems like that connection refused has to be related in some way, no?
Not necessarily, in the log its recording:

[1377783520] - Timestamp
SERVICE NOTIFICATION: - Type Of Notification, HOST or SERVICE
admin; - Contact, send notification to this contact
[server]; - Host (Kind of self explanatory!)
CPU_check; - Service, the name of the particular check that failed.
CRITICAL; - Severity
notify-by-email; - Command used to send the notification.
Connection refused - Status/Results, this is the results of the Service check causing the notification, it's not the result of the notify-by-email command.

If it's host down notifications you're looking for then you'll need to grep your log file for HOST NOTIFICATION to see that nagios is actually detecting and trying to send you the notifications.

Hope this helps you understand what you're looking at / for in your log files.

Last edited by TenTenths; 08-29-2013 at 10:40 AM. Reason: Changed layout a bit.
 
1 members found this post helpful.
Old 08-29-2013, 11:15 AM   #5
digdougburns
LQ Newbie
 
Registered: Aug 2013
Posts: 3

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by TenTenths View Post
Not necessarily, in the log its recording:

[1377783520] - Timestamp
SERVICE NOTIFICATION: - Type Of Notification, HOST or SERVICE
admin; - Contact, send notification to this contact
[server]; - Host (Kind of self explanatory!)
CPU_check; - Service, the name of the particular check that failed.
CRITICAL; - Severity
notify-by-email; - Command used to send the notification.
Connection refused - Status/Results, this is the results of the Service check causing the notification, it's not the result of the notify-by-email command.

If it's host down notifications you're looking for then you'll need to grep your log file for HOST NOTIFICATION to see that nagios is actually detecting and trying to send you the notifications.

Hope this helps you understand what you're looking at / for in your log files.
Wow, that is insanely helpful. I've been banging my head against this email "problem" for so long and never realized that it was actually a problem with the service. I disabled that broken service and it's basically fixed everything. I'm not getting flooded with alerts, I don't see the same errors, all, it seems, is well.

Now to figure out why that service is failing and I'll be in business! Thanks again so much for your help.

**EDIT** Looks like the listening daemon was removed from the host server. All is well and I've fixed the Nagios! Thanks for the help in explaining the error. I'll mark this as solved.

Last edited by digdougburns; 08-29-2013 at 11:22 AM. Reason: Update
 
Old 08-29-2013, 11:35 AM   #6
TenTenths
Senior Member
 
Registered: Aug 2011
Location: Dublin
Distribution: Centos 5 / 6 / 7
Posts: 2,166

Rep: Reputation: 751Reputation: 751Reputation: 751Reputation: 751Reputation: 751Reputation: 751Reputation: 751
Quote:
Originally Posted by digdougburns View Post
Looks like the listening daemon was removed from the host server.
Yup, that'll cause the No Connection errors
Quote:
Originally Posted by digdougburns View Post
All is well and I've fixed the Nagios! Thanks for the help in explaining the error. I'll mark this as solved.
You're welcome!
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Nagios - email notification - How to receive alert email? kumaran1983 Linux - Newbie 4 10-20-2011 07:12 AM
email Connection refused manojg Linux - Networking 1 07-13-2010 08:14 PM
Nagios - Connection Refused by host jack.deselms Linux - Newbie 6 01-05-2010 09:12 AM
Connection Refused or time out error in Nagios prak86 Linux - Newbie 1 12-30-2009 05:11 AM
SSH connection refused using linux box 90coders Linux - Security 3 10-18-2003 03:53 AM


All times are GMT -5. The time now is 02:14 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration