I've got a Nagios server (on CentOS 5), and a monitored node (also on CentOS 5). I initially had a problem with SSH key-exchange, but that has been solved, and I'm still receiving a No Route to Host.
Nagios server: 10.0.100.130
monitored node: 10.0.100.143
Yet, I can do the following from Nagios Server:
Code:
/usr/local/nagios/libexec/check_tcp -H 10.0.100.143 -p 5666
TCP OK - 0.000 second response time on port 5666|time=0.000361s;0.000000;0.000000;0.000000;10.000000
also can do this from the Nagios Server:
Code:
ssh 10.0.100.143 /usr/local/nagios/libexec/check_procs
PROCS OK: 603 processes
I can successfully ping 10.0.100.143 from Nagios server as well.
grep for the monitored node in /var/log/messages pulls this up:
Code:
Nov 10 00:00:00 nagiosbox nagios: CURRENT HOST STATE: monitorednode;UP;HARD;1;PING OK - Packet loss = 0%, RTA = 0.21 ms
Nov 10 00:00:00 nagiosbox nagios: CURRENT SERVICE STATE: monitorednode;Home Page;CRITICAL;HARD;1;No route to host
route and ifconfig info
Code:
from monitored node:
ping 10.0.100.130
PING 10.0.100.130 (10.0.100.130) 56(84) bytes of data.
64 bytes from 10.0.100.130: icmp_seq=1 ttl=64 time=0.897 ms
monitored node ifconfig:
ifconfig
eth0 Link encap:Ethernet HWaddr 00:1D:09:2C:C3:2A
inet addr:10.0.100.143 Bcast:10.0.100.255 Mask:255.255.255.0
inet6 addr: fe80::21d:9ff:fe2c:c32a/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:151840310 errors:0 dropped:0 overruns:0 frame:0
TX packets:20026487 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:145578488128 (135.5 GiB) TX bytes:2364444581 (2.2 GiB)
Interrupt:169 Memory:f8000000-f8012800
eth0:1 Link encap:Ethernet HWaddr 00:1D:09:2C:C3:2A
inet addr:10.0.100.144 Bcast:10.0.100.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
Interrupt:169 Memory:f8000000-f8012800
"route" from monitored node:
route
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
10.0.100.0 * 255.255.255.0 U 0 0 0 eth0
169.254.0.0 * 255.255.0.0 U 0 0 0 eth0
default 10.0.100.1 0.0.0.0 UG 0 0 0 eth0
from Nagios box, ifconfig:
/sbin/ifconfig
eth0 Link encap:Ethernet HWaddr 00:1C:23:C8:96:AE
inet addr:10.0.100.130 Bcast:10.0.100.255 Mask:255.255.255.0
inet6 addr: fe80::21c:23ff:fec8:96ae/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:1968825668 errors:0 dropped:0 overruns:0 frame:0
TX packets:2112609296 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:708043528943 (659.4 GiB) TX bytes:995965269105 (927.5 GiB)
Interrupt:169 Memory:f8000000-f8011100
"route" from nagios box:
/sbin/route
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
10.0.101.0 * 255.255.255.0 U 0 0 0 eth1
10.0.100.0 * 255.255.255.0 U 0 0 0 eth0
169.254.0.0 * 255.255.0.0 U 0 0 0 eth0
default 10.0.100.1 0.0.0.0 UG 0 0 0 eth0
i have a bucket container:
Code:
/usr/local/nagios/etc/servers/monitorednode.cfg:
define host{
use linux-server ; Inherit default values from a template
host_name monitorednode ; The name we're giving to this server
alias monitorednode ; A longer name for the server
address 10.0.100.143 ; IP address of the server
}
define service{
use generic-service
host_name monitorednode
service_description Home Page
check_command check_http!ww2
If I do a ./check_http -H 10.0.100.143, I get a connection refused, Unable to open TCP socket. I can't telnet to 80 on that box either.
If I do a ./check_http -H 10.0.100.144, I get:
Code:
OK - HTTP/1.1 301 Moved Permanently - 0.003 second response time |time=0.002535s;;;0.000000 size=434B;;;0
I can telnet successfully to 80 on .144
Someone mentioned that this error isn't Nagios, but with the OS. specifically stating that the "Home Page" check isn't looking at a valid host name or address vs the check_ping plugin. Problem is... I can't find any reference to "Home Page" anywhere.
I got these from
/usr/local/nagios/etc/objects/commands.cfg
Code:
'check-host-alive' command definition
define command{
command_name check-host-alive
command_line $USER1$/check_ping -H $HOSTADDRESS$ -w 3000.0,80% -c 5000.0,100% -p 5
'check_ping' command definition
define command{
command_name check_ping
command_line $USER1$/check_ping -H $HOSTADDRESS$ -w $ARG1$ -c $ARG2$ -p 5
# 'check_http' command definition
define command{
command_name check_http
command_line $USER1$/check_http -I $HOSTADDRESS$ $ARG1$
}
Under /etc/rc.d/init.d/nagios I can see that I've got the paths right:
Code:
prefix="/usr/local/nagios"
exec_prefix="/usr/local/nagios"
exec="/usr/local/nagios/bin/nagios"
config="/usr/local/nagios/etc/nagios.cfg"
Code:
Nov 9 00:00:00 nagiosbox nagios: CURRENT SERVICE STATE: monitorednode;Home Page;CRITICAL;HARD;1;No route to host
Nov 10 00:00:00 nagiosbox nagios: CURRENT HOST STATE: monitorednode;UP;HARD;1;PING OK - Packet loss = 0%, RTA = 0.21 ms
Nov 10 00:00:00 nagiosbox nagios: CURRENT SERVICE STATE: monitorednode;Home Page;CRITICAL;HARD;1;No route to host
Code:
define host{
use linux-server ; Inherit default values from a template
host_name monitorednode ; The name we're giving to this server
alias monitorednode ; A longer name for the server
address 10.0.100.143 ; IP address of the server
}
define service{
use generic-service
host_name monitorednode
service_description Home Page
check_command check_http!ww2
}