Linux - ServerThis forum is for the discussion of Linux Software used in a server related context.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Let's say we have a Linux / UNIX server example1.com which is responding to a ping request but when we try to connect to it through ssh or telnet, it does not respond. Similarly we have a Windows server example2.com and it responds to a ping request successfully (0% packets loss), but we can't connect to it through RDP.
Does it mean the servers are in a hung state? Can a server still respond (successfully) to a ping request while it is dying?
If yes, then consider a production environment wherein we have 100 UNIX and Windows servers and we want to make sure that they are alive. I usually write a shell script which ping(s) each of them and if a green signal is returned, I ignore this node otherwise an email alert is sent to the admin / concerned group.
(This is critical because we have other services such as Databases running on the servers and if the servers are pinging but are not actually functioning properly then the services are likely to be impacted.)
A half dead server can certainly often respond to ICMP but not open TCP connections, but there's no 100% guarantee that that's what's the case. Maybe the ssh service itself has just frozen, or a firewall was updated
imagine, a single network card without running OS can answer to the ping request, so ping is not reliable for this purpose. also a firewall can block any port or protocol. If you want to be sure I suggest you to create-install your own health check service on all your hosts and ask that service about the state.
consider a production environment wherein we have 100 UNIX and Windows servers and we want to make sure that they are alive. I usually write a shell script which ping(s) each of them and if a green signal is returned, I ignore this node otherwise an email alert is sent to the admin / concerned group.
Just checking ping is nowhere near enough to determine if a host is down or not. As previous posters have indicated a host can reply to pings while other services are impacted, conversly firewall configurations could prevent the host from responding to pings while other services are unaffected.
You need to define what services you are expecting on each of your hosts and check them accordingly.
Rather than a single script you might want to consider a monitoring suite, my personal preference is nagios but I'm sure others will have their own opinions.
create-install your own health check service on all your hosts and ask that service about the state.
To make sure I got your point exactly, let me put it this way:
I put a script on server_1, server_2, and server_n and let the script on each of these servers create a log at some common place let's say logging_server:\var\log\myLogs\servers_health.log and then see if I get logs/answers from all the named servers or not?
No, not really. A small daemon process which will listen on a given port and will reply to the central host. Something like "are you ok?" "yes/no/whatever".
The daemon process knows on every and each host how to check if it works well and runs some test periodically.
But also you can try to write a log on a common filesystem and check those logs...
A half dead server can certainly often respond to ICMP but not open TCP connections
How do we make sure that a particular Port is opened only when the server is alive?
Quote:
Originally Posted by pan64;
A small daemon process which will listen on a given port and will reply to the central host. Something like "are you ok?" "yes/no/whatever".
The daemon process knows on every and each host how to check if it works well and runs some test periodically.
No, not really. A small daemon process which will listen on a given port and will reply to the central host. Something like "are you ok?" "yes/no/whatever".
Why re-invent something that already exists? Most distros now come with an extendable SNMP daemon.
As with the others; if this is a serious prod qn, get monitoring tool eg nagios, zabbix, zenoss, opennms etc etc.
I wouldn't write your own for 100 systems.
Re ping: that just tests the network cxn to the remote host's network stack.
Tells you nothing about the state of rest of the system/services.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.