Linux - ServerThis forum is for the discussion of Linux Software used in a server related context.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Our red hat es 3 file & print server keeps crashing these past 2 days, you can ping it and when i scan teh ports it shows ssh and others but i cannot ssh into it. I have tried looking in the logs but cant really find any obvious issues, this has now happened in the past two days.
The only weird thing ive noticed is that in the messages logs there were no activity from
april 18th 07 - 17:56 TO april 19th 07 - 08:21
AND
april 19th 07 - 08:21 TO april 19th 07 - 09:21
the above shows a one hour gap before we physically rebooted the server at about 9:20 something.
The same issue happended today
april 19th 07 - 19:24 TO april 20th 07 - 08:21
AND
april 20th 07 - 08:21 TO april 20th 07 - 09:21
Again the above showing a one hour gap before we reboot the server, please see examples from the messages logs.
If it happened at 8:21 each day that sounds like a timed thing which should immediately lead you to investigate scheduling tools. The basic one is cron. Have a look at /var/log/cron to see what may have kicked off at 8:21 (or more likely 8:20) or sometime before that (e.g. 8:00).
You can look in /var/spool/cron to see what the files there (if any) may be kicking off.
Also you can look in the files in /etc/cron* to see what they may be kicking off.
Another utility some people use is anacron - you'd have to check the man page for that for what files it uses as I don't use it.
Finally there are of course commercial scheduling tools (and probably other open source ones) like Tivoli Workload Manager that could be scheduling things. I don't know that one has a Linux agent but it is fairly popular (was called Maestro) previously in large UNIX shops.
Its happended again, the server has frozen up, i have identified that the 8.21 gap to 9.21 was a ntp date update running at boot up.
I have looked through the messages logs but cant find any obvious cause of the system hang. Can someone advise what i could do to identify what is causing the hang.
If everything is up but you can't login one culprit might be NFS (or possibly Samba) mounts. When you login there is an attempt to check quotas on all drives (even if you haven't set any quotas). If you have mounted an NFS share (or maybe Samba) but the server that is the source of the mount is down the filesystem is inaccessible but your mnttab still indicates it is mounted. You therefore would see long delays while it tried to check quota. This might eventually let you in after a timeout.
Check to see if you have any mounts to this server from others via NFS or Samba. If so see if something is happening to those servers (e.g. daily reboot).
This server is used by about 17 users for samba shares. I am currentley keeping a couple of ssh sessions open to the server monitoring cpu, ram, unix users and tailing logs and processes to hopefully see the problem causer when it crashes the system.
Meaning it is a Samba server on which the filesystems exist natively and others mount to their systems OR meaning it is a Samba client that has filesystems mounted from other systems? If the latter then what I said about quotas may be the issue.
it's fairly safe to say that when a *nix system becomes completely unresponsive, it's usually a bad piece of critical hardware. If it truly is unresponsive, I'd look into the cpu and memory, maybe try some of the many memory stepping programs that will analyze that for you (memtest86 is a good one). It could be something as extraneous as a NIC or USB controller that could be replaced or simply turned off, but usually if those things are locking the system, you know about it before it ever gets a chance to come all the way up.
Before you go down any real investigative roads, I advise getting as much information about your problem as you can. Are you 100% sure that the system is completely unresponsive? Have you tried connecting to the console and checking it out from there? Are there any other symptoms that might give you some clues as to the general health of the server?
It is a sambar and cups server. The server has crashed a few times now and i have always tried to ssh into it with no luck, and to be sure i have also pinged and port scanned the server and it showed it was up with the correct ports open.
I have then travelled to the physical location of the serverto see if i can log in on the main console, but that has also crashed.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.