LinuxQuestions.org
Visit Jeremy's Blog.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Hardware
User Name
Password
Linux - Hardware This forum is for Hardware issues.
Having trouble installing a piece of hardware? Want to know if that peripheral is compatible with Linux?

Notices


Reply
  Search this Thread
Old 05-14-2006, 01:59 PM   #1
tisource
Member
 
Registered: Feb 2002
Posts: 322

Rep: Reputation: 30
Server hangs, nothing adds up


I have a SUSE Linux server, usually rock-solid, that suddenly started locking up every day, and I'm having a hard time determining what the problem is.

The first thing that happens is that services are no longer available... network communication "times out". So, I go to the server console to find that the display is black and won't respond to input. I can't even switch to another Bash console. However, the machine still responds to ping packets (with good ping response times).

The Caps-Lock and Scroll Lock appear to work (LEDs on the keyboard toggle state when the keys are pressed).

So, I have to hard-power the machine off, and then power it back on. Everything starts appropriately. But within 24 hours or so, this happens again.

So I got checking my logs, particularly /var/log/messages. I noticed that syslog is logging throughout the entire "down" period, where the server is unresponsive.

I suppose it could be from something I've done or changed, but I can't think of anything that would cause this. I don't "tinker" with my servers... I typically set them up (install) and then leave them alone.

Here is a typical exerpt from the logs (/var/log/messages):

May 14 12:26:55 fs2 kernel: lowmem_reserve[]: 0 0 0
May 14 12:26:55 fs2 kernel: Node 0 DMA: 0*4kB 1*8kB 1*16kB 1*32kB 0*64kB 1*128kB 1*256kB 1*512kB
May 14 12:26:55 fs2 kernel: Node 0 Normal: 9*4kB 24*8kB 12*16kB 2*32kB 0*64kB 1*128kB 1*256kB 0*5
May 14 12:26:55 fs2 kernel: Node 0 HighMem: empty
May 14 12:26:55 fs2 kernel: Swap cache: add 1109627, delete 1109624, find 243647/321774, race 1+2
May 14 12:26:55 fs2 kernel: Free swap: 0kB
May 14 12:26:55 fs2 kernel: 259724 pages of RAM
May 14 12:26:55 fs2 kernel: 6632 reserved pages
May 14 12:26:55 fs2 kernel: 14525 pages shared
May 14 12:26:55 fs2 kernel: 3 pages swap cached
May 14 12:26:55 fs2 kernel: Out of Memory: Killed process 12545 (httpd2-prefork).
May 14 12:26:55 fs2 kernel: iptables in DROP IN=eth1 OUT= MAC=00:0a:5e:3d:13:f9:00:12:80:32:a1:8
May 14 12:26:55 fs2 kernel: Mem-info:
May 14 12:26:55 fs2 kernel: Node 0 DMA per-cpu:
May 14 12:26:55 fs2 kernel: cpu 0 hot: low 2, high 6, batch 1
May 14 12:26:55 fs2 kernel: cpu 0 cold: low 0, high 2, batch 1
May 14 12:26:55 fs2 kernel: cpu 1 hot: low 2, high 6, batch 1
May 14 12:26:55 fs2 kernel: cpu 1 cold: low 0, high 2, batch 1
May 14 12:26:55 fs2 kernel: Node 0 Normal per-cpu:
May 14 12:26:55 fs2 kernel: cpu 0 hot: low 62, high 186, batch 31
May 14 12:26:55 fs2 kernel: cpu 0 cold: low 0, high 62, batch 31
May 14 12:26:55 fs2 kernel: cpu 1 hot: low 62, high 186, batch 31
May 14 12:26:55 fs2 kernel: cpu 1 cold: low 0, high 62, batch 31
May 14 12:26:55 fs2 kernel: Node 0 HighMem per-cpu: empty
May 14 12:26:55 fs2 kernel:
May 14 12:26:55 fs2 kernel: Free pages: 6016kB (0kB HighMem)

This is a DELL PowerEdge server, with Intel EMT64 3.4 GHz processor and 1 GB of RAM. It has a SATA 150 RAID 1 configuration (2x160GB drives via 3Ware/AMCC Escalade RAID card). Linux kernel version is 2.6.5-7.252-smp. It serves web (apache2), ftp (proftpd), email (cyrus/postfix), file (samba), etc. No, I no longer have a service contract with Dell (we couldn't afford to renew it, besides the fact they don't support SUSE Linux on this PowerEdge server anyway).

I ran clamscan, and it reports the machine is clean. I have plenty of spare disk space, and the machine (while it is responsive) averages 30-60MB free RAM.

What could be doing this? Are there other logs I should be checking?

Any help would be most appreciated, as this is a production server!
 
Old 05-14-2006, 04:53 PM   #2
Electro
LQ Guru
 
Registered: Jan 2002
Posts: 6,042

Rep: Reputation: Disabled
Probably the power supply is going bad. Also use knoppix or any Linux LIVE distribution to scan your installation for rootkits. The 2.6.5 kernel version is very old and vulernable to network and other attacks. I suggest upgrading to at least 2.6.12 or higher. Search through NSA web site to find out any vulnerabilities of the services that you are running.
 
Old 05-14-2006, 05:02 PM   #3
tisource
Member
 
Registered: Feb 2002
Posts: 322

Original Poster
Rep: Reputation: 30
Why would you say the power supply is going bad?
 
Old 05-14-2006, 11:47 PM   #4
Electro
LQ Guru
 
Registered: Jan 2002
Posts: 6,042

Rep: Reputation: Disabled
If you have done a rootkit scan, update the kernel, ran memtest86, and you still have problems, the power supply is number one of all computer related problems. Intel 3.4 GHz processor uses a lot more electricity than AMD's top of the line processors. Power supplies gets worst over time. Unforturnately, Dell systems uses non-standard devices, so you may want to re-think to pay their service to replace the hardware.
 
Old 05-15-2006, 12:48 AM   #5
tisource
Member
 
Registered: Feb 2002
Posts: 322

Original Poster
Rep: Reputation: 30
After a great deal of digging, I have come to the conclusion that the machine is locking up when the swap fills to 100%. There is no available ram, no available swap... and the computer freezes.

Well, like I said earlier, some processes continue (ie. syslog), and the keyboard responds, but the computer won't allow you to login (prompt hangs after you've typed your username and hit enter), and none of the network services are available.

As long as swap doesn't run out, nothing goes bad. I have a 2GB swap partition.

Based on normal load (my baseline), the server rarely ever touches swap. I usually have 30 to 50 MB free space. The fact something is sucking up so much swap tells me (1) I have one or more processes that are bringing the server to its knees, or (2) I have a virus.

What would you suggest for rootkit scanning? All I have is clam antivirus, which takes forever.

Again, help is appreciated...
 
Old 05-15-2006, 01:19 AM   #6
Electro
LQ Guru
 
Registered: Jan 2002
Posts: 6,042

Rep: Reputation: Disabled
Reconfigure Apache so it does not open a lot of threads and minimize its memory usage. Also use sysctl to optimize memory usage. Setup a cron script it creates additional swap when needed.

I should have notice the error message "May 14 12:26:55 fs2 kernel: Out of Memory: Killed process 12545 (httpd2-prefork)."
 
Old 05-15-2006, 06:47 AM   #7
pmarques
LQ Newbie
 
Registered: Jul 2003
Posts: 28

Rep: Reputation: 15
Seems like a memory leak somewhere....

I had a problem with a server a while ago similar to yours.

It turned out that it was saslauthd that had a major leak and it would eat up all the memory from the machine until it was OOM killed.

I would let the machine run for a while, then use top and sort by memory usage. That should pinpoint the culprit straight away.
 
Old 05-15-2006, 07:32 AM   #8
animehair
Member
 
Registered: Sep 2004
Location: NJ
Distribution: Gentoo
Posts: 104

Rep: Reputation: 15
RKHUNTER http://www.rootkit.nl/ is a good rootkit scanner if you want to look into the possibilty of an exploit.
 
Old 05-15-2006, 09:47 AM   #9
tisource
Member
 
Registered: Feb 2002
Posts: 322

Original Poster
Rep: Reputation: 30
First of all, thanks for the tips!

Second, I have done further research, and it is quite evident that clamav (aka clamscan) is the one hogging all the memory/swap. It is scanning archives, and I think that is what is killing it.

I think apache, because of its thread/memory usage, is the one getting killed as a result.

I've never used sysctl, so I don't know what it is all about.

I also don't know why clamav suddenly started doing this. I've had it scanning archives on a weekly basis for some time now (months). I'm unsure as to why it is giving me grief now.
 
Old 05-15-2006, 10:11 AM   #10
animehair
Member
 
Registered: Sep 2004
Location: NJ
Distribution: Gentoo
Posts: 104

Rep: Reputation: 15
maybe there is an update that fixes a bug
 
Old 05-15-2006, 12:05 PM   #11
tisource
Member
 
Registered: Feb 2002
Posts: 322

Original Poster
Rep: Reputation: 30
Stupid me... after doing even further research, I found that I had zip, rar and arj archive scanning enabled. I also had a 3 GB tar (backup of a home directory on a linux workstation) on the server, and I think it was trying to process that, chewed up swap and killed the machine.

I have turned off archive scanning for the time being, to see if the server locks up again. We will see if it hangs within the next 24 hours. If it hangs again, then I guess I have another problem...

Thanks thusfar for everyone's help... it is most appreciated!
 
Old 05-15-2006, 04:32 PM   #12
skulbite
LQ Newbie
 
Registered: Apr 2005
Location: Barbados
Distribution: Mandrake/Mandriva
Posts: 15

Rep: Reputation: 0
Is there any way that you can exempt the compressed file or directory from the scan?
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
LXer: Microsoft adds Virtual Server to delay parade LXer Syndicated Linux News 0 03-29-2006 04:03 PM
LXer: Open-source mobile server adds push email to challenge BlackBerry et al LXer Syndicated Linux News 0 02-06-2006 10:31 AM
Server hangs on inetd tbartolucci Linux - General 1 09-20-2004 08:01 PM
linux server hangs Pranesh Linux - General 17 06-11-2003 05:06 AM
Server Hangs out subhasis_ray Linux - General 1 09-13-2002 10:24 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Hardware

All times are GMT -5. The time now is 05:05 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration