LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Server
User Name
Password
Linux - Server This forum is for the discussion of Linux Software used in a server related context.

Notices


Reply
  Search this Thread
Old 06-16-2011, 07:59 AM   #1
invader7
LQ Newbie
 
Registered: Jun 2011
Posts: 24

Rep: Reputation: Disabled
100% RAM usage , server hang


Hello , i have a debian for hosting 2 sites , it works good for some hours (some times 2 days some times 10 hours) and then suddently it hangs , until i stop apache2 and mysql processes and restart them... during the hang my ram is 100% and i can't visit the websites... here is my server log "from" the problem start "until" the processes stoped by me.

you will find 2 cronjobs (update.sh) , they are doing a wget request to update news via rss every 10 minutes

my log is too big and i can't post it here , http://pastebin.com/fhMdJAVu
 
Old 06-16-2011, 12:32 PM   #2
Noway2
Senior Member
 
Registered: Jul 2007
Distribution: Gentoo
Posts: 2,125

Rep: Reputation: 781Reputation: 781Reputation: 781Reputation: 781Reputation: 781Reputation: 781Reputation: 781
Did you notice these messages:
Quote:
Jun 16 12:59:43 swq11 mysqld: 110616 12:59:43 [ERROR] /usr/sbin/mysqld: Incorrect key file for table '/tmp/#sql_7cf_5.MYI'; try to repair it
It looks like you are using temp tables and there may be something wrong with them.

There appears to be two main things running, your mail server and your scripts that are accessing the web page. Of the two, your scripts are more likely to cause problems than the mail server, assuming you are running stock binaries and you seem to have narrowed it down to the scripts.

Generally speaking, a ram or cpu usage at 100% means that you have something in a loop, acquiring resources and that loop isn't exiting properly. It might also be if you are not closing your connection and only adding new ones. I would look very closely in your scripts, and consider running them in a debugger to see if they are looping on something when the problem occurs.
 
1 members found this post helpful.
Old 06-16-2011, 12:37 PM   #3
paulsm4
LQ Guru
 
Registered: Mar 2004
Distribution: SusE 8.2
Posts: 5,863
Blog Entries: 1

Rep: Reputation: Disabled
Hi -

1. From your log:
Code:
Jun 16 14:00:02 swq11 /USR/SBIN/CRON[13358]: (root) CMD (sh /home/online/mysite.gr/update.sh)
Jun 16 14:00:02 swq11 /USR/SBIN/CRON[13359]: (root) CMD (sh /home/online/mysecondsite.gr/update.sh)
Jun 16 14:00:39 swq11 postfix/pickup[12325]: 35D0B108A6F: uid=0 from=<root>
Jun 16 14:00:39 swq11 postfix/cleanup[13320]: warning: connect to mysql server 127.0.0.1: Too many connections
Jun 16 14:00:39 swq11 postfix/cleanup[13320]: warning: 35D0B108A6F: virtual_alias_maps map lookup problem for root@my.server.com
Jun 16 14:01:05 swq11 /USR/SBIN/CRON[13357]: (CRON) error (grandchild #13359 failed with exit status 4)
Jun 16 14:01:06 swq11 postfix/pickup[12325]: B15C8108A73: uid=0 from=<root>
Jun 16 14:01:06 swq11 postfix/cleanup[13320]: warning: BBAAC108A73: virtual_alias_maps map lookup problem for root@my.server.com
Jun 16 14:01:07 swq11 /USR/SBIN/CRON[13356]: (CRON) error (grandchild #13358 failed with exit status 4)
2. Two suggestions:
* Disable your update.sh scripts and see if the problem still occurs
.. AND ..
* Run "top" to determine exactly which process(s) is spiking RAM usage

ALSO:
* Make sure you have enough RAM/CPU to begin with. For example, are you near 100% RAM during "normal" usage?

PS:
Please read this article and make sure your Postfix configuration is correct.

PPS:
Read this article and make sure you have sufficient space in your /tmp partition.

Also run "df -h" and make sure /tmp is not 100% full when the problem occurs.

Last edited by paulsm4; 06-16-2011 at 12:44 PM.
 
1 members found this post helpful.
Old 06-16-2011, 01:14 PM   #4
invader7
LQ Newbie
 
Registered: Jun 2011
Posts: 24

Original Poster
Rep: Reputation: Disabled
Thanks for answering

Quote:
Originally Posted by Noway2 View Post
Did you notice these messages:
It looks like you are using temp tables and there may be something wrong with them.

There appears to be two main things running, your mail server and your scripts that are accessing the web page. Of the two, your scripts are more likely to cause problems than the mail server, assuming you are running stock binaries and you seem to have narrowed it down to the scripts.

Generally speaking, a ram or cpu usage at 100% means that you have something in a loop, acquiring resources and that loop isn't exiting properly. It might also be if you are not closing your connection and only adding new ones. I would look very closely in your scripts, and consider running them in a debugger to see if they are looping on something when the problem occurs.
my 2 scripts has only 1 line inside them...
wget "http://......" i use this for updating my rss news , i use a famous joomla component so i believe its code is correct....

i use temp tables , err , joomla uses them , and i found out that it is "better" to have mysql tmp dir mounted on tmpfs than on / , so i did it and i thing its a huge change....

Quote:
Originally Posted by paulsm4 View Post
Hi -

1. From your log:
Code:
Jun 16 14:00:02 swq11 /USR/SBIN/CRON[13358]: (root) CMD (sh /home/online/mysite.gr/update.sh)
Jun 16 14:00:02 swq11 /USR/SBIN/CRON[13359]: (root) CMD (sh /home/online/mysecondsite.gr/update.sh)
Jun 16 14:00:39 swq11 postfix/pickup[12325]: 35D0B108A6F: uid=0 from=<root>
Jun 16 14:00:39 swq11 postfix/cleanup[13320]: warning: connect to mysql server 127.0.0.1: Too many connections
Jun 16 14:00:39 swq11 postfix/cleanup[13320]: warning: 35D0B108A6F: virtual_alias_maps map lookup problem for root@my.server.com
Jun 16 14:01:05 swq11 /USR/SBIN/CRON[13357]: (CRON) error (grandchild #13359 failed with exit status 4)
Jun 16 14:01:06 swq11 postfix/pickup[12325]: B15C8108A73: uid=0 from=<root>
Jun 16 14:01:06 swq11 postfix/cleanup[13320]: warning: BBAAC108A73: virtual_alias_maps map lookup problem for root@my.server.com
Jun 16 14:01:07 swq11 /USR/SBIN/CRON[13356]: (CRON) error (grandchild #13358 failed with exit status 4)
2. Two suggestions:
* Disable your update.sh scripts and see if the problem still occurs
.. AND ..
* Run "top" to determine exactly which process(s) is spiking RAM usage

ALSO:
* Make sure you have enough RAM/CPU to begin with. For example, are you near 100% RAM during "normal" usage?

PS:
Please read this article and make sure your Postfix configuration is correct.

PPS:
Read this article and make sure you have sufficient space in your /tmp partition.

Also run "df -h" and make sure /tmp is not 100% full when the problem occurs.

2.
* my scripts have only one line so they are not the problem
* mysql uses 20% of cpu and ~1.5% ram all other processes are ~<0.5

ALSO:
* usually my cpu is at ~10% used and my ram at 35-50% used , sites are working ok with these numbers.... (my ram is 4GB)

PS:
postfix maybe a problem too, with the connections , i will check it thanks, but i don't thing this is the problem which "hangs" the system resources

PPS:
my mysql tmp dir is /tmp/mysqltmp and it is mounted as tmpfs 2GB , / has 5GB free
 
Old 06-16-2011, 04:14 PM   #5
Noway2
Senior Member
 
Registered: Jul 2007
Distribution: Gentoo
Posts: 2,125

Rep: Reputation: 781Reputation: 781Reputation: 781Reputation: 781Reputation: 781Reputation: 781Reputation: 781
Quote:
i use temp tables , err , joomla uses them , and i found out that it is "better" to have mysql tmp dir mounted on tmpfs than on / , so i did it and i thing its a huge change....
When you say that its a huge change are you saying that things improved?

If you are still having trouble, run the top command (may need to run as root to see all processes) which will tell you which process is at fault.
 
Old 06-16-2011, 05:01 PM   #6
invader7
LQ Newbie
 
Registered: Jun 2011
Posts: 24

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by Noway2 View Post
When you say that its a huge change are you saying that things improved?

If you are still having trouble, run the top command (may need to run as root to see all processes) which will tell you which process is at fault.
i say that things are much better now , better than ever.....
 
Old 06-16-2011, 08:14 PM   #7
paulsm4
LQ Guru
 
Registered: Mar 2004
Distribution: SusE 8.2
Posts: 5,863
Blog Entries: 1

Rep: Reputation: Disabled
Hi -
Quote:
i say that things are much better now , better than ever.....
Q: What changed?

FYI -

When you hit a "problem" (any kind of problem) on a busy system, it often causes a "snowball effect". Lots of things suddenly go wrong, very quickly. You run out of memory, you hit 100% CPU, you run out of connections - all at once! This can make it hard to debug the original, "root cause".

I'm guessing that maybe the "root cause" here was /tmp space, and that maybe you resolved it by moving "/tmp" to a different partition (with more space).

In any case, if the problem recurs: please run "top" (or just run it all the time, in case the problem recurs), and please run "df -h". If /tmp fills up again, it's entirely possible that it might "go from zero to max'ed out" very, very quickly.

IMHO .. PSM
 
1 members found this post helpful.
Old 06-17-2011, 01:42 PM   #8
invader7
LQ Newbie
 
Registered: Jun 2011
Posts: 24

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by paulsm4 View Post
Hi -

Q: What changed?

FYI -

When you hit a "problem" (any kind of problem) on a busy system, it often causes a "snowball effect". Lots of things suddenly go wrong, very quickly. You run out of memory, you hit 100% CPU, you run out of connections - all at once! This can make it hard to debug the original, "root cause".

I'm guessing that maybe the "root cause" here was /tmp space, and that maybe you resolved it by moving "/tmp" to a different partition (with more space).

In any case, if the problem recurs: please run "top" (or just run it all the time, in case the problem recurs), and please run "df -h". If /tmp fills up again, it's entirely possible that it might "go from zero to max'ed out" very, very quickly.

IMHO .. PSM
i didn't transfer /tmp to another partition , its at / partition , i changed the mysql tmp dir from /tmp to /tmp/mysqltmp (mysqltmp is mounted on tmpfs 2GB) and now my system is stable for 1 whole day... , memory usage is at 2GB all the time cpu is ~<20% and sites are up, i can access pages faster than ever. my 2 sites are joomla sites and have more than 20.000 news each at 2 databases at the same machine , the problem now is that when i "ask" for a page with news it takes 5-10 seconds to execute the query and start loading the page... maybe i have to add an index to database tables storing the news ??
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] MySQL server 100% CPU usage in a specific time okcomputer44 Linux - Server 3 01-18-2011 06:41 PM
100% RAM usage all the time Mik0r Linux - Newbie 3 03-13-2009 09:11 PM
X server 100% cpu usage on AMD Athlon 1.1GHz johngreenwood Slackware 9 03-15-2007 04:16 PM
100% Server Memory usage; too many httpd procs hansmast Linux - Server 3 09-11-2006 02:08 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Server

All times are GMT -5. The time now is 02:37 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration