Linux - Server This forum is for the discussion of Linux Software used in a server related context. |
Notices |
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
Are you new to LinuxQuestions.org? Visit the following links:
Site Howto |
Site FAQ |
Sitemap |
Register Now
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
|
|
06-16-2011, 07:59 AM
|
#1
|
LQ Newbie
Registered: Jun 2011
Posts: 24
Rep:
|
100% RAM usage , server hang
Hello , i have a debian for hosting 2 sites , it works good for some hours (some times 2 days some times 10 hours) and then suddently it hangs , until i stop apache2 and mysql processes and restart them... during the hang my ram is 100% and i can't visit the websites... here is my server log "from" the problem start "until" the processes stoped by me.
you will find 2 cronjobs (update.sh) , they are doing a wget request to update news via rss every 10 minutes
my log is too big and i can't post it here , http://pastebin.com/fhMdJAVu
|
|
|
06-16-2011, 12:32 PM
|
#2
|
Senior Member
Registered: Jul 2007
Distribution: Gentoo
Posts: 2,125
|
Did you notice these messages:
Quote:
Jun 16 12:59:43 swq11 mysqld: 110616 12:59:43 [ERROR] /usr/sbin/mysqld: Incorrect key file for table '/tmp/#sql_7cf_5.MYI'; try to repair it
|
It looks like you are using temp tables and there may be something wrong with them.
There appears to be two main things running, your mail server and your scripts that are accessing the web page. Of the two, your scripts are more likely to cause problems than the mail server, assuming you are running stock binaries and you seem to have narrowed it down to the scripts.
Generally speaking, a ram or cpu usage at 100% means that you have something in a loop, acquiring resources and that loop isn't exiting properly. It might also be if you are not closing your connection and only adding new ones. I would look very closely in your scripts, and consider running them in a debugger to see if they are looping on something when the problem occurs.
|
|
1 members found this post helpful.
|
06-16-2011, 12:37 PM
|
#3
|
LQ Guru
Registered: Mar 2004
Distribution: SusE 8.2
Posts: 5,863
Rep:
|
Hi -
1. From your log:
Code:
Jun 16 14:00:02 swq11 /USR/SBIN/CRON[13358]: (root) CMD (sh /home/online/mysite.gr/update.sh)
Jun 16 14:00:02 swq11 /USR/SBIN/CRON[13359]: (root) CMD (sh /home/online/mysecondsite.gr/update.sh)
Jun 16 14:00:39 swq11 postfix/pickup[12325]: 35D0B108A6F: uid=0 from=<root>
Jun 16 14:00:39 swq11 postfix/cleanup[13320]: warning: connect to mysql server 127.0.0.1: Too many connections
Jun 16 14:00:39 swq11 postfix/cleanup[13320]: warning: 35D0B108A6F: virtual_alias_maps map lookup problem for root@my.server.com
Jun 16 14:01:05 swq11 /USR/SBIN/CRON[13357]: (CRON) error (grandchild #13359 failed with exit status 4)
Jun 16 14:01:06 swq11 postfix/pickup[12325]: B15C8108A73: uid=0 from=<root>
Jun 16 14:01:06 swq11 postfix/cleanup[13320]: warning: BBAAC108A73: virtual_alias_maps map lookup problem for root@my.server.com
Jun 16 14:01:07 swq11 /USR/SBIN/CRON[13356]: (CRON) error (grandchild #13358 failed with exit status 4)
2. Two suggestions:
* Disable your update.sh scripts and see if the problem still occurs
.. AND ..
* Run "top" to determine exactly which process(s) is spiking RAM usage
ALSO:
* Make sure you have enough RAM/CPU to begin with. For example, are you near 100% RAM during "normal" usage?
PS:
Please read this article and make sure your Postfix configuration is correct.
PPS:
Read this article and make sure you have sufficient space in your /tmp partition.
Also run "df -h" and make sure /tmp is not 100% full when the problem occurs.
Last edited by paulsm4; 06-16-2011 at 12:44 PM.
|
|
1 members found this post helpful.
|
06-16-2011, 01:14 PM
|
#4
|
LQ Newbie
Registered: Jun 2011
Posts: 24
Original Poster
Rep:
|
Thanks for answering
Quote:
Originally Posted by Noway2
Did you notice these messages:
It looks like you are using temp tables and there may be something wrong with them.
There appears to be two main things running, your mail server and your scripts that are accessing the web page. Of the two, your scripts are more likely to cause problems than the mail server, assuming you are running stock binaries and you seem to have narrowed it down to the scripts.
Generally speaking, a ram or cpu usage at 100% means that you have something in a loop, acquiring resources and that loop isn't exiting properly. It might also be if you are not closing your connection and only adding new ones. I would look very closely in your scripts, and consider running them in a debugger to see if they are looping on something when the problem occurs.
|
my 2 scripts has only 1 line inside them...
wget "http://......" i use this for updating my rss news , i use a famous joomla component so i believe its code is correct....
i use temp tables , err , joomla uses them , and i found out that it is "better" to have mysql tmp dir mounted on tmpfs than on / , so i did it and i thing its a huge change....
Quote:
Originally Posted by paulsm4
Hi -
1. From your log:
Code:
Jun 16 14:00:02 swq11 /USR/SBIN/CRON[13358]: (root) CMD (sh /home/online/mysite.gr/update.sh)
Jun 16 14:00:02 swq11 /USR/SBIN/CRON[13359]: (root) CMD (sh /home/online/mysecondsite.gr/update.sh)
Jun 16 14:00:39 swq11 postfix/pickup[12325]: 35D0B108A6F: uid=0 from=<root>
Jun 16 14:00:39 swq11 postfix/cleanup[13320]: warning: connect to mysql server 127.0.0.1: Too many connections
Jun 16 14:00:39 swq11 postfix/cleanup[13320]: warning: 35D0B108A6F: virtual_alias_maps map lookup problem for root@my.server.com
Jun 16 14:01:05 swq11 /USR/SBIN/CRON[13357]: (CRON) error (grandchild #13359 failed with exit status 4)
Jun 16 14:01:06 swq11 postfix/pickup[12325]: B15C8108A73: uid=0 from=<root>
Jun 16 14:01:06 swq11 postfix/cleanup[13320]: warning: BBAAC108A73: virtual_alias_maps map lookup problem for root@my.server.com
Jun 16 14:01:07 swq11 /USR/SBIN/CRON[13356]: (CRON) error (grandchild #13358 failed with exit status 4)
2. Two suggestions:
* Disable your update.sh scripts and see if the problem still occurs
.. AND ..
* Run "top" to determine exactly which process(s) is spiking RAM usage
ALSO:
* Make sure you have enough RAM/CPU to begin with. For example, are you near 100% RAM during "normal" usage?
PS:
Please read this article and make sure your Postfix configuration is correct.
PPS:
Read this article and make sure you have sufficient space in your /tmp partition.
Also run "df -h" and make sure /tmp is not 100% full when the problem occurs.
|
2.
* my scripts have only one line so they are not the problem
* mysql uses 20% of cpu and ~1.5% ram all other processes are ~<0.5
ALSO:
* usually my cpu is at ~10% used and my ram at 35-50% used , sites are working ok with these numbers.... (my ram is 4GB)
PS:
postfix maybe a problem too, with the connections , i will check it thanks, but i don't thing this is the problem which "hangs" the system resources
PPS:
my mysql tmp dir is /tmp/mysqltmp and it is mounted as tmpfs 2GB , / has 5GB free
|
|
|
06-16-2011, 04:14 PM
|
#5
|
Senior Member
Registered: Jul 2007
Distribution: Gentoo
Posts: 2,125
|
Quote:
i use temp tables , err , joomla uses them , and i found out that it is "better" to have mysql tmp dir mounted on tmpfs than on / , so i did it and i thing its a huge change....
|
When you say that its a huge change are you saying that things improved?
If you are still having trouble, run the top command (may need to run as root to see all processes) which will tell you which process is at fault.
|
|
|
06-16-2011, 05:01 PM
|
#6
|
LQ Newbie
Registered: Jun 2011
Posts: 24
Original Poster
Rep:
|
Quote:
Originally Posted by Noway2
When you say that its a huge change are you saying that things improved?
If you are still having trouble, run the top command (may need to run as root to see all processes) which will tell you which process is at fault.
|
i say that things are much better now , better than ever.....
|
|
|
06-16-2011, 08:14 PM
|
#7
|
LQ Guru
Registered: Mar 2004
Distribution: SusE 8.2
Posts: 5,863
Rep:
|
Hi -
Quote:
i say that things are much better now , better than ever.....
|
Q: What changed?
FYI -
When you hit a "problem" (any kind of problem) on a busy system, it often causes a "snowball effect". Lots of things suddenly go wrong, very quickly. You run out of memory, you hit 100% CPU, you run out of connections - all at once! This can make it hard to debug the original, "root cause".
I'm guessing that maybe the "root cause" here was /tmp space, and that maybe you resolved it by moving "/tmp" to a different partition (with more space).
In any case, if the problem recurs: please run "top" (or just run it all the time, in case the problem recurs), and please run "df -h". If /tmp fills up again, it's entirely possible that it might "go from zero to max'ed out" very, very quickly.
IMHO .. PSM
|
|
1 members found this post helpful.
|
06-17-2011, 01:42 PM
|
#8
|
LQ Newbie
Registered: Jun 2011
Posts: 24
Original Poster
Rep:
|
Quote:
Originally Posted by paulsm4
Hi -
Q: What changed?
FYI -
When you hit a "problem" (any kind of problem) on a busy system, it often causes a "snowball effect". Lots of things suddenly go wrong, very quickly. You run out of memory, you hit 100% CPU, you run out of connections - all at once! This can make it hard to debug the original, "root cause".
I'm guessing that maybe the "root cause" here was /tmp space, and that maybe you resolved it by moving "/tmp" to a different partition (with more space).
In any case, if the problem recurs: please run "top" (or just run it all the time, in case the problem recurs), and please run "df -h". If /tmp fills up again, it's entirely possible that it might "go from zero to max'ed out" very, very quickly.
IMHO .. PSM
|
i didn't transfer /tmp to another partition , its at / partition , i changed the mysql tmp dir from /tmp to /tmp/mysqltmp (mysqltmp is mounted on tmpfs 2GB) and now my system is stable for 1 whole day... , memory usage is at 2GB all the time cpu is ~<20% and sites are up, i can access pages faster than ever. my 2 sites are joomla sites and have more than 20.000 news each at 2 databases at the same machine , the problem now is that when i "ask" for a page with news it takes 5-10 seconds to execute the query and start loading the page... maybe i have to add an index to database tables storing the news ??
|
|
|
All times are GMT -5. The time now is 02:37 PM.
|
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.
|
Latest Threads
LQ News
|
|