Linux - EnterpriseThis forum is for all items relating to using Linux in the Enterprise.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Mmmmm - not good.
With RH, you should have sysstat I would imagine. With the sar (and sadc) componentyou can get a history of what happens on your system.
Perhaps have a look at that.
Mmmmm - not good.
With RH, you should have sysstat I would imagine. With the sar (and sadc) componentyou can get a history of what happens on your system.
Perhaps have a look at that.
Hi, i just check sar's report but found nothing abnormal.
Is it something related to some kernel parameters or its limit/defect when inquiring too many processes under /proc?
btw: a quick question. what's the difference between /proc/pid and /proc/.pid ?
Can you paste me the output of the following command on your server?
/usr/bin/strace ps -ef
This will give you an idea on the point where the ps command hangs up. The problem is due to some defunct threads running on the server. defunct threads are the one which reside in the system even after the parent processes are killed. I believe it could be due to some kernel module not working correctly.
What does this server is used for?
Your ps command hangs up since it tries to read some processes from /proc directory, which are zombies.
Can you paste me the output of the following command on your server?
/usr/bin/strace ps -ef
This will give you an idea on the point where the ps command hangs up. The problem is due to some defunct threads running on the server. defunct threads are the one which reside in the system even after the parent processes are killed. I believe it could be due to some kernel module not working correctly.
What does this server is used for?
Your ps command hangs up since it tries to read some processes from /proc directory, which are zombies.
Rahul Khare.
Hi, Thanks for your reply. This is really a good idea.
Anyway, it's unpredictable when the box hangs. I'm so unlucky that it never hang when i'm there...
In fact, i've tried to trace syscall ps uses.
the server is used for a critical business use, which runs lots of processes (also lots of threads)
and... I'd like to ask again about /proc/.pid what's the difference with common /proc/pid?
When you say the box hangs, does the OS actually crash and kernel panic or does it just lock/freeze? If you have another RH ox on the same LAN it would be worth setting up netdump to see if that will provide you with a useful memory and syslog dump of the failure. Have you setup netdump in the past? - it's *really* easy but fantastic for this kind of problem. Also, if you need to raise a call with RedHat then you can provide them all the details they need to sort the problem out for you, assuming you have support that is.
Other thoughts - has anything changed recently no matter how small or 'unrealted'? Updated applications, new network switches, etc etc?
As for /proc/pid vs /proc/.pid - I've only got /proc/pid on my 2.4 and 2.6 boxes which is odd.
When you say the box hangs, does the OS actually crash and kernel panic or does it just lock/freeze? If you have another RH ox on the same LAN it would be worth setting up netdump to see if that will provide you with a useful memory and syslog dump of the failure. Have you setup netdump in the past? - it's *really* easy but fantastic for this kind of problem. Also, if you need to raise a call with RedHat then you can provide them all the details they need to sort the problem out for you, assuming you have support that is.
Other thoughts - has anything changed recently no matter how small or 'unrealted'? Updated applications, new network switches, etc etc?
As for /proc/pid vs /proc/.pid - I've only got /proc/pid on my 2.4 and 2.6 boxes which is odd.
Hi, i just check sar's report but found nothing abnormal.
Is it something related to some kernel parameters or its limit/defect when inquiring too many processes under /proc?
btw: a quick question. what's the difference between /proc/pid and /proc/.pid ?
thanks!
i think i've got some ideas about /proc/.pid
they are "threads", which belong to a specific process id
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.