Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place. |
Notices |
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
Are you new to LinuxQuestions.org? Visit the following links:
Site Howto |
Site FAQ |
Sitemap |
Register Now
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
|
 |
10-29-2003, 09:22 AM
|
#1
|
LQ Newbie
Registered: Oct 2003
Distribution: red-hat; slackware
Posts: 5
Rep:
|
software/hardware problem ?
Hello all
I post my problem here, hoping someone has a great idea.
Hardware description :
compaq proliant DL 360
DUAL processor
2 NIC inside with a virtual network interface , managed by a compaq software (redundancy)
Problem description :
Suddenly, one processor goes to 100% used for system and the other does pretty nothing.
At a certain moment we loose all connections on the system, but it is still pingable. (Very slow anyway)
I suspect one system task taking the whole of one cpu...
But since i loose every connections and can't get in with console, iwas wondering if someone would have an idea where to start
Thx a lot.
|
|
|
10-31-2003, 04:04 PM
|
#2
|
Member
Registered: Sep 2003
Location: Dallas, Tx, USA
Distribution: Red Hat, Gentoo, Libranet
Posts: 98
Rep:
|
If you can't get in through the console, you have a very sick system.
About all I can suggest is a reboot (Ctrl-Alt-Del). That's not a good answer, but it's the only thing I can think of that might improve things.
One thought: You do know about virtual terminals, right? (Ctrl-Alt-F1 (or F2,3,4,5 or 6). If you can't get in via a GUI, those should still work. If you have a text-mode console and you still can't get in, then I'm back to "you have a very sick system".
|
|
|
10-31-2003, 04:11 PM
|
#3
|
LQ Newbie
Registered: Oct 2003
Distribution: red-hat; slackware
Posts: 5
Original Poster
Rep:
|
Thx for your reply.
yes i do know about ctrl+alt+function key,
but even that doesn't work.
Between the moment the cpu goes to 100 % for system and the reboot, i have almost 4-5 minutes with a very slow (ofcourse) terminal.
I was wondering if there is any special tools in order to determine why system is using 100 % cpu...
TOP or sar just says it's used full , but not by what...
|
|
|
10-31-2003, 11:56 PM
|
#4
|
Member
Registered: Sep 2003
Location: Dallas, Tx, USA
Distribution: Red Hat, Gentoo, Libranet
Posts: 98
Rep:
|
Well, sar (and its cousin vmstat) aren't supposed to tell you which process is eating up the CPU, but that's the whole purpose for top. You could try "ps aux" -- that sorts things in descending order of CPU usage.
Does it say it's using all user time, all system, or a mix?
I really have no idea what might be going on at this point, so I'm throwing questions and suggestions out as they occur to me:
Are you using somebody's stock kernel, or did you recompile it?
Are you sure you're using an SMP kernel?
Assuming you are using an SMP kernel, what happens when you use a uniprocessor kernel (other than it only uses one CPU, of course)?
Is there anything that the various "freak outs" seem to have in common? I suspect that the kernel is going into a loop while trying to do some kind of I/O (mainly because that's the major thing the kernel does). If you can pin it down to disk, network, video, etc. activity, that will make it easier to search for somebody with a similar problem.
If you are using somebody's stock kernel, you might want to check their mailing lists/forums, especially if it's a recent release. This might be a known bug.
I doubt I contributed anything major that you hadn't already thought of, but maybe one of these will help.
Good luck,
CHL
|
|
|
11-04-2003, 07:42 AM
|
#5
|
LQ Newbie
Registered: Oct 2003
Distribution: red-hat; slackware
Posts: 5
Original Poster
Rep:
|
hello,
thx anyway.
It says : all system time .... while the other cpu is shared between user and a few for system.
I agree with your idea, that the kernel is going into a loop.
The Kernel is smp : Linux smtp2 2.4.2-2smp
I didn't compile it, an ex-colleague did it 
I have no uniprocessor kernel, i can recompile one
For the moment the system is stable, i'm waiting and try to dig a bit.
Anyway thx a lot.
Ben
|
|
|
All times are GMT -5. The time now is 05:39 AM.
|
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.
|
Latest Threads
LQ News
|
|