Linux - HardwareThis forum is for Hardware issues.
Having trouble installing a piece of hardware? Want to know if that peripheral is compatible with Linux?
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
We are running Redhat 5.5 on 2 Sun Fire X4470 server with the following setup...
8 GB memory kit with two 4 GB 1066 MHz DDR3 DIMMs
2 Intel . Xeon . X7560 8-core 2.26 GHz CPUs
300GB 10K RPM 2.5" SAS hard disk drive with Marlin bracket. StorageTek 8 Gb Fibre Channel PCIe HBA single port Emulex
StorageTek 8 Gb Fibre Channel PCIe HBA dual port Emulex
Sun x4 PCI Express Quad Gigabit Ethernet UTP low profile adapter, with low profile bracket on board, Standard bracket included,RoHS-6 compliant. Intel OEM card
Sun Storage 6 Gb SAS PCIe RAID HBA, Internal: 8 port and 512 MB memory
These are replacement machines for 3 others that are running Redhat 5.2 with the same software. The above machines are experiencing periodic CPU spikes. At low load the spikes do not occur, but under moderate load the CPU's spike to between 80%-90%.
This is the frist 2 lines from top...
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
14034 fworks 15 0 9339m 1.5g 44m S 1642.8 2.4 1859:04 cor
12493 fworks 15 0 5418m 1.0g 126m S 297.8 1.6 989:04.11 cor
Any ideas on how to track the cause of spikes would be great.
Last edited by Boerboel_Boy; 03-15-2011 at 02:52 AM.
Reason: To add some detail
Distribution: Ubuntu, Red Hat, Solaris, HP-UX, FreeBSD.
Posts: 63
Rep:
Lets start with a top
First, please use top to show current resources. Also I happen to have run into this before at work.
please cat out /proc/interrupts and post here
if we see an insane amount of interrupts on 169 (Disk I/O most of the time) (sometimes video) you may be suffering from a bug https://partner-bugzilla.redhat.com/....cgi?id=145530
you would also see ticks and irq errors at start-up. As soon as I saw that you were running a sun host with Rhel 5 I thought "Wow, that reminds me of something."
This is just a possible explanation, which requires upgrade to a new kernel to fix.
Please post output of top and a cat of /proc/interrupts, just to start off.
Here attached is the output from interrupts. We have notices a correlation between writing to file and the interupts, but can not be directly correlated. The process 14034 in the above post is a corba interface and this processes CPU time skyrockets when we experience the spikes.
Distribution: Ubuntu, Red Hat, Solaris, HP-UX, FreeBSD.
Posts: 63
Rep:
Ouch so it really is corba then. In my "professional" opinion I would call up omg/sun and or redhat see what they can do to support you on it. Other wise lets see what dmesg says if anything. Corba errors are fairly common and can be a continual headache. .I suspect you get the alarms when you get those spikes of cpu utilization. I would hope this host is currently in a dev environment so you can test for fixes. Some people have luck with moving to a new CORBA version, some have luck with simple java updates, or if you can find the offending object or where its having trouble with specific bindings thats always useful as well, if you have one of these events still in the log, cat out messages we can look to the output for clues to see what orbd reports. That might give a better point of reference to work from.
We have 2 identical machines and both showing the same problem. The frustrating thing is that we already have 3 machines in place (which these are to replace) running the same applications, but these are 8x more powerfull and running redhat 5.5 as apposed to 5.3. The only consilation is that we can test/probe these machines off-line at the moment as the current production machines are stable, all be it running close to capacity in peak times.
I have extracted the dmesg and we have now also put in a support call to Redhat, so lets hope that produces something. Thanks for the help so far
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.