LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Server (http://www.linuxquestions.org/questions/linux-server-73/)
-   -   Kipmi0 eating up to 99.8% cpu on centos 6.4 (http://www.linuxquestions.org/questions/linux-server-73/kipmi0-eating-up-to-99-8-cpu-on-centos-6-4-a-4175460915/)

newbie14 05-06-2013 03:07 PM

Kipmi0 eating up to 99.8% cpu on centos 6.4
 
We have centos 6.4 and the kipmi0 is showing as 99.8%cpu and 0.0% memory and load average is 1.00. What should we do to rectify on this? Thank you.

siremaxus 05-06-2013 03:40 PM

Hi,

Maybe you can find your answer here:
http://www.serveradminblog.com/2011/02/kipmi0-problem/

hope it helps...

Sire Maxus

newbie14 05-06-2013 03:43 PM

Dear Sire,
I dont find this in my centos 6.4 /etc/sysconfig/lm_sensors ?

newbie14 05-10-2013 01:24 PM

Hi,
Any help or indication how to resolve this matter?

siremaxus 05-10-2013 01:29 PM

Hi,

I've been thinking about this, but if you don't have the package lm_sensors installed then the problem must lie elsewhere.

can you post more info from the "top" command and perhaps install the package sysstat (yum install sysstat -y) and try some test on your box.
try to use iostat and "vmstat 5 5" and post your results here.

Good Luck

Sire Maxus

newbie14 05-11-2013 02:42 AM

Hi,
What type of test should I run with the sysstat? I am not so clear on that ? How many iostat and top samples do you want? Thank you.

newbie14 05-12-2013 03:02 AM

Hi Sire,
Below are some of my samples data captured.Please let me know if those are not suffice.

Quote:

top - 11:53:32 up 24 days, 23:03, 1 user, load average: 1.24, 1.10, 1.04
Tasks: 210 total, 2 running, 208 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.0%us, 12.5%sy, 0.0%ni, 87.0%id, 0.4%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 7990988k total, 1787480k used, 6203508k free, 165400k buffers
Swap: 8126456k total, 0k used, 8126456k free, 1333796k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
130 root 39 19 0 0 0 R 99.8 0.0 26939:03 kipmi0
1059 root 20 0 0 0 0 S 0.3 0.0 1:06.99 jbd2/dm-2-8
7268 mysql 20 0 2685m 90m 7020 S 0.3 1.2 58:05.13 mysqld
1 root 20 0 19228 1500 1220 S 0.0 0.0 0:00.78 init


top - 11:53:47 up 24 days, 23:04, 1 user, load average: 1.18, 1.09, 1.04
Tasks: 210 total, 2 running, 208 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.0%us, 12.5%sy, 0.0%ni, 87.5%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 7990988k total, 1787604k used, 6203384k free, 165400k buffers
Swap: 8126456k total, 0k used, 8126456k free, 1333796k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
130 root 39 19 0 0 0 R 99.8 0.0 26939:18 kipmi0
1 root 20 0 19228 1500 1220 S 0.0 0.0 0:00.78 init
2 root 20 0 0 0 0 S 0.0 0.0 0:00.06 kthreadd


top - 11:59:42 up 24 days, 23:10, 1 user, load average: 1.07, 1.08, 1.03
Tasks: 210 total, 2 running, 208 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.0%us, 12.5%sy, 0.0%ni, 87.5%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 7990988k total, 1787372k used, 6203616k free, 165400k buffers
Swap: 8126456k total, 0k used, 8126456k free, 1333812k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
130 root 39 19 0 0 0 R 99.8 0.0 26945:12 kipmi0
4926 root 20 0 15128 1404 1008 R 0.3 0.0 0:01.05 top
1 root 20 0 19228 1500 1220 S 0.0 0.0 0:00.78 init
2 root 20 0 0 0 0 S 0.0 0.0 0:00.06 kthreadd


vmstat 5 5
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
1 0 0 6203740 165400 1333816 0 0 0 5 1 1 0 9 90 0 0
1 0 0 6203600 165400 1333836 0 0 0 7 1039 34 0 13 87 0 0
1 0 0 6203464 165408 1333836 0 0 0 2 1042 32 0 13 87 0 0
1 0 0 6203464 165408 1333836 0 0 0 5 1126 122 1 13 87 0 0
1 0 0 6203464 165408 1333836 0 0 0 2 1033 30 0 13 87 0 0



vmstat 5 5
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
1 0 0 6201780 165616 1335004 0 0 0 5 2 1 0 9 90 0 0
1 0 0 6201640 165616 1335004 0 0 0 119 1150 134 1 13 87 0 0
1 0 0 6201640 165616 1335004 0 0 0 4 1033 32 0 12 87 0 0
1 0 0 6201648 165616 1335004 0 0 0 26 1074 229 0 13 87 0 0
1 0 0 6201584 165616 1335004 0 0 0 12 1036 38 0 13 87 0 0

vmstat 5 5
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
1 0 0 6201768 165616 1335052 0 0 0 5 2 1 0 9 90 0 0
1 0 0 6201884 165616 1335052 0 0 0 0 1034 27 0 13 87 0 0
1 0 0 6201884 165616 1335052 0 0 0 0 1032 30 0 13 88 0 0
1 0 0 6201884 165616 1335052 0 0 0 0 1035 28 0 13 87 0 0
1 0 0 6201884 165616 1335052 0 0 0 7 1034 32 0 13 87 0 0



sar -u 1 3
Linux 2.6.32-358.2.1.el6.x86_64 (localhost.localdomain) 05/12/2013 _x86_64_ (8 CPU)

12:07:22 PM CPU %user %nice %system %iowait %steal %idle
12:07:23 PM all 0.00 0.00 12.62 0.00 0.00 87.38
12:07:24 PM all 0.00 0.00 12.50 0.00 0.00 87.50
12:07:25 PM all 0.00 0.00 12.62 0.00 0.00 87.38
Average: all





sar -P ALL 1 1
Linux 2.6.32-358.2.1.el6.x86_64 (localhost.localdomain) 05/12/2013 _x86_64_ (8 CPU)

12:08:17 PM CPU %user %nice %system %iowait %steal %idle
12:08:18 PM all 0.00 0.00 12.50 0.00 0.00 87.50
12:08:18 PM 0 0.00 0.00 0.00 0.00 0.00 100.00
12:08:18 PM 1 0.00 0.00 0.00 0.00 0.00 100.00
12:08:18 PM 2 0.00 0.00 0.00 0.00 0.00 100.00
12:08:18 PM 3 0.00 0.00 100.00 0.00 0.00 0.00
12:08:18 PM 4 0.00 0.00 0.00 0.00 0.00 100.00
12:08:18 PM 5 0.00 0.00 0.00 0.00 0.00 100.00
12:08:18 PM 6 0.00 0.00 0.00 0.00 0.00 100.00
12:08:18 PM 7 0.00 0.00 0.00 0.00 0.00 100.00

Average: CPU %user %nice %system %iowait %steal %idle
Average: all 0.00 0.00 12.50 0.00 0.00 87.50
Average: 0 0.00 0.00 0.00 0.00 0.00 100.00
Average: 1 0.00 0.00 0.00 0.00 0.00 100.00
Average: 2 0.00 0.00 0.00 0.00 0.00 100.00
Average: 3 0.00 0.00 100.00 0.00 0.00 0.00
Average: 4 0.00 0.00 0.00 0.00 0.00 100.00
Average: 5 0.00 0.00 0.00 0.00 0.00 100.00
Average: 6 0.00 0.00 0.00 0.00 0.00 100.00
Average: 7 0.00 0.00 0.00 0.00 0.00 100.00




sar -P ALL 1 1
Linux 2.6.32-358.2.1.el6.x86_64 (localhost.localdomain) 05/12/2013 _x86_64_ (8 CPU)

12:08:50 PM CPU %user %nice %system %iowait %steal %idle
12:08:51 PM all 0.00 0.00 12.50 0.12 0.00 87.38
12:08:51 PM 0 0.00 0.00 0.00 1.00 0.00 99.00
12:08:51 PM 1 0.00 0.00 0.00 0.00 0.00 100.00
12:08:51 PM 2 0.00 0.00 0.00 0.00 0.00 100.00
12:08:51 PM 3 0.00 0.00 100.00 0.00 0.00 0.00
12:08:51 PM 4 0.00 0.00 0.00 0.00 0.00 100.00
12:08:51 PM 5 0.00 0.00 0.00 0.00 0.00 100.00
12:08:51 PM 6 0.00 0.00 0.00 0.00 0.00 100.00
12:08:51 PM 7 0.00 0.00 0.00 0.00 0.00 100.00


sar -q 1 3
Linux 2.6.32-358.2.1.el6.x86_64 (localhost.localdomain) 05/12/2013 _x86_64_ (8 CPU)

12:10:17 PM runq-sz plist-sz ldavg-1 ldavg-5 ldavg-15
12:10:18 PM 1 246 1.01 1.02 1.00
12:10:19 PM 1 246 1.01 1.02 1.00
12:10:20 PM 1 247 1.01 1.02 1.00
Average: 1 246 1.01 1.02 1.00


Average: CPU %user %nice %system %iowait %steal %idle
Average: all 0.00 0.00 12.50 0.12 0.00 87.38
Average: 0 0.00 0.00 0.00 1.00 0.00 99.00
Average: 1 0.00 0.00 0.00 0.00 0.00 100.00
Average: 2 0.00 0.00 0.00 0.00 0.00 100.00
Average: 3 0.00 0.00 100.00 0.00 0.00 0.00
Average: 4 0.00 0.00 0.00 0.00 0.00 100.00
Average: 5 0.00 0.00 0.00 0.00 0.00 100.00
Average: 6 0.00 0.00 0.00 0.00 0.00 100.00
Average: 7 0.00 0.00 0.00 0.00 0.00 100.00



sar -P ALL 1 1
Linux 2.6.32-358.2.1.el6.x86_64 (localhost.localdomain) 05/12/2013 _x86_64_ (8 CPU)

12:26:56 PM CPU %user %nice %system %iowait %steal %idle
12:26:57 PM all 0.00 0.00 12.61 0.00 0.00 87.39
12:26:57 PM 0 0.00 0.00 0.00 0.00 0.00 100.00
12:26:57 PM 1 0.00 0.00 100.00 0.00 0.00 0.00
12:26:57 PM 2 0.00 0.00 0.00 0.00 0.00 100.00
12:26:57 PM 3 0.00 0.00 0.00 0.00 0.00 100.00
12:26:57 PM 4 0.00 0.00 0.00 0.00 0.00 100.00
12:26:57 PM 5 0.00 0.00 0.00 0.00 0.00 100.00
12:26:57 PM 6 0.00 0.00 0.99 0.00 0.00 99.01
12:26:57 PM 7 0.00 0.00 0.00 0.00 0.00 100.00

Average: CPU %user %nice %system %iowait %steal %idle
Average: all 0.00 0.00 12.61 0.00 0.00 87.39
Average: 0 0.00 0.00 0.00 0.00 0.00 100.00
Average: 1 0.00 0.00 100.00 0.00 0.00 0.00
Average: 2 0.00 0.00 0.00 0.00 0.00 100.00
Average: 3 0.00 0.00 0.00 0.00 0.00 100.00
Average: 4 0.00 0.00 0.00 0.00 0.00 100.00
Average: 5 0.00 0.00 0.00 0.00 0.00 100.00
Average: 6 0.00 0.00 0.99 0.00 0.00 99.01
Average: 7 0.00 0.00 0.00 0.00 0.00 100.00

siremaxus 05-12-2013 04:38 AM

Hello,

Your system looks OK, besides kipmi0 using 99.8% (it's only using 1 core not all CPU)
In general your system is using 12.5% or 12.6% (average)and sometimes uses .12% on IOWAIT which is not a bad number either.

If you could post the exit of this command:
#ps -feaux > /root/process.txt

and the upload that file, we could check the process kipmi0 and any other process that could be making kipmi0 to use that much memory.

Good Luck

Sire Maxus

newbie14 05-12-2013 05:27 AM

1 Attachment(s)
Hi Sire,
What I/O wait could be considered as bad or danger? I have uploaded the required file. Thank you and appreciate your help.

newbie14 05-13-2013 10:03 AM

Hi Sire,
Any updates based on the process list? Thank you.

siremaxus 05-13-2013 10:39 AM

Hi,

As far as I know, 20-25% is considered acceptable for IOWAIT, more than that signals an issue with the storage devices.

Code:

root      129  0.0  0.0      0    0 ?        S    Apr17  0:00  \_ [pciehpd]
root      130 75.1  0.0      0    0 ?        RN  Apr17 27255:54  \_ [kipmi0]

The process "pciehpd" is related to hot-plug and that is what causing kipmi0 to use that much CPU.
Maybe some piece of hardware attached recently or the service didn't update cleanly when you update your system.

One question I have not asked yet is if you have rebooted your system after the update?

Sire Maxus

newbie14 05-13-2013 10:49 AM

Hi Sire,
I am sorry kind of new into this area. So which of the previous commands best to be used to monitor the IOWAIT. IOTWait signify that there is delay in the harddisk rite? I am sure there is no additionaly hardware attached to this machine. Possible the service didnt update cleanly. How ensure that there is a clean update cause I just run yum update always thats it. Normally I dont reboot. But there was once I reboot last month but after reboot is ok then slowly it again hike to this values. I am curious how do you linked pciehpd to kipmi0? Actually what is the exact role of kipmi0.

siremaxus 05-13-2013 11:24 AM

Hello,

The option "f" for the PS command shows all process with their parent and child process.
So there you have the process "pciehpd" which is a parent process for the "kipmi0" process, you can see the relationships between process with the lines drawn to the left of the process name.

You can check more infor on kipmi in this links:
http://www-01.ibm.com/support/docvie...2575fa0050f604
http://lists.us.dell.com/pipermail/l...ay/031305.html
https://supportcenter.checkpoint.com...tionid=sk43262
http://www.linux-archive.org/red-hat...el-thread.html

I hope this helps you,

Sire Maxus

newbie14 05-13-2013 11:28 AM

Hi sire,
Actually I have visited all the given links via google and non of it working e.g. service ipmi stop also is not working. I am quite lost on how to exactly solve this? Do you think a reboot again will help? Any way to ensure a clean yum update?

siremaxus 05-13-2013 11:40 AM

Hi,
I thought maybe a reboot could help, but if you have already done it, and then the process keeps hogging CPU then that is not a solution.
Of all the links I've read kipmi0 is related to IPMI, which is a set usually used to monitor hardware or used by some applications to monitor some process.
From the links I posted on my previous post, the IBM one says that it does not matter if kipmi is reporting high CPU usage, but it only runs on idle time and is standard behavior for this process.

If it bothers you that the process uses so much CPU the you can disable it as said in the following link:
http://www.novell.com/support/kb/doc.php?id=7003352 (Novel SUSE)
http://unix.stackexchange.com/questi...-on-centos-6-4 (CentOS)

Good Luck

Sire Maxus


All times are GMT -5. The time now is 03:23 PM.