LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   kswapd0 eats 100% CPU, drives up system load (https://www.linuxquestions.org/questions/linux-newbie-8/kswapd0-eats-100-cpu-drives-up-system-load-4175418519/)

ravindert 07-25-2012 01:37 AM

kswapd0 eats 100% CPU, drives up system load
 
Hi,

In one of our linux server the kswapd process is consuming high cpu approx 100% and server is getting slow . And also load on the server is also on high side. So can you please suggest me how i can get rid of this problem.

Below are the output of the top command.


top - 15:36:35 up 236 days, 13:59, 8 users, load average: 43.16, 50.20, 51.47
Tasks: 835 total, 3 running, 828 sleeping, 0 stopped, 4 zombie
Cpu(s): 5.6%us, 5.4%sy, 0.0%ni, 43.7%id, 45.2%wa, 0.0%hi, 0.1%si, 0.0%st
Mem: 98992208k total, 98748432k used, 243776k free, 1408k buffers
Swap: 29786104k total, 16064776k used, 13721328k free, 62495492k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1321 root 20 -5 0 0 0 R 100.0 0.0 2958:02 kswapd1
17199 oracle 25 0 70.1g 4.6g 4.6g S 49.7 4.9 1813:11 oracle
15372 oracle 15 0 1688m 717m 716m S 8.6 0.7 221:38.03 oracle
1320 root 10 -5 0 0 0 S 8.3 0.0 2381:13 kswapd0
12607 oracle 15 0 1688m 646m 645m R 7.0 0.7 606:27.15 oracle
30998 oracle 16 0 70.2g 19g 19g D 4.6 20.9 49:54.40 oracle
6982 oracle 15 0 70.2g 14g 14g D 2.0 14.8 24:19.83 oracle
9172 oracle 16 0 70.6g 45g 44g D 1.7 48.0 145:13.82 oracle
28825 oracle 15 0 70.2g 2.8g 2.8g S 1.7 2.9 1:16.68 oracle
1413 irbpd 16 0 13424 1736 796 S 1.3 0.0 0:16.61 top
5437 irbpd 16 0 13292 1712 812 S 1.3 0.0 0:00.09 top
28770 oracle 16 0 70.1g 2.7g 2.7g S 1.3 2.9 1:21.49 oracle
28809 oracle 15 0 70.1g 2.7g 2.7g S 1.3 2.9 1:26.04 oracle
28898 oracle 15 0 70.1g 2.7g 2.7g S 1.3 2.9 1:47.36 oracle
3891 oracle 16 0 1688m 80m 74m D 1.0 0.1 0:16.29 oracle
6584 oracle 15 0 70.2g 4.8g 4.8g S 1.0 5.1 4:21.48 oracle
15725 oracle 18 0 70.2g 4.2g 4.2g D 1.0 4.5 1:35.73 oracle
24567 oracle 16 0 70.2g 3.9g 3.9g S 1.0 4.2 0:42.20 oracle
26551 oracle 15 0 70.2g 3.9g 3.8g S 1.0 4.1 0:46.42 oracle
26553 oracle 15 0 70.2g 3.3g 3.3g S 1.0 3.5 1:17.06 oracle
26555 oracle 15 0 70.2g 3.8g 3.8g S 1.0 4.0 0:47.78 oracle
28712 oracle 15 0 70.1g 2.8g 2.8g S 1.0 2.9 1:26.41 oracle
28759 oracle 15 0 70.2g 2.8g 2.8g S 1.0 2.9 1:15.14 oracle
28768 oracle 15 0 70.2g 2.8g 2.8g S 1.0 2.9 1:20.10 oracle
28772 oracle 15 0 70.2g 2.8g 2.8g S 1.0 2.9 1:06.29 oracle
28780 oracle 15 0 70.1g 2.7g 2.7g S 1.0 2.9 1:31.25 oracle
28787 oracle 15 0 70.1g 2.7g 2.7g S 1.0 2.9 1:20.19 oracle
28803 oracle 15 0 70.2g 2.8g 2.8g S 1.0 2.9 1:14.57 oracle
28805 oracle 15 0 70.1g 2.7g 2.7g S 1.0 2.9 1:26.23 oracle
28836 oracle 15 0 70.2g 2.8g 2.8g S 1.0 2.9 1:24.37 oracle
28945 oracle 15 0 70.1g 2.4g 2.3g S 1.0 2.5 2:28.66 oracle
28976 oracle 15 0 70.1g 2.3g 2.3g S 1.0 2.5 2:08.11 oracle
28980 oracle 15 0 70.1g 2.3g 2.3g S 1.0 2.5 2:31.61 oracle
29059 oracle 15 0 70.1g 2.3g 2.3g S 1.0 2.5 2:02.70 oracle
29066 oracle 15 0 70.2g 2.3g 2.3g S 1.0 2.5 2:10.24 oracle
29068 oracle 15 0 70.1g 2.3g 2.3g S 1.0 2.5 2:04.28 oracle
29082 oracle 15 0 70.1g 2.3g 2.3g S 1.0 2.5 2:00.03 oracle


Thanks
Ravinder

syg00 07-25-2012 01:51 AM

This is a natural consequence of the situation in your other two threads.

This is a symptom of a wider problem. Just as loadavg is.
Too much memory in use (maybe SGA, who knows), thrashing is causing kswapd to have to move too any pages in/out, the disk subsystem is obviously under-configured, so I/O is kept waiting, so your loadavg goes up.
Problems keeps piling up.

You cannot directly directly affect the kswapd CPU%. Like I said, it's a symptom.
Fix the real problem(s).

ravindert 07-25-2012 02:05 AM

Quote:

Originally Posted by syg00 (Post 4737338)
This is a natural consequence of the situation in your other two threads.

This is a symptom of a wider problem. Just as loadavg is.
Too much memory in use (maybe SGA, who knows), thrashing is causing kswapd to have to move too any pages in/out, the disk subsystem is obviously under-configured, so I/O is kept waiting, so your loadavg goes up.
Problems keeps piling up.

You cannot directly directly affect the kswapd CPU%. Like I said, it's a symptom.
Fix the real problem(s).



Thanks for the reqply.

So can you please let me know which process causing the kswap and load to go high on the server and how i can avoid this in future.


Thanks
Ravinder

rmugunthan 07-25-2012 02:07 AM

From your top command output many of the oracle process are in D (uninterruptible sleep) state

30998 oracle 16 0 70.2g 19g 19g D 4.6 20.9 49:54.40 oracle
6982 oracle 15 0 70.2g 14g 14g D 2.0 14.8 24:19.83 oracle
9172 oracle 16 0 70.6g 45g 44g D 1.7 48.0 145:13.82 oracle


Also these oracle process are using lot of memory. That may cause this problem.

ravindert 07-25-2012 02:32 AM

Quote:

Originally Posted by rmugunthan (Post 4737348)
From your top command output many of the oracle process are in D (uninterruptible sleep) state

30998 oracle 16 0 70.2g 19g 19g D 4.6 20.9 49:54.40 oracle
6982 oracle 15 0 70.2g 14g 14g D 2.0 14.8 24:19.83 oracle
9172 oracle 16 0 70.6g 45g 44g D 1.7 48.0 145:13.82 oracle


Also these oracle process are using lot of memory. That may cause this problem.


Thanks a lot for the reply.

So can you please let me know how i can avoid such situation in the future.


Thanks Again

rmugunthan 07-25-2012 07:02 AM

Are you check these 30998, 6982 and 9172 processes. These all are any sql query or jdbc connection?

chrism01 07-25-2012 09:09 PM

Also, a lot of processes are in state S http://slack-linux.blogspot.com.au/2...ate-codes.html.
Basically, as above, you need to find out what your Oracle DB/processes are actually doing and fix the root cause, not keep trying to fix the symptoms.
See post#2 by syg00

ravindert 07-25-2012 09:25 PM

Quote:

Originally Posted by chrism01 (Post 4738061)
Also, a lot of processes are in state S http://slack-linux.blogspot.com.au/2...ate-codes.html.
Basically, as above, you need to find out what your Oracle DB/processes are actually doing and fix the root cause, not keep trying to fix the symptoms.
See post#2 by syg00



Thanks Chris.

So we need to check with the oracle database. So that is waht causing the real problem.??

chrism01 07-25-2012 09:49 PM

There's no way to tell from this distance ... however, it appears that your system is almost exclusively running Oracle, so start there ..
Note that you need to check the app programs calling Oracle as well as the Oracle DB procs.
In fact, you could start by killing off all the progs talking to Oracle and just run Oracle to see what happens.
Its interesting to see that you have only 8 users but 800+ process; ie each one is (probably) generating 100 processes ... sounds odd to me.


All times are GMT -5. The time now is 06:00 AM.