isourabh.wadhwa |
07-30-2013 08:20 PM |
High Load average | vmstat hints what ?
TOP:
Code:
top - 17:09:39 up 47 days, 1:34, 13 users, load average: 6.54, 10.96, 11.27
Tasks: 274 total, 3 running, 271 sleeping, 0 stopped, 0 zombie
Cpu0 : 6.0%us, 44.9%sy, 0.0%ni, 48.8%id, 0.0%wa, 0.0%hi, 0.3%si, 0.0%st
Cpu1 : 6.3%us, 44.4%sy, 0.0%ni, 48.0%id, 0.3%wa, 0.0%hi, 1.0%si, 0.0%st
Cpu2 : 7.3%us, 44.0%sy, 0.0%ni, 48.0%id, 0.3%wa, 0.0%hi, 0.3%si, 0.0%st
Cpu3 : 7.0%us, 44.9%sy, 0.0%ni, 48.2%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 16304972k total, 13700780k used, 2604192k free, 86396k buffers
Swap: 16779884k total, 304k used, 16779580k free, 1452304k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
26856 batcheai 17 0 156m 33m 10m R 48.8 0.2 0:03.99 ruby /eai/apps/framework/bin/run_infa_component -e PRD -c RUN_WF_DAILY_LABEL_EXTRACT_FROM_VMI
26863 batcheai 18 0 154m 33m 10m R 48.8 0.2 0:03.96 ruby /eai/apps/framework/bin/run_infa_component -e PRD -c RUN_WF_GFK_INDUSTRY_DATA_IMPORT_ESP
26824 batcheai 19 0 159m 34m 10m S 47.2 0.2 0:04.04 ruby /eai/apps/framework/bin/run_legacy_component -e PRD -c RUN_CREATE_PLANNED_ORDERS_IDOCS_FOR_
26811 batcheai 16 0 164m 38m 13m S 45.2 0.2 0:04.00 ruby /eai/apps/custom/legacy/ruby/run_eai_envelope_splitter_router -e PRD -c RUN_EAI_ENVELOPE_SP
26251 as2admin 25 0 1603m 612m 12m S 13.6 3.8 898:26.42 /eai/apps/cleo/VLTrader/../VLTrader/jre/bin/java -Xmx1000M -Duser.timezone=America/Los_Angeles -
3957 root 10 -5 0 0 0 S 0.3 0.0 115:40.28 [nfsiod]
4326 root 15 0 159m 56m 9036 S 0.3 0.4 142:01.91 splunkd -p 8089 start
17137 colemi 16 0 12892 1344 836 S 0.3 0.0 0:03.73 top
1 root 15 0 10368 640 548 S 0.0 0.0 0:03.79 init [3]
2 root RT -5 0 0 0 S 0.0 0.0 0:43.55 [migration/0]
3 root 34 19 0 0 0 S 0.0 0.0 0:04.68 [ksoftirqd/0]
4 root RT -5 0 0 0 S 0.0 0.0 1:06.07 [migration/1]
5 root 34 19 0 0 0 S 0.0 0.0 0:04.78 [ksoftirqd/1]
6 root RT -5 0 0 0 S 0.0 0.0 0:49.25 [migration/2]
7 root 34 19 0 0 0 S 0.0 0.0 0:14.72 [ksoftirqd/2]
8 root RT -5 0 0 0 S 0.0 0.0 0:44.58 [migration/3]
9 root 34 19 0 0 0 S 0.0 0.0 0:08.95 [ksoftirqd/3]
10 root 10 -5 0 0 0 S 0.0 0.0 0:05.60 [events/0]
11 root 10 -5 0 0 0 S 0.0 0.0 0:01.45 [events/1]
12 root 10 -5 0 0 0 S 0.0 0.0 0:02.26 [events/2]
13 root 10 -5 0 0 0 S 0.0 0.0 0:01.73 [events/3]
14 root 15 -5 0 0 0 S 0.0 0.0 0:00.00 [khelper]
151 root 10 -5 0 0 0 S 0.0 0.0 0:00.00 [kthread]
158 root 10 -5 0 0 0 S 0.0 0.0 0:35.04 [kblockd/0]
159 root 10 -5 0 0 0 S 0.0 0.0 0:18.58 [kblockd/1]
160 root 10 -5 0 0 0 S 0.0 0.0 0:16.99 [kblockd/2]
161 root 10 -5 0 0 0 S 0.0 0.0 0:31.35 [kblockd/3]
162 root 15 -5 0 0 0 S 0.0 0.0 0:00.00 [kacpid]
323 root 13 -5 0 0 0 S 0.0 0.0 0:00.00 [cqueue/0]
324 root 14 -5 0 0 0 S 0.0 0.0 0:00.00 [cqueue/1]
325 root 14 -5 0 0 0 S 0.0 0.0 0:00.00 [cqueue/2]
326 root 14 -5 0 0 0 S 0.0 0.0 0:00.00 [cqueue/3]
329 root 13 -5 0 0 0 S 0.0 0.0 0:00.00 [khubd]
331 root 10 -5 0 0 0 S 0.0 0.0 0:00.00 [kseriod]
423 root 15 0 0 0 0 S 0.0 0.0 0:00.10 [khungtaskd]
next second it is:
Code:
top - 17:11:02 up 47 days, 1:35, 13 users, load average: 3.54, 8.85, 10.49Tasks: 289 total, 1 running, 287 sleeping, 0 stopped, 1 zombie
Cpu0 : 0.3%us, 0.3%sy, 0.0%ni, 99.3%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu1 : 2.3%us, 0.7%sy, 0.0%ni, 96.7%id, 0.3%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu2 : 0.0%us, 0.3%sy, 0.0%ni, 98.7%id, 1.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu3 : 0.3%us, 0.3%sy, 0.0%ni, 99.3%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 16304972k total, 13825028k used, 2479944k free, 87224k buffers
Swap: 16779884k total, 304k used, 16779580k free, 1453396k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
30259 batcheai 15 0 147m 18m 14m S 1.0 0.1 0:00.03 pmcmd
25585 wadhwaso 15 0 12892 1304 840 R 0.7 0.0 0:00.28 top
4326 root 15 0 159m 56m 9036 S 0.3 0.4 142:02.04 splunkd
12384 as2admin 19 0 1416m 361m 13m S 0.3 2.3 7:31.96 java
17137 colemi 16 0 12892 1344 836 S 0.3 0.0 0:03.89 top
26251 as2admin 25 0 1600m 612m 12m S 0.3 3.8 898:31.12 java
30034 batcheai 18 0 165m 39m 13m S 0.3 0.2 0:01.37 ruby
30055 batcheai 15 0 164m 38m 13m S 0.3 0.2 0:01.40 ruby
1 root 15 0 10368 640 548 S 0.0 0.0 0:03.79 init
2 root RT -5 0 0 0 S 0.0 0.0 0:43.55 migration/0
3 root 34 19 0 0 0 S 0.0 0.0 0:04.68 ksoftirqd/0
4 root RT -5 0 0 0 S 0.0 0.0 1:06.07 migration/1
5 root 34 19 0 0 0 S 0.0 0.0 0:04.78 ksoftirqd/1
6 root RT -5 0 0 0 S 0.0 0.0 0:49.26 migration/2
7 root 34 19 0 0 0 S 0.0 0.0 0:14.72 ksoftirqd/2
8 root RT -5 0 0 0 S 0.0 0.0 0:44.58 migration/3
9 root 34 19 0 0 0 S 0.0 0.0 0:08.95 ksoftirqd/3
10 root 10 -5 0 0 0 S 0.0 0.0 0:05.60 events/0
11 root 10 -5 0 0 0 S 0.0 0.0 0:01.45 events/1
12 root 10 -5 0 0 0 S 0.0 0.0 0:02.26 events/2
13 root 10 -5 0 0 0 S 0.0 0.0 0:01.73 events/3
vmstat:
Code:
vmstat 1 10
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
r b swpd free buff cache si so bi bo in cs us sy id wa st
23 0 304 1926556 88004 1454152 0 0 10 20 1 0 10 12 75 2 0
34 0 304 1920852 88012 1454196 0 0 0 68 1836 479661 9 87 4 0 0
37 0 304 1913544 88028 1454196 0 0 0 84 1736 453677 10 87 3 0 0
33 0 304 1910072 88028 1454212 0 0 0 0 2228 480404 10 85 5 0 0
17 0 304 1905368 88036 1454224 0 0 0 44 1930 471783 11 83 6 0 0
24 0 304 1894424 88136 1454124 0 0 0 248 2028 452977 13 83 4 0 0
21 0 304 1885188 88152 1454232 0 0 0 36 5324 466238 13 75 10 2 0
24 1 304 1876956 88160 1454220 0 0 0 12 3439 471261 15 69 12 4 0
30 2 304 1870272 88160 1454232 0 0 0 0 4156 462486 15 68 12 4 0
24 1 304 1865228 88168 1454272 0 0 0 60 3700 474922 13 68 16 4 0
those high "in" and "cs" value .. i know what they are ..
in --> Number of interrupts received by the system per second
cs --> Rate of context switching in the process space (in number/sec)
but what they do, is this is the thing that is affecting my system?? .....
---------- Post updated at 06:11 AM ---------- Previous update was at 05:49 AM ----------
I found something::
in: The number of interrupts per second, including the clock.
cs: The number of context switches per second.
(A context switch occurs when the currently running thread is different from the previously running thread, so it is taken off of the CPU.)
It is not uncommon to see the context switch rate be approximately the same as device interrupt rate (in column)
If cs is high, it may indicate too much process switching is occurring, thus using memory inefficiently.
If cs is higher then sy, system is doing more context switching than actual work.
High r with high cs -> possible lock contention
Lock contention occurs whenever one process or thread attempts to acquire a lock held by another process or thread. The more granular the available locks, the less likely one process/thread will request a lock held by the other. (For example, locking a row rather than the entire table, or locking a cell rather than the entire row.)
When you are seeing blocked processes or high values on waiting on I/O (wa), it usually signifies either real I/O issues where you are waiting for file accesses or an I/O condition associated with paging due to a lack of memory on your system.
but still not able to understand the lock contention part and what should I do to resolve this issue. ??
|