LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - General (https://www.linuxquestions.org/questions/linux-general-1/)
-   -   Tutorial for understanding TOP Command. (https://www.linuxquestions.org/questions/linux-general-1/tutorial-for-understanding-top-command-822366/)

Raghu140 07-27-2010 01:20 AM

Tutorial for understanding TOP Command.
 
Dear All,

Can anyone suggest me any good tutorial for understanding the top. I have searched on net but still not sure that i understand it. Moreover what are other commands to analyse memory usgae at runtime so that we can detect memory leak problem. I am facing some very serious memory related issues and not able to detect the excat reason for it.

Please help.

Regards,
Raghu

Aquarius_Girl 07-27-2010 01:24 AM

Quote:

Originally Posted by Raghu140 (Post 4046516)
Can anyone suggest me any good tutorial for understanding the top. I have searched on net but still not sure that i understand it.

Check out the following:

http://tldp.org/LDP/sag/html/system-resources.html
and
http://www.thegeekstuff.com/2010/01/...mand-examples/

Aquarius_Girl 07-27-2010 01:37 AM

Quote:

Originally Posted by Raghu140 (Post 4046516)
Moreover what are other commands to analyse memory usgae at runtime so that we can detect memory leak problem. I am facing some very serious memory related issues and not able to detect the excat reason for it.

http://www.faqs.org/docs/Linux-HOWTO...ind-HOWTO.html
and
http://www.cyberciti.biz/faq/linux-check-memory-usage/
and
http://www.ibm.com/developerworks/li...brary/l-debug/

salasi 07-27-2010 06:09 AM

Quote:

Originally Posted by Raghu140 (Post 4046516)
Can anyone suggest me any good tutorial for understanding the top.

The top? You mean not atop, htop, etc, etc. As no one has mentioned 'man top', I have to. You have probably read it, but you start there first.

Quote:

Originally Posted by Raghu140 (Post 4046516)
Moreover what are other commands to analyse memory usgae at runtime so that we can detect memory leak problem. I am facing some very serious memory related issues and not able to detect the excat reason for it.

Again, I'll have to mention vmstat, even though I suspect that you want something different.

I think you probably should have read this before posting, and then should be thinking about how to use the tools which don't directly do what exactly you want, to get useful information.

johnsfine 07-27-2010 07:50 AM

Quote:

Originally Posted by Raghu140 (Post 4046516)
Can anyone suggest me any good tutorial for understanding the top.

Surprising with all the links people posted in this thread, no one posted this one:

http://www.linuxatemyram.com/

Quote:

analyse memory usgae at runtime so that we can detect memory leak problem. I am facing some very serious memory related issues
First read the above link. Most people who think they are seeing the symptoms of a serious memory leak are really just misinterpreting normal behavior of Linux. That link might help you understand whether the "serious memory related issues" you think you have are real.

If the problem is real, there are lots of tools for digging into the details. But none of that is simple. If you post the info that makes you believe you have a memory problem, that may make it easier for us to tell you specific tools and/or documentation to understand the problem.

Aquarius_Girl 07-27-2010 07:53 AM

Quote:

Originally Posted by johnsfine (Post 4046802)
Surprising with all the links people posted in this thread, no one posted this one:

http://www.linuxatemyram.com/

That was a nice link. Thanks

Raghu140 07-29-2010 04:49 AM

All of sudden i get this error message in the var log:

Jul 26 20:01:02 localhost kernel: mercd_write: unable to allocate memory 16128
Jul 26 20:01:02 localhost kernel: mercd_write: Unmatching Message Class 16128 and 52 35
Jul 26 20:01:02 localhost kernel: mercd_write: Current Message Class 0xfc0 Id 0x1
Jul 26 20:01:02 localhost kernel: mercd_write: Unmatching Message Class 16128 and 52 35

The system runs fine for 3 days. On the 3rd day when the load is at its peak it gives the above error. I took all kinds of logs at this particular time:-

top - 20:01:18 up 5 days, 10:10, 9 users, load average: 0.13, 0.03, 0.01
Tasks: 128 total, 1 running, 127 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.1% us, 0.2% sy, 0.0% ni, 99.1% id, 0.6% wa, 0.0% hi, 0.0% si
Mem: 4151264k total, 4130204k used, 21060k free, 138644k buffers
Swap: 6144820k total, 4k used, 6144816k free, 3756412k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
25537 root 16 0 197m 27m 4548 S 1 0.7 10:58.00 SdpMedia
25413 root 16 0 239m 46m 6200 S 0 1.2 3:45.55 MGCv1.1
25773 root 15 0 4536 1728 1140 S 0 0.0 1:44.56 SDP_IVR
23695 root 16 0 16048 14m 14m S 0 0.4 0:08.20 rsi_lnk
3633 root 15 0 0 0 0 S 0 0.0 0:28.33 smbiod
25762 root 16 0 4416 1524 1272 S 0 0.0 0:30.97 DB_MR
1 root 16 0 2880 544 468 S 0 0.0 0:02.33 init


Date&Time: 2010-07-26-20:01
total used free shared buffers cached
Mem: 4053 4023 30 0 132 3653
-/+ buffers/cache: 237 3816
Swap: 6000 0 6000

My appliaction are SdpMedia, MGCv1.1, SDP_IVR and DB_MR.

Disk space is also ok. The application shows normal memory usage. I am not able to find the problem. The system hangs after some time.

Please help.

johnsfine 07-29-2010 05:52 AM

Quote:

Originally Posted by Raghu140 (Post 4048809)
localhost kernel: mercd_write: unable to allocate memory 16128

I don't know what that means. I can only explain why it doesn't mean a shortage of physical memory.

Quote:

I took all kinds of logs at this particular time:-
I assume you mean right after the failure.

There is very little free memory, so either the failing process didn't fully abort or it wasn't using much anonymous memory.

There was very little swap space used, which tends to indicate there was no recent memory pressure.

There are very high buffer and cache levels, also tending to indicate no recent memory pressure.

Top and similar tools tell you about CPU use, which is irrelevant to your problem, and about memory use which also seems to be irrelevant to the problem. So I think you need to be looking elsewhere.

Raghu140 07-29-2010 06:04 AM

My Problem is that i am not able to locate the problem. Binary seems to be doing fine both in terms of memory and CPU. But the telephony card driver (mercd) is saying its unable to find the memory. Free says that enough memory is availabe. Var log messgae does not say any other thing. I am lost. How to crack this problem.

johnsfine 07-29-2010 06:57 AM

Quote:

Originally Posted by Raghu140 (Post 4048874)
the telephony card driver (mercd) is saying its unable to find the memory. Free says that enough memory is availabe.

That error message is more likely to mean the driver is either unable to allocate kernel virtual memory or unable to allocate a particular range of low memory for some form of DMA. It does not mean a shortage of the kind of memory reported by free.

Is this a 32 bit or 64 bit system and what distribution and version of Linux is it?

In 32 bit Linux, you might be legitimately exceeding the 1GB limit on kernel virtual memory. There might also be some small resource leak in the mercd or other driver that quickly exhausts the 1GB virtual space before it even becomes obvious that there is a resource leak.

If you were using 64 bit Linux, the kernel virtual memory is nearly unlimited (you'll run out of something else before you run out of kernel virtual memory). So if it is legitimately using over 1GB of kernel virtual, 64 bit would just work. If it is leaking a kernel resource, 64 bit would delay the crash almost indefinitely and certainly long enough to make the leak obvious.

If it is really impractical to switch to 64 bit but you think kernel virtual memory is the problem, you can probably build a new 32 bit kernel with using the option that gives the kernel 2GB virtual.

One of your other posts said RHEL4. RHEL4 had a 32 bit kernel option for 4GB kernel virtual. If you have that kernel, then I'm pretty sure you're not exhausting kernel virtual memory. But that option is an ugly kludge and likely to trigger driver bugs that wouldn't occur in other kernels. So if you have that RHEL4 kernel, it would be better to switch to something else.

Unfortunately, I don't know which tools you would use to investigate the status of kernel virtual memory. The following command gives a lot of info about kernel memory use (you should post its results) but I'm not sure it covers enough uses of kernel virtual memory to see a driver problem with kernel virtual memory.
Code:

cat /proc/slabinfo

Aquarius_Girl 07-29-2010 07:03 AM

Quote:

Originally Posted by johnsfine (Post 4048929)
Unfortunately, I don't know which tools you would use to investigate the status of kernel virtual memory.

Will this be of some help to him ?
http://docs.sun.com/app/docs/doc/816...=en&n=1&a=view

johnsfine 07-29-2010 07:07 AM

Quote:

Originally Posted by anishakaul (Post 4048941)

I don't think so. That seems to report something about "kernel threads" and a lot about non kernel virtual and physical memory. But it doesn't seem to report anything about kernel virtual memory.

Kernel virtual memory is just my wild guess at where the problem might be. I having nothing to support that.

But non kernel virtual and physical memory is the topic already investigated well earlier in this thread and pretty much ruled out as a relevant factor.

Aquarius_Girl 07-29-2010 07:09 AM

Quote:

Originally Posted by johnsfine (Post 4048948)
I don't think so. That seems to report something about "kernel threads" and a lot about non kernel virtual memory. But it doesn't seem to report anything about kernel virtual memory.

Thanks for looking!

Raghu140 07-29-2010 05:31 PM

I am using RHEL 4 update 5.
uname -a:
Linux localhost.localdomain 2.6.9-55.ELsmp #1 SMP Fri Apr 20 17:03:35 EDT 2007 i686 i686 i386 GNU/Linux.
It's a 32 bit system.
vmstat:
procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
0 0 208 15956 20116 3886644 0 0 4 28 31 7 1 2 97 1

cat /proc/slabinfo:
slabinfo - version: 2.0
# name <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> : tunables <batchco> : slabdata <active_slabs> <num_slabs> <sharedavail>
smb_request 18 30 256 15 1 : tunables 120 60 8 : slabdata 2 2
smb_inode_cache 480 517 368 11 1 : tunables 54 27 8 : slabdata 47 47
rpc_buffers 8 8 2048 2 1 : tunables 24 12 8 : slabdata 4 4
rpc_tasks 8 20 192 20 1 : tunables 120 60 8 : slabdata 1 1
rpc_inode_cache 6 7 512 7 1 : tunables 54 27 8 : slabdata 1 1
msi_cache 3 3 3840 1 1 : tunables 24 12 8 : slabdata 3 3
fib6_nodes 5 119 32 119 1 : tunables 120 60 8 : slabdata 1 1
ip6_dst_cache 4 15 256 15 1 : tunables 120 60 8 : slabdata 1 1
ndisc_cache 1 20 192 20 1 : tunables 120 60 8 : slabdata 1 1
rawv6_sock 6 11 704 11 2 : tunables 54 27 8 : slabdata 1 1
udpv6_sock 0 0 704 11 2 : tunables 54 27 8 : slabdata 0 0
tcpv6_sock 9 15 1216 3 1 : tunables 24 12 8 : slabdata 5 5
ip_fib_alias 10 226 16 226 1 : tunables 120 60 8 : slabdata 1 1
ip_fib_hash 10 119 32 119 1 : tunables 120 60 8 : slabdata 1 1
dm_tio 0 0 16 226 1 : tunables 120 60 8 : slabdata 0 0
dm_io 0 0 20 185 1 : tunables 120 60 8 : slabdata 0 0
dm-bvec-(256) 0 0 3072 2 2 : tunables 24 12 8 : slabdata 0 0
dm-bvec-128 0 0 1536 5 2 : tunables 24 12 8 : slabdata 0 0
dm-bvec-64 0 0 768 5 1 : tunables 54 27 8 : slabdata 0 0
dm-bvec-16 0 0 192 20 1 : tunables 120 60 8 : slabdata 0 0
dm-bvec-4 0 0 64 61 1 : tunables 120 60 8 : slabdata 0 0
dm-bvec-1 0 0 16 226 1 : tunables 120 60 8 : slabdata 0 0
dm-bio 0 0 128 31 1 : tunables 120 60 8 : slabdata 0 0
ext3_inode_cache 4528 6090 552 7 1 : tunables 54 27 8 : slabdata 870 870
ext3_xattr 0 0 48 81 1 : tunables 120 60 8 : slabdata 0 0
journal_handle 101 135 28 135 1 : tunables 120 60 8 : slabdata 1 1
journal_head 628 1458 48 81 1 : tunables 120 60 8 : slabdata 18 18
revoke_table 12 290 12 290 1 : tunables 120 60 8 : slabdata 1 1
revoke_record 0 0 16 226 1 : tunables 120 60 8 : slabdata 0 0
scsi_cmd_cache 101 110 384 10 1 : tunables 54 27 8 : slabdata 11 11
uhci_urb_priv 0 0 44 88 1 : tunables 120 60 8 : slabdata 0 0
sgpool-128 32 33 2560 3 2 : tunables 24 12 8 : slabdata 11 11
sgpool-64 32 33 1280 3 1 : tunables 24 12 8 : slabdata 11 11
sgpool-32 35 36 640 6 1 : tunables 54 27 8 : slabdata 6 6
sgpool-16 36 36 320 12 1 : tunables 54 27 8 : slabdata 3 3
sgpool-8 177 180 192 20 1 : tunables 120 60 8 : slabdata 9 9
unix_sock 64 112 512 7 1 : tunables 54 27 8 : slabdata 16 16
ip_mrt_cache 0 0 128 31 1 : tunables 120 60 8 : slabdata 0 0
tcp_tw_bucket 0 0 128 31 1 : tunables 120 60 8 : slabdata 0 0
tcp_bind_bucket 63 226 16 226 1 : tunables 120 60 8 : slabdata 1 1
tcp_open_request 0 0 128 31 1 : tunables 120 60 8 : slabdata 0 0
inet_peer_cache 2 61 64 61 1 : tunables 120 60 8 : slabdata 1 1
secpath_cache 0 0 128 31 1 : tunables 120 60 8 : slabdata 0 0
xfrm_dst_cache 0 0 256 15 1 : tunables 120 60 8 : slabdata 0 0
ip_dst_cache 18 60 256 15 1 : tunables 120 60 8 : slabdata 4 4
arp_cache 4 20 192 20 1 : tunables 120 60 8 : slabdata 1 1
raw_sock 5 7 576 7 1 : tunables 54 27 8 : slabdata 1 1
udp_sock 20 28 576 7 1 : tunables 54 27 8 : slabdata 4 4
tcp_sock 97 98 1152 7 2 : tunables 24 12 8 : slabdata 14 14
flow_cache 0 0 128 31 1 : tunables 120 60 8 : slabdata 0 0
mqueue_inode_cache 1 7 576 7 1 : tunables 54 27 8 : slabdata 1 1
relayfs_inode_cache 0 0 348 11 1 : tunables 54 27 8 : slabdata 0
isofs_inode_cache 0 0 372 10 1 : tunables 54 27 8 : slabdata 0 0
hugetlbfs_inode_cache 1 11 344 11 1 : tunables 54 27 8 : slabdata 1
ext2_inode_cache 0 0 488 8 1 : tunables 54 27 8 : slabdata 0 0
ext2_xattr 0 0 48 81 1 : tunables 120 60 8 : slabdata 0 0
dquot 0 0 144 27 1 : tunables 120 60 8 : slabdata 0 0
eventpoll_pwq 1 107 36 107 1 : tunables 120 60 8 : slabdata 1 1
eventpoll_epi 1 31 128 31 1 : tunables 120 60 8 : slabdata 1 1
kioctx 0 0 192 20 1 : tunables 120 60 8 : slabdata 0 0
kiocb 0 0 128 31 1 : tunables 120 60 8 : slabdata 0 0
dnotify_cache 2 185 20 185 1 : tunables 120 60 8 : slabdata 1 1
fasync_cache 1 226 16 226 1 : tunables 120 60 8 : slabdata 1 1
shmem_inode_cache 338 351 444 9 1 : tunables 54 27 8 : slabdata 39 39
posix_timers_cache 0 0 112 35 1 : tunables 120 60 8 : slabdata 0 0
uid_cache 5 61 64 61 1 : tunables 120 60 8 : slabdata 1 1
cfq_pool 107 119 32 119 1 : tunables 120 60 8 : slabdata 1 1
crq_pool 192 192 40 96 1 : tunables 120 60 8 : slabdata 2 2
deadline_drq 0 0 52 75 1 : tunables 120 60 8 : slabdata 0 0
as_arq 0 0 64 61 1 : tunables 120 60 8 : slabdata 0 0
blkdev_ioc 49 185 20 185 1 : tunables 120 60 8 : slabdata 1 1
blkdev_queue 20 32 488 8 1 : tunables 54 27 8 : slabdata 4 4
blkdev_requests 200 200 160 25 1 : tunables 120 60 8 : slabdata 8 8
biovec-(256) 256 256 3072 2 2 : tunables 24 12 8 : slabdata 128 128
biovec-128 256 260 1536 5 2 : tunables 24 12 8 : slabdata 52 52
biovec-64 256 260 768 5 1 : tunables 54 27 8 : slabdata 52 52
biovec-16 256 260 192 20 1 : tunables 120 60 8 : slabdata 13 13
biovec-4 256 305 64 61 1 : tunables 120 60 8 : slabdata 5 5
biovec-1 462 904 16 226 1 : tunables 120 60 8 : slabdata 4 4
bio 440 527 128 31 1 : tunables 120 60 8 : slabdata 17 17
file_lock_cache 8 82 96 41 1 : tunables 120 60 8 : slabdata 2 2
sock_inode_cache 202 234 448 9 1 : tunables 54 27 8 : slabdata 26 26
skbuff_head_cache 538 920 192 20 1 : tunables 120 60 8 : slabdata 46 46
sock 11 30 384 10 1 : tunables 54 27 8 : slabdata 3 3
proc_inode_cache 769 1133 360 11 1 : tunables 54 27 8 : slabdata 103 103
sigqueue 9 54 148 27 1 : tunables 120 60 8 : slabdata 2 2
radix_tree_node 21294 24388 276 14 1 : tunables 54 27 8 : slabdata 1742 1742
bdev_cache 40 42 512 7 1 : tunables 54 27 8 : slabdata 6 6
mnt_cache 34 62 128 31 1 : tunables 120 60 8 : slabdata 2 2
audit_watch_cache 0 0 48 81 1 : tunables 120 60 8 : slabdata 0 0
inode_cache 845 1199 344 11 1 : tunables 54 27 8 : slabdata 109 109
dentry_cache 5149 20332 152 26 1 : tunables 120 60 8 : slabdata 782 782
filp 1763 2080 192 20 1 : tunables 120 60 8 : slabdata 104 104
names_cache 47 47 4096 1 1 : tunables 24 12 8 : slabdata 47 47
avc_node 12 600 52 75 1 : tunables 120 60 8 : slabdata 8 8
key_jar 10 31 128 31 1 : tunables 120 60 8 : slabdata 1 1
idr_layer_cache 84 116 136 29 1 : tunables 120 60 8 : slabdata 4 4
buffer_head 703043 902925 52 75 1 : tunables 120 60 8 : slabdata 12039 12039
mm_struct 90 231 704 11 2 : tunables 54 27 8 : slabdata 21 21
vm_area_struct 3969 4815 88 45 1 : tunables 120 60 8 : slabdata 107 107
fs_cache 92 427 64 61 1 : tunables 120 60 8 : slabdata 7 7
files_cache 93 225 448 9 1 : tunables 54 27 8 : slabdata 25 25
signal_cache 145 440 192 20 1 : tunables 120 60 8 : slabdata 22 22
sighand_cache 155 183 1344 3 1 : tunables 24 12 8 : slabdata 61 61
task_struct 309 330 1408 5 2 : tunables 24 12 8 : slabdata 66 66
anon_vma 1513 2034 16 226 1 : tunables 120 60 8 : slabdata 9 9
pgd 90 476 32 119 1 : tunables 120 60 8 : slabdata 4 4
pmd 278 296 4096 1 1 : tunables 24 12 8 : slabdata 278 296
size-131072(DMA) 0 0 131072 1 32 : tunables 8 4 0 : slabdata 0 0
size-131072 1 1 131072 1 32 : tunables 8 4 0 : slabdata 1 1
size-65536(DMA) 0 0 65536 1 16 : tunables 8 4 0 : slabdata 0 0
size-65536 3 3 65536 1 16 : tunables 8 4 0 : slabdata 3 3
size-32768(DMA) 0 0 32768 1 8 : tunables 8 4 0 : slabdata 0 0
size-32768 7 7 32768 1 8 : tunables 8 4 0 : slabdata 7 7
size-16384(DMA) 0 0 16384 1 4 : tunables 8 4 0 : slabdata 0 0
size-16384 12 12 16384 1 4 : tunables 8 4 0 : slabdata 12 12
size-8192(DMA) 0 0 8192 1 2 : tunables 8 4 0 : slabdata 0 0
size-8192 24 27 8192 1 2 : tunables 8 4 0 : slabdata 24 27
size-4096(DMA) 0 0 4096 1 1 : tunables 24 12 8 : slabdata 0 0
size-4096 776 776 4096 1 1 : tunables 24 12 8 : slabdata 776 776
size-2048(DMA) 0 0 2048 2 1 : tunables 24 12 8 : slabdata 0 0
size-2048 140 140 2048 2 1 : tun
size-1620(DMA) 0 0 1664 4 2 : tun
size-1620 32 32 1664 4 2 : tun
size-1024(DMA) 0 0 1024 4 1 : tun
size-1024 331 356 1024 4 1 : tun
size-512(DMA) 0 0 512 8 1 : tun
size-512 910 2472 512 8 1 : tun
size-256(DMA) 0 0 256 15 1 : tun
size-256 708 1920 256 15 1 : tun
size-128(DMA) 0 0 128 31 1 : tun
size-128 2048 4712 128 31 1 : tun
size-64(DMA) 0 0 64 61 1 : tun
size-64 7998 11468 64 61 1 : tun
size-32(DMA) 0 0 32 119 1 : tun
size-32 13668 18921 32 119 1 : tun
kmem_cache 165 165 256 15 1 : tun

The problem ocuurs after three days of continuous running. I have also observed that the problem occurance time is also same. When suddenly load increases and the telephony card plays the wave file on high load (happens on the third day from restart) then it faces the mentioned problem. when problem starts, it fails to play some request and it also successfully plays some requests. Then number of failures cases keeps on increasing. Finally system hangs and kernel panic error message is generated by the system. Then we do the restart and it works fine for next two days and the problem repeates on third day when suddenly load increases. As per card capacity call peak load is only 30-50% which is ok. if we restart the card driver only (without system restart) then also it the system runs fine for a day.

How to see that what is total size of kernel memory availabe?

How to increase the kernel memory size?

Shall migrating to higher version of RHEL (5.3 etc) will help?

Aquarius_Girl 07-30-2010 12:23 AM

Raghu,

Kindly put you codes in code tags, it will be easier for others to read:
http://www.linuxquestions.org/questi...do=bbcode#code


All times are GMT -5. The time now is 12:29 AM.