Tutorial for understanding TOP Command.
Dear All,
Can anyone suggest me any good tutorial for understanding the top. I have searched on net but still not sure that i understand it. Moreover what are other commands to analyse memory usgae at runtime so that we can detect memory leak problem. I am facing some very serious memory related issues and not able to detect the excat reason for it. Please help. Regards, Raghu |
Quote:
http://tldp.org/LDP/sag/html/system-resources.html and http://www.thegeekstuff.com/2010/01/...mand-examples/ |
Quote:
and http://www.cyberciti.biz/faq/linux-check-memory-usage/ and http://www.ibm.com/developerworks/li...brary/l-debug/ |
Quote:
Quote:
I think you probably should have read this before posting, and then should be thinking about how to use the tools which don't directly do what exactly you want, to get useful information. |
Quote:
http://www.linuxatemyram.com/ Quote:
If the problem is real, there are lots of tools for digging into the details. But none of that is simple. If you post the info that makes you believe you have a memory problem, that may make it easier for us to tell you specific tools and/or documentation to understand the problem. |
Quote:
|
All of sudden i get this error message in the var log:
Jul 26 20:01:02 localhost kernel: mercd_write: unable to allocate memory 16128 Jul 26 20:01:02 localhost kernel: mercd_write: Unmatching Message Class 16128 and 52 35 Jul 26 20:01:02 localhost kernel: mercd_write: Current Message Class 0xfc0 Id 0x1 Jul 26 20:01:02 localhost kernel: mercd_write: Unmatching Message Class 16128 and 52 35 The system runs fine for 3 days. On the 3rd day when the load is at its peak it gives the above error. I took all kinds of logs at this particular time:- top - 20:01:18 up 5 days, 10:10, 9 users, load average: 0.13, 0.03, 0.01 Tasks: 128 total, 1 running, 127 sleeping, 0 stopped, 0 zombie Cpu(s): 0.1% us, 0.2% sy, 0.0% ni, 99.1% id, 0.6% wa, 0.0% hi, 0.0% si Mem: 4151264k total, 4130204k used, 21060k free, 138644k buffers Swap: 6144820k total, 4k used, 6144816k free, 3756412k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 25537 root 16 0 197m 27m 4548 S 1 0.7 10:58.00 SdpMedia 25413 root 16 0 239m 46m 6200 S 0 1.2 3:45.55 MGCv1.1 25773 root 15 0 4536 1728 1140 S 0 0.0 1:44.56 SDP_IVR 23695 root 16 0 16048 14m 14m S 0 0.4 0:08.20 rsi_lnk 3633 root 15 0 0 0 0 S 0 0.0 0:28.33 smbiod 25762 root 16 0 4416 1524 1272 S 0 0.0 0:30.97 DB_MR 1 root 16 0 2880 544 468 S 0 0.0 0:02.33 init Date&Time: 2010-07-26-20:01 total used free shared buffers cached Mem: 4053 4023 30 0 132 3653 -/+ buffers/cache: 237 3816 Swap: 6000 0 6000 My appliaction are SdpMedia, MGCv1.1, SDP_IVR and DB_MR. Disk space is also ok. The application shows normal memory usage. I am not able to find the problem. The system hangs after some time. Please help. |
Quote:
Quote:
There is very little free memory, so either the failing process didn't fully abort or it wasn't using much anonymous memory. There was very little swap space used, which tends to indicate there was no recent memory pressure. There are very high buffer and cache levels, also tending to indicate no recent memory pressure. Top and similar tools tell you about CPU use, which is irrelevant to your problem, and about memory use which also seems to be irrelevant to the problem. So I think you need to be looking elsewhere. |
My Problem is that i am not able to locate the problem. Binary seems to be doing fine both in terms of memory and CPU. But the telephony card driver (mercd) is saying its unable to find the memory. Free says that enough memory is availabe. Var log messgae does not say any other thing. I am lost. How to crack this problem.
|
Quote:
Is this a 32 bit or 64 bit system and what distribution and version of Linux is it? In 32 bit Linux, you might be legitimately exceeding the 1GB limit on kernel virtual memory. There might also be some small resource leak in the mercd or other driver that quickly exhausts the 1GB virtual space before it even becomes obvious that there is a resource leak. If you were using 64 bit Linux, the kernel virtual memory is nearly unlimited (you'll run out of something else before you run out of kernel virtual memory). So if it is legitimately using over 1GB of kernel virtual, 64 bit would just work. If it is leaking a kernel resource, 64 bit would delay the crash almost indefinitely and certainly long enough to make the leak obvious. If it is really impractical to switch to 64 bit but you think kernel virtual memory is the problem, you can probably build a new 32 bit kernel with using the option that gives the kernel 2GB virtual. One of your other posts said RHEL4. RHEL4 had a 32 bit kernel option for 4GB kernel virtual. If you have that kernel, then I'm pretty sure you're not exhausting kernel virtual memory. But that option is an ugly kludge and likely to trigger driver bugs that wouldn't occur in other kernels. So if you have that RHEL4 kernel, it would be better to switch to something else. Unfortunately, I don't know which tools you would use to investigate the status of kernel virtual memory. The following command gives a lot of info about kernel memory use (you should post its results) but I'm not sure it covers enough uses of kernel virtual memory to see a driver problem with kernel virtual memory. Code:
cat /proc/slabinfo |
Quote:
http://docs.sun.com/app/docs/doc/816...=en&n=1&a=view |
Quote:
Kernel virtual memory is just my wild guess at where the problem might be. I having nothing to support that. But non kernel virtual and physical memory is the topic already investigated well earlier in this thread and pretty much ruled out as a relevant factor. |
Quote:
|
I am using RHEL 4 update 5.
uname -a: Linux localhost.localdomain 2.6.9-55.ELsmp #1 SMP Fri Apr 20 17:03:35 EDT 2007 i686 i686 i386 GNU/Linux. It's a 32 bit system. vmstat: procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 0 0 208 15956 20116 3886644 0 0 4 28 31 7 1 2 97 1 cat /proc/slabinfo: slabinfo - version: 2.0 # name <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> : tunables <batchco> : slabdata <active_slabs> <num_slabs> <sharedavail> smb_request 18 30 256 15 1 : tunables 120 60 8 : slabdata 2 2 smb_inode_cache 480 517 368 11 1 : tunables 54 27 8 : slabdata 47 47 rpc_buffers 8 8 2048 2 1 : tunables 24 12 8 : slabdata 4 4 rpc_tasks 8 20 192 20 1 : tunables 120 60 8 : slabdata 1 1 rpc_inode_cache 6 7 512 7 1 : tunables 54 27 8 : slabdata 1 1 msi_cache 3 3 3840 1 1 : tunables 24 12 8 : slabdata 3 3 fib6_nodes 5 119 32 119 1 : tunables 120 60 8 : slabdata 1 1 ip6_dst_cache 4 15 256 15 1 : tunables 120 60 8 : slabdata 1 1 ndisc_cache 1 20 192 20 1 : tunables 120 60 8 : slabdata 1 1 rawv6_sock 6 11 704 11 2 : tunables 54 27 8 : slabdata 1 1 udpv6_sock 0 0 704 11 2 : tunables 54 27 8 : slabdata 0 0 tcpv6_sock 9 15 1216 3 1 : tunables 24 12 8 : slabdata 5 5 ip_fib_alias 10 226 16 226 1 : tunables 120 60 8 : slabdata 1 1 ip_fib_hash 10 119 32 119 1 : tunables 120 60 8 : slabdata 1 1 dm_tio 0 0 16 226 1 : tunables 120 60 8 : slabdata 0 0 dm_io 0 0 20 185 1 : tunables 120 60 8 : slabdata 0 0 dm-bvec-(256) 0 0 3072 2 2 : tunables 24 12 8 : slabdata 0 0 dm-bvec-128 0 0 1536 5 2 : tunables 24 12 8 : slabdata 0 0 dm-bvec-64 0 0 768 5 1 : tunables 54 27 8 : slabdata 0 0 dm-bvec-16 0 0 192 20 1 : tunables 120 60 8 : slabdata 0 0 dm-bvec-4 0 0 64 61 1 : tunables 120 60 8 : slabdata 0 0 dm-bvec-1 0 0 16 226 1 : tunables 120 60 8 : slabdata 0 0 dm-bio 0 0 128 31 1 : tunables 120 60 8 : slabdata 0 0 ext3_inode_cache 4528 6090 552 7 1 : tunables 54 27 8 : slabdata 870 870 ext3_xattr 0 0 48 81 1 : tunables 120 60 8 : slabdata 0 0 journal_handle 101 135 28 135 1 : tunables 120 60 8 : slabdata 1 1 journal_head 628 1458 48 81 1 : tunables 120 60 8 : slabdata 18 18 revoke_table 12 290 12 290 1 : tunables 120 60 8 : slabdata 1 1 revoke_record 0 0 16 226 1 : tunables 120 60 8 : slabdata 0 0 scsi_cmd_cache 101 110 384 10 1 : tunables 54 27 8 : slabdata 11 11 uhci_urb_priv 0 0 44 88 1 : tunables 120 60 8 : slabdata 0 0 sgpool-128 32 33 2560 3 2 : tunables 24 12 8 : slabdata 11 11 sgpool-64 32 33 1280 3 1 : tunables 24 12 8 : slabdata 11 11 sgpool-32 35 36 640 6 1 : tunables 54 27 8 : slabdata 6 6 sgpool-16 36 36 320 12 1 : tunables 54 27 8 : slabdata 3 3 sgpool-8 177 180 192 20 1 : tunables 120 60 8 : slabdata 9 9 unix_sock 64 112 512 7 1 : tunables 54 27 8 : slabdata 16 16 ip_mrt_cache 0 0 128 31 1 : tunables 120 60 8 : slabdata 0 0 tcp_tw_bucket 0 0 128 31 1 : tunables 120 60 8 : slabdata 0 0 tcp_bind_bucket 63 226 16 226 1 : tunables 120 60 8 : slabdata 1 1 tcp_open_request 0 0 128 31 1 : tunables 120 60 8 : slabdata 0 0 inet_peer_cache 2 61 64 61 1 : tunables 120 60 8 : slabdata 1 1 secpath_cache 0 0 128 31 1 : tunables 120 60 8 : slabdata 0 0 xfrm_dst_cache 0 0 256 15 1 : tunables 120 60 8 : slabdata 0 0 ip_dst_cache 18 60 256 15 1 : tunables 120 60 8 : slabdata 4 4 arp_cache 4 20 192 20 1 : tunables 120 60 8 : slabdata 1 1 raw_sock 5 7 576 7 1 : tunables 54 27 8 : slabdata 1 1 udp_sock 20 28 576 7 1 : tunables 54 27 8 : slabdata 4 4 tcp_sock 97 98 1152 7 2 : tunables 24 12 8 : slabdata 14 14 flow_cache 0 0 128 31 1 : tunables 120 60 8 : slabdata 0 0 mqueue_inode_cache 1 7 576 7 1 : tunables 54 27 8 : slabdata 1 1 relayfs_inode_cache 0 0 348 11 1 : tunables 54 27 8 : slabdata 0 isofs_inode_cache 0 0 372 10 1 : tunables 54 27 8 : slabdata 0 0 hugetlbfs_inode_cache 1 11 344 11 1 : tunables 54 27 8 : slabdata 1 ext2_inode_cache 0 0 488 8 1 : tunables 54 27 8 : slabdata 0 0 ext2_xattr 0 0 48 81 1 : tunables 120 60 8 : slabdata 0 0 dquot 0 0 144 27 1 : tunables 120 60 8 : slabdata 0 0 eventpoll_pwq 1 107 36 107 1 : tunables 120 60 8 : slabdata 1 1 eventpoll_epi 1 31 128 31 1 : tunables 120 60 8 : slabdata 1 1 kioctx 0 0 192 20 1 : tunables 120 60 8 : slabdata 0 0 kiocb 0 0 128 31 1 : tunables 120 60 8 : slabdata 0 0 dnotify_cache 2 185 20 185 1 : tunables 120 60 8 : slabdata 1 1 fasync_cache 1 226 16 226 1 : tunables 120 60 8 : slabdata 1 1 shmem_inode_cache 338 351 444 9 1 : tunables 54 27 8 : slabdata 39 39 posix_timers_cache 0 0 112 35 1 : tunables 120 60 8 : slabdata 0 0 uid_cache 5 61 64 61 1 : tunables 120 60 8 : slabdata 1 1 cfq_pool 107 119 32 119 1 : tunables 120 60 8 : slabdata 1 1 crq_pool 192 192 40 96 1 : tunables 120 60 8 : slabdata 2 2 deadline_drq 0 0 52 75 1 : tunables 120 60 8 : slabdata 0 0 as_arq 0 0 64 61 1 : tunables 120 60 8 : slabdata 0 0 blkdev_ioc 49 185 20 185 1 : tunables 120 60 8 : slabdata 1 1 blkdev_queue 20 32 488 8 1 : tunables 54 27 8 : slabdata 4 4 blkdev_requests 200 200 160 25 1 : tunables 120 60 8 : slabdata 8 8 biovec-(256) 256 256 3072 2 2 : tunables 24 12 8 : slabdata 128 128 biovec-128 256 260 1536 5 2 : tunables 24 12 8 : slabdata 52 52 biovec-64 256 260 768 5 1 : tunables 54 27 8 : slabdata 52 52 biovec-16 256 260 192 20 1 : tunables 120 60 8 : slabdata 13 13 biovec-4 256 305 64 61 1 : tunables 120 60 8 : slabdata 5 5 biovec-1 462 904 16 226 1 : tunables 120 60 8 : slabdata 4 4 bio 440 527 128 31 1 : tunables 120 60 8 : slabdata 17 17 file_lock_cache 8 82 96 41 1 : tunables 120 60 8 : slabdata 2 2 sock_inode_cache 202 234 448 9 1 : tunables 54 27 8 : slabdata 26 26 skbuff_head_cache 538 920 192 20 1 : tunables 120 60 8 : slabdata 46 46 sock 11 30 384 10 1 : tunables 54 27 8 : slabdata 3 3 proc_inode_cache 769 1133 360 11 1 : tunables 54 27 8 : slabdata 103 103 sigqueue 9 54 148 27 1 : tunables 120 60 8 : slabdata 2 2 radix_tree_node 21294 24388 276 14 1 : tunables 54 27 8 : slabdata 1742 1742 bdev_cache 40 42 512 7 1 : tunables 54 27 8 : slabdata 6 6 mnt_cache 34 62 128 31 1 : tunables 120 60 8 : slabdata 2 2 audit_watch_cache 0 0 48 81 1 : tunables 120 60 8 : slabdata 0 0 inode_cache 845 1199 344 11 1 : tunables 54 27 8 : slabdata 109 109 dentry_cache 5149 20332 152 26 1 : tunables 120 60 8 : slabdata 782 782 filp 1763 2080 192 20 1 : tunables 120 60 8 : slabdata 104 104 names_cache 47 47 4096 1 1 : tunables 24 12 8 : slabdata 47 47 avc_node 12 600 52 75 1 : tunables 120 60 8 : slabdata 8 8 key_jar 10 31 128 31 1 : tunables 120 60 8 : slabdata 1 1 idr_layer_cache 84 116 136 29 1 : tunables 120 60 8 : slabdata 4 4 buffer_head 703043 902925 52 75 1 : tunables 120 60 8 : slabdata 12039 12039 mm_struct 90 231 704 11 2 : tunables 54 27 8 : slabdata 21 21 vm_area_struct 3969 4815 88 45 1 : tunables 120 60 8 : slabdata 107 107 fs_cache 92 427 64 61 1 : tunables 120 60 8 : slabdata 7 7 files_cache 93 225 448 9 1 : tunables 54 27 8 : slabdata 25 25 signal_cache 145 440 192 20 1 : tunables 120 60 8 : slabdata 22 22 sighand_cache 155 183 1344 3 1 : tunables 24 12 8 : slabdata 61 61 task_struct 309 330 1408 5 2 : tunables 24 12 8 : slabdata 66 66 anon_vma 1513 2034 16 226 1 : tunables 120 60 8 : slabdata 9 9 pgd 90 476 32 119 1 : tunables 120 60 8 : slabdata 4 4 pmd 278 296 4096 1 1 : tunables 24 12 8 : slabdata 278 296 size-131072(DMA) 0 0 131072 1 32 : tunables 8 4 0 : slabdata 0 0 size-131072 1 1 131072 1 32 : tunables 8 4 0 : slabdata 1 1 size-65536(DMA) 0 0 65536 1 16 : tunables 8 4 0 : slabdata 0 0 size-65536 3 3 65536 1 16 : tunables 8 4 0 : slabdata 3 3 size-32768(DMA) 0 0 32768 1 8 : tunables 8 4 0 : slabdata 0 0 size-32768 7 7 32768 1 8 : tunables 8 4 0 : slabdata 7 7 size-16384(DMA) 0 0 16384 1 4 : tunables 8 4 0 : slabdata 0 0 size-16384 12 12 16384 1 4 : tunables 8 4 0 : slabdata 12 12 size-8192(DMA) 0 0 8192 1 2 : tunables 8 4 0 : slabdata 0 0 size-8192 24 27 8192 1 2 : tunables 8 4 0 : slabdata 24 27 size-4096(DMA) 0 0 4096 1 1 : tunables 24 12 8 : slabdata 0 0 size-4096 776 776 4096 1 1 : tunables 24 12 8 : slabdata 776 776 size-2048(DMA) 0 0 2048 2 1 : tunables 24 12 8 : slabdata 0 0 size-2048 140 140 2048 2 1 : tun size-1620(DMA) 0 0 1664 4 2 : tun size-1620 32 32 1664 4 2 : tun size-1024(DMA) 0 0 1024 4 1 : tun size-1024 331 356 1024 4 1 : tun size-512(DMA) 0 0 512 8 1 : tun size-512 910 2472 512 8 1 : tun size-256(DMA) 0 0 256 15 1 : tun size-256 708 1920 256 15 1 : tun size-128(DMA) 0 0 128 31 1 : tun size-128 2048 4712 128 31 1 : tun size-64(DMA) 0 0 64 61 1 : tun size-64 7998 11468 64 61 1 : tun size-32(DMA) 0 0 32 119 1 : tun size-32 13668 18921 32 119 1 : tun kmem_cache 165 165 256 15 1 : tun The problem ocuurs after three days of continuous running. I have also observed that the problem occurance time is also same. When suddenly load increases and the telephony card plays the wave file on high load (happens on the third day from restart) then it faces the mentioned problem. when problem starts, it fails to play some request and it also successfully plays some requests. Then number of failures cases keeps on increasing. Finally system hangs and kernel panic error message is generated by the system. Then we do the restart and it works fine for next two days and the problem repeates on third day when suddenly load increases. As per card capacity call peak load is only 30-50% which is ok. if we restart the card driver only (without system restart) then also it the system runs fine for a day. How to see that what is total size of kernel memory availabe? How to increase the kernel memory size? Shall migrating to higher version of RHEL (5.3 etc) will help? |
Raghu,
Kindly put you codes in code tags, it will be easier for others to read: http://www.linuxquestions.org/questi...do=bbcode#code |
All times are GMT -5. The time now is 12:29 AM. |