Memory leak?
Hiya,
I have a problem with my Linux server (Slackware 10.2.0, kernel 2.4.31). I think a memory leak. When I ran the program free I get this result:
Code:
root@home:/var/log# free
total used free shared buffers cached
Mem: 904496 836748 67748 0 235344 333820
-/+ buffers/cache: 267584 636912
Swap: 2000084 10404 1989680
So it's using around 90% of my memory (most days it's 98% to 99%). Yesterday a lot of processes were killed by the kernel, i guess beceause of a full memory. I checked some logfiles and found this:
syslog:
Code:
Jan 31 18:03:33 hablserv sshd[3040]: error: Could not get shadow information for NOUSER
Jan 31 18:03:35 hablserv sshd[3042]: error: Could not get shadow information for NOUSER
Jan 31 18:03:47 hablserv kernel: VM: killing process sshd
Jan 31 18:03:50 hablserv kernel: VM: killing process proftpd
Jan 31 18:04:00 hablserv kernel: VM: killing process eggdrop
Jan 31 18:04:12 hablserv kernel: VM: killing process mysqld
Jan 31 18:04:48 hablserv kernel: VM: killing process httpd
Jan 31 18:05:39 hablserv kernel: VM: killing process smbd
Jan 31 18:06:10 hablserv kernel: VM: killing process local
Jan 31 18:06:20 hablserv kernel: VM: killing process irssi
Jan 31 18:07:01 hablserv kernel: VM: killing process eggdrop
Jan 31 18:15:09 hablserv kernel: VM: killing process sshd
Jan 31 18:15:39 hablserv kernel: VM: killing process local
Jan 31 18:16:43 hablserv kernel: VM: killing process nmbd
Jan 31 18:17:27 hablserv kernel: VM: killing process eggdrop
Jan 31 18:17:27 hablserv kernel: VM: killing process httpd
Jan 31 18:17:27 hablserv kernel: VM: killing process psybnc
Jan 31 18:17:49 hablserv kernel: VM: killing process httpd
Jan 31 18:18:11 hablserv kernel: VM: killing process mysqld
Jan 31 18:18:16 hablserv kernel: VM: killing process ircd
Jan 31 18:18:31 hablserv kernel: VM: killing process local
Jan 31 18:18:48 hablserv kernel: VM: killing process pipe
Jan 31 18:18:48 hablserv kernel: VM: killing process stats
Jan 31 18:19:05 hablserv kernel: VM: killing process flush
Jan 31 18:19:07 hablserv kernel: VM: killing process eggdrop
Jan 31 18:19:10 hablserv kernel: VM: killing process trivial-rewrite
Jan 31 18:19:13 hablserv kernel: VM: killing process bot.pl
Jan 31 18:19:22 hablserv kernel: VM: killing process master
Jan 31 18:19:26 hablserv kernel: VM: killing process eggdrop
Jan 31 18:19:44 hablserv kernel: VM: killing process eggdrop
And here messages:
Code:
Jan 31 18:03:35 hablserv sshd[3042]: Invalid user windows from 65.160.227.136
Jan 31 18:03:36 hablserv sshd[3042]: Failed password for invalid user windows from 65.160.227.136 port 42661 ssh2
Jan 31 18:03:46 hablserv kernel: __alloc_pages: 0-order allocation failed (gfp=0x1d2/0)
Jan 31 18:04:35 hablserv last message repeated 8 times
Jan 31 18:04:35 hablserv kernel: __alloc_pages: 0-order allocation failed (gfp=0xf0/0)
Jan 31 18:04:37 hablserv kernel: __alloc_pages: 0-order allocation failed (gfp=0xf0/0)
Jan 31 18:04:47 hablserv kernel: __alloc_pages: 0-order allocation failed (gfp=0x1d2/0)
Jan 31 18:05:35 hablserv kernel: __alloc_pages: 0-order allocation failed (gfp=0xf0/0)
Jan 31 18:05:39 hablserv kernel: __alloc_pages: 0-order allocation failed (gfp=0x1d2/0)
Jan 31 18:06:13 hablserv last message repeated 2 times
Jan 31 18:15:08 hablserv last message repeated 8 times
Jan 31 18:16:28 hablserv last message repeated 2 times
Jan 31 18:17:09 hablserv last message repeated 5 times
Jan 31 18:17:03 hablserv sshd[3064]: Did not receive identification string from 217.120.219.224
Jan 31 18:17:11 hablserv kernel: __alloc_pages: 0-order allocation failed (gfp=0x1d2/0)
Jan 31 18:17:37 hablserv last message repeated 7 times
Jan 31 18:17:39 hablserv kernel: __alloc_pages: 0-order allocation failed (gfp=0xf0/0)
Jan 31 18:17:39 hablserv kernel: __alloc_pages: 0-order allocation failed (gfp=0x1d2/0)
Jan 31 18:18:11 hablserv last message repeated 4 times
Jan 31 18:19:13 hablserv last message repeated 18 times
Jan 31 18:19:19 hablserv kernel: __alloc_pages: 0-order allocation failed (gfp=0xf0/0)
Jan 31 18:19:20 hablserv kernel: __alloc_pages: 0-order allocation failed (gfp=0x1d2/0)
Jan 31 18:19:22 hablserv last message repeated 2 times
Jan 31 18:19:25 hablserv kernel: __alloc_pages: 0-order allocation failed (gfp=0x1f0/0)
Jan 31 18:19:26 hablserv kernel: __alloc_pages: 0-order allocation failed (gfp=0x1d2/0)
Jan 31 18:19:44 hablserv last message repeated 2 times
Jan 31 18:19:44 hablserv kernel: __alloc_pages: 0-order allocation failed (gfp=0xf0/0)
Jan 31 18:19:44 hablserv kernel: __alloc_pages: 0-order allocation failed (gfp=0x1d2/0)
Jan 31 18:19:44 hablserv kernel: __alloc_pages: 0-order allocation failed (gfp=0x1d2/0)
The last thing that happend was some kind of a kiddo who tried to login to my server, without any succes. I didn't pasted all lines, it are a lot. I'm not sure if this has something to do with it, because it happends a lot that people are hammering my sshd. It never causes a problem and they never got in.
I think it's just a process with a memory leak or something and offcourse I would like to know which one it is. Unfortunately I can't find it. After the processes were killed, I started them one by one and after every process checked my memory, but everything went normal. After about approximately 1 hour, my memory was filled again.
I try tools like ps, top, slabtop (output below) but I still haven't found the process wich causes this problem.
Does somebody has suggestions what else I could try?
Here are the outputs of slabtop and ps. I shrinked the outputs btw, if nessecary I can post the complete result.
slabtop output:
Code:
Active / Total Objects (% used) : 909093 / 913799 (99.5%)
Active / Total Slabs (% used) : 54653 / 54748 (99.8%)
Active / Total Caches (% used) : 46 / 65 (70.8%)
Active / Total Size (% used) : 200557.89K / 201070.15K (99.7%)
Minimum / Average / Maximum Object : 0.01K / 0.22K / 128.00K
OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME
319504 319448 99% 0.45K 39938 8 159752K inode_cache
325010 324962 99% 0.11K 9286 36 37144K dentry_cache
142320 142185 99% 0.09K 3558 42 14232K buffer_head
88931 88832 99% 0.03K 787 128 3148K size-32
19411 19391 99% 0.06K 329 64 1316K size-64
ps output:
Code:
root 20324 6928 /usr/sbin/smbd -D 0.3 2736
root 20321 6928 /usr/sbin/smbd -D 0.3 2760
dennis 20698 4160 ./eggdrop 0.3 2836
dennis 20708 6172 ./stats 0.3 3016
hans 20459 6892 irssi 0.3 3528
dennis 20702 5220 ./eggdrop 0.4 3908
dennis 20689 5316 ./eggdrop 0.4 3948
root 20106 79624 /usr/sbin/httpd 0.5 5376
nobody 20357 79784 /usr/sbin/httpd 0.6 5528
nobody 20107 79784 /usr/sbin/httpd 0.6 5540
nobody 20109 79784 /usr/sbin/httpd 0.6 5540
nobody 20110 79784 /usr/sbin/httpd 0.6 5568
hans 20131 8932 /usr/bin/perl -w? ./bot.pl 0.6 5812
nobody 20108 82344 /usr/sbin/httpd 0.9 8712
nobody 20111 82344 /usr/sbin/httpd 0.9 8716
nobody 20358 82344 /usr/sbin/httpd 0.9 8716
nobody 20359 82344 /usr/sbin/httpd 0.9 8720
mysql 20094 51656 /usr/libexec/mysqld --based 1.7 15948
mysql 20095 51656 /usr/libexec/mysqld --based 1.7 15948
mysql 20096 51656 /usr/libexec/mysqld --based 1.7 15948
mysql 20097 51656 /usr/libexec/mysqld --based 1.7 15948
mysql 20098 51656 /usr/libexec/mysqld --based 1.7 15948
mysql 20099 51656 /usr/libexec/mysqld --based 1.7 15948
mysql 20100 51656 /usr/libexec/mysqld --based 1.7 15948
mysql 20101 51656 /usr/libexec/mysqld --based 1.7 15948
mysql 20102 51656 /usr/libexec/mysqld --based 1.7 15948
mysql 20103 51656 /usr/libexec/mysqld --based 1.7 15948
mysql 20709 51656 /usr/libexec/mysqld --based 1.7 15948
mysql 20726 51656 /usr/libexec/mysqld --based 1.7 15948
mysql 20731 51656 /usr/libexec/mysqld --based 1.7 15948
mysql 20733 51656 /usr/libexec/mysqld --based 1.7 15948
mysql 20734 51656 /usr/libexec/mysqld --based 1.7 15948
mysql 20736 51656 /usr/libexec/mysqld --based 1.7 15948
mysql 20739 51656 /usr/libexec/mysqld --based 1.7 15948
mysql 20741 51656 /usr/libexec/mysqld --based 1.7 15948
|