Hello Team,
I am new to Linux system administration and though there is a sysadmin, i am the one in charge of this end of the system.
On Wednesday last week, I realized that my space was increasing slowly like by 1% every hour; I've always removed old mysqlbackups everyday first thing and this always reduces my disk usage by 2GB.
This problem however continued and I've tried to troubleshoot but I don't have guidance. On wednesday morning my system usage was at 30% of 147GB; On Friday I deleted some large unused files in addition to expired backups and had 60% usage around 6pm. Today morning I wake up to find 95% usage and a whole lot of hanged processes that look like this;
www-data 18092 4.1 1.0 91928 5068 pts/4 Sl+ Oct07 57:20 ./hlds_i686 -game cstrike +ip 0.0.0.0 +maxplayers 32 +map de
www-data 18101 0.0 0.0 23748 336 ? Ss Oct07 0:00 SCREEN -d -m ./hlds_run -game cstrike +ip 0.0.0.0 +maxplayer
www-data 18102 0.0 0.0 4008 276 pts/5 Ss+ Oct07 0:00 /bin/sh ./hlds_run -game cstrike +ip 0.0.0.0 +maxplayers 32
and other mail programs (set in cron) which had hang; I also couldnt log in as a user into mysql on our web portal.
My conclusion is that I have a rogue process somewhere that's adding into 1% of disk usage every hour. And this process must have started Wednesday. However, I couldnt see the process and I can use iotop because its not installed and when i try installing i get an error during installation;
Code:
Err http://ke.archive.ubuntu.com karmic/universe iotop 0.3-1
404 Not Found
Failed to fetch http://ke.archive.ubuntu.com/ubuntu/pool/universe/i/iotop/iotop_0.3-1_all.deb 404 Not Found
E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?
and when i try to update package repository, i get alot of failed error like;
Code:
W: Failed to fetch http://security.ubuntu.com/ubuntu/dists/karmic-security/main/binary-amd64/Packages.gz 404 Not Found [IP: 91.189.92.184 80]
so using iotop is out of the question.
I decided to restart my server after removing 2 old mysqlbackups .tar.gz and I had 91% usage now and all those usr-www processes were killed.
I've run
and this is the result;
faith@sms:~$ Filesystem Type Size Used Avail Use% Mounted on
/dev/mapper/sms-root
ext4 146G 133G 4.9G 97% /
udev tmpfs 243M 200K 243M 1% /dev
none tmpfs 243M 0 243M 0% /dev/shm
none tmpfs 243M 344K 242M 1% /var/run
none tmpfs 243M 0 243M 0% /var/lock
none tmpfs 243M 0 243M 0% /lib/init/rw
/dev/sda5 ext2 228M 15M 202M 7% /boot
Now The only thing I can understand from this result is that root uses the 97% of space; How do I go on from here to find that file that has been updated by that rogue process? Im thinking I need a file that could have been created on Wednesday 3rd Oct and updated every hour; and its definately very large; about or over 85GB in less than 5 days i guess.