LinuxQuestions.org
Visit Jeremy's Blog.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > General
User Name
Password
General This forum is for non-technical general discussion which can include both Linux and non-Linux topics. Have fun!

Notices


Reply
  Search this Thread
Old 08-25-2009, 01:17 PM   #1
davidstvz
Member
 
Registered: Jun 2008
Posts: 405

Rep: Reputation: 31
One Part Horror Story; One Part Success Story


Or: Who Would Have Thought rm -r ./e could cause so much trouble?

I've been doing this sys admin thing for about a year now. I was pretty much a complete n00b when I started. I still have a lot to learn obviously.

This morning I was relaxing in my office, poking around on my server. I say my server, but I actually inherited this server from another admin, so there are things about it that are mysterious to me. I noticed a partition called scratch. After some investigation, I determined that it was used for exactly what it sounded like and everything on it was old enough to merit deleting (I have tape backups after all). I suppose I could have reformatted it, but deleting a few folders seemed easy enough.

I started with virusmail going with the command "rm -r ./virusmail" which worked fine. Encouraged, I turned to a directory simply called "e". It was a two year old backup for a professor who is notorious for hoarding backups and such going back decades. In this directory, there were 2 GB of data. It was obviously a stale backup and not needed. I'm certain the individual didn't even know it was there. After a bit of thought I issued (as root) the command: rm -r ./e

And there was an uncomfortable pause.

The virusmail folder went away pretty quick, I figured this probably shouldn't take more than a few seconds. To reassure myself, I called up a second ssh terminal and it just hung. Oh **** what did I do?! What if there was a softlink to the root directory in that folder? Would that have made this the equivalent of rm -r /*

After running upstairs to the server room the terminal says something about being out of memory. Somehow the command froze the machine up.

After initiating the nail-bitingly-long reboot process (still wondering if I hadn't wiped out the system) everything seemed to work normally at first. Then I realized anytime someone tried to run various programs, the program would just hang. Not good. I had little energy left for freaking out though, I was more annoyed and determined to fix it at this point. The worse they can do is fire me. Then maybe I'll move out of state and work my way into the video game industry (something I probably should have done a long time ago).

I tried to kill various hanging processes. I couldn't remember if -9 was the signal and found a neat page lamenting that so many n00bs thought "kill -9 pid" is a good idea. It suggested:

Code:
kill pid (sends a TERM, wait 5 seconds)
kill pid (yes, try again, wait 5 seconds)
kill -INT pid  (wait for it)
kill -INT pid  (damn, still not dead?)
kill -KILL pid (same thing as -9)
kill -KILL pid (something is wrong)
Well, something was indeed wrong. Kill -9 pid wouldn't even work. A Google search turned up, what else, a LQ thread where I got the idea that an unkillable process indicates a wait for IO. Maybe a drive was bad. That made sense after a freeze and surprise shutdown. After a bit of exploration I found that any attempt to access the /tmp partition caused hanging. I went from annoyance to complete relief.

After a year of experience (thanks for all the help LQ ) that seemed like a really easy problem to fix. I ended up editing fstab with pico (vi tried to use /tmp and hung) so that /tmp would not be mounted on reboot. After reboot, root remade /tmp for me as I expected. The old pine program complained rather specifically that permissions needed to be 1777 on /tmp. I complied (verifying the truth of this on another server) and now everything seems to be working perfectly. Fortunately the /root partition has 20 GB of unused space and was only used to 1% capacity the last 4 years so /tmp should be able to exist on it happily. The old /tmp only had about 4GB of stuff on it.

I don't guess there is anything too important in those 4GB of stuff. Probably just years of temporary files not being properly deleted. I suppose I could try to remount that partition as something else and investigate further, but I'm migrating to a new Debian server soon so I may just leave well enough alone

Well, that was cathartic. Now I'm going to go beat my head against a wall for being so stupid or maybe have some lunch. I have enough problems without causing my own! Then again, I learned several valuable things, though I still don't know why the: rm -r ./e command failed. I'm going to be scared to ever use rm -r again (especially as root).
 
Old 08-25-2009, 02:12 PM   #2
b0uncer
LQ Guru
 
Registered: Aug 2003
Distribution: CentOS, OS X
Posts: 5,131

Rep: Reputation: Disabled
I don't think "rm" should follow symlinks by default, but just remove the symlinks. I'm fairly sure this is how it worked last time I did a recursive rm to a directory that surely contained symlinks..and if it did follow symlinks, wouldn't it be a pretty dangerous thing? If somebody (without root permissions) could write a symlink to / unnoticed, it would be a deathtrap should somebody run rm on it with enough privileges..

It's easy to get into more or less scary situations when there's some responsibility involved, like being a paid admin, I agree
 
Old 08-25-2009, 03:38 PM   #3
davidstvz
Member
 
Registered: Jun 2008
Posts: 405

Original Poster
Rep: Reputation: 31
I think you're right that rm shouldn't (and probably doesn't) follow symbolic links even with the -r flag. I'm not sure what caused the crash though. It seems to have been a memory error. I'm not sure why trying to rm too much stuff at once could conceivably run my system out of memory. It has 4GB of physical memory and typically uses no more than 1GB.

This would be the fourth OMFG moment in the last year. The first two were failing drives. It's a good thing my first priority in taking the job was getting the tape back ups functioning again (the previous admin didn't backup often). The third was a router failure that turned out to be not under my direct responsibility (although it was my problem of course). And finally this fun little issue.

I'm overdue for another drive failure, but perhaps I'll have more luck with the remaining drives. If not, there's always the tapes (speaking of which, time to go upstairs and swap tapes).
 
Old 08-25-2009, 04:02 PM   #4
davidstvz
Member
 
Registered: Jun 2008
Posts: 405

Original Poster
Rep: Reputation: 31
I think you're right that rm shouldn't (and probably doesn't) follow symbolic links even with the -r flag. I'm not sure what caused the crash though. It seems to have been a memory error. I'm not sure why trying to rm too much stuff at once could conceivably run my system out of memory. It has 4GB of physical memory and typically uses no more than 1GB.

This would be the fourth OMFG moment in the last year. The first two were failing drives. It's a good thing my first priority in taking the job was getting the tape back ups functioning again (the previous admin didn't backup often). The third was a router failure that turned out to be not under my direct responsibility (although it was my problem of course). And finally this fun little issue.

I'm overdue for another drive failure, but perhaps I'll have more luck with the remaining drives. If not, there's always the tapes (speaking of which, time to go upstairs and swap tapes).
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
LXer: OpenSUSE 11.1: Only Part of the Novell Story LXer Syndicated Linux News 1 12-17-2008 03:38 PM
LXer: When the Ombudsman Becomes Part of the Story LXer Syndicated Linux News 0 02-14-2006 08:31 PM
Friend's Laptop, a horror story (long) Artimus General 4 06-20-2004 06:54 AM
Success Story! Xshare LinuxQuestions.org Member Success Stories 3 01-25-2004 07:05 PM
CS like success story PingouinShark Linux - Games 2 10-08-2003 04:17 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > General

All times are GMT -5. The time now is 11:39 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration