Quicker way to delete folders than rm -r folder_name
Linux - NewbieThis Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place!
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I need to delete 100,000 of files in a massive directory tree daily.
Using rm -r folder_name takes hours.
Is there another linux command that deletes folder trees faster?
Thanks,
nbdr
Welcome to LQ!!
What you need is some way of defining which folders to delete. This could be something simple like having them all in one place, or it could be a unique string in the filename or extension.
Take a look at the "find" command, with all its options.
What you need is some way of defining which folders to delete. This could be something simple like having them all in one place, or it could be a unique string in the filename or extension.
Take a look at the "find" command, with all its options.
Thanks for the answers.
I know which folders to delete. they are all under 'cache' folder. What do I do then that is faster than rm -r cache/ ?
If rm could be faster, it would be faster. It has had many decades to improve.
Most of the stuff is done by the fs driver, so, the only way to improve the performance would be to look into the file system (change filesystem, tune it, change options when formatting, etc. etc. etc.).
What filesystem are you using ? If it's ext3 then I can see why it takes hours, I bet the same would only take minutes with JFS (fastest delete speed) or XFS.
What filesystem are you using ? If it's ext3 then I can see why it takes hours, I bet the same would only take minutes with JFS (fastest delete speed) or XFS.
That's not my experience with ext3 at all.
Code:
$ time for dir1 in $(seq 1 1000); do mkdir $dir1; for dir2 in $(seq 1 100); do mkdir $dir1/$dir2; done; done
real 5m14.906s
user 1m19.725s
sys 2m56.671s
$ time rm -rf *
real 0m9.218s
user 0m0.584s
sys 0m6.224s
Under ten seconds for 100,000 directories. This is a sempron 3000+ (relatively old machine) with a sata disk (not sata 2). Filesystem is ext3, it's formated with -O dir_index, though. Creation of directories is not that fast, however that's to be expected.
Out of curiosity, I tested a loopback fs formated and mounted with the standard options (no dir_index), just to be fair. The results are very similar.
Code:
real 5m55.984s
user 1m44.103s
sys 3m1.987s
$ time rm -rf *
real 0m10.178s
user 0m0.583s
sys 0m7.564s
A really small difference.
From my experience, I know that ext3 is a very stable and fast filesystem overall even if people usually don't like to admit it because of I don't know what reason. Sure that some other fs's do X thing better, but they also do other things *much* worse. I find that ext3 does everything adequately.
If the OP really find that deleting 100,000 files takes that long, there are a number of probably causes.
Defective or experimental fs (not ext3), like reiser4 or ext4. I don't know if reiserfs (3.x) can have problems with this, but I know from first hand that it does have serious problems with fragmentation.
Defective hardware, look on the dmesg output for I/O errors when doing fs operations.
Your cpu is being hogged by something else. Check top or htop.
There might be other possible problems. But rm is not one of them.
As mentioned, rm is pretty quick. What might be slowing it down is updating the dir files as it goes. Try re-mounting with -noatime.
Alternately, as said, make it a separate partition and use mkfs.
Hadn't seen post by i92guboj - I did some tests too. I just created 100000 copies of a small (few hundred bytes) file. Took just less than 10 and a half minutes.
Rebooted and "rm-rf ..." - less than 10 seconds.
Hardware RAID5 on an old idle quad (P-III based) Xeon server. EXT3 mounted noatime, nodiratime - because I always have them that way.
Its a cache of a website that hosted on a shared server.
I think that the files are stored on a storage cluster. Don't have any control of the file system or other stuff. The problem is that I exceed the 500,000 files limit every few days and delete manually until I optimize the caching.
So if you do that daily, wouldn't it make sense to set up a cron job or something that takes care of it for you?
++
That's the way to go. Just create a cron job. He might consider using an higher niceness so it doesn't hit the cpu so badly, though, sincerely, in a cluster I don't think that cpu is the problem. I am rather inclined to think that's something to do with the fs or the hardware.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.