LinuxQuestions.org
Help answer threads with 0 replies.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 01-20-2012, 05:57 PM   #1
hydraMax
Member
 
Registered: Jul 2010
Location: Skynet
Distribution: Debian + Emacs
Posts: 467
Blog Entries: 60

Rep: Reputation: 51
c library: recursive delete of directories


Is there a C library out there (unix only is fine) designed to safely handle the recursive deletion of non-empty directories? I found several posts on the Web from people asking the same question, and they were all told that they would either have to make an external call to "rm -rf", or implement the whole idea from scratch (say, with nftw()), or use some hacked-together function some guy wrote in five minutes but is afraid to use himself.

But it just seems hard to believe that its been over forty years since Unix came along and there isn't some standard, safe, commonly accepted way of doing this. Is there something I've overlooked?
 
Old 01-20-2012, 08:40 PM   #2
Dark_Helmet
Senior Member
 
Registered: Jan 2003
Posts: 2,786

Rep: Reputation: 374Reputation: 374Reputation: 374Reputation: 374
I don't know that a recursive delete of empty directories is really something that needs to be implemented as a library call. My understanding of libraries is that they provide basic building blocks that a programmer then assembles to perform top-level specific tasks. What you describe is asking for a top-level specific task to be part of the library. In other words, I do not see a recursive-delete-of-empty-directories as a useful building block to some other task.

Be that as it may, there's no reason to rely on the code "some guy" wrote in five minutes. The code for rm and its recursive option is available for everyone and anyone to inspect and/or incorporate into their software. The rm command is part of the GNU Coreutils package. Download, extract, and have a little party!
 
Old 01-20-2012, 11:29 PM   #3
hydraMax
Member
 
Registered: Jul 2010
Location: Skynet
Distribution: Debian + Emacs
Posts: 467

Original Poster
Blog Entries: 60

Rep: Reputation: 51
Actually, I already have extracted the binutils code. The code for the rm program depends on remove.h and remove.c, which provide a rm() function and associated structures, but I do not know how easy it would be to cut and paste into my program: remove.c itself seems to be dependent on at least ten other source files from within the binutils package, most of which link to other headers and some of which are filled beginning to end with macro tests I've never even heard of. Many of the headers used do not even exist in the source tree, but are generated at the beginning of the make process.

No, I wouldn't be able to cut and paste from it, even if that was a good idea; somebody who knows how would have to turn the code into a separate library that I could link to.
 
Old 01-21-2012, 12:05 AM   #4
Dark_Helmet
Senior Member
 
Registered: Jan 2003
Posts: 2,786

Rep: Reputation: 374Reputation: 374Reputation: 374Reputation: 374
You do not necessarily have to copy-paste wholesale. My idea was to look at the code and incorporate the basic flow of the logic.

If that's a time-consuming process (more than you can afford), then perhaps a simple recursive function with calls to rmdir().

Off-hand, the system functions the algortithm might use:
opendir() (man 3 opendir)
readdir() (man 3 readdir)
stat() (man 2 stat)
rmdir() (man 2 rmdir)

I know you were hoping for some library that would have all this done for you, but I'm not aware of one. Though, obviously if someone does know of one and comes by this thread, I would be happy for them to correct me.

As a side note, you said you found rm in the binutils package. It was my impression that GNU Binutils was for manipulation/information of binary executables. I would hate to think that there's a binutils-rm and a coreutils-rm floating around out there.
 
Old 01-21-2012, 05:23 AM   #5
hydraMax
Member
 
Registered: Jul 2010
Location: Skynet
Distribution: Debian + Emacs
Posts: 467

Original Poster
Blog Entries: 60

Rep: Reputation: 51
Err... meant to type coreutils. (was looking at binutils earlier in the day so the name stuck in my mind...)

Well, since this thread went no where, I guess I'm back where I started. Either reinvent the wheel, or continue making external calls to "rm".

Maybe for my next C project, I'll try to port that Coreutils function into a separate library. Or maybe I'll join the mailing list and beg them to do it. (♪ All I have to do, is dre-e-e-e-eammm... ♫)
 
Old 01-21-2012, 01:02 PM   #6
Nominal Animal
Senior Member
 
Registered: Dec 2010
Location: Finland
Distribution: Xubuntu, CentOS, LFS
Posts: 1,723
Blog Entries: 3

Rep: Reputation: 948Reputation: 948Reputation: 948Reputation: 948Reputation: 948Reputation: 948Reputation: 948Reputation: 948
Actually, the situation is pretty interesting right now.

Linux kernels starting from version 2.6.16 provide syscall openat(fd,dirname,O_PATH|O_DIRECTORY) which can be used to open a directory descriptor to a subdirectory using a descriptor to the parent directory and only the subdirectory name, and unlinkat(fd,name,flags) where flags is 0 for normal files and AT_REMOVEDIR for empty directories. For current directory, you can use AT_FDCWD for fd .

Using the above, a very simple and robust algorithm will only use as many descriptors as the depth of the deepest subdirectory, but will be totally immune from hard link and rename issues. In particular, you can rename one of the directories being deleted, without the algorithm getting confused. It will only depend on current working directory for the very first unlink/opendir, and it will be perfectly thread-safe.

You might wish to use a loop to delete the contents of a directory recursively, until the directory itself can be removed. This in case there is somebody creating new files while you're trying to remove the tree. Because subdirectory deletion is not dependent on the current working directory, you could farm each subdirectory out to a separate thread from a thread pool, removing subdirectories in parallel.

To do the same in a portable manner, you need to do the tree removal in a child process, using fchdir() to descend into and go back up the tree, because the current working directory is common to all threads in the process. For the same reason, you cannot use more than one thread.

Finally, /bin/rm is a reliable workhorse for this. If you create a function which forks a child process and returns the child process pid, and in the child process, redirects standard input, output and error to /dev/null and calls execl("/bin/rm","rm","-rf",thing); you have an asynchronous tree deletion function done. The caller can go on doing something else productive, while the files are being removed. The caller can call a helper function, supplying the pid, to wait until the deletion is complete.

I wouldn't mind writing the removal as a simple library, but I just cannot decide which of the three above approaches makes most sense. I personally prefer the first, but it is Linux-specific, and will only work with kernels 2.6.16 and later.
 
Old 01-21-2012, 03:15 PM   #7
hydraMax
Member
 
Registered: Jul 2010
Location: Skynet
Distribution: Debian + Emacs
Posts: 467

Original Poster
Blog Entries: 60

Rep: Reputation: 51
Quote:
Originally Posted by Nominal Animal View Post
Finally, /bin/rm is a reliable workhorse for this. If you create a function which forks a child process and returns the child process pid, and in the child process, redirects standard input, output and error to /dev/null and calls execl("/bin/rm","rm","-rf",thing); you have an asynchronous tree deletion function done. The caller can go on doing something else productive, while the files are being removed. The caller can call a helper function, supplying the pid, to wait until the deletion is complete.
This is the path I decided to take (already implemented). Writing my own recursive delete code sounds interesting as a project in and of itself, but I don't want to make it part of the current project. Thanks for the interesting information about openat and unlinkat sycalls, though.

If I implement my own simply library, or use somebody elses, it will have to meet my criteria for portability, and definitely not be Linux-only. Currently I am restricting myself to _XOPEN_SOURCE 700 features (POSIX.2, XPG4, SUSv4), and I want my code to compile (ideally) on any *nix system which meets those standards. Of course, there could be conditional preprocessor code for specific OSes.

Last edited by hydraMax; 01-21-2012 at 03:20 PM.
 
Old 02-02-2012, 06:05 PM   #8
hydraMax
Member
 
Registered: Jul 2010
Location: Skynet
Distribution: Debian + Emacs
Posts: 467

Original Poster
Blog Entries: 60

Rep: Reputation: 51
Now that my other project has reached beta, I am intending to try and create a simple library that does recursive deletion, hopeful in a reliable and safe manner. However, does anyone have any additional insights and suggestions for me? I am to be honest not sure yet what approach to take. I'm thinking the AT functions, as nominal mentioned, look quite useful, though I am also wondering if this should be based around a file tree walk with ftw/nftw via postorder traversal.

I tried to look through the remove.c code in coreutils, but it seems rather complicated and is intended to do more than what I am aiming for here (e.g., interactive deletions).
 
Old 02-10-2012, 03:09 PM   #9
hydraMax
Member
 
Registered: Jul 2010
Location: Skynet
Distribution: Debian + Emacs
Posts: 467

Original Poster
Blog Entries: 60

Rep: Reputation: 51
I just wanted to mention (for anyone that might read this thread in the future) I was able to create a small library with a simple function for recursive deletion of a file hierarchy:

Recursive Remove Library
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Recursive delete specific files from sub-directories. guriinii Linux - Newbie 11 03-07-2011 10:41 AM
How to do recursive file delete using specifier (*.tmp) from nested directories? Arodef Linux - General 3 11-11-2009 07:49 AM
backup recursive directories with webmin? guest Linux - Server 1 02-02-2009 12:14 PM
rm *.foo through recursive directories RevenantSeraph Linux - General 9 05-05-2007 04:03 AM
Recursive directories listing cdog Linux - General 11 12-09-2006 07:04 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 03:03 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration