Linux - GeneralThis Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
i would like to test if a directory is empty. the obviously simple way is to read the list of names in the directory, skipping . and .. if they are there. but this forces another block to be read. i'd like to know if this can be determined from the inode much like looking at the link count can tell you if it has any subdirectories (you can skip it if only listing girectories). can this be done from stat() data?
the purpose of this is to aquire all the names in a ditrctory tree sf fasy as possible (the fewest titol I/O operations).
The directory's inode can't tell you much. The minimum space allocated to a directory is 4096 bytes (1 block), and that will be the same whether the directory is empty or has a few files in it. Also, a directory can expand, but does not automatically shrink when files are removed (there's an option in fsck to accomplish that), so a large size for a directory just means that at one time it contained many files, whereas it might now be empty.
The find command does have a "-empty" test that will return "true" for an empty file or directory. Whether that is considered a "simple way" depends on the individual and the situation.
The days of being able to predict if you can save (real) I/O have long gone. There is so much caching going on you can't even presume to be able to replicate test results.
You are probably worrying about the wrong thing in the overall scheme of things.
the purpose is simply to increase efficiency in a file scan generator, to avoid trying to read the list of names if there are none. it appears that the filesytem code or kernel reads at least one empty 4k block from the directory when trying to read names. that is probably good evidence that there is no way to determine that, at least for filesystems i have tried (ext2,ext3,ext4,btrfs,reiserfs). it's not a critical need. i can just go ahead and read the list of names and see if it is empty, or just not deal with being empty.
this project is a generator in python3 that yields each path in name sorted order with the file type (regular file vs directory, etc) included in the yielded tuple.
the only way to see if a something is in something else without asking someone else is to look for yourself. this is a basic truth applied in all areas of life.
i would like to test if a directory is empty. the obviously simple way is to read the list of names in the directory, skipping . and .. if they are there. but this forces another block to be read. i'd like to know if this can be determined from the inode much like looking at the link count can tell you if it has any subdirectories (you can skip it if only listing girectories). can this be done from stat() data?
the purpose of this is to aquire all the names in a ditrctory tree sf fasy as possible (the fewest titol I/O operations).
I got to the last sentence, and, I, uh...
If the point is to acquire all the names in a directory tree, then the fastest and most efficient way is to query for all the names in the directory directly. Adding a guard to check for the special case where the directory is empty is just going to waste time. The check, no matter how efficient, is not free.
And the information you want wouldn't be in the inode. It would in the directory entry ("dirent"). It looks to me like the structure that provides access the directory entry's children is intentionally not part of the public API, and you have to call readdir to get them. So the fastest way to check if directory is empty is indeed to list it.
The . and .. links are present in a Unix-like filesystem or if the kernel driver presents it Unix-style.
Assuming this is always true, you can see if it has sub directories (links > 2) or not (links = 2). But for seeing files you need readdir().
The glibc implementation of readdir(3) uses getdents64(2) internally with a buffer size of 32768. What this essentially means is that any number of readdir(3) calls that don't exceed that buffer size will not result in any additional I/O operations or additional context switches (due to syscalls).
As shown above, there are only two getdents64() calls in both cases. It looks like you always get one additional getdents64() call when trying to read past the last directory entry with readdir(3).
Even if all the file's names are approaching NAME_MAX you'd still need over a hundred of them to exceed this buffer and result in additional I/O OPs: assuming VFS cache hasn't already cached them of course, which it probably has.
So, as others have said, not worth worrying about.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.