LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - General
User Name
Password
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.

Notices


Reply
  Search this Thread
Old 03-06-2020, 09:11 PM   #1
Skaperen
Senior Member
 
Registered: May 2009
Location: center of singularity
Distribution: Xubuntu, Ubuntu, Slackware, Amazon Linux, OpenBSD, LFS (on Sparc_32 and i386)
Posts: 2,684
Blog Entries: 31

Rep: Reputation: 176Reputation: 176
testing if a directory is empty


i would like to test if a directory is empty. the obviously simple way is to read the list of names in the directory, skipping . and .. if they are there. but this forces another block to be read. i'd like to know if this can be determined from the inode much like looking at the link count can tell you if it has any subdirectories (you can skip it if only listing girectories). can this be done from stat() data?

the purpose of this is to aquire all the names in a ditrctory tree sf fasy as possible (the fewest titol I/O operations).
 
Old 03-06-2020, 09:41 PM   #2
rknichols
Senior Member
 
Registered: Aug 2009
Distribution: Rocky Linux
Posts: 4,780

Rep: Reputation: 2214Reputation: 2214Reputation: 2214Reputation: 2214Reputation: 2214Reputation: 2214Reputation: 2214Reputation: 2214Reputation: 2214Reputation: 2214Reputation: 2214
The directory's inode can't tell you much. The minimum space allocated to a directory is 4096 bytes (1 block), and that will be the same whether the directory is empty or has a few files in it. Also, a directory can expand, but does not automatically shrink when files are removed (there's an option in fsck to accomplish that), so a large size for a directory just means that at one time it contained many files, whereas it might now be empty.

The find command does have a "-empty" test that will return "true" for an empty file or directory. Whether that is considered a "simple way" depends on the individual and the situation.
 
Old 03-06-2020, 09:58 PM   #3
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,131

Rep: Reputation: 4121Reputation: 4121Reputation: 4121Reputation: 4121Reputation: 4121Reputation: 4121Reputation: 4121Reputation: 4121Reputation: 4121Reputation: 4121Reputation: 4121
The days of being able to predict if you can save (real) I/O have long gone. There is so much caching going on you can't even presume to be able to replicate test results.
You are probably worrying about the wrong thing in the overall scheme of things.
 
1 members found this post helpful.
Old 03-07-2020, 12:23 AM   #4
Skaperen
Senior Member
 
Registered: May 2009
Location: center of singularity
Distribution: Xubuntu, Ubuntu, Slackware, Amazon Linux, OpenBSD, LFS (on Sparc_32 and i386)
Posts: 2,684

Original Poster
Blog Entries: 31

Rep: Reputation: 176Reputation: 176
the purpose is simply to increase efficiency in a file scan generator, to avoid trying to read the list of names if there are none. it appears that the filesytem code or kernel reads at least one empty 4k block from the directory when trying to read names. that is probably good evidence that there is no way to determine that, at least for filesystems i have tried (ext2,ext3,ext4,btrfs,reiserfs). it's not a critical need. i can just go ahead and read the list of names and see if it is empty, or just not deal with being empty.

this project is a generator in python3 that yields each path in name sorted order with the file type (regular file vs directory, etc) included in the yielded tuple.

Last edited by Skaperen; 03-07-2020 at 12:25 AM.
 
Old 03-07-2020, 02:51 AM   #5
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,131

Rep: Reputation: 4121Reputation: 4121Reputation: 4121Reputation: 4121Reputation: 4121Reputation: 4121Reputation: 4121Reputation: 4121Reputation: 4121Reputation: 4121Reputation: 4121
How much I/O (and time) is consumed starting python ?. I think you need to get things in perspective.
 
1 members found this post helpful.
Old 03-07-2020, 10:54 AM   #6
BW-userx
LQ Guru
 
Registered: Sep 2013
Location: Somewhere in my head.
Distribution: Slackware (15 current), Slack15, Ubuntu studio, MX Linux, FreeBSD 13.1, WIn10
Posts: 10,342

Rep: Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242
the only way to see if a something is in something else without asking someone else is to look for yourself. this is a basic truth applied in all areas of life.
 
Old 03-07-2020, 01:00 PM   #7
rknichols
Senior Member
 
Registered: Aug 2009
Distribution: Rocky Linux
Posts: 4,780

Rep: Reputation: 2214Reputation: 2214Reputation: 2214Reputation: 2214Reputation: 2214Reputation: 2214Reputation: 2214Reputation: 2214Reputation: 2214Reputation: 2214Reputation: 2214
Quote:
Originally Posted by Skaperen View Post
it appears that the filesytem code or kernel reads at least one empty 4k block from the directory when trying to read names.
It is never empty. At a minimum, it contains the entries for "." and ".." .
 
1 members found this post helpful.
Old 03-07-2020, 02:19 PM   #8
dugan
LQ Guru
 
Registered: Nov 2003
Location: Canada
Distribution: distro hopper
Posts: 11,236

Rep: Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320
Quote:
Originally Posted by Skaperen View Post
i would like to test if a directory is empty. the obviously simple way is to read the list of names in the directory, skipping . and .. if they are there. but this forces another block to be read. i'd like to know if this can be determined from the inode much like looking at the link count can tell you if it has any subdirectories (you can skip it if only listing girectories). can this be done from stat() data?

the purpose of this is to aquire all the names in a ditrctory tree sf fasy as possible (the fewest titol I/O operations).
I got to the last sentence, and, I, uh...

If the point is to acquire all the names in a directory tree, then the fastest and most efficient way is to query for all the names in the directory directly. Adding a guard to check for the special case where the directory is empty is just going to waste time. The check, no matter how efficient, is not free.

And the information you want wouldn't be in the inode. It would in the directory entry ("dirent"). It looks to me like the structure that provides access the directory entry's children is intentionally not part of the public API, and you have to call readdir to get them. So the fastest way to check if directory is empty is indeed to list it.

Last edited by dugan; 03-07-2020 at 02:25 PM.
 
1 members found this post helpful.
Old 03-07-2020, 05:58 PM   #9
MadeInGermany
Senior Member
 
Registered: Dec 2011
Location: Simplicity
Posts: 2,805

Rep: Reputation: 1206Reputation: 1206Reputation: 1206Reputation: 1206Reputation: 1206Reputation: 1206Reputation: 1206Reputation: 1206Reputation: 1206
Trace a find -empty
I guess it does a readdir().

The . and .. links are present in a Unix-like filesystem or if the kernel driver presents it Unix-style.
Assuming this is always true, you can see if it has sub directories (links > 2) or not (links = 2). But for seeing files you need readdir().
 
Old 03-08-2020, 10:06 AM   #10
GazL
LQ Veteran
 
Registered: May 2008
Posts: 6,901

Rep: Reputation: 5025Reputation: 5025Reputation: 5025Reputation: 5025Reputation: 5025Reputation: 5025Reputation: 5025Reputation: 5025Reputation: 5025Reputation: 5025Reputation: 5025
Example program (I was curious):
Code:
#include<stdio.h>
#include<sys/types.h>
#include<dirent.h>

int main( int argc, char *argv[] )
{
    int count = -2 ;
    struct dirent *dent;

    DIR *d;

    d = opendir(argv[1]);
    while ( dent = readdir(d) )
        count++;
    
    if ( count > 0 )
        printf("Count %d\n", count);
    else
        puts("empty");
    
    return 0;
}
(please excuse the lack of error/argument checking, I couldn't be bothered).

strace of ./a.out /var/empty:
Code:
openat(AT_FDCWD, "/var/empty", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = 3
fstat(3, {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
brk(NULL)                               = 0x215a000
brk(0x217b000)                          = 0x217b000
getdents64(3, /* 2 entries */, 32768)   = 48
getdents64(3, /* 0 entries */, 32768)   = 0
fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(0x88, 0x1), ...}) = 0
write(1, "empty\n", 6)                  = 6
strace of ./a.out /somewhere_that's_not empty:
Code:
openat(AT_FDCWD, "/var/tmp", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = 3
fstat(3, {st_mode=S_IFDIR|S_ISVTX|0777, st_size=4096, ...}) = 0
brk(NULL)                               = 0xa68000
brk(0xa89000)                           = 0xa89000
getdents64(3, /* 22 entries */, 32768)  = 712
getdents64(3, /* 0 entries */, 32768)   = 0
fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(0x88, 0x1), ...}) = 0
write(1, "Count 20\n", 9)               = 9
Conclusion:

The glibc implementation of readdir(3) uses getdents64(2) internally with a buffer size of 32768. What this essentially means is that any number of readdir(3) calls that don't exceed that buffer size will not result in any additional I/O operations or additional context switches (due to syscalls).

As shown above, there are only two getdents64() calls in both cases. It looks like you always get one additional getdents64() call when trying to read past the last directory entry with readdir(3).

Even if all the file's names are approaching NAME_MAX you'd still need over a hundred of them to exceed this buffer and result in additional I/O OPs: assuming VFS cache hasn't already cached them of course, which it probably has.

So, as others have said, not worth worrying about.
 
1 members found this post helpful.
Old 03-08-2020, 05:33 PM   #11
petelq
Member
 
Registered: Aug 2008
Location: Yorkshire
Distribution: openSUSE(Leap and Tumbleweed) and a (not so) regularly changing third and fourth
Posts: 627

Rep: Reputation: Disabled
Maybe you can work with something like
Code:
if [ $(ls -A)=0 ]; then echo " empty"
else
echo "files"
fi
You could, perhaps, build in a directory variable as a parameter but the bottom line is, I think dugan is right in his post above.

Last edited by petelq; 03-08-2020 at 05:42 PM.
 
Old 03-09-2020, 02:42 AM   #12
MadeInGermany
Senior Member
 
Registered: Dec 2011
Location: Simplicity
Posts: 2,805

Rep: Reputation: 1206Reputation: 1206Reputation: 1206Reputation: 1206Reputation: 1206Reputation: 1206Reputation: 1206Reputation: 1206Reputation: 1206
Quote:
Originally Posted by petelq View Post
Maybe you can work with something like
Code:
if [ $(ls -A)=0 ]; then echo " empty"
else
echo "files"
fi
You could, perhaps, build in a directory variable as a parameter but the bottom line is, I think dugan is right in his post above.
That needs a small correction
Code:
if [ -z "$(ls -A)" ]; then
 
Old 03-11-2020, 02:11 AM   #13
ondoho
LQ Addict
 
Registered: Dec 2013
Posts: 19,872
Blog Entries: 12

Rep: Reputation: 6053Reputation: 6053Reputation: 6053Reputation: 6053Reputation: 6053Reputation: 6053Reputation: 6053Reputation: 6053Reputation: 6053Reputation: 6053Reputation: 6053
^ what about https://mywiki.wooledge.org/ParsingLs ?
Maybe something like
Code:
for i in * .*; ....
would be better?

Or, how about
Code:
stat .
It offers some information that seems to hint to a directory being filled with stuff, or empty, like 'Size' or 'Blocks' or 'Links'.
 
Old 03-11-2020, 09:25 AM   #14
rnturn
Senior Member
 
Registered: Jan 2003
Location: Illinois (SW Chicago 'burbs)
Distribution: openSUSE, Raspbian, Slackware. Previous: MacOS, Red Hat, Coherent, Consensys SVR4.2, Tru64, Solaris
Posts: 2,803

Rep: Reputation: 550Reputation: 550Reputation: 550Reputation: 550Reputation: 550Reputation: 550
Quote:
Originally Posted by syg00 View Post
How much I/O (and time) is consumed starting python ?.
~15ms on my older (G3440-based) system. Looking at an empty subdirectory can't take very long.

IMHO, if directory scanning is taking too long, we need to find an easy way to detect and skip browser cache and thumbnail directories. :^)
 
Old 03-11-2020, 12:41 PM   #15
petelq
Member
 
Registered: Aug 2008
Location: Yorkshire
Distribution: openSUSE(Leap and Tumbleweed) and a (not so) regularly changing third and fourth
Posts: 627

Rep: Reputation: Disabled
Quote:
Originally Posted by MadeInGermany View Post
That needs a small correction
Code:
if [ -z "$(ls -A)" ]; then
I did a brief test with "$(ls -A)=0" before my original post and it worked. But your way's good also.

Last edited by petelq; 03-11-2020 at 03:49 PM.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] F27 - Empty Trash Bin icon is showing the non-empty icon Connor84 Fedora 2 02-17-2018 01:18 AM
LXer: How To Empty a File, Delete N Lines From a File, Remove Matching String From a File, And Remove Empty/Blank Lines From a File In Linux LXer Syndicated Linux News 0 11-22-2017 12:30 PM
Scripting Help--Check empty string condition (not null, but empty!) sungchoiok Linux - Newbie 4 01-01-2012 03:46 PM
[SOLVED] [BASH] non-empty variable before loop end, is empty after exiting loop aitor Programming 2 08-26-2010 09:57 AM
Gentoo VNC, empty dialog box !! "Question" window is empty ! TheHushedCaskeT Linux - Software 0 02-01-2005 10:14 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - General

All times are GMT -5. The time now is 07:17 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration