Linux - GeneralThis Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I have a list of files in a text document & I'd like to find out how much space all of those files use.
I was just going through the list with a shell script & accumulating their file size, but I run into a 2GB limit with the shell script. I know I could rewrite the script in python or perl, but before I go reinventing the wheel, is there some way to du only those files?
the man page for du says I can:
--files0-from=F
summarize disk usage of the NUL-terminated file names specified in file F
but it doesn't seem to work for more than one file in the given text doc. Is there some special way to null-terminate the file names in the text doc? I've tried one file per line, enclosing file names in quotes, adding '\0' at the end of the file name... nothing seems to work.
Maybe this can give You a clue? Given that Your filelist has newline separated filenames (and including those pesty spaces in filenames, replace \n with \0 using tr and tell du to use stdin. Some creative use of cut could summarize the first column for You as well (it is tab-separated).
Quote:
lakris@ubuntu:~/projekt/scripts$ find /home/share/musik/Alice\ In\ Chains\ -\ Facelift/ -type f>filelist
lakris@ubuntu:~/projekt/scripts$ cat filelist
/home/share/musik/Alice In Chains - Facelift/Alice In Chains - Facelift - 09 Put You Down.mp3
/home/share/musik/Alice In Chains - Facelift/Alice In Chains - Facelift - 03 Sea Of Sorrow.mp3
/home/share/musik/Alice In Chains - Facelift/Alice In Chains - Facelift - 12 - Real Thing.mp3
/home/share/musik/Alice In Chains - Facelift/Alice In Chains - Facelift - 10 - Confusion.mp3
/home/share/musik/Alice In Chains - Facelift/Alice In Chains - Facelift - 11 - I Know Somethin (Bout You).mp3
/home/share/musik/Alice In Chains - Facelift/Alice In Chains - Facelift - 05 I Can't Remember.mp3
/home/share/musik/Alice In Chains - Facelift/Alice In Chains - Facelift - 07 It Ain't Like That.mp3
/home/share/musik/Alice In Chains - Facelift/Alice In Chains - Facelift - 01 We Die Young.mp3
/home/share/musik/Alice In Chains - Facelift/Alice In Chains - Facelift - 04 Bleed The Freak.mp3
/home/share/musik/Alice In Chains - Facelift/Alice In Chains - Facelift - 08 Sunshine.mp3
/home/share/musik/Alice In Chains - Facelift/Alice In Chains - Facelift - 02 Man In The Box.mp3
/home/share/musik/Alice In Chains - Facelift/Alice In Chains - Facelift - 06 Love, Hate, Love.mp3
lakris@ubuntu:~/projekt/scripts$ cat filelist|tr "\n" "\0"| du --files0-from=-
3072 /home/share/musik/Alice In Chains - Facelift/Alice In Chains - Facelift - 09 Put You Down.mp3
5468 /home/share/musik/Alice In Chains - Facelift/Alice In Chains - Facelift - 03 Sea Of Sorrow.mp3
3808 /home/share/musik/Alice In Chains - Facelift/Alice In Chains - Facelift - 12 - Real Thing.mp3
5388 /home/share/musik/Alice In Chains - Facelift/Alice In Chains - Facelift - 10 - Confusion.mp3
4100 /home/share/musik/Alice In Chains - Facelift/Alice In Chains - Facelift - 11 - I Know Somethin (Bout You).mp3
3480 /home/share/musik/Alice In Chains - Facelift/Alice In Chains - Facelift - 05 I Can't Remember.mp3
4344 /home/share/musik/Alice In Chains - Facelift/Alice In Chains - Facelift - 07 It Ain't Like That.mp3
2384 /home/share/musik/Alice In Chains - Facelift/Alice In Chains - Facelift - 01 We Die Young.mp3
3776 /home/share/musik/Alice In Chains - Facelift/Alice In Chains - Facelift - 04 Bleed The Freak.mp3
4468 /home/share/musik/Alice In Chains - Facelift/Alice In Chains - Facelift - 08 Sunshine.mp3
4480 /home/share/musik/Alice In Chains - Facelift/Alice In Chains - Facelift - 02 Man In The Box.mp3
6044 /home/share/musik/Alice In Chains - Facelift/Alice In Chains - Facelift - 06 Love, Hate, Love.mp3
lakris@ubuntu:~/projekt/scripts$
/Lakris
PS Just thought of something, if Your filelist is very long You may have to use xargs because the list of arguments grow to big. It often happens when trying to operate on entire directory trees, with tar for example.
The problem with both of these methods is that they get the disk usage of each file individually... I want a summary of all files... i.e. "du -sh <list of files>" - but then only get one entry for output which is the disk usage of all of the files in the list.
The problem with both of these methods is that they get the disk usage of each file individually... I want a summary of all files... i.e. "du -sh <list of files>" - but then only get one entry for output which is the disk usage of all of the files in the list.
Yes, and as I said, You can use more piping to extract the size, sum it up and display it, I'll even do it for You:
Quote:
lakris@ubuntu:~/projekt/scripts$ cat filelist | tr "\n" "\0" | du -b --files0-from=- | (sum=0; x=0; while read line; do x=`echo $line|cut -d" " -f1`; sum=$(($x+$sum)); done; echo $sum)
51912254
I love the *nix tools!
PS Maybe You meant that du itself could do the summarisation but I tried it and couldn't get it to work, maybe it can't do that with files from list...
The problem with both of these methods is that they get the disk usage of each file individually... I want a summary of all files... i.e. "du -sh <list of files>" - but then only get one entry for output which is the disk usage of all of the files in the list.
Heads up: "du" gives you disk space, not file size. Small files take a certain minimum disk space. (Is this different for different file systems?) Depending on what you are using the data for, you could get misleading numbers.
Heads up: "du" gives you disk space, not file size. Small files take a certain minimum disk space. (Is this different for different file systems?) Depending on what you are using the data for, you could get misleading numbers.
Thanks for the heads up, but I'm aware of that. disk usage is actually better than file size for the purpose of this script, but I'd take either.
I've since re-written the script in python, but I would sure like to know if it's possible to get a summary of disk usage of a list of files with du
Someone already said to run du on the list and then add up the numbers. Beyond that, the only way to get du to do it is to put them all in a folder.
du <foldername> gives the total of everything in the folder.
Yes, and as I said, You can use more piping to extract the size, sum it up and display it, I'll even do it for You:
I love the *nix tools!
PS Maybe You meant that du itself could do the summarisation but I tried it and couldn't get it to work, maybe it can't do that with files from list...
I missed this one before.
I'll just add that the reason I was looking to du in the first place is that summing up file sizes in a shell script has a 2GB limit & almost all of my searches are greater than 2GB. Well... 2GB with tcsh, don't know about bash (and yes I know the pros and cons of bash vs tcsh).
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.