LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   robust total size of a list of files (https://www.linuxquestions.org/questions/programming-9/robust-total-size-of-a-list-of-files-944186/)

patrick295767 05-10-2012 03:56 AM

robust total size of a list of files
 
Hi,

There are different methods but what is the most robust method to make the total size of the files and/or directories from a list of files (`ls -1 `)?

Code:

ls -1 | while read -r each ; do

tmpsize=$(du -b "${each}")
totalsize=??

done

echo "The total size of the list is: $totalsize"



thanks

em31amit 05-10-2012 04:04 AM

du also shows total count of file sizes


check "-c" parameters in du man pages.

Tinkster 05-10-2012 04:11 AM

Quote:

Originally Posted by patrick295767 (Post 4674898)
Hi,

There are different methods but what is the most robust method to make the total size of the files and/or directories from a list of files (`ls -1 `)?

Code:

ls -1 | while read -r each ; do

tmpsize=$(du -b "${each}")
totalsize=??

done

echo "The total size of the list is: $totalsize"



thanks

You really don't want to use ls for this. What's the exact
requirement of this exercise?

Nominal Animal 05-10-2012 05:27 AM

For a list of files, I'd use
Code:

stat -c '%s' FILE(s).. | awk '{ s=s+$1 } END { printf("%.0f\n", s) }'
but for complete directory trees, I'd use
Code:

find DIR(S).. -type f -printf '%s\n' | awk '{ s=s+$1 } END { printf("%.0f\n", s)
If you want the output in millions of bytes, with one decimal digit, use printf("%.1f MB\n", s/1000000.0)

Remember that du lists disk usage, not file sizes.

em31amit 05-10-2012 05:36 AM

@Nominal

du lists disk usage, but by whom?? ofcourse a file.. everything is file in linux. than what is the difference between size listed by du for a file and file size showing using stat command ?

Nominal Animal 05-10-2012 06:53 AM

Quote:

Originally Posted by em31amit (Post 4674967)
du lists disk usage

Exactly. Observe:
Code:

dd if=/dev/zero of=example-file bs=1 count=1 seek=1048574
This produces a sparse file. You can read it normally, but the initial zeros you see are not stored on disk, and do not consume disk space.

So, even though ls -l example-file outputs
Code:

-rw-rw-r-- 1 nominal animal 1048575 May 10 14:43 example-file
and find example-file -printf '%s\n' and stat -c '%s' example-file both output
Code:

1048575
the actual disk usage is much less than the file size. My ext4 partition I used for this example utilizes 4k blocks, so du -hs example-file outputs
Code:

4.0K        example-file
Do you now see the difference between file sizes and disk usage?

Directories also consume disk space. The amount depends on the number of entries, the length of the file and directory names in it, and the number of POSIX and extended attributes used. If the du command you run includes the directories, the result will reflect the actual disk space used, and will be significantly different to the total size of the relevant files. You cannot even say which one (disk usage or total file size) is greater, unless you check!

grail 05-10-2012 07:23 AM

Quote:

echo "The total size of the list is: $totalsize"
Someone should also point out that this line will never work as expected seeing the while loop is in a sub-shell and so any changes to the variables inside the loop
will not be reported outside the loop.

millgates 05-10-2012 07:24 AM

Quote:

Originally Posted by Nominal Animal (Post 4675006)
Exactly. Observe:
Code:

dd if=/dev/zero of=example-file bs=1 count=1 seek=1048574
This produces a sparse file. You can read it normally, but the initial zeros you see are not stored on disk, and do not consume disk space.

So, even though ls -l example-file outputs
Code:

-rw-rw-r-- 1 nominal animal 1048575 May 10 14:43 example-file
and find example-file -printf '%s\n' and stat -c '%s' example-file both output
Code:

1048575
the actual disk usage is much less than the file size. My ext4 partition I used for this example utilizes 4k blocks, so du -hs example-file outputs
Code:

4.0K        example-file
Do you now see the difference between file sizes and disk usage?

Directories also consume disk space. The amount depends on the number of entries, the length of the file and directory names in it, and the number of POSIX and extended attributes used. If the du command you run includes the directories, the result will reflect the actual disk space used, and will be significantly different to the total size of the relevant files. You cannot even say which one (disk usage or total file size) is greater, unless you check!

That is correct. However, look at the -b and --apparent-size switches for du which will show the correct result.

bigearsbilly 05-10-2012 08:50 AM

What about?

cat * | wc -c

well typically to compare directories I do

Code:

cat A/* | cksum
cat B/* | cksum

but I am
a) lazy
b) a guru
c) lateral thinking

not sure which

Nominal Animal 05-10-2012 10:37 AM

Quote:

Originally Posted by millgates (Post 4675027)
That is correct. However, look at the -b and --apparent-size switches for du which will show the correct result.

Only if the parameter list does not refer to any directories. When using
Code:

du -b --apparent-size ... DIRECTORY ...
the result includes the size of the directories, not just the files. It is therefore not the total size of the files, as specified in the title of this thread; it yields the total size of the specified files and directories.

millgates 05-10-2012 11:41 AM

Quote:

Originally Posted by Nominal Animal (Post 4675180)
Only if the parameter list does not refer to any directories. When using
Code:

du -b --apparent-size ... DIRECTORY ...
the result includes the size of the directories, not just the files. It is therefore not the total size of the files, as specified in the title of this thread; it yields the total size of the specified files and directories.

Isn't a directory also a file (everything's a file in unix)? By the way, the original question was (emphasis added):

Quote:

There are different methods but what is the most robust method to make the total size of the files and/or directories from a list of files (`ls -1 `)?
if you don't want directories, you can use find . -type f or whatever.

Nominal Animal 05-10-2012 12:32 PM

Quote:

Originally Posted by millgates (Post 4675234)
Isn't a directory also a file

There are enough differences that make the distinction meaningful.

Directory size is almost always an integral number of disk allocation blocks, making it useless for detecting e.g. additions or removals. (Note: Symlink size is almost always the length of the symlink target path.)

Directories cannot be hardlinked, only files can. (You can, however, bind-mount directories and files over other directories and files, so only the "top one" is accessible.)

Execute right to a directory allows you to stat() the directory and any file you know the name of in that directory. (Without read rights, you cannot scan the directory, so I guess read and write rights are analogous between files and directories.)

For mount points, /path/to/directory and /path/to/directory/. have different device and inode numbers.

patrick295767 05-20-2012 01:57 PM

thank you so so so much !!!!!

I like the --apparent-size way and too the find frmo unspawn


All times are GMT -5. The time now is 01:56 PM.