LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   How does ls calculate file size (https://www.linuxquestions.org/questions/linux-newbie-8/how-does-ls-calculate-file-size-4175427294/)

svenxix 09-14-2012 02:30 PM

How does ls calculate file size
 
The ls command can display the file size with the -s or -l switches. Where does it read this information from?

I know the du command recalculates file size by reading the file directly. Does the ls command do this or does it read the file size from the inode?

tronayne 09-14-2012 02:55 PM

I'm 99-44/100% sure it uses stat(); see man 2 stat.

Hope this helps some.

jefro 09-14-2012 08:42 PM

I get the feeling ls only reads the i-node or file allocation table and does not compute sizes. I could be way wrong on that.

syg00 09-14-2012 09:36 PM

strace might be instructive - especially a comparison of "ls" versus "ls -l".

tronayne 09-15-2012 08:04 AM

The stat, fstat and lstat library functions all return information about a file, including the total size in bytes (see the structure definition below); no calculation need take place.

man 2 stat displays the details along with the structure definition:
Quote:

struct stat {
dev_t st_dev; /* ID of device containing file */
ino_t st_ino; /* inode number */
mode_t st_mode; /* protection */
nlink_t st_nlink; /* number of hard links */
uid_t st_uid; /* user ID of owner */
gid_t st_gid; /* group ID of owner */
dev_t st_rdev; /* device ID (if special file) */
off_t st_size; /* total size, in bytes */
blksize_t st_blksize; /* blocksize for file system I/O */
blkcnt_t st_blocks; /* number of 512B blocks allocated */
time_t st_atime; /* time of last access */
time_t st_mtime; /* time of last modification */
time_t st_ctime; /* time of last status change */
};
The stat library functions are used extensively by many utilities, among others the stat utility (display file or file system status), cp (copy files) and ls (I'm pretty sure).

Below is a demonstration version, status.c, that can be used to display all the information about one file contained in the stat structure, thus it's not terribly useful for general purposes:
Code:

#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <grp.h>
#include <pwd.h>
#include <time.h>

main        (int argc, char *argv[])
{
        time_t        now;
        struct        group        *grentry, *getgrgid (gid_t);
        struct        passwd        *pwentry, *getpwuid (uid_t);
        struct        stat        status;
        struct        tm        *tstr, *localtime (const time_t *);

        if (argc != 2) {
                (void) fprintf (stderr, "usage:\t%s file\n", argv [0]);
                return (1);
        }
        /*        get stat structure for specified file        */
        if (stat (argv [1], &status) == -1) {
                (void) fprintf (stderr, "%s:\tcan't stat %s\n",
                    argv [0], argv [1]);
                return (2);
        }
        /*        display the type of file        */
        if (status.st_mode & S_IFREG)
                (void) fprintf (stdout, "Regular File:\t");
        else if (status.st_mode & S_IFDIR)
                (void) fprintf (stdout, "Directory:\t");
        else if (status.st_mode & S_IFIFO)
                (void) fprintf (stdout, "FIFO Special File:\t");
        else if (status.st_mode & S_IFCHR)
                (void) fprintf (stdout, "Character Special File:\t");
        else if (status.st_mode & S_IFBLK)
                (void) fprintf (stdout, "Block Special File:\t");
        else if (status.st_mode & S_IFLNK)
                (void) fprintf (stdout, "Symbolic Link:\t");
        else
                (void) fprintf (stdout, "Unknown File:\t");
        (void) fprintf (stdout, "%s\n", argv [1]);
        (void) fprintf (stdout, "Inode:\t\t%d\n", (int) status.st_ino);
        (void) fprintf (stdout, "Mode:\t\t%d\n", (int) status.st_mode);
        (void) fprintf (stdout, "Device:\t\t%d\n", (int) status.st_dev);
        (void) fprintf (stdout, "Blocksize:\t%d\n", (int) status.st_blksize);
        (void) fprintf (stdout, "Links:\t\t%d\n", (int) status.st_nlink);
        (void) fprintf (stdout, "Owner (%d):\t", (int) status.st_uid);
        /*        look up owner in /etc/passwd        */
        if ((pwentry = getpwuid (status.st_uid)) == (struct passwd *) NULL) {
                (void) fprintf (stdout, "not found\n");
        } else {
                (void) fprintf (stdout, "%s\n", pwentry->pw_name);
        }
        /*        look up group name in /etc/group        */
        (void) fprintf (stdout, "Group (%d):\t", (int) status.st_gid);
        if ((grentry = getgrgid (status.st_gid)) == (struct group *) NULL) {
                (void) fprintf (stdout, "no found\n");
        } else {
                (void) fprintf (stdout, "%s\n", grentry->gr_name);
        }
        (void) fprintf (stdout, "Size:\t\t%d\n", (int) status.st_size);
        now = status.st_atime;
        tstr = localtime (&now);
        (void) fprintf (stdout, "atime:\t\t%02d:%02d:%02d %02d/%02d/%02d\t",
            tstr->tm_hour, tstr->tm_min, tstr->tm_sec,
            tstr->tm_mon+1, tstr->tm_mday, tstr->tm_year);
        (void) fprintf (stdout, "(Last Access)\n");
        now = status.st_ctime;
        tstr = localtime (&now);
        (void) fprintf (stdout, "ctime:\t\t%02d:%02d:%02d %02d/%02d/%02d\t",
            tstr->tm_hour, tstr->tm_min, tstr->tm_sec,
            tstr->tm_mon+1, tstr->tm_mday, tstr->tm_year);
        (void) fprintf (stdout, "(Last Data Modification)\n");
        now = status.st_mtime;
        tstr = localtime (&now);
        (void) fprintf (stdout, "mtime:\t\t%02d:%02d:%02d %02d/%02d/%02d\t",
            tstr->tm_hour, tstr->tm_min, tstr->tm_sec,
            tstr->tm_mon+1, tstr->tm_mday, tstr->tm_year);
        (void) fprintf (stdout, "(Last File Status Change)\n");
        return (0);
}

If you save the above as, say, status.c, you can
Code:

make status
or
cc -o status status.c

Usage is
Code:

status file.name
or
for FILE in *
do
    status ${FILE}
done

It's just a demo, not a full-blown utility.

Hope this helps some.

SecretCode 09-15-2012 09:36 AM

Quote:

Originally Posted by svenxix (Post 4780538)
I know the du command recalculates file size by reading the file directly. Does the ls command do this or does it read the file size from the inode?

Where did you read this? And what does it mean? The file itself is just a stream. If the inode is "wrong" about the file size, there's nothing in the file that gives any better information.

(I'm including the inode pointer blocks in what I mean by the inode, even though some of them can be indirect. They're still part of the metadata not part of the file.)

((OK now I think I know what you mean. If the file is sparse (or in some other cases) it may use less disk space - fewer physical disk blocks - than its apparent size. The space used, as reported by du (and the Blocks: part of stat), is the combined size of all the disk blocks it uses ... but this is still recorded in the inode and indirect metadata, not in the "file itself". ls and the Size: part of stat report the actual or apparent size read from a single field in the inode. For most files, this will be a bit smaller than the space used, if the last disk block is not full. But either way these are different meanings, one is not more accurate than the other.))

SecretCode 09-15-2012 09:50 AM

For a pathological case, try
Code:

dd of=takes-no-space bs=1 count=0 seek=100G
Then
Code:

ls -1h takes-no-space
du takes-no-space
stat takes-no-space

ls says it's 100GB in size; du says it takes no space; stat says both are correct. :)

jefro 09-15-2012 12:16 PM

The file actually is taking that space on the drive. That space is reserved and can't be used so I guess we get to the question of what is the size. Size on disk or actual size of file.

SecretCode 09-15-2012 12:33 PM

@jefro: if you're replying to me, no it isn't! It's not taking any space; no disk blocks are reserved.

SecretCode 09-15-2012 12:38 PM

10 terabytes works too. I promise i don't have that much disk.

Code:

$ dd of=takes-no-space bs=1 count=0 seek=10T
0+0 records in
0+0 records out
0 bytes (0 B) copied, 1.776e-05 s, 0.0 kB/s
[joe@sourdust: ~/tests] Sat Sep 15 18:36:48
$ ls -lh takes-no-space
-rw-rw-r-- 1 joe joe 10T 2012-09-15 18:36 takes-no-space

When I try 16T I get an error,
Code:

dd: failed to truncate to 17592186044416 bytes in output file `takes-no-space': File too large
which I guess is a file system limit.

colucix 09-15-2012 12:50 PM

Quote:

Originally Posted by svenxix (Post 4780538)
The ls command can display the file size with the -s or -l switches. Where does it read this information from?

Please, take a look at the source code of the ls command (from the latest GNU coreutils 8.19):
Code:

char const *size =
        (! f->stat_ok
        ? "?"
        : human_readable (unsigned_file_size (f->stat.st_size),
                          hbuf, file_human_output_opts, 1,
                          file_output_block_size));

Indeed, it uses the stat library (included in the coreutils themselves) as previously mentioned.


All times are GMT -5. The time now is 09:15 AM.