Disk Top Ten (dtt) release announcements and discussion

metaed · 01-24-2024, 05:15 PM

dtt is a better "where's my disk space" utility. It quickly summarizes filesystem usage into a "top ten" list of largest subtrees or individual objects. Its output has the unique property that it shows the ten largest single subtrees found at any depth without reporting redundant results. In other words, all the really big chunks float to the top. Sample output:

Code:


1076. $ dtt -h /usr
            9.7G (grand total - physical size - customary binary units)
              1G /usr/bin
            589M /usr/doc/rust-1.58.1/html
            537M /usr/include
            450M /usr/share/locale
            368M /usr/share/texmf-dist
            318M /usr/lib64/python3.9
            291M /usr/libexec/gcc/x86_64-slackware-linux/11.2.0
            256M /usr/share/fonts/TTF
            250M /usr/lib64/thunderbird
            249M /usr/lib64/firefox

1077. $

Subscribe this thread for release announcements such as the one below, and for feature requests and general discussion about the tool.

New in 2.1 alpha (release candidate):

added --inodes option
made option parsing errors fatal (thank you 0XBF)
added support for Perl 5.38.2
made lots of clarifying edits to the manuals

Release 2.1 alpha source package, Slackware 15.0 slackbuild, manuals, and changelog are here: https://metaed.com/papers/dtt/. (If you cannot reach the site, you are connecting from a country that is geoblocked because of high attack rate. At the moment that's CN, VN, RU, HK, SG, IN, KR, TW, BR, JP, and ID. Let me know which country and and I will lower shields for you.)

For older discussion, see also the version 1.0 release announcement and the version 2.0 release announcement threads.

Credit goes to Windu for the suggestion to start a megathread.

babydr · 01-24-2024, 09:57 PM

Hello metaed , Seems kid of odd that using a set -B size , Would report file sizes of '0' .
I can understand why , It just seems odd that the report would not represent them with 0.4 or 0.5 .
Hth , JimL
ie:

Code:

./usr/bin/dtt -B1M /home/vaxuser/VAX-DISKS
           13197 (grand total - physical size 1048576B-blocks)
            4092 /home/vaxuser/VAX-DISKS/vaxX-vms-rq0.dsk
            4092 /home/vaxuser/VAX-DISKS/vaxX-vms-rq3.dsk
            2927 /home/vaxuser/VAX-DISKS/VAX1-vms-rq0.dsk
            1436 /home/vaxuser/VAX-DISKS/vaxX-vms-rq1.dsk
             650 /home/vaxuser/VAX-DISKS/HOBBYISTV1.IMG
               0 /home/vaxuser/VAX-DISKS/VaxServer-ka655x.bin
               0 /home/vaxuser/VAX-DISKS/VaxStation-ka655x.bin
               0 /home/vaxuser/VAX-DISKS/ka655x.bin
               0 /home/vaxuser/VAX-DISKS/ka750_new.bin
               0 /home/vaxuser/VAX-DISKS/vax1-nvram.bin

A "ls -al" listing of the directory in question .

Code:

$ ls -al /home/vaxuser/VAX-DISKS
total 13514008
      4 drwxr-xr-x  2 vaxuser users       4096 Aug 13  2018 ./
      4 drwxr-x--x 22 vaxuser users       4096 Mar 25  2023 ../
 665908 -rw-r-----  1 vaxuser users  681885696 Jan 27  2018 HOBBYISTV1.IMG
2997104 -rw-r--r--  1 vaxuser users 4290601472 May 30  2022 VAX1-vms-rq0.dsk
      0 -rw-r--r--  1 vaxuser users          0 Mar 28  2018 VAX1-vms-rq1.dsk
    128 -rw-r--r--  1 vaxuser users     131072 Mar 21  2023 VaxServer-ka655x.bin
    128 -rw-r--r--  1 vaxuser users     131072 Mar 21  2023 VaxStation-ka655x.bin
    128 -rw-r--r--  1 vaxuser users     131072 Mar 21  2023 ka655x.bin
      4 -rw-r--r--  1 vaxuser users       1024 Apr 15  2018 ka750_new.bin
      4 -rw-r--r--  1 vaxuser users       1024 May 30  2022 vax1-nvram.bin
      4 -rw-r--r--  1 vaxuser users       1342 Apr 15  2018 vax1.ini
      4 -rw-r--r--  1 vaxuser users       1024 Mar 26  2023 vaxX-nvram.bin
4190044 -rw-r--r--  1 vaxuser users 4290600960 Jan 29  2019 vaxX-vms-rq0.dsk
1470480 -rw-r--r--  1 vaxuser users 1505766912 Apr 15  2018 vaxX-vms-rq1.dsk
4190044 -rw-r--r--  1 vaxuser users 4290600960 Oct 30  2018 vaxX-vms-rq3.dsk
      4 -rw-r--r--  1 vaxuser users       1516 Mar 26  2023 vaxX.ini
      4 -rw-r--r--  1 vaxuser users       1024 May 25  2018 vaxY-nvram.bin
      4 -rw-r--r--  1 vaxuser users       1400 May 23  2018 vaxY.ini
      4 -rw-r--r--  1 vaxuser users       1024 Jan 29  2019 vaxZ-nvram.bin
      4 -rw-r--r--  1 vaxuser users       2121 Dec 27  2018 vaxZ.ini

Windu · 01-25-2024, 01:10 AM

Could you please stop opening new threads every time you make a code change? It's just silly.
You can dedicate a single topic to that software of yours and then keep updating the first post with new information.

metaed · 01-25-2024, 12:45 PM

Quote:

Originally Posted by babydr

Seems kid of odd that using a set -B size , Would report file sizes of '0' .
I can understand why , It just seems odd that the report would not represent them with 0.4 or 0.5 .
Hth , JimL

Thank you for being interested enough to ask the question. The answer is that I deliberately pattern dtt's behavior after df and du, except where I feel strongly that the legacy behavior interferes with dtt's primary job. du -B reports whole numbers, so dtt -B reports whole numbers. But where du -B rounds up, dtt -B rounds to the nearest integer. This is where I felt strongly enough to make a change, because the legacy behavior is misleading. Your 131072-byte file is a case in point. On a volume having a blocksize of 4096 bytes, it reserves 32 blocks (131072 bytes) on the volume. When the reporting unit of measure is -B1M (binary megabytes), du reports the file reserves ~1 megabyte on the volume, and this is misleading. The number reported by dtt, ~0 megabytes, is much closer to the actual 0.125 megabytes reserved.

metaed · 01-25-2024, 12:52 PM

Quote:

Originally Posted by Windu

You can dedicate a single topic to that software of yours and then keep updating the first post with new information.

Actually, I can't do that. LQ freezes any post, including the first post of a thread, after some time, I believe it to be six months. Otherwise it's a good suggestion. Maybe I can rewrite the first post of this thread as a more release-independent intro, and then post new releases as replies beneath it. That is, if I get busy on it before the six month timer expires.

pan64 · 01-25-2024, 12:55 PM

Quote:

Originally Posted by metaed

When the reporting unit of measure is -B1M (binary megabytes), du reports the file reserves ~1 megabyte on the volume, and this is misleading. The number reported by dtt, ~0 megabytes, is much closer to the actual 0.125 megabytes reserved.

When all your file sizes are rounded to 0 you may think your disk is empty and in the meantime probably your disk is full.

metaed · 01-25-2024, 01:21 PM

Quote:

Originally Posted by pan64

When all your file sizes are rounded to 0 you may think your disk is empty and in the meantime probably your disk is full.

Or you could use du, and when all your file sizes are rounded up to the next 1 megabyte, you may think your disk is full and in the meantime probably it's not.

The serious reply is, dtt subtotals and grand totals are calculated before rounding, so contributions of small files are never ignored. But also, normally you would use dtt to investigate a mount-point or a complex directory subtree such as /usr, not a single directory with 18 files in it. babydr picked a great test case for demonstrating and asking a specific question about dtt's rounding behavior. But it's not how you would use the tool. In a practical scenario you would not expect to see an output row rounded to 0.

pan64 · 01-26-2024, 01:08 AM

Quote:

Originally Posted by metaed

Or you could use du, and when all your file sizes are rounded up to the next 1 megabyte, you may think your disk is full and in the meantime probably it's not.

obviously I don't want to use other tools, that would mean it is not good enough.

Quote:

Originally Posted by metaed

The serious reply is, dtt subtotals and grand totals are calculated before rounding, so contributions of small files are never ignored. But also, normally you would use dtt to investigate a mount-point or a complex directory subtree such as /usr, not a single directory with 18 files in it.

When you download for example the linux kernel (tgz) and unpack it you will get millions of files, most probably all of them are small (few kb). Using 1M as block size will cause what babydr explained, you have dirs (a lot), but it looks like all the files are empty everywhere. That is just odd, not a problem at all. Most probably you ought to use a sign like . or o instead of 0 to show it is not exactly 0, just rounded. (or 0+).

metaed · 01-26-2024, 01:40 PM

Quote:

Originally Posted by pan64

When you download for example the linux kernel (tgz) and unpack it you will get millions of files, most probably all of them are small (few kb). Using 1M as block size will cause what babydr explained, you have dirs (a lot), but it looks like all the files are empty everywhere

dtt's one job is to float the biggest chunks of disk usage to the top of its output, whether that's files or directory subtrees. So no, it won't clutter the report with individual little files. Here's how that actually looks:

Code:


963. $ wget --quiet https://mirrors.slackware.com/slackware/slackware64-15.0/patches/source/linux-5.15.145/linux-5.15.145.tar.xz

964. $ tar xf linux-5.15.145.tar.xz 

965. $ dtt -B1M linux-5.15.145
            1217 (grand total - physical size 1048576B-blocks)
             137 linux-5.15.145/arch
              69 linux-5.15.145/drivers/gpu/drm/amd/include/asic_reg/nbio
              65 linux-5.15.145/drivers/net/ethernet
              59 linux-5.15.145/drivers/gpu/drm/amd/include/asic_reg/dcn
              57 linux-5.15.145/Documentation
              48 linux-5.15.145/include
              48 linux-5.15.145/tools
              45 linux-5.15.145/fs
              43 linux-5.15.145/sound
              42 linux-5.15.145/drivers/net/wireless

966. $

metaed · 01-26-2024, 02:14 PM

Quote:

Originally Posted by pan64

Most probably you ought to use a sign like . or o instead of 0 to show it is not exactly 0, just rounded. (or 0+).

I think you are looking for human-friendly output. In that case you would use the option --human-readable instead of -B. When you use --human-readable, the question doesn't come up. This is because --human-readable already varies the reporting unit of measure automatically from bytes, to kilobytes, megabytes, gigabytes, etc. and also adds a decimal place where appropriate. So small files, if they get reported, get reported without rounding to 0.

Incidentally, the reason that utilities like df, du, and dtt offer options like -B is to support scripting. For that purpose, it actually is a hindrance to introduce prefixes and suffixes that may not be there. Because then the script would have to parse them. Scripting is simple when the input is simple, complicated when the input is complicated.

metaed · 01-26-2024, 03:21 PM

The beta fixes errors found in the 2.1 alpha release candidate. For details see the changelog. The 2.1 beta can be downloaded from https://metaed.com/papers/dtt/. (If you cannot reach the site, you are connecting from a country that is geoblocked because of high attack rate. At the moment that's CN, VN, RU, HK, SG, IN, KR, TW, BR, JP, and ID. Let me know which country and and I will lower shields for you.)

metaed · 02-07-2024, 01:08 PM

dtt is a utility that types a “top ten” report of directory trees and files found on a file-structured storage device, such as a Unix or Windows volume. In other words, it floats the really big chunks to the top.

dtt 2.2 is not a feature drop. It lays the groundwork for future development by simplifying many things about the build process, and also includes general source and documentation cleanup. For details see the dtt landing page, and the attached screenshots.

r1w1s1 · 03-02-2024, 09:00 PM

Quote:

Originally Posted by metaed

dtt is a better "where's my disk space" utility. It quickly summarizes filesystem usage into a "top ten" list of largest subtrees or individual objects. Its output has the unique property that it shows the ten largest single subtrees found at any depth without reporting redundant results. In other words, all the really big chunks float to the top. Sample output:

Code:


1076. $ dtt -h /usr
            9.7G (grand total - physical size - customary binary units)
              1G /usr/bin
            589M /usr/doc/rust-1.58.1/html
            537M /usr/include
            450M /usr/share/locale
            368M /usr/share/texmf-dist
            318M /usr/lib64/python3.9
            291M /usr/libexec/gcc/x86_64-slackware-linux/11.2.0
            256M /usr/share/fonts/TTF
            250M /usr/lib64/thunderbird
            249M /usr/lib64/firefox

1077. $

Subscribe this thread for release announcements such as the one below, and for feature requests and general discussion about the tool.

New in 2.1 alpha (release candidate):

added --inodes option
made option parsing errors fatal (thank you 0XBF)
added support for Perl 5.38.2
made lots of clarifying edits to the manuals

Release 2.1 alpha source package, Slackware 15.0 slackbuild, manuals, and changelog are here: https://metaed.com/papers/dtt/. (If you cannot reach the site, you are connecting from a country that is geoblocked because of high attack rate. At the moment that's CN, VN, RU, HK, SG, IN, KR, TW, BR, JP, and ID. Let me know which country and and I will lower shields for you.)

For older discussion, see also the version 1.0 release announcement and the version 2.0 release announcement threads.

Credit goes to Windu for the suggestion to start a megathread.

I cannot access, I'm from Brazil.

pan64 · 03-03-2024, 02:00 AM

is there any way to make it available for other distros?
probably add -n|--number to print top N lines.
add short options, like -i|--inodes -a|--apparent-size -h|--help (or -?) -V|--version
and as it was mentioned would be nice to have either a ~/.dttrc for user specific defaults or an env var which may contain default options (like less).

scuzzy_dog · 03-05-2024, 09:53 PM

Useful tool. Thanks.