Welcome to the most active Linux Forum on the web.
 Home Forums HCL Reviews Tutorials Articles Register Search Today's Posts Mark Forums Read
 LinuxQuestions.org [SOLVED] Need help with script that finds the 10 largest files
 Linux - Newbie This Linux forum is for members that are new to Linux. Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices

 03-25-2013, 11:09 PM #1 amber1 LQ Newbie   Registered: Mar 2013 Posts: 8 Blog Entries: 1 Rep: Need help with script that finds the 10 largest files How can I write a script that finds 10 largest files in a filesystem? The script has to display the filenames and sizes in reverse order, largest first. I would appreciate if someone can please help. Thanks
03-25-2013, 11:37 PM   #2
nicksu
Member

Registered: Dec 2012
Posts: 35

Rep:
Quote:
 Originally Posted by amber1 How can I write a script that finds 10 largest files in a filesystem? The script has to display the filenames and sizes in reverse order, largest first. I would appreciate if someone can please help. Thanks
can this work ?
du -ha | sort -hr | head -n 10

 03-26-2013, 12:19 AM #3 shivaa Senior Member   Registered: Jul 2012 Location: Grenoble, Fr. Distribution: Sun Solaris, RHEL, Ubuntu, Debian 6.0 Posts: 1,800 Blog Entries: 4 Rep: You will need following commands: find, du, sort, head Well, it's not a good idea to offer you a ready-made script. Instead you can create it yourself with help of following cmds: Code: find / -exec du -sk {} \; 2>/dev/null| gawk '{gsub(/K/,"",$1); print$0}' > /tmp/files.txt sort -nr -k1 /tmp/files.txt | head -10 \rm /tmp/files.txt Last edited by shivaa; 03-26-2013 at 01:31 AM. Reason: Added 2>/dev/null
03-26-2013, 01:26 AM   #4
nicksu
Member

Registered: Dec 2012
Posts: 35

Rep:
Quote:
 Originally Posted by shivaa You will need following commands: find, du, sort, head Well, it's not a good idea to offer you a ready-made script. Instead you can create it yourself with help of following cmds: Code: find / -exec du -sk {} \; | gawk '{gsub(/K/,"",$1); print$0}' > /tmp/files.txt sort -nr -k1 /tmp/files.txt | head -10 \rm /tmp/files.txt
Hi,but your find would add the current path into the files.txt,can you show how to avoid it ?use find
I only can use grep as grep -v ".$" to avoid the . into the files.txt  03-26-2013, 01:43 AM #5 shivaa Senior Member Registered: Jul 2012 Location: Grenoble, Fr. Distribution: Sun Solaris, RHEL, Ubuntu, Debian 6.0 Posts: 1,800 Blog Entries: 4 Rep: @nicksu: Searching for files in / directory will print absolute path i.e. complete path of files in /tmp/files.txt, which is the only way to search files in whole file system. Please explain your problem little more, what exactly you want to say? Else, following script is running fine: Code: #!/bin/bash find / -exec du -sk {} \; 2>/dev/null| gawk '{gsub(/K/,"",$1); print $0}' > /tmp/files.txt sort -nr -k1 /tmp/files.txt | head -10 \rm /tmp/files.txt @amber1: Run the script as root user or use sudo, because you will search for files in / directory. 03-26-2013, 01:54 AM #6 nicksu Member Registered: Dec 2012 Posts: 35 Rep: Quote:  Originally Posted by shivaa @nicksu: Searching for files in / directory will print absolute path i.e. complete path of files in /tmp/files.txt, which is the only way to search files in whole file system. Please explain your problem little more, what exactly you want to say? Else, following script is running fine: Code: #!/bin/bash find / -exec du -sk {} \; 2>/dev/null| gawk '{gsub(/K/,"",$1); print $0}' > /tmp/files.txt sort -nr -k1 /tmp/files.txt | head -10 \rm /tmp/files.txt @amber1: Run the script as root user or use sudo, because you will search for files in / directory. oh,my mistake.I use the . instead of the / in find command 03-26-2013, 02:54 AM #7 nicksu Member Registered: Dec 2012 Posts: 35 Rep: Quote:  Originally Posted by amber1 How can I write a script that finds 10 largest files in a filesystem? The script has to display the filenames and sizes in reverse order, largest first. I would appreciate if someone can please help. Thanks Hi,thanks to Shivva's help,I formed the script as below,hope it can help #! /bin/bash echo "pleae type in the path in which you want to check the most largest file" read path a=find$path -type f -exec du -h {} \;|sort -hr|head -n 10|awk '{print $2}' for i in$a
do
echo $(ls -hld$i) 2>/dev/null
done

 03-26-2013, 05:30 AM #8 shivaa Senior Member   Registered: Jul 2012 Location: Grenoble, Fr. Distribution: Sun Solaris, RHEL, Ubuntu, Debian 6.0 Posts: 1,800 Blog Entries: 4 Rep: @nicksu: Once go through this guide. And note that, 1. There's no need to use a for loop, -exec option will do that. 2. Do not use "head -10" with find command, because it will then print first 10 files, not all files. Code: #! /bin/bash echo "pleae type in the path in which you want to check the most largest file"; read path #!/bin/bash find "$path" -exec du -sk {} \; 2>/dev/null| gawk '{gsub(/K/,"",$1); print $0}' > /tmp/files.txt sort -nr -k1 /tmp/files.txt | head -10 \rm /tmp/files.txt 1 members found this post helpful. 03-26-2013, 06:10 AM #9 nicksu Member Registered: Dec 2012 Posts: 35 Rep: Quote:  Originally Posted by shivaa @nicksu: Once go through this guide. And note that, 1. There's no need to use a for loop, -exec option will do that. 2. Do not use "head -10" with find command, because it will then print first 10 files, not all files. Code: #! /bin/bash echo "pleae type in the path in which you want to check the most largest file"; read path #!/bin/bash find "$path" -exec du -sk {} \; 2>/dev/null| gawk '{gsub(/K/,"",$1); print$0}' > /tmp/files.txt sort -nr -k1 /tmp/files.txt | head -10 \rm /tmp/files.txt
wow,what a guide,thank you for your share.
and for your point 2,I am not so clear.I use the "head -10" to print out the first 10 lines,because I have issued the "sort -hr" which sort the find result and then fetch the first 10 line by "head -10".Why wrong ?

 03-26-2013, 06:28 AM #10 colucix LQ Guru   Registered: Sep 2003 Location: Bologna Distribution: CentOS 6.5 OpenSuSE 12.3 Posts: 10,509 Rep: Please use a descriptive title for your thread excluding words like 'urgent' or 'help'. Using a proper title makes it easier for members to help you. This thread has been reported for title modification. Please do not add replies that address the thread title.
03-26-2013, 08:14 AM   #11
shivaa
Senior Member

Registered: Jul 2012
Location: Grenoble, Fr.
Distribution: Sun Solaris, RHEL, Ubuntu, Debian 6.0
Posts: 1,800
Blog Entries: 4

Rep:
Quote:
 @nicksu: ...and for your point 2,I am not so clear.I use the "head -10" to print out the first 10 lines,because I have issued the "sort -hr" which sort the find result and then fetch the first 10 line by "head -10".Why wrong ?
It's not wrong, but sorting directly on find command result isn't good, plus you've not specified any field no. i.e. k1, which you should.
Second thing, making your script unnecessarily lengthy is not a good idea when your work can be done in a few commands, so for loop is also not needed.

However, please refer the guide I mentioned, so your doubts can be cleared.

Last edited by shivaa; 03-26-2013 at 08:15 AM.

 03-26-2013, 09:00 AM #12 pan64 LQ Guru   Registered: Mar 2012 Location: Hungary Distribution: debian i686 (solaris) Posts: 8,109 Rep: I do not like those chain of commands like find, du, grep, sed, awk, head, cut ... (not to speak about additional loops), keep it simple: Code: find . -type f -exec du -sb {} \; | perl -e ' my %h; while (<>) { @b = split; $h{$b[1]}=$b[0] }$i=2; for my $k (sort {$h{$b} <=>$h{$a} } keys %h) { print "$k $h{$k}\n"; last if $i++>3; } ' # the last 3 (after$i++>) means print 3 lines, so you need to modify it.... # also the last print can be replaced to give formatted output: printf("%-30s %10d\n", $k,$h{\$k}) 2 members found this post helpful.
 03-28-2013, 01:35 PM #13 rigor Member   Registered: Sep 2011 Posts: 212 Rep: amber1, The commands du and find both go through all files and directories beneath the starting point you give them, and so will cross file system boundaries, unless you tell them not to do so. That is, they will not necessarily be limited to a single file system. If the file system on which you start, has other file systems mounted within it, the result would be the largest file on any of those file systems, not the largest file on the one file system. Also, du tends to attribute all space used by all files and directories under a directory, to the directory itself. So it would show the root of any directory tree as owning all the space used below it, falsely showing the root directory as the largest file. Although the -exec option of the find command is a powerful capability to extend the usefulness of the find command, repeatedly having find execute another command, such as du, when it isn't necessary, can be rather slow. A command sequence such as the following, limits find to a single file system, making use of find's own capabilities to produce its output, only then sorting the result, and grabbing the first ten lines. Code: find / -xdev -printf "%s %p\n" | sort -rn | head -10 You indicated you wanted the 10 largest files. If you meant that in the sense of simple files, not directories, so you need to exclude directories from consideration, you could do the following. Code: find / -xdev -type f -printf "%s %p\n" | sort -rn | head -10 Last edited by rigor; 03-28-2013 at 01:40 PM. 1 members found this post helpful.

 Posting Rules You may not post new threads You may not post replies You may not post attachments You may not edit your posts BB code is On Smilies are On [IMG] code is Off HTML code is Off Forum Rules

 Similar Threads Thread Thread Starter Forum Replies Last Post pomico Programming 15 09-13-2012 02:07 PM pooppp Programming 10 07-17-2012 10:36 AM pooppp Programming 1 07-13-2012 02:03 AM [SOLVED] Script Finds no Files dougp25 Programming 10 10-28-2010 10:54 AM

All times are GMT -5. The time now is 05:14 AM.

 Contact Us - Advertising Info - Rules - LQ Merchandise - Donations - Contributing Member - LQ Sitemap -