Old 04-18-2015, 04:55 AM   #1
Glenn D.
How can this list be sorted by file size 'high to low'

Thanks Glenn

Commands used so far:
# rm index.html

# wget -c

# grep -i "\.xz" index.html | cut -d">" -f5,11 |cut -c10- | cut -d"<" -f1| sed 's/\">/ /'
amor-15.04.0.tar.xz 171K
analitza-15.04.0.tar.xz 223K
ark-15.04.0.tar.xz 233K
artikulate-15.04.0.tar.xz 3.7M
audiocd-kio-15.04.0.tar.xz 51K
blinken-15.04.0.tar.xz 553K
bomber-15.04.0.tar.xz 384K
bovo-15.04.0.tar.xz 103K
cantor-15.04.0.tar.xz 348K
cervisia-15.04.0.tar.xz 364K
dolphin-plugins-15.04.0.tar.xz 57K
dragon-15.04.0.tar.xz 390K
ffmpegthumbs-15.04.0.tar.xz 20K
filelight-15.04.0.tar.xz 281K
granatier-15.04.0.tar.xz 1.3M
gwenview-15.04.0.tar.xz 2.8M
jovie-15.04.0.tar.xz 381K
juk-15.04.0.tar.xz 433K
kaccessible-15.04.0.tar.xz 20K
kaccounts-integration-15.04.0.tar.xz 53K
kaccounts-providers-15.04.0.tar.xz 12K
kajongg-15.04.0.tar.xz 2.4M
kalgebra-15.04.0.tar.xz 263K
kalzium-15.04.0.tar.xz 3.8M
kamera-15.04.0.tar.xz 35K
kanagram-15.04.0.tar.xz 4.6M
kapman-15.04.0.tar.xz 1.7M
kapptemplate-15.04.0.tar.xz 689K
kate-15.04.0.tar.xz 1.6M
katomic-15.04.0.tar.xz 646K
kblackbox-15.04.0.tar.xz 254K
kblocks-15.04.0.tar.xz 1.2M
kbounce-15.04.0.tar.xz 1.5M
kbreakout-15.04.0.tar.xz 1.3M
kbruch-15.04.0.tar.xz 883K
kcachegrind-15.04.0.tar.xz 231K
kcalc-15.04.0.tar.xz 81K
kcharselect-15.04.0.tar.xz 83K
kcolorchooser-15.04.0.tar.xz 4.2K
kcron-15.04.0.tar.xz 170K
kde-base-artwork-15.04.0.tar.xz 7.1M
kde-baseapps-15.04.0.tar.xz 2.5M
kde-dev-scripts-15.04.0.tar.xz 299K
kde-dev-utils-15.04.0.tar.xz 50K
kde-runtime-15.04.0.tar.xz 7.5M
kde-wallpapers-15.04.0.tar.xz 86M
kde-workspace-4.11.18.tar.xz 13M
kdeartwork-15.04.0.tar.xz 134M
kdeedu-data-15.04.0.tar.xz 94K
kdegraphics-mobipocket-15.04.0.tar.xz 14K
kdegraphics-strigi-analyzer-15.04.0.tar.xz 39K
kdegraphics-thumbnailers-15.04.0.tar.xz 41K
kdelibs-4.14.7.tar.xz 11M
kdenetwork-filesharing-15.04.0.tar.xz 27K
kdenetwork-strigi-analyzers-15.04.0.tar.xz 14K
kdenlive-15.04.0.tar.xz 3.0M
kdepim-4.14.7.tar.xz 14M
kdepim-runtime-4.14.7.tar.xz 1.1M
kdepimlibs-4.14.7.tar.xz 2.7M
kdesdk-kioslaves-15.04.0.tar.xz 353K
kdesdk-strigi-analyzers-15.04.0.tar.xz 18K
kdesdk-thumbnailers-15.04.0.tar.xz 11K
kdewebdev-15.04.0.tar.xz 2.4M
kdf-15.04.0.tar.xz 146K
kdiamond-15.04.0.tar.xz 4.0M
kfloppy-15.04.0.tar.xz 52K
kfourinline-15.04.0.tar.xz 275K
kgamma-15.04.0.tar.xz 22K
kgeography-15.04.0.tar.xz 6.4M
kget-15.04.0.tar.xz 1.0M
kgoldrunner-15.04.0.tar.xz 2.0M
kgpg-15.04.0.tar.xz 791K
khangman-15.04.0.tar.xz 3.7M
kig-15.04.0.tar.xz 1.4M
kigo-15.04.0.tar.xz 1.4M
killbots-15.04.0.tar.xz 941K
kiriki-15.04.0.tar.xz 115K
kiten-15.04.0.tar.xz 11M
kjumpingcube-15.04.0.tar.xz 162K
klettres-15.04.0.tar.xz 2.7M
klickety-15.04.0.tar.xz 766K
klines-15.04.0.tar.xz 928K
kmag-15.04.0.tar.xz 85K
kmahjongg-15.04.0.tar.xz 1.0M
kmines-15.04.0.tar.xz 610K
kmix-15.04.0.tar.xz 379K
kmousetool-15.04.0.tar.xz 35K
kmouth-15.04.0.tar.xz 301K
kmplot-15.04.0.tar.xz 647K
knavalbattle-15.04.0.tar.xz 835K
knetwalk-15.04.0.tar.xz 764K
kolf-15.04.0.tar.xz 751K
kollision-15.04.0.tar.xz 203K
kolourpaint-15.04.0.tar.xz 1.1M
kompare-15.04.0.tar.xz 336K
konquest-15.04.0.tar.xz 396K
konsole-15.04.0.tar.xz 451K
kopete-15.04.0.tar.xz 6.0M
kpat-15.04.0.tar.xz 2.9M
kppp-15.04.0.tar.xz 684K
kqtquickcharts-15.04.0.tar.xz 20K
krdc-15.04.0.tar.xz 390K
kremotecontrol-15.04.0.tar.xz 1.0M
kreversi-15.04.0.tar.xz 464K
krfb-15.04.0.tar.xz 319K
kruler-15.04.0.tar.xz 126K
ksaneplugin-15.04.0.tar.xz 13K
kscd-15.04.0.tar.xz 91K
kshisen-15.04.0.tar.xz 133K
ksirk-15.04.0.tar.xz 4.8M
ksnakeduel-15.04.0.tar.xz 291K
ksnapshot-15.04.0.tar.xz 252K
kspaceduel-15.04.0.tar.xz 248K
ksquares-15.04.0.tar.xz 79K
kstars-15.04.0.tar.xz 12M
ksudoku-15.04.0.tar.xz 1.3M
ksystemlog-15.04.0.tar.xz 373K
kteatime-15.04.0.tar.xz 102K
ktimer-15.04.0.tar.xz 144K
ktouch-15.04.0.tar.xz 2.3M
ktp-accounts-kcm-15.04.0.tar.xz 86K
ktp-approver-15.04.0.tar.xz 20K
ktp-auth-handler-15.04.0.tar.xz 26K
ktp-common-internals-15.04.0.tar.xz 306K
ktp-contact-list-15.04.0.tar.xz 48K
ktp-contact-runner-15.04.0.tar.xz 16K
ktp-desktop-applets-15.04.0.tar.xz 25K
ktp-filetransfer-handler-15.04.0.tar.xz 22K
ktp-kded-module-15.04.0.tar.xz 38K
ktp-send-file-15.04.0.tar.xz 14K
ktp-text-ui-15.04.0.tar.xz 283K
ktuberling-15.04.0.tar.xz 4.1M
kturtle-15.04.0.tar.xz 189K
ktux-15.04.0.tar.xz 106K
kubrick-15.04.0.tar.xz 105K
kuser-15.04.0.tar.xz 132K
kwalletmanager-15.04.0.tar.xz 446K
kwordquiz-15.04.0.tar.xz 1.1M
libkcddb-15.04.0.tar.xz 155K
libkcompactdisc-15.04.0.tar.xz 74K
libkdcraw-15.04.0.tar.xz 100K
libkdeedu-15.04.0.tar.xz 128K
libkdegames-15.04.0.tar.xz 5.5M
libkeduvocdocument-15.04.0.tar.xz 97K
libkexiv2-15.04.0.tar.xz 134K
libkface-15.04.0.tar.xz 8.6M
libkgeomap-15.04.0.tar.xz 121K
libkipi-15.04.0.tar.xz 93K
libkmahjongg-15.04.0.tar.xz 1.6M
libkomparediff2-15.04.0.tar.xz 55K
libksane-15.04.0.tar.xz 79K
lokalize-15.04.0.tar.xz 930K
lskat-15.04.0.tar.xz 908K
marble-15.04.0.tar.xz 22M
mplayerthumbs-15.04.0.tar.xz 27K
okteta-15.04.0.tar.xz 491K
okular-15.04.0.tar.xz 1.5M
oxygen-icons-15.04.0.tar.xz 219M
pairs-15.04.0.tar.xz 2.7M
palapeli-15.04.0.tar.xz 1.7M
parley-15.04.0.tar.xz 4.5M
picmi-15.04.0.tar.xz 714K
poxml-15.04.0.tar.xz 31K
print-manager-15.04.0.tar.xz 91K
rocs-15.04.0.tar.xz 521K
signon-kwallet-extension-15.04.0.tar.xz 10K
step-15.04.0.tar.xz 377K
superkaramba-15.04.0.tar.xz 373K
svgpart-15.04.0.tar.xz 8.8K
sweeper-15.04.0.tar.xz 80K
umbrello-15.04.0.tar.xz 1.5M
zeroconf-ioslave-15.04.0.tar.xz 25K
Old 04-18-2015, 05:24 AM   #2
LQ Veteran
gnu sort is usually used for sorting. It handles "human readable" numbers as well as strict numerics - see the manpage.
Old 04-18-2015, 05:28 AM   #3
Hi Glen,

For files greater that 1 Mega :

grep -i "\.xz" index.html | cut -d">" -f5,11 |cut -c10- | cut -d"<" -f1| sed 's/\">/ /'  |  grep  M |  awk '{print $2" "$1}' |  sort -rn
For files smaller than 1 Mega :

grep -i "\.xz" index.html | cut -d">" -f5,11 |cut -c10- | cut -d"<" -f1| sed 's/\">/ /'  |  grep  K |  awk '{print $2" "$1}' |  sort -rn
Hopefully it is was helpful.
Old 04-18-2015, 05:29 AM   #4
Senior Member
Try "sort -h -r -k 2".

From the manpage on sort:
       -h, --human-numeric-sort
              compare human readable numbers (e.g., 2K 1G)
       -r, --reverse
              reverse the result of comparisons

       -k, --key=KEYDEF
              sort via a key; KEYDEF gives location and type
       KEYDEF  is F[.C][OPTS][,F[.C][OPTS]] for start and stop position, where
       F is a field number and C a character position in the field;  both  are
       origin 1, and the stop position defaults to the line's end.  If neither
       -t nor -b is in effect, characters in a  field  are  counted  from  the
       beginning of the preceding whitespace.  OPTS is one or more single-let‐
       ter ordering options  [bdfgiMhnRrV],  which  override  global  ordering
       options  for  that key.  If no key is given, use the entire line as the
I didn't remember that -h option...
Old 04-18-2015, 10:58 AM   #5
If you are willing to settle for a computer-muggle approach, you can do it with a spreadsheet. I used Libre Office Calc and Emacs.

1. Put your file list into a text file (suggest a file name like FILENAME.txt).
2. Edit the file (I used Emacs) to put a space before each K and M (space will be a delimiter in the spreadsheet).
3. Open the text file with Libre Office Calc (the spreadsheet), choosing space as the delimiter. This will place the file length into two columns after the file name.
4. Mark the entire column containing the M and K.
5. Do a global find/replace, changing "M" to "=2^20" in the marked column.
6. Do a global find/replace, changing "K" to "=2^10" in the marked column. This will convert M and K to numbers.
7. In a fourth column, multiply the two previous columns to get the file size in bytes.
8. Do a descending sort on the entire spreadsheet using the newly computed column with the file length in bytes.

Here is what I got (after reconstructing the sorted list with Emacs):

oxygen-icons-15.04.0.tar.xz                         229638144 
kdeartwork-15.04.0.tar.xz                           140509184 
kde-wallpapers-15.04.0.tar.xz                       90177536  
marble-15.04.0.tar.xz                               23068672  
kdepim-4.14.7.tar.xz                                14680064  
kde-workspace-4.11.18.tar.xz                        13631488  
kstars-15.04.0.tar.xz                               12582912  
kdelibs-4.14.7.tar.xz                               11534336  
kiten-15.04.0.tar.xz                                11534336  
libkface-15.04.0.tar.xz                             9017753.6 
kde-runtime-15.04.0.tar.xz                          7864320   
kde-base-artwork-15.04.0.tar.xz                     7444889.6 
kgeography-15.04.0.tar.xz                           6710886.4 
kopete-15.04.0.tar.xz                               6291456   
libkdegames-15.04.0.tar.xz                          5767168   
ksirk-15.04.0.tar.xz                                5033164.8 
kanagram-15.04.0.tar.xz                             4823449.6 
parley-15.04.0.tar.xz                               4718592   
ktuberling-15.04.0.tar.xz                           4299161.6 
kdiamond-15.04.0.tar.xz                             4194304   
kalzium-15.04.0.tar.xz                              3984588.8 
artikulate-15.04.0.tar.xz                           3879731.2 
khangman-15.04.0.tar.xz                             3879731.2 
kdenlive-15.04.0.tar.xz                             3145728   
kpat-15.04.0.tar.xz                                 3040870.4 
gwenview-15.04.0.tar.xz                             2936012.8 
kdepimlibs-4.14.7.tar.xz                            2831155.2 
klettres-15.04.0.tar.xz                             2831155.2 
pairs-15.04.0.tar.xz                                2831155.2 
kde-baseapps-15.04.0.tar.xz                         2621440   
kajongg-15.04.0.tar.xz                              2516582.4 
kdewebdev-15.04.0.tar.xz                            2516582.4 
ktouch-15.04.0.tar.xz                               2411724.8 
kgoldrunner-15.04.0.tar.xz                          2097152   
kapman-15.04.0.tar.xz                               1782579.2 
palapeli-15.04.0.tar.xz                             1782579.2 
kate-15.04.0.tar.xz                                 1677721.6 
libkmahjongg-15.04.0.tar.xz                         1677721.6 
kbounce-15.04.0.tar.xz                              1572864   
okular-15.04.0.tar.xz                               1572864   
umbrello-15.04.0.tar.xz                             1572864   
kig-15.04.0.tar.xz                                  1468006.4 
kigo-15.04.0.tar.xz                                 1468006.4 
granatier-15.04.0.tar.xz                            1363148.8 
kbreakout-15.04.0.tar.xz                            1363148.8 
ksudoku-15.04.0.tar.xz                              1363148.8 
kblocks-15.04.0.tar.xz                              1258291.2 
kdepim-runtime-4.14.7.tar.xz                        1153433.6 
kolourpaint-15.04.0.tar.xz                          1153433.6 
kwordquiz-15.04.0.tar.xz                            1153433.6 
kget-15.04.0.tar.xz                                 1048576   
kmahjongg-15.04.0.tar.xz                            1048576   
kremotecontrol-15.04.0.tar.xz                       1048576   
killbots-15.04.0.tar.xz                             963584    
lokalize-15.04.0.tar.xz                             952320    
klines-15.04.0.tar.xz                               950272    
lskat-15.04.0.tar.xz                                929792    
kbruch-15.04.0.tar.xz                               904192    
knavalbattle-15.04.0.tar.xz                         855040    
kgpg-15.04.0.tar.xz                                 809984    
klickety-15.04.0.tar.xz                             784384    
knetwalk-15.04.0.tar.xz                             782336    
kolf-15.04.0.tar.xz                                 769024    
picmi-15.04.0.tar.xz                                731136    
kapptemplate-15.04.0.tar.xz                         705536    
kppp-15.04.0.tar.xz                                 700416    
kmplot-15.04.0.tar.xz                               662528    
katomic-15.04.0.tar.xz                              661504    
kmines-15.04.0.tar.xz                               624640    
blinken-15.04.0.tar.xz                              566272    
rocs-15.04.0.tar.xz                                 533504    
okteta-15.04.0.tar.xz                               502784    
kreversi-15.04.0.tar.xz                             475136    
konsole-15.04.0.tar.xz                              461824    
kwalletmanager-15.04.0.tar.xz                       456704    
juk-15.04.0.tar.xz                                  443392    
konquest-15.04.0.tar.xz                             405504    
dragon-15.04.0.tar.xz                               399360    
krdc-15.04.0.tar.xz                                 399360    
bomber-15.04.0.tar.xz                               393216    
jovie-15.04.0.tar.xz                                390144    
kmix-15.04.0.tar.xz                                 388096    
step-15.04.0.tar.xz                                 386048    
ksystemlog-15.04.0.tar.xz                           381952    
superkaramba-15.04.0.tar.xz                         381952    
cervisia-15.04.0.tar.xz                             372736    
kdesdk-kioslaves-15.04.0.tar.xz                     361472    
cantor-15.04.0.tar.xz                               356352    
kompare-15.04.0.tar.xz                              344064    
krfb-15.04.0.tar.xz                                 326656    
ktp-common-internals-15.04.0.tar.xz                 313344    
kmouth-15.04.0.tar.xz                               308224    
kde-dev-scripts-15.04.0.tar.xz                      306176    
ksnakeduel-15.04.0.tar.xz                           297984    
ktp-text-ui-15.04.0.tar.xz                          289792    
filelight-15.04.0.tar.xz                            287744    
kfourinline-15.04.0.tar.xz                          281600    
kalgebra-15.04.0.tar.xz                             269312    
kblackbox-15.04.0.tar.xz                            260096    
ksnapshot-15.04.0.tar.xz                            258048    
kspaceduel-15.04.0.tar.xz                           253952    
ark-15.04.0.tar.xz                                  238592    
kcachegrind-15.04.0.tar.xz                          236544    
analitza-15.04.0.tar.xz                             228352    
kollision-15.04.0.tar.xz                            207872    
kturtle-15.04.0.tar.xz                              193536    
amor-15.04.0.tar.xz                                 175104    
kcron-15.04.0.tar.xz                                174080    
kjumpingcube-15.04.0.tar.xz                         165888    
libkcddb-15.04.0.tar.xz                             158720    
kdf-15.04.0.tar.xz                                  149504    
ktimer-15.04.0.tar.xz                               147456    
libkexiv2-15.04.0.tar.xz                            137216    
kshisen-15.04.0.tar.xz                              136192    
kuser-15.04.0.tar.xz                                135168    
libkdeedu-15.04.0.tar.xz                            131072    
kruler-15.04.0.tar.xz                               129024    
libkgeomap-15.04.0.tar.xz                           123904    
kiriki-15.04.0.tar.xz                               117760    
ktux-15.04.0.tar.xz                                 108544    
kubrick-15.04.0.tar.xz                              107520    
bovo-15.04.0.tar.xz                                 105472    
kteatime-15.04.0.tar.xz                             104448    
libkdcraw-15.04.0.tar.xz                            102400    
libkeduvocdocument-15.04.0.tar.xz                   99328     
kdeedu-data-15.04.0.tar.xz                          96256     
libkipi-15.04.0.tar.xz                              95232     
kscd-15.04.0.tar.xz                                 93184     
print-manager-15.04.0.tar.xz                        93184     
ktp-accounts-kcm-15.04.0.tar.xz                     88064     
kmag-15.04.0.tar.xz                                 87040     
kcharselect-15.04.0.tar.xz                          84992     
kcalc-15.04.0.tar.xz                                82944     
sweeper-15.04.0.tar.xz                              81920     
ksquares-15.04.0.tar.xz                             80896     
libksane-15.04.0.tar.xz                             80896     
libkcompactdisc-15.04.0.tar.xz                      75776     
dolphin-plugins-15.04.0.tar.xz                      58368     
libkomparediff2-15.04.0.tar.xz                      56320     
kaccounts-integration-15.04.0.tar.xz                54272     
kfloppy-15.04.0.tar.xz                              53248     
audiocd-kio-15.04.0.tar.xz                          52224     
kde-dev-utils-15.04.0.tar.xz                        51200     
ktp-contact-list-15.04.0.tar.xz                     49152     
kdegraphics-thumbnailers-15.04.0.tar.xz             41984     
kdegraphics-strigi-analyzer-15.04.0.tar.xz          39936     
ktp-kded-module-15.04.0.tar.xz                      38912     
kamera-15.04.0.tar.xz                               35840     
kmousetool-15.04.0.tar.xz                           35840     
poxml-15.04.0.tar.xz                                31744     
kdenetwork-filesharing-15.04.0.tar.xz               27648     
mplayerthumbs-15.04.0.tar.xz                        27648     
ktp-auth-handler-15.04.0.tar.xz                     26624     
ktp-desktop-applets-15.04.0.tar.xz                  25600     
zeroconf-ioslave-15.04.0.tar.xz                     25600     
kgamma-15.04.0.tar.xz                               22528     
ktp-filetransfer-handler-15.04.0.tar.xz             22528     
ffmpegthumbs-15.04.0.tar.xz                         20480     
kaccessible-15.04.0.tar.xz                          20480     
kqtquickcharts-15.04.0.tar.xz                       20480     
ktp-approver-15.04.0.tar.xz                         20480     
kdesdk-strigi-analyzers-15.04.0.tar.xz              18432     
ktp-contact-runner-15.04.0.tar.xz                   16384     
kdegraphics-mobipocket-15.04.0.tar.xz               14336     
kdenetwork-strigi-analyzers-15.04.0.tar.xz          14336     
ktp-send-file-15.04.0.tar.xz                        14336     
ksaneplugin-15.04.0.tar.xz                          13312     
kaccounts-providers-15.04.0.tar.xz                  12288     
kdesdk-thumbnailers-15.04.0.tar.xz                  11264     
signon-kwallet-extension-15.04.0.tar.xz             10240     
svgpart-15.04.0.tar.xz                              9011.2    
kcolorchooser-15.04.0.tar.xz                        4300.8
Old 04-18-2015, 11:07 AM   #6
Senior Member
Using sort is much simpler. Only one step.
Old 04-18-2015, 11:31 AM   #7
LQ Guru
awk -F'<[^<>]*>' '/\.xz/ { print $11 " " $6 } ' index.html | sort -h -r
Old 04-18-2015, 11:43 AM   #8
LQ Newbie
Here's what I did. I'll show them to you as a series of commands, so you can follow along. Basically, the idea is to gather two sets of input: one for file names and another for file sizes. Then, we can combine them together and do our operations on them.

You can also do this in one single step - but I find this more easier to follow and work with :-)

$> wget -c
# now we have an index.html that we can work with
# I now gather file names
$> grep -o 'href=".*\.xz"' index.html|sed -e 's/href="\(.*\)"/\1/' > files.txt
# Next, I can gather file sizes - it's in a different column in the table in index.html
$> grep -Eo '[0-9]+\.?[0-9]*[KM]' index.html > sizes.txt
# Merge them both together to easily work with them
$> paste files.txt sizes.txt > filesizes.txt
# Now, we get to the meat of your question!
# Note that all file sizes are displayed in 'human readable' format
# K for kilobytes, M for megabytes and so on instead of plain bytes. 
# so, we use the 'numeric human format' available with sort and reverse the sorting
# file sizes are in the second column, so we give the key sort to be at index position 2
$> sort -k 2 -rh filesizes.txt 
# Hope this helps :-)
Incidentally, you'll also find that most exercises in gathering information from files involves locating the exact place where that information is available and taking some shortcuts. For example, I took a "shortcut" in picking files where I knew that files are simply hyperlinks on the index file and that file sizes are numeric (That's why I used the extended regex format for getting that information) - the regex simply reads like this: 'Pick everything that starts with the numeric digits 0 to 9, optionally followed by a period and another set of digits. But they have to be followed by either a K or an M'

Have fun!
Old 04-18-2015, 02:23 PM   #9
Originally Posted by jpollard View Post
Using sort is much simpler. Only one step.
Yeah ... I guess I wasn't sure how to deal with the Ks and Ms, without generating the full numerical file sizes in bytes. I may be missing something obvious.
Old 04-18-2015, 02:42 PM   #10
Originally Posted by flshope View Post
Yeah ... I guess I wasn't sure how to deal with the Ks and Ms, without generating the full numerical file sizes in bytes. I may be missing something obvious.
One thing I'm not sure about yet - it states "human readable", but doesn't indicate if there is the capability of handling "Kib","Mib","Gib","Tib"; where the base is 1024 and not 1000.


