LinuxQuestions.org
Review your favorite Linux distribution.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 08-08-2012, 06:50 PM   #1
kristo5747
Member
 
Registered: Jul 2010
Location: Earth
Distribution: Ubuntu 11.04 (Natty Narwhal)
Posts: 31

Rep: Reputation: 0
Why is sort -k not working all the time?


I have a script that puts a list of files in two separate arrays:

First, I get a file list from a ZIP file and fill `FIRST_Array()` with it. Second, I get a file list from a control file within a ZIP file and fill `SECOND_Array()` with it

Code:
                while read length date time filename 
                do
                        FIRST_Array+=( "$filename" )
                        echo "$filename" >> FIRST.report.out
                done < <(/usr/bin/unzip -qql AAA.ZIP |sort -k12 -t~)
Third, I compare both array like so:

Code:
    diff -q <(printf "%s\n" "${FIRST_Array[@]}") <(printf "%s\n" "${SECOND_Array[@]}") |wc -l
I can tell that `Diff` fails because I output each array to files: `FIRST.report.out` and `SECOND.report.out` are simply not sorted properly.

1) FIRST.report.out (what's inside the ZIP file)


Code:
JGS-Memphis~AT1~Pre-Test~X-BanhT~JGMDTV387~6~P~1100~HR24-500~033072053326~20120808~240914.XML
JGS-Memphis~PRE~DTV_PREP~X-GuinE~JGMDTV069~6~P~1100~H24-700~033081107519~20120808~240914.XML
JGS-Memphis~PRE~DTV_PREP~X-MooreBe~JGM98745~40~P~1100~H21-200~029264526103~20120808~240914.XML
JGS-Memphis~FUN~Pre-Test~X-RossA~jgmdtv168~2~P~1100~H21-200~029415655926~20120808~240914.XML
2) SECOND.report.out (what's inside the ZIP's control file)

Code:
JGS-Memphis~AT1~Pre-Test~X-BanhT~JGMDTV387~6~P~1100~HR24-500~033072053326~20120808~240914.XML
JGS-Memphis~FUN~Pre-Test~X-RossA~jgmdtv168~2~P~1100~H21-200~029415655926~20120808~240914.XML
JGS-Memphis~PRE~DTV_PREP~X-GuinE~JGMDTV069~6~P~1100~H24-700~033081107519~20120808~240914.XML
JGS-Memphis~PRE~DTV_PREP~X-MooreBe~JGM98745~40~P~1100~H21-200~029264526103~20120808~240914.XML
Using sort -k12 -t~ made sense since ~ is the delimiter for the file's date field ("20120808" : 12th position). But it is not working consistently.

The sort is worse when my script processes bigger ZIP files. Why is sort -k not working all the time? How can I sort both arrays?

Last edited by kristo5747; 08-08-2012 at 06:57 PM.
 
Old 08-11-2012, 03:44 PM   #2
kakaka
Member
 
Registered: Sep 2003
Posts: 382

Rep: Reputation: 87
Hi kristo5747!

Are you using a shell that will expand ~ to your home directory, but not escaping ~

???

You might want to see if your code works any better if you put a backslash in front of ~
as in \~

Last edited by kakaka; 08-11-2012 at 03:45 PM.
 
Old 08-13-2012, 02:33 PM   #3
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Debian sid + kde 3.5 & 4.4
Posts: 6,823

Rep: Reputation: 1958Reputation: 1958Reputation: 1958Reputation: 1958Reputation: 1958Reputation: 1958Reputation: 1958Reputation: 1958Reputation: 1958Reputation: 1958Reputation: 1958
By my count the 12th field is the last one, the "240914.XML". I think you want to use "-k 11".

However, if you use "-k 11" it sorts by all fields from the 11th to the end of the line. You have to use "-k 11,11" to limit the sort to only the 11th field. (See the sort info page for details.)

Finally, you'll probably also want to use numerical sorting.


Edit: looking again, the sort command is on the raw unzip output, but you only posted the final sorted output. There's no way to tell from what you posted if it's targeting the correct field. Perhaps there's something else in it that's affecting the sort on some lines.

So how about posting the raw input from unzip too so we can compare?

Last edited by David the H.; 08-13-2012 at 02:41 PM. Reason: as posted
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Sort by the one of the columns at a time hanae Programming 11 05-18-2012 05:18 AM
Need to sort a file according to a column having no. of days as well as time stamp Jhinukk General 2 01-17-2012 10:36 PM
[SOLVED] how to sort two mysql table on their date or time column golden_boy615 Programming 3 07-27-2011 04:49 AM
Way to sort a files based on the creation time thangappan Linux - Software 6 07-09-2010 03:18 AM
linux newbie...Q: how to find files and sort by time chucker8 Linux - General 6 05-12-2006 06:01 PM


All times are GMT -5. The time now is 10:41 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration