LinuxQuestions.org
Review your favorite Linux distribution.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices


Reply
  Search this Thread
Old 05-05-2006, 08:12 AM   #1
otchie1
Registered User
 
Registered: Apr 2004
Posts: 560

Rep: Reputation: 30
tricky batch file job. needs vi/cc guru


I have a collection of txt files containing columns of data. The files each relate to one day are are named according to the date (in US format). I have one file for each day for about 6 years from 1999 to 2006.

Each file is prefaced with a short blurb and finishes with more junk....the columns of data occupy lines 6 to 30 inclusive and all other lines are not required.

I need to coallate all the files in to one file in sequential date order with the unwanted lines removed and a new column added that shows the name of the fiel that the data came from (being the date of the file).

Ahh, and just to make it dfficult, the remaining lines in each file are in backwards order so need reversing (line 30 should be line 1, line 29 should be line 2 etc).

I'm pretty sure some vi grep sed magic will do this but I'm struggling.

The whole lot will then be imported into gnumeric for further work.

It's all to do with a hons degree project in Artificial Intelligence networks as applied to weather patterns.

Can anyone help?
 
Old 05-05-2006, 09:03 AM   #2
druuna
LQ Veteran
 
Registered: Sep 2003
Posts: 10,532
Blog Entries: 7

Rep: Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405
Hi,

There are probably more ways to do this, but this should work:

Code:
for THISFILE in `ls [12][09]*`
do
  sed -n '6,30p' ${THISFILE} | \
  sort -r | \
  awk -v thisfile=${THISFILE} '{ print thisfile, $0 }' >> outfile
done
A little breakdown:

for THISFILE in `ls [12][09]*` => do this for all targetfiles,
sed -n '6,30p' ${THISFILE} => print lines 6 to 30 (including),
sort -r => reverse sort them,
awk -v thisfile=${THISFILE} '{ print thisfile, $0 }' => prepend filename
>> outfile => print to some output file.

Hope this helps.
 
Old 05-05-2006, 09:17 AM   #3
otchie1
Registered User
 
Registered: Apr 2004
Posts: 560

Original Poster
Rep: Reputation: 30
I bow before your awk & sed greatness

it looks like it'll do the job so long as the prepending file name forms it's own column. I'll run a few tests now and see what flops out.

Thx, especially for the explanation..Practical Unix was starting to get blurry
 
Old 05-05-2006, 09:43 AM   #4
otchie1
Registered User
 
Registered: Apr 2004
Posts: 560

Original Poster
Rep: Reputation: 30
ahhh, a further complication.

the original first column contains US format time
Code:
11p 10p 9p 8p 7p 6p 5p 4p 3p 2p 1p 12p 11a 10a 9a 8a 7a 6a 5a 4a 3a 2a 1a
sort is intermixing the am & pm and sorting them numerically

Code:
12a 12p 11a 11p 10p 10a 9a 9p 8a 8p 7a 7p 6a 6p 5a 5p 4a 4p 3a 3p 2a 2p 1a 1p
is there a way to just read the file from the bottom up instead of trying to sort it?


and lastly... I followed most of what was going on in the script but what did the [12][9] in the ls * signify?
 
Old 05-05-2006, 09:56 AM   #5
druuna
LQ Veteran
 
Registered: Sep 2003
Posts: 10,532
Blog Entries: 7

Rep: Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405
Hi again,

Can you give a few examples of the actual datestring that is used?

The [12][09] is date related, I thought that the files would have the 20061131 (or alike 11312006) format. The [12][09] part makes sure that relevant files are targeted (should be tailored to own needs).
 
Old 05-05-2006, 10:56 AM   #6
otchie1
Registered User
 
Registered: Apr 2004
Posts: 560

Original Poster
Rep: Reputation: 30
Quote:
Originally Posted by druuna
Hi again,

Can you give a few examples of the actual datestring that is used?

The [12][09] is date related, I thought that the files would have the 20061131 (or alike 11312006) format. The [12][09] part makes sure that relevant files are targeted (should be tailored to own needs).
ahhh. I've put all the files in the same dir and will strip out the .txt shortly so I don't need to target the list.

file names
Code:
020105.txt  020405.txt  020705.txt  021005.txt  021305.txt
021605.txt  021905.txt  022205.txt  022505.txt  022805.txt
020205.txt  020505.txt  020805.txt  021105.txt  021405.txt
021705.txt  022005.txt  022305.txt  022605.txt  020305.txt
020605.txt  020905.txt  021205.txt  021505.txt  021805.txt
022105.txt  022405.txt  022705.txt
file contents (first few lines)
Code:
Rice University
Hourly Data for Past 24 hours.  Updated Wed, 02/09/2005  23:55

Hour   I Temp  O Temp  Chill   Dew     Heat    O Hum   Press   Rain    Dir     Speed   Solar   Light
        F      F      F      F      F      %Rh     InHg    In             mph     WM     Lc
11 p   68.2    47.2    47.2    39.3    47.2    74      30.845  0.00    233     2.1     3       0
10 p   68.6    48.2    40.9    39.2    48.2    71      30.833  0.00    192     21.6    3       0
09 p   69.3    50.0    45.6    42.0    50.0    74      30.805  0.00    234     11.4    3       0
08 p   69.6    52.6    48.5    44.2    52.6    73      30.730  0.00    212     12.9    3       0
07 p   69.7    53.2    49.8    45.8    53.2    76      30.709  0.00    204     10.8    3
whicj gives an ouput of
Code:
020805.txt 12 p   72.5    57.9    56.0    55.0    57.9    90      30.671  0.00    214     9.3     368     0
020805.txt 12 a   70.9    63.2    63.2    63.2    63.2    100     30.510  0.00    076     3.0     3       0
020805.txt 11 p   68.2    47.2    47.2    39.3    47.2    74      30.845  0.00    233     2.1     3       0
020805.txt 11 a   71.3    57.0    54.9    57.0    57.0    100     30.660  0.00    191     9.3     145     0
020805.txt 10 p   68.6    48.2    40.9    39.2    48.2    71      30.833  0.00    192     21.6    3       0

Last edited by otchie1; 05-05-2006 at 10:59 AM.
 
Old 05-05-2006, 11:09 AM   #7
druuna
LQ Veteran
 
Registered: Sep 2003
Posts: 10,532
Blog Entries: 7

Rep: Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405
Hi,

Replace:
sort -r
with:
tac -

If you use the -, tac reads standard input and not from a file.
And yes, tac is cat in reverse

tac is kinda linux specific, haven't seen it on unix boxes (installed by default that is).
 
Old 05-05-2006, 11:11 AM   #8
Rufus330Ci
Member
 
Registered: Aug 2002
Location: PA
Distribution: Mandrake Linux v10.2, RHEL3u8, RHEL4u4 & RHEL5 Client Beta2 for desktop
Posts: 59

Rep: Reputation: 15
Maybe you could add a few lines to first search for am (i.e. a) and throw that in a file with >>, and then do the same with pm (i.e. p) throw that into another file. Then do your sort he gave you for each file and then >> merge them together am first then pm. Does this sound reasonable or would a complex if statement checking for am/pm be easier to code?
 
Old 05-05-2006, 11:12 AM   #9
Rufus330Ci
Member
 
Registered: Aug 2002
Location: PA
Distribution: Mandrake Linux v10.2, RHEL3u8, RHEL4u4 & RHEL5 Client Beta2 for desktop
Posts: 59

Rep: Reputation: 15
Woops I posted above when you were writing that druuna, my bad.
 
Old 05-05-2006, 11:19 AM   #10
otchie1
Registered User
 
Registered: Apr 2004
Posts: 560

Original Poster
Rep: Reputation: 30
druuna..tac wonderful!! I would never have thought of looking for that. This is on a linux box..I just happen to have a history with Sparc stations so have a few Unix manuals laying around. Got a few linux ones as well but Moritsugu's Practical Unix published by Que is very readable.

crunching now....
 
Old 05-05-2006, 11:24 AM   #11
otchie1
Registered User
 
Registered: Apr 2004
Posts: 560

Original Poster
Rep: Reputation: 30
Woooohoooo..worked. Thx Druuna

Rufus, I see, yes that would have done the job too.

Right now to process it in gnumeric :-)
 
Old 05-05-2006, 11:27 AM   #12
druuna
LQ Veteran
 
Registered: Sep 2003
Posts: 10,532
Blog Entries: 7

Rep: Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405
Glad to be of service
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Sftp using a batch file JeRrYmAn Linux - Security 2 05-05-2006 11:15 AM
File system guru - please help!! weird /home danimalz Linux - General 9 11-11-2005 01:30 PM
How to create a batch file? Franziss Programming 8 05-16-2005 01:36 AM
batch file for chmod's? teodavinci Linux - Networking 1 08-13-2004 03:39 PM
ftp batch file cli_man Linux - General 2 05-24-2002 11:04 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software

All times are GMT -5. The time now is 04:58 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration