LinuxQuestions.org
Review your favorite Linux distribution.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices

Reply
 
LinkBack Search this Thread
Old 03-22-2013, 07:33 PM   #1
TullyGirl
LQ Newbie
 
Registered: Mar 2013
Location: Los Angeles
Posts: 4

Rep: Reputation: Disabled
bash script reading specified lines from multiple files and concatenating into one


Hi Everyone!

I'm trying to write a little bash script to go into sub directories and find all the files named *.cluster.summary, grab lines 13, 36, and 40 through to the end of the file, and then paste/concatenate those lines into one single outputfile. Each *cluster.summary file is two directories down (i.e. ./firstdirectory/directorywithfileinit/), and although lines 13 and 36 will always exist, the length of the file after line 40 varies. Importantly, I have a similar set of *.cluster.summary files in parent directories that I don't want to copy lines from. I think I need to use a simple for loop like the one below, but I can't work out how to specify the lines and make sure that it doesn't go into parent directories. Any suggestions would be most appreciated!



#!/bin/bash

for clustersummaries in *.cluster.summary
do
sed -n '13,36,40' $clustersummaries >> clusteroutputfile
done
 
Old 03-22-2013, 09:16 PM   #2
shivaa
Senior Member
 
Registered: Jul 2012
Location: Grenoble, Fr.
Distribution: Sun Solaris, RHEL, Ubuntu, Debian 6.0
Posts: 1,778
Blog Entries: 4

Rep: Reputation: 282Reputation: 282Reputation: 282
Try this:
Code:
#!/bin/bash
\rm /tmp/clusteroutputfile
cd ./firstdirectory/directorywithfileinit/
for file in *.cluster.summary
do
echo "Reading file $file"
awk 'NR ~ /^(13|36|40)$/ {print}' $file >> /tmp/clusteroutputfile
done

Last edited by shivaa; 03-22-2013 at 09:18 PM.
 
Old 03-22-2013, 09:38 PM   #3
TullyGirl
LQ Newbie
 
Registered: Mar 2013
Location: Los Angeles
Posts: 4

Original Poster
Rep: Reputation: Disabled
Thanks for the suggestion Shivaa - I gave it a try and it works for just finding one of the *cluster.summary files but I have ~200 of them in separate sub-directories that I want to be able to get the loop to go through recursively. Also - why the tmp file? I want to keep the output file to look at later, am I misunderstanding the function of the tmp piece? Here are the adjustments I made that saves the output file in the parent directory where I want it - I just need to work out the "search recursively through all subdirectories for *cluster.summary files" bit (and then get rid of the cd ./firstdirectory...?)

#!/bin/bash
cd ./firstdirectory/directorywithfileinit/
for file in *.cluster.summary
do
echo "Reading file $file"
awk 'NR ~ /^(13|36|40)$/ {print}' $file >> ../../clusteroutputfile.txt
done
 
Old 03-22-2013, 09:50 PM   #4
shivaa
Senior Member
 
Registered: Jul 2012
Location: Grenoble, Fr.
Distribution: Sun Solaris, RHEL, Ubuntu, Debian 6.0
Posts: 1,778
Blog Entries: 4

Rep: Reputation: 282Reputation: 282Reputation: 282
You can save output wherever you want.

Anyway, if you have multiple sub-directories, use find cmd as:
Code:
cd ./firstdirectory/directorywithfileinit/
for file in $(find . -name '*.cluster.summary' -print)
do
echo "Reading file $file"
awk 'NR ~ /^(13|36|40)$/ {print}' $file >> ../../clusteroutputfile.txt
done
You can change directory level i.e. "cd ./firstdirectory/directorywithfileinit/" accordingly i.e. from wherever you want to start searching the files.
 
Old 03-22-2013, 09:59 PM   #5
TullyGirl
LQ Newbie
 
Registered: Mar 2013
Location: Los Angeles
Posts: 4

Original Poster
Rep: Reputation: Disabled
Ah - the find cmd addition works a treat. However - I noticed that the line search piece grabs ONLY lines 13,36, and 40; what I want it to do is grab line 13, line 36, and then all lines from 40 onwards to the end of the file. How would I denote that?
 
Old 03-22-2013, 10:19 PM   #6
shivaa
Senior Member
 
Registered: Jul 2012
Location: Grenoble, Fr.
Distribution: Sun Solaris, RHEL, Ubuntu, Debian 6.0
Posts: 1,778
Blog Entries: 4

Rep: Reputation: 282Reputation: 282Reputation: 282
Then try this:
Code:
#!/bin/bash
cd ./firstdirectory/directorywithfileinit/
\rm ../../clusteroutputfile.txt    ## Clear the old clusteroutputfile.txt file
for file in $(find . -name '*.cluster.summary' -print)
do
echo "Reading file $file"
awk 'NR ~ /^(13|36|)$/ || NR >= 40 {print}' $file >> ../../clusteroutputfile.txt
done
Note: Before running the script, clean the "../../clusteroutputfile.txt" file, else output will keep appending with old outputs.
 
Old 03-22-2013, 10:29 PM   #7
TullyGirl
LQ Newbie
 
Registered: Mar 2013
Location: Los Angeles
Posts: 4

Original Poster
Rep: Reputation: Disabled
Talking

Beautiful! Thank you so much for your help.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] How to echo multiple lines from a bash script while preserving spacing. sam.m Programming 12 10-25-2012 12:56 PM
[SOLVED] bash: concatenating the output of multiple commands without using temp files twoprop Linux - Software 3 03-16-2012 12:22 AM
Bash script to find and remove similar lines from multiple files linuxquestion1 Programming 9 07-13-2011 01:45 AM
Script to append lines to multiple files jimma Linux - General 1 08-22-2009 06:52 AM
LXer: Reading Multiple Files with Bash LXer Syndicated Linux News 0 08-22-2009 12:00 AM


All times are GMT -5. The time now is 12:28 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration