Latest LQ Deal: Linux Power User Bundle
 Home Forums HCL Reviews Tutorials Articles Register Search Today's Posts Mark Forums Read
 LinuxQuestions.org [SOLVED] bash script reading specified lines from multiple files and concatenating into one
 Linux - Newbie This Linux forum is for members that are new to Linux. Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices

 03-22-2013, 07:33 PM #1 TullyGirl LQ Newbie   Registered: Mar 2013 Location: Los Angeles Posts: 4 Rep: bash script reading specified lines from multiple files and concatenating into one Hi Everyone! I'm trying to write a little bash script to go into sub directories and find all the files named *.cluster.summary, grab lines 13, 36, and 40 through to the end of the file, and then paste/concatenate those lines into one single outputfile. Each *cluster.summary file is two directories down (i.e. ./firstdirectory/directorywithfileinit/), and although lines 13 and 36 will always exist, the length of the file after line 40 varies. Importantly, I have a similar set of *.cluster.summary files in parent directories that I don't want to copy lines from. I think I need to use a simple for loop like the one below, but I can't work out how to specify the lines and make sure that it doesn't go into parent directories. Any suggestions would be most appreciated! #!/bin/bash for clustersummaries in *.cluster.summary do sed -n '13,36,40' $clustersummaries >> clusteroutputfile done  03-22-2013, 09:16 PM #2 shivaa Senior Member Registered: Jul 2012 Location: Grenoble, Fr. Distribution: Sun Solaris, RHEL, Ubuntu, Debian 6.0 Posts: 1,800 Blog Entries: 4 Rep: Try this: Code: #!/bin/bash \rm /tmp/clusteroutputfile cd ./firstdirectory/directorywithfileinit/ for file in *.cluster.summary do echo "Reading file$file" awk 'NR ~ /^(13|36|40)$/ {print}'$file >> /tmp/clusteroutputfile done Last edited by shivaa; 03-22-2013 at 09:18 PM.
 03-22-2013, 09:38 PM #3 TullyGirl LQ Newbie   Registered: Mar 2013 Location: Los Angeles Posts: 4 Original Poster Rep: Thanks for the suggestion Shivaa - I gave it a try and it works for just finding one of the *cluster.summary files but I have ~200 of them in separate sub-directories that I want to be able to get the loop to go through recursively. Also - why the tmp file? I want to keep the output file to look at later, am I misunderstanding the function of the tmp piece? Here are the adjustments I made that saves the output file in the parent directory where I want it - I just need to work out the "search recursively through all subdirectories for *cluster.summary files" bit (and then get rid of the cd ./firstdirectory...?) #!/bin/bash cd ./firstdirectory/directorywithfileinit/ for file in *.cluster.summary do echo "Reading file $file" awk 'NR ~ /^(13|36|40)$/ {print}' $file >> ../../clusteroutputfile.txt done  03-22-2013, 09:50 PM #4 shivaa Senior Member Registered: Jul 2012 Location: Grenoble, Fr. Distribution: Sun Solaris, RHEL, Ubuntu, Debian 6.0 Posts: 1,800 Blog Entries: 4 Rep: You can save output wherever you want. Anyway, if you have multiple sub-directories, use find cmd as: Code: cd ./firstdirectory/directorywithfileinit/ for file in$(find . -name '*.cluster.summary' -print) do echo "Reading file $file" awk 'NR ~ /^(13|36|40)$/ {print}' $file >> ../../clusteroutputfile.txt done You can change directory level i.e. "cd ./firstdirectory/directorywithfileinit/" accordingly i.e. from wherever you want to start searching the files.  03-22-2013, 09:59 PM #5 TullyGirl LQ Newbie Registered: Mar 2013 Location: Los Angeles Posts: 4 Original Poster Rep: Ah - the find cmd addition works a treat. However - I noticed that the line search piece grabs ONLY lines 13,36, and 40; what I want it to do is grab line 13, line 36, and then all lines from 40 onwards to the end of the file. How would I denote that?  03-22-2013, 10:19 PM #6 shivaa Senior Member Registered: Jul 2012 Location: Grenoble, Fr. Distribution: Sun Solaris, RHEL, Ubuntu, Debian 6.0 Posts: 1,800 Blog Entries: 4 Rep: Then try this: Code: #!/bin/bash cd ./firstdirectory/directorywithfileinit/ \rm ../../clusteroutputfile.txt ## Clear the old clusteroutputfile.txt file for file in$(find . -name '*.cluster.summary' -print) do echo "Reading file $file" awk 'NR ~ /^(13|36|)$/ || NR >= 40 {print}' \$file >> ../../clusteroutputfile.txt done Note: Before running the script, clean the "../../clusteroutputfile.txt" file, else output will keep appending with old outputs.
 03-22-2013, 10:29 PM #7 TullyGirl LQ Newbie   Registered: Mar 2013 Location: Los Angeles Posts: 4 Original Poster Rep: Beautiful! Thank you so much for your help.

 Posting Rules You may not post new threads You may not post replies You may not post attachments You may not edit your posts BB code is On Smilies are On [IMG] code is Off HTML code is Off Forum Rules

 Similar Threads Thread Thread Starter Forum Replies Last Post sam.m Programming 12 10-25-2012 12:56 PM twoprop Linux - Software 3 03-16-2012 12:22 AM linuxquestion1 Programming 9 07-13-2011 01:45 AM jimma Linux - General 1 08-22-2009 06:52 AM LXer Syndicated Linux News 0 08-22-2009 12:00 AM

All times are GMT -5. The time now is 07:00 PM.

 Contact Us - Advertising Info - Rules - LQ Merchandise - Donations - Contributing Member - LQ Sitemap -