LinuxQuestions.org
Visit the LQ Articles and Editorials section
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices

Reply
 
Search this Thread
Old 02-29-2008, 02:01 PM   #1
willinusf
LQ Newbie
 
Registered: Jul 2004
Posts: 12

Rep: Reputation: 0
awk help


I have a file which is a catalog of molecules with information about those molecules. It is structured as follows:

blah
molecule1
info
blah
molecule2
info


The number of lines of info for each molecule varies. The header "blah" stays constant. I want to extract "blah" through info for each molecule in place the extracted data into a file named after that molecule. So,

blah
molecule1
info

Would go into a file named molecule1 with the extension mol2 (molecule1.mol2). All files would have this extension. I'm new to programming/scripting and would appreciate any help/comments. I've done this:

awk '/^molecule/,/blah/' file

But, that of course leaves out the initial header "blah" and I have no idea how to loop this. Thanks.

Will
 
Old 02-29-2008, 02:30 PM   #2
angrybanana
Member
 
Registered: Oct 2003
Distribution: Archlinux
Posts: 147

Rep: Reputation: 21
Code:
awk -F'\n' 'NR>1{print substr($0, 0, length($0)-1) > $1".mol2"}' RS='blah\n' catalog
 
Old 02-29-2008, 03:32 PM   #3
radoulov
Member
 
Registered: Apr 2007
Location: Milano, Italia/Варна, България
Distribution: Ubuntu, Open SUSE
Posts: 212

Rep: Reputation: 35
Another one (GNU Awk):

Code:
awk '{close(f);print RS $0>(f=$1".mol2")}' ORS= RS="blah" catalog
If you don't have problems openning too many files,
you could change the code to:

Code:
awk '{print RS $0>$1".mol2"}' ORS= RS="blah" catalog

Last edited by radoulov; 03-02-2008 at 08:09 AM. Reason: grammar ...
 
Old 02-29-2008, 09:06 PM   #4
ghostdog74
Senior Member
 
Registered: Aug 2006
Posts: 2,695
Blog Entries: 5

Rep: Reputation: 240Reputation: 240Reputation: 240
Quote:
Originally Posted by willinusf View Post
I have a file which is a catalog of molecules with information about those molecules. It is structured as follows:

blah
molecule1
info
blah
molecule2
info


The number of lines of info for each molecule varies. The header "blah" stays constant. I want to extract "blah" through info for each molecule in place the extracted data into a file named after that molecule. So,

blah
molecule1
info

Would go into a file named molecule1 with the extension mol2 (molecule1.mol2).
there's an "algorithm" to do that. so you can use it in any other languages.
Code:
i=0
while read -r line
do
 case $line in
  blah ) 
        i=$(( i+1 )) #increment your file counter
        file="molecule${i}.mol2"  #initialize new file name
        echo $line >> $file;; # print to the new file name
  *) echo $line >>  $file ;;  # concat the rest of the line
 esac
done < "file"
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Some comments on awk and awk scripts makyo Programming 4 03-02-2008 05:39 PM
Help: awk Paris Heng Programming 6 01-18-2008 08:21 PM
awk help uttam_h Linux - Software 4 12-17-2007 09:40 PM
awk help please stefaandk Programming 6 10-02-2007 07:50 AM
awk kalyanofb Programming 4 02-19-2007 01:55 AM


All times are GMT -5. The time now is 09:46 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration