LinuxQuestions.org

- Programming (https://www.linuxquestions.org/questions/programming-9/)

- - awk help (https://www.linuxquestions.org/questions/programming-9/awk-help-624810/)

I have a file which is a catalog of molecules with information about those molecules. It is structured as follows:

blah
molecule1
info
blah
molecule2
info

The number of lines of info for each molecule varies. The header "blah" stays constant. I want to extract "blah" through info for each molecule in place the extracted data into a file named after that molecule. So,

blah
molecule1
info

Would go into a file named molecule1 with the extension mol2 (molecule1.mol2). All files would have this extension. I'm new to programming/scripting and would appreciate any help/comments. I've done this:

awk '/^molecule/,/blah/' file

But, that of course leaves out the initial header "blah" and I have no idea how to loop this. Thanks.

Will

Code:

awk -F'\n' 'NR>1{print substr($0, 0, length($0)-1) > $1".mol2"}' RS='blah\n' catalog

Another one (GNU Awk):

Code:

awk '{close(f);print RS $0>(f=$1".mol2")}' ORS= RS="blah" catalog

If you don't have problems openning too many files,
you could change the code to:

Code:

awk '{print RS $0>$1".mol2"}' ORS= RS="blah" catalog

Quote:

Originally Posted by willinusf (Post 3074175)

there's an "algorithm" to do that. so you can use it in any other languages.

Code:

i=0

while read -r line

do

 case $line in

  blah ) 

        i=$(( i+1 )) #increment your file counter

        file="molecule${i}.mol2"  #initialize new file name

        echo $line >> $file;; # print to the new file name

  *) echo $line >>  $file ;;  # concat the rest of the line

 esac

done < "file"