LinuxQuestions.org - awk loops and deleting lines

- Programming (https://www.linuxquestions.org/questions/programming-9/)

- - awk loops and deleting lines (https://www.linuxquestions.org/questions/programming-9/awk-loops-and-deleting-lines-731424/)

awk loops and deleting lines

Hi all,

I just found this forum by searching for some awk help and was wondering if any of you guys could aid me in a question I have.

I have a set of text files that are set up in the following way:

Quote:

Track 1: stat = 32, ifst = 1, ilst = 1, nit = 1 990720

t da hr stat k iop q x y p
200.0000 990720 0 32 1 0 0.000 159.500 -67.000 970.280

Track 2: stat = 32, ifst = 1, ilst = 6, nit = 6 990720

t da hr stat k iop q x y p
200.0000 990720 0 34 2 0 0.000 28.000 -65.500 973.100
200.2500 990720 600 44 1 0 0.982 30.500 -66.000 977.200
.....
.....

The number after "nit" in each "track" line defines how many lines of numbers are in each track. So Track 2 above is cut off and actually has 4 more lines since nit = 6 for that one. There are 1000+ tracks in each file.

What I need to do is delete any track for which nit <= 4.

I was thinking that the easiest way to do this would be to write a loop that reads nit for each line starting with track and then either skip or delete nit+3 lines (to account for the blank lines in the files separating each track and the original track line itself) depending on the value of nit.

I'm a complete novice at awk. My main programming experience is with C++/C# but unfortunately I am confined to a linux box at the moment.

Is this a good way to go about this task? Should I be looking into perl or python? I have no idea if they would be better suited to this task. I also dont have any clue on how I would go about deleting an entire line so even if someone could clear that up for me it would be much appreciated. :)

Thanks in advance for any help!

Try this

Code:

/Track/{ split($0,array)

  # Read the value of nit

  for (i=1; i<=NF; i++){

    if (array[i] == "nit" )

        nit = array[i+2]

  }

  # Print the track if nit > 4

  if ( nit > 4 ){

    i = 1

    while ( i <= nit+4 ){

      print

      getline

      i++

    }

    print

  } 

}

I assume there are two blank lines between each track, otherwise you have to modify the while statement.

Quote:

Originally Posted by skray (Post 3566863)

I'm a complete novice at awk. My main programming experience is with C++/C# but unfortunately I am confined to a linux box at the moment.

well, when you started learning C++/C, what did you learn with? books, internet right?? same with learning awk, read the docs! Here

Quote:

Should I be looking into perl or python?

those 2 languages also are viable solution to your problem.

awk:

Code:

awk 'BEGIN{RS="Track"}

{

 val=gensub(/.*nit = ([0-9]+) .*/,"\\1","g",$0)

 if(val >4){

  print RT,$0

 }

}' file

Thanks, Ghostdog! I didn't know you can use \1 in gensub. Also the RS="Track" is cool! :)

Quote:

Originally Posted by colucix (Post 3566996)

Thanks, Ghostdog! I didn't know you can use \1 in gensub. Also the RS="Track" is cool! :)

yes, you can. Its already documented. pls check the link i gave the OP for the GNU manual. Note, gensub does not address greediness and its a GNU extension.

Quote:

Originally Posted by ghostdog74 (Post 3567001)

yes, you can. Its already documented. pls check the link i gave the OP for the GNU manual. Note, gensub does not address greediness and its a GNU extension.

I always have the GNU awk manual at hand for reference, but I did not find that. Can you point me to the paragraph?

Edit: Found it. Just in the gensub paragraph:

Quote:

gensub provides an additional feature that is not available in sub or gsub: the ability to specify components of a regexp in the replacement text. This is done by using parentheses in the regexp to mark the components and then specifying ‘\N’ in the replacement text, where N is a digit from 1 to 9.

So... this feature is available for gensub only, not for sub or gsub. Good to know! :)