LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   awk loops and deleting lines (https://www.linuxquestions.org/questions/programming-9/awk-loops-and-deleting-lines-731424/)

skray 06-08-2009 09:29 AM

awk loops and deleting lines
 
Hi all,

I just found this forum by searching for some awk help and was wondering if any of you guys could aid me in a question I have.

I have a set of text files that are set up in the following way:
Quote:

Track 1: stat = 32, ifst = 1, ilst = 1, nit = 1 990720

t da hr stat k iop q x y p
200.0000 990720 0 32 1 0 0.000 159.500 -67.000 970.280


Track 2: stat = 32, ifst = 1, ilst = 6, nit = 6 990720

t da hr stat k iop q x y p
200.0000 990720 0 34 2 0 0.000 28.000 -65.500 973.100
200.2500 990720 600 44 1 0 0.982 30.500 -66.000 977.200
.....
.....
The number after "nit" in each "track" line defines how many lines of numbers are in each track. So Track 2 above is cut off and actually has 4 more lines since nit = 6 for that one. There are 1000+ tracks in each file.

What I need to do is delete any track for which nit <= 4.

I was thinking that the easiest way to do this would be to write a loop that reads nit for each line starting with track and then either skip or delete nit+3 lines (to account for the blank lines in the files separating each track and the original track line itself) depending on the value of nit.

I'm a complete novice at awk. My main programming experience is with C++/C# but unfortunately I am confined to a linux box at the moment.

Is this a good way to go about this task? Should I be looking into perl or python? I have no idea if they would be better suited to this task. I also dont have any clue on how I would go about deleting an entire line so even if someone could clear that up for me it would be much appreciated. :)

Thanks in advance for any help!

colucix 06-08-2009 10:10 AM

Try this
Code:

/Track/{ split($0,array)
  # Read the value of nit
  for (i=1; i<=NF; i++){
    if (array[i] == "nit" )
        nit = array[i+2]
  }
  # Print the track if nit > 4
  if ( nit > 4 ){
    i = 1
    while ( i <= nit+4 ){
      print
      getline
      i++
    }
    print
  }
}

I assume there are two blank lines between each track, otherwise you have to modify the while statement.

ghostdog74 06-08-2009 11:29 AM

Quote:

Originally Posted by skray (Post 3566863)

I'm a complete novice at awk. My main programming experience is with C++/C# but unfortunately I am confined to a linux box at the moment.

well, when you started learning C++/C, what did you learn with? books, internet right?? same with learning awk, read the docs! Here

Quote:

Should I be looking into perl or python?
those 2 languages also are viable solution to your problem.


awk:
Code:

awk 'BEGIN{RS="Track"}
{
 val=gensub(/.*nit = ([0-9]+) .*/,"\\1","g",$0)
 if(val >4){
  print RT,$0
 }
}' file


colucix 06-08-2009 11:33 AM

Thanks, Ghostdog! I didn't know you can use \1 in gensub. Also the RS="Track" is cool! :)

ghostdog74 06-08-2009 11:38 AM

Quote:

Originally Posted by colucix (Post 3566996)
Thanks, Ghostdog! I didn't know you can use \1 in gensub. Also the RS="Track" is cool! :)

yes, you can. Its already documented. pls check the link i gave the OP for the GNU manual. Note, gensub does not address greediness and its a GNU extension.

colucix 06-08-2009 11:58 AM

Quote:

Originally Posted by ghostdog74 (Post 3567001)
yes, you can. Its already documented. pls check the link i gave the OP for the GNU manual. Note, gensub does not address greediness and its a GNU extension.

I always have the GNU awk manual at hand for reference, but I did not find that. Can you point me to the paragraph?

Edit: Found it. Just in the gensub paragraph:
Quote:

gensub provides an additional feature that is not available in sub or gsub: the ability to specify components of a regexp in the replacement text. This is done by using parentheses in the regexp to mark the components and then specifying ‘\N’ in the replacement text, where N is a digit from 1 to 9.
So... this feature is available for gensub only, not for sub or gsub. Good to know! :)


All times are GMT -5. The time now is 06:41 PM.