AWK - skip line if line contains pattern and print next line
Dear experts,
From a txt file in input: 0 this_pattern 1 2 3 4 0 this_pattern 5 6 7 8 9 10 11 0 this_pattern etc. I would like to find matching strings (e.g. "this_pattern") and print it along the next following FIVE lines. Mandatory is that the following lines must not contain "this pattern"; rather, this line has to be ignored, and the next following line has to be print instead. My output should be the following: 0 this_pattern 1 2 3 4 5 0 this_pattern 5 6 7 8 9 etc. So far this is my AWK command: awk '/ this_pattern / {nr[NR]; nr[NR+1]; nr[NR+2]; nr[NR+3]; nr[NR+4]; nr[NR+5]} ; NR in nr' However, this does not prevent lines including "this_pattern" to be ignored. I essentially only need to say somehow that if NR+n has the pattern "this_pattern", it has to be ignored. Any help or different approach would be highly appreciated. Sincerely, Udiubu |
I would have the pattern increment a counter. Then I would have a second statement print a line, and increment the counter, if the counter is already greater than zero. Then if the counter is greater than 5, reset it to zero.
|
I'd also anchor the pattern to the beginning of the line or to the beginning of the field. Use a ^ for that
|
Dear Turbocapitalist,
Thanks for your reply. It's not entirely clear to me what you exactly mean, though. Maybe a command line would be more helpful to get your point. Udiubu |
The elegant way sets a counter that is decremented until 0
Code:
awk '/this_pattern/ {cnt=6} (cnt && cnt--)' By switching the order one can print the (5) following lines Code:
awk '(cnt && cnt--); /this_pattern/ {cnt=5}' |
The first line above produces the output shown as as an example in #1 above.
It is much smoother that what I proposed. However, being in the newbie subforum, it can be pointed out the shortcuts that awk takes: If an action statement is left off after the pattern, a print is assumed, and if the print has no parameters then $0 is assumed. So the following is the same as the first line above, but shows the 'hidden' print statement: Code:
awk '/this_pattern/ {cnt=6}; (cnt && cnt--) {print}' Code:
awk '$2 ~ /^this_pattern/ {cnt=6}; (cnt && cnt--) {print}' |
Thanks to both of you for the great suggestions.
I got the point of the commands. The problem is that this does not solve my issue. I am using awk '/this_pattern/ {cnt=6} (cnt && cnt--)' but I still get the following : 0 this_pattern 1 2 3 4 0 this_pattern 5 6 7 8 9 0 this_pattern After each "this pattern" line I would need FIVE lines not containing the string "this_pattern", which can simply be ignored. 0 this_pattern 1 2 3 4 5 0 this_pattern 5 6 7 8 9 0 this_pattern In this sense, when the first "this_pattern" matches, the next five lines need to be printed, but the second "this_pattern" has to be skipped. Following, when the second "this_pattern" matches, the next five lines need to be printed. Or is it the case that once a line is skipped, it cannot be retrieved again? An ideal solution would be to say that the next following lines must not contain alphabetic strings: if this is so, ignore it. Hope this help! I thank you so much for your valuable help. Best, Udiubu |
The following, if meeting another match, skips the printing and decrementing
Code:
awk '/this_pattern/ {if (cnt) next; cnt=6} (cnt && cnt--)' |
Wait. We're missing an explanation for where the extra "5" comes from. It is not in the input you have shown. But it is in the output:
Quote:
|
@Turbocapitalist - I think the '5' you mention actually comes from the line after this_pattern in the input file, hence the solution is a little more complicated. Not only do you need to provide data from additional lines but you also then need a rewind type option.
If I am understanding correclty, and OP may correct if not, I think the following example is a little clearer: Code:
0 this_pattern Code:
0 this_pattern So my suggestion would be to create an array to store the required lines and when you hit the limit, in this case 5, you print out the array that has this many items |
yes, would be nice to see a better input/output example
What I can imagine is to store the state somehow and print lines according to that state: Code:
awk ' |
Thanks everyone for suggestions:
@grail: you got exactly the point and your example is perfect to test. There should exactly be a rewind back to any pattern lines found along the way. However I honestly do not really know how to implement it. @MadeInGermany: this command works, but indeed once you skip a matched string along the next lines, it is simply lost and not recoverable. awk '/this_pattern/ {if (cnt) next; cnt=6} (cnt && cnt--)' Thanks for helping! |
Yes, information from a past cycle needs to be saved.
For example Code:
awk '/this_pattern/ {if (cnt) {save=$0; saved=1; next} else cnt=6} { if (cnt) {cnt--; print} else if (saved) {print "saved:"save; saved=0; print; cnt=6}}' This might be close to your requirement (that I have still not got in full). |
Ok, so not pretty, but this is what I came up with so far:
Code:
/pattern/{ |
Very cool. I was wondering how to push something onto an array.
Isn't something needed at the end for a catch-all? There might be an array or two left over with fewer than 5 elements. Code:
END { |
Quote:
|
All times are GMT -5. The time now is 10:12 AM. |