Read lines after a specific word.

Kashif_Bash · 04-20-2012, 10:07 AM

I have a text file (basically a log file) in which i have couple of occurrence of word "address 5". Now my requirement is that I reach to last occurrence of this word and then output this line and next 20 lines in some temp file for further manipulation.

or from reverse side: go to end of file and get the first occurrence and output that line and next 20 lines in some temp file.

I'm using Cent OS bash..

thanks in advance. and do let me know if my question isn't clear.

Nominal Animal · 04-20-2012, 10:13 AM

How about

Code:

grep -A 20 -e 'address 5' input > output

You get a line containing two dashes (--) in output between matches, unless the matches overlap, in which case you get the output from the first match continuously to 20 lines after the last match in that continuous segment.

If you only want the last match, try

Code:

grep -A 20 -e 'address 5' input | tail -n 21 > output

Edited: To include the match itself, you need 21 last lines (the parameter to tail -n).

danielbmartin · 04-20-2012, 10:59 AM

Quote:

Originally Posted by Nominal Animal

If you only want the last match, try

Code:

grep -A 20 -e 'address 5' input | tail -n 20 > output

Nitpick...
OP asked for the last match and the following 20 lines. I tested this and it returned only the following 20 lines.

Daniel B. Martin

Nominal Animal · 04-20-2012, 11:12 AM

Quote:

Originally Posted by danielbmartin

OP asked for the last match and the following 20 lines. I tested this and it returned only the following 20 lines.

Quite right, thanks for noticing that. The fix is obvious, using tail -n 21 instead. Fixed in the post.

David the H. · 04-21-2012, 10:49 AM

Quote:

Originally Posted by Nominal Animal

You get a line containing two dashes (--) in output between matches...

gnu grep now has two options for controlling this. You can change the separator or leave it out entirely.

Code:

grep --no-group-separator -A 20 -e 'address 5' input > output

grep --group-separator='*****' -A 20 -e 'address 5' input > output

No, they're not mentioned in the man page yet, but they are in the info page.

Kashif_Bash · 04-23-2012, 11:20 AM

Opening this thread again

Kashif_Bash · 04-24-2012, 05:34 PM

I'm sorry to tell you but I've got a bug in following command:

Quote:

grep -A 20 -e 'New USB device found' /var/log/messages | tail -n 16 > usb_detail

Problem is that I'm trying to get that line with "New USB device found" and NEXT 16 lines. ok. but what this script does is that it goes to last occurrence of "New USB device found" and then just output last 16 lines of log file into temp file.

so it means if there are 20 lines after last occurrence of "New USB device found" then I'll get ending 16 lines. and this is not what I'm looking for. I want next 16 lines after "New USB device found".

Hope this time I'm making it more clear.

descendant_command · 04-24-2012, 05:52 PM

'grep -A 16 ...' & 'tail -n 17'

David the H. · 04-25-2012, 11:32 PM

@descendant_command: Just what do you think the command you posted is supposed to do?

@Kashif_Bash: Think about how the commands you are using work. "grep -A20" prints out every occurrence of the matched line, plus the 20 lines following them. Then that output is sent into "tail -n16", which filters out all but the last 16 lines.

What you really want is the first 16 lines of the last 21 lines of the output. So what we need to do is run it through a second filter, head in this case.

Code:

grep -A 20 -e 'New USB device found' /var/log/messages | tail -n 21 | head -n 16 > usb_detail

However, that would still fail you if the matching string happens to be less than 21 lines from the end of the file. I think this will do you better:

Code:

tac /var/log/messages | grep -B16 -m1 'New USB device found' | tac > usb_detail

tac (cat backwards) reverses the output of the file, so you filter from the last line up. then you grep only the first instance (-m1) of the string, and print it and the 16 lines before it (-B), or as many as there actually are. Then you use tac again to put it back in the correct order.

It may be possible to come up with something cleaner using sed or awk. I'll have to think about it a bit.

Edit: Here's a quickly-knocked-out awk expression that appears to do it, although it's a bit cumbersome. Likely grail or someone will come along and embarrass me with a much simpler version.

Code:

awk '/New USB device found/ { t=""; f=1 } ; { if ( f != 0 && f <= 17 ) { t=t"\n"$0 ; f++ } else{ f=0 }} END{ sub(/^\n/,"",t) ; print t }' /var/log/messages > usb_detail

Edit2: Whoops, I had to revise the code. It was printing an unwanted newline before the output. But I'm not sure how to remove it except to use a sub function. I also decided to shorten the variables to one letter, to make it look a bit cleaner (t for "text" and f for "flag").

descendant_command · 04-26-2012, 12:15 AM

@david

It's not a command - it is a hint as to how to adjust the previous command so that it will work as desired.

David the H. · 04-26-2012, 04:06 AM

Ah, I see it now. You really should take time to explain such things more clearly. I had taken the "&" to be part of the suggested change, which of course is not right at all.