LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - General (http://www.linuxquestions.org/questions/linux-general-1/)
-   -   Stripping all but lines matching string from text files (http://www.linuxquestions.org/questions/linux-general-1/stripping-all-but-lines-matching-string-from-text-files-4175459885/)

d.vanheeckeren 04-28-2013 11:32 AM

Stripping all but lines matching string from text files
 
Have a question...about Linux...so figured this site was a good place to ask. :D

I have a bunch of server log files, and I don't need to keep all the information in them, only certain lines.

For example, I want to keep ONLY lines with the keywords "Chat", "Global", "Execute", and "Broadcast". I need the entire line it's on, with date and time, but don't need all the other lines.

I did see another thread stripping out lines by keyword, but I want to do the opposite, stripping out everything BUT those lines.

Thanks in advance for any help!

[Edit] Oh yeah...I should have mentioned that this is for a shell script I want to make to take log files from a samba network share, strip out everything but those lines, and save them to a directory on the web server to allow public viewing of events in the logs (Script will be cron job on web server).

TB0ne 04-28-2013 11:54 AM

Quote:

Originally Posted by d.vanheeckeren (Post 4940547)
Have a question...about Linux...so figured this site was a good place to ask. :D

I have a bunch of server log files, and I don't need to keep all the information in them, only certain lines. For example, I want to keep ONLY lines with the keywords "Chat", "Global", "Execute", and "Broadcast". I need the entire line it's on, with date and time, but don't need all the other lines.

I did see another thread stripping out lines by keyword, but I want to do the opposite, stripping out everything BUT those lines.

[Edit] Oh yeah...I should have mentioned that this is for a shell script I want to make to take log files from a samba network share, strip out everything but those lines, and save them to a directory on the web server to allow public viewing of events in the logs (Script will be cron job on web server).

We'll be glad to help you get a shell script going...so post what you've written/done so far, and where you're stuck. But we aren't going to write it for you.

If it was me, I'd just throw the log file through grep for whatever keywords you want, and output them to another file. Lots of information on the grep man page, and easily found on Google too:
Code:

grep 'Global\|Chat\|Execute\|Broadcast' /path/to/log.file > /path/to/output.file

d.vanheeckeren 04-28-2013 12:26 PM

Quote:

Originally Posted by TB0ne (Post 4940566)
But we aren't going to write it for you.

That's cool, I wasn't expecting that. :) I removed the pathnames, and I used the three question marks as a placeholder for the actual operation. I thought it should be really simple, so what I've got so far is this:
Code:

mv [logpath]/*.log [destinationpath]
for i in [destinationpath]/*.log; do ???/$i.txt; done

So I will try this now (but using cp instead of mv while testing]:
Code:

mv [logpath]/*.log [destinationpath]
for i in [destinationpath]/*.log; do grep 'Global\|Chat\|execute\|Broadcast' [destinationpath]/$i.log > $i.txt; done
rm *.log

Quote:

Originally Posted by TB0ne (Post 4940566)
If it was me, I'd just throw the log file through grep for whatever keywords you want, and output them to another file. Lots of information on the grep man page, and easily found on Google too:
Code:

grep 'Global\|Chat\|Execute\|Broadcast' /path/to/log.file > /path/to/output.file

I think the grep is exactly what I was looking for! Simple and effective...sorry, I'm kind of a newbie to linux, and didn't realize that grep could do multiple keywords. I was thinking I was going to have to do another loop for each keyword, but couldn't think of how to do that while still keeping the lines in chronological order. I should have looked though, sorry about that. I'll try it when I get back this evening, and if I run into any problems, I'll consult the manpages first. And thank you, TBOne, for being willing to help!

TB0ne 04-28-2013 12:57 PM

Quote:

Originally Posted by d.vanheeckeren (Post 4940575)
That's cool, I wasn't expecting that. :) I removed the pathnames, and I used the three question marks as a placeholder for the actual operation. I thought it should be really simple, so what I've got so far is this:
Code:

mv [logpath]/*.log [destinationpath]
for i in [destinationpath]/*.log; do ???/$i.txt; done

So I will try this now (but using cp instead of mv while testing]:
Code:

mv [logpath]/*.log [destinationpath]
for i in [destinationpath]/*.log; do grep 'Global\|Chat\|execute\|Broadcast' [destinationpath]/$i.log > $i.txt; done
rm *.log


You would be AMAZED at how many times folks DO expect a script written for them here. :)
Quote:

I think the grep is exactly what I was looking for! Simple and effective...sorry, I'm kind of a newbie to linux, and didn't realize that grep could do multiple keywords. I was thinking I was going to have to do another loop for each keyword, but couldn't think of how to do that while still keeping the lines in chronological order. I should have looked though, sorry about that. I'll try it when I get back this evening, and if I run into any problems, I'll consult the manpages first. And thank you, TBOne, for being willing to help!
No sweat, and was glad to help. This may fit your needs for now, but regular-expressions are complicated, and VERY powerful. A lot of Linux commands (like grep), can accept them, but getting them formatted correctly can be a chore at times. There are lots of pages that explain regex...my eyes glaze over after looking at them too long, though....

You shouldn't have to move your logs to another location (unless you WANT to), to get them read. You can just tell grep to go through them in place, and output to a different location for display. But if you're wanting to zero-out the log files so you only get new information, you have options:
  • The tail -f command. That will grab any new lines going IN to the log files, check them for the string(s), and hork them out to the output file if they're found.
    Code:

    tail -f /path/to/log.file grep 'Global\|Chat\|Execute\|Broadcast' > /path/to/output.file
  • Swatch or Logwatch: two utilities written specifically to watch log files. They may or may not work for you

cortman 04-28-2013 02:56 PM

Just as an FYI, I've found O'Reilly's Mastering Regular Expressions to be a really helpful reference- it's written simply and is easy to understand and yet is very in-depth.

d.vanheeckeren 04-29-2013 09:00 AM

Thanks for all the help, guys. I never did get a chance last night to try that script, and today's gonna be really really busy, so I probably can't get to it until tomorrow. TBOne and cortman, thanks for the suggestions, I'm going to see if I can find that book on regular expressions on the amazon ebook store.

And even though I haven't had time to test it yet, I woke up in the middle of the night with another idea. LOL (My weird brain does that random kind of stuff) But anyway, here's my idea:
First, these logs are from a game server, and the original reason I wanted to strip all but the desired lines was so that I could keep a record of what happens, because we're starting a Halo CE and UT99 community, and people are already complaining that an admin has kicked or banned them for no reason, and this way I'd have a record as a way to verify what people claim. I said I wanted to make these viewable publicly also, and I was thinking just text files. But when I woke up in the middle of the night, I realized I'm gonna try making this script create an actual html page (shouldn't be that hard for simple html tags and some basic formatting I would think), and limit the size of each html file to a specific (undetermined so far) size. Then all I'll have to do is create an html page that displays a listing of it's own directory, and the files will be viewable by anybody and there can be no more room for argument. So Wednesday, I'm gonna do my best to get it all done if I have enough time.

On a side note, as these are game server logs from a windows machine (Halo is running on Windows 7, UT server on linux), and not actual linux server log files, I don't know if the swatch or logwatch would help me or not, but I will check into them when I get a chance too.

And again, thanks for the toss-ins! :)


All times are GMT -5. The time now is 11:52 PM.