Linux - NewbieThis Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place!
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I have a log file with simple(no meta characters) text lines. Each entry starts with a + followed by the name of the person who added an entry to the log. So for example:
+Alan
line1
line2
more lines
some blanks lines
+David
line
more lines
still more lines
+Alan
...
...
+Chris
..
..
What is a good way to filter out all the entries by, say,Alan? Blank lines must be maintained as part of the entry.
Yes, awk is good for this.
RS is best if it's at the end of the record. But here the + is at the beginning, so perhaps a state variable is simpler. (Set the state if the + is met. If state is good then print the current line.)
Waiting for some attempt of the O/P...
My first goal was to find a list of all unique names. I can do that with:
awk '/^+/' logfile | awk '!a[$0]++'
But I am still struggling with filtering out all entries of the same person. Ideally I could call a function with any of the names as parameter and have all his/her entries printed.
My first goal was to find a list of all unique names. I can do that with:
Code:
awk '/^+/' logfile | awk '!a[$0]++'
But I am still struggling with filtering out all entries of the same person. Ideally I could call a function with any of the names as parameter and have all his/her entries printed.
My first question would be, where does this information come from, and can the output format be changed? Often times you can modify what the program(s) in question output, so if you could get all the data for each person on one line, that'd make it FAR easier to get out info for just one person. I'm not an awk expert, and I'd probably write a perl script to do this.
My approach would be to read the file line-by-line (you don't say how big these files are), and look for the + sign, then compare the name to what you're looking for...if you find it, all other lines would be pushed into an array, until you hit the NEXT line with a + at the beginning. Name match? Keep shoving data out to the array. Doesn't match? keep reading. When you're done, you'll have an array with all of Alan's data in it, and you can output to screen/file/whatever. There are also approximately 10,000 other ways to do this, but for quick-and-dirty (this sounds like homework, honestly), that'd be my approach.
I can't imagine any circumstance where you should need to pipe awk to awk - it has all the conditionals needed, and the END block for tidying up after the input has reached EOF in need.
The typical solution for this sort of thing is to search for your key and set a flag - print while the flag is true. Turn the flag off at the next non-key. You can pass the key in by a bash variable.
Pretty straightforward.
Thanks for the improvements. Yes the single awk is neat and concise.
This log has just over 2000 lines.
The log was used by several engineers as they were installed a refrigeration plant. It all went well. But I am now preparing an activity report on each engineer's contribution.
I think I may have found a way to pull all blocks of text for each person (tag) with this command:
awk '/^+/{f=(/tag/)} f' logfile # e.g tag=Alan
No failures so far. I don't like the fact that the tag is hard-coded in the command. I guess I could use something like:
<logfile awk '/^+/{f=(/tag/)} f' tag=Alan
If there is a more efficient way I would be grateful to hear.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.