[SOLVED] [BASH/SHELL] grep/extract/display lines under specific strings (irregular number of lines)
ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
However, the downside of this method is that I need to give static number of lines to be displayed and it is not always 3.
I was hoping that maybe here someone would give me some better way to do that?
You need something with a bit of logic plus regex - awk, perl, python, whatever you're comfortable with.
Find your header lines, set a flag and get the next record - print while you have the flag set. When you reach another (non-wanted) header turn the flag off.
If I run: cat file.out | egrep -A 3 "AAA-1|DFG-54" | egrep -v "AAA-1|DFG-54" it will be more or less what I need although limited to only 3 lines whereas I need all lines in each selected section.
Edit: Nevermind, I misunderstood the original question. If you only want specific groups (rather than "all lines under ones starting with capital letters" as you state in the OP), you will need something with "looping" capabilities.
Last edited by individual; 01-08-2020 at 06:21 AM.
This can be done with grep and a simple regex - you just need a couple of things:
* A lookbehind to locate the headers without matching (lookbehinds require Perl regex, so -P)
* The ability to match each section in one go means crossing lines, so -z prevents grep splitting on newline.
The lookbehind part (?<=...) contains each of the required headers/prefixes (including a newline for each one to prevent blank lines), and is easy to add new headers to: (?<=AAA-1\n|DFG-54\n|ANOTHER-1\n)
The second part matches as many sub-items as possible by checking for their literal '| - ' prefix, with an optional the newline to match when it needs to (but not after headers). The [^\n]+ part could be replaced with a specific sub-pattern if further filtering is needed.
Otherwise the first line of the next section (DFG-54) was displayed in the same line as the last line of the first section (AAA-1). Its probably something specific to my grep version.
...Its probably something specific to my grep version.
Hrm, actually I think it was the system I tested on - I checked with a newer grep and also get the merged lines you mentioned.
Your modification will be fine if AAA-1 is the first section, but otherwise you might want to remove all newlines on the headers, make the optional one mandatory, then just trim the first line, i.e:
Code:
grep -Poz '(?<=AAA-1|DFG-54)(\n\| - [^\n]+)+' file.out | sed 1d
There is a criterion for "stop printing, prt=0" and a criterion for "start printing, prt=1", and at the appropriate place there is "prt" meaning "print if true".
In this case, knowing that "search" won't start with a | character, one can condense it to
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.