awk, sed, grep and paragraphs
I need to extract paragraphs that is more than 4 lines from a text file.
The paragraph length may vary according to the results from a wget request. The paragraphs are separated by blank lines and I need the entire contents of that paragraph to be returned in order to follow the redirects.
What would be the best way of doing this?
welcome to LQ!
The quick & easy way:
What this does is quite simple; awk normally operates with
lines (\n) as records, and any number of whitespace as a
field separator. What we did here is to tell it that a field
is anything with a line-end (FS), and that a record is a sequence
of 2 line-endings (RS, with nothing else in between, AKA, our
empty line between paragraphs). The rest is even simpler:
if we have NF (number of fields, AKA lines with content) greater
or equal 4, perform the default action (which is print and
which we have lazily omitted). The significance of RS=ORS
and FS=OFS respectively is that we don't want the output to
be reformatted to "standard" awk separators.
OP, did you find the explanation satisfactory? Nothing left unclear?
Not sure if the OP is aware of what "OP" means - depends on whether they are "forum-savvy" or not. :)
|All times are GMT -5. The time now is 07:16 AM.|