LinuxQuestions.org
LinuxAnswers - the LQ Linux tutorial section.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices

Reply
 
Search this Thread
Old 02-24-2010, 03:49 AM   #1
ThinkLinux
LQ Newbie
 
Registered: May 2009
Posts: 1

Rep: Reputation: 0
awk, sed, grep and paragraphs


Hi,

I need to extract paragraphs that is more than 4 lines from a text file.
The paragraph length may vary according to the results from a wget request. The paragraphs are separated by blank lines and I need the entire contents of that paragraph to be returned in order to follow the redirects.

What would be the best way of doing this?

Thanks
 
Old 02-24-2010, 05:08 AM   #2
Tinkster
Moderator
 
Registered: Apr 2002
Location: in a fallen world
Distribution: slackware by choice, others too :} ... android.
Posts: 23,004
Blog Entries: 11

Rep: Reputation: 903Reputation: 903Reputation: 903Reputation: 903Reputation: 903Reputation: 903Reputation: 903Reputation: 903
Hi,

welcome to LQ!

The quick & easy way:
Code:
awk 'BEGIN{RS=ORS="\n\n";FS=OFS="\n"}NF>=4' file
[edit]
What this does is quite simple; awk normally operates with
lines (\n) as records, and any number of whitespace as a
field separator. What we did here is to tell it that a field
is anything with a line-end (FS), and that a record is a sequence
of 2 line-endings (RS, with nothing else in between, AKA, our
empty line between paragraphs). The rest is even simpler:
if we have NF (number of fields, AKA lines with content) greater
or equal 4, perform the default action (which is print and
which we have lazily omitted). The significance of RS=ORS
and FS=OFS respectively is that we don't want the output to
be reformatted to "standard" awk separators.
[/edit]


Cheers,
Tink

Last edited by Tinkster; 02-24-2010 at 11:10 AM. Reason: Added explanation - hope it helps.
 
Old 02-25-2010, 08:02 PM   #3
Tinkster
Moderator
 
Registered: Apr 2002
Location: in a fallen world
Distribution: slackware by choice, others too :} ... android.
Posts: 23,004
Blog Entries: 11

Rep: Reputation: 903Reputation: 903Reputation: 903Reputation: 903Reputation: 903Reputation: 903Reputation: 903Reputation: 903
OP, did you find the explanation satisfactory? Nothing left unclear?
 
Old 04-09-2010, 02:22 PM   #4
Star_Gazer
Member
 
Registered: Aug 2009
Location: Virginia, United States
Distribution: openSUSE 11.2 KDE
Posts: 32

Rep: Reputation: 15
Quote:
Originally Posted by Tinkster View Post
OP, did you find the explanation satisfactory? Nothing left unclear?
It educated me some!

Not sure if the OP is aware of what "OP" means - depends on whether they are "forum-savvy" or not.

Clifton
 
  


Reply

Tags
awk, grep, sed


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
help with grep/sed/awk nikunjbadjatya Programming 8 02-17-2010 08:29 PM
bash - awk, sed, grep, ... advice schneidz Programming 13 08-25-2008 10:30 AM
Sed, Awk, grep,Search,delete joyds219 Linux - Newbie 6 04-03-2008 07:15 AM
awk/sed to grep the text ahpin Linux - Software 3 10-17-2007 01:34 AM
How can I awk/sed/grep the IPs from the maillog? abefroman Programming 7 03-09-2006 11:22 AM


All times are GMT -5. The time now is 05:03 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration