LinuxQuestions.org
View the Most Wanted LQ Wiki articles.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices

Reply
 
Search this Thread
Old 10-05-2008, 02:49 PM   #1
Renan_S2
Member
 
Registered: Jul 2007
Location: Santa Maria, Brazil
Distribution: Arch Linux
Posts: 66

Rep: Reputation: 16
Extract lines NOT on a block of text from a file


Hello, I hope I'm not in the wrong forum.

I have a text file in the format:

Code:
START

blah blah blah, blah blah blah
...
...
...

END

Comments: .....

START

...
...
...
...
...

END

Comments: .....
Now I need a way to extract just the text that is NOT within the "START ... END" block. How would I do this?
I've tried searching a way of doing this with awk/sed, but didn't find it.

Don't know if I've managed to express myself properly...


Thanks.
 
Old 10-05-2008, 03:31 PM   #2
nadroj
Senior Member
 
Registered: Jan 2005
Location: Canada
Distribution: ubuntu
Posts: 2,539

Rep: Reputation: 58
i found this link that you will be able to use to achieve what you need: http://student.northpark.edu/pemente/sed/sed1line.txt
if "START" and "END" are 'keywords' in whatever your doing (that is, they cannot appear exactly as "START" or "END" besides to denote the start and end of a 'block') then it will be straightforward. if they can appear in a 'block' but not at the beginning of a line then it will also be straightforward. if they can appear within a block then it will be much more difficult (i think).

i never use 'sed', but i am very familiar with regular expressions. i was able to write a simple regex expression, using the link above as a reference, to print what you need. note the '-n' sed command may or may not be needed, depending on the type of 'newline' delimeter being used (ie Unix vs Windows).

so lets think about what you need: you want to print all blocks that do not start with "START" and end with "END". this is equivalent to not printing blocks with lines that begin with "START" up to a line that beings with "END". try and make a regular expression in sed that prints the blocks that start with "START" up to lines that being with "END". if you have that, then you simply negate it, and do not print (rather than do print) these blocks--this is your answer.

for example, a simple regular expression for printing blocks that start with START up to lines that start with END would be: ^START.*^END

im just trying to explain to let you do it. give an attempt and if you cant get it ill post the answer.

Last edited by nadroj; 10-05-2008 at 03:34 PM.
 
Old 10-05-2008, 03:38 PM   #3
Renan_S2
Member
 
Registered: Jul 2007
Location: Santa Maria, Brazil
Distribution: Arch Linux
Posts: 66

Original Poster
Rep: Reputation: 16
This does it, I think:

Code:
sed '/START/,/END/d'
Input:

Code:
START

1
2
3
4

END

Comment: foo

START

5
6
7
8

END

Comment: bar
Output:

Code:
Comment: foo


Comment: bar
Thanks.
 
Old 10-05-2008, 04:14 PM   #4
nadroj
Senior Member
 
Registered: Jan 2005
Location: Canada
Distribution: ubuntu
Posts: 2,539

Rep: Reputation: 58
note that if the block between START and END also contains your keywords ("START", "END"), you will get unexpected output. however if these are in fact keywords and cannot be used in the blocks, then what you have should be fine.

glad to help
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
How to extract particular text in a text file maverick_cat Linux - Newbie 3 07-22-2008 02:44 AM
Extract certain text info from text file xmrkite Linux - Software 30 02-26-2008 11:06 AM
Grab text lines in text file LULUSNATCH Programming 1 12-02-2005 10:55 AM
how to extract certain lines from a log file Avatar Linux - Newbie 3 02-11-2005 09:51 AM
Extract text from a html file gsphanikumar6 Linux - Newbie 2 08-20-2004 01:11 PM


All times are GMT -5. The time now is 01:38 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration