LinuxQuestions.org
Review your favorite Linux distribution.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 02-22-2017, 11:59 PM   #1
typecasket
LQ Newbie
 
Registered: Sep 2014
Posts: 2

Rep: Reputation: Disabled
How to extract certain portion of a file that starts with a pattern?


Hi, seeing sample INPUT.TXT below how can I extract all portions that has "HDRHEADER0001" occurrences using sed? I tried using sed command below but requires trailing pattern.
sed -n "/^HEADER0001/,/^,<TRAILER>$/p" INPUT.TXT>OUTPUT.TXT
HDRHEADER0001 X004010850P
BEG00SAD202659801032017021699CANE
HDRHEADER0002 X004010850P
BEG00SAD202611701012017021499CANW
DTM01020170214
N1ST 92 0642397236
N315829 RUE BELLERIVE
N4MONTREAL QCH1A5A6 CANADA
HDRHEADER0003 X004010850P
BEG00SAP521006901012017021399CANOUT B16885
DTM01020170213
N1STCEGEP SAINT LAURENT 92 0642385892
HDRHEADER0001 X004010850P
BEG00SAD202659801032017021699CANE
HDRHEADER0003 X004010850P
BEG00SAP521006901012017021399CANOUT B16885
DTM01020170213
Expected output:
HDRHEADER0001 X004010850P
BEG00SAD202659801032017021699CANE
HDRHEADER0001 X004010850P
BEG00SAD202659801032017021699CANE
 
Old 02-23-2017, 10:06 AM   #2
rknichols
Senior Member
 
Registered: Aug 2009
Distribution: CentOS
Posts: 4,362

Rep: Reputation: 2001Reputation: 2001Reputation: 2001Reputation: 2001Reputation: 2001Reputation: 2001Reputation: 2001Reputation: 2001Reputation: 2001Reputation: 2001Reputation: 2001
First, there's a typo in your post since '^HEADER0001' doesn't match the actual lines that begin "HDRHEADER0001".

What defines the "section" that you want? If for "HDRHEADER0001" it's always just the next line, that's easy:
Code:
sed -n "/^HDRHEADER0001/,+1p" INPUT.TXT>OUTPUT.TXT
It it's everything up to but not including the next "HDRHEADER" line, that's going to be quite a bit more complicated. Yes, sed can do it (its language is Turing-complete), but it might not be a suitable tool for the job. That task would be simple in awk or perl, but somewhat convoluted in sed.
 
Old 02-23-2017, 12:41 PM   #3
r3sistance
Senior Member
 
Registered: Mar 2004
Location: UK
Distribution: CentOS 6/7
Posts: 1,375

Rep: Reputation: 217Reputation: 217Reputation: 217
If you are just after the following line, you could use grep...

grep -A 1 "^HDRHEADER0001" INPUT > OUTPUT.TXT

should work.
 
Old 02-23-2017, 08:36 PM   #4
typecasket
LQ Newbie
 
Registered: Sep 2014
Posts: 2

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by rknichols View Post
First, there's a typo in your post since '^HEADER0001' doesn't match the actual lines that begin "HDRHEADER0001".

What defines the "section" that you want? If for "HDRHEADER0001" it's always just the next line, that's easy:
Code:
sed -n "/^HDRHEADER0001/,+1p" INPUT.TXT>OUTPUT.TXT
It it's everything up to but not including the next "HDRHEADER" line, that's going to be quite a bit more complicated. Yes, sed can do it (its language is Turing-complete), but it might not be a suitable tool for the job. That task would be simple in awk or perl, but somewhat convoluted in sed.
Sorry for the typo, yes it can be more than one line following "HDRHEADER0001".
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] How to use grep or sed to extract pattern? Weapon S Linux - Newbie 6 10-28-2012 06:08 PM
[SOLVED] how to extract a 2-line pattern from a file using awk, grep, etc. dcsmayei Linux - Newbie 9 06-09-2012 09:32 AM
[SOLVED] Sed: Remove trailing portion of pattern space Dyspeptic Curmudgeon Programming 8 02-28-2012 03:51 PM
[SOLVED] Extract portion of text - IRC Log free_ouyo Programming 2 07-19-2011 04:50 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 03:30 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration