LinuxQuestions.org
View the Most Wanted LQ Wiki articles.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices

Reply
 
LinkBack Search this Thread
Old 10-27-2008, 06:44 PM   #1
roach7711x
LQ Newbie
 
Registered: Oct 2008
Posts: 18

Rep: Reputation: 1
Using sed to extract a pattern plus a number of positions after


I am having trouble extracting text using sed. I need to extract a PATTERN in a file plus 12 positions after the detected PATTERN. I've tried sed 's/.*\(LEN:.....HOST............\).*/\1/' FILE with no such luck. I need the output to read:
HOST 02 0 02 11
HOST 02 0 02 07

A sample of the file looks like:
Code:
MADN SPECIFIED.  LEN OUTPUT IS FOR PRIMARY.
-------------------------------------------------------------------------------
LEN:     HOST  02 0 02 11
TYPE: MULTIPLE APPEARANCE DIRECTORY NUMBER
SNPA: 315
DIRECTORY NUMBER:     6344015               (NON-UNIQUE)
LINE CLASS CODE:  IBN
IBN TYPE:  MADN
MADN INFO - TYPE:SCA  PRIMARY:Y  RING:ALWAYS
MADN GROUP INFO - DENIAL_TRMT:SILENCE   BRIDGING:N
CUSTGRP:           KAFB  SUBGRP:0  NCOS: 52
SIGNALLING TYPE:  DIGITONE
CARDCODE:  6X17AC    GND: N  PADGRP: ONS  BNV: NL MNO: N
PM NODE NUMBER     :    54
PM TERMINAL NUMBER :    76
OPTIONS:
3WC COD CLF RAG AVT DGT CND NOAMA CNAMD NOAMA NAME PUBLIC BLDG 1460 SPB
5806344015 CPU 0 HOST 02 0 02 11 CFI $ I CFD N 111 A CFB N 111 A MWT STD Y
NO N
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
LEN:     HOST  02 0 02 07
TYPE: SINGLE PARTY LINE
SNPA: 315
DIRECTORY NUMBER:     6347277
LINE CLASS CODE:  IBN
IBN TYPE: STATION
CUSTGRP:           KAFB     SUBGRP: 0  NCOS: 53
SIGNALLING TYPE:  DIGITONE
CARDCODE:  6X17AC    GND: N  PADGRP: ONS  BNV: NL MNO: N
PM NODE NUMBER     :    54
PM TERMINAL NUMBER :    72
OPTIONS:
3WC COD CLF RAG LNR AVT PREMTBL DGT CND NOAMA CNAMD NOAMA NAME PUBLIC BLDG
1460 SPB 5976347277 CFI $ I
 
Old 10-27-2008, 07:09 PM   #2
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 11,821

Rep: Reputation: 924Reputation: 924Reputation: 924Reputation: 924Reputation: 924Reputation: 924Reputation: 924Reputation: 924
Use a better tool - grep
Code:
grep -oE Host.{12} <filname>
Presumes you don't need to filter further.

Last edited by syg00; 10-27-2008 at 07:13 PM. Reason: Omitted filename
 
Old 10-27-2008, 07:19 PM   #3
roach7711x
LQ Newbie
 
Registered: Oct 2008
Posts: 18

Original Poster
Rep: Reputation: 1
Thanks for the reply. I guess I forgot to mention that there are other instances throughout my text file that include HOST that I do not need. So I need to filter LEN: HOST (12 positions after) specifically.
 
Old 10-27-2008, 07:34 PM   #4
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 11,821

Rep: Reputation: 924Reputation: 924Reputation: 924Reputation: 924Reputation: 924Reputation: 924Reputation: 924Reputation: 924
In which case you might want to use LEN: to address lines rather than as part of the pattern. Try something like
Code:
sed -nr '/LEN:/s/.*(HOST.{12}).*/\1/p' FILE

Last edited by syg00; 10-27-2008 at 07:36 PM.
 
Old 10-27-2008, 07:47 PM   #5
roach7711x
LQ Newbie
 
Registered: Oct 2008
Posts: 18

Original Poster
Rep: Reputation: 1
I tried your suggested command:
Code:
sed -nr '/LEN:/s/.*(HOST.{12}).*/\1/p' FILE > NEWFILE
NEWFILE came back empty. I also tried:

Code:
sed -nr '/LEN:...../s/.*(HOST.{12}).*/\1/p' FILE > NEWFILE
Considering there is 5 spaces before HOST but this too resulted in an empty NEWFILE.
 
Old 10-27-2008, 07:59 PM   #6
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 11,821

Rep: Reputation: 924Reputation: 924Reputation: 924Reputation: 924Reputation: 924Reputation: 924Reputation: 924Reputation: 924
Case issues ??? - I'm also presuming you have gnu sed.
 
Old 10-27-2008, 08:11 PM   #7
roach7711x
LQ Newbie
 
Registered: Oct 2008
Posts: 18

Original Poster
Rep: Reputation: 1
Yes I am using GNU sed. I do not see why the command you suggested doesn't work. I don't believe it's a case issue though. LEN and HOST are capitalized throughout the entire text file.
 
Old 10-27-2008, 08:16 PM   #8
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 11,821

Rep: Reputation: 924Reputation: 924Reputation: 924Reputation: 924Reputation: 924Reputation: 924Reputation: 924Reputation: 924
Works for me on your data - maybe try escaping the parentheses as you did earlier. Might depend on (sed) version.
 
Old 10-27-2008, 08:23 PM   #9
roach7711x
LQ Newbie
 
Registered: Oct 2008
Posts: 18

Original Poster
Rep: Reputation: 1
Ahh ok. Escaping didn't work but
Code:
sed -nr '/LEN:/s/.*(HOST.{12}).*/\1/p' FILE > NEWFILE
seemed to work. I do not know why it didn't work before. Thanks for all your help. By the way, how would I get this to output each instance of HOST XX X X X on a new line?
 
Old 10-27-2008, 08:43 PM   #10
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 11,821

Rep: Reputation: 924Reputation: 924Reputation: 924Reputation: 924Reputation: 924Reputation: 924Reputation: 924Reputation: 924
Huh ???. You shouldn't need to if the data is on separate lines as per your example - sed is a stream editor after all. You can just embed "\n" in the RHS of that expression in need.
 
Old 10-27-2008, 09:19 PM   #11
roach7711x
LQ Newbie
 
Registered: Oct 2008
Posts: 18

Original Poster
Rep: Reputation: 1
Ok I got it to work correctly now. One more request though, can I get it to search for and output HOST XX XX XX XX along with RST1 XX XX XX XX, RST2 XX XX XX XX, RST3 XX XX XX XX, or RST4 XX XX XX XX?
 
Old 10-27-2008, 09:59 PM   #12
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 11,821

Rep: Reputation: 924Reputation: 924Reputation: 924Reputation: 924Reputation: 924Reputation: 924Reputation: 924Reputation: 924
You can - sed can incorporate (logical) or.
Try the sed.sf.net page - has a link to handy one-liners you might find instructive.
 
Old 10-27-2008, 10:16 PM   #13
jschiwal
Guru
 
Registered: Aug 2001
Location: Fargo, ND
Distribution: SuSE AMD64
Posts: 15,733

Rep: Reputation: 654Reputation: 654Reputation: 654Reputation: 654Reputation: 654Reputation: 654
Code:
sed -n '/^LEN:.*HOST/s/.*HOST  \(.*$\)/HOST \1/p' infile >outfile
I didn't notice RST1, etc in the original. Do you want
Code:
HOST 02 0 02 11 RST1 02 0 02 11 RST2 02 0 02 11 RST3 02 0 02 11 RST4 02 0 02 11
HOST 02 0 02 07 RST1 02 0 02 07 RST2 02 0 02 07 RST3 02 0 02 07 RST4 02 0 02 11
You can use the \1 register more than once in the replacement.
Code:
sed -n '/^LEN:.*HOST/s/.*HOST  \(.*$\)/HOST \1 RST1 \1 RST2 \1 RST3 \1 RST4 \1/p' junk2
HOST 02 0 02 11 RST1 02 0 02 11 RST2 02 0 02 11 RST3 02 0 02 11 RST4 02 0 02 11
HOST 02 0 02 07 RST1 02 0 02 07 RST2 02 0 02 07 RST3 02 0 02 07 RST4 02 0 02 07

sed -n '/^LEN:.*HOST/s/.*HOST  \(.*$\)/HOST \1 \nRST1 \1 \nRST2 \1 \nRST3 \1 \nRST4 \1\n/p' junk2
HOST 02 0 02 11
RST1 02 0 02 11
RST2 02 0 02 11
RST3 02 0 02 11
RST4 02 0 02 11

HOST 02 0 02 07
RST1 02 0 02 07
RST2 02 0 02 07
RST3 02 0 02 07
RST4 02 0 02 07
 
Old 10-28-2008, 12:49 AM   #14
roach7711x
LQ Newbie
 
Registered: Oct 2008
Posts: 18

Original Poster
Rep: Reputation: 1
Cool

Thanks for all your replies I finally came up with a script that works flawlessly
Code:
sed -nr "/LEN:/s/.*(HOST.{12}|RST1.{12}|RST2.{12}|RST3.{12}|RST4.{12}).*/\1/p" FILE > NEWFILE
 
Old 10-28-2008, 01:58 AM   #15
roach7711x
LQ Newbie
 
Registered: Oct 2008
Posts: 18

Original Poster
Rep: Reputation: 1
Ok one more thing. Here is a sample of one record:

Code:
LEN:     HOST  00 0 06 27
TYPE: SINGLE PARTY LINE
SNPA: 315
DIRECTORY NUMBER:     6347250
LINE CLASS CODE:  M5112 SET
CUSTGRP:            KAFB  SUBGRP: 0  NCOS: 51  RING: Y
CARDCODE:  6X21AC    GND: N  PADGRP: PONS  BNV: NL MNO: Y
PM NODE NUMBER     :    50
PM TERMINAL NUMBER :    220
OPTIONS:
3WC MCH RAG AVT PREMTBL KSMOH NAME PUBLIC BLDN 1460
CPU 0 HOST 00 0 06 27 1 SPB 5976347250 MWT MWL Y NO N CFI 6347253 I 1 CFB N
111 A 1 CFD N 111 A 1

   KEY       DN
   ---       --
     1       DN          3156347250
     2       MDN         3156340412   SCA         PRIM:N  RING :ALWAYS  NCOS:52
     3       MDN         3156347283   SCA         PRIM:N  RING :ALWAYS  NCOS:16

   KEY     FEATURE
   ---     -------
     1        CPU     0 HOST  00 0 06 27  1
     1        SPB  5976347250
     2        SPB  5976340412
     4        MWT     MWL Y      NO N
     7        MCH
     8        RAG
     9        CFI                        6347253 I  1
     9        CFB N                            111         A  1
     9        CFD N                            111         A  1
    10        3WC
I would like to use the same script
Code:
sed -nr "/LEN:/s/.*(HOST.{12}|RST1.{12}|RST2.{12}|RST3.{12}|RST4.{12).*/\1/p" FILE > NEWFILE
to also filter out the first appearance of NCOS: XXX and apply that to the same line as HOST XX XX XX XX, RST1 XX XX XX XX, RST2 XX XX XX XX, RST3 XX XX XX XX and so on. The problem is that there is multiple appearance throughout all of these records but I only need the first one.

Last edited by roach7711x; 10-28-2008 at 02:01 AM.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
last pattern with sed? xpto09 Linux - Newbie 6 10-04-2007 08:01 PM
How to get the pattern using sed or awk? ahpin Programming 3 08-02-2007 03:16 AM
sed display line after pattern match inonzi_prowler Linux - Software 3 02-19-2007 01:47 PM
Sed pattern matching digitalbrutus Programming 1 08-20-2006 01:37 PM
pattern matching problem in sed digitalbrutus Programming 4 08-20-2006 04:40 AM


All times are GMT -5. The time now is 01:02 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration