LinuxQuestions.org
Visit Jeremy's Blog.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 12-03-2007, 03:29 AM   #1
naren_0101bits
Member
 
Registered: Jul 2004
Location: Guntur
Posts: 44

Rep: Reputation: 15
Help needed in removing intermediate segments from a pipe delimited segment file


Hi,

I just stuckup in doing some regular expressions on a file.

I have data which has multiple FHS and BTS segments like:

FHS|12121|LOCAL|2323
MSH|10101|POTAMAS|2323
PID|121221|THOMAS|DAVID|23432
OBX|2342|H1211|3232
BTS|0000|MERSTO|LIABLE
FHS|12121|LOCAL|2323
MSH|10101|POTAMAS|2323
PID|121221|THOMAS|DAVID|23432
OBX|2342|H1211|3232
BTS|0000|MERSTO|LIABLE
FHS|12121|LOCAL|2323
MSH|10101|POTAMAS|2323
PID|121221|THOMAS|DAVID|23432
OBX|2342|H1211|3232
BTS|0000|MERSTO|LIABLE

I am trying to have an output which will have only one FHS at the beginning and one BTS in the ending.
And all other FHS and BTS in the middle should be deleted.

The output should look like :

FHS|12121|LOCAL|2323
MSH|10101|POTAMAS|2323
PID|121221|THOMAS|DAVID|23432
OBX|2342|H1211|3232
MSH|10101|POTAMAS|2323
PID|121221|THOMAS|DAVID|23432
OBX|2342|H1211|3232
MSH|10101|POTAMAS|2323
PID|121221|THOMAS|DAVID|23432
OBX|2342|H1211|3232
BTS|0000|MERSTO|LIABLE


I will be glad if you give me some light in solving this problem.

Thanks in advance.

Naren
 
Old 12-03-2007, 03:38 AM   #2
bigearsbilly
Senior Member
 
Registered: Mar 2004
Location: england
Distribution: Mint, Armbian, NetBSD, Puppy, Raspbian
Posts: 3,515

Rep: Reputation: 239Reputation: 239Reputation: 239
egrep '^(FHS|BTS)'
 
Old 12-03-2007, 03:42 AM   #3
matthewg42
Senior Member
 
Registered: Oct 2003
Location: UK
Distribution: Kubuntu 12.10 (using awesome wm though)
Posts: 3,530

Rep: Reputation: 65
Sounds like a job for Perl or awk (I have no doubt you will get awk posts from others, so here's a little Perl program to do it):
Code:
#!/usr/bin/perl

use strict;
use warnings;

my $bts = undef;
my $fhs_printed = 0;

while (<>) {
    if (/^FHS/ && ! $fhs_printed) {
        print;
        $fhs_printed = 1;
    }
    elsif (/^BTS/) {
        $bts = $_;
    }
    else {
        print;
    }
}

print $bts || warn "no BTS line found... output is missing a BTS line\n";;
Just save that to a file, chmod 755 the file and run it with the input data file as an argument. Re-direct the output to another file and you have your modified data.

Last edited by matthewg42; 12-03-2007 at 03:45 AM. Reason: added warning if no BTS line found.
 
Old 12-03-2007, 03:56 AM   #4
naren_0101bits
Member
 
Registered: Jul 2004
Location: Guntur
Posts: 44

Original Poster
Rep: Reputation: 15
Hi bigearsbilly,
I am not trying for FHS and BTS alone. I am trying for all the segments except the FHS and BTS which are in between the first and last lines.

Hi matthewg42,
Thanks for the immediate reply. I am trying in awk.
 
Old 12-03-2007, 04:09 AM   #5
ghostdog74
Senior Member
 
Registered: Aug 2006
Posts: 2,697
Blog Entries: 5

Rep: Reputation: 244Reputation: 244Reputation: 244
tested on your sample data only
Code:
awk '
/^BTS/ { getline;
         if ($0 ~ /^FHS/) { next }         
}
{print}
' file
output:
Code:
# ./test.sh
FHS|12121|LOCAL|2323
MSH|10101|POTAMAS|2323
PID|121221|THOMAS|DAVID|23432
OBX|2342|H1211|3232
MSH|10101|POTAMAS|2323
PID|121221|THOMAS|DAVID|23432
OBX|2342|H1211|3232
MSH|10101|POTAMAS|2323
PID|121221|THOMAS|DAVID|23432
OBX|2342|H1211|3232
BTS|0000|MERSTO|LIABLE
 
Old 12-03-2007, 06:16 AM   #6
naren_0101bits
Member
 
Registered: Jul 2004
Location: Guntur
Posts: 44

Original Poster
Rep: Reputation: 15
Hi ghostdog74,

awk '
/^BTS/ { getline;
if ($0 ~ /^FHS/) { next }
}
{print}
' file

This script on execution is not giving the final BTS segment line. But i want the output which has first FHS and last BTS with out intermediate FHS and BTS
 
Old 12-03-2007, 07:27 AM   #7
AnanthaP
Member
 
Registered: Jul 2004
Location: Chennai, India
Posts: 952

Rep: Reputation: 217Reputation: 217Reputation: 217
Try in awk and here is the pseudo logic.

{
if (fhs and first fhs) then print;
elseif bts then store $0 in x;
else print ;
}
'END' {
print the last stored bts from x;
}

What are the lines really? Not a class test I hope.
End
 
Old 12-03-2007, 07:42 AM   #8
PAix
Member
 
Registered: Jul 2007
Location: United Kingdom, W Mids
Distribution: SUSE 11.0 as of Nov 2008
Posts: 195

Rep: Reputation: 40
So I called my script aaa, but as shown, using the file supplied, my result fully supports Ghostdog's assertion. I have highlighted the first and last lines of the output, but other than that it's as output. Did you cut and paste the script that Ghostdog provided?
Code:
ian@C4SL101D:~/bashandawk> cat aaa
#!/bin/sh
awk '
/^BTS/ { getline;
         if ($0 ~ /^FHS/) { next }
}
{print}
' infile
ian@C4SL101D:~/bashandawk> ./aaa
FHS|12121|LOCAL|2323
MSH|10101|POTAMAS|2323
PID|121221|THOMAS|DAVID|23432
OBX|2342|H1211|3232
MSH|10101|POTAMAS|2323
PID|121221|THOMAS|DAVID|23432
OBX|2342|H1211|3232
MSH|10101|POTAMAS|2323
PID|121221|THOMAS|DAVID|23432
OBX|2342|H1211|3232
BTS|0000|MERSTO|LIABLE
 
Old 12-03-2007, 07:47 AM   #9
radoulov
Member
 
Registered: Apr 2007
Location: Milano, Italia/Варна, България
Distribution: Ubuntu, Open SUSE
Posts: 212

Rep: Reputation: 38
If the first FHS and the last BTS are not the first and the last line respectively (otherwise it will be easier):

Code:
awk 'f && /^FHS/ {
	fhs[FNR]
	}
/^FHS/ {
	f = 1
	}
/^BTS/ {
	f1 = FNR
	}
{
	x[FNR] = $0
	} END {
		for(i=1; i<=FNR; i++)
			if (!((x[i] ~ /^BTS/) && (i != f1)) && !(i in fhs))
 				print x[i]
}' filename

Last edited by radoulov; 12-03-2007 at 07:51 AM.
 
Old 12-03-2007, 09:37 AM   #10
makyo
Member
 
Registered: Aug 2006
Location: Saint Paul, MN, USA
Distribution: {Free,Open}BSD, CentOS, Debian, Fedora, Solaris, SuSE
Posts: 735

Rep: Reputation: 76
Hi.

I never trust the user to supply clean data, so here is what I am using for testing, "data1":
Code:
chaff
detritus
FHS|12121|LOCAL|2323
MSH|10101|POTAMAS|2323
PID|121221|THOMAS|DAVID|23432
OBX|2342|H1211|3232
BTS|0000|MERSTO|LIABLE
FHS|12121|LOCAL|2323
MSH|10101|POTAMAS|2323
PID|121221|THOMAS|DAVID|23432
OBX|2342|H1211|3232
BTS|0000|MERSTO|LIABLE
FHS|12121|LOCAL|2323
MSH|10101|POTAMAS|2323
PID|121221|THOMAS|DAVID|23432
OBX|2342|H1211|3232
BTS|0000|MERSTO|LIABLE
garbage
junk
Looking at the two awk scripts, this file fed into "user1" (ghostdog74), produces:
Code:
% ./user1 data1
chaff
detritus
FHS|12121|LOCAL|2323
MSH|10101|POTAMAS|2323
PID|121221|THOMAS|DAVID|23432
OBX|2342|H1211|3232
MSH|10101|POTAMAS|2323
PID|121221|THOMAS|DAVID|23432
OBX|2342|H1211|3232
MSH|10101|POTAMAS|2323
PID|121221|THOMAS|DAVID|23432
OBX|2342|H1211|3232
garbage
junk
and fed into "user2" (radoulov) produces:
Code:
% ./user2 data1
chaff
detritus
FHS|12121|LOCAL|2323
MSH|10101|POTAMAS|2323
PID|121221|THOMAS|DAVID|23432
OBX|2342|H1211|3232
MSH|10101|POTAMAS|2323
PID|121221|THOMAS|DAVID|23432
OBX|2342|H1211|3232
MSH|10101|POTAMAS|2323
PID|121221|THOMAS|DAVID|23432
OBX|2342|H1211|3232
BTS|0000|MERSTO|LIABLE
garbage
junk
Both of those awk scripts seem to work with "clean" data.

I have a multi-sed solution that I may post later ... cheers, makyo

Last edited by makyo; 12-03-2007 at 09:54 AM.
 
Old 12-03-2007, 10:20 AM   #11
ghostdog74
Senior Member
 
Registered: Aug 2006
Posts: 2,697
Blog Entries: 5

Rep: Reputation: 244Reputation: 244Reputation: 244
@OP, the script i posted was only tested on your sample data. For the sample version posted my makyo, it will miss the last BTS. here's a fix for it
Code:
awk '
/^BTS/ { l=$0;getline;
         if ($0 ~ /^FHS/) next 
         else print l                   
}
{print}
' file
output:
Code:
# ./test.sh
chaff
detritus
FHS|12121|LOCAL|2323
MSH|10101|POTAMAS|2323
PID|121221|THOMAS|DAVID|23432
OBX|2342|H1211|3232
MSH|10101|POTAMAS|2323
PID|121221|THOMAS|DAVID|23432
OBX|2342|H1211|3232
MSH|10101|POTAMAS|2323
PID|121221|THOMAS|DAVID|23432
OBX|2342|H1211|3232
BTS|0000|MERSTO|LIABLE
garbage
junk
 
Old 12-03-2007, 10:45 AM   #12
naren_0101bits
Member
 
Registered: Jul 2004
Location: Guntur
Posts: 44

Original Poster
Rep: Reputation: 15
Hi all,

Thanks a lot for giving me so many proper and thought making responses in time.


Naren
 
Old 12-03-2007, 10:47 AM   #13
radoulov
Member
 
Registered: Apr 2007
Location: Milano, Italia/Варна, България
Distribution: Ubuntu, Open SUSE
Posts: 212

Rep: Reputation: 38
Ha! Didn't notice that BTS is followed by FHS.
And if the final BTS could not be followed by FHS:

sed version:

Code:
sed '/^BTS/{N;/\nFHS/d}' filename
or (for older seds):

Code:
sed '/^BTS/{N;/\nFHS/d;}' filename
 
  


Reply

Tags
awk, perl



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Bridging the Wired Segment (Ethernet) & Wireless Segments (Access Point) Paris Heng Linux - Wireless Networking 0 07-14-2007 08:38 AM
how to pipe my tar file linux_10_1 Linux - Newbie 2 04-11-2006 11:28 AM
Pipe a file into another? SlowCoder Linux - General 1 04-16-2005 09:42 PM
Parsing a tab delimited text file jajanes Programming 9 08-08-2003 10:34 AM
comma delimited file cdragon Programming 5 06-21-2002 07:55 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 01:39 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration