LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 04-15-2011, 03:55 PM   #1
yjy4321
Member
 
Registered: Apr 2011
Distribution: Ubuntu
Posts: 42

Rep: Reputation: Disabled
How to delete multiple lines in a file using perl


I have a file looks like the following:

digraph topology
{
"192.168.3.254" -> "10.1.1.11"[label="1.000", style=solid];
"192.168.3.254" -> "10.1.1.12"[label="1.000", style=solid];
"192.168.3.254" -> "10.1.1.10"[label="1.000", style=solid];
"192.168.3.254" -> "10.1.1.9"[label="1.000", style=solid];
(skip some lines...)
"10.1.1.9" -> "10.1.1.10"[label="1.000"];
"10.1.1.9" -> "10.1.1.11"[label="1.024"];
"10.1.1.9" -> "10.1.1.12"[label="1.076"];
"10.1.1.9" -> "192.168.3.254"[label="1.000"];
"10.1.1.10" -> "10.1.1.9"[label="1.000"];
"10.1.1.10" -> "10.1.1.11"[label="1.020"];
"10.1.1.10" -> "10.1.1.12"[label="1.067"];
"10.1.1.10" -> "192.168.3.254"[label="1.000"];
"10.1.1.11" -> "10.1.1.9"[label="1.028"];
"10.1.1.11" -> "10.1.1.10"[label="1.028"];
"10.1.1.11" -> "10.1.1.12"[label="1.053"];
"10.1.1.11" -> "192.168.3.254"[label="1.000"];
"10.1.1.12" -> "10.1.1.9"[label="1.099"];
"10.1.1.12" -> "10.1.1.10"[label="1.085"];
"10.1.1.12" -> "10.1.1.11"[label="1.057"];
"10.1.1.12" -> "192.168.3.254"[label="1.000"];
"192.168.3.254" -> "10.1.1.9"[label="1.000"];
"192.168.3.254" -> "10.1.1.10"[label="1.000"];
"192.168.3.254" -> "10.1.1.11"[label="1.000"];
"192.168.3.254" -> "10.1.1.12"[label="1.000"];
"192.168.3.254" -> "192.168.3.0/24"[label="HNA"];
"192.168.3.0/24"[shape=diamond];
}

I need to search some particular lines and delete them. For example, I need to delete following lines:
"10.1.1.9" -> "10.1.1.11"[label="1.024"];
"10.1.1.11" -> "10.1.1.9"[label="1.028"];
"10.1.1.12" -> "10.1.1.11"[label="1.057"];
"10.1.1.11" -> "10.1.1.12"[label="1.053"];
"192.168.3.254" -> "192.168.3.0/24"[label="HNA"];
"192.168.3.0/24"[shape=diamond];

Order of these lines are random... So I cannot delete line #19, for example... And you can see that top four lines I want to delete are pairs. So there might be some clever way to detect the lines, if a line has both "1.9" and "1.11", then delete the line... I am new to perl language.

The following is the code I have now... I think I just need to write some code inside the while loop checking if I want to delete the line $dotline before I write to a NEW file.
Code:
#!/usr/bin/perl -w

$TOPPATH = "/tmp";
$NAME = "topology";
$FILENAME = "$TOPPATH/$NAME.dot";
$CONFFILENAME = "$TOPPATH/$NAME.conf";
$NEWFILENAME = "$TOPPATH/$NAME.new";
$EXT = "png";

`touch $TOPPATH/$NAME.$EXT`;

my $f;

(skip some lines...)

`touch $NEWFILENAME`;

my $newfile;
my $infile;
my $dotfile;
$newfile = $NEWFILENAME;
$infile = $CONFFILENAME;
$dotfile = $FILENAME;
open ( NEW , "> $newfile") or die "Can't open $newfile. $!";
open( IN , "< $infile") or die "Can't open $infile. $!";
open( DOT , "< $dotfile") or die "Can't open $dotfile. $!";
my $newline;
my $line;
my $dotline;

my $i = 0;
while( $dotline = <DOT> ) {
	$i++;
	# I think here should be the extra codes...
	#
	printf NEW "$dotline";
	if ($i == 3) {
		(skip some lines...)
	}
}

close(IN);
close(DOT);
close(NEW);

`cp $NEWFILENAME $FILENAME`;

`neato -Tpng -Gbgcolor=grey -Nfontsize=15 -Ncolor=black -Nfillcolor=green -Ecolor=blue -Earrowsize=2 $FILENAME -o $TOPPATH/$NAME.new`;

`mv $TOPPATH/$NAME.new $TOPPATH/$NAME.$EXT`;
`cp $TOPPATH/$NAME.dot $TOPPATH/$NAME-\$(date +'%Y-%m-%d-%H-%M-%S').dot`;
 
Old 04-15-2011, 06:52 PM   #2
Tinkster
Moderator
 
Registered: Apr 2002
Location: earth
Distribution: slackware by choice, others too :} ... android.
Posts: 23,067
Blog Entries: 11

Rep: Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928
I'm not sure I understood your criteria fully ... ?!

Getting rid of (not printing) lines that have both 1.9 and 1.11 in them:

Code:
printf NEW "$dotline" if($dotline !~ /\.1\.9/ && $dotline !~ /\.1\.11/);
Untested, should work.



Cheers,
Tink
 
Old 04-16-2011, 02:37 AM   #3
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Arch
Posts: 10,038

Rep: Reputation: 3203Reputation: 3203Reputation: 3203Reputation: 3203Reputation: 3203Reputation: 3203Reputation: 3203Reputation: 3203Reputation: 3203Reputation: 3203Reputation: 3203
So not in your example of things to be removed, but using the logic you have explained and Tink's example, you would be removing:
Quote:
"192.168.3.254" -> "10.1.1.11"[label="1.000", style=solid];
ie. the very first entry in your example ... is this correct??
 
Old 04-18-2011, 04:20 PM   #4
yjy4321
Member
 
Registered: Apr 2011
Distribution: Ubuntu
Posts: 42

Original Poster
Rep: Reputation: Disabled
Thank you for reply, Tinkster.
But that deletes all lines with ".1.9" OR ".1.11"

grail
It did not remove the very first entry in my example. It removed any lines with ".1.9" OR ".1.11"

Code:
#!/usr/bin/perl -w

(skip some lines...)

while( $dotline = <DOT> ) {
	# I think here should be the extra codes...
	#
	printf NEW "$dotline" if($dotline !~ /\.1\.9/ && $dotline !~ /\.1\.11/);
	(skip some lines...)

}

(skip some lines...)
 
Old 04-18-2011, 05:24 PM   #5
Tinkster
Moderator
 
Registered: Apr 2002
Location: earth
Distribution: slackware by choice, others too :} ... android.
Posts: 23,067
Blog Entries: 11

Rep: Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928
My bad - my predicate logic went bad once again; replace the && w/ ||.


Cheers,
Tink
 
Old 04-18-2011, 06:45 PM   #6
yjy4321
Member
 
Registered: Apr 2011
Distribution: Ubuntu
Posts: 42

Original Poster
Rep: Reputation: Disabled
Yes, it works with "||"
Is there a way to make code simpler or cleaner?
Now, I have a code looks like following:
Code:
#!/usr/bin/perl -w

(skip some lines...)

while( $dotline = <DOT> ) {
	printf NEW "$dotline" if( ($dotline !~ /\.9/ || $dotline !~ /\.10/) && 
				  ($dotline !~ /\.9/ || $dotline !~ /\.11/) && 
				  ($dotline !~ /\.9/ || $dotline !~ /\.12/) && 
				  ($dotline !~ /\.10/ || $dotline !~ /\.11/) && 
				  ($dotline !~ /\.10/ || $dotline !~ /\.12/) && 
				  ($dotline !~ /\.11/ || $dotline !~ /\.12/) );
	(skip some lines...)

}

(skip some lines...)
 
Old 04-18-2011, 08:49 PM   #7
Tinkster
Moderator
 
Registered: Apr 2002
Location: earth
Distribution: slackware by choice, others too :} ... android.
Posts: 23,067
Blog Entries: 11

Rep: Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928
Code:
while( $dotline = <DOT> ) {
	printf NEW "$dotline" if( ($dotline !~ /\.9|\.1[10]/ || $dotline !~ /\.1[012]/);


Should do the same job ...
But: beware - your simplification (omitting the leading \.1) may
remove more than you expected inadvertently.

Cheers,
Tink

Last edited by Tinkster; 04-18-2011 at 08:54 PM.
 
Old 04-18-2011, 10:58 PM   #8
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Arch
Posts: 10,038

Rep: Reputation: 3203Reputation: 3203Reputation: 3203Reputation: 3203Reputation: 3203Reputation: 3203Reputation: 3203Reputation: 3203Reputation: 3203Reputation: 3203Reputation: 3203
I know I am a relative noob here, but did I miss a split or something?
I am struggling to follow why we are testing $dotline twice?
Code:
while( $dotline = <DOT> ) {
	printf NEW "$dotline" if( ($dotline !~ /\.1\.(9|1[0-2])"/);
I included the quotes (") as I am guessing we don't want to get rid of - 10.1.9.123
 
Old 04-19-2011, 01:23 AM   #9
chrism01
LQ Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Rocky 9.x
Posts: 18,442

Rep: Reputation: 2791Reputation: 2791Reputation: 2791Reputation: 2791Reputation: 2791Reputation: 2791Reputation: 2791Reputation: 2791Reputation: 2791Reputation: 2791Reputation: 2791
I'm guessing you don't use a lot of Perl? (Assuming I've got your qn correct)
The
Code:
while() {}
construct is actually reading the next rec in from the input file. It returns null if no rec found and skips to end of file processing.

Last edited by chrism01; 04-19-2011 at 01:27 AM.
 
Old 04-19-2011, 01:30 AM   #10
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Arch
Posts: 10,038

Rep: Reputation: 3203Reputation: 3203Reputation: 3203Reputation: 3203Reputation: 3203Reputation: 3203Reputation: 3203Reputation: 3203Reputation: 3203Reputation: 3203Reputation: 3203
Hi Chris

No I was more asking why the others seem to be testing $dotline twice, ie looking at posts #6 and #7?

But thanks, I did know that

cheers
grail
 
Old 04-19-2011, 11:41 AM   #11
yjy4321
Member
 
Registered: Apr 2011
Distribution: Ubuntu
Posts: 42

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by Tinkster View Post
Code:
while( $dotline = <DOT> ) {
	printf NEW "$dotline" if( ($dotline !~ /\.9|\.1[10]/ || $dotline !~ /\.1[012]/);
Thanks, but this code deletes all ".10" or ".11" lines. And for your suggestion, I added leading "1" without "\."

Here is my updated code which does exactly what I want it to do. I colored the codes so that it is easier to see. Number of lines is not too much. But I want to know if I could use other variables and while loops to reduce number of lines... I know that when I have a working code, then I am not supposed to edit it until it does not work. But there is possibility that these lines could get much larger.
Code:
while( $dotline = <DOT> ) {
	printf NEW "$dotline" if( ($dotline !~ /1\.1/ || $dotline !~ /1\.[234]/) &&
				  ($dotline !~ /1\.2/ || $dotline !~ /1\.[34]/) &&
				  ($dotline !~ /1\.3/ || $dotline !~ /1\.4/) &&
				  ($dotline !~ /1\.5/ || $dotline !~ /1\.[678]/) &&
				  ($dotline !~ /1\.6/ || $dotline !~ /1\.[78]/) &&
				  ($dotline !~ /1\.7/ || $dotline !~ /1\.8/) &&
				  ($dotline !~ /1\.9/ || $dotline !~ /1\.1[012]/) &&
				  ($dotline !~ /1\.10/ || $dotline !~ /1\.1[12]/) &&
				  ($dotline !~ /1\.11/ || $dotline !~ /1\.12/) &&
				  ($dotline !~ /1\.13/ || $dotline !~ /1\.1[456]/) &&
				  ($dotline !~ /1\.14/ || $dotline !~ /1\.1[56]/) &&
				  ($dotline !~ /1\.15/ || $dotline !~ /1\.16/) &&
				  ($dotline !~ /0\/24/) );
}

Last edited by yjy4321; 04-19-2011 at 11:42 AM. Reason: grammer
 
Old 04-19-2011, 12:27 PM   #12
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Arch
Posts: 10,038

Rep: Reputation: 3203Reputation: 3203Reputation: 3203Reputation: 3203Reputation: 3203Reputation: 3203Reputation: 3203Reputation: 3203Reputation: 3203Reputation: 3203Reputation: 3203
Well I am not 100% sure I am on the right path, but how about:
Code:
while( $dotline = <DOT> ) {
	printf NEW "$dotline" if($dotline !~ /^\s*"[^"]+\.1\.([1-9]|1[0-6])"[^"]+"[^"]+\.1\.([1-9]|1[0-6])".*/ && $dotline !~ /0\/24/);
 
Old 04-20-2011, 01:56 PM   #13
yjy4321
Member
 
Registered: Apr 2011
Distribution: Ubuntu
Posts: 42

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by grail View Post
Well I am not 100% sure I am on the right path, but how about:
Code:
while( $dotline = <DOT> ) {
	printf NEW "$dotline" if($dotline !~ /^\s*"[^"]+\.1\.([1-9]|1[0-6])"[^"]+"[^"]+\.1\.([1-9]|1[0-6])".*/ && $dotline !~ /0\/24/);
It does not work... It deletes some lines I do not want to delete... Anyway, now I have a working code, so I will not try to change it. Thank you all.
 
Old 04-21-2011, 01:04 AM   #14
chrism01
LQ Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Rocky 9.x
Posts: 18,442

Rep: Reputation: 2791Reputation: 2791Reputation: 2791Reputation: 2791Reputation: 2791Reputation: 2791Reputation: 2791Reputation: 2791Reputation: 2791Reputation: 2791Reputation: 2791
@grail
Well I'm a bit confused because the OP said
Quote:
if a line has both "1.9" and "1.11",
which in perl is && not || ....

Maybe I'm misunderstanding the post#1, but it seems to me he's got a (possibly) large file of recs and wants to remove a smaller subset, contained in another file.
Assuming (as per example) that the recs in both files are exact matches (for those in both files), I'd create hash using the smaller set as hash keys, then read through the large file once and for each (large file) rec, check if it's in the hash of recs to be deleted.
If so, get next (large file) rec, else output (large file) rec to new file.
This will produce a file of recs not inc those in the to-be-deleted list.
 
Old 04-21-2011, 10:15 AM   #15
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Arch
Posts: 10,038

Rep: Reputation: 3203Reputation: 3203Reputation: 3203Reputation: 3203Reputation: 3203Reputation: 3203Reputation: 3203Reputation: 3203Reputation: 3203Reputation: 3203Reputation: 3203
Quote:
It deletes some lines I do not want to delete
My bad .. I did not notice that you removed numbers from either side, ie left hand side is missing 4, 8, 12.

I would mention that even with what you have you could submit only a single line for each coloured bracket you have:
Code:
while( $dotline = <DOT> ) {
	printf NEW "$dotline" if( ($dotline !~ /1\.[1-3]/ || $dotline !~ /1\.[2-4]/) &&
				  ($dotline !~ /1\.[5-7]/ || $dotline !~ /1\.[6-8]/) &&
				  ($dotline !~ /1\.(9|1[01])/ || $dotline !~ /1\.1[0-2]/) &&
				  ($dotline !~ /1\.1[3-5]/ || $dotline !~ /1\.1[4-6]/) &&
				  ($dotline !~ /0\/24/) );
}
I also agree with chris though, using || instead of && seems odd when you are using an exclusion.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Perl regex not matching across multiple lines despite ms flags gfarrell Programming 30 08-18-2010 04:10 AM
how to copy some lines in a file and delete these lines after gartura Linux - General 1 07-20-2010 08:55 AM
delete multiple lines from file using shell script mech123 Linux - Newbie 4 06-09-2010 04:04 AM
Delete Duplicate Lines in a file, leaving only the unique lines left xmrkite Linux - Software 6 01-14-2010 06:18 PM
[SOLVED] [Perl] fail to sort a file with 300,000 lines by multiple columns Kunsheng Programming 10 11-13-2009 06:41 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 12:56 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration