LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 10-12-2007, 11:12 AM   #1
sal_paradise42
Member
 
Registered: Jul 2003
Location: Utah
Distribution: Gentoo FreeBSD 5.4
Posts: 150

Rep: Reputation: 16
Need a perl regexp master


perl noob hoping to write a script to add values to a sql db at work.
Here is the relevant part of code
Code:
while (<file2>) {
 40 if (my ($mainbit) =
 41   m{
 42 ^
 43   \s*
 44   (?:\b(?:\d{1,3}\.){3}\d{1,3}\b)
 45   /
 46   (\d{1,2})
 47   \s is \s subnetted, \s
 48   \d+
 49   \s subnets \s*
 50   .*
 51 $
 52 }x) {
 53 $lastbit = $mainbit;
 54 } elsif (my ($type, $default, $subtype, $netip, $subnetbits, $ad, $metric, $destip, $uptime, $int) =
 55 m{
 56  ^
 57  ([CSIRMBDOE])          # type
 58  (\*)? \s+              # is default?
 59   ((?:EX)|(?:IA)|(?:N1)|(?:N2)|(?:E1)|(?:E2)|(?:L1)|(?:L2)|(?:ia))? \s*   #sub type
 60   (\b(?:\d{1,3}\.){3}\d{1,3}\b) #Network address
 61   (?:
 62     / 
 63    (\d{1,2})   # subnet bits
 64    )?
 65    \s
 66    \[(\d{1,3}) # adminsitrative distance
 67    /        
 68   (\d{1,10})\]  # metric
 69   \s via \s
 70   (\b(?:\d{1,3}\.){3}\d{1,3}\b)  #destaddr
 71   (?:
 72    , \s
 73                 ([^,]+)         # uptime                                                                      
 74                 , \s                                                                                          
 75                 ([^\s]+)  # Exit interface                                                                    
 76               )?                                                                                              
 77   .*                                                                                                          
 78            $                                                                                                  
 79            }x) {                                                                                              
 80 $default = "" unless defined $default;     

 81           $subtype = "" unless defined $subtype;                                                              
 82           $uptime = "" unless defined $uptime;                                                                
 83           $int = "" unless defined $int;                                                                      
 84           $subnetbits = $lastbit unless defined $subnetbits;
 85           $count++;                                                                                           
 86           push(@networks, "$netip/$subnetbits");                                                              
 87           #print "$type\t$netip\t$destip\n";                                                                  
 88 }                                                                                                             
 89 }
Basically im trying to extract individual values from the following output
Code:
 4 7600-UT01#show ip route vrf vrfData ospf
  5      69.0.0.0/8 is variably subnetted, 184 subnets, 7 masks
  6 O E2    69.4.191.252/30 
  7            [110/20] via 10.136.217.2, 00:28:33, Serial1/1.7/26:0.1
  8 O E2    69.4.191.244/30 
  9            [110/20] via 10.136.215.2, 00:28:33, Serial1/1.7/24:0.1
 10 O E2    69.4.191.240/30 
 11            [110/20] via 10.136.214.2, 00:28:33, Serial1/1.7/23:0.1
 12 O E2    69.4.191.220/30 
 13            [110/20] via 10.136.236.2, 00:28:33, Serial1/1.8/13:0.1
 14 O E2    69.4.191.216/30 [110/20] via 10.136.232.2, 00:28:33, Serial1/1.8/9:0.1
 15 O E2    69.4.191.212/30 
 16            [110/20] via 10.136.203.2, 00:28:33, Serial1/1.7/12:0.1
 17 O E2    69.4.191.208/30 [110/20] via 10.136.225.2, 00:28:33, Serial1/1.8/2:0.1
 18 O E2    69.4.191.192/28 [110/20] via 10.137.8.2, 00:28:33, Serial1/1.3/9:0.1
 19 O E2    69.4.191.176/28 [110/20] via 10.136.226.2, 00:28:33, Serial1/1.8/3:0.1
 20 O E2    69.4.191.168/29 
 21            [110/20] via 10.136.205.2, 00:28:33, Serial1/1.7/14:0.1
It works great for lines where the output is only one line like line 18 of the previous output, but it doesn't match line 20 and 21 because they are in two different lines. So I am looking for one or two solutions.
Have $_ =~ match on a multiline like that and make it one line, or change my regexp to also match on those lines somehow.
Thanks in advance.
 
Old 10-12-2007, 01:30 PM   #2
ghostdog74
Senior Member
 
Registered: Aug 2006
Posts: 2,697
Blog Entries: 5

Rep: Reputation: 244Reputation: 244Reputation: 244
Code:
while (<>) {
    chomp;      # strip record separator
    if ($. > 1) {
        $Last = $line;
        $line = $_;
    }
    if (!/O E2/) {
        $get = sprintf('%s %s', $Last, $line);        
        @l = split(" ", $get);
        print $l[2] . $l[5]; #change as necesary
    }
}
 
Old 10-13-2007, 01:05 AM   #3
sal_paradise42
Member
 
Registered: Jul 2003
Location: Utah
Distribution: Gentoo FreeBSD 5.4
Posts: 150

Original Poster
Rep: Reputation: 16
pretty sure this is over my head.
What changes do I need to make exactly?
When I use that piece of code alone I don't match on any of the stuff I was matching on before, but it does change the desired lines the way I want to.
Is there a way I can make a change to *only* those lines right after "while <file2>" ?
i.e s/^O (?:E2)? etc\n\s+/$1 $2/ and have the rest of my code work with that?

Sorry this is like my first major undertaking into perl.
 
Old 10-13-2007, 03:42 AM   #4
ghostdog74
Senior Member
 
Registered: Aug 2006
Posts: 2,697
Blog Entries: 5

Rep: Reputation: 244Reputation: 244Reputation: 244
use it on the command line with that file you are parsing

eg

# perl myscript.pl "file2"

to see the outcome at various points of the code, you can insert some print statements
Code:
while (<>) {
    chomp;      # strip record separator
    if ($. > 1) { #skip first line 69.0.0.0/8 is variably subnetted, 184 subnets, 7 masks
        $Last = $line;
        $line = $_;
    }
    if (!/O E2/) {
        $get = sprintf('%s %s', $Last, $line);       
        print $get; # see what's going on..
        @l = split(" ", $get); #you know what split is don't you?
        print $l[2] . $l[5]; #change as necesary as to the fields you want to get.
    }
}
 
Old 10-13-2007, 02:48 PM   #5
sal_paradise42
Member
 
Registered: Jul 2003
Location: Utah
Distribution: Gentoo FreeBSD 5.4
Posts: 150

Original Poster
Rep: Reputation: 16
thanks for the help ghostdog. That clears up a few things, but I feel like im full circle again. It seems that I ran into two new issues.
In your recommended script it only prints out the lines that match !/^O E2/, so I figure to put a "print $_\n" after the if statement, but it prints out the lines twice.
so if I have the following line:
Code:
O E2    69.4.191.252/30 
           [110/20] via 10.136.217.2, 01:06:46, Serial1/1.7/26:0.1
It will print out like this:
Code:
O E2    69.4.191.244/30 
O E2 69.4.191.244/30 [110/20] via 10.136.215.2, 00:13:50, Serial1/1.7/24:0.1 #exactly how I need it but can't get rid of the entry above. Tried with a regex but no luck
The other problem is that there is lines like the following:
Code:
O E2    69.4.190.64/30 [110/20] via 10.136.10.2, 01:06:46, Serial1/1.1/11:0.1
                       [110/20] via 10.136.248.2, 01:06:46, Serial1/1.8/25:0.1
IT will print them out like this:
Code:
O E2 69.4.190.64/30 [110/20] via 10.136.10.2, 00:13:50, Serial1/1.1/11:0.1 [110/20] via 10.136.248.2, 00:13:50, Serial1/1.8/25:0.1
So the array positions are different.
This is probably fairly trivial stuff for an experience programmer, but I am having a hard time figuring ways around it.
Regardless Im excited about the possibilities at work as I am figuring some of this stuff out.
Thanks again for your help.
 
Old 10-13-2007, 06:14 PM   #6
angrybanana
Member
 
Registered: Oct 2003
Distribution: Archlinux
Posts: 147

Rep: Reputation: 21
Please, try to post a full explanation of your problem the first time. Not knowing what you want to do, or what the expected inputs and outputs are really hinders us in answering your question.

That being said, here's my best guess at what you're trying to do.
Code:
use strict;
use warnings;

my ($last, $line, @lines, @fields);

while(<>){
        chomp;
        s/^\s+//; s/\s+$//;
        $line = $_;
        if (/^O E2/){
                push @lines, $line if ( do{ my @t = split " ", $line;@t} > 3);
                $last = $1 if $line =~ /(^O E2\s+.*?)(\s|$)/;
        } else {
                next unless $last;
                push @lines, join " ", $last, $line;
        }
}

for $line (@lines){
        @fields = split " ", $line;
        print "$line\n";
}
Here's an example of it in action.
Input:
Code:
     69.0.0.0/8 is variably subnetted, 184 subnets, 7 masks
O E2    69.4.191.252/30 
           [110/20] via 10.136.217.2, 00:28:33, Serial1/1.7/26:0.1
O E2    69.4.191.244/30 
           [110/20] via 10.136.215.2, 00:28:33, Serial1/1.7/24:0.1
O E2    69.4.191.240/30 
           [110/20] via 10.136.214.2, 00:28:33, Serial1/1.7/23:0.1
O E2    69.4.191.220/30 
           [110/20] via 10.136.236.2, 00:28:33, Serial1/1.8/13:0.1
O E2    69.4.191.216/30 [110/20] via 10.136.232.2, 00:28:33, Serial1/1.8/9:0.1
O E2    69.4.191.212/30 
           [110/20] via 10.136.203.2, 00:28:33, Serial1/1.7/12:0.1
O E2    69.4.191.208/30 [110/20] via 10.136.225.2, 00:28:33, Serial1/1.8/2:0.1
O E2    69.4.191.192/28 [110/20] via 10.137.8.2, 00:28:33, Serial1/1.3/9:0.1
O E2    69.4.191.176/28 [110/20] via 10.136.226.2, 00:28:33, Serial1/1.8/3:0.1
O E2    69.4.191.168/29 
           [110/20] via 10.136.205.2, 00:28:33, Serial1/1.7/14:0.1
O E2    69.4.190.64/30 [110/20] via 10.136.10.2, 01:06:46, Serial1/1.1/11:0.1
                       [110/20] via 10.136.248.2, 01:06:46, Serial1/1.8/25:0.1
Output
Code:
O E2    69.4.191.252/30 [110/20] via 10.136.217.2, 00:28:33, Serial1/1.7/26:0.1
O E2    69.4.191.244/30 [110/20] via 10.136.215.2, 00:28:33, Serial1/1.7/24:0.1
O E2    69.4.191.240/30 [110/20] via 10.136.214.2, 00:28:33, Serial1/1.7/23:0.1
O E2    69.4.191.220/30 [110/20] via 10.136.236.2, 00:28:33, Serial1/1.8/13:0.1
O E2    69.4.191.216/30 [110/20] via 10.136.232.2, 00:28:33, Serial1/1.8/9:0.1
O E2    69.4.191.212/30 [110/20] via 10.136.203.2, 00:28:33, Serial1/1.7/12:0.1
O E2    69.4.191.208/30 [110/20] via 10.136.225.2, 00:28:33, Serial1/1.8/2:0.1
O E2    69.4.191.192/28 [110/20] via 10.137.8.2, 00:28:33, Serial1/1.3/9:0.1
O E2    69.4.191.176/28 [110/20] via 10.136.226.2, 00:28:33, Serial1/1.8/3:0.1
O E2    69.4.191.168/29 [110/20] via 10.136.205.2, 00:28:33, Serial1/1.7/14:0.1
O E2    69.4.190.64/30 [110/20] via 10.136.10.2, 01:06:46, Serial1/1.1/11:0.1
O E2    69.4.190.64/30 [110/20] via 10.136.248.2, 01:06:46, Serial1/1.8/25:0.1
Notice that the last IP is repeated twice. I did this because you didn't quite explain how you wanted to deal with IP addresses that had more then one line associated with them. Use the second loop to parse your data it should be straight forward. I have @fields set up for you and ready to go, $fields[number] will get you the specific field.

If you don't understand the script and need me to explain some parts of it, just ask. I'm just too tired to put comments on it right now.

PS: Perl makes my head hurt

Last edited by angrybanana; 10-13-2007 at 06:33 PM.
 
Old 10-14-2007, 02:05 AM   #7
sal_paradise42
Member
 
Registered: Jul 2003
Location: Utah
Distribution: Gentoo FreeBSD 5.4
Posts: 150

Original Poster
Rep: Reputation: 16
hey angry,
I didn't think I was going to end up with a completely new piece of code, but I guess I was totally over my head, and probably why I lacked in my explanation.
That was a great guess to what I actually need.
Here is the input and here is the output
Your assumption was correct, if the IP has two lines associated I wanted two entries for it. I also changed your line from
Code:
#$last = $1 if $line =~ /(^O E2\s+.*?)(\s|$)/;
to
Code:
$last = $1 if $line =~ /(^O (?:E2){0,1}\s+.*?)(\s|$)/;
since there is instances where the E2 is not present after ^O.

Anyway, as you can see the problem that I am running into is that I need to strip out the lines that contain
Code:
^.*variably subnetted.*$
and
Code:
 7600-UT01#exit
Connection closed by foreign host.
I try to do this with a regex after chomp. A s/blah// removes it, but the ip addresses right before that line get printed twice. Anyway I am really close but just need to strip out the lines that match that.
Thanks for taking the time, and I apologize for a lack of understanding.
P.S can you explain what this lines are doing?
Code:
push @lines, $line if ( do{ my @t = split " ", $line;@t} > 3);
                $last = $1 if $line =~ /(^O E2\s+.*?)(\s|$)/;
 
Old 10-14-2007, 04:30 AM   #8
angrybanana
Member
 
Registered: Oct 2003
Distribution: Archlinux
Posts: 147

Rep: Reputation: 21
Here's the changes, this should work now:
Code:
use strict;
use warnings;

my ($last, $line, $limit, @lines, @fields);

while(<>){
        next if /variably subnetted/;
        last if /#exit/;
        chomp;
        s/^\s+//; s/\s+$//;
        $line = $_;
        if (/^O (E2)?/){ 
                if ($1){ $limit = 3 } else { $limit = 2 }
                push @lines, $line if ( do{ my @t = split " ", $line;@t} > $limit);
                $last = $1 if $line =~ /(^O (?:E2)?\s+.*?)(\s|$)/;
        } else {
                next unless $last;
                push @lines, join " ", $last, $line;
        }
}

for $line (@lines){
        @fields = split " ", $line;
        print "$line\n";
}
What I added where two new lines:
Code:
next if /variably subnetted/; #skip the line completely if it contains match
last if /#exit/; #exit the loop once we see "#exit"
as far as what this is doing:
Code:
push @lines, $line if ( do{ my @t = split " ", $line;@t} > 3);
splits the line on the spaces, if there's more then 3 fields then we assume that It's a full line. We add it to @lines. Now that I know that the E2 is optional, I added an if statement that checks if E2 was there or not, and sets the $limit to either 3 or 2 accordingly.
The do block is there because I couldn't figure out a cleaner way to do this (I'm new to perl myself). Basically it stores the split values in a temporary array @t, then it returns that array. If I don't do that it gives me a warning, which is annoying.

Code:
$last = $1 if $line =~ /(^O E2\s+.*?)(\s|$)/;
Here we store the first part of the line in $last. We do this so that we can join $last with line later on if the IP has multiple lines associated with it. $1 = group one in the regex match (parenthesis)

Hope that helps, best of luck.

Last edited by angrybanana; 10-14-2007 at 04:35 AM.
 
Old 10-14-2007, 06:17 PM   #9
sal_paradise42
Member
 
Registered: Jul 2003
Location: Utah
Distribution: Gentoo FreeBSD 5.4
Posts: 150

Original Poster
Rep: Reputation: 16
thanks angry and ghostdog. Working like a charm.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
How do I install 2 hdd 1 primary master and 1 secondary master? tribeone Linux - Hardware 4 01-13-2007 07:58 PM
Perl Regexp search-n-replace jpbarto Programming 2 06-16-2005 12:45 PM
Perl/regexp help... - query string parsing... lowpro2k3 Programming 4 05-11-2005 05:18 PM
perl simple regexp champ Programming 3 07-07-2004 03:27 AM
perl regexp problem raven Programming 4 03-21-2004 11:49 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 06:52 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration