LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   Perl Script to extract email (https://www.linuxquestions.org/questions/programming-9/perl-script-to-extract-email-4175462243/)

Leo-G 05-16-2013 11:24 AM

Perl Script to extract email
 
Having a little trouble with the below perl script

Code:

#!/usr/bin/perl
#Author Leo
use Email::Address;

#use strict;
my $file = "/var/log/maillog";
my $string="75E37A371C";
open(MAIL, $file);
my @buffer =<MAIL>;
close(MAIL);
my $lines=grep(/$string/, @buffer);

#print "@lines";

my @addresses = Email::Address->parse($lines);
print $addresses[0]->address;

Not sure how to print the output, tried line as an array as well

smallpond 05-16-2013 12:16 PM

You want lines to be an array, not a scalar:

Code:

my @lines=grep(/$string/, @buffer);

my @addresses = Email::Address->parse(@lines);
foreach (@addresses) {
  print
}


Leo-G 05-17-2013 02:35 AM

Tried that as well but still no luck

Code:

#!/usr/bin/perl
#Author Leo
use Email::Address;

#use strict;
my $file = "/var/log/maillog";
my $string="75E37A371C";
open(MAIL, $file);
my @buffer =<MAIL>;
close(MAIL);
my @lines=grep(/$string/, @buffer);

print "@lines";

my @addresses = Email::Address->parse(@lines);
#print "@addresses";
foreach (@addresses) {
print "$addresses[0]->address";
}

Not sure if the print is correct though

chrism01 05-17-2013 06:14 AM

The grep should work, but Email::Address->parse works on strings, not arrays
Quote:

This class implements a regex-based RFC 2822 parser that locates email addresses in strings
http://search.cpan.org/~rjbs/Email-A...ail/Address.pm

Note that the example on that page
Code:

Class Methods

parse

      my @addrs = Email::Address->parse(
        q[me@local, Casey <me@local>, "Casey" <me@local> (West)]
      );

Looks a bit like an array (uses [ ] chars), but this is actually an example of quoting
Quote:

with q, qq, and qw, delimiters other than parentheses can be used. [] and {} will work, as will just about any punctuation mark,
http://www.perlmonks.org/?node_id=401006

Incidentally, I always start Perl with 'warnings' and 'strict' turned on; highly recommend you do the same.
You'll thank me later ;)
Code:

#!/usr/bin/perl -w

use strict;


Leo-G 05-17-2013 07:49 AM

Thank you chrism01, I will use strict and warnings from now on.

Can you tell me How I can extract the email id's only,

I wanna feed them to a database and I am actually contemplating using shell instead.

chrism01 05-17-2013 08:23 AM

Basically, you need to loop through the @lines array, 1 element at a time and parse that (or indeed loop through @buffer).
In your code you've already specified the Id to match on, so I don't understand your qn.

Do you want addresses or Ids?

Leo-G 05-17-2013 08:40 AM

Hi,

It will print out a line as below

Code:

May 16 21:00:53 mspwss sendmail[32248]: 75E37A371C: to=leo@leog.in, ctladdr= (664/664)

I want to strip the email id only from this line, I am gathering data on 550 errors and I wanna check which email ids are frequently used so I can block them

chrism01 05-20-2013 02:17 AM

Code:

$var1='May 16 21:00:53 mspwss sendmail[32248]: 75E37A371C: to=leo@leog.in, ctladdr= (664/664)';
$var2 = (split(/:/, $var1))[3];
print "var2 $var2\n";

#output
var2  75E37A371C

:)

HTH

Leo-G 05-20-2013 02:28 AM

Quote:

Originally Posted by chrism01 (Post 4954688)
Code:

$var1='May 16 21:00:53 mspwss sendmail[32248]: 75E37A371C: to=leo@leog.in, ctladdr= (664/664)';
$var2 = (split(/:/, $var1))[3];
print "var2 $var2\n";

#output
var2  75E37A371C

:)

HTH

That Gives the message id, I want the email id :(

chrism01 05-20-2013 02:37 AM

You mean 32248 ?
Code:

$var1='May 16 21:00:53 mspwss sendmail[32248]: 75E37A371C: to=leo@leog.in, ctladdr= (664/664)';
$var1 =~ /\[([[:digit:]]+)\]/;
print "$1\n";

# out
32248


Leo-G 05-20-2013 03:04 AM

No the output should be

leo@leog.in

chrism01 05-20-2013 04:53 AM

Ah, the email address. The other 2 nums are email Id and the msg Id.
Code:

$var1='May 16 21:00:53 mspwss sendmail[32248]: 75E37A371C: to=leo@leog.in, ctladdr= (664/664)';
$var2 = (split(/\s+/,$var1))[6];
$var2 =~ /=([a-z]+@.*),/;
print "$1\n";

#out
leo@leog.in


Leo-G 05-20-2013 09:26 AM

Thank you Chris,

I seem to be getting the below error though any idea why

Use of uninitialized value $1 in concatenation (.) or string at ./email.pl line 24.


Code

Code:

#!/usr/bin/perl
#Author Leo
use strict;
use warnings;


my $file = "maillog";
my $string="550";
open(MAIL, $file);
my @buffer =<MAIL>;
close(MAIL);
my @lines=grep(/$string/, @buffer);

#print "@lines";

foreach my $line (@lines){
#print $line


my $email = (split(/\s+/,$line))[6];
$email =~ /=([a-z]+@.*),/;

print "$1\n";

}


chrism01 05-21-2013 12:45 AM

You need to check exactly what you're getting in each element of each of those arrays.
Try printing them to a file and have a good look; you need to know exactly what data you're dealing with before you can craft algorithms/regexes to deal with them.

If you get that error, it means there wasn't a matching email addr in that string.

Leo-G 05-21-2013 01:49 AM

Thank you Chris, I will check further


All times are GMT -5. The time now is 11:54 PM.