LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - General
User Name
Password
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.

Notices

Reply
 
Search this Thread
Old 04-10-2003, 01:12 PM   #1
mikeyt_333
Member
 
Registered: Jun 2001
Location: Up in the clouds
Distribution: Fedora et al.
Posts: 353

Rep: Reputation: 30
Not grepping emails right


Hi,
I am trying to parse a file for email addresses. So here's the line I'm using with grep:

grep -E "^([a-zA-Z0-9_-])+([\.a-zA-Z0-9_-])*@([a-zA-Z0-9_-])+(\.[a-zA-Z0-9_-])+" Export

When I run it, I get a response however, the first letter of every email address found is truncated, and it is also outputting info like this:

... Deferred: Connection timed out with blah.blah.blah.

So lets say the email address in the file is:

user1@linuxquestions.org

But when grepped the output is:

ser1@linuxquestions.org

and the space is infront of it in the results also, it's like the results are displayed with a space replacing the first character of all matching results.

Regex are still like magic to me, and I have the hardest time figuring them out. I thought I was getting better, but I guess not. Please if anybody can help that would be great! Thanks in Advance!

Mike.
 
Old 04-10-2003, 03:05 PM   #2
Tinkster
Moderator
 
Registered: Apr 2002
Location: in a fallen world
Distribution: slackware by choice, others too :} ... android.
Posts: 22,974
Blog Entries: 11

Rep: Reputation: 879Reputation: 879Reputation: 879Reputation: 879Reputation: 879Reputation: 879Reputation: 879
That would strike me as odd, your expression works
here... what's your grep's version number?

Cheers,
Tink
 
Old 04-10-2003, 04:10 PM   #3
mikeyt_333
Member
 
Registered: Jun 2001
Location: Up in the clouds
Distribution: Fedora et al.
Posts: 353

Original Poster
Rep: Reputation: 30
I thought it was wierd too, I have used this expression for many email comparison things. I'm using grep 2.4.2, RH 7.2

Thanks for the reply.
Mike.
 
Old 04-10-2003, 05:37 PM   #4
Tinkster
Moderator
 
Registered: Apr 2002
Location: in a fallen world
Distribution: slackware by choice, others too :} ... android.
Posts: 22,974
Blog Entries: 11

Rep: Reputation: 879Reputation: 879Reputation: 879Reputation: 879Reputation: 879Reputation: 879Reputation: 879
grep (GNU grep) 2.5

Copyright 1988, 1992-1999, 2000, 2001 Free Software Foundation, Inc.

Slack 8.1 ... odd ...

Sorry, I don't know what to say :/

Cheers,
Tink
 
Old 04-10-2003, 05:52 PM   #5
david_ross
Moderator
 
Registered: Mar 2003
Location: Scotland
Distribution: Slackware, RedHat, Debian
Posts: 12,047

Rep: Reputation: 64
Try using a perl regexp.
 
Old 04-10-2003, 07:31 PM   #6
mikeyt_333
Member
 
Registered: Jun 2001
Location: Up in the clouds
Distribution: Fedora et al.
Posts: 353

Original Poster
Rep: Reputation: 30
you mean write a script in perl to do the job or use the format of a perl regexp?

Thanks for the help!
 
Old 04-11-2003, 02:20 PM   #7
david_ross
Moderator
 
Registered: Mar 2003
Location: Scotland
Distribution: Slackware, RedHat, Debian
Posts: 12,047

Rep: Reputation: 64
I tend to use perl full stop for most things - I doubt if using a perl like regexp in grep would make much difference. Like Tinkster - your grep command works for me. I can only assume there is somthing odd with either the file you are reading from (malformed characters or bad end of lines) or there is somthing wrong with another part of the script you are running this command from.

How are you calling the grep command? From the command line or from within a script? It may help if you post the script (if there is one) and the content of the file you are trying to grep.
 
Old 04-11-2003, 03:20 PM   #8
mikeyt_333
Member
 
Registered: Jun 2001
Location: Up in the clouds
Distribution: Fedora et al.
Posts: 353

Original Poster
Rep: Reputation: 30
I was running grep from the command line. I can't really post the contents of the file cause it is a huge file 34meg, it is just an inbox file from netscape, however there are characters in the file, after some of the email address that look like this: ^M I believe that is the windows line break character, but I can't see how that would influence it. The lines aren't that complicated, but I have tried this on another file and had it work also, I wonder if I uploaded the file in Binary...hmmm...

Thanks for the help.
 
Old 04-12-2003, 02:16 PM   #9
mikeyt_333
Member
 
Registered: Jun 2001
Location: Up in the clouds
Distribution: Fedora et al.
Posts: 353

Original Poster
Rep: Reputation: 30
Figured it out,
I search and replaced those ^M with \n and it fixed it. Now does anybody know how to have grep return only the exact matches and not the full lines where they match? For example the line is:

blah@linuxquestions.org ... Deffered ...

Grep returns:

blah@linuxquestions.org ... Deffered ...

But I only want:

blah@linuxquestions.org

TIA!
Mike.
 
Old 04-12-2003, 02:27 PM   #10
david_ross
Moderator
 
Registered: Mar 2003
Location: Scotland
Distribution: Slackware, RedHat, Debian
Posts: 12,047

Rep: Reputation: 64
I think you will need to use awk.

I don't know of a way to do it with grep.
 
Old 04-12-2003, 07:48 PM   #11
unSpawn
Moderator
 
Registered: May 2001
Posts: 27,140
Blog Entries: 54

Rep: Reputation: 2791Reputation: 2791Reputation: 2791Reputation: 2791Reputation: 2791Reputation: 2791Reputation: 2791Reputation: 2791Reputation: 2791Reputation: 2791Reputation: 2791
AFAIK, Nutscrape uses std mbox format, which means you should be able to do "cat <nutscrape_mbox> | formail -czx From: -s" to get the "from" address for instance.
If you have got to get the 3rd field from a header and know fields are space separated, you could do "cat <nutscrape_mbox> | formail -czx From: -s" | while read i; do i=( ${i} ); printf "%s${i[2]}\n"; done". Arrays count from zero up.
 
Old 04-13-2003, 02:48 PM   #12
Tinkster
Moderator
 
Registered: Apr 2002
Location: in a fallen world
Distribution: slackware by choice, others too :} ... android.
Posts: 22,974
Blog Entries: 11

Rep: Reputation: 879Reputation: 879Reputation: 879Reputation: 879Reputation: 879Reputation: 879Reputation: 879
Code:
grep -E "^([a-zA-Z0-9_-])+([\.a-zA-Z0-9_-])*@([a-zA-Z0-9_-])+(\.[a-zA-Z0-9_-])+" Export | cut -d" " -f1

Cheers,
Tink
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
how to access IP addresses when grepping ecampos02 Linux - General 1 11-05-2005 02:39 PM
grepping unknown text, is there a way? vdemuth Programming 15 09-26-2005 03:00 AM
convert html emails to plain text emails andredude Linux - General 6 03-20-2005 12:33 PM
question about grepping something Grafbak Programming 6 03-01-2005 01:28 PM
grepping last word of output rajatgarg Programming 3 11-25-2003 10:41 AM


All times are GMT -5. The time now is 05:59 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration