Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place. |
| Notices |
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
Are you new to LinuxQuestions.org? Visit the following links:
Site Howto |
Site FAQ |
Sitemap |
Register Now
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
 |
GNU/Linux Basic Guide
This 255-page guide will provide you with the keys to understand the philosophy of free software, teach you how to use and handle it, and give you the tools required to move easily in the world of GNU/Linux. Many users and administrators will be taking their first steps with this GNU/Linux Basic guide and it will show you how to approach and solve the problems you encounter.
Click Here to receive this Complete Guide absolutely free. |
|
 |
04-10-2003, 01:12 PM
|
#1
|
|
Member
Registered: Jun 2001
Location: Up in the clouds
Distribution: Fedora et al.
Posts: 353
Rep:
|
Not grepping emails right
Hi,
I am trying to parse a file for email addresses. So here's the line I'm using with grep:
grep -E "^([a-zA-Z0-9_-])+([\.a-zA-Z0-9_-])*@([a-zA-Z0-9_-])+(\.[a-zA-Z0-9_-])+" Export
When I run it, I get a response however, the first letter of every email address found is truncated, and it is also outputting info like this:
... Deferred: Connection timed out with blah.blah.blah.
So lets say the email address in the file is:
user1@linuxquestions.org
But when grepped the output is:
ser1@linuxquestions.org
and the space is infront of it in the results also, it's like the results are displayed with a space replacing the first character of all matching results.
Regex are still like magic to me, and I have the hardest time figuring them out. I thought I was getting better, but I guess not. Please if anybody can help that would be great! Thanks in Advance!
Mike.
|
|
|
|
04-10-2003, 03:05 PM
|
#2
|
|
Moderator
Registered: Apr 2002
Location: in a fallen world
Distribution: slackware by choice, others too :} ... android.
Posts: 22,902
|
That would strike me as odd, your expression works
here... what's your grep's version number?
Cheers,
Tink
|
|
|
|
04-10-2003, 04:10 PM
|
#3
|
|
Member
Registered: Jun 2001
Location: Up in the clouds
Distribution: Fedora et al.
Posts: 353
Original Poster
Rep:
|
I thought it was wierd too, I have used this expression for many email comparison things. I'm using grep 2.4.2, RH 7.2
Thanks for the reply.
Mike.
|
|
|
|
04-10-2003, 05:37 PM
|
#4
|
|
Moderator
Registered: Apr 2002
Location: in a fallen world
Distribution: slackware by choice, others too :} ... android.
Posts: 22,902
|
grep (GNU grep) 2.5
Copyright 1988, 1992-1999, 2000, 2001 Free Software Foundation, Inc.
Slack 8.1 ... odd ...
Sorry, I don't know what to say :/
Cheers,
Tink
|
|
|
|
04-10-2003, 05:52 PM
|
#5
|
|
Moderator
Registered: Mar 2003
Location: Scotland
Distribution: Slackware, RedHat, Debian
Posts: 12,047
Rep:
|
Try using a perl regexp.
|
|
|
|
04-10-2003, 07:31 PM
|
#6
|
|
Member
Registered: Jun 2001
Location: Up in the clouds
Distribution: Fedora et al.
Posts: 353
Original Poster
Rep:
|
you mean write a script in perl to do the job or use the format of a perl regexp?
Thanks for the help!
|
|
|
|
04-11-2003, 02:20 PM
|
#7
|
|
Moderator
Registered: Mar 2003
Location: Scotland
Distribution: Slackware, RedHat, Debian
Posts: 12,047
Rep:
|
I tend to use perl full stop for most things - I doubt if using a perl like regexp in grep would make much difference. Like Tinkster - your grep command works for me. I can only assume there is somthing odd with either the file you are reading from (malformed characters or bad end of lines) or there is somthing wrong with another part of the script you are running this command from.
How are you calling the grep command? From the command line or from within a script? It may help if you post the script (if there is one) and the content of the file you are trying to grep.
|
|
|
|
04-11-2003, 03:20 PM
|
#8
|
|
Member
Registered: Jun 2001
Location: Up in the clouds
Distribution: Fedora et al.
Posts: 353
Original Poster
Rep:
|
I was running grep from the command line. I can't really post the contents of the file cause it is a huge file 34meg, it is just an inbox file from netscape, however there are characters in the file, after some of the email address that look like this: ^M I believe that is the windows line break character, but I can't see how that would influence it. The lines aren't that complicated, but I have tried this on another file and had it work also, I wonder if I uploaded the file in Binary...hmmm...
Thanks for the help.
|
|
|
|
04-12-2003, 02:16 PM
|
#9
|
|
Member
Registered: Jun 2001
Location: Up in the clouds
Distribution: Fedora et al.
Posts: 353
Original Poster
Rep:
|
Figured it out,
I search and replaced those ^M with \n and it fixed it. Now does anybody know how to have grep return only the exact matches and not the full lines where they match? For example the line is:
blah@linuxquestions.org ... Deffered ...
Grep returns:
blah@linuxquestions.org ... Deffered ...
But I only want:
blah@linuxquestions.org
TIA!
Mike.
|
|
|
|
04-12-2003, 02:27 PM
|
#10
|
|
Moderator
Registered: Mar 2003
Location: Scotland
Distribution: Slackware, RedHat, Debian
Posts: 12,047
Rep:
|
I think you will need to use awk.
I don't know of a way to do it with grep.
|
|
|
|
04-12-2003, 07:48 PM
|
#11
|
|
Moderator
Registered: May 2001
Posts: 24,779
|
AFAIK, Nutscrape uses std mbox format, which means you should be able to do "cat <nutscrape_mbox> | formail -czx From: -s" to get the "from" address for instance.
If you have got to get the 3rd field from a header and know fields are space separated, you could do "cat <nutscrape_mbox> | formail -czx From: -s" | while read i; do i=( ${i} ); printf "%s${i[2]}\n"; done". Arrays count from zero up.
|
|
|
|
04-13-2003, 02:48 PM
|
#12
|
|
Moderator
Registered: Apr 2002
Location: in a fallen world
Distribution: slackware by choice, others too :} ... android.
Posts: 22,902
|
Code:
grep -E "^([a-zA-Z0-9_-])+([\.a-zA-Z0-9_-])*@([a-zA-Z0-9_-])+(\.[a-zA-Z0-9_-])+" Export | cut -d" " -f1
Cheers,
Tink
|
|
|
|
| Thread Tools |
Search this Thread |
|
|
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
All times are GMT -5. The time now is 05:06 AM.
|
|
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.
|
Latest Threads
LQ News
|
|