Automatically find and export email addresses?
I’m a complete newb (when it comes to Linux) and I’m not even sure if what I have in mind is possible, but I know Linux has a lot of capability so I think there is probably a way.
I have a couple of 40 page text files, exported from contact lists (from programs in Windows, I have dual boot) these files are most junk I don’t need but they are also full of email addresses I DO need. I have been manually going through finding the email addresses and cutting and pasting them into a separate list. It is tedious as hell. Is there anyway to make a script of something that just searches a text file and exports every word containing the @ symbol? So I’m looking for way to just automatically get all the email addresses out of a long text file and put them into a list. Is this possible? Thank you! |
$ cat file.txt | grep @
|
Where "file.txt" is your input file. You can output the contents into a file called address.txt by adding the following to kilgoretrout's command:
Code:
> address.txt |
If I may throw in my 2 cents.
Asymptote's solution will end up with one address in address.txt becsuse every time the script finds an address, it will overwrite the previous one. A small matter of syntax: change asymptote's solution to read Code:
>> address.txt |
Not on my system! I tested it using the following code:
Code:
#List all files in the file system containing an "a" starting with |
grep @ file.txt > address.txt
Only 1 process/invocation, so only need '>' Note that cat file|grep pattern is UUOC (Useless Use of cat) |
Good point - bigrigdriver threw me off.
|
I guess thats not going to work because the addresses are not in separate lines, they are in every line of text. This is what part of my text file looks like:
"C","","Ellington","","cell@example.net","Page1","","","","","","","","","","","","","","","","","", "","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","" "Cali","","Nichols","","ccni@example.com","","","","","","","","","","","","","","","","","","",""," ","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","" "Carey","","Davis","","carey@example.com","ida-rmb","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","" ,"","","","","","","","","","","","","","","","","" "Carla","","Kociolek","","cakoci@example.com","","","","","","","","","","","","","","","","","","", "","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","" "Carmen","","Senter","","cirw@example.com","Page1","","","","","","","","","","","","","","","",""," ","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","",""," " "Carol","","Foster","","caro@example.com","Page1","","","","","","","","","","","","","","","","","" ,"","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","" "Carol","","Carr","","carr@example.net","Parent","","", So the "grep" command just ends up copying the whole file. Thanks anyways for your help guys. |
Who the hell is carey davis??? Why are you emailing her ?!?! If that's who I think it is you and I are GOING TO HAVE A TALK!
|
A couple of thoughts:
First, those people probably woudn't be too impressed having their email addresses posted on a forum, so perhaps post and exampl.com type line and delete the rest. Otherwise, I know full well that some of the scripting gurus will give you a perfectly elegant solution to your problem from the command line, but I'm very ordinary at awk and all the rest of thos tools. have you tried opening this file a comma delimited text in Excel/OO or similar, and pasting the address row into a text file? Crude, but effective if you are running a GUI and have OO or similar installed. B Edit, or you could just try this http://linux.die.net/man/1/cut http://lowfatlinux.com/linux-columns-cut.html |
From the OP:
Quote:
I stand by my suggestion of using the append redirect. To get all of the address into one file, from all filles they are to be extracted from, a loop through the files, with an append to the existant addresses.txt would be my way to do it. There is no good reason (at least not one given by the OP) to have to run the script more than once to get the job done. And tommy.sean, don't give up on us so quickly, You didn't give us any indication of the file formats, or you would have received quite different suggestions. billymayday only hints at what those answers would have been. |
Qucik 'n dirty perl
Code:
#!/usr/bin/perl -w |
Isn't using "cut" simpler?
|
Well guys I finished up, the slow and tedious way. Thanks again for your help, at least I did learn some things. I would have to learn a lot more about pearl before I could have done it that way.
Thanks for the info. Should I close this forum or something now? |
All times are GMT -5. The time now is 04:27 PM. |