LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 05-25-2008, 05:17 PM   #1
tommy.sean
LQ Newbie
 
Registered: May 2008
Posts: 3

Rep: Reputation: 0
Question Automatically find and export email addresses?


I知 a complete newb (when it comes to Linux) and I知 not even sure if what I have in mind is possible, but I know Linux has a lot of capability so I think there is probably a way.

I have a couple of 40 page text files, exported from contact lists (from programs in Windows, I have dual boot) these files are most junk I don稚 need but they are also full of email addresses I DO need. I have been manually going through finding the email addresses and cutting and pasting them into a separate list. It is tedious as hell.

Is there anyway to make a script of something that just searches a text file and exports every word containing the @ symbol? So I知 looking for way to just automatically get all the email addresses out of a long text file and put them into a list. Is this possible? Thank you!
 
Old 05-25-2008, 05:33 PM   #2
kilgoretrout
Senior Member
 
Registered: Oct 2003
Posts: 2,987

Rep: Reputation: 388Reputation: 388Reputation: 388Reputation: 388
$ cat file.txt | grep @
 
Old 05-25-2008, 05:38 PM   #3
asymptote
Member
 
Registered: Mar 2008
Posts: 236

Rep: Reputation: 37
Where "file.txt" is your input file. You can output the contents into a file called address.txt by adding the following to kilgoretrout's command:
Code:
> address.txt
Address.txt will be automatically created and contain a list of all lines of entry that contain the @ symbol.
 
Old 05-25-2008, 05:51 PM   #4
bigrigdriver
LQ Addict
 
Registered: Jul 2002
Location: East Centra Illinois, USA
Distribution: Debian stable
Posts: 5,908

Rep: Reputation: 356Reputation: 356Reputation: 356Reputation: 356
If I may throw in my 2 cents.

Asymptote's solution will end up with one address in address.txt becsuse every time the script finds an address, it will overwrite the previous one.

A small matter of syntax: change asymptote's solution to read
Code:
>> address.txt
The double >> will append addresses to address.txt rather than overwrite it.
 
Old 05-25-2008, 05:58 PM   #5
asymptote
Member
 
Registered: Mar 2008
Posts: 236

Rep: Reputation: 37
Not on my system! I tested it using the following code:
Code:
#List all files in the file system containing an "a" starting with
#the root directory and place the search results in scan.txt
311-laptop:~/Desktop$ sudo ls /* | grep a > scan.txt

#contents of scan.txt
311-laptop:~/Desktop$ cat scan.txt 
/apt_get_update.cap
bash
bzcat
cat
dash
date
dnsdomainname
false
hostname
ld_static
loadkeys
nano
netcat
netstat
rbash
readlink
rnano
run-parts
tailf
tar
uname
zcat
abi-2.6.22-14-generic

Last edited by asymptote; 05-25-2008 at 06:01 PM.
 
Old 05-25-2008, 05:59 PM   #6
chrism01
LQ Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Rocky 9.2
Posts: 18,359

Rep: Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751
grep @ file.txt > address.txt

Only 1 process/invocation, so only need '>'

Note that cat file|grep pattern is UUOC (Useless Use of cat)
 
Old 05-25-2008, 06:03 PM   #7
asymptote
Member
 
Registered: Mar 2008
Posts: 236

Rep: Reputation: 37
Good point - bigrigdriver threw me off.
 
Old 05-25-2008, 08:28 PM   #8
tommy.sean
LQ Newbie
 
Registered: May 2008
Posts: 3

Original Poster
Rep: Reputation: 0
I guess thats not going to work because the addresses are not in separate lines, they are in every line of text. This is what part of my text file looks like:

"C","","Ellington","","cell@example.net","Page1","","","","","","","","","","","","","","","","","", "","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","",""

"Cali","","Nichols","","ccni@example.com","","","","","","","","","","","","","","","","","","",""," ","","","","","","","","","","","","","","","","","","","","","","","","","","","","","",""

"Carey","","Davis","","carey@example.com","ida-rmb","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","" ,"","","","","","","","","","","","","","","","",""

"Carla","","Kociolek","","cakoci@example.com","","","","","","","","","","","","","","","","","","", "","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","",""

"Carmen","","Senter","","cirw@example.com","Page1","","","","","","","","","","","","","","","",""," ","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","",""," "

"Carol","","Foster","","caro@example.com","Page1","","","","","","","","","","","","","","","","","" ,"","","","","","","","","","","","","","","","","","","","","","","","","","","","","","","",""

"Carol","","Carr","","carr@example.net","Parent","","",

So the "grep" command just ends up copying the whole file. Thanks anyways for your help guys.

Last edited by tommy.sean; 05-26-2008 at 05:02 PM.
 
Old 05-25-2008, 09:56 PM   #9
asymptote
Member
 
Registered: Mar 2008
Posts: 236

Rep: Reputation: 37
Who the hell is carey davis??? Why are you emailing her ?!?! If that's who I think it is you and I are GOING TO HAVE A TALK!
 
Old 05-25-2008, 10:11 PM   #10
billymayday
LQ Guru
 
Registered: Mar 2006
Location: Sydney, Australia
Distribution: Fedora, CentOS, OpenSuse, Slack, Gentoo, Debian, Arch, PCBSD
Posts: 6,678

Rep: Reputation: 122Reputation: 122
A couple of thoughts:

First, those people probably woudn't be too impressed having their email addresses posted on a forum, so perhaps post and exampl.com type line and delete the rest.

Otherwise, I know full well that some of the scripting gurus will give you a perfectly elegant solution to your problem from the command line, but I'm very ordinary at awk and all the rest of thos tools. have you tried opening this file a comma delimited text in Excel/OO or similar, and pasting the address row into a text file? Crude, but effective if you are running a GUI and have OO or similar installed.


B

Edit, or you could just try this

http://linux.die.net/man/1/cut
http://lowfatlinux.com/linux-columns-cut.html

Last edited by billymayday; 05-25-2008 at 10:23 PM.
 
Old 05-25-2008, 11:47 PM   #11
bigrigdriver
LQ Addict
 
Registered: Jul 2002
Location: East Centra Illinois, USA
Distribution: Debian stable
Posts: 5,908

Rep: Reputation: 356Reputation: 356Reputation: 356Reputation: 356
From the OP:
Quote:
I have a couple of 40 page text files, exported from contact lists (from programs in Windows, I have dual boot) these files are most junk I don’t need but they are also full of email addresses I DO need. I have been manually going through finding the email addresses and cutting and pasting them into a separate list. It is tedious as hell.
He clearly indicates he has more than one file to extract addresses from.

I stand by my suggestion of using the append redirect. To get all of the address into one file, from all filles they are to be extracted from, a loop through the files, with an append to the existant addresses.txt would be my way to do it.

There is no good reason (at least not one given by the OP) to have to run the script more than once to get the job done.

And tommy.sean, don't give up on us so quickly, You didn't give us any indication of the file formats, or you would have received quite different suggestions. billymayday only hints at what those answers would have been.

Last edited by bigrigdriver; 05-25-2008 at 11:52 PM.
 
Old 05-26-2008, 12:45 AM   #12
chrism01
LQ Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Rocky 9.2
Posts: 18,359

Rep: Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751
Qucik 'n dirty perl
Code:
#!/usr/bin/perl -w
use strict;

my (
    $f1, $f1_rec, $f1_field, $f2
    );

$f1 = $ARGV[0];
$f2 = $ARGV[1];

open(F2,">>", "$f2") or die "Unable to open $f2: $!\n";

open(F1,"<", "$f1") or die "Unable to open $f1: $!\n";
while ( defined ( $f1_rec = <F1> ) )
{
    $f1_field = (split(/,/, $f1_rec))[4];
    print F2 "$f1_field\n";
}
close(F1) or die "Unable to close $f1: $!\n";
close(F2) or die "Unable to close $f2: $!\n";
Assumes email is always 5th field as per example data given above.
 
Old 05-26-2008, 12:57 AM   #13
billymayday
LQ Guru
 
Registered: Mar 2006
Location: Sydney, Australia
Distribution: Fedora, CentOS, OpenSuse, Slack, Gentoo, Debian, Arch, PCBSD
Posts: 6,678

Rep: Reputation: 122Reputation: 122
Isn't using "cut" simpler?
 
Old 05-26-2008, 04:57 PM   #14
tommy.sean
LQ Newbie
 
Registered: May 2008
Posts: 3

Original Poster
Rep: Reputation: 0
Smile

Well guys I finished up, the slow and tedious way. Thanks again for your help, at least I did learn some things. I would have to learn a lot more about pearl before I could have done it that way.
Thanks for the info. Should I close this forum or something now?
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Import (and export) Email mesages to and from Outlook / OE 1kyle SUSE / openSUSE 2 03-22-2008 03:26 PM
Trying to export display automatically in Linux ? lab123 Linux - Networking 0 09-29-2005 08:33 AM
Trying to export email from one distro to another pofadda Linux - Newbie 1 04-16-2005 11:06 AM
Automatically cast 'export LD_ASSUME_KERNEL' before an app starts tbfirefox Linux - General 1 02-01-2005 08:59 PM
Automatically renewing DHCP addresses? irishbear Linux - Networking 6 01-21-2005 11:53 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 07:58 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration