LinuxQuestions.org
Review your favorite Linux distribution.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Networking
User Name
Password
Linux - Networking This forum is for any issue related to networks or networking.
Routing, network cards, OSI, etc. Anything is fair game.

Notices


Closed Thread
  Search this Thread
Old 11-17-2014, 10:13 AM   #1
jokar.mohsen
Member
 
Registered: Jul 2008
Location: Tehran
Posts: 441

Rep: Reputation: 22
Post Extract Emails from a Web site using Wget.


Hello all.
How can I extract all email address from a web site using wget? I used below command but not take any result :

wget -q -r -l 5 -O - www.example.com | grep -E -o "\b[a-zA-Z0-9.-]+@[a-zA-Z0-9.-]+\.[a-zA-Z0-9.-]+\b"

Any idea?
 
Old 11-17-2014, 10:25 AM   #2
TB0ne
LQ Guru
 
Registered: Jul 2003
Location: Birmingham, Alabama
Distribution: SuSE, RedHat, Slack,CentOS
Posts: 21,948

Rep: Reputation: 5811Reputation: 5811Reputation: 5811Reputation: 5811Reputation: 5811Reputation: 5811Reputation: 5811Reputation: 5811Reputation: 5811Reputation: 5811Reputation: 5811
Quote:
Originally Posted by jokar.mohsen View Post
Hello all.
How can I extract all email address from a web site using wget? I used below command but not take any result :

wget -q -r -l 5 -O - www.example.com | grep -E -o "\b[a-zA-Z0-9.-]+@[a-zA-Z0-9.-]+\.[a-zA-Z0-9.-]+\b"
Any idea?
No, since you AGAIN, do not provide examples or details. We have NO IDEA on how email addresses may be stored or displayed on a website, so any commands that are offered may (or may NOT) work. Also, this sounds VERY much like something a spammer would do...what are you trying to accomplish with this?
 
Old 11-17-2014, 10:35 AM   #3
SAbhi
Member
 
Registered: Aug 2009
Location: Bangaluru, India
Distribution: CentOS 6.5, SuSE SLED/ SLES 10.2 SP2 /11.2, Fedora 11/16
Posts: 665

Rep: Reputation: Disabled
same questions :

tell us what are trying to do ?
how the emails are displayed in the website ?
 
Old 11-17-2014, 10:45 AM   #4
szboardstretcher
Senior Member
 
Registered: Aug 2006
Location: Detroit, MI
Distribution: GNU/Linux systemd
Posts: 4,237

Rep: Reputation: 1651Reputation: 1651Reputation: 1651Reputation: 1651Reputation: 1651Reputation: 1651Reputation: 1651Reputation: 1651Reputation: 1651Reputation: 1651Reputation: 1651
Quote:
this sounds VERY much like something a spammer would do
I agree with this - what is the reason behind wanting to scrape email addresses off of a website? I don't want to support a spamming campaign.
 
Old 11-17-2014, 12:44 PM   #5
jokar.mohsen
Member
 
Registered: Jul 2008
Location: Tehran
Posts: 441

Original Poster
Rep: Reputation: 22
Quote:
Originally Posted by szboardstretcher View Post
I agree with this - what is the reason behind wanting to scrape email addresses off of a website? I don't want to support a spamming campaign.
It is not Spamming.
 
Old 11-17-2014, 12:47 PM   #6
jokar.mohsen
Member
 
Registered: Jul 2008
Location: Tehran
Posts: 441

Original Poster
Rep: Reputation: 22
Quote:
Originally Posted by SAbhi View Post
same questions :

tell us what are trying to do ?
how the emails are displayed in the website ?
I want to use wget like Metasploit Modules and Maltego. I know that Maltego and Metasploit can extract emails from a web site very nice but How can I do it with wget? If you visit a website and click on "contact" you can see some email addresses about the website but as you know.other pages have email too and I want to extract all emails from all pages.
 
Old 11-17-2014, 12:48 PM   #7
jokar.mohsen
Member
 
Registered: Jul 2008
Location: Tehran
Posts: 441

Original Poster
Rep: Reputation: 22
Quote:
Originally Posted by TB0ne View Post
No, since you AGAIN, do not provide examples or details. We have NO IDEA on how email addresses may be stored or displayed on a website, so any commands that are offered may (or may NOT) work. Also, this sounds VERY much like something a spammer would do...what are you trying to accomplish with this?
It is not Spamming. I can do it via Metasploit and Maltego but I want to know how can I do it via wget. If you click on "Support" or "Contact" on a website you can list of email address and I want to extract all emails from all pages via wget.
 
Old 11-17-2014, 12:51 PM   #8
szboardstretcher
Senior Member
 
Registered: Aug 2006
Location: Detroit, MI
Distribution: GNU/Linux systemd
Posts: 4,237

Rep: Reputation: 1651Reputation: 1651Reputation: 1651Reputation: 1651Reputation: 1651Reputation: 1651Reputation: 1651Reputation: 1651Reputation: 1651Reputation: 1651Reputation: 1651
Sounds legit.

This will work:

Code:
curl http://www.regular-expressions.info/email.html  | grep -hrio "\b[a-z0-9.-]\+@[a-z0-9.-]\+\.[a-z]\{2,4\}\+\b"
 
Old 11-18-2014, 05:45 AM   #9
jokar.mohsen
Member
 
Registered: Jul 2008
Location: Tehran
Posts: 441

Original Poster
Rep: Reputation: 22
Post

Quote:
Originally Posted by szboardstretcher View Post
Sounds legit.

This will work:

Code:
curl http://www.regular-expressions.info/email.html  | grep -hrio "\b[a-z0-9.-]\+@[a-z0-9.-]\+\.[a-z]\{2,4\}\+\b"
Thank you so much but this just extract emails from the current page. It like a FireFox Add-ons with the name "Email extractor" but I mean is that Digg all pages in a domain and extract email from it. for example :

Code:
curl http://www.regular-expressions.info  | grep -hrio "\b[a-z0-9.-]\+@[a-z0-9.-]\+\.[a-z]\{2,4\}\+\b"
[/QUOTE]

Any idea?
 
Old 11-19-2014, 03:01 AM   #10
SAbhi
Member
 
Registered: Aug 2009
Location: Bangaluru, India
Distribution: CentOS 6.5, SuSE SLED/ SLES 10.2 SP2 /11.2, Fedora 11/16
Posts: 665

Rep: Reputation: Disabled
Now it is nomore sounds valid to me...!!!
 
Old 11-19-2014, 04:39 AM   #11
jokar.mohsen
Member
 
Registered: Jul 2008
Location: Tehran
Posts: 441

Original Poster
Rep: Reputation: 22
Why it is not Valid? Can Wget do it?
 
Old 11-23-2014, 09:16 AM   #12
jokar.mohsen
Member
 
Registered: Jul 2008
Location: Tehran
Posts: 441

Original Poster
Rep: Reputation: 22
Quote:
Originally Posted by TB0ne View Post
No, since you AGAIN, do not provide examples or details. We have NO IDEA on how email addresses may be stored or displayed on a website, so any commands that are offered may (or may NOT) work. Also, this sounds VERY much like something a spammer would do...what are you trying to accomplish with this?
if you claim that you are a genius in linux, Help me to solve my sound card problem . Some genius told me that I must reinstall the debian and it is so...
 
Old 11-23-2014, 10:32 AM   #13
TB0ne
LQ Guru
 
Registered: Jul 2003
Location: Birmingham, Alabama
Distribution: SuSE, RedHat, Slack,CentOS
Posts: 21,948

Rep: Reputation: 5811Reputation: 5811Reputation: 5811Reputation: 5811Reputation: 5811Reputation: 5811Reputation: 5811Reputation: 5811Reputation: 5811Reputation: 5811Reputation: 5811
Quote:
Originally Posted by jokar.mohsen View Post
Why it is not Valid? Can Wget do it?
Because you want to harvest email addresses from websites that aren't your own, which is something a spammer would do.
Quote:
Originally Posted by jokar.mohsen
if you claim that you are a genius in linux, Help me to solve my sound card problem . Some genius told me that I must reinstall the debian and it is so...
Several things wrong here:
  • This is NOT the right thread for this, and crossposting is against LQ Rules. Thread reported to moderators, not only for suspected help in spamming, but for crossposting.
  • If you followed advice from someone, then ask THEM to help you. After years of using Linux, you should be able to handle a simple re-install.
  • People are TIRED of trying to help you, because you NEVER post details, have to be asked things repeatedly to even get a halfway answer, and don't acknowledge what you've been told. If you want help, pay attention to what gets suggested, try it, and ask clear questions. If you want to ignore everyone and do whatever you want, there is NO POINT IN POSTING.
 
Old 11-23-2014, 01:52 PM   #14
John VV
LQ Muse
 
Registered: Aug 2005
Location: A2 area Mi.
Posts: 17,454

Rep: Reputation: 2601Reputation: 2601Reputation: 2601Reputation: 2601Reputation: 2601Reputation: 2601Reputation: 2601Reputation: 2601Reputation: 2601Reputation: 2601Reputation: 2601
Quote:
I want to use wget like Metasploit Modules and Maltego
while the use of these tools is legal and NOT against the forum rules

IF one has to ask ....
then the use is very likely " not all that legal "

also email addresses on sites SHOULD!!! be in a database ( an encrypted one!!!)

unless it is the "contact information" that is in some( BUT NOT ALL ) sites html footer

if that
then it is easy to grab
just extract it from the page "footer"

Quote:
If you visit a website and click on "contact" you can see some email addresses
without REAL!!! examples we can NOT help

that information
MIGHT!!!
be in "plain text"?
might be a database driven page ?
might be coded in php ?
or perl?
or ruby?
and so on ....
it might be a redirect to a different site

without KNOWING there is NO WAY we really can help

Last edited by John VV; 11-23-2014 at 01:59 PM.
 
Old 11-23-2014, 08:55 PM   #15
SAbhi
Member
 
Registered: Aug 2009
Location: Bangaluru, India
Distribution: CentOS 6.5, SuSE SLED/ SLES 10.2 SP2 /11.2, Fedora 11/16
Posts: 665

Rep: Reputation: Disabled
Quote:
Originally Posted by jokar.mohsen View Post
Why it is not Valid? Can Wget do it?
Sorry to see it late.

it doesn't sounds valid to me because the outcome you need from a web application is not owned by you, if this a support email you are looking at that's often available on "Contact Us" page or footer as advised on a previous comment. you can use the given curl method to get that. But asking for a code that can extract each and every email address is bit of a concern of what are you trying to achieve with it ?

By my opinion every thing has its limitation and beyond that it is not legal.
 
  


Closed Thread


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Writing script to extract appropriate line from a web site using links ben1173 Linux - Newbie 4 10-26-2010 10:33 AM
block particular web site form multiple site hosted web server and allow others lasantha Linux - Security 2 08-17-2010 01:49 PM
block particular web site form multiple site hosted web server and allow others lasantha Linux - Security 1 08-17-2010 12:09 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Networking

All times are GMT -5. The time now is 04:23 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration