LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Networking (https://www.linuxquestions.org/questions/linux-networking-3/)
-   -   Extract Emails from a Web site using Wget. (https://www.linuxquestions.org/questions/linux-networking-3/extract-emails-from-a-web-site-using-wget-4175525661/)

jokar.mohsen 11-17-2014 10:13 AM

Extract Emails from a Web site using Wget.
 
Hello all.
How can I extract all email address from a web site using wget? I used below command but not take any result :

wget -q -r -l 5 -O - www.example.com | grep -E -o "\b[a-zA-Z0-9.-]+@[a-zA-Z0-9.-]+\.[a-zA-Z0-9.-]+\b"

Any idea?

TB0ne 11-17-2014 10:25 AM

Quote:

Originally Posted by jokar.mohsen (Post 5270922)
Hello all.
How can I extract all email address from a web site using wget? I used below command but not take any result :

wget -q -r -l 5 -O - www.example.com | grep -E -o "\b[a-zA-Z0-9.-]+@[a-zA-Z0-9.-]+\.[a-zA-Z0-9.-]+\b"
Any idea?

No, since you AGAIN, do not provide examples or details. We have NO IDEA on how email addresses may be stored or displayed on a website, so any commands that are offered may (or may NOT) work. Also, this sounds VERY much like something a spammer would do...what are you trying to accomplish with this?

SAbhi 11-17-2014 10:35 AM

same questions :

tell us what are trying to do ?
how the emails are displayed in the website ?

szboardstretcher 11-17-2014 10:45 AM

Quote:

this sounds VERY much like something a spammer would do
I agree with this - what is the reason behind wanting to scrape email addresses off of a website? I don't want to support a spamming campaign.

jokar.mohsen 11-17-2014 12:44 PM

Quote:

Originally Posted by szboardstretcher (Post 5270936)
I agree with this - what is the reason behind wanting to scrape email addresses off of a website? I don't want to support a spamming campaign.

It is not Spamming.

jokar.mohsen 11-17-2014 12:47 PM

Quote:

Originally Posted by SAbhi (Post 5270929)
same questions :

tell us what are trying to do ?
how the emails are displayed in the website ?

I want to use wget like Metasploit Modules and Maltego. I know that Maltego and Metasploit can extract emails from a web site very nice but How can I do it with wget? If you visit a website and click on "contact" you can see some email addresses about the website but as you know.other pages have email too and I want to extract all emails from all pages.

jokar.mohsen 11-17-2014 12:48 PM

Quote:

Originally Posted by TB0ne (Post 5270927)
No, since you AGAIN, do not provide examples or details. We have NO IDEA on how email addresses may be stored or displayed on a website, so any commands that are offered may (or may NOT) work. Also, this sounds VERY much like something a spammer would do...what are you trying to accomplish with this?

It is not Spamming. I can do it via Metasploit and Maltego but I want to know how can I do it via wget. If you click on "Support" or "Contact" on a website you can list of email address and I want to extract all emails from all pages via wget.

szboardstretcher 11-17-2014 12:51 PM

Sounds legit.

This will work:

Code:

curl http://www.regular-expressions.info/email.html  | grep -hrio "\b[a-z0-9.-]\+@[a-z0-9.-]\+\.[a-z]\{2,4\}\+\b"

jokar.mohsen 11-18-2014 05:45 AM

Quote:

Originally Posted by szboardstretcher (Post 5270992)
Sounds legit.

This will work:

Code:

curl http://www.regular-expressions.info/email.html  | grep -hrio "\b[a-z0-9.-]\+@[a-z0-9.-]\+\.[a-z]\{2,4\}\+\b"

Thank you so much but this just extract emails from the current page. It like a FireFox Add-ons with the name "Email extractor" but I mean is that Digg all pages in a domain and extract email from it. for example :

Code:

curl http://www.regular-expressions.info  | grep -hrio "\b[a-z0-9.-]\+@[a-z0-9.-]\+\.[a-z]\{2,4\}\+\b"
[/QUOTE]

Any idea?

SAbhi 11-19-2014 03:01 AM

Now it is nomore sounds valid to me...!!!

jokar.mohsen 11-19-2014 04:39 AM

Why it is not Valid? Can Wget do it?

jokar.mohsen 11-23-2014 09:16 AM

Quote:

Originally Posted by TB0ne (Post 5270927)
No, since you AGAIN, do not provide examples or details. We have NO IDEA on how email addresses may be stored or displayed on a website, so any commands that are offered may (or may NOT) work. Also, this sounds VERY much like something a spammer would do...what are you trying to accomplish with this?

if you claim that you are a genius in linux, Help me to solve my sound card problem :). Some genius told me that I must reinstall the debian and it is so...

TB0ne 11-23-2014 10:32 AM

Quote:

Originally Posted by jokar.mohsen (Post 5271806)
Why it is not Valid? Can Wget do it?

Because you want to harvest email addresses from websites that aren't your own, which is something a spammer would do.
Quote:

Originally Posted by jokar.mohsen
if you claim that you are a genius in linux, Help me to solve my sound card problem . Some genius told me that I must reinstall the debian and it is so...

Several things wrong here:
  • This is NOT the right thread for this, and crossposting is against LQ Rules. Thread reported to moderators, not only for suspected help in spamming, but for crossposting.
  • If you followed advice from someone, then ask THEM to help you. After years of using Linux, you should be able to handle a simple re-install.
  • People are TIRED of trying to help you, because you NEVER post details, have to be asked things repeatedly to even get a halfway answer, and don't acknowledge what you've been told. If you want help, pay attention to what gets suggested, try it, and ask clear questions. If you want to ignore everyone and do whatever you want, there is NO POINT IN POSTING.

John VV 11-23-2014 01:52 PM

Quote:

I want to use wget like Metasploit Modules and Maltego
while the use of these tools is legal and NOT against the forum rules

IF one has to ask ....
then the use is very likely " not all that legal "

also email addresses on sites SHOULD!!! be in a database ( an encrypted one!!!)

unless it is the "contact information" that is in some( BUT NOT ALL ) sites html footer

if that
then it is easy to grab
just extract it from the page "footer"

Quote:

If you visit a website and click on "contact" you can see some email addresses
without REAL!!! examples we can NOT help

that information
MIGHT!!!
be in "plain text"?
might be a database driven page ?
might be coded in php ?
or perl?
or ruby?
and so on ....
it might be a redirect to a different site

without KNOWING there is NO WAY we really can help

SAbhi 11-23-2014 08:55 PM

Quote:

Originally Posted by jokar.mohsen (Post 5271806)
Why it is not Valid? Can Wget do it?

Sorry to see it late.

it doesn't sounds valid to me because the outcome you need from a web application is not owned by you, if this a support email you are looking at that's often available on "Contact Us" page or footer as advised on a previous comment. you can use the given curl method to get that. But asking for a code that can extract each and every email address is bit of a concern of what are you trying to achieve with it ?

By my opinion every thing has its limitation and beyond that it is not legal.


All times are GMT -5. The time now is 05:48 PM.