Extract Emails from a Web site using Wget.
Hello all.
How can I extract all email address from a web site using wget? I used below command but not take any result : wget -q -r -l 5 -O - www.example.com | grep -E -o "\b[a-zA-Z0-9.-]+@[a-zA-Z0-9.-]+\.[a-zA-Z0-9.-]+\b" Any idea? |
Quote:
|
same questions :
tell us what are trying to do ? how the emails are displayed in the website ? |
Quote:
|
Quote:
|
Quote:
|
Quote:
|
Sounds legit.
This will work: Code:
curl http://www.regular-expressions.info/email.html | grep -hrio "\b[a-z0-9.-]\+@[a-z0-9.-]\+\.[a-z]\{2,4\}\+\b" |
Quote:
Code:
curl http://www.regular-expressions.info | grep -hrio "\b[a-z0-9.-]\+@[a-z0-9.-]\+\.[a-z]\{2,4\}\+\b" Any idea? |
Now it is nomore sounds valid to me...!!!
|
Why it is not Valid? Can Wget do it?
|
Quote:
|
Quote:
Quote:
|
Quote:
IF one has to ask .... then the use is very likely " not all that legal " also email addresses on sites SHOULD!!! be in a database ( an encrypted one!!!) unless it is the "contact information" that is in some( BUT NOT ALL ) sites html footer if that then it is easy to grab just extract it from the page "footer" Quote:
that information MIGHT!!! be in "plain text"? might be a database driven page ? might be coded in php ? or perl? or ruby? and so on .... it might be a redirect to a different site without KNOWING there is NO WAY we really can help |
Quote:
it doesn't sounds valid to me because the outcome you need from a web application is not owned by you, if this a support email you are looking at that's often available on "Contact Us" page or footer as advised on a previous comment. you can use the given curl method to get that. But asking for a code that can extract each and every email address is bit of a concern of what are you trying to achieve with it ? By my opinion every thing has its limitation and beyond that it is not legal. |
All times are GMT -5. The time now is 05:48 PM. |