LinuxQuestions.org
Latest LQ Deal: Latest LQ Deals
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 03-08-2014, 10:20 PM   #1
hari921019
LQ Newbie
 
Registered: Mar 2014
Posts: 20

Rep: Reputation: Disabled
Domain name to IP address


lynx -dump http://www.domain.com | grep -A999 "^References$" | tail -n +3 | awk '{print $2 }'

By using the command above i'm getting the list of domain names instead of ip address.

How to make it as ip address?
 
Old 03-09-2014, 03:47 AM   #2
Ser Olmy
Senior Member
 
Registered: Jan 2012
Distribution: Slackware
Posts: 3,341

Rep: Reputation: Disabled
When I run those commands, I get a bunch of links (URLs), not a list of domain names.

To find the IP address of a hostname (domain names may or may not be associated with an IP address), a DNS lookup must be performed. The host command will do that.
 
Old 03-09-2014, 05:51 AM   #3
hari921019
LQ Newbie
 
Registered: Mar 2014
Posts: 20

Original Poster
Rep: Reputation: Disabled
Ya links sorry my bad...may i know what are the steps i should take and can u give me code for that?thanks
 
Old 03-09-2014, 05:56 AM   #4
hari921019
LQ Newbie
 
Registered: Mar 2014
Posts: 20

Original Poster
Rep: Reputation: Disabled
i'm trying to get all the ip address for the hyperlinks on a website?for example i want all the ip address of the hyperlinks in cisco.com
 
Old 03-09-2014, 06:46 AM   #5
david1941
Member
 
Registered: May 2005
Location: St. Louis, MO
Distribution: CentOS7
Posts: 267

Rep: Reputation: 58
Google-chrome has an ap for that:

IPvFoo is a Chrome extension that adds an icon to your location bar, indicating whether the current page was fetched using IPv4 or IPv6. When you click the icon, a pop-up appears, listing the IP address for each domain that served the page elements.

Dave
 
Old 03-09-2014, 07:20 AM   #6
Ser Olmy
Senior Member
 
Registered: Jan 2012
Distribution: Slackware
Posts: 3,341

Rep: Reputation: Disabled
As david1941 points out, there may be other ways to attack this problem. It all depends on why you need this data and what you're going to do with it. If you need to process the names in a script, a browser extension may not work for you.

Your set of commands filter the output from lynx -dump, which means you get a lot of URLs, and then for some web sites the text "links:" (the latter part of the "Hidden links:" header) followed by some more URLs.

You could filter on "^http://" (a "^" in a regular expression means "the start of the line") to lose any blank lines and the "links:" header. I've used "www.microsoft.com" in this example, as the page contains an unusual amount of links and whould be a good test case:
Code:
lynx -dump http://www.microsoft.com \
  | grep -A999 "^References$" \
  | tail -n +3 \
  | awk '{print $2 }' \
  | grep "http://"
(I've used backslash escaping to split the code across multiple lines, as this improves readability.)

The search-and-replace function in sed could then be used to remove the "http://" part as well as the link after the host name:
Code:
lynx -dump http://www.microsoft.com \
  | grep -A999 "^References$" \
  | tail -n +3 \
  | awk '{print $2 }' \
  | grep "http://" \
  | sed -e "s/http:\/\///" -e "s/\/.*//"
That should leave you with a simple list of host names. You could use a while loop to read each name into a variable and feed that to the host command:
Code:
lynx -dump http://www.microsoft.com +
  | grep -A999 "^References$" \
  | tail -n +3 \
  | awk '{print $2 }' \
  | grep "http://" \
  | sed -e "s/http:\/\///" -e "s/\/.*//" \
  | while read hname ; do 
      host $hname
    done
However, the output from the host command is somewhat unpredictable. The host name may have an A record, in which case one or more IP addresses are returned, or it could actually be a CNAME, in which case the host command will attempt to follow the pointer and recursively resolve the name (good), and will report its progress as it goes along (not necessarily what you want to see in your list). Further filtering through grep and sed could be used to return just the IP address(es):
Code:
lynx -dump http://www.microsoft.com \
  | grep -A999 "^References$" \
  | tail -n +3 \
  | awk '{print $2 }' \
  | grep "http://" \
  | sed -e "s/http:\/\///" -e "s/\/.*//" \
  | while read hname ; do
      host $hname \
        | grep "has address" \
        | sed "s/.*has address //"
    done
You now have a simple list of IP addresses, but not the corresponding hostnames. If you also want the host names, the output from the hosts command needs to be parsed (as a hostname can resolve to more than one IP address). An echo statement inside a second while loop would do the trick:
Code:
lynx -dump http://www.microsoft.com \
  | grep -A999 "^References$" \
  | tail -n +3 \
  | awk '{print $2 }' \
  | grep "http://" \
  | sed -e "s/http:\/\///" -e "s/\/.*//" \
  | while read hname ; do 
      host $hname \
        | grep "has address" \
        | sed "s/.*has address //" \
        | while read addr; do 
            echo "$addr $hname"
          done
    done
While this works, it is:
  • only one of many ways to solve this problem, and
  • almost certainly not the best way
In fact, many would find this script ridiculously convoluted, and would opt to replace most of the code with a much shorter and arguably better awk program.

My point was to demonstrate how you can use common filtering and substitution mechanisms to alter the output from one command into any format you like, and not hot to produce the shortest and most efficient solution to this particular problem.

I'd recommend you take a closer look at what commands like awk, sed, cut, and join (as well as regular expressions in general) can do with regards to manipulating text.

Last edited by Ser Olmy; 03-09-2014 at 07:36 AM. Reason: typo
 
1 members found this post helpful.
Old 03-09-2014, 08:19 AM   #7
hari921019
LQ Newbie
 
Registered: Mar 2014
Posts: 20

Original Poster
Rep: Reputation: Disabled
Really Helpful..Thanks a lot
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
new qmail install changes from address to domain@domain mrshiat Linux - General 1 12-19-2012 10:35 AM
Domain Name and IP Address EviLBoX Linux - Networking 5 11-15-2007 02:59 AM
Domain and IP address setup satimis Linux - Server 6 11-25-2006 09:17 AM
illegal use of my domain address Lui Linux - Security 9 06-17-2004 05:36 PM
Using a domain name instead of ip address for my website??? oulevon General 9 08-10-2001 04:22 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 05:59 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration