Parse out only specific characters from web page
Hi. I am using curl to download a specific web page, basically I need the CustName: from the query run on whois.domaintools.com
If you type an ip address, it will return the organization name. I need only this information saved in a plain text file. I tried using grep -E but it gets messy because there are many   and ;  located after the CustName. Also the string is returned in one long line so grepping for CustName returns that same long line. The characters that follow the information I need are simply a new line which is '<br>'. I need to stop grabbing text up until that point. So what I do is run Code:
curl -s http://whois.domaintools.com/ip.addr.of.domain > file Code:
grep -E -o "CustName.{120}" file Here is an example of the output of the above command: Code:
grep -E -o "CustName.{120}" file |
If you 'elinks -dump' then you have parsed output to grep or awk? Or else pipe the curl through a parser like http://www.devshed.com/c/a/apache/logging-in-apache/2/ (see "Listing 3-1. A Simple Script to Use As a Filter")?
|
Hi. I'm not sure what parser means here. I saved the output of the curl command to a file and just ran grep from there. Sorry new to parsing in apache.
|
Perhaps I'm missing something here, but why not use "whois"?
Code:
whois ip.addr.of.domain | grep OrgName |
The problem is that whois does not seem to work with every ip address
|
All times are GMT -5. The time now is 04:10 PM. |