Grep parsing issue
i'm trying to extract ip's and port numbers from a html file but the task ended up being more complicated the what i had expected, here is a sample from file:
<td class=xl27 x:num="38178">2004-7-10</td> <td height=19 class=xl25 style='height:14.25pt'>80.110.116.179</td> <td class=xl25>5561</td> i need to isolate "80.110.116.179" as well as "5561" ( 0-65535) and place it on the same line ( xxx.xxx.xxx.xxx pN). I tought about using awk with "> <" delimiter without any luck, can't "cut" the output since some ip's are longer than others, grep '[0-9]\{3\}.[0-9]\{3\}. [0-9]\{3\}' turn out giving too many jusk data...there must be a simple way to accomplish this task that i am not seeing... any CLI wiz up for a challenge ? thanks for your time ________________ Selfsck |
Using egrep, this reg exp will grep all lines with legal ip numbers only (0.0.0.0 - 255.255.255.255):
\b(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b Now you should be able to pipe it thru awk to cut it: awk -F"[<>]" '{ print $3}' This only works if the number of <'s and >'s in front of the ip number are the same for all the ip lines (and the closing < should be there, too). Hope this gets you going again. |
i get " Badly placed (. " error message from the command, ill try to
play with the syntax a little.. beside, i still haven't found a way to display both ip's and port nunbers on the same line...hmm wonders |
You don't show the command you tried, it should be something like this:
egrep 'expression' <file> There should be single (or double) quotes around the expression. Displaying 2 (or more) items on the same line can be done in various ways, here's just one: - Fill 2 (or more) variables: THIS="`grep 'foo' somefile | cut -d: -f2`" THAT="`egrep 'bar' otherfile | awk '{ print $3}'`" and print/echo these: echo $THIS $THAT |
All times are GMT -5. The time now is 03:16 PM. |