LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   bash script to dynamically edit an html file (https://www.linuxquestions.org/questions/programming-9/bash-script-to-dynamically-edit-an-html-file-802004/)

melee 04-15-2010 11:54 AM

Hmmm...

So I tried both of these awk one-liners and they don't quite seem to work, though the first one seems to work better. This is the command I tried:

Code:

awk 'BEGIN{FS="[\\||>*<*]"}ARGV[1] == FILENAME{_[$1]=$2}ARGV[2] == FILENAME{if($5 in _){match($2,/[0-9]+/,pc);gsub(pc[0],pc[0]/2)}print $0"\n"gensub($5,_[$5],2)}' ipsandhostlist.txt sedtest.html
This script turned this:

Code:

  <tr>

        <td class=default width="30%"><a href="#172_27_1_107">172.27.1.107</a></td>

        <td class=default width="40%">Security warning(s) found</td></tr>

  <tr>

into this:

Code:

0 0 0 0<0t0r0>0
0 0 0 0<0t0r0>0
        <td class=default width="15%"><a href="#172_27_1_58">172.27.1.58</a></td>
        <td class=default width="15%"><a href="#172_27_1_58">usbedtstee01.hologic.corp</a></td>
        <td class=default width="20%">Security warning(s) found</td></tr>
        <td class=default width="20%">Security warning(s) found</td></tr>
0 0 0 0<0t0r0>0
0 0 0 0<0t0r0>0


So it did match the hostname to the ip (which is awesome!), but it also seemed to duplicate every other line, and input zeroes for every second character on most other lines. So how is this script finding the ip? Is it using the regex in:
Code:

{match($2,/[0-9]+/,pc);gsub(pc[0],pc[0]/2)}
or is it pulling it from a specific field?



Oh, and for what it's worth, the second script does the same as the first, except it doesn't match the hostname to the ip. It duplicates every line and inputs the extra zeroes.


Thanks for the help so far!

Kenhelm 04-15-2010 05:19 PM

Using GNU sed and bash
Code:

while read ip hostname;do
  line='<td class=default width="60%"><a href="#'${ip//./_}'">'
  ip=${ip//./\\.}        # Escape the dots: 172\.27\.1\.107
  sed -i "/$line/{ s/60%/30%/; h; s/$ip/$hostname/; x; G }" nessus.html
done < hostnames.txt

The above method runs sed once for each ip/hostname. If there are a large number of ip/hostnames it would be more efficient to create a script of sed commands so that all the editing of the file can be done with just a single run of sed.
Code:

> sedscript        # Start with an empty sedscript file
while read ip hostname;do
  line='<td class=default width="60%"><a href="#'${ip//./_}'">'
  ip=${ip//./\\.}        # Escape the dots: 172\.27\.1\.107
  echo "/$line/{ s/60%/30%/; h; s/$ip/$hostname/; x; G }" >> sedscript
done < hostnames.txt

sed -f sedscript nessus.html > newnessus.html


grail 04-15-2010 10:16 PM

Hi melee

I will try and break it down for you, oh and sorry about the duplication of the other lines <my bad> will fix that too :)

BEGIN{FS="[\\||>*<*]"} - Set the delimeters to be used, in this case \\| is the pipe for ipsandhost and >*<* is for sedtest

ARGV[1] == FILENAME{_[$1]=$2} - while in file ipsandhost create an array where index is ip and value is host name

ARGV[2] == FILENAME{ - while in second file sedtest

if($5 in _) - if after splitting line the fifth field (ip address) is one of the indexes in array _

{match($2,/[0-9]+/,pc); - match is a function which looks for the regex /[0-9]+/ (this is one or more numbers) in string represented by field 2 and store in array pc

gsub(pc[0],pc[0]/2) - for the whole line (represented by $0) change all occurrences equal equal to value stored in pc[0] (which was 60 in example) with pc[0]/2 (ie 30);

print $0"\n"gensub($5,_[$5],2)} - print the original line ($0) plus the newline character ("\n") on newline print $0 but replace the fifth filed string (represents the ip address) with the value stored in array equivalent to that address (_[$5]) but only replace second occurrence (2). reason for last part is because the searched for string has dots in it (ie the ones separating the ip address) it also is matching any character between numbers, hence if you make it global it will also replace 172_27_1_58 as numbers are the same and dot "." is matching the underscore "_"

// extra stuff I should have put in. you will notice that the print above is now in the 'if'
else print} - this will now print the line as is if it does not require a change

So new line looks like:
Code:

awk 'BEGIN{FS="[\\||>*<*]"}ARGV[1] == FILENAME{_[$1]=$2}ARGV[2] == FILENAME{if($5 in _){match($2,/[0-9]+/,pc);gsub(pc[0],pc[0]/2);print $0"\n"gensub($5,_[$5],2)}else print}' ipsandhostlist.txt sedtest.html
Let me know how we go?

melee 04-15-2010 10:16 PM

Kenhelm,

This is exactly what I was looking for...

It's a testament to good coding that my script (which didn't work) was about 5 times longer than the one you posted. :)


All, thanks for your help!

grail 04-16-2010 12:28 AM

Quote:

echo "/$line/{ s/60%/30%/; h; s/$ip/$hostname/; x; G }" >> sedscript
I realise you said it wasn't so important, but just in case not all lines have 60 in them this will not
always work as intended.

Glad you have a solution.

melee 04-16-2010 03:57 PM

Hey Kenhelm (or anyone):

what's this part do?

Code:

ip=${ip//./\\.}
more specifically this part

Code:

{ip//./
I understand the rest of this is the escaping of the dots, but I'm a little confused about the rest...

Thanks

Kenhelm 04-16-2010 07:01 PM

${ip//./\\.} is bash parameter expansion (sometimes called 'parameter substitution' or 'variable substitution').
It's similar to the sed s/ / / command but it uses filename globbing patterns, not regular expressions.
Code:

var="some dogs are doggedly dogmatic"
echo ${var/dog/cat}        # replace first
some cats are doggedly dogmatic

echo ${var//dog/cat}        # replace all
some cats are catgedly catmatic

echo ${var/*d??/cat}        # globbing pattern
catmatic


grail 04-17-2010 01:53 AM

Have you checked the revised awk as well, as I believe it works for all scenarios.

melee 04-17-2010 01:51 PM

Quote:

Originally Posted by Kenhelm (Post 3938083)
${ip//./\\.} is bash parameter expansion (sometimes called 'parameter substitution' or 'variable substitution').
It's similar to the sed s/ / / command but it uses filename globbing patterns, not regular expressions.
Code:

var="some dogs are doggedly dogmatic"
echo ${var/dog/cat}        # replace first
some cats are doggedly dogmatic

echo ${var//dog/cat}        # replace all
some cats are catgedly catmatic

echo ${var/*d??/cat}        # globbing pattern
catmatic




I didn't even know this existed... Thanks for the education!

melee 04-17-2010 01:53 PM

Quote:

Originally Posted by grail (Post 3938251)
Have you checked the revised awk as well, as I believe it works for all scenarios.

grail,

I haven't tried this yet as the script that Kenhelm provided worked perfectly. I'll give it a try though to see if it works and report back here.


All times are GMT -5. The time now is 07:07 PM.