LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   awk or sed to use CSV as input and XML as template and output to a single file (https://www.linuxquestions.org/questions/linux-newbie-8/awk-or-sed-to-use-csv-as-input-and-xml-as-template-and-output-to-a-single-file-751270/)

bridrod 08-30-2009 10:33 AM

awk or sed to use CSV as input and XML as template and output to a single file
 
Guys,

I have an XML file I want to use as a template. Something like this:

<connection name="SERVER">
<connection_info>
<name>SERVER</name>
<protocol>SSH</protocol>
<host>SERVER</host>
</connection_info>
</connection>

I have a control file (CSV) that the 1st column contains all the servers, one per line. What I want to do is somehow read the CSV, update "SERVER" on the template and output the result to a file. I want to repeat the process to the end of the CSV and "merge" to the same output file.

TIA,

-Rod

catkin 08-30-2009 11:18 AM

How about using awk's getline() in the BEGIN section to read the XML template? See http://www.cs.utah.edu/dept/old/texi...k_5.html#SEC28 You could add each line to a variable, separated by line-ends so you would have the whole XML template in a single variable. Then, when awk reads and parses the CSV, you could use substr() to change tokens in the template XML to values from the CSV. Finish off by redirecting print or printf output to the output file. See http://www.cs.utah.edu/dept/old/texi...k_6.html#SEC39

Would help if you posted the CSV, too.

bridrod 08-30-2009 11:25 AM

Quote:

Originally Posted by catkin (Post 3662936)
How about using awk's getline() in the BEGIN section to read the XML template? See http://www.cs.utah.edu/dept/old/texi...k_5.html#SEC28 You could add each line to a variable, separated by line-ends so you would have the whole XML template in a single variable. Then, when awk reads and parses the CSV, you could use substr() to change tokens in the template XML to values from the CSV. Finish off by redirecting print or printf output to the output file. See http://www.cs.utah.edu/dept/old/texi...k_6.html#SEC39

Would help if you posted the CSV, too.

I will read those links and see if I can digest them. Thanks.

Well, my CSV is actually simple right now. Example:

Server1,IP1,version1
Server2,IP2,version1
Server3,IP3,version2
Server4,IP4,version1
Server5,IP5,version3
.
.
.

So I just need to read Column1 where the different server names are.

-Rod

catkin 08-30-2009 12:37 PM

Quote:

Originally Posted by bridrod (Post 3662947)
Well, my CSV is actually simple right now. Example:

Server1,IP1,version1
Server2,IP2,version1
Server3,IP3,version2
Server4,IP4,version1
Server5,IP5,version3

That's nice and straightforward :)

As awk reads each line of the CSV file, you can:
  1. use split() (or otherwise, as you prefer) to parse it at the commas into an array
  2. make a copy of your XML template
  3. loop over the array using sub() or gsub() to change the tokens in the XML template copy to values from the CSV file
  4. print() or printf() with redirection to the output file
Or (neater), how about changing your XML template into a printf() format string? That way you could avoid the sub() or gsub() step.

Tinkster 08-30-2009 01:06 PM

Code:

BEGIN{
  FS=","
}
{
  print "<connection name=\""$1"\">"
  print "<connection_info>"
  print "<name>"$1"</name>"
  print "<protocol>"$3"</protocol>"
  print "<host>"$2"</host>"
  print "</connection_info>"
  print "</connection>"
}

Code:

awk -f awkscript csv
<connection name="Server1">
<connection_info>
<name>Server1</name>
<protocol>version1</protocol>
<host>IP1</host>
</connection_info>
</connection>
<connection name="Server2">
<connection_info>
<name>Server2</name>
<protocol>version1</protocol>
<host>IP2</host>
</connection_info>
</connection>
<connection name="Server3">
<connection_info>
<name>Server3</name>
<protocol>version2</protocol>
<host>IP3</host>
</connection_info>
</connection>
<connection name="Server4">
<connection_info>
<name>Server4</name>
<protocol>version1</protocol>
<host>IP4</host>
</connection_info>
</connection>
<connection name="Server5">
<connection_info>
<name>Server5</name>
<protocol>version3</protocol>
<host>IP5</host>
</connection_info>
</connection>


bridrod 08-31-2009 02:11 PM

Wow, thanks for all the input!

Tinkster, your script was right on! Thanks! Exactly what I needed!

I got to admit, I have a hard time understanding sed, awk and reg exp tools. It seems so easy for some but it's too much for me to handle. If my script is a little more than trivial, it just fails and I don't know why. Glad you guys are always awesome!

Thanks again!

-Rod

Birdy 03-13-2012 07:00 PM

Thanks for the nice and simple templating solution. I wanted to have the template as a simple text file instead of an awk script with print statements. To accomplish this I constructed the following command line that takes a template and the csv file and outputs the same result:
Code:

cat template | sed '{:q;N;s/\n/\\n/g;t q}' | awk '{print "awk \x27 BEGIN{FS=\",\"}{print "$0"}\x27 csv"}' | sh
The template file contains:
Code:

"<connection name="$1">
<connection_info>
<name>"$1"</name>
<protocol>"$3"</protocol>
<host>"$2"</host>
</connection_info>
</connection>"



All times are GMT -5. The time now is 10:59 PM.