shell xml parser

macushk · 04-01-2008, 10:42 PM

Hi All,

One of my job is to check a text files which contains a few thousand lines of text like this:

<abc.txt>
|network
|network|eth0
|network|eth0|ip:STRING:192.168.1.1
|network|eth0|subnet:STRING:255.255.255.0
...

As you see the file really similar to xml file. Now I only "vi" and search the required field for checking. (eg. search the "eth0" and check the ip address)
Is there any way to parse the file for easier readable by human to prevent human error?

eg.
1.I just enter the required field for search (ie. eth0)
2.The parser will show the child element as a formatted table.

Thanks.

chrism01 · 04-02-2008, 12:36 AM

Can you give a short example of the reqd output

macushk · 04-02-2008, 01:24 AM

sundialsvcs · 04-02-2008, 09:22 PM

Well, first of all, that's not an XML file.

Now having said that, there are (of course...) several ways to do it.

One tool that's fairly specialized for this purpose is called awk. This might be a very good tool for handling data like this, which is basically "pipe-delimited" ("|" is "pipe") once you get into the meat-and-potatoes of it.

Another more-advanced tool, actually a full-fledged programming language, is Perl. Think of it as "awk on steroids," because that's more-or-less what its original inventor came up with when faced with a task very similar to yours.

Poke around... there might be an even-simpler solution nearby. In this Internet age, the best approach to a problem is usually on-line research, in the hope-and-expectation of finding a tool that you can ... just use. It's a good bet, because no matter what it is you're facing, someone else has probably already been there first.

chrism01 · 04-03-2008, 12:26 AM

Y, I'd use Perl for that, split(/|/, $rec) and use either a hash or an array as temp storage depending on what the rest of the file looks like.
http://perldoc.perl.org/

archtoad6 · 05-03-2008, 09:43 AM

Actually, if the sample is representative, sed will suffice:

There seem to 3 simple steps:

Remove the leading ' |' strings.
sed 's,^ |,,'
Remove everything up to the next '|' symbol.
sed 's,^.*|,,'
Transform ':STRING:' into '='.
sed 's,:STRING:,=,'

These can be linked:

Code:

sed 's,^ |,,;s,^.*|,,;s,:STRING:,=,' $INFILE > $OUTFILE

My test:

Code:

$ echo '<abc.txt>
 |network
 |network|eth0
 |network|eth0|ip:STRING:192.168.1.1
 |network|eth0|subnet:STRING:255.255.255.0' \
| sed 's,^ |,,;s,^.*|,,;s,:STRING:,=,'

<abc.txt>
network
eth0
ip=192.168.1.1
subnet=255.255.255.0

druuna · 05-03-2008, 10:52 AM

Hi,

Using awk:

awk -F\| '{ sub(/:STRING:/,"=") ; print $NF }' infile

Code:

$ awk -F\| '{ sub(/:STRING:/,"=") ; print $NF  }' infile
<abc.txt>
network
eth0
ip=192.168.1.1
subnet=255.255.255.0

Hope this helps.