LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   shell xml parser (https://www.linuxquestions.org/questions/linux-newbie-8/shell-xml-parser-632330/)

macushk 04-01-2008 10:42 PM

shell xml parser
 
Hi All,

One of my job is to check a text files which contains a few thousand lines of text like this:

<abc.txt>
|network
|network|eth0
|network|eth0|ip:STRING:192.168.1.1
|network|eth0|subnet:STRING:255.255.255.0
...

As you see the file really similar to xml file. Now I only "vi" and search the required field for checking. (eg. search the "eth0" and check the ip address)
Is there any way to parse the file for easier readable by human to prevent human error?

eg.
1.I just enter the required field for search (ie. eth0)
2.The parser will show the child element as a formatted table.

Thanks.

chrism01 04-02-2008 12:36 AM

Can you give a short example of the reqd output

macushk 04-02-2008 01:24 AM

<abc.txt>
|network
|network|eth0
|network|eth0|ip:STRING:192.168.1.1
|network|eth0|subnet:STRING:255.255.255.0
...
..

<abc.out>
network
eth0
ip=192.168.1.1
subnet=255.255.255.0
...
..

sundialsvcs 04-02-2008 09:22 PM

Well, first of all, that's not an XML file.

Now having said that, there are (of course...) several ways to do it.

One tool that's fairly specialized for this purpose is called awk. This might be a very good tool for handling data like this, which is basically "pipe-delimited" ("|" is "pipe") once you get into the meat-and-potatoes of it.

Another more-advanced tool, actually a full-fledged programming language, is Perl. Think of it as "awk on steroids," because that's more-or-less what its original inventor came up with when faced with a task very similar to yours.

Poke around... there might be an even-simpler solution nearby. In this Internet age, the best approach to a problem is usually on-line research, in the hope-and-expectation of finding a tool that you can ... just use. It's a good bet, because no matter what it is you're facing, someone else has probably already been there first.

chrism01 04-03-2008 12:26 AM

Y, I'd use Perl for that, split(/|/, $rec) and use either a hash or an array as temp storage depending on what the rest of the file looks like.
http://perldoc.perl.org/

archtoad6 05-03-2008 09:43 AM

Actually, if the sample is representative, sed will suffice:

There seem to 3 simple steps:
  1. Remove the leading ' |' strings.
    sed 's,^ |,,'
  2. Remove everything up to the next '|' symbol.
    sed 's,^.*|,,'
  3. Transform ':STRING:' into '='.
    sed 's,:STRING:,=,'
These can be linked:
Code:

sed 's,^ |,,;s,^.*|,,;s,:STRING:,=,' $INFILE > $OUTFILE
My test:
Code:

$ echo '<abc.txt>
 |network
 |network|eth0
 |network|eth0|ip:STRING:192.168.1.1
 |network|eth0|subnet:STRING:255.255.255.0' \
| sed 's,^ |,,;s,^.*|,,;s,:STRING:,=,'

<abc.txt>
network
eth0
ip=192.168.1.1
subnet=255.255.255.0


druuna 05-03-2008 10:52 AM

Hi,

Using awk:

awk -F\| '{ sub(/:STRING:/,"=") ; print $NF }' infile

Code:

$ awk -F\| '{ sub(/:STRING:/,"=") ; print $NF  }' infile
<abc.txt>
network
eth0
ip=192.168.1.1
subnet=255.255.255.0

Hope this helps.


All times are GMT -5. The time now is 02:04 PM.