LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 04-01-2008, 11:42 PM   #1
macushk
LQ Newbie
 
Registered: Oct 2006
Posts: 7

Rep: Reputation: 0
shell xml parser


Hi All,

One of my job is to check a text files which contains a few thousand lines of text like this:

<abc.txt>
|network
|network|eth0
|network|eth0|ip:STRING:192.168.1.1
|network|eth0|subnet:STRING:255.255.255.0
...

As you see the file really similar to xml file. Now I only "vi" and search the required field for checking. (eg. search the "eth0" and check the ip address)
Is there any way to parse the file for easier readable by human to prevent human error?

eg.
1.I just enter the required field for search (ie. eth0)
2.The parser will show the child element as a formatted table.

Thanks.
 
Old 04-02-2008, 01:36 AM   #2
chrism01
LQ Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Centos 6.8, Centos 5.10
Posts: 17,258

Rep: Reputation: 2328Reputation: 2328Reputation: 2328Reputation: 2328Reputation: 2328Reputation: 2328Reputation: 2328Reputation: 2328Reputation: 2328Reputation: 2328Reputation: 2328
Can you give a short example of the reqd output
 
Old 04-02-2008, 02:24 AM   #3
macushk
LQ Newbie
 
Registered: Oct 2006
Posts: 7

Original Poster
Rep: Reputation: 0
<abc.txt>
|network
|network|eth0
|network|eth0|ip:STRING:192.168.1.1
|network|eth0|subnet:STRING:255.255.255.0
...
..

<abc.out>
network
eth0
ip=192.168.1.1
subnet=255.255.255.0
...
..
 
Old 04-02-2008, 10:22 PM   #4
sundialsvcs
LQ Guru
 
Registered: Feb 2004
Location: SE Tennessee, USA
Distribution: Gentoo, LFS
Posts: 7,482

Rep: Reputation: 2377Reputation: 2377Reputation: 2377Reputation: 2377Reputation: 2377Reputation: 2377Reputation: 2377Reputation: 2377Reputation: 2377Reputation: 2377Reputation: 2377
Well, first of all, that's not an XML file.

Now having said that, there are (of course...) several ways to do it.

One tool that's fairly specialized for this purpose is called awk. This might be a very good tool for handling data like this, which is basically "pipe-delimited" ("|" is "pipe") once you get into the meat-and-potatoes of it.

Another more-advanced tool, actually a full-fledged programming language, is Perl. Think of it as "awk on steroids," because that's more-or-less what its original inventor came up with when faced with a task very similar to yours.

Poke around... there might be an even-simpler solution nearby. In this Internet age, the best approach to a problem is usually on-line research, in the hope-and-expectation of finding a tool that you can ... just use. It's a good bet, because no matter what it is you're facing, someone else has probably already been there first.
 
Old 04-03-2008, 01:26 AM   #5
chrism01
LQ Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Centos 6.8, Centos 5.10
Posts: 17,258

Rep: Reputation: 2328Reputation: 2328Reputation: 2328Reputation: 2328Reputation: 2328Reputation: 2328Reputation: 2328Reputation: 2328Reputation: 2328Reputation: 2328Reputation: 2328
Y, I'd use Perl for that, split(/|/, $rec) and use either a hash or an array as temp storage depending on what the rest of the file looks like.
http://perldoc.perl.org/
 
Old 05-03-2008, 10:43 AM   #6
archtoad6
Senior Member
 
Registered: Oct 2004
Location: Houston, TX (usa)
Distribution: MEPIS, Debian, Knoppix,
Posts: 4,727
Blog Entries: 15

Rep: Reputation: 233Reputation: 233Reputation: 233
Actually, if the sample is representative, sed will suffice:

There seem to 3 simple steps:
  1. Remove the leading ' |' strings.
    sed 's,^ |,,'
  2. Remove everything up to the next '|' symbol.
    sed 's,^.*|,,'
  3. Transform ':STRING:' into '='.
    sed 's,:STRING:,=,'
These can be linked:
Code:
sed 's,^ |,,;s,^.*|,,;s,:STRING:,=,' $INFILE > $OUTFILE
My test:
Code:
$ echo '<abc.txt>
 |network
 |network|eth0
 |network|eth0|ip:STRING:192.168.1.1
 |network|eth0|subnet:STRING:255.255.255.0' \
| sed 's,^ |,,;s,^.*|,,;s,:STRING:,=,'

<abc.txt>
network
eth0
ip=192.168.1.1
subnet=255.255.255.0
 
Old 05-03-2008, 11:52 AM   #7
druuna
LQ Veteran
 
Registered: Sep 2003
Posts: 10,532
Blog Entries: 7

Rep: Reputation: 2387Reputation: 2387Reputation: 2387Reputation: 2387Reputation: 2387Reputation: 2387Reputation: 2387Reputation: 2387Reputation: 2387Reputation: 2387Reputation: 2387
Hi,

Using awk:

awk -F\| '{ sub(/:STRING:/,"=") ; print $NF }' infile

Code:
$ awk -F\| '{ sub(/:STRING:/,"=") ; print $NF  }' infile
<abc.txt>
network
eth0
ip=192.168.1.1
subnet=255.255.255.0
Hope this helps.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
checking for XML::Parser... configure: error: XML::Parser perl module is required for kornerr Linux - General 11 11-16-2008 08:24 AM
XML Parser... Prada Linux - Software 2 04-17-2007 12:26 PM
Can't install XML::Parser runa Linux - Software 4 03-13-2007 02:58 PM
XML Parser bulkman Linux - Software 6 04-29-2005 08:01 AM
xml parser in linux shilpig Programming 1 05-13-2004 12:24 PM


All times are GMT -5. The time now is 03:27 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration