LinuxQuestions.org
Review your favorite Linux distribution.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 04-12-2010, 04:38 AM   #1
grob115
Member
 
Registered: Oct 2005
Posts: 542

Rep: Reputation: 32
XML to CSV


Just wondering if there are any tools available for doing this. I found CSV to XML but not the other way around. Basically I have the following structure:
<Level Grade=1>
<Content Step=A>Romero</Content>
<Content Step=B>Julie</Content>
<Content Step=C>Stephen</Content>
</Level>
<Level Grade=2>
<Content Step=A>Thomas</Content>
</Level>
<Level Grade=3>
<Content Step=A>Mary</Content>
<Content Step=B>Flora</Content>
<Content Step=C>Michael</Content>
<Content Step=D>Jerry</Content>
</Level>

And I want it to appear into the following structure:
Level, Content A, Content B, Content C, Content D
1, Romero, Julie, Stephen
2, Thomas, ,
3, Mary, Flora, Michael, Jerry

Any ideas?
 
Old 04-12-2010, 05:48 AM   #2
Sergei Steshenko
Senior Member
 
Registered: May 2005
Posts: 4,481

Rep: Reputation: 454Reputation: 454Reputation: 454Reputation: 454Reputation: 454
Quote:
Originally Posted by grob115 View Post
Just wondering if there are any tools available for doing this. I found CSV to XML but not the other way around. Basically I have the following structure:
<Level Grade=1>
<Content Step=A>Romero</Content>
<Content Step=B>Julie</Content>
<Content Step=C>Stephen</Content>
</Level>
<Level Grade=2>
<Content Step=A>Thomas</Content>
</Level>
<Level Grade=3>
<Content Step=A>Mary</Content>
<Content Step=B>Flora</Content>
<Content Step=C>Michael</Content>
<Content Step=D>Jerry</Content>
</Level>

And I want it to appear into the following structure:
Level, Content A, Content B, Content C, Content D
1, Romero, Julie, Stephen
2, Thomas, ,
3, Mary, Flora, Michael, Jerry

Any ideas?
Yes, grab a Perl module parsing XML and grab a Perl module generating CSV and glue them together.
 
Old 04-12-2010, 05:53 AM   #3
Tinkster
Moderator
 
Registered: Apr 2002
Location: earth
Distribution: slackware by choice, others too :} ... android.
Posts: 23,067
Blog Entries: 11

Rep: Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928
A prototype that needs LOTS of more work ;}
Code:
awk 'BEGIN{RS=ORS="<Level Grade";FS=OFS="\n"}{printf "%s,", gensub(/=([0-9]+).*/, "\\1","1",$1);for(i=1;i<=NF;i++){if($i ~ /Content Step/){printf "%s,",gensub(/^[^>]+>([^<]+)<.*/,"\\1","1", $i)}}printf "\n"}' file 
,
1,Romero,Julie,Stephen,
2,Thomas,
3,Mary,Flora,Michael,Jerry,
You get no headers, and the comma count is not the same on
all lines ....
 
Old 04-12-2010, 05:57 AM   #4
Sergei Steshenko
Senior Member
 
Registered: May 2005
Posts: 4,481

Rep: Reputation: 454Reputation: 454Reputation: 454Reputation: 454Reputation: 454
Quote:
Originally Posted by Tinkster View Post
A prototype that needs LOTS of more work ;}
Code:
awk 'BEGIN{RS=ORS="<Level Grade";FS=OFS="\n"}{printf "%s,", gensub(/=([0-9]+).*/, "\\1","1",$1);for(i=1;i<=NF;i++){if($i ~ /Content Step/){printf "%s,",gensub(/^[^>]+>([^<]+)<.*/,"\\1","1", $i)}}printf "\n"}' file 
,
1,Romero,Julie,Stephen,
2,Thomas,
3,Mary,Flora,Michael,Jerry,
You get no headers, and the comma count is not the same on
all lines ....
XML is not a line-oriented format, will your code work for records whose opening and closing tags are not on the same line ?

Also, what will your code do with ill-formed XML ?
 
Old 04-12-2010, 07:08 AM   #5
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,007

Rep: Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192
Now I know Sergei is going to beat me about the head and ears, but I just thought I would put up an alternative to Tinkster's
as a learning solution for parsing files in general (although acknowledge not solely the best way top approach this):

Code:
awk 'BEGIN{RS="</Level>\n";FS=">\n";OFS=","}{for(i=1;i<NF;i++)gsub(/.*[A-Z]>|<\/.*|<L.*=/,"",$i);gsub(OFS"$","")}1' infile
@Tinkster - I have removed the trailing commas, but the headers would take some more work
 
Old 04-12-2010, 07:19 AM   #6
Sergei Steshenko
Senior Member
 
Registered: May 2005
Posts: 4,481

Rep: Reputation: 454Reputation: 454Reputation: 454Reputation: 454Reputation: 454
Quote:
Originally Posted by grail View Post
Now I know Sergei is going to beat me about the head and ears, but I just thought I would put up an alternative to Tinkster's
as a learning solution for parsing files in general (although acknowledge not solely the best way top approach this):

Code:
awk 'BEGIN{RS="</Level>\n";FS=">\n";OFS=","}{for(i=1;i<NF;i++)gsub(/.*[A-Z]>|<\/.*|<L.*=/,"",$i);gsub(OFS"$","")}1' infile
@Tinkster - I have removed the trailing commas, but the headers would take some more work
Folks, I've seen too many production environment failures because of "childish" parsers. Childishness is assuming certain line oriented-ness when the language does not impose it.

The only legitimate cases to write "childish" parsers is when their input is generated automatically and the one who writes the parser can guarantee the input format won't change.
 
Old 04-12-2010, 07:34 AM   #7
knudfl
LQ 5k Club
 
Registered: Jan 2008
Location: Copenhagen DK
Distribution: PCLinuxOS2023 Fedora38 + 50+ other Linux OS, for test only.
Posts: 17,511

Rep: Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641
May be ...
http://www.google.com/linux?hl=en&q=xml2csv&btnG=Search

http://sourceforge.net/projects/java-xml2csv/files/

At least it can convert the included example with this command:
java -cp bin gs.xml2csv.xml2csv
.....
 
Old 04-12-2010, 07:34 AM   #8
knudfl
LQ 5k Club
 
Registered: Jan 2008
Location: Copenhagen DK
Distribution: PCLinuxOS2023 Fedora38 + 50+ other Linux OS, for test only.
Posts: 17,511

Rep: Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641
May be ...
http://www.google.com/linux?hl=en&q=xml2csv&btnG=Search

http://sourceforge.net/projects/java-xml2csv/files/

At least it can convert the included example with this command:
java -cp bin gs.xml2csv.xml2csv
.....
 
Old 04-12-2010, 07:59 AM   #9
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,007

Rep: Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192
@Sergei
Quote:
Folks, I've seen too many production environment failures because of "childish" parsers. Childishness is assuming certain line oriented-ness when the language does not impose it.

The only legitimate cases to write "childish" parsers is when their input is generated automatically and the one who writes the parser can guarantee the input format won't change.
Agree
Just my if it were oriented
 
Old 04-12-2010, 10:55 AM   #10
grob115
Member
 
Registered: Oct 2005
Posts: 542

Original Poster
Rep: Reputation: 32
First, thanks for the responses. Questions:
Sergei, can you show me some examples (maybe online) of these Perl modules? I don't know anything about them, and definitely nothing about gluing modules together.

knudfl, thanks for the link on the xml2csv java binary. Can you tell me where you found the way to input that command? I looked over the Sourceforge site but didn't find any documentation. Do they have both Windows and Linux versions?
 
Old 04-12-2010, 11:09 AM   #11
knudfl
LQ 5k Club
 
Registered: Jan 2008
Location: Copenhagen DK
Distribution: PCLinuxOS2023 Fedora38 + 50+ other Linux OS, for test only.
Posts: 17,511

Rep: Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641Reputation: 3641
cd xml2csv/
cat xml2csv.bat

The file xml2csv.bat is a four liner,
line 2 = java -cp bin gs.xml2csv.xml2csv

.....
 
Old 04-12-2010, 11:58 AM   #12
Sergei Steshenko
Senior Member
 
Registered: May 2005
Posts: 4,481

Rep: Reputation: 454Reputation: 454Reputation: 454Reputation: 454Reputation: 454
Quote:
Originally Posted by grob115 View Post
First, thanks for the responses. Questions:
Sergei, can you show me some examples (maybe online) of these Perl modules? I don't know anything about them, and definitely nothing about gluing modules together.

knudfl, thanks for the link on the xml2csv java binary. Can you tell me where you found the way to input that command? I looked over the Sourceforge site but didn't find any documentation. Do they have both Windows and Linux versions?
http://search.cpan.org/search?query=xml+parser&mode=all ->
http://search.cpan.org/~msergeant/XM...2.36/Parser.pm ;
http://search.cpan.org/~andya/XML-De...XML/Descent.pm
.

Regarding writing in CSV format - it's probably for your case faster to just write it manually than to use a module; still,

http://search.cpan.org/search?query=CSV&mode=all ->
http://search.cpan.org/~makamaka/Tex...ib/Text/CSV.pm ;
http://search.cpan.org/~djr/Class-CSV-1.03/CSV.pm ;
http://search.cpan.org/~gwyn/Data-Ta.../Dumper/CSV.pm .
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] awk or sed to use CSV as input and XML as template and output to a single file bridrod Linux - Newbie 6 03-13-2012 07:00 PM
looking for software to convert multiple csv files to a single xml file Rocket-boy Linux - Software 6 10-28-2009 10:03 AM
Comparing two csv files and write different record in third CSV file irfanb146 Linux - Newbie 3 06-30-2008 09:15 PM
no xml, convert tvtime stationlist to xml for mythtv/freevo... frenchn00b Linux - General 8 11-03-2007 11:35 PM
configure: error: could not find DocBook XML DTD V4.1.2 in XML catalog Fadoksi Linux - Software 1 07-16-2006 06:41 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 01:23 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration