LinuxQuestions.org
Latest LQ Deal: Complete CCNA, CCNP & Red Hat Certification Training Bundle
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 06-10-2009, 05:00 PM   #1
coady77
LQ Newbie
 
Registered: Jun 2009
Posts: 5

Rep: Reputation: 0
Thumbs up Rearrange lines in a file


Hi, all

I have a file with a format like the following:

<Fine>
<ItemInfo>
<ItemTitle><![CDATA[Magician's gambit]]></ItemTitle>
<CollectionType><![CDATA[Library]]></CollectionType>
<ItemBarcode><![CDATA[T 5395]]></ItemBarcode>
<ItemSiteShortName><![CDATA[ALKI]]></ItemSiteShortName>
</ItemInfo>
</Fine>
<Fine>
<ItemInfo>
<ItemTitle><![CDATA[Sleeping beauty]]></ItemTitle>
<CollectionType><![CDATA[Library]]></CollectionType>
<ItemBarcode><![CDATA[T 4563]]></ItemBarcode>
<ItemSiteShortName><![CDATA[ALKI]]></ItemSiteShortName>
</ItemInfo>
</Fine>
.
.
.
</Fine>
more records
</Fine>


How do I rearrange the lines so that for each record,
<ItemBarcode>....<ItemBarcode> shows before <ItemTitle>...<ItemTitle>
In other words, the result will be

<Fine>
<ItemInfo>
<ItemBarcode><![CDATA[T 5395]]></ItemBarcode>
<ItemTitle><![CDATA[Magician's gambit]]></ItemTitle>
<CollectionType><![CDATA[Library]]</CollectionType>
<ItemSiteShortName><![CDATA[ALKI]]></ItemSiteShortName>
</ItemInfo>
</Fine>
<Fine>
<ItemInfo>
<ItemBarcode><![CDATA[T 4563]]></ItemBarcode>
<ItemTitle><![CDATA[Sleeping beauty]]></ItemTitle>
<CollectionType><![CDATA[Library]]</CollectionType>
<ItemSiteShortName><![CDATA[ALKI]]></ItemSiteShortName>
</ItemInfo>
</Fine>


Thank you
 
Old 06-10-2009, 06:55 PM   #2
schneidz
LQ Guru
 
Registered: May 2005
Location: boston, usa
Distribution: fc-15/ fc-20-live-usb/ aix
Posts: 5,051

Rep: Reputation: 852Reputation: 852Reputation: 852Reputation: 852Reputation: 852Reputation: 852Reputation: 852
the tools you need are probably

grep, sed or awk
 
Old 06-10-2009, 07:10 PM   #3
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Debian sid + kde 3.5 & 4.4
Posts: 6,823

Rep: Reputation: 1958Reputation: 1958Reputation: 1958Reputation: 1958Reputation: 1958Reputation: 1958Reputation: 1958Reputation: 1958Reputation: 1958Reputation: 1958Reputation: 1958
It took me a while to get it right, but this seems to do the trick. It assumes that the file has only those fields and nothing else before or after them though.

Code:
awk 'BEGIN{FS="\n"; OFS="\n"; RS="</Fine>\n"}{print $1,$2,$5,$3,$4,$6,$7,"</Fine>"}' file.txt
To explain it, every instance of "</Fine>\n" will be seen as a record separator, and each line inside that record is interpreted as a field. So for each record, print the fields in the order indicated, adding back the </Fine> at the end.

Last edited by David the H.; 06-10-2009 at 07:13 PM. Reason: fixed minor mistake
 
Old 06-10-2009, 07:39 PM   #4
coady77
LQ Newbie
 
Registered: Jun 2009
Posts: 5

Original Poster
Rep: Reputation: 0
Thank you for the quick reply.

Unfortunately it's multiple records. Each record starts and ends with </Fine>.

And, at the beginning of the file is
<?xml version="1.0" encoding="UTF-8"?>
<!-- 9.0 (rc16) -->
<FSC-FinesExport version="1" date="20090609">
<Fines count="17800">

So the whole file looks like this:

<?xml version="1.0" encoding="UTF-8"?>
<!-- 9.0 (rc16) -->
<FSC-FinesExport version="1" date="20090609">
<Fines count="17800">
<Fine>
<FineID>102</FineID>
<FineSiteShortName><![CDATA[abc]]></FineSiteShortName>
<FineDescription><![CDATA[Refund]]></FineDescription>
<FineCreatedDate>20060627</FineCreatedDate>
<FineAmount>-600</FineAmount>
<PatronInfo>
<PatronSiteShortName><![CDATA[SKY]]></PatronSiteShortName>
<PatronDistrictID><![CDATA[1234567]]></PatronDistrictID>
<PatronBarcode><![CDATA[1234567]]></PatronBarcode>
<PatronNameLast><![CDATA[last]]></PatronNameLast>
<PatronNameMiddle><![CDATA[middle]]></PatronNameMiddle>
<PatronNameFirst><![CDATA[first]]></PatronNameFirst>
</PatronInfo>

<ItemInfo>
<ItemTitle><![CDATA[Magician's gambit]]></ItemTitle>
<CollectionType><![CDATA[Library]]></CollectionType>
<ItemBarcode><![CDATA[T 5395]]></ItemBarcode>
<ItemSiteShortName><![CDATA[abc]]></ItemSiteShortName>
</ItemInfo>
</Fine>
</Fine>
.
.
.
</Fine>
 
Old 06-10-2009, 08:26 PM   #5
ghostdog74
Senior Member
 
Registered: Aug 2006
Posts: 2,697
Blog Entries: 5

Rep: Reputation: 244Reputation: 244Reputation: 244
if you have Python
Code:
data=open("file").read().split("</Fine>")
for it in data:
    it=it.split("\n")    
    for n,item in enumerate(it):
        if item.startswith("<ItemTitle>"):
            x = n
        if item.startswith("<ItemBarcode>"):
            it.pop(n)
            it.insert(x,item)
            it.insert(-1,"</Fine>")
    print '\n'.join(it)
output
Code:
# ./test.py
<Fine>
<ItemInfo>
<ItemBarcode><![CDATA[T 5395]]></ItemBarcode>
<ItemTitle><![CDATA[Magician's gambit]]></ItemTitle>
<CollectionType><![CDATA[Library]]></CollectionType>
<ItemSiteShortName><![CDATA[ALKI]]></ItemSiteShortName>
</ItemInfo>
</Fine>
<Fine>
<ItemInfo>
<ItemBarcode><![CDATA[T 4563]]></ItemBarcode>
<ItemTitle><![CDATA[Sleeping beauty]]></ItemTitle>
<CollectionType><![CDATA[Library]]></CollectionType>
<ItemSiteShortName><![CDATA[ALKI]]></ItemSiteShortName>
</ItemInfo>
</Fine>
.
.
.
</Fine>
more records
</Fine>

Last edited by ghostdog74; 06-11-2009 at 12:50 AM. Reason: takes care of missing barcode(if any)
 
Old 06-10-2009, 08:49 PM   #6
chrism01
LQ Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Centos 6.8, Centos 5.10
Posts: 17,254

Rep: Reputation: 2328Reputation: 2328Reputation: 2328Reputation: 2328Reputation: 2328Reputation: 2328Reputation: 2328Reputation: 2328Reputation: 2328Reputation: 2328Reputation: 2328
If you are going to be doing a lot of xml stuff, it would be a good idea to use eg Perl which has a lot of XML handling modules.
eg
http://search.cpan.org/~mirod/XML-Twig-3.32/Twig.pm
http://search.cpan.org/~grantm/XML-S.../XML/Simple.pm

for the full list, just go to search.cpan.org and enter 'xml'
 
Old 06-10-2009, 09:51 PM   #7
Kenhelm
Member
 
Registered: Mar 2008
Location: N. W. England
Distribution: Mandriva
Posts: 333

Rep: Reputation: 141Reputation: 141
With GNU sed
Code:
sed '/<ItemTitle>/{N;h;d}; /<ItemBarcode>/G'  infile > outfile
 
Old 06-10-2009, 11:17 PM   #8
Tinkster
Moderator
 
Registered: Apr 2002
Location: in a fallen world
Distribution: slackware by choice, others too :} ... android.
Posts: 23,067
Blog Entries: 11

Rep: Reputation: 910Reputation: 910Reputation: 910Reputation: 910Reputation: 910Reputation: 910Reputation: 910Reputation: 910
Quote:
Originally Posted by Kenhelm View Post
With GNU sed
Code:
sed '/<ItemTitle>/{N;h;d}; /<ItemBarcode>/G'  infile > outfile
That's just beautiful. :}

The only (potential?) pitfall is: what happens if some
records don't have a ItemBarcode ? Will that snippet
grab the Barcode from the next record? And so forth?



Cheers,
Tink
 
Old 06-11-2009, 12:01 AM   #9
ghostdog74
Senior Member
 
Registered: Aug 2006
Posts: 2,697
Blog Entries: 5

Rep: Reputation: 244Reputation: 244Reputation: 244
Quote:
Originally Posted by Tinkster View Post
That's just beautiful. :}
beautiful as a one liner. not beautiful when you want to decipher it
 
Old 06-11-2009, 12:12 AM   #10
jamescondron
Member
 
Registered: Jul 2007
Location: Scunthorpe, UK
Distribution: Ubuntu 8.10; Gentoo; Debian Lenny
Posts: 961

Rep: Reputation: 69
coady77: There are lots of great answers here, but since you haven't given any sample code of your own (towards a solution) nor any information on what you'd be comfortable with for this, then this may be out of your range; whether awk, python, perl or sed (or a few others not mentioned yet) if you're not able to even specify how you'd want this done, you'll probably not understand the solutions.

I'm not saying that to be nasty, but you're probably either better paying a contractor (if this is at work) or stepping back and putting this project at hold until you have a better idea (if this is for a personal project).

The fact of the matter is, a simple parser when you know the rules the file uses is a very, very quick job- like ghostdog74 demonstrated in python; I get the feeling you're not sure about this at all.

Last edited by jamescondron; 06-11-2009 at 12:14 AM. Reason: spelling, eugh
 
Old 06-11-2009, 11:47 AM   #11
coady77
LQ Newbie
 
Registered: Jun 2009
Posts: 5

Original Poster
Rep: Reputation: 0
Thank you for all the help.

I don't have Python, but I am learning sed.

sed "/<ItemTitle>/{N;h;d}; /<ItemBarcode>/G;w c:\out.txt" file.txt

Works great!
 
Old 06-11-2009, 03:12 PM   #12
Tinkster
Moderator
 
Registered: Apr 2002
Location: in a fallen world
Distribution: slackware by choice, others too :} ... android.
Posts: 23,067
Blog Entries: 11

Rep: Reputation: 910Reputation: 910Reputation: 910Reputation: 910Reputation: 910Reputation: 910Reputation: 910Reputation: 910
Eeeew ... you're using windows?! :D
 
Old 06-11-2009, 04:26 PM   #13
coady77
LQ Newbie
 
Registered: Jun 2009
Posts: 5

Original Poster
Rep: Reputation: 0
Yes, I am. Unfortunately.
 
Old 06-11-2009, 04:48 PM   #14
pg99
Member
 
Registered: May 2008
Location: UK
Distribution: Slackware
Posts: 73

Rep: Reputation: 18
since your data file is XML and you want to transform it to a slightly different format XML, the natural choice is surely to write an XSLT stylesheet.

A simple variant on the identity stylesheet should do it, something like this.
Code:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:template match="node()|@*">
     <xsl:copy>
      <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
  </xsl:template>
  <xsl:template match="ItemInfo">
    <xsl:copy>
      <xsl:copy-of select="ItemBarcode"/>
      <xsl:copy-of select="ItemTitle"/>
      <xsl:copy-of select="CollectionType"/>
      <xsl:copy-of select="ItemSiteShortName"/>
    </xsl:copy>
  </xsl:template>
</xsl:stylesheet>
save this XSLT to a file, then the easiest way to run the transformation on windows is to add an xml-stylesheet reference to the top of your XML file, then just open the XML file in IE and it will run the transformation and show the resulting output.

This is the line you would add to to your XML, right at the top
<?xml-stylesheet type="text/xsl" href="path/to/your.xsl"?>

<?xml version="1.0?>
<-- it goes here
<FirstNode>

I'm a bit puzzled why you need to rearrange it like this though, to any XML-aware application the order of nodes is irrelevant, its the hierarchy that's important.

Last edited by pg99; 06-11-2009 at 04:53 PM. Reason: old version wasnt copying the ItemInfo node
 
Old 06-11-2009, 05:08 PM   #15
coady77
LQ Newbie
 
Registered: Jun 2009
Posts: 5

Original Poster
Rep: Reputation: 0
To answer why I need to do this in short,

Using an exe file came with the application A to extract data from its SQL DB -> produced an XML file ->
import XML file into another application B -> App B processed info and exports a txt file into App A

Well, app B merge <ItemTitle> and <ItemBarcode> into 1 field. However, this field only has max 50 characters. So, if <ItemTitle> has more than 50 characters, then <ItemBarcode> won't be in that field. Unfortunately the <ItemBarcode> is what app A people need.

So, I am just trying to manipulate the extract data hoping it would produce something that would satisfy what app A people want, while knowing asking App B vendor to add a field in their DB, or to increase the max characters would probably take months before I hear anything back from them.

So, I am more of a Windows/SQL person and sometimes need to use sed, awk, for and if for things like this.

But thank you very much for all the help.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Substitute specific lines with lines from another file rahmathullakm Programming 4 01-10-2009 06:47 AM
How to modify a field in few lines in a file and save the new file - in Perl rounak94 Programming 1 10-02-2008 08:43 PM
How to rearrange partitions on a disk? sheintze Linux - Software 3 03-10-2007 08:39 PM
Random file lines directed to a new file. In script an error. In command line no err leventis Programming 1 09-28-2006 08:16 AM
can't rearrange desktop icons - kde arch23 Fedora 1 01-02-2004 12:17 AM


All times are GMT -5. The time now is 06:18 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration