Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game. |
| Notices |
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
Are you new to LinuxQuestions.org? Visit the following links:
Site Howto |
Site FAQ |
Sitemap |
Register Now
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
 |
GNU/Linux Basic Guide
This 255-page guide will provide you with the keys to understand the philosophy of free software, teach you how to use and handle it, and give you the tools required to move easily in the world of GNU/Linux. Many users and administrators will be taking their first steps with this GNU/Linux Basic guide and it will show you how to approach and solve the problems you encounter.
Click Here to receive this Complete Guide absolutely free. |
|
 |
08-31-2004, 11:40 AM
|
#1
|
|
Member
Registered: Aug 2004
Location: bangalore india
Posts: 50
Rep:
|
Splitting A Large Xml File
hi
i have a 200mb xml file with 19845 records of diff no of linesfor each record
i want to seperate each of the records a store it in a new file (for each record a new file) can anybody help me in doing so
how can i do this in linux,or java /c
i tried using split in linux but it removes only a default no of lines but not the records as i want it to be
plz help
|
|
|
|
08-31-2004, 02:56 PM
|
#2
|
|
Senior Member
Registered: Jul 2004
Distribution: Slackware
Posts: 2,140
|
How do your records look like ?
|
|
|
|
08-31-2004, 03:10 PM
|
#3
|
|
Moderator
Registered: Apr 2002
Location: in a fallen world
Distribution: slackware by choice, others too :} ... android.
Posts: 22,903
|
awk, perl or python would offer themselves for
such tasks :)
Cheers,
Tink
|
|
|
|
09-01-2004, 06:12 AM
|
#4
|
|
Member
Registered: Aug 2004
Location: bangalore india
Posts: 50
Original Poster
Rep:
|
each of my records are like the code given below
<party xmlns:defns="http://INDL060BB:8080/home/SCOTT/AIG/xsd" xmlns:soapenc="http://schemas.xmlsoap.org/soap/encoding/" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:wsdl="http://schemas.xmlsoap.org/wsdl/" xmlns:ns5="http://INDL060BB:8080/home/SCOTT/AIG/xsd" xmlns:ns4="http://INDL060BB:8080/home/SCOTT/AIG/xsd/" xmlns:ns3="http://INDL060BB:8080/home/SCOTT/AIG/xsd/" xmlns:ns2="http://INDL060BB:8080/home/SCOTT/AIG/xsd" xmlns:ns1="http://INDL060BB:8080/home/SCOTT/AIG/xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="http://INDL060BB:8080/home/SCOTT/AIG/xsd/party.xsd">
- <partyGdr>
- <addressGdr>
<address />
<addressType>Physical</addressType>
</addressGdr>
- <aigLegalEntity>
<aigLegalEntityType>AIGC</aigLegalEntityType>
- <corporateSplit>
<domesticOrForeign>Foreign</domesticOrForeign>
<generalOrLife>General</generalOrLife>
</corporateSplit>
<fcrClassificationCode>ForeignOffsGenEurB</fcrClassificationCode>
</aigLegalEntity>
<currentOwner>CICADA</currentOwner>
- <industryClassification>
- <activity>
<activityType activityTypeScheme="" />
<activityCode />
</activity>
</industryClassification>
- <lastUpdate>
<timestamp>2004-07-02T08:47:58.000000</timestamp>
<user>magellan</user>
</lastUpdate>
- <names>
<sequenceNumber>1</sequenceNumber>
<legalName>American International Underwriters Overseas Association</legalName>
<longName>American International Underwriters Overseas Association</longName>
<shortName>American International Underwriters Overseas Association</shortName>
</names>
<originalOwner>CICADA</originalOwner>
- <parentage>
- <immediate>
<partyId>AC0000755</partyId>
<partyName>American International Group, Inc.</partyName>
<providerAssignedId />
</immediate>
</parentage>
<partyType partyTypeScheme="Party">AIG LEGAL ENTITY</partyType>
<processingDirective>MOD</processingDirective>
<processingDirectiveIssuer>GDR</processingDirectiveIssuer>
<processingDirectiveDate>2004-04-23</processingDirectiveDate>
<recordStatus>Active</recordStatus>
- <sourceSystem>
<aigClientId>AIUOA</aigClientId>
<aigClientParentId>AIG</aigClientParentId>
<counterPartyName>American International Underwriters Overseas Association</counterPartyName>
<internalId>D326C</internalId>
- <lastUpdate>
<timestamp>2004-07-02T08:47:58.000000</timestamp>
<user>magellan</user>
</lastUpdate>
<reportingDate>2003-03-31</reportingDate>
<systemId>CPP</systemId>
</sourceSystem>
</partyGdr>
<partyId>AC0000169</partyId>
<partyName>American International Underwriters Overseas Association</partyName>
</party>
|
|
|
|
09-02-2004, 01:25 PM
|
#5
|
|
Member
Registered: Aug 2004
Location: bangalore india
Posts: 50
Original Poster
Rep:
|
sax IN JAVA
HI THERE
I FOUND THAT SAX CAN BE USED TO DO THIS BUT I DONT KNOW JAVA CAN ANYBODY HELP ME DO THIS each record looks as above (xml) SPLITTING.
PLZ
|
|
|
|
09-02-2004, 06:35 PM
|
#6
|
|
Senior Member
Registered: Jul 2004
Distribution: Slackware
Posts: 2,140
|
Try this code (in Perl) :
Code:
#!/usr/bin/perl
$xml_file = "records.xml";
$output_dir ="/home/me/output";
$file_prefix ="result_";
$open =0;
$count =0;
open XML_FILE, $xml_file or die "can't open $xml_file";
while(<XML_FILE>) {
if(/^<party\sxmlns/) {
print "New record found\nCreating $file_prefix$count\n";
open RESULT, ">", "$output_dir/$file_prefix$count"
or die "Error : can't open $output_dir/$file_prefix$count";
print RESULT $_;
$open = 1;
$count++;
} elsif(/^<\/party>/) {
if($open) {
print RESULT $_;
close RESULT;
$open = 0;
}
} elsif($open) {
print RESULT $_;
}
}
close XML_FILE;
chmod +x and ./ it after configure the variables inside
Last edited by Cedrik; 09-02-2004 at 06:42 PM.
|
|
|
|
09-03-2004, 04:15 AM
|
#7
|
|
Member
Registered: Aug 2004
Location: bangalore india
Posts: 50
Original Poster
Rep:
|
hi cedrik thanks a lot the programed worked

|
|
|
|
09-03-2004, 04:28 AM
|
#8
|
|
Senior Member
Registered: Jul 2004
Distribution: Slackware
Posts: 2,140
|
 good, you may learn a little Perl to adapt the script to your needs, say it would take the xml file and output directory as argument rather than hard coding it...
|
|
|
|
| Thread Tools |
Search this Thread |
|
|
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
All times are GMT -5. The time now is 09:21 PM.
|
|
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.
|
Latest Threads
LQ News
|
|