Splitting A Large Xml File
hi
i have a 200mb xml file with 19845 records of diff no of linesfor each record i want to seperate each of the records a store it in a new file (for each record a new file) can anybody help me in doing so how can i do this in linux,or java /c i tried using split in linux but it removes only a default no of lines but not the records as i want it to be plz help |
How do your records look like ?
|
awk, perl or python would offer themselves for
such tasks :) Cheers, Tink |
each of my records are like the code given below
<party xmlns:defns="http://INDL060BB:8080/home/SCOTT/AIG/xsd" xmlns:soapenc="http://schemas.xmlsoap.org/soap/encoding/" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:wsdl="http://schemas.xmlsoap.org/wsdl/" xmlns:ns5="http://INDL060BB:8080/home/SCOTT/AIG/xsd" xmlns:ns4="http://INDL060BB:8080/home/SCOTT/AIG/xsd/" xmlns:ns3="http://INDL060BB:8080/home/SCOTT/AIG/xsd/" xmlns:ns2="http://INDL060BB:8080/home/SCOTT/AIG/xsd" xmlns:ns1="http://INDL060BB:8080/home/SCOTT/AIG/xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="http://INDL060BB:8080/home/SCOTT/AIG/xsd/party.xsd"> - <partyGdr> - <addressGdr> <address /> <addressType>Physical</addressType> </addressGdr> - <aigLegalEntity> <aigLegalEntityType>AIGC</aigLegalEntityType> - <corporateSplit> <domesticOrForeign>Foreign</domesticOrForeign> <generalOrLife>General</generalOrLife> </corporateSplit> <fcrClassificationCode>ForeignOffsGenEurB</fcrClassificationCode> </aigLegalEntity> <currentOwner>CICADA</currentOwner> - <industryClassification> - <activity> <activityType activityTypeScheme="" /> <activityCode /> </activity> </industryClassification> - <lastUpdate> <timestamp>2004-07-02T08:47:58.000000</timestamp> <user>magellan</user> </lastUpdate> - <names> <sequenceNumber>1</sequenceNumber> <legalName>American International Underwriters Overseas Association</legalName> <longName>American International Underwriters Overseas Association</longName> <shortName>American International Underwriters Overseas Association</shortName> </names> <originalOwner>CICADA</originalOwner> - <parentage> - <immediate> <partyId>AC0000755</partyId> <partyName>American International Group, Inc.</partyName> <providerAssignedId /> </immediate> </parentage> <partyType partyTypeScheme="Party">AIG LEGAL ENTITY</partyType> <processingDirective>MOD</processingDirective> <processingDirectiveIssuer>GDR</processingDirectiveIssuer> <processingDirectiveDate>2004-04-23</processingDirectiveDate> <recordStatus>Active</recordStatus> - <sourceSystem> <aigClientId>AIUOA</aigClientId> <aigClientParentId>AIG</aigClientParentId> <counterPartyName>American International Underwriters Overseas Association</counterPartyName> <internalId>D326C</internalId> - <lastUpdate> <timestamp>2004-07-02T08:47:58.000000</timestamp> <user>magellan</user> </lastUpdate> <reportingDate>2003-03-31</reportingDate> <systemId>CPP</systemId> </sourceSystem> </partyGdr> <partyId>AC0000169</partyId> <partyName>American International Underwriters Overseas Association</partyName> </party> |
sax IN JAVA
HI THERE
I FOUND THAT SAX CAN BE USED TO DO THIS BUT I DONT KNOW JAVA CAN ANYBODY HELP ME DO THIS each record looks as above (xml) SPLITTING. PLZ |
Try this code (in Perl) :
Code:
#!/usr/bin/perl |
hi cedrik thanks a lot the programed worked
:) |
;) good, you may learn a little Perl to adapt the script to your needs, say it would take the xml file and output directory as argument rather than hard coding it...
|
All times are GMT -5. The time now is 02:08 PM. |