LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 10-27-2010, 09:15 AM   #16
Kenhelm
Member
 
Registered: Mar 2008
Location: N. W. England
Distribution: Mandriva
Posts: 336

Rep: Reputation: 141Reputation: 141

This works for me with the posted file fragment.
The --recover option tells xmllint to
"Output any parsable portions of an invalid document" (From the man page)
It tries to fix the incomplete xml by adding the missing end tags.

Code:
sed -n '/^ <document>/,/^$/s/^ //p' file | tr -d '\n' | xmllint --format --recover -

-:1: parser error : Couldn't find end of Start Tag objectFie line 1
tLifePercentage</fieldName><fieldValue>0.00</fieldValue></objectField><objectFie
                                                                               ^
-:1: parser error : Premature end of data in tag level2Object line 1
tLifePercentage</fieldName><fieldValue>0.00</fieldValue></objectField><objectFie
                                                                               ^
-:1: parser error : Premature end of data in tag level1Object line 1
tLifePercentage</fieldName><fieldValue>0.00</fieldValue></objectField><objectFie
                                                                               ^
-:1: parser error : Premature end of data in tag level0Object line 1
tLifePercentage</fieldName><fieldValue>0.00</fieldValue></objectField><objectFie
                                                                               ^
-:1: parser error : Premature end of data in tag document line 1
tLifePercentage</fieldName><fieldValue>0.00</fieldValue></objectField><objectFie
                                                                               ^
<?xml version="1.0"?>
<document>
  <docRequestID>2010-10-22-11.57.22.903813</docRequestID>
  <docStylesheet>Thunderhead</docStylesheet>
  <requestType>claim</requestType>
  <level0Object>
    <objectType>transaction</objectType>
    <objectID>900</objectID>
    <objectSeq>1</objectSeq>
    <level1Object>
      <objectType>lifelite</objectType>
      <objectID>901</objectID>
      <objectSeq>1</objectSeq>
      <level2Object>
        <objectType>documentHeader</objectType>
        <objectID>100</objectID>
        <objectSeq>1</objectSeq>
        <objectField>
          <fieldID>1500</fieldID>
          <fieldName>transactionType</fieldName>
          <fieldValue>6</fieldValue>
        </objectField>
        <objectField>
          <fieldID>1501</fieldID>
          <fieldName>lifeliteReference</fieldName>
          <fieldValue>000231263</fieldValue>
        </objectField>
        <objectField>
          <fieldID>1502</fieldID>
          <fieldName>requestorUserid</fieldName>
          <fieldValue>LV20073</fieldValue>
        </objectField>
        <objectField>
          <fieldID>1503</fieldID>
          <fieldName>requestDate</fieldName>
          <fieldValue>2010-10-22</fieldValue>
        </objectField>
        <objectField>
          <fieldID>1504</fieldID>
          <fieldName>requestTime</fieldName>
          <fieldValue>6</fieldValue>
        </objectField>
        <objectField>
          <fieldID>1505</fieldID>
          <fieldName>busProcess</fieldName>
          <fieldValue>LLP0101</fieldValue>
        </objectField>
        <objectField>
          <fieldID>1506</fieldID>
          <fieldName>insert</fieldName>
          <fieldValue>N</fieldValue>
        </objectField>
        <objectField>
          <fieldID>1507</fieldID>
          <fieldName>adviserName</fieldName>
          <fieldValue>PHIL</fieldValue>
        </objectField>
      </level2Object>
      <level2Object>
        <objectType>recipient</objectType>
        <objectID>110</objectID>
        <objectSeq>2</objectSeq>
        <objectField>
          <fieldID>1510</fieldID>
          <fieldName>rcpntPartyId</fieldName>
          <fieldValue>7510134</fieldValue>
        </objectField>
        <objectField>
          <fieldID>1511</fieldID>
          <fieldName>companyCode</fieldName>
          <fieldValue>LVG</fieldValue>
        </objectField>
      </level2Object>
      <level2Object>
        <objectType>claim</objectType>
        <objectID>120</objectID>
        <objectSeq>3</objectSeq>
        <objectField>
          <fieldID>1107</fieldID>
          <fieldName>claimRef</fieldName>
          <fieldValue>V1058036</fieldValue>
        </objectField>
        <objectField>
          <fieldID>1108</fieldID>
          <fieldName>totalClaimAmount</fieldName>
          <fieldValue>10000.00</fieldValue>
        </objectField>
        <objectField>
          <fieldID>1109</fieldID>
          <fieldName>totalGroupClaimAmt</fieldName>
          <fieldValue>10000.00</fieldValue>
        </objectField>
        <objectField>
          <fieldID>1533</fieldID>
          <fieldName>totalFundAmt</fieldName>
          <fieldValue>100000.00</fieldValue>
        </objectField>
        <objectField>
          <fieldID>1110</fieldID>
          <fieldName>trivialityInd</fieldName>
          <fieldValue>N</fieldValue>
        </objectField>
        <objectField>
          <fieldID>1111</fieldID>
          <fieldName>reducedPensionAmt</fieldName>
          <fieldValue>3750.00</fieldValue>
        </objectField>
        <objectField>
          <fieldID>1112</fieldID>
          <fieldName>firstPaymentDate</fieldName>
          <fieldValue>2010-11-19</fieldValue>
        </objectField>
        <objectField>
          <fieldID>1113</fieldID>
          <fieldName>paymentType</fieldName>
          <fieldValue>IN ADVANCE</fieldValue>
        </objectField>
        <objectField>
          <fieldID>1114</fieldID>
          <fieldName>paymentInterval</fieldName>
          <fieldValue>QUARTERLY</fieldValue>
        </objectField>
        <objectField>
          <fieldID>1115</fieldID>
          <fieldName>lumpSumAmt</fieldName>
          <fieldValue>25000.00</fieldValue>
        </objectField>
        <objectField>
          <fieldID>1116</fieldID>
          <fieldName>residualSum</fieldName>
          <fieldValue>75000.00</fieldValue>
        </objectField>
        <objectField>
          <fieldID>1117</fieldID>
          <fieldName>slaPerc</fieldName>
          <fieldValue>0.000</fieldValue>
        </objectField>
        <objectField>
          <fieldID>1118</fieldID>
          <fieldName>jointLifePercentage</fieldName>
          <fieldValue>0.00</fieldValue>
        </objectField>
        <objectFie/>
      </level2Object>
    </level1Object>
  </level0Object>
</document>

Last edited by Kenhelm; 10-27-2010 at 09:29 AM.
 
Old 10-27-2010, 10:40 AM   #17
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 9,550

Rep: Reputation: 2898Reputation: 2898Reputation: 2898Reputation: 2898Reputation: 2898Reputation: 2898Reputation: 2898Reputation: 2898Reputation: 2898Reputation: 2898Reputation: 2898
Quote:
Are you not getting the line spaces in the extracted xml then? I get one on every line.. any ideas how i can stop this?
No I am not, but as I said I feel it may be that your input file is written and saved under Windows. Do you have dos2unix or some such that you can run over the file?
 
Old 10-28-2010, 04:10 AM   #18
hugh86
LQ Newbie
 
Registered: Oct 2010
Posts: 9

Original Poster
Rep: Reputation: 0
@ KENHELM...


did you code work then? Did you just run that line or did you use it alongside my script i posted? Im new to all this and not sure if its an addition to what i have already done?

Code:
#!/bin/bash
echo "getXML"

echo -n "Enter the source file name WITH extension : "
read infile 
echo "Processing... : " 
sleep 1 
echo -n "Enter output file name (extenstion not applicable) : "
read outfile
sed -n '/Sending XML/,/Message sending ended/p' ${infile} > ${outfile}
echo "Processing XML... : "
sleep 1
echo "Success..Data should be in '$outfile' if compiled correctly"

Thankyou
 
Old 10-28-2010, 05:17 AM   #19
Kenhelm
Member
 
Registered: Mar 2008
Location: N. W. England
Distribution: Mandriva
Posts: 336

Rep: Reputation: 141Reputation: 141
I just ran the line of code I posted.
This is your script with the code inserted.
Code:
#!/bin/bash
echo "getXML"

echo -n "Enter the source file name WITH extension : "
read infile 
echo "Processing... : " 
sleep 1 
echo -n "Enter output file name (extenstion not applicable) : "
read outfile

sed -n '/^ <document>/,/^$/s/^ //p' ${infile} |
tr -d '\n' |
xmllint --format - > ${outfile}

echo "Processing XML... : "
sleep 1
echo "Success..Data should be in '$outfile' if compiled correctly"
/^ <document>/,/^$/ selects lines from ' <document>' to the next empty line.
s/^ // removes the single leading space on each line.
tr -d '\n' removes the newlines, putting all the xml onto a single line.
If the xml is valid you shouldn't need the '--recover' option to xmllint, but if you get some parsing error messages try putting it back in.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Get data from multi lined text file using awk, sed or perl - grep & cut not upto par cam34 Programming 4 07-02-2010 04:10 AM
How to use grep, cut, or awk to get an IP from a file chudster Linux - General 4 02-03-2010 08:06 PM
How to use command grep,cut,awk to cut a data from a file? hocheetiong Linux - Newbie 7 09-11-2008 08:16 PM
sed/awk/grep for multiple line data hotrodmacman Programming 8 10-18-2007 12:06 PM
How do I cut out a specific piece of a html page (using sed/awk or similar)? bomix Linux - General 2 10-08-2005 05:30 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 04:35 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration