Hey guys,
I have a huge xml file like this...
Code:
<manufacturers>
<manufacturer_data>
<action>UPDATE</action>
<mfr_id>6515951</mfr_id>
<local_content>0</local_content>
<name>Johnsonville Sausage, Llc</name>
</manufacturer_data>
<manufacturer_data>
<action>INSERT</action>
<mfr_id>6594084</mfr_id>
<local_content>0</local_content>
<name>Foodmark</name>
</manufacturer_data>
</manufacturers>
<brands>
<brand_data>
<action>INSERT</action>
<brand_id>6594088</brand_id>
<mfr_id>6594084</mfr_id>
<local_content>0</local_content>
<name>Good Food Made Simple</name>
</brand_data>
<brand_data>
<action>INSERT</action>
<brand_id>6523125</brand_id>
<mfr_id>105873</mfr_id>
<local_content>0</local_content>
<name>Hawaiian(Tm) Kettle Style Potato Chips</name>
</brand_data>
<brand_data>
</brands>
Yesterday I asked for assistance to extract mfr_id from the list and I used
Code:
grep mfr_id | sed -rn 's@</?mfr_id>@@gp'
to extract the data/ids which I later then sorted and removed duplicates for my actual analysis.
Today, I am looking to extract <mfr_id> and <name> from <manufacturer_data>
Issues I am having.
- sed is extracting all instances of <name>
So I need to
- tell sed to "hold" data between <manufactuer_data> tags and do pattern search to strip <mfr_id> and <name> tags and print them into columns.
This is a little above league. Can some one help me out?