LinuxQuestions.org - Scripting Help (Complex)

- Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)

- - Scripting Help (Complex) (https://www.linuxquestions.org/questions/linux-newbie-8/scripting-help-complex-4175598396/)

Scripting Help (Complex)

So, here's what I have and what I need.

I have approximately 800 pairs of files. A .log and a .xml:

5559_27589_TVE_HULU_AE5559_AEN_BSST_158395_TVE_000_2398_60_20150721_000.log
5559_27589_nmr.xml

I also have an Excel spreadsheet of metadata

The XML has specific fields that are empty and need the metadata inserted:

<NMRVodMetadata>
<ADIFileName/>
<DsrcId></DsrcId>
<DsrcName></DsrcName>
<AssetId></AssetId>
<AssetName></AssetName>
<EpisodeId></EpisodeId>
<WatermarkedTime>01/06/17 14:22:35</WatermarkedTime>
<SID>5559</SID>
<StartingTIC>27589</StartingTIC>
<EndingTIC>30179</EndingTIC>
<AllocatedStartTIC>27589</AllocatedStartTIC>
<AllocatedEndTIC>30206</AllocatedEndTIC>
<TicsRemaining>268405248</TicsRemaining>
<HDContent></HDContent>
<FileSizeBytes>1490688000</FileSizeBytes>
<FileSizeSeconds>2588</FileSizeSeconds>
<ApplicationName></ApplicationName>
<ApplicationVersion></ApplicationVersion>
<EncoderEngineName>Nielsen Watermark Engine</EncoderEngineName>
<EncoderEngineVersion>1.9.15</EncoderEngineVersion>
</NMRVodMetadata>

The orange fields are static values. The yellow fields are information that exists in the Excel doc.

Basically, for all the XMLs, we need to add the static values into the orange fields.

AND

- We need to correlate the ID# in the first column (158395) with the .log file, which has the value in the file name.
- Then, we need to correlate that specific .log file with the corresponding .xml file that shares the same prefix (5559_27589).
- Then we need to take the information in the columns next to the ID# and populate the blank fields in the XML with that information.

Any help in automating this process or giving me some CLI commands would be greatly appreciated.

Thanks.

Ok, so what have you done so far?

Hi mermelmadness and welcome to LQ,

Please read the link in TenTenth's post about how to ask a question, and here is a repeat of that link: How to ask a Question.

The point here is that while you have described the input and your desired output, you have not discussed what language you intend to do your project in, and whether or not you have tried to solve any part of this yet.

LQ members are volunteers and not here to provide "on demand" code. Instead we volunteer to help you to learn the tools you need to accomplish your goals. Along the way, when you show some effort, people will gladly offer suggested improvements.

What are your ideas for automation? Were you planning to write a script or a program? What CLI commands are you considering and have you checked the manual pages?

As above, but also, please use [code][/code] tags around your code and data as this will help maintain the formatting.

Probably a suggestion would be to convert the Excel docs to csv text files so they can be easily read.

Well, I'm a newb myself, so by now I'd probably export the Excel-file to CSV, and then go crazy with grep, head/tail/cut and paste.
The correlations you need should be not that big of a deal with string-matching-conditionals generally available in bash.

Quote:

Originally Posted by mermelmadness (Post 5661157)

The XML has specific fields that are empty and need the metadata inserted:

Perl does XML parsing rather easily. Try looking at the XML::TreeBuider::XPath module or one of the other XML parsers. There are also modules which can read CSV.