Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place! |
Notices |
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
Are you new to LinuxQuestions.org? Visit the following links:
Site Howto |
Site FAQ |
Sitemap |
Register Now
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
|
|
01-27-2017, 04:35 PM
|
#1
|
LQ Newbie
Registered: Jan 2017
Location: Brooklyn, NY
Posts: 3
Rep:
|
Scripting Help (Complex)
So, here's what I have and what I need.
I have approximately 800 pairs of files. A .log and a .xml:
5559_27589_TVE_HULU_AE5559_AEN_BSST_158395_TVE_000_2398_60_20150721_000.log
5559_27589_nmr.xml
I also have an Excel spreadsheet of metadata
The XML has specific fields that are empty and need the metadata inserted:
<NMRVodMetadata>
<ADIFileName/>
<DsrcId></DsrcId>
<DsrcName></DsrcName>
<AssetId></AssetId>
<AssetName></AssetName>
<EpisodeId></EpisodeId>
<WatermarkedTime>01/06/17 14:22:35</WatermarkedTime>
<SID>5559</SID>
<StartingTIC>27589</StartingTIC>
<EndingTIC>30179</EndingTIC>
<AllocatedStartTIC>27589</AllocatedStartTIC>
<AllocatedEndTIC>30206</AllocatedEndTIC>
<TicsRemaining>268405248</TicsRemaining>
<HDContent></HDContent>
<FileSizeBytes>1490688000</FileSizeBytes>
<FileSizeSeconds>2588</FileSizeSeconds>
<ApplicationName></ApplicationName>
<ApplicationVersion></ApplicationVersion>
<EncoderEngineName>Nielsen Watermark Engine</EncoderEngineName>
<EncoderEngineVersion>1.9.15</EncoderEngineVersion>
</NMRVodMetadata>
The orange fields are static values. The yellow fields are information that exists in the Excel doc.
Basically, for all the XMLs, we need to add the static values into the orange fields.
AND
- We need to correlate the ID# in the first column (158395) with the .log file, which has the value in the file name.
- Then, we need to correlate that specific .log file with the corresponding .xml file that shares the same prefix (5559_27589).
- Then we need to take the information in the columns next to the ID# and populate the blank fields in the XML with that information.
Any help in automating this process or giving me some CLI commands would be greatly appreciated.
Thanks.
|
|
|
01-31-2017, 04:51 AM
|
#2
|
Senior Member
Registered: Aug 2011
Location: Dublin
Distribution: Centos 5 / 6 / 7 / 8
Posts: 3,538
|
Ok, so what have you done so far?
|
|
|
01-31-2017, 07:38 AM
|
#3
|
Moderator
Registered: Mar 2011
Location: USA
Distribution: MINT Debian, Angstrom, SUSE, Ubuntu, Debian
Posts: 9,938
|
Hi mermelmadness and welcome to LQ,
Please read the link in TenTenth's post about how to ask a question, and here is a repeat of that link: How to ask a Question.
The point here is that while you have described the input and your desired output, you have not discussed what language you intend to do your project in, and whether or not you have tried to solve any part of this yet.
LQ members are volunteers and not here to provide "on demand" code. Instead we volunteer to help you to learn the tools you need to accomplish your goals. Along the way, when you show some effort, people will gladly offer suggested improvements.
What are your ideas for automation? Were you planning to write a script or a program? What CLI commands are you considering and have you checked the manual pages?
|
|
|
01-31-2017, 07:41 AM
|
#4
|
LQ Guru
Registered: Sep 2009
Location: Perth
Distribution: Arch
Posts: 10,030
|
As above, but also, please use [code][/code] tags around your code and data as this will help maintain the formatting.
Probably a suggestion would be to convert the Excel docs to csv text files so they can be easily read.
|
|
|
01-31-2017, 07:42 AM
|
#5
|
Member
Registered: Jan 2017
Posts: 32
Rep:
|
Well, I'm a newb myself, so by now I'd probably export the Excel-file to CSV, and then go crazy with grep, head/tail/cut and paste.
The correlations you need should be not that big of a deal with string-matching-conditionals generally available in bash.
Last edited by MrMeeSeeks; 01-31-2017 at 07:45 AM.
|
|
|
01-31-2017, 07:47 AM
|
#6
|
LQ Guru
Registered: Apr 2005
Distribution: Linux Mint, Devuan, OpenBSD
Posts: 7,638
|
Quote:
Originally Posted by mermelmadness
The XML has specific fields that are empty and need the metadata inserted:
|
Perl does XML parsing rather easily. Try looking at the XML::TreeBuider::XPath module or one of the other XML parsers. There are also modules which can read CSV.
|
|
|
All times are GMT -5. The time now is 10:25 PM.
|
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.
|
Latest Threads
LQ News
|
|