[SOLVED] Script to print repeated values separated by line break
ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Script to print repeated values separated by line break
Hello everyone,
I've been trying myself and searching for help without success aso far. Maybe someone could help me please, maybe could be a better way to do it.
What I want is print for each "Report" (SReport) its respective values in separate columns and for nodes
that appear several times, add "end of line" between each value, for example. For first report (SReport), within node "MR_NRanges", NA appears 3 times, then I want to printing NA like this 763LF358LF852, where LF represents end of line only to see it easy.
NA, NRB, SubRangeB and SubRangeE could appear inside the parent nodes "MR_NRanges" and "PK_NRanges" one or more times.
How do you know that MR_NRanges is a node? Looking at the xml data I see nothing specific about this entry over many others that clarifies how to know it is a node?
Also, whilst I am always happy to champion awk as a great tool, if you are needing such a low level reference to data within an xml construct, I would recommend looking at something
like Ruby or Perl which have specific modules for interacting with xml data.
I found an example with ruby REXML and testing with my XML input and trying to print only ReportName, ReportType and NA
and NRB that belong to MR_NRanges, the output should be (for repeated nodes I'm putting a "," here):
Ok ... I haven't played with this particular module much and there is still a pesky comma in the wrong spot (which just annoys me), but this is what I have so far:
There is probably a cleaner method, I did see that you can iterate over items until a match is found and then print that so it may work for you to play a little more
Thank your for your fixing. I was able to print the other values, but it happens like you said, a comma as last character
is still present for me too. Maybe storing in a variable the content of each row before printing could be useful to remove
the extra "," and then print but I wasn't able to change the code to store in a variable before printing, maybe you can help
me one more time to fix that part.
Following your example I was able to print all values for this input file and it seems
are printed like I want. I'm not sure how to use the values as reference in the hash.
What I've finally tried is how to add double quotes at begin and at the end of column that
contain repeated values.
And I finally want to have with duoble quotes repeated values as below:
*(if they appear only once could be printed with or without double quotes, doesn't matter. For example in column 5 of 2 row
only have one value, that is 256, so could be printed without double quotes.)
Now I have my last 2 questions, if you have time enough would be great if you can help me again.
1- Since this script uses a XML module, I cannot concatenate more than one XML file in a single one to apply this script because
is detected as XML file with bad format. Then, having several XML files with the same format in a directory, how can
I do in order this script takes all XML files and outputs the result in a single output file, with each line representing the values of each XML in output file?
* for each XML file there is only one Top node called "SReport".
2- I've been trying to make short the a.elements.each(...) commands assigning the path to a string, but when I do that,
even I don not receive a syntax error message, the output is not correct.
Well I am using a later version of ruby (2.1.1p76), so not sure if that make s a difference.
As for Dir, are you saying even a simple, puts file, with nothing else also prints nothing?
That works for me, of course on linux but I do not see why that would matter.
Yes, I tried to simply print the names of files in folder putting the full path inside Dir command and doesn't work in that way, only if I put the wildcard like this "*.xml".
Is there some other way to do the loop over all files?
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.