[SOLVED] Script to print repeated values separated by line break
ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
That would be due to you introducing text with spaces. The usual story will probably be that given enough time you will always find some form of xml that will not fit the pattern.
However, this one is a simple fix:
Your codes works great. I've tried each step, one by one and I think I understand better the logic.
I've modified your script to print headers from nodes names as below.
Code:
require 'rexml/document'
include REXML
xmldoc = Document.new File.new("input.xml")
array_A = []
array_B = []
array_Headers = []
xmldoc.elements.each("//"){ |z| array_A << z.xpath.gsub(/\[.\]/,'') }
array_A.uniq.each{ |x|
if xmldoc.elements[x].has_text? && xmldoc.elements[x].text =~ /^[[:alnum:]]/
array_B << xmldoc.get_elements(x).map{ |a| a.text }.join(",")
array_Headers << x.sub(/^\/(.+\/)*(\w*)(\[\d*\])?/ ,'\2') # to get only node name
end
}
print array_Headers.join("|") + "\n" + array_B.map{ |n| (n.include?(","))?"\"#{n}\"":n }.join("|")
But even is printing all values I want, the output is not be presented good enough for some nodes, since using the real file I detected a issue with similar nodes from different categories, what I mean is:
In input XML, after "<SecondSection>" comes a "<ThirdSection>... </ThirSection>" with contact names for this different kind of issues (IssuesTypeA, IssuesTypeB, IssueTypeC, etc). The data of all contacts persons is printed but in different columns. I want to improve the output for this contact data printing similar nodes in the same column (grouped with commas and surrounded with double quotes like before). This is print only once the headers "Prefix, GivenName,FamilyName,JobTitle,PhoneFix,PhoneMobile,Fax,Email" and group all prefixes with commas, group all GivenName with commas etc.
So, instead to get this like currently get(repeated 3 times headers for the 3 contacts)
Firstly, for printing the names as headers, just use .name:
Code:
array_Headers << x.name
As for your second requirement ... it is not possible. You now are requiring knowledge of the data, whereas the current solution says we do not care the format of the data, but
if it should fall under the same heading then we will group it all together.
The data you have created now has a completely different node name, ie IssuesTypeA and IssuesTypeB, hence no path will ever equal both so they are quite correctly separated in to
separate values.
I think we have probably gone far enough off the reservation with this question as it has transformed several times. Also, as I stated earlier, you have now found yet another example
where the current solution does not work. This may continue at infinitum as once we solve a problem you provide a new issue.
Lastly, the new sections being added do not seem to match any of the initial data, so I am not sure if you are just taking on new things to see how to change the solution, but
I will leave you with your new hurdle. From the current different solutions you may be able to cobble together the 2 solutions (which would seem to be what you are now heading for)
and see what you can come up with.
For some reason I get error trying with "array_Headers << x.name", but is a minor issue.
I know that looks that I'm changing things each time, but actually I only presented a representative sample
of the original XML to make easier to understand. Your last code it works just correct for what I asked, only
happened that when I tested with the complete XML I saw that issue with the contacts nodes. I'll use your previous
examples to try to get that output.
Many thanks again for the great help, support, patience and time provided.
As I said above, I think your biggest hurdle will be to assume you do not know the data and still have it fall in line when a node does not have the same name.
One thought I did have is that wildcards can be used so you may be able to replace the path so you could have:
Of course this supposes you know the "IssueTypeX" is going to exist and that there are multiples and you will replace all with this line.
I am also then not sure how you would go about getting the format you specified with the "IssueTypeX" preceding the value
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.