Please use ***
[code][/code]*** tags around your code and data, to preserve the original formatting and to improve readability. Do
not use quote tags, bolding, colors, "start/end" lines, or other creative techniques.
Also, when giving us data to work with, please make sure it's complete. I couldn't do any testing on what you gave me until I figured out how to get it into proper xml, with a defined namespace.
Anyway, line and regex-based tools like sed and awk are not well designed for nested, tag-structured languages like xml/html. You should only use them when you can guarantee that the file format is unvarying.
It's much better in the long run to use a tool with a dedicated xml parser, like xmlstarlet.
http://xmlstar.sourceforge.net/
I'm still kind of a beginner at this, but I was able to extract the kind of data you wanted with these commands:
Code:
$ xmlstarlet sel -T -t -m '//cdf:rule-result' -v 'concat(@version," ",cdf:result)' -n file.xml
GEN005390 pass
GEN005450 fail
GEN005501 pass
GEN005505 fail
GEN005507 fail
GEN005510 fail
$ xmlstarlet sel -T -t -m '//cdf:rule-result[cdf:result="pass"]' -v 'concat(@version," ",cdf:result)' -n file.xml
GEN005390 pass
GEN005501 pass
$ xmlstarlet sel -T -t -m '//cdf:rule-result[not(cdf:result="pass")]' -v 'concat(@version," ",cdf:result)' -n file.xml
GEN005450 fail
GEN005505 fail
GEN005507 fail
GEN005510 fail
Someone more experienced in
xpath manipulation could doubtlessly do much more with it.