Latest LQ Deal: Linux Power User Bundle
Go Back > Forums > Non-*NIX Forums > Programming
User Name
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.


  Search this Thread
Old 05-09-2012, 12:51 PM   #31
David the H.
Bash Guru
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Debian sid + kde 3.5 & 4.4
Posts: 6,823

Rep: Reputation: 1960Reputation: 1960Reputation: 1960Reputation: 1960Reputation: 1960Reputation: 1960Reputation: 1960Reputation: 1960Reputation: 1960Reputation: 1960Reputation: 1960

Actually, I don't think you need to make it that complex. The parent tag appears to be unique, and there's only one "value" in it, so it should be quite simple to grab.

xmlparts=$( xml sel -t -v '//hudson.model.StringParameterValue/value' -o , -v /build/result -o , -v '/build/culprits/string[last()]' build.xml "$dir/$file" )
Using two slashes in front of an element appears to make the matching "global", bounded only by any higher level paths specified in front of it. In any case you can always provide the full tree path to the exact entry you want.

Now do we have everything?

Edit: BTW, it looks like the only reason your last command is failing is because it uses single quotes rather than doubles. Since the outer quotes are also single, It means that they pair up with them instead and leave the tagName string unprotected by the shell. So no quotes actually get passed to the command.

Last edited by David the H.; 05-09-2012 at 01:04 PM.
Old 05-09-2012, 02:16 PM   #32
Registered: Jan 2003
Location: des moines, ia
Distribution: suse RH
Posts: 123

Original Poster
Rep: Reputation: 16

that is interesting. It works. Just trying to make sure I understand it. I kinda do now. I learned a ton during this process.
returns an error.
Entity: line 23: parser error : Couldn't find end of Start Tag value-of line 23

thus why I went with the [name='tagName'] and it worked with [name='tagName'] until I tried to include the /description ... I think the brackets make the single quotes "local".

I believe now that provides what the requirements were. Thank you so very much.
Old 05-09-2012, 10:05 PM   #33
Senior Member
Registered: Nov 2005
Distribution: Arch
Posts: 3,163

Rep: Reputation: 1370Reputation: 1370Reputation: 1370Reputation: 1370Reputation: 1370Reputation: 1370Reputation: 1370Reputation: 1370Reputation: 1370Reputation: 1370
Originally Posted by j-me View Post
returns an error.
Entity: line 23: parser error : Couldn't find end of Start Tag value-of line 23
Older versions of xmlstarlet (1.0.x and earlier) create an XSLT document as a string and then parse it, double quotes are special in XML so the parser gets tripped up (1.0.4 and later will escape these args).
Old 05-10-2012, 08:57 AM   #34
Nominal Animal
Senior Member
Registered: Dec 2010
Location: Finland
Distribution: Xubuntu, CentOS, LFS
Posts: 1,723
Blog Entries: 3

Rep: Reputation: 946Reputation: 946Reputation: 946Reputation: 946Reputation: 946Reputation: 946Reputation: 946Reputation: 946
XML files should always be processed using XML tools, like above posters have shown.

Still, if the build.xml files are as limited to the format shown in the example, then it is certainly possible to parse them with plain awk . I would personally consider this only if using the proper tools was too slow or burdensome.

Anyway, here is the awk script:
#!/usr/bin/awk -f
    RS = "[\t\n\v\f\r ]*<"
    FS = "[\t\n\v\f\r ]*>[\t\n\v\f\r ]*"

    element = ""    # XML element name
    content = ""    # Immediate content to element
    isopen  = 0

    parents = 0     # Number of parent elements
    parent[0] = ""  # Current element
    parent[1] = ""  # Parent element to current element

# Per-file initialization
(FNR == 1) {

    # Per-file initialization.
    # Current file name (and path) is in FILENAME.

    result = ""
    description = ""
    timestamp = ""
    basename = FILENAME
    sub(/^.*\/Deploy_/, "Deploy_", basename)
    split(basename, path, "/")


# XML processing

    if (isopen) {
        isopen = 0
        if (length(element) > 0) {
            for (i = parents; i >= 0; i--)
                parent[i+1] = parent[i]

    if (NF < 1)

    if ($1 ~ /^[!?]/)

    if (NF > 2) {
        printf("Spurious > after %s.\n", $2) > "/dev/stderr"

    element = $1
    content = $2

    sub(/^[\t\n\v\f\r ]+/, "", content)
    sub(/[\t\n\v\f\r ]+$/, "", content)
    # To combine all whitespace in content to single spaces, add
    #   sub(/[\t\n\v\f\r ]+/, " ", content)

    if (element ~ /^\//) {
        sub(/^\/+/, "", element)
        sub(/[\t\n\v\f\r ].*$/, "", element)
        if (parent[1] != element) {
            printf("%s: Element not open.\n", element) > "/dev/stderr"
        for (i = 1; i < parents; i++)
            parent[i] = parent[i+1]
        delete parent[parents]

    if (element ~ /\/$/)
        sub(/\/+$/, "", element)
        isopen = 1

    sub(/[\t\n\v\f\r ].*$/, "", element)
    parent[0] = element

# Actual processing starts here. Available:
#   parents         The number of parent elements for current node
#   parent[0]       The current element name
#   parent[1]       The name of closest parent element
#   parent[parents] The name of the root element
#   content         Immediate content following the element.
#                   Does not include content after any child elements,
#                   even if they do belong to the current element.

(parents == 1 && parent[1] == "build" && parent[0] == "description") {
    description = content

(parents == 1 && parent[1] == "build" && parent[0] == "result") {
    result = content

(parents == 2 && parent[2] == "build" && parent[1] == "culprits" && parent[0] == "string") {
    printf("%s,%s,%s,%s,%s\n", path[1], path[3], description, result, content)
Run it using
above-script.awk jobs/*/*/*/build.xml
or, if you have too many files for a single command,
find jobs/ -mindepth 4 -maxdepth 4 -name build.xml -print0 | xargs -r0 ./above-script.awk
to get the expected output.


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off

Similar Threads
Thread Thread Starter Forum Replies Last Post
output pid's from ps to a comma delimited file machielr Linux - General 12 06-26-2016 06:15 PM
[SOLVED] rear file names to creat a comma delimited file GRS63 Linux - Software 7 02-04-2011 12:40 AM
using sed to remove line in a comma-delimited file seefor Programming 4 03-10-2009 03:35 PM
column re-alignment - space delimited to comma delimited hattori.hanzo Linux - Newbie 9 03-05-2009 12:54 AM
comma delimited file cdragon Programming 5 06-21-2002 07:55 PM > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 01:21 PM.

Main Menu
Write for LQ is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration