LinuxQuestions.org
Visit the LQ Articles and Editorials section
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices



Reply
 
Search this Thread
Old 05-09-2012, 01:51 PM   #31
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Debian sid + kde 3.5 & 4.4
Posts: 6,823

Rep: Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950

Actually, I don't think you need to make it that complex. The parent tag appears to be unique, and there's only one "value" in it, so it should be quite simple to grab.

Code:
xmlparts=$( xml sel -t -v '//hudson.model.StringParameterValue/value' -o , -v /build/result -o , -v '/build/culprits/string[last()]' build.xml "$dir/$file" )
Using two slashes in front of an element appears to make the matching "global", bounded only by any higher level paths specified in front of it. In any case you can always provide the full tree path to the exact entry you want.


Now do we have everything?


Edit: BTW, it looks like the only reason your last command is failing is because it uses single quotes rather than doubles. Since the outer quotes are also single, It means that they pair up with them instead and leave the tagName string unprotected by the shell. So no quotes actually get passed to the command.

Last edited by David the H.; 05-09-2012 at 02:04 PM.
 
Old 05-09-2012, 03:16 PM   #32
j-me
Member
 
Registered: Jan 2003
Location: des moines, ia
Distribution: suse RH
Posts: 123

Original Poster
Rep: Reputation: 16
Wink

that is interesting. It works. Just trying to make sure I understand it. I kinda do now. I learned a ton during this process.
The
Code:
/build/actions/hudson.model.ParametersAction/parameters/hudson.model.StringParameterValue[name="tagName"]/value
returns an error.
Entity: line 23: parser error : Couldn't find end of Start Tag value-of line 23
udson.model.ParametersAction/parameters/hudson.model.StringParameterValue[name="

thus why I went with the [name='tagName'] and it worked with [name='tagName'] until I tried to include the /description ... I think the brackets make the single quotes "local".

I believe now that provides what the requirements were. Thank you so very much.
 
Old 05-09-2012, 11:05 PM   #33
ntubski
Senior Member
 
Registered: Nov 2005
Distribution: Debian
Posts: 2,541

Rep: Reputation: 878Reputation: 878Reputation: 878Reputation: 878Reputation: 878Reputation: 878Reputation: 878
Quote:
Originally Posted by j-me View Post
Code:
/build/actions/hudson.model.ParametersAction/parameters/hudson.model.StringParameterValue[name="tagName"]/value
returns an error.
Entity: line 23: parser error : Couldn't find end of Start Tag value-of line 23
udson.model.ParametersAction/parameters/hudson.model.StringParameterValue[name="
Older versions of xmlstarlet (1.0.x and earlier) create an XSLT document as a string and then parse it, double quotes are special in XML so the parser gets tripped up (1.0.4 and later will escape these args).
 
Old 05-10-2012, 09:57 AM   #34
Nominal Animal
Senior Member
 
Registered: Dec 2010
Location: Finland
Distribution: Xubuntu, CentOS, LFS
Posts: 1,723
Blog Entries: 3

Rep: Reputation: 943Reputation: 943Reputation: 943Reputation: 943Reputation: 943Reputation: 943Reputation: 943Reputation: 943
XML files should always be processed using XML tools, like above posters have shown.

Still, if the build.xml files are as limited to the format shown in the example, then it is certainly possible to parse them with plain awk . I would personally consider this only if using the proper tools was too slow or burdensome.

Anyway, here is the awk script:
Code:
#!/usr/bin/awk -f
BEGIN {
    RS = "[\t\n\v\f\r ]*<"
    FS = "[\t\n\v\f\r ]*>[\t\n\v\f\r ]*"

    element = ""    # XML element name
    content = ""    # Immediate content to element
    isopen  = 0

    parents = 0     # Number of parent elements
    parent[0] = ""  # Current element
    parent[1] = ""  # Parent element to current element
}

#
# Per-file initialization
#
(FNR == 1) {

    # Per-file initialization.
    # Current file name (and path) is in FILENAME.

    result = ""
    description = ""
    timestamp = ""
    basename = FILENAME
    sub(/^.*\/Deploy_/, "Deploy_", basename)
    split(basename, path, "/")

}

#
# XML processing
#

{
    if (isopen) {
        isopen = 0
        if (length(element) > 0) {
            for (i = parents; i >= 0; i--)
                parent[i+1] = parent[i]
            parents++
        }
    }

    if (NF < 1)
        next

    if ($1 ~ /^[!?]/)
        next

    if (NF > 2) {
        printf("Spurious > after %s.\n", $2) > "/dev/stderr"
        exit(1)
    }

    element = $1
    content = $2

    sub(/^[\t\n\v\f\r ]+/, "", content)
    sub(/[\t\n\v\f\r ]+$/, "", content)
    # To combine all whitespace in content to single spaces, add
    #   sub(/[\t\n\v\f\r ]+/, " ", content)

    if (element ~ /^\//) {
        sub(/^\/+/, "", element)
        sub(/[\t\n\v\f\r ].*$/, "", element)
        if (parent[1] != element) {
            printf("%s: Element not open.\n", element) > "/dev/stderr"
            exit(1)
        }
        for (i = 1; i < parents; i++)
            parent[i] = parent[i+1]
        delete parent[parents]
        parents--
        next
    }

    if (element ~ /\/$/)
        sub(/\/+$/, "", element)
    else
        isopen = 1

    sub(/[\t\n\v\f\r ].*$/, "", element)
    parent[0] = element
}

# Actual processing starts here. Available:
#   parents         The number of parent elements for current node
#   parent[0]       The current element name
#   parent[1]       The name of closest parent element
#   parent[parents] The name of the root element
#   content         Immediate content following the element.
#                   Does not include content after any child elements,
#                   even if they do belong to the current element.

(parents == 1 && parent[1] == "build" && parent[0] == "description") {
    description = content
    next
}

(parents == 1 && parent[1] == "build" && parent[0] == "result") {
    result = content
    next
}

(parents == 2 && parent[2] == "build" && parent[1] == "culprits" && parent[0] == "string") {
    printf("%s,%s,%s,%s,%s\n", path[1], path[3], description, result, content)
}
Run it using
Code:
above-script.awk jobs/*/*/*/build.xml
or, if you have too many files for a single command,
Code:
find jobs/ -mindepth 4 -maxdepth 4 -name build.xml -print0 | xargs -r0 ./above-script.awk
to get the expected output.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] rear file names to creat a comma delimited file GRS63 Linux - Software 7 02-04-2011 01:40 AM
output pid's from ps to a comma delimited file machielr Linux - General 11 04-08-2010 04:33 AM
using sed to remove line in a comma-delimited file seefor Programming 4 03-10-2009 04:35 PM
column re-alignment - space delimited to comma delimited hattori.hanzo Linux - Newbie 9 03-05-2009 01:54 AM
comma delimited file cdragon Programming 5 06-21-2002 08:55 PM


All times are GMT -5. The time now is 03:05 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration