Please Help with AWK code to parse XML messages
Hi Guy's
Can I please get some help with this code. I have xml feed file which rapidly changing temporary file and I need to capture the content of this file as soon as data arrives. Example of the data Quote:
Quote:
This is awk code that I have so far but this doesn't do what I need it to do. Can I please get help with it. All I want the code to do is to run for 2 minutes process the counts , write it to output then do the same process again and again. Code:
Will anyone be able to help me with this? any help would be greatly appreciated. James |
Can it be a shell script?
Also what is the numbers at the end of the output? |
Yes it can be shell script.
The numbers at the end are counts for the age, so if there are 2 males of age 34 then instead of writing male,34,1 twice. Its easier to have male,34,2. Thanks |
Below is a shell script you want to help you out....
data.txt has Code:
cat data.txt Code:
#!/bin/bash Quote:
|
Please show an exact format for date + time?
Assuming the file is always the same format (and not currently including date + time) the following works: Code:
awk -F"[<>]+" '{gsub(/^.*="|"$/,"",$(NF-1));gsub(/^.*="|"$/,"",$5);total[$5,$(NF-1)]++}END{for( x in total)print x,total[x]}' file |
@cbtshare,
This is good way for me to start but the only problem with this is that I am reading the data from rapidly-changing kshfile. I am using a pipe to read the ksh file then what I want is to read from the pipe every 2 minutes and write to output file. Is this something that can be done using shell script? Also is there away to add the counts to the loop? Thank you all again |
I assume you mean you are reading the output from a ksh file, not reading the ksh prog file.
Where does the 2 mins thing come from? Does the ksh prog produce a new file every 2 mins? Does it output for 2 mins then overwrite the same file? In either case, synchronisation is key to avoid losing data. In either case (or even if this is a continuous stream being out put eg like a logfile), I would highly recommend http://search.cpan.org/~mgrabnar/Fil...0.99.3/Tail.pm which is designed to handle those situations. I've used it myself; very handy. :) |
Quote:
count=0 and to increase the count , let "count=+1" you can use the wait command to anywhere you want to pause the script also. |
@chrism01,
Where does the 2 mins thing come from? Quote:
The ksh file produces new message every couple seconds and each new message overwrites the previous message. And yes you are right i want to avoid losing data. Quote:
@cbtshare, I can't use cron scheduler and this why I am not sure how i could solve this issue. All Please help Thank you all again James |
Quote:
in that case, I don't get the 2 mins thing at all. You've got to grab each msg immediately or you will lose it... So, you do need to use something like that Perl module or eg Code:
tail -f output_file | your post-processing prog Going back to bash soln, maybe instead of having the ksh file write to the ever-changing file, just pipe the output directly thus Code:
ksh_prog | post-process_prog |
Thank you all for replying, but I think I haven’t explained myself.
I have .ksh file which contains XML messages. What I need is to parse and capture this XML messages then store the output in log file. The two minutes thing is something I came up with as the log file could be a large file if I get each message output to it. But if I collect the messages for 2 minutes then I will be able to get summary output as this example: Quote:
Can anyone please help with these questions? How can I parse the xml messages using this PERL code? (the link) After this how can I save the output to a log file? Any Advice will be appreciated. Thank you all again, James |
Hi Guy's,
Can someone please help with this issue? Thank you all |
Guy’s
I don’t want to bump my thread but can I please get help with this problem. Thanks James |
I am not sure I understand your current issue? You have been presented with code to parse the xml and retrieve data. Redirecting this into a new file should be trivial.
Are you able to explain where you are now stuck? |
@grail,
Which code are you referring to? If you are referring to the bash code, this code does almost what my AWK code does? I could parse and retrieve the data then redirect to output file. But the only problem with AWK is that it reads the whole file at once and what I want is to read part of the file each minute or so. If this is not possible then parse messages then log output in file, but this should include the current time and count of the messages. For the PERL link I am getting this error and I am not sure how this should parse XML messages. Quote:
Thanks again James |
All times are GMT -5. The time now is 11:45 PM. |