[SOLVED] How to merge this awk and sed codes in a single one?
ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
How to merge this awk and sed codes in a single one?
Hi to all,
I have written this short code:
Code:
### 1) Printing ranges between strings to exclude unwanted lines #####
awk '/^Event/,/^$/{print $1};/Children/,/^$/{print $1};/Relative Event Code/,/^$/{print $1}' input |
#### 2) After print 1st field for only wanted lines, I Remove blank lines and lines with "--.." and "___..."
sed -e 's/^-.*//;s/^_.*$//;/^$/d' | ## I've tried to use sub(/^-.*/,"",$0) to emulate this sed line
#### 3) After remove all unwanted lines, begin to print headers, then merge all lines in a single one, FS=","
awk 'BEGIN{print "Event Code|Children Result|Relative Event Code"}
{
if ( $0~/Event||Children||Relative||.*_.*/ ) # I would like to use an array instead of "$0", I dont know how to
printf("%s,", $0) # load it in rigth way the array and after that, manipulate it.
else
printf(" %s\n", $0)
}' |
#### 4) After step 3, remove unneeded strings and at the same time separate lines corresponding
#### to each "Even Code" block
sed -e 's/^Event,//g;s/,Event,/\n/g;s/,Children,/|/g;s/,Relative,/|/g;s/,$//' | # I would like to replace this too
# with sub() to include within awk code
#### 5) Printing fields separated by "|"
awk -F"|" '{print $1,$2,$3}' OFS="|" > output
Applied to this input:
Code:
______________________________________________________________________________
Start Relative
Event Code State Date? Event? Events?
-------- ------ --------------- ----- ---------
XYZGF$TY101_Procuct01 ON_HOLD Met No No
______________________________________________________________________________
Start Relative
Event Code State Date? Event? Events?
-------- ------ --------------- ----- ---------
XYZGF$TY102_Evecod01 ON_HOLD No Yes Yes
Result: s(PRQ$MAC111_xiib) and s(PRQ$MAC141_code) and
s(PRQ$MAC131_xiib_pres_sol) and s(PRQ$MAC134_pres_areatyp)
Children Result Current State T/F
---------------- -------------- ---
CORRECT(PRQ$MAC131_xiib_pres_sol) CORRECT OK
CORRECT(PRQ$MAC134_pres_areatyp) CORRECT OK
Relative Event Code Result
------------------ ---------
ABC$MAC101_dept_abc CORRECT(XYZGF$TY102_Evecod01)
______________________________________________________________________________
Start Relative
Event Code State Date? Event? Events?
-------- ------ --------------- ----- ---------
ABC$MAC101_dept_abc ON_HOLD No Yes No
Result: s(XYZGF$TY102_Evecod01)
Children Result Current State T/F
---------------- -------------- ---
CORRECT(XYZGF$TY102_Evecod01) ON_HOLD F
The code gives the output I need, but my issue is that I would like/learn, how to join all awk parts in a single code, and use replacing awk functions instead of sed, to include all code in one single awk code.
In the code are some comments for the function of every part of code, and what I've tried to do without success so far, in order to join the code.
May somebody help me with this issue, if the joined code remains almost as the original, it would be better.
"Homework" questions are generally discouraged at LQ, but I did attempt to solve your problem. This is my solution (I tested it, and it seems to work):
Here is an alternative that assumes knowledge if other fields in the file:
Code:
#!/usr/bin/awk -f
BEGIN{ print "Event Code|Children Result|Relative Event Code"
RS = "_+\n" #Records separated by continuous underscores
}
/Event/{ #Only interested in records that contain the string 'Event'
rel = child = 0 #Set false for whether or not a child or relative part of record have been read
for(i=1;i<=NF;i++){
test = 0 #Set test to false so nothing is printed unless test is true
if($i ~ /\$/){ #Only interested in fields that contain the dollar symbol (this was inferred from your desired output)
if($(i+1) ~ /\$/){ #If next field also contains a dollar symbol then we are looking at the relative section
test++
rel++
if(!child) #If no child section has been processed, add a preceding pipe to indicate previous field missing
$i = "|"$i
$i = $i"\n" #Always add newline after relative section
}
else if($i ~ /CORRECT\(/ && $(i+1) ~ /^(ON_HOLD|CORRECT)$/){ #Current field contain 'CORRECT(' string and next field contains strings shown, then it is a child section
test++
child++
if($(i+3) ~ /CORRECT/) #If third field from current also contains string 'CORRECT', then there is another child so use comma else pipe
$i = $i","
else
$i = $i"|"
}
else if($(i+1) == "ON_HOLD"){ #Assumes all Events will be 'ON_HOLD', inferred from input file
test++
$i = $i"|"
}
}
if(test)
printf("%s", $i)
}
if(!(rel || child)) #If neither child or rel were entered then print extra pipe to indicate missing fields
print "|"
else if(!rel && child)
print ""
}
Last edited by grail; 03-13-2011 at 03:13 AM.
Reason: To add comments to further explain code
Thanks for your help and time. Really, believe me, it's not a homework, I'm not a student at all, even a real programmer, only an more or less empiric awk/sed enthusiastic fan. This is another person's question, I'm only trying, with my scarse knowledge of programming, to help that person, like some others have helped me before.
Well, your code looks it works for me either, but I'm little lost with it, I understand the kind of matrix "transposition" you do in BEGIN statement playing with RS, FS.
May you explain a little bit how it works when you define variables as read_XXXX=Y? why you say =0, =1, =2, =3 to that variables? and how is possible to use below that variables without the part "read_"?
If is not too much, may you explain how it works one of the if statements? then I'll try to do an analogy to the others :-)
I'm here to learn from you experts and help when I can :-), so, I'm lttle lost with how it work your code either.
In order to not to be a nuisance, maybe may you explain the part that I feel more complicated, how to merge the lines below the block "Children Result".
Thanks for your help and time. Really, believe me, it's not a homework, I'm not a student at all, even a real programmer, only an more or less empiric awk/sed enthusiastic fan. This is another person's question, I'm only trying, with my scarse knowledge of programming, to help that person, like some others have helped me before.
Well, your code looks it works for me either, but I'm little lost with it, I understand the kind of matrix "transposition" you do in BEGIN statement playing with RS, FS.
May you explain a little bit how it works when you define variables as read_XXXX=Y? why you say =0, =1, =2, =3 to that variables? and how is possible to use below that variables without the part "read_"?
If is not too much, may you explain how it works one of the if statements? then I'll try to do an analogy to the others :-)
I'm here to learn from you experts and help when I can :-), so, I'm lttle lost with how it work your code either.
In order to not to be a nuisance, maybe may you explain the part that I feel more complicated, how to merge the lines below the block "Children Result".
Thanks again for all your help.
Best regards
I just fixed some mistakes I made. Here is the new script:
When it finds a line that starts with Event Code, it clears the event, child_results, and rel_code variables (used to hold the event code, child results, and relative event code respectively). Then it sets read_next (which tells my script what its looking for next) to my "constant" read_event, so that it sets the first thing on the next line to the event. Then the "continue" goes to the next loop iteration (and therefor to the next field in the record). It would probably be better (farther down), instead of stopping reading the event after it finds an event, stopping after it encounters a blank line. However, the script currently skips blank lines, so you would need to tweak that first.
Fell free to ask any other questions about my script. I'll try to fix my other mistake too soon.
To explain how my code merges the child result lines:
When the program encounters a Children Result line, it sets read_next to read_child_result to tell the script that the following lines (up to a blank line) should be added to the child_results variable. Then it sets first_child_result to 1 (true) to let the script know that the next line will be the first child result.
When the script reads the lines following the Children Result line, it first checks to see if first_child_result is true. If it is, it sets the variable child_results to the first thing on the line (child_results = splitline[1]), and then sets first_child_result to 0 (false). If it is not true, the script sets child_results to the concatenation of the current value of child_results, a comma, and the first thing on the line (child_results = child_results "," splitline[1]).
Many thanks both for your great and complete help. Your codes work perfect and are great examples of how to use and when to apply for() and if() else... statements. I think I have to run by steps the codes to understand even better the partial abtraction that can be imagine so far reading the codes.
My last question about this is:
-how it's possible to define one variable a use part of its name later? I mean, the defined variable is read_event, and below in the code is used sometimes only event, without "read_".
-What is the function of variables like _varname with the "_" in at the beginning?
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.