LinuxQuestions.org - [SOLVED] Help in Parsing data

- Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)

- - Help in Parsing data (https://www.linuxquestions.org/questions/linux-newbie-8/help-in-parsing-data-862016/)

Help in Parsing data

I have below string

Code:



Transaction_ID:SDP-DM-151204679 , Transaction_DateTime:2011-02-11 00:00:15 GMT+05:30 , Transaction_Mode:WAP , Circle_ID:4 , Circle_Name:BJ ,Zone: , CustomerID:B_31563486 , MSISDN:7870904329 , IMSI:405876122068099 , IMEI: , Sub_Profile:Pre-Paid , CPID:Nazara , CPNAME:Nazara , Content_ID:NA , Content_Name:Java%20Games , Base_Price:50.0 , Charge_Code:code50 , Content_Price:50 , Content_Status: , Content_Type:Games , Other_Info:] , Static_ID:BJ#25848082 , Original_Content_Owner_ID:Nazara , External_Correlation_Id:2011021023585279271 , Product_Name: , Sender_MSISDN:56363 , Subscription_Channel: , Subscription_Type: , Location:BJ , PPL_FLAG:TRUE , CustomCDRInterceptor - CDR Info[Optional_Field1: , Optional_Field2: ,

I need the output like :

Code:



SDP-DM-151204679,2011-02-11 00:00:15 GMT+05:30,WAP,4,BJ,,B_31563486,7870904329,405876122068099,,Pre-Paid,Nazara,Nazara,NA,Java%20Games,50.0,code50,50,,Games,,BJ#25848082,Nazara,2011021023585279271,,56363,,,BJ,TRUE,,,

Show us what you have attempted - we can't (won't) write it for you.
Help maybe.

I am trying the following code but not work as expected

Code:

awk -F"," '{print split ( $0, s, ":" ); print $2}' filename

Quote:

Originally Posted by syg00 (Post 4254977)

Show us what you have attempted - we can't (won't) write it for you.
Help maybe.

Quote:

Originally Posted by saurabhmehan (Post 4254984)

awk -F"," '{print split ( $0, s, ":" ); print $2}' filename

This does not manage correctly the Transaction_DateTime filed, that contains multiple colons in its value. Also you have to manage extra spaces after and before the commas. Something like this should do the trick:

Code:

awk 'BEGIN{RS=ORS=","}!/\n/{if ( $0 ~ /Transaction_DateTime/ ) $0 = gensub(/Transaction_DateTime:(.*)/,"\\1","g"); else $0 = gensub(/^.*:]*(.*)/,"\\1","g"); gsub(/^ +| +$/,""); print}END{printf "\n"}' file

This also takes in account the extra bracket in the "Other_Info:]" field. I'm not sure it is a typo. Anyway the solution does not change. Hope this helps.

Or something like:

Code:

awk '{print $2}' FS="[]:]" ORS="," RS="[ \t]*,[ \t]*" file

If you wish to end on a new line you may wish to add an END command to put it in.

Quote:

Originally Posted by grail (Post 4255074)

Or something like:

Code:

awk '{print $2}' FS="[]:]" ORS="," RS="[ \t]*,[ \t]*" file

If you wish to end on a new line you may wish to add an END command to put it in.

Yes, it would be easy. But - as I already mentioned - the problem is with the extra colons in the Transaction_DateTime field. Your code removes the minutes, seconds and timezone fields, since they are $3, $4 and $5. Anyway, I'm sure you can find out a more compact version of my code in post #4. ;)

Sorry ... good point ... how about:

Code:

awk 'sub(/^[^:]+:[]]?/,"")' ORS="," RS="[ \t]*,[ \t]*" file

Quote:

Originally Posted by grail (Post 4255159)

Sorry ... good point ... how about:

Code:

awk 'sub(/^[^:]+:[]]?/,"")' ORS="," RS="[ \t]*,[ \t]*" file

Awesome! :)

Thanks Grail.

Quote:

Originally Posted by grail (Post 4255159)

Sorry ... good point ... how about:

Code:

awk 'sub(/^[^:]+:[]]?/,"")' ORS="," RS="[ \t]*,[ \t]*" file

No probs ... remember to mark as SOLVED once you have a solution.