LinuxQuestions.org
Review your favorite Linux distribution.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices

Reply
 
Search this Thread
Old 02-11-2011, 05:21 AM   #1
saurabhmehan
Member
 
Registered: Jul 2010
Posts: 44

Rep: Reputation: 0
Question Help in Parsing data


I have below string

Code:
Transaction_ID:SDP-DM-151204679 , Transaction_DateTime:2011-02-11 00:00:15 GMT+05:30 , Transaction_Mode:WAP , Circle_ID:4 , Circle_Name:BJ ,Zone: , CustomerID:B_31563486 , MSISDN:7870904329 , IMSI:405876122068099 , IMEI: , Sub_Profile:Pre-Paid , CPID:Nazara , CPNAME:Nazara , Content_ID:NA , Content_Name:Java%20Games , Base_Price:50.0 , Charge_Code:code50 , Content_Price:50 , Content_Status: , Content_Type:Games , Other_Info:] , Static_ID:BJ#25848082 , Original_Content_Owner_ID:Nazara , External_Correlation_Id:2011021023585279271 , Product_Name: , Sender_MSISDN:56363 , Subscription_Channel: , Subscription_Type: , Location:BJ , PPL_FLAG:TRUE , CustomCDRInterceptor - CDR Info[Optional_Field1: , Optional_Field2: ,


I need the output like :

Code:
SDP-DM-151204679,2011-02-11 00:00:15 GMT+05:30,WAP,4,BJ,,B_31563486,7870904329,405876122068099,,Pre-Paid,Nazara,Nazara,NA,Java%20Games,50.0,code50,50,,Games,,BJ#25848082,Nazara,2011021023585279271,,56363,,,BJ,TRUE,,,
 
Click here to see the post LQ members have rated as the most helpful post in this thread.
Old 02-11-2011, 05:47 AM   #2
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 12,352

Rep: Reputation: 1041Reputation: 1041Reputation: 1041Reputation: 1041Reputation: 1041Reputation: 1041Reputation: 1041Reputation: 1041
Show us what you have attempted - we can't (won't) write it for you.
Help maybe.
 
Old 02-11-2011, 05:53 AM   #3
saurabhmehan
Member
 
Registered: Jul 2010
Posts: 44

Original Poster
Rep: Reputation: 0
Question Parsing probem

I am trying the following code but not work as expected
Code:
awk -F"," '{print split ( $0, s, ":" ); print $2}' filename
Quote:
Originally Posted by syg00 View Post
Show us what you have attempted - we can't (won't) write it for you.
Help maybe.
 
Old 02-11-2011, 06:21 AM   #4
colucix
Moderator
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,508

Rep: Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957
Quote:
Originally Posted by saurabhmehan View Post
awk -F"," '{print split ( $0, s, ":" ); print $2}' filename
This does not manage correctly the Transaction_DateTime filed, that contains multiple colons in its value. Also you have to manage extra spaces after and before the commas. Something like this should do the trick:
Code:
awk 'BEGIN{RS=ORS=","}!/\n/{if ( $0 ~ /Transaction_DateTime/ ) $0 = gensub(/Transaction_DateTime:(.*)/,"\\1","g"); else $0 = gensub(/^.*:]*(.*)/,"\\1","g"); gsub(/^ +| +$/,""); print}END{printf "\n"}' file
This also takes in account the extra bracket in the "Other_Info:]" field. I'm not sure it is a typo. Anyway the solution does not change. Hope this helps.
 
Old 02-11-2011, 07:56 AM   #5
grail
Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 7,562

Rep: Reputation: 1939Reputation: 1939Reputation: 1939Reputation: 1939Reputation: 1939Reputation: 1939Reputation: 1939Reputation: 1939Reputation: 1939Reputation: 1939Reputation: 1939
Or something like:
Code:
awk '{print $2}' FS="[]:]" ORS="," RS="[ \t]*,[ \t]*" file
If you wish to end on a new line you may wish to add an END command to put it in.
 
Old 02-11-2011, 08:37 AM   #6
colucix
Moderator
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,508

Rep: Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957
Quote:
Originally Posted by grail View Post
Or something like:
Code:
awk '{print $2}' FS="[]:]" ORS="," RS="[ \t]*,[ \t]*" file
If you wish to end on a new line you may wish to add an END command to put it in.
Yes, it would be easy. But - as I already mentioned - the problem is with the extra colons in the Transaction_DateTime field. Your code removes the minutes, seconds and timezone fields, since they are $3, $4 and $5. Anyway, I'm sure you can find out a more compact version of my code in post #4.
 
Old 02-11-2011, 09:20 AM   #7
grail
Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 7,562

Rep: Reputation: 1939Reputation: 1939Reputation: 1939Reputation: 1939Reputation: 1939Reputation: 1939Reputation: 1939Reputation: 1939Reputation: 1939Reputation: 1939Reputation: 1939
Sorry ... good point ... how about:
Code:
awk 'sub(/^[^:]+:[]]?/,"")' ORS="," RS="[ \t]*,[ \t]*" file
 
2 members found this post helpful.
Old 02-11-2011, 01:32 PM   #8
colucix
Moderator
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,508

Rep: Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957
Quote:
Originally Posted by grail View Post
Sorry ... good point ... how about:
Code:
awk 'sub(/^[^:]+:[]]?/,"")' ORS="," RS="[ \t]*,[ \t]*" file
Awesome!
 
1 members found this post helpful.
Old 02-15-2011, 09:14 AM   #9
saurabhmehan
Member
 
Registered: Jul 2010
Posts: 44

Original Poster
Rep: Reputation: 0
Cool Thanks

Thanks Grail.
Quote:
Originally Posted by grail View Post
Sorry ... good point ... how about:
Code:
awk 'sub(/^[^:]+:[]]?/,"")' ORS="," RS="[ \t]*,[ \t]*" file
 
Old 02-15-2011, 09:50 AM   #10
grail
Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 7,562

Rep: Reputation: 1939Reputation: 1939Reputation: 1939Reputation: 1939Reputation: 1939Reputation: 1939Reputation: 1939Reputation: 1939Reputation: 1939Reputation: 1939Reputation: 1939
No probs ... remember to mark as SOLVED once you have a solution.
 
  


Reply

Tags
asap, awk


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Python: Parsing data in chunks/sections? Neruocomp Programming 3 11-06-2010 04:06 AM
Data Parsing/Extracting anrchist007 Linux - Software 2 04-02-2009 06:12 AM
A data parsing problem - AWK indiancosmonaut Programming 1 02-24-2009 10:34 AM
Parsing Email Data Geneset Linux - General 1 07-23-2007 05:31 PM
parsing data - better way of doing that kshkid Programming 10 01-08-2007 06:05 AM


All times are GMT -5. The time now is 03:11 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration