LinuxQuestions.org
Latest LQ Deal: Latest LQ Deals
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 07-31-2008, 02:45 AM   #16
burschik
Member
 
Registered: Jul 2008
Posts: 159

Rep: Reputation: 31

I agree that sed is cumbersome for large tasks, but this is a very small task. And the sed solution does seem to be quite a bit more concise than the others.
 
Old 07-31-2008, 02:46 AM   #17
radoulov
Member
 
Registered: Apr 2007
Location: Milano, Italia/Варна, България
Distribution: Ubuntu, Open SUSE
Posts: 212

Rep: Reputation: 38
Quote:
Originally Posted by burschik View Post
I agree that sed is cumbersome for large tasks, but this is a very small task. And the sed solution does seem to be quite a bit more concise than the others.

Sure,
but think if you have to add more patterns like this ...

Quote:
If field starts with ...
 
Old 07-31-2008, 02:56 AM   #18
lmedland
LQ Newbie
 
Registered: Jun 2008
Location: England
Posts: 21

Original Poster
Rep: Reputation: 15
Quote:
Originally Posted by radoulov View Post
This is for GNU Awk (otherwise you should escape the new lines in the ternary operator):

Code:
awk>new.csv -F', *' 'BEGIN {
  n = split("N C L", t, OFS)
  while (++i <= n) tt[t[i]] = sprintf("%02d", i)
  c = "01"
  }
NR == 1 { $(NF + 1) = "New code"; print; next }
{ $(NF + 1) = $NF ~ /^[NCL].*/ ? 
    tt[substr($NF, 1, 1)] substr($NF, 2, 3) c : 
      $NF }
1'  OFS=', ' filename
Sorry, for some reason I missed this post!!!

Tried this but it pops a new line in there. How do I correct this?

Here is output:-

Code:
Patient No, Balance, Payor ID

 ,New code
8388, 13, NBUP

, 01BUP01
8526, 315, 8526

, 8526

8550, 464.65, NBUP

, 01BUP01
Thank you
 
Old 07-31-2008, 03:14 AM   #19
radoulov
Member
 
Registered: Apr 2007
Location: Milano, Italia/Варна, България
Distribution: Ubuntu, Open SUSE
Posts: 212

Rep: Reputation: 38
Hm, is this MS Windows?
Try changng the FS:

from:

Code:
-F', *'
to

Code:
-F', '
 
Old 07-31-2008, 03:25 AM   #20
gnashley
Amigo developer
 
Registered: Dec 2003
Location: Germany
Distribution: Slackware
Posts: 4,928

Rep: Reputation: 612Reputation: 612Reputation: 612Reputation: 612Reputation: 612Reputation: 612
This whole exercise looks a lot like homework to me...
 
Old 07-31-2008, 03:26 AM   #21
lmedland
LQ Newbie
 
Registered: Jun 2008
Location: England
Posts: 21

Original Poster
Rep: Reputation: 15
Quote:
Originally Posted by radoulov View Post
Hm, is this MS Windows?
Try changng the FS:

from:

Code:
-F', *'
to

Code:
-F', '
Nope, Ubuntu. I'll give it ago, thanks again.
 
Old 07-31-2008, 03:42 AM   #22
lmedland
LQ Newbie
 
Registered: Jun 2008
Location: England
Posts: 21

Original Poster
Rep: Reputation: 15
Quote:
Originally Posted by gnashley View Post
This whole exercise looks a lot like homework to me...
I can assure you its not. I'm 27 and work in an IT department of just 1....me.....to do everything!

I needed to manipulate some data extracts from legacy systems as we can't afford to do a data cleanse.
 
Old 07-31-2008, 03:44 AM   #23
lmedland
LQ Newbie
 
Registered: Jun 2008
Location: England
Posts: 21

Original Poster
Rep: Reputation: 15
Quote:
Originally Posted by radoulov View Post
Hm, is this MS Windows?
Try changng the FS:

from:

Code:
-F', *'
to

Code:
-F', '
Sorry my mistake, the original csv file was generated under Excel, so your right - it was Windows.

I have since loaded it into OpenOffice and exported again and the routine above works. Thank you.
 
Old 07-31-2008, 04:33 AM   #24
burschik
Member
 
Registered: Jul 2008
Posts: 159

Rep: Reputation: 31
Quote:
Originally Posted by radoulov View Post
Sure,
but think if you have to add more patterns like this ...
Well, of course the thought had occurred to me, but the OP did not state that the number of patterns to process might increase. So I decided to go for a one-liner rather than a complete, modular, extensible solution. And for a one-liner, sed is perfectly valid, possibly even optimal.
 
Old 07-31-2008, 05:01 AM   #25
radoulov
Member
 
Registered: Apr 2007
Location: Milano, Italia/Варна, България
Distribution: Ubuntu, Open SUSE
Posts: 212

Rep: Reputation: 38
I agree and I must confess that usually I have the same approach (quick and dirty, but efficient).
 
Old 07-31-2008, 09:56 AM   #26
ghostdog74
Senior Member
 
Registered: Aug 2006
Posts: 2,697
Blog Entries: 5

Rep: Reputation: 244Reputation: 244Reputation: 244
I like to work with fields instead of regexp, especially for structured data.
Code:
awk 'BEGIN{OFS=FS=", "
 c["N"]="01"
 c["C"]="02"
 c["L"]="03"
}
NR==1{print}
NR>1{ 
 print $1,$2, c[substr($3,1,1)] substr($3,2) "01"
}
' file
tell me honestly, would you want to read this, or one long line of regexp?
 
Old 07-31-2008, 10:44 AM   #27
burschik
Member
 
Registered: Jul 2008
Posts: 159

Rep: Reputation: 31
Quote:
Originally Posted by ghostdog74 View Post
I like to work with fields instead of regexp, especially for structured data.
Code:
awk 'BEGIN{OFS=FS=", "
 c["N"]="01"
 c["C"]="02"
 c["L"]="03"
}
NR==1{print}
NR>1{ 
 print $1,$2, c[substr($3,1,1)] substr($3,2) "01"
}
' file
tell me honestly, would you want to read this, or one long line of regexp?
I'm not denying your program is more readable and more maintainable. I merely object to your claim that sed is unsuitable for the task. Moreover, "one long line of regexp" could also be written like this:

Code:
s/, N\([A-Z]\+\)/\0, 01\101/
s/, C\([A-Z]\+\)/\0, 02\101/
s/, L\([A-Z]\+\)/\0, 03\101/
Now, that also looks pretty readable to me.
 
  


Reply

Tags
awk, condition, csv, gawk



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Easy string/text manipulation/indentation for restructured text brianmcgee Linux - Software 1 04-22-2008 08:27 PM
.csv file upoload, manipulation, download script help donv2 Programming 6 12-19-2007 03:20 PM
need help with text manipulation pcorajr Programming 12 12-15-2006 07:33 AM
More text manipulation ice_hockey Linux - General 2 05-28-2005 01:43 AM
convert CSV (TEXT) files to UTF-16 cccc Programming 1 07-01-2004 01:54 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 08:49 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration