LinuxQuestions.org
Visit Jeremy's Blog.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 10-02-2012, 10:14 AM   #1
grob115
Member
 
Registered: Oct 2005
Posts: 542

Rep: Reputation: 32
Regex or Sed


Hi, I need to perform a simple substitution by replacing whitespaces with a character but only within the middle of a specific identifiable patterns. For example, if I have the following:
Name, Sex, Address
Tom, M, 15 Broadway
Mary, F, 80 Maple Street


I need to transform it to the following (not in Excel but with command lines please) by adding hyphens to the address field. It can be anything but this is just an example.
Name, Sex, Address
Tom, M, 15-Broadway
Mary, F, 80-Maple-Street


I need to have the flexibility of specifying some type of pattern to frame where I want the translation or replacement to take place. In this case:
Start Pattern = ^.*,\s(M|F),\s
Stop Pattern = $

Can someone please show me how this can be done? Thanks. Thought about using sed but it can do only the whole line as far as I know and not a portion of the line. Not sure how to use regex to replace aside from specifying start and stop patterns.
 
Old 10-02-2012, 11:22 AM   #2
whizje
Member
 
Registered: Sep 2008
Location: The Netherlands
Distribution: Slackware64 current
Posts: 594

Rep: Reputation: 141Reputation: 141
Code:
echo "Mary, F, 80 Maple Street" | sed -r 's/([^,])[ ]/\1-/g'
Mary, F, 80-Maple-Street
Code:
-r   - use regular expression
's   - substitute
/    -start regular expression which needs to be replaced in this case we want to replace a space if it is between chars except a comma
(    -start block
[^,] -for the space can occur any char except comma
)    -end block we save this char else it also get's replaced exp:
Code:
echo "Mary, F, 80 Maple Street" | sed -r 's/([^,])[ ]/-/g'
Mary, F, 8-Mapl-Street
Code:
[ ]  - space eventually you can use [ ]* for multiple spaces
/    -end regular expression start replace part
\1   -print chars between start block and end block
-    -print -
/    -end replace
g    -do this globally else only the first space is converted exp:
Code:
echo "Mary, F, 80 Maple Street" | sed -r 's/([^,])[ ]/\1-/'
Mary, F, 80-Maple Street
 
Old 10-02-2012, 11:30 AM   #3
firstfire
Member
 
Registered: Mar 2006
Location: Ekaterinburg, Russia
Distribution: Debian, Ubuntu
Posts: 709

Rep: Reputation: 428Reputation: 428Reputation: 428Reputation: 428Reputation: 428
Hi.

Here is awk approach:
Code:
$ cat in 
Name, Sex, Address
Tom, M, 15 Broadway
Mary, F, 80 Maple Street
$ awk  '{gsub(" +", "-", $3);}1' FS=' *, *' OFS=, in
Name, Sex, Address
Tom,M,15-Broadway
Mary,F,80-Maple-Street
Magic `1' here is a "pattern", which always evaluates to TRUE and, because there are no associated action, this action defaults to 'print $0'. $3 means that we want to perform substitution only on 3rd field (delimited by commas).

Last edited by firstfire; 10-02-2012 at 12:00 PM.
 
1 members found this post helpful.
Old 10-02-2012, 06:54 PM   #4
danielbmartin
Senior Member
 
Registered: Apr 2010
Location: Apex, NC, USA
Distribution: Mint 17.3
Posts: 1,881

Rep: Reputation: 660Reputation: 660Reputation: 660Reputation: 660Reputation: 660Reputation: 660
OP has this input file:
Code:
$ cat in 
Name, Sex, Address
Tom, M, 15 Broadway
Mary, F, 80 Maple Street
firstfire, your awk ...
Code:
$ awk  '{gsub(" +", "-", $3);}1' FS=' *, *' OFS=, in
... produced this...
Code:
Name, Sex, Address
Tom,M,15-Broadway
Mary,F,80-Maple-Street
... but OP wanted this ...
Code:
Name, Sex, Address
Tom, M, 15-Broadway
Mary, F, 80-Maple-Street
Easy fix:
Code:
awk '{gsub(" +", "-", $3);}1' FS=' *, *' OFS=', ' $InFile
Daniel B. Martin
 
Old 10-02-2012, 08:04 PM   #5
Tinkster
Moderator
 
Registered: Apr 2002
Location: earth
Distribution: slackware by choice, others too :} ... android.
Posts: 23,067
Blog Entries: 11

Rep: Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928
Quote:
Originally Posted by whizje View Post
Code:
echo "Mary, F, 80 Maple Street" | sed -r 's/([^,])[ ]/\1-/g'
Mary, F, 80-Maple-Street
Code:
-r   - use regular expression
's   - substitute
/    -start regular expression which needs to be replaced in this case we want to replace a space if it is between chars except a comma
(    -start block
[^,] -for the space can occur any char except comma
)    -end block we save this char else it also get's replaced exp:
Code:
echo "Mary, F, 80 Maple Street" | sed -r 's/([^,])[ ]/-/g'
Mary, F, 8-Mapl-Street
Code:
[ ]  - space eventually you can use [ ]* for multiple spaces
/    -end regular expression start replace part
\1   -print chars between start block and end block
-    -print -
/    -end replace
g    -do this globally else only the first space is converted exp:
Code:
echo "Mary, F, 80 Maple Street" | sed -r 's/([^,])[ ]/\1-/'
Mary, F, 80-Maple Street

You can easily enough incorporate his "condition", too:
Code:
echo "Name, Sex, Street Address
Tom, M, 15 Broadway
Mary, F, 80 Maple Street
" |  sed -r '/^[^ ]+, (M|F),/ s/([^,])[ ]/-/g'
Name, Sex, Street Address
Tom, M, 1-Broadway
Mary, F, 8-Mapl-Street
I chucked the Street only in there for illustration of the
fact that the condition works ;}


Cheers,
Tink
 
Old 10-02-2012, 08:29 PM   #6
danielbmartin
Senior Member
 
Registered: Apr 2010
Location: Apex, NC, USA
Distribution: Mint 17.3
Posts: 1,881

Rep: Reputation: 660Reputation: 660Reputation: 660Reputation: 660Reputation: 660Reputation: 660
A couple of the proposed solutions in this thread took "80 Maple" and turned it into "8-Mapl". A 0 and an e were lost. That isn't right, is it?

Daniel B. Martin
 
Old 10-02-2012, 09:00 PM   #7
Tinkster
Moderator
 
Registered: Apr 2002
Location: earth
Distribution: slackware by choice, others too :} ... android.
Posts: 23,067
Blog Entries: 11

Rep: Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928
Quote:
Originally Posted by danielbmartin View Post
A couple of the proposed solutions in this thread took "80 Maple" and turned it into "8-Mapl". A 0 and an e were lost. That isn't right, is it?

Daniel B. Martin
No, no it's not ... and my apologies for not actually checking
the output of the command I quoted against the input, and blindly
assuming it did what was needed :)
Code:
echo "Name, Sex, Street Address
Tom, M, 15 Broadway
Mary, F, 80 Maple Street
" |  sed -r '/^[^ ]+, (M|F),/ s/([a-zA-Z0-9]+) +/\1-/g'
Name, Sex, Street Address
Tom, M, 15-Broadway
Mary, F, 80-Maple-Street
This seems to do better :)


Cheers,
Tink
 
1 members found this post helpful.
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
sed regex one or none '?' casperdaghost Linux - Newbie 5 06-21-2012 09:44 AM
[SOLVED] Regex with sed pixiandreas Linux - Newbie 12 05-16-2012 02:30 PM
Help with sed regex homer_3 Linux - General 1 08-18-2009 01:57 PM
regex with sed to process file, need help on regex dwynter Linux - Newbie 5 08-31-2007 05:10 AM
Help with Sed and regex cmfarley19 Programming 6 11-18-2004 01:09 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 06:49 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration