LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   Regex or Sed (https://www.linuxquestions.org/questions/programming-9/regex-or-sed-4175430070/)

grob115 10-02-2012 10:14 AM

Regex or Sed
 
Hi, I need to perform a simple substitution by replacing whitespaces with a character but only within the middle of a specific identifiable patterns. For example, if I have the following:
Name, Sex, Address
Tom, M, 15 Broadway
Mary, F, 80 Maple Street


I need to transform it to the following (not in Excel but with command lines please) by adding hyphens to the address field. It can be anything but this is just an example.
Name, Sex, Address
Tom, M, 15-Broadway
Mary, F, 80-Maple-Street


I need to have the flexibility of specifying some type of pattern to frame where I want the translation or replacement to take place. In this case:
Start Pattern = ^.*,\s(M|F),\s
Stop Pattern = $

Can someone please show me how this can be done? Thanks. Thought about using sed but it can do only the whole line as far as I know and not a portion of the line. Not sure how to use regex to replace aside from specifying start and stop patterns.

whizje 10-02-2012 11:22 AM

Code:

echo "Mary, F, 80 Maple Street" | sed -r 's/([^,])[ ]/\1-/g'
Mary, F, 80-Maple-Street

Code:

-r  - use regular expression
's  - substitute
/    -start regular expression which needs to be replaced in this case we want to replace a space if it is between chars except a comma
(    -start block
[^,] -for the space can occur any char except comma
)    -end block we save this char else it also get's replaced exp:

Code:

echo "Mary, F, 80 Maple Street" | sed -r 's/([^,])[ ]/-/g'
Mary, F, 8-Mapl-Street

Code:

[ ]  - space eventually you can use [ ]* for multiple spaces
/    -end regular expression start replace part
\1  -print chars between start block and end block
-    -print -
/    -end replace
g    -do this globally else only the first space is converted exp:

Code:

echo "Mary, F, 80 Maple Street" | sed -r 's/([^,])[ ]/\1-/'
Mary, F, 80-Maple Street


firstfire 10-02-2012 11:30 AM

Hi.

Here is awk approach:
Code:

$ cat in
Name, Sex, Address
Tom, M, 15 Broadway
Mary, F, 80 Maple Street
$ awk  '{gsub(" +", "-", $3);}1' FS=' *, *' OFS=, in
Name, Sex, Address
Tom,M,15-Broadway
Mary,F,80-Maple-Street

Magic `1' here is a "pattern", which always evaluates to TRUE and, because there are no associated action, this action defaults to 'print $0'. $3 means that we want to perform substitution only on 3rd field (delimited by commas).

danielbmartin 10-02-2012 06:54 PM

OP has this input file:
Code:

$ cat in
Name, Sex, Address
Tom, M, 15 Broadway
Mary, F, 80 Maple Street

firstfire, your awk ...
Code:

$ awk  '{gsub(" +", "-", $3);}1' FS=' *, *' OFS=, in
... produced this...
Code:

Name, Sex, Address
Tom,M,15-Broadway
Mary,F,80-Maple-Street

... but OP wanted this ...
Code:

Name, Sex, Address
Tom, M, 15-Broadway
Mary, F, 80-Maple-Street

Easy fix:
Code:

awk '{gsub(" +", "-", $3);}1' FS=' *, *' OFS=', ' $InFile
Daniel B. Martin

Tinkster 10-02-2012 08:04 PM

Quote:

Originally Posted by whizje (Post 4795067)
Code:

echo "Mary, F, 80 Maple Street" | sed -r 's/([^,])[ ]/\1-/g'
Mary, F, 80-Maple-Street

Code:

-r  - use regular expression
's  - substitute
/    -start regular expression which needs to be replaced in this case we want to replace a space if it is between chars except a comma
(    -start block
[^,] -for the space can occur any char except comma
)    -end block we save this char else it also get's replaced exp:

Code:

echo "Mary, F, 80 Maple Street" | sed -r 's/([^,])[ ]/-/g'
Mary, F, 8-Mapl-Street

Code:

[ ]  - space eventually you can use [ ]* for multiple spaces
/    -end regular expression start replace part
\1  -print chars between start block and end block
-    -print -
/    -end replace
g    -do this globally else only the first space is converted exp:

Code:

echo "Mary, F, 80 Maple Street" | sed -r 's/([^,])[ ]/\1-/'
Mary, F, 80-Maple Street



You can easily enough incorporate his "condition", too:
Code:

echo "Name, Sex, Street Address
Tom, M, 15 Broadway
Mary, F, 80 Maple Street
" |  sed -r '/^[^ ]+, (M|F),/ s/([^,])[ ]/-/g'
Name, Sex, Street Address
Tom, M, 1-Broadway
Mary, F, 8-Mapl-Street

I chucked the Street only in there for illustration of the
fact that the condition works ;}


Cheers,
Tink

danielbmartin 10-02-2012 08:29 PM

A couple of the proposed solutions in this thread took "80 Maple" and turned it into "8-Mapl". A 0 and an e were lost. That isn't right, is it?

Daniel B. Martin

Tinkster 10-02-2012 09:00 PM

Quote:

Originally Posted by danielbmartin (Post 4795538)
A couple of the proposed solutions in this thread took "80 Maple" and turned it into "8-Mapl". A 0 and an e were lost. That isn't right, is it?

Daniel B. Martin

No, no it's not ... and my apologies for not actually checking
the output of the command I quoted against the input, and blindly
assuming it did what was needed :)
Code:

echo "Name, Sex, Street Address
Tom, M, 15 Broadway
Mary, F, 80 Maple Street
" |  sed -r '/^[^ ]+, (M|F),/ s/([a-zA-Z0-9]+) +/\1-/g'
Name, Sex, Street Address
Tom, M, 15-Broadway
Mary, F, 80-Maple-Street

This seems to do better :)


Cheers,
Tink


All times are GMT -5. The time now is 10:34 AM.