ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Hi, I need to perform a simple substitution by replacing whitespaces with a character but only within the middle of a specific identifiable patterns. For example, if I have the following:
Name, Sex, Address
Tom, M, 15 Broadway
Mary, F, 80 Maple Street
I need to transform it to the following (not in Excel but with command lines please) by adding hyphens to the address field. It can be anything but this is just an example.
Name, Sex, Address
Tom, M, 15-Broadway
Mary, F, 80-Maple-Street
I need to have the flexibility of specifying some type of pattern to frame where I want the translation or replacement to take place. In this case:
Start Pattern = ^.*,\s(M|F),\s
Stop Pattern = $
Can someone please show me how this can be done? Thanks. Thought about using sed but it can do only the whole line as far as I know and not a portion of the line. Not sure how to use regex to replace aside from specifying start and stop patterns.
echo "Mary, F, 80 Maple Street" | sed -r 's/([^,])[ ]/\1-/g'
Mary, F, 80-Maple-Street
Code:
-r - use regular expression
's - substitute
/ -start regular expression which needs to be replaced in this case we want to replace a space if it is between chars except a comma
( -start block
[^,] -for the space can occur any char except comma
) -end block we save this char else it also get's replaced exp:
Code:
echo "Mary, F, 80 Maple Street" | sed -r 's/([^,])[ ]/-/g'
Mary, F, 8-Mapl-Street
Code:
[ ] - space eventually you can use [ ]* for multiple spaces
/ -end regular expression start replace part
\1 -print chars between start block and end block
- -print -
/ -end replace
g -do this globally else only the first space is converted exp:
Code:
echo "Mary, F, 80 Maple Street" | sed -r 's/([^,])[ ]/\1-/'
Mary, F, 80-Maple Street
$ cat in
Name, Sex, Address
Tom, M, 15 Broadway
Mary, F, 80 Maple Street
$ awk '{gsub(" +", "-", $3);}1' FS=' *, *' OFS=, in
Name, Sex, Address
Tom,M,15-Broadway
Mary,F,80-Maple-Street
Magic `1' here is a "pattern", which always evaluates to TRUE and, because there are no associated action, this action defaults to 'print $0'. $3 means that we want to perform substitution only on 3rd field (delimited by commas).
echo "Mary, F, 80 Maple Street" | sed -r 's/([^,])[ ]/\1-/g'
Mary, F, 80-Maple-Street
Code:
-r - use regular expression
's - substitute
/ -start regular expression which needs to be replaced in this case we want to replace a space if it is between chars except a comma
( -start block
[^,] -for the space can occur any char except comma
) -end block we save this char else it also get's replaced exp:
Code:
echo "Mary, F, 80 Maple Street" | sed -r 's/([^,])[ ]/-/g'
Mary, F, 8-Mapl-Street
Code:
[ ] - space eventually you can use [ ]* for multiple spaces
/ -end regular expression start replace part
\1 -print chars between start block and end block
- -print -
/ -end replace
g -do this globally else only the first space is converted exp:
Code:
echo "Mary, F, 80 Maple Street" | sed -r 's/([^,])[ ]/\1-/'
Mary, F, 80-Maple Street
You can easily enough incorporate his "condition", too:
Code:
echo "Name, Sex, Street Address
Tom, M, 15 Broadway
Mary, F, 80 Maple Street
" | sed -r '/^[^ ]+, (M|F),/ s/([^,])[ ]/-/g'
Name, Sex, Street Address
Tom, M, 1-Broadway
Mary, F, 8-Mapl-Street
I chucked the Street only in there for illustration of the
fact that the condition works ;}
A couple of the proposed solutions in this thread took "80 Maple" and turned it into "8-Mapl". A 0 and an e were lost. That isn't right, is it?
Daniel B. Martin
No, no it's not ... and my apologies for not actually checking
the output of the command I quoted against the input, and blindly
assuming it did what was needed :)
Code:
echo "Name, Sex, Street Address
Tom, M, 15 Broadway
Mary, F, 80 Maple Street
" | sed -r '/^[^ ]+, (M|F),/ s/([a-zA-Z0-9]+) +/\1-/g'
Name, Sex, Street Address
Tom, M, 15-Broadway
Mary, F, 80-Maple-Street
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.