LinuxQuestions.org
Visit Jeremy's Blog.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices

Reply
 
Search this Thread
Old 01-16-2012, 05:18 PM   #1
danielbmartin
Senior Member
 
Registered: Apr 2010
Location: Apex, NC, USA
Distribution: Ubuntu
Posts: 1,142

Rep: Reputation: 300Reputation: 300Reputation: 300Reputation: 300
sed - text substitution in certain places


I want to replace OLD with NEW but only where OLD is somewhere between double quotes. I think this is done with regions but haven't been able to dope it out.

Example:
Quote:
Even in his old age Emerson said, "The old houses are better."
would become:
Quote:
Even in his old age Emerson said, "The new houses are better."
Daniel B. Martin
 
Old 01-16-2012, 06:01 PM   #2
arshadul
LQ Newbie
 
Registered: Apr 2009
Distribution: CentOS 5.3, Ubuntu 8.0.4LTS
Posts: 4

Rep: Reputation: 0
try the following:

sed -e 's/\(\".*\)old\(.*\"\)/\1new\2/g'

as in


$ echo "Even in his old age Emerson said, \"The old houses are better.\"" | sed -e 's/\(\".*\)old\(.*\"\)/\1new\2/g'
Even in his old age Emerson said, "The new houses are better."
 
Old 01-16-2012, 09:47 PM   #3
danielbmartin
Senior Member
 
Registered: Apr 2010
Location: Apex, NC, USA
Distribution: Ubuntu
Posts: 1,142

Original Poster
Rep: Reputation: 300Reputation: 300Reputation: 300Reputation: 300
Quote:
Originally Posted by arshadul View Post
try the following:
sed -e 's/\(\".*\)old\(.*\"\)/\1new\2/g'
Thank you, arshadul, for your prompt response. It works for the given example, but not in all cases. Allow me to restate the question, hoping to clarify.

I want to replace OLD with NEW anywhere and everywhere that OLD is somewhere between double quotes.

Examples:
Quote:
Even in his old age Emerson said, "The old houses are better."
old dogs old habits "old houses old cars" old men old computers
would become:
Quote:
Even in his old age Emerson said, "The new houses are better."
old dogs old habits "new houses new cars" old men old computers
Your suggestion handles the first line correctly but not the second.

Daniel B. Martin
 
Old 01-16-2012, 11:07 PM   #4
grail
Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 7,577

Rep: Reputation: 1941Reputation: 1941Reputation: 1941Reputation: 1941Reputation: 1941Reputation: 1941Reputation: 1941Reputation: 1941Reputation: 1941Reputation: 1941Reputation: 1941
I believe you would need to use the testing options of either 'b' or 't' for sed to process this correctly (could be wrong of course).
As an alternative, awk can do this rather easily:
Code:
awk 'BEGIN{ RS="\"" }{ORS = RT}!(NR%2){gsub(/old/,"new")}1' file
 
1 members found this post helpful.
Old 01-17-2012, 08:41 AM   #5
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Debian sid + kde 3.5 & 4.4
Posts: 6,823

Rep: Reputation: 1949Reputation: 1949Reputation: 1949Reputation: 1949Reputation: 1949Reputation: 1949Reputation: 1949Reputation: 1949Reputation: 1949Reputation: 1949Reputation: 1949
Yes, a "t" loop probably is the best way to solve the above.

Code:
sed -r -e ':loop; s/["](.*)\bold\b(.*)["]/"\1new\2"/ ; t loop'
Using the -r option and [..] character class brackets also avoids the need for backslashing everything, making it a bit more readable.

One more minor issue is that "old" would also match sub-strings in words like "cold" and "olden", so I added the \b word boundary anchor at each end. To still match word variations like old/older/oldest, you can stick in yet another set of capture parentheses.

Code:
sed -r -e ':loop; s/["](.*)\bold(er|est)?\b(.*)["]/"\1new\2\3"/ ; t loop'

Last edited by David the H.; 01-17-2012 at 08:48 AM. Reason: minor code correction
 
Old 01-17-2012, 09:14 AM   #6
danielbmartin
Senior Member
 
Registered: Apr 2010
Location: Apex, NC, USA
Distribution: Ubuntu
Posts: 1,142

Original Poster
Rep: Reputation: 300Reputation: 300Reputation: 300Reputation: 300
Thank you, David the H., for your timely response. With further testing a flaw is detected. This flaw may be attributed to an ambiguity in the phrase "between double quotes."

Input line:
Quote:
old dogs "old habits" old sayings "old men old coins" old houses
Result:
Quote:
old dogs "new habits" new sayings "new men new coins" old houses
Desired result:
Quote:
old dogs "new habits" old sayings "new men new coins" old houses
Allow me to restate the question (again), to sharpen the spec.

I want to replace OLD with NEW anywhere and everywhere that OLD is somewhere between *pairs of* double quotes.

Daniel B. Martin
 
Old 01-17-2012, 10:22 AM   #7
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Debian sid + kde 3.5 & 4.4
Posts: 6,823

Rep: Reputation: 1949Reputation: 1949Reputation: 1949Reputation: 1949Reputation: 1949Reputation: 1949Reputation: 1949Reputation: 1949Reputation: 1949Reputation: 1949Reputation: 1949
Hmm, I see. that is a problem. And yes, I understand the requirements.

However, technically, I think that is what sed's doing. The "old sayings" string is between a pair of double-quotes, so the loop is affecting that too.

The thing is, handling matched pairs of characters like this, quotes, parentheses, whatever, has always been a very tricky thing to deal with. There have been quite a few long threads here discussing such things.

Ok, here's one more try that works on your sample text, at least.
Code:
sed -r -e ':loop; s/\B["]([^"]*)\bold(er|est)?\b([^"]*)["]\B/"\1new\2\3"/ ; t loop'
Overall, there probably isn't any single sed solution that would be able to handle every possible variation of text. You may have to go with awk or perl instead, like with grail's above, and just tackle each specific situation as it comes up.
 
1 members found this post helpful.
Old 01-17-2012, 12:08 PM   #8
grail
Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 7,577

Rep: Reputation: 1941Reputation: 1941Reputation: 1941Reputation: 1941Reputation: 1941Reputation: 1941Reputation: 1941Reputation: 1941Reputation: 1941Reputation: 1941Reputation: 1941
The awk solution still works for the current example and if the requirement is like David has said that words may also contain old and should not be changed then simply use \<old\>
for the regex
 
Old 01-17-2012, 01:00 PM   #9
theNbomr
LQ 5k Club
 
Registered: Aug 2005
Distribution: OpenSuse, Fedora, Redhat, Debian
Posts: 5,395
Blog Entries: 2

Rep: Reputation: 903Reputation: 903Reputation: 903Reputation: 903Reputation: 903Reputation: 903Reputation: 903Reputation: 903
I think the problem is with the description of the objective, which I believe would be better stated as 'I want to replace OLD with NEW anywhere and everywhere that OLD is somewhere between balanced double quotes.' Even that is somewhat ambiguous for some cases.
--- rod.
 
Old 01-17-2012, 09:21 PM   #10
danielbmartin
Senior Member
 
Registered: Apr 2010
Location: Apex, NC, USA
Distribution: Ubuntu
Posts: 1,142

Original Poster
Rep: Reputation: 300Reputation: 300Reputation: 300Reputation: 300
[QUOTE=David the H.;4576977]
Code:
sed -r -e ':loop; s/\B["]([^"]*)\bold(er|est)?\b([^"]*)["]\B/"\1new\2\3"/ ; t loop'
I'm happy with this and modified it to suit the actual application. Thank you, David the H. for your effort and advice. Let's mark this one SOLVED!

Daniel B. Martin
 
  


Reply

Tags
sed


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Sed substitution using & linuxScriptGirl Linux - Newbie 9 03-17-2011 10:20 AM
SED - substitution carolflb Linux - Newbie 5 02-06-2010 12:20 AM
Text substitution and processing with sed and awk shanecraddock@gmail.com Linux - Newbie 1 12-18-2008 11:34 AM
variable substitution in sed gaynut Programming 1 07-14-2008 07:38 AM
Command substitution and sed daYz Linux - General 9 11-04-2006 01:15 AM


All times are GMT -5. The time now is 11:58 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration