[SOLVED] sed - text substitution in certain places
ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
Yes, a "t" loop probably is the best way to solve the above.
sed -r -e ':loop; s/["](.*)\bold\b(.*)["]/"\1new\2"/ ; t loop'
Using the -r option and [..] character class brackets also avoids the need for backslashing everything, making it a bit more readable.
One more minor issue is that "old" would also match sub-strings in words like "cold" and "olden", so I added the \b word boundary anchor at each end. To still match word variations like old/older/oldest, you can stick in yet another set of capture parentheses.
sed -r -e ':loop; s/["](.*)\bold(er|est)?\b(.*)["]/"\1new\2\3"/ ; t loop'
Last edited by David the H.; 01-17-2012 at 08:48 AM.
Reason: minor code correction
Hmm, I see. that is a problem. And yes, I understand the requirements.
However, technically, I think that is what sed's doing. The "old sayings" string is between a pair of double-quotes, so the loop is affecting that too.
The thing is, handling matched pairs of characters like this, quotes, parentheses, whatever, has always been a very tricky thing to deal with. There have been quite a few long threads here discussing such things.
Ok, here's one more try that works on your sample text, at least.
sed -r -e ':loop; s/\B["]([^"]*)\bold(er|est)?\b([^"]*)["]\B/"\1new\2\3"/ ; t loop'
Overall, there probably isn't any single sed solution that would be able to handle every possible variation of text. You may have to go with awk or perl instead, like with grail's above, and just tackle each specific situation as it comes up.
I think the problem is with the description of the objective, which I believe would be better stated as 'I want to replace OLD with NEW anywhere and everywhere that OLD is somewhere between balanced double quotes.' Even that is somewhat ambiguous for some cases.