LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   Escape characters in sed stream (https://www.linuxquestions.org/questions/programming-9/escape-characters-in-sed-stream-4175689888/)

__John_L. 02-04-2021 04:02 PM

Escape characters in sed stream
 
I'm trying to use sed on the linux command line, and am running into an issue with special characters.

I'd like to replace "~" with "E;" in the target file (without quotations). So far, using this command:

sed {s/\~/\&#7E\;/g}

on this data:

blah~blah

produces:

blah~#7E;blah

Close, but no cigar.

Thanks in advance for any insight you can provide.

astrogeek 02-04-2021 04:22 PM

I would first ask how you arrived at the expression you are using? Why does it include "&#7" in the replacement string if you do not want them in the replacement text? And why the curly braces?

And by extension, what happens if you remove those characters, have you tried to do that?

Mechanikx 02-04-2021 05:41 PM

I don't believe '~' is a special character. At least not when using regular expressions with sed. It is in the Bash shell. It expands the user's home directory. So it doesn't need to be escaped with sed.

FWIW sed -i 's/~/E;/g' file.txt worked for me

__John_L. 02-04-2021 05:46 PM

The forum software "edited" my desired target string.
Quote:

ampersand-hash-7-E-semicolon
The curly braces are standard for a sed command-line script, by my understanding.

The ampersand is the issue, I believe. It is a sed special character.

Ser Olmy 02-04-2021 05:52 PM

The curly braces are the problem (and I've never seen them used with sed before). This works:
Code:

echo blah~blah | sed 's/~/\&#7E;/'
And so does this:
Code:

echo blah~blah | sed "s/~/\&#7E;/"
And this:
Code:

echo blah~blah | sed {s/~/\\\&#7E\;/}
Note the double escaping in the last example, which is necessary due to both \ and & being treated as special characters in that context.

Edit: The forum messed up the characters again.

__John_L. 02-04-2021 05:59 PM

Thanks all. My ignorance is showing.

syg00 02-04-2021 07:00 PM

Just to close this out, curly brackets are used to group commands - typically when a selection criteria is met; say particular line number or regex match. It is a standard, if not everyday, usage in sed.

astrogeek 02-04-2021 08:10 PM

Quote:

Originally Posted by __John_L. (Post 6216198)
I'd like to replace "~" with "&#7E;" in the target file (without quotations).

Ah! I tripped over the forum-mangled post myself...

You don't need to escape the tilde, '~' for the regular expression, but you do need to escape the ampersand, '&' because it is special, and the semi-colon to hide it from the shell (i.e. for a different reason than the &).

The curly braces are unnecessary, but I would suggest single quotes around the whole expression in which case there is no need to additionally escape the tilde and semi-colon as the shell does not see them:

Code:

echo 'blah~blah' |sed 's/~/\&#7E;/g'
blah&#7E;blah


boughtonp 02-05-2021 07:44 AM

Quote:

Originally Posted by astrogeek (Post 6216281)
You don't need to escape the tilde, '~' for the regular expression, but you do need to escape the ampersand, '&' because it is special,

To clarify further, in Sed replacement strings, ampersand represents "the matched text" - i.e. it's the same as \0 (back-reference zero), and thus needs to be escaped to provide a literal ampersand character.

This doesn't apply in the match pattern - ampersands do not need to be escaped there.

(Aside: Some regex engines have a concept of a union operator for combining multiple character classes, and use a double ampersand for this; sed is not one of those.)



All times are GMT -5. The time now is 07:07 PM.