-   Linux - Newbie (
-   -   print only changed line with sed after double substition (

vincix 03-03-2017 04:01 PM

print only changed line with sed after double substition

As the title says, I'm trying to print only the lines that have been changed with sed after a double substitution.
This is the text file:

The grand old Duke of York
He had ten thousand men
He marched them up to the top of the hill
And he marched them down again
And when they were up they were up
And when they were down they were down
And when they were only half-way up
They were neither up nor down

Normally I know that I need to use the -n option with the p flag, but in this case I'm trying to print only the lines that were altered by both substitions. The two substitions are 's/down/DOWN/' and 's/up/UP/'.
So for instance, sed -e '/s/up/UP' -e 's/down/DOWN' file.txt will display the whole file, including the lines which haven't been altered.

sed -n -e 's/up/UP/p' -e 's/down/DOWN/p' file.txt won't work either, because lines containing both 'up' and 'down' are going to be displayed twice (once for each substition).

So how should I go about this problem?

syg00 03-03-2017 04:15 PM

Use regex to do the substitution for both in one stanza.

vincix 03-04-2017 01:43 AM

I can only think of the\U option, which turns what it matches into uppercase. I don't want the solution, but I'd like to know in principle how I could actually use regex for two different strings and apply the same action to both. I was thinking of something like 's/"up|down"/\U/', which doesn't work because it interprets them literally. And neither would \| work between "up" and "down" :)

Turbocapitalist 03-04-2017 02:16 AM


Originally Posted by vincix (Post 5678742)
So how should I go about this problem?

You probably need to clarify the problem a little more. You can do a lot with t, b, and : alone. The t will branch if s/up/&/; succeeds, though in effect it is just a check.

If you are only ever going to be using GNU sed then you can do it more concisely with T instead and skip the jumping.

syg00 03-04-2017 02:19 AM

Perhaps you shouldn't be so keen to reject options. You might be pleasantly surprised.

Some tips if I might:
- look at "-r"
- check the (GNU) doco for the first few sentences describing the "s" command. Particularly re the matched portion of the pattern space.

(no need for branching in this case)

vincix 03-04-2017 02:58 AM

I'm not rejecting options. After all -n and -e are options. It's just that you suggested using regex and still don't know exactly how I could solve the problem through regex alone. Yes, actually, I've already looked at -r, and sed does interpret | as or, but that doesn't seem to be the right solution. Anyway, I haven't heard of t, b, or :, so I'll have to read a little bit more.

astrogeek 03-04-2017 03:00 AM

Think like this...


...using the hints provided by syg00. It works.

I had to use the ADDRESS to get only the desired line(s), then using the not so subtle 's' hint provided by syg00 and a previously mentioned operator provided the right result.

Not to give everything away, here is an obfuscated example with the text in ud.txt:


$ sed -rn '/.../s/up|down/.../gp' ud.txt
They were neither UP nor DOWN

Perhaps syg00 has seen a way to get it without the address...?

MadeInGermany 03-04-2017 03:12 AM

With the t and d commands

sed '
t s2
' file.txt

With awk

awk 'sub(/up/,"UP") && sub(/down/,"DOWN")' file.txt
If you want to replace multiple ups and downs per line then you need the g modifier in sed or gsub in awk.

syg00 03-04-2017 04:19 AM


Originally Posted by vincix (Post 5678872)
I don't want the solution, but I'd like to know in principle how I could actually use regex

This is the approach that will encourage contributions. You are making the effort, I am happy to help. If you eventually feel lost, ask and I will supply my solution. It may not be correct, or sufficient, but hopefully we may all learn something by the exercise.

vincix 03-04-2017 12:42 PM

This is what I came up with:

sed -nE '/up|down/s/up|down/\U/gp' duke.txt
The problem is that "U" is interpreted as literal "U", and it doesn't convert to uppercase letters. How do I make sed interpret it correctly?

By the way, I think the correct option was -E, not -r, in order to make sed interpret extended regex. I was referring to -E when I said that sed was eventually interpreting | as "or".

Turbocapitalist 03-04-2017 01:22 PM

You'll probably use an ampersand & instead of \U

Here's another alternative:


sed -e '/down/s/up/&/; t; d;' duke.txt
Though neither example do much with regex, more with sed programming.

The t is a conditional jump. When used without a destination it defaults to a jump to the end of the sed script.
Thus if the // pattern matches AND the s/// substitution succeeds, hop over the command to delete the line.

vincix 03-04-2017 01:29 PM

I don't insist doing it with regex (only). syg00 had suggested it at the beginning of the thread and that's why I was curious. I'm fine with using sed options. The question is, why doesn't \U work? I've seen several examples on the internet.

P.S. Only now did I see that on mac it works only with -E (for extended regex), but on Centos it seems to be working with -r (only?).

Turbocapitalist 03-04-2017 01:33 PM

In which context have you seen \U mentioned? I don't see it in the regex manual or in the manual for sed itself.


man 7 regex
man sed

Though \U does have a meaning in perl's pattern matching


man perlre

vincix 03-04-2017 01:34 PM

And yes, I was working with sed on mac, and now I see it's behaving slightly differently on Centos 7 when using \U. It doesn't interpret it as a literal \U, but it still doesn't work. It simply deletes both matches ("up" and "down").

Turbocapitalist 03-04-2017 01:44 PM

If you want portability you'll need to give up on \U in sed scripting and stay closer to POSIX.


sed 's/up/UP/g; t up; d; b; :up { s/down/DOWN/g; t; d; }' duke.txt

All times are GMT -5. The time now is 06:35 PM.