LinuxQuestions.org - [SOLVED] subtitute pattern that crosses new line

- Programming (https://www.linuxquestions.org/questions/programming-9/)

- - subtitute pattern that crosses new line (https://www.linuxquestions.org/questions/programming-9/subtitute-pattern-that-crosses-new-line-4175636591/)

subtitute pattern that crosses new line

This is snippet of the file I'm trying to change:

Code:

1:      00:02:43:24 00:02:45:22 01:23

Why haven't you ever asked me

Warum hast du mich nie gefragt,



2:      00:02:46:03 00:02:49:04 03:01

what this film is about?

worum es in diesem Film geht?



3:      00:02:53:16 00:02:58:18 05:02

And why haven't I ever told you, anyway?

Und warum hab ich es dir eigentlich nie erzählt?



4:      00:03:02:13 00:03:07:00 04:12

Was it just you not being curious?

Warst du einfach nicht neugierig?



5:      00:03:09:09 00:03:12:00 02:16

Or also me being relieved

Oder war ich einfach erleichtert,

What I'm trying to do is to delete the colon, add a new line and delete the empty space before the time interval, so that the line starts with that line interval, like this:

Code:

1

00:02:43:24 00:02:45:22 01:23

I'd like to do it with sed. This is what I've tried so far:

Code:

sed -r -e '/^[0-9]+:/{N;s/(^[0-9]+):\n\s+/\1\n}' file.txt

I've also tried using \t+ instead of \s+, but to no avail. There's no match, the text doesn't change at all.
On the other hand, I'm not sure using \n on both sides of the substitute sentence is going to work.

I did manage to create a new line after the number with each those lines begin, but I wasn't able to delete the space that preceeds the time interval, so that's not effective:

Code:

sed -r -e '/^[0-9]+:/{N;s/(^[0-9]+):/\1\n/g}' file.txt

1

      00:02:43:24 00:02:45:22 01:23

Why haven't you ever asked me

Warum hast du mich nie gefragt,

Any ideas?

[later edit]
Now I've realised that the initial idea doesn't make sense, as there is no \n in the initial line, so there can't possibly be a match there.

So this seems to be doing what I'm looking for :)

Code:

sed -r -e '/^[0-9]+:/{N;s/(^[0-9]+):\s+/\1\n/g}' file.txt

Good work!

You can simplify that a bit:

Code:

sed -r 's/^([0-9]+:)\s*/\1\n/' file.txt

But I'm trying to get rid of the colon :)

Quote:

Originally Posted by vincix (Post 5892818)

But I'm trying to get rid of the colon :)

Sorry, I missed that.

Easy to fix, simply move the capture parenthesis:

Code:

sed -r 's/^([0-9]+):\s*/\1\n/' file.txt

I thought \n doesn't work without N. But now I realise that what N does is simply to translate the end of line into \n, so that it can be matched, whereas in your example (and mine too actually), \n is part of the string that substitutes, not the string that is being substituted.

And yes, it doesn't make too much sense to match a string and create a { } sentence if the substituted string is going to be the same with the initial string (the one before the open brace) I'm trying to match. I see what you're getting at. Indeed, much simpler :)

Glad that worked!

Equally glad to see that you are trying to understand it all! Very good exercise and time well spent for us both!

Thanks for the help :)