LinuxQuestions.org
Review your favorite Linux distribution.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 05-12-2020, 09:41 AM   #1
blueray
Member
 
Registered: Feb 2020
Location: Bangladesh
Distribution: Debian, Ubuntu, Linux Mint
Posts: 136

Rep: Reputation: 2
Regex: Put + Between Paragraphs


I need a regex one liner for the following problem.

If There is multiple paragraph after a Line which has ::, Then have to put + between each paragraph.

Current Text

Code:
One dollar:: and eighty-seven cents. That was all. And sixty cents of it was in pennies. 

Thee:: It was easy to spot her. All you needed to do was look at her socks.

One would reach her knee while the other barely touched her ankle. 

While the argument:: seems to be different the truth is it's always the same. 

They both knew it, but neither has the courage or strength. 

The words:: hadn't flowed from his fingers for the past few weeks. 

He didn't understand why he couldn't even type a single word.

Was being satisfied enough?

She reached her goal:: exhausted. Even more chilling to her was that the euphoria that she thought she'd feel.

Spending time at national parks can be an exciting adventure. 

It seemed like it should have been so simple. 

Was it enough:: That was the question he kept asking himself. 

He knew that he was satisfied and he also knew it wasn't going to be enough.

It was just a burger. Why couldn't she understand that? 

Yes, he had promised her and yes, he had broken that promise.
Expected Output

Code:
One dollar:: and eighty-seven cents. That was all. And sixty cents of it was in pennies. 

Thee:: It was easy to spot her. All you needed to do was look at her socks.
+
One would reach her knee while the other barely touched her ankle. 

While the argument:: seems to be different the truth is it's always the same. 
+
They both knew it, but neither has the courage or strength. 

The words:: hadn't flowed from his fingers for the past few weeks. 
+
He didn't understand why he couldn't even type a single word.
+
Was being satisfied enough?

She reached her goal:: exhausted. Even more chilling to her was that the euphoria that she thought she'd feel.
+
Spending time at national parks can be an exciting adventure. 
+
It seemed like it should have been so simple. 

Was it enough:: That was the question he kept asking himself. 
+
He knew that he was satisfied and he also knew it wasn't going to be enough.
+
It was just a burger. Why couldn't she understand that? 
+
Yes, he had promised her and yes, he had broken that promise.
The solution I have tried is

Code:
$ perl -pe 's/(^.*::.*\n\n)/$1\n+\n/g' regex.txt
However, It only put + after the first paragraph.
 
Old 05-12-2020, 09:51 AM   #2
TB0ne
LQ Guru
 
Registered: Jul 2003
Location: Birmingham, Alabama
Distribution: SuSE, RedHat, Slack,CentOS
Posts: 26,613

Rep: Reputation: 7962Reputation: 7962Reputation: 7962Reputation: 7962Reputation: 7962Reputation: 7962Reputation: 7962Reputation: 7962Reputation: 7962Reputation: 7962Reputation: 7962
Try this:
Code:
 sed '/\:\:/{N;s/\n$/\n+/}'
 
1 members found this post helpful.
Old 05-12-2020, 10:00 AM   #3
blueray
Member
 
Registered: Feb 2020
Location: Bangladesh
Distribution: Debian, Ubuntu, Linux Mint
Posts: 136

Original Poster
Rep: Reputation: 2
Please let me get back to you. It might take an hour.
 
Old 05-12-2020, 10:07 AM   #4
Turbocapitalist
LQ Guru
 
Registered: Apr 2005
Distribution: Linux Mint, Devuan, OpenBSD
Posts: 7,292
Blog Entries: 3

Rep: Reputation: 3718Reputation: 3718Reputation: 3718Reputation: 3718Reputation: 3718Reputation: 3718Reputation: 3718Reputation: 3718Reputation: 3718Reputation: 3718Reputation: 3718
Or try reversing the pattern and changing the input record separator to something other than a new line.

Code:
perl -0x1ff -pe 's/\n\n(?!.*::)/\n+\n/g'
See "man perlrun" and "man perlre"

That leaves a trailing plus on the last line, however.

That might not be the most practical with very large files. You might need a more complex one-liner or even something more than just a one-liner.

Edit:

Code:
perl -0x1ff -pe 's/\n\n(?!.*::)(?=.)/\n+\n/g;'

Last edited by Turbocapitalist; 05-12-2020 at 10:23 AM.
 
3 members found this post helpful.
Old 05-12-2020, 10:29 AM   #5
blueray
Member
 
Registered: Feb 2020
Location: Bangladesh
Distribution: Debian, Ubuntu, Linux Mint
Posts: 136

Original Poster
Rep: Reputation: 2
Thank you very much.
 
Old 05-12-2020, 10:33 AM   #6
Turbocapitalist
LQ Guru
 
Registered: Apr 2005
Distribution: Linux Mint, Devuan, OpenBSD
Posts: 7,292
Blog Entries: 3

Rep: Reputation: 3718Reputation: 3718Reputation: 3718Reputation: 3718Reputation: 3718Reputation: 3718Reputation: 3718Reputation: 3718Reputation: 3718Reputation: 3718Reputation: 3718
No problem. There is one negative lookahead (?!…) assertion and one positive (?=…) assertion. They are useful on occasion. They are non-capturing groups. Again, see "man perlre" about that.
 
2 members found this post helpful.
Old 05-12-2020, 10:33 AM   #7
pan64
LQ Addict
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 21,790

Rep: Reputation: 7304Reputation: 7304Reputation: 7304Reputation: 7304Reputation: 7304Reputation: 7304Reputation: 7304Reputation: 7304Reputation: 7304Reputation: 7304Reputation: 7304
Without any other options, -p processes the input line by line. No line can contain anything after the \n. You have to change the record separator:
Code:
perl -0pe 's/(::\N*\n)\n/$1+\n/g'
 
1 members found this post helpful.
Old 05-12-2020, 11:11 AM   #8
shruggy
Senior Member
 
Registered: Mar 2020
Posts: 3,670

Rep: Reputation: Disabled
@Turbocapitalist. Hats off!

@others. No, it's not so easy, see Turbocapitalist's solution. The OP didn't make it clear, but lines that don't include :: are considered part of the previous paragraph.

Last edited by shruggy; 05-12-2020 at 11:18 AM.
 
Old 05-12-2020, 12:15 PM   #9
pan64
LQ Addict
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 21,790

Rep: Reputation: 7304Reputation: 7304Reputation: 7304Reputation: 7304Reputation: 7304Reputation: 7304Reputation: 7304Reputation: 7304Reputation: 7304Reputation: 7304Reputation: 7304
Yes, now I understand.
Code:
perl -pe 'BEGIN{$/="::"} {s/(\n\N+\n)/+$1/g}'
 
1 members found this post helpful.
Old 05-12-2020, 12:31 PM   #10
Turbocapitalist
LQ Guru
 
Registered: Apr 2005
Distribution: Linux Mint, Devuan, OpenBSD
Posts: 7,292
Blog Entries: 3

Rep: Reputation: 3718Reputation: 3718Reputation: 3718Reputation: 3718Reputation: 3718Reputation: 3718Reputation: 3718Reputation: 3718Reputation: 3718Reputation: 3718Reputation: 3718
That looks more efficient. But that use for \N sure is buried deeply in the manual page.

The special variables also have more mnemonic names, too:

Code:
perl -pe 'BEGIN{$RS="::"} {s/(\n\N+\n)/+$1/g}'
Or the English module can allow full names for the variables:

Code:
perl -MEnglish -pe 'BEGIN{$INPUT_RECORD_SEPARATOR="::"} {s/(\n\N+\n)/+$1/g}'
A one-liner might lose its simplicity that way but a full script would be more readable with that module.
 
  


Reply

Tags
perl, regex, regexp


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] differences between shell regex and php regex and perl regex and javascript and mysql golden_boy615 Linux - General 2 04-19-2011 01:10 AM
How to suppress line spacing between paragraphs dv502 Linux - General 2 08-17-2008 08:50 PM
sort date in paragraphs within file nabmufti Programming 3 02-14-2008 10:13 AM
how to extract paragraphs from file in BASH script followed by prefix ! , !! and !!! nabmufti Programming 4 02-10-2008 09:23 AM
Trying to write a BASH script which allows input of paragraphs ChrisScott Linux - General 6 11-26-2006 05:32 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 01:49 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration