LinuxQuestions.org
Latest LQ Deal: Latest LQ Deals
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - General
User Name
Password
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.

Notices


Reply
  Search this Thread
Old 03-06-2012, 05:00 PM   #1
rm_-rf_windows
Member
 
Registered: Jun 2007
Location: Europe
Distribution: Ubuntu
Posts: 292

Rep: Reputation: 27
Using sed to substitute repetition -> single occurrence


Hi,

I want to use sed to substitute an occurrence of "a a a a a" (a,space,a,space,a,space,a,space,a ... any number of a's like this) with "a".

I just can't figure it out. I've googled it...

Thanks in advance.

PS. - I think it'd be cool to have a special forum section for command line stuff... I wasn't sure where to post this.
 
Old 03-06-2012, 05:19 PM   #2
sycamorex
LQ Veteran
 
Registered: Nov 2005
Location: London
Distribution: Slackware64-current
Posts: 5,836
Blog Entries: 1

Rep: Reputation: 1251Reputation: 1251Reputation: 1251Reputation: 1251Reputation: 1251Reputation: 1251Reputation: 1251Reputation: 1251Reputation: 1251
Not sure if that'll suit your needs. You'd have to be more specific in your example. Ideally provide some sample input.

Code:
sed 's/\(\<a\>\).*/\1/' infile
EDIT: My example is not good for you. I can think of a few situations where it'll fail.

Last edited by sycamorex; 03-06-2012 at 05:21 PM.
 
Old 03-06-2012, 05:25 PM   #3
crts
Senior Member
 
Registered: Jan 2010
Posts: 2,020

Rep: Reputation: 757Reputation: 757Reputation: 757Reputation: 757Reputation: 757Reputation: 757Reputation: 757
Quote:
Originally Posted by rm_-rf_windows View Post
Hi,

I want to use sed to substitute an occurrence of "a a a a a" (a,space,a,space,a,space,a,space,a ... any number of a's like this) with "a".

I just can't figure it out. I've googled it...

Thanks in advance.

PS. - I think it'd be cool to have a special forum section for command line stuff... I wasn't sure where to post this.
Hi,

how about this:
Code:
$ echo 'a a a a a hab' |sed -r 's/(a *)*/a/'
ahab
$ echo 'a a a a ahab' |sed -r 's/(a *)*/a/'
ahab
Is this what you mean? I have to agree with sycamorex that your example is a bit vague.

PS: The command line stuff is usually fine in Linux General or in Programming if the problem is a bit more complex.

Last edited by crts; 03-06-2012 at 05:27 PM.
 
Old 03-06-2012, 05:31 PM   #4
Dark_Helmet
Senior Member
 
Registered: Jan 2003
Posts: 2,786

Rep: Reputation: 374Reputation: 374Reputation: 374Reputation: 374
There are lots of things sed can do, but I don't know that this problem is best suited for it. Though, some sed guru may come along with a quick one liner.

That said, my immediate thought for this problem would use tr and uniq. If the line of text is in a shell variable for instance, substitute newlines for the spaces, then send the result through uniq, and then substitute spaces for the newlines.

The result should be a line where any repeated block of non-space characters is reduced to a single occurrence.

There may be some input "sanitizing" that may need to be done (i.e. convert multiple spaces to a single space). Also, this approach would not properly handle a repeated sequence that spans multiple lines. Then again, neither would sed without some additional complication.

EDIT:
Oh... yeah, I'm looking at this from the more general perspective that you do not know the exact text that will be repeated beforehand.

EDIT2:
Since my response feels naked without an example:
Code:
user@localhost$ echo "they practically practically sell themselves themselves themselves" | \
tr ' ' '\n' | \
uniq | \
tr '\n' ' ' | \
sed 's@ $@\n@'
they practically sell themselves
user@localhost$

Last edited by Dark_Helmet; 03-06-2012 at 05:39 PM.
 
Old 03-06-2012, 10:40 PM   #5
romagnolo
Member
 
Registered: Jul 2009
Location: Montaletto
Distribution: Debian GNU/Linux
Posts: 107

Rep: Reputation: 5
I think you are looking exactly for this:
Code:
sed -r 's/(a )+(a| )?/a/g' your_file >tmp; mv tmp your_file
If your_file contains this:
Quote:
I'm practica a a a ally selling myself.
your command will do:
Code:
$ sed -r 's/(a )+(a| )?/a/g' your_file >tmp; mv tmp your_file
I'm practically selling myself.
For reference, the only True manual of sed is the one written by Lee E. McMahon in 1978, here.
 
Old 03-07-2012, 05:15 AM   #6
rm_-rf_windows
Member
 
Registered: Jun 2007
Location: Europe
Distribution: Ubuntu
Posts: 292

Original Poster
Rep: Reputation: 27
Ottimo Romagnolo, grazie mille!

It's exactly what I wanted (last post, the one before this). Is it possible to make this more generic, so that any repetition, the repetition of any character + space is replaced by the one character?

God, I spent 4 hours trying to figure it out.

Sei italiano?

Ciao.
 
Old 03-07-2012, 06:26 AM   #7
romagnolo
Member
 
Registered: Jul 2009
Location: Montaletto
Distribution: Debian GNU/Linux
Posts: 107

Rep: Reputation: 5
This works like a charm:
Code:
sed -r 's/(([[:graph:]]) )+(\2){1}/\2/g' your_file >tmp; mv tmp your_file
Quote:
Originally Posted by rm_-rf_windows View Post
Sei italiano?
Romagnolo!
 
Old 03-07-2012, 07:09 AM   #8
rm_-rf_windows
Member
 
Registered: Jun 2007
Location: Europe
Distribution: Ubuntu
Posts: 292

Original Poster
Rep: Reputation: 27
Ciao Romagnolo,

Grazie per la risposta, ma da me non funziona! (I haven't said anything obscene here, only that his last code snippet doesn't work on my end)...

Romagnolo, eh? Mi piacerebbe essere in Italia ora (I'd like to be in Italy now).

A dopo.
 
Old 03-07-2012, 07:11 AM   #9
rm_-rf_windows
Member
 
Registered: Jun 2007
Location: Europe
Distribution: Ubuntu
Posts: 292

Original Poster
Rep: Reputation: 27
Incidentally, thanks for everybody who replied to this post. I'll try your suggestions too.
 
Old 03-18-2012, 10:12 AM   #10
rm_-rf_windows
Member
 
Registered: Jun 2007
Location: Europe
Distribution: Ubuntu
Posts: 292

Original Poster
Rep: Reputation: 27
A solution to this simple problem has been found at http://www.linuxquestions.org/questi...rrence-933087/ at message #3.

The solution uses the command line program "uniq".

Many thanks to all contributors to this and the linked thread.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Replace 2nd occurrence of a string in a file - sed or awk? kushalkoolwal Programming 26 09-26-2021 04:10 PM
[SOLVED] substitute ( in sed ghantauke Programming 7 03-11-2011 10:46 AM
sed: delete lines after last occurrence of a pattern in a file zugvogel Programming 4 11-17-2009 01:49 AM
[SOLVED] SED and Replacing Specific occurrence or Range of Lines bridrod Linux - Newbie 7 08-27-2009 09:59 AM
SED replace string by occurrence uttam_h Programming 5 03-05-2008 10:02 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - General

All times are GMT -5. The time now is 04:46 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration