LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 05-03-2010, 07:11 AM   #1
jigg_fly
LQ Newbie
 
Registered: May 2010
Posts: 2

Rep: Reputation: 0
use sed to find string pattern and delete subsequent characters


I have a file with a number of strings like the ones below

string1#m1asdfe23easdf23wefas
string2#mfaaeb2vr1rhserh
anotherstring#ji89ensrsegr
anotherone#m1ynmdt324nsdt

I'm trying to delete everything after #** so that

string1#maasdfeaveasdfawefas
string2#mfaaebvrserhserh

becomes

string1#ma
string2#mf

tried sed 's/#..*//g' but as you all will know it returns string1, string2 etc.
 
Old 05-03-2010, 07:19 AM   #2
PMP
Member
 
Registered: Apr 2009
Location: ~
Distribution: RHEL, Fedora
Posts: 381

Rep: Reputation: 58
Code:
echo 'string1#m1asdfe23easdf23wefas'  | sed 's/\(.*\#..\).*/\1/g'
 
1 members found this post helpful.
Old 05-03-2010, 07:23 AM   #3
pixellany
LQ Veteran
 
Registered: Nov 2005
Location: Annapolis, MD
Distribution: Mint
Posts: 17,809

Rep: Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743
So what is the rule?

For example, is it: 'find the first "#" and delete everything after "#" plus 2 characters'?

Your code finds the pattern: '"#", followed by any character, then any number of characters'


Try this:
Code:
sed 's/\(#..\).*/\1/' filename
This uses a backreference to capture "#" plus any 2 characters (as part of the total matched expression), and re-insert that pattern in place of the total match.

Go here for a really good SED tutorial:
http://www.grymoire.com/Unix/Sed.html
 
1 members found this post helpful.
Old 05-03-2010, 07:53 AM   #4
pixellany
LQ Veteran
 
Registered: Nov 2005
Location: Annapolis, MD
Distribution: Mint
Posts: 17,809

Rep: Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743
Quote:
Originally Posted by PMP View Post
Code:
echo 'string1#m1asdfe23easdf23wefas'  | sed 's/\(.*\#..\).*/\1/g'
This is not going to work....I can explain later (Have to be on a conference call.)
<<Edit: It was later established that I was wrong>>

Last edited by pixellany; 05-03-2010 at 11:11 AM.
 
Old 05-03-2010, 07:56 AM   #5
PMP
Member
 
Registered: Apr 2009
Location: ~
Distribution: RHEL, Fedora
Posts: 381

Rep: Reputation: 58
Quote:
Originally Posted by pixellany View Post
This is not going to work....I can explain later (Have to be on a conference call.)
Code:
-bash-3.2$ cat test
string1#m1asdfe23easdf23wefas
string2#mfaaeb2vr1rhserh
anotherstring#ji89ensrsegr
anotherone#m1ynmdt324nsdt
Code:
-bash-3.2$  sed 's/\(.*\#..\).*/\1/g' test
string1#m1
string2#mf
anotherstring#ji
anotherone#m1

It worked for me !! Happy to see your views !!

Last edited by PMP; 05-03-2010 at 07:59 AM. Reason: added outout codes
 
Old 05-03-2010, 08:02 AM   #6
jigg_fly
LQ Newbie
 
Registered: May 2010
Posts: 2

Original Poster
Rep: Reputation: 0
Thanks to both pixellany and PMP.

I've tried both solutions and they both seem to work. Also curious as to why PMP's ideal.

pixellany, I'm using yours as it seems to work faster on a big file.
 
Old 05-03-2010, 08:07 AM   #7
druuna
LQ Veteran
 
Registered: Sep 2003
Posts: 10,532
Blog Entries: 7

Rep: Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405
Hi,

Both pixellany's and PMP's example work.

@PMP: The first .* and the g option are not needed (but don't do any harm for the task at hand).

@pixellany: I would have chosen your example, but the "missing" first part (everything up to the #) can be confusing if you are not familiar with sed. As long as the first .* in PMP's example is part of the back referencing all is ok.
 
Old 05-03-2010, 08:07 AM   #8
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,007

Rep: Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192
@PMP - I think what pixellany is referring to is if you change the pattern to have anymore hashes (#) in it then yours will be a little greedy. Try this string:

Code:
string1#m1asdfe23easdf2#3wefas
 
1 members found this post helpful.
Old 05-03-2010, 08:16 AM   #9
PMP
Member
 
Registered: Apr 2009
Location: ~
Distribution: RHEL, Fedora
Posts: 381

Rep: Reputation: 58
@grail,
I took the sample data provided by OP and where it is mentioned

Quote:
I'm trying to delete everything after #** so that
Even I am waiting for pixellany's view. Let him finish his con-call
 
Old 05-03-2010, 08:50 AM   #10
pixellany
LQ Veteran
 
Registered: Nov 2005
Location: Annapolis, MD
Distribution: Mint
Posts: 17,809

Rep: Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743
I was in error!! The confusion was in the fact that the backreference in PMP's solution was replacing everything on the first part of line, whereas mine replaces on what starts with "#..". It was not obvious at a glance that they were doing the same thing.

In PMP's solution, why does the "#" have to be escaped?
 
Old 05-03-2010, 08:56 AM   #11
PMP
Member
 
Registered: Apr 2009
Location: ~
Distribution: RHEL, Fedora
Posts: 381

Rep: Reputation: 58
It can be ignored, Thought of playing safe, later did not removed it
 
Old 05-03-2010, 01:15 PM   #12
rkski
Member
 
Registered: Jan 2009
Location: Canada
Distribution: CentOS 6.3, Fedora 17
Posts: 247

Rep: Reputation: 51
Isn't pixellany's solution more robust (not to mention more efficient) for the reason stated by grail in post#8?
 
Old 05-03-2010, 01:18 PM   #13
druuna
LQ Veteran
 
Registered: Sep 2003
Posts: 10,532
Blog Entries: 7

Rep: Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405
Hi,

@rkski: pixellany's solution is indeed "better". The OP noticed that (see post #6).
 
Old 10-09-2013, 03:21 PM   #14
Ayubstation
LQ Newbie
 
Registered: Oct 2013
Posts: 1

Rep: Reputation: Disabled
Quote:


Originally Posted by pixellany View Post

This is not going to work....I can explain later (Have to be on a conference call.)



Code:
-bash-3.2$ cat test
string1#m1asdfe23easdf23wefas
string2#mfaaeb2vr1rhserh
anotherstring#ji89ensrsegr
anotherone#m1ynmdt324nsdt


Code:
-bash-3.2$ sed 's/\(.*\#..\).*/\1/g' test
string1#m1
string2#mf
anotherstring#ji
anotherone#m1

It worked for me !! Happy to see your views !!


How is it if I want to print only after that pattern?
 
Old 10-09-2013, 04:53 PM   #15
danielbmartin
Senior Member
 
Registered: Apr 2010
Location: Apex, NC, USA
Distribution: Mint 17.3
Posts: 1,881

Rep: Reputation: 660Reputation: 660Reputation: 660Reputation: 660Reputation: 660Reputation: 660
OP asked for a sed solution and a good one has already been posted.
Therefore I will contribute an awk solution.

With this InFile ...
Code:
string1#m1asdfe23easdf23wefas
string2#mfaaeb2vr1rhserh
anotherstring#ji89ensrsegr
anotherone#m1ynmdt324nsdt
... this awk ...
Code:
awk '{print substr($0,1,index($0,"#")+2)}' $InFile >$OutFile
... produced this OutFile ...
Code:
string1#m1
string2#mf
anotherstring#ji
anotherone#m1
Daniel B. Martin
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
SED - remove last four characters from string 3saul Linux - Software 12 01-16-2023 10:21 AM
How to use sed to delete all lines before the first match of a pattern? C_Blade Linux - Newbie 9 05-01-2010 04:18 AM
[SOLVED] sed: Find pattern and delete 5 lines after it supersoni3 Programming 4 03-24-2010 07:00 AM
sed: delete lines after last occurrence of a pattern in a file zugvogel Programming 4 11-17-2009 01:49 AM
Find string pattern in directory of text files magnum818 Linux - Newbie 2 10-15-2003 08:19 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 02:17 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration