LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 09-13-2006, 01:55 AM   #1
twantrd
Senior Member
 
Registered: Nov 2002
Location: CA
Distribution: redhat 7.3
Posts: 1,440

Rep: Reputation: 52
Remove string in sed


Hi,

I have a log file and I want to remove a string for parsing purposes. My log file looks like this

Code:
2006-09-12 19:53:27 1GNKsR-0001yZ-00 <= root@blah.com U=root P=local S=344
I want to remove the username, U=root. However, the username can be anything else as well such as Sandy, Mike, Joe, etc. Therefore, I want to remove U=* but NOT the rest. The result should be

Code:
2006-09-12 19:53:27 1GNKsR-0001yZ-00 <= root@blah.com P=local S=344
My sed syntax is
Code:
cat /tmp/log | sed '/U=.*/,/^$/d'
So, I'm trying to match everything after U= and stop at a blank line but it's not working right. Can someone shed some light? Thanks!

-twantrd
 
Old 09-13-2006, 02:02 AM   #2
druuna
LQ Veteran
 
Registered: Sep 2003
Posts: 10,532
Blog Entries: 7

Rep: Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405
Hi,

This should work:

sed 's/ U=.* P/ P/' <infile>

It assumes that there's always a P= entry after U=<user>.

Hope this helps.
 
Old 09-13-2006, 03:07 AM   #3
spirit receiver
Member
 
Registered: May 2006
Location: Frankfurt, Germany
Distribution: SUSE 10.2
Posts: 424

Rep: Reputation: 33
... and if you want to stop at space characters instead, you can use
Code:
sed 's/ U=[^[:space:]]*//'
 
Old 09-13-2006, 08:26 AM   #4
makyo
Member
 
Registered: Aug 2006
Location: Saint Paul, MN, USA
Distribution: {Free,Open}BSD, CentOS, Debian, Fedora, Solaris, SuSE
Posts: 735

Rep: Reputation: 76
Hi, twantrd.
Quote:
Originally Posted by twantrd
Can someone shed some light?
You received 2 good specific answers for the problem.

One of the principles I try to get across to students in my classes is that regular expressions are by design greedy. They will match the longest possible string. So if you need to match fewer characters, you need to supply a constraint of some kind. As you saw, that took the form of additional text that followed the .* part of the pattern.

That being said, however, it is easy to forget details like that when we're busy with a million other things ... cheers, makyo (26)

Last edited by makyo; 09-13-2006 at 08:30 AM.
 
Old 09-13-2006, 12:15 PM   #5
twantrd
Senior Member
 
Registered: Nov 2002
Location: CA
Distribution: redhat 7.3
Posts: 1,440

Original Poster
Rep: Reputation: 52
Thanks everyone!

Druuna, I understand your syntax. Makes sense. Thanks!

Spirit receiver, can you explain how yours works? I understand that [[:space:]] means an actual space but what does [^[:space:]]* mean? I thought [^ ]* will match everything EXCEPT a space. So, wouldn't the syntax be like this:

sed 's/ U=*[^[:space:]]//' - Match everything after U= but do not match space.

Obviously, this doesn't work as I've just tried it. Thanks.

-twantrd
 
Old 09-13-2006, 12:30 PM   #6
druuna
LQ Veteran
 
Registered: Sep 2003
Posts: 10,532
Blog Entries: 7

Rep: Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405
Hi,

Strange it doesn't work (it does at my end).

About the [^[:space:]]* construct: This will allow anything except (the ^) whitespace.

And yes, you could also write: [^ ]*. But if there's a tab between root and P= instead of a 'real' space, [^ ]* will be greedy and the P= part will also be targeted (wherever the first 'real' space is found).

Hope this clears things up a bit.
 
Old 09-13-2006, 12:34 PM   #7
soggycornflake
Member
 
Registered: May 2006
Location: England
Distribution: Slackware 10.2, Slamd64
Posts: 249

Rep: Reputation: 31
Quote:
Originally Posted by twantrd
Thanks everyone!

Druuna, I understand your syntax. Makes sense. Thanks!

Spirit receiver, can you explain how yours works? I understand that [[:space:]] means an actual space but what does [^[:space:]]* mean? I thought [^ ]* will match everything EXCEPT a space. So, wouldn't the syntax be like this:

sed 's/ U=*[^[:space:]]//' - Match everything after U= but do not match space.

Obviously, this doesn't work as I've just tried it. Thanks.

-twantrd
I think you're confusing the shell meta-character * with the regex repetition operator *.

spirit receiver's pattern '...[^[:space:]]*...' says

[ match a character class (matches 1 character)
^ not (i.e. a character that doesn't match one of the following)
[:space:] match a space character
] closing outer bracket
* repeated zero or more times.

i.e. match any sequence of characters (including nothing) that do not contain a space (or newline/cr/etc).

Whereas your string '...U=*[^[:space:]]...' matches U followed by any number of = characters, followed by any character except a space.

Bear in mind that in regular expressions, * doesn't match anything on its own (nor does ?, + etc), these are repetition operators which apply to the preceding pattern.

Last edited by soggycornflake; 09-13-2006 at 12:37 PM.
 
Old 09-13-2006, 02:28 PM   #8
twantrd
Senior Member
 
Registered: Nov 2002
Location: CA
Distribution: redhat 7.3
Posts: 1,440

Original Poster
Rep: Reputation: 52
Ahh ok that makes sense. Thanks for the clarification!

-twantrd
 
  


Reply

Tags
expressions, regular


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
SED - remove last four characters from string 3saul Linux - Software 12 01-16-2023 10:21 AM
How can I replace this string with another using sed? dave4545 Programming 7 01-27-2006 10:58 AM
insert string with sed greg108 Programming 7 02-18-2005 01:11 PM
[sed] replace string? chuanyung Programming 3 03-11-2004 08:42 PM
Using sed to convert a string to a character? whansard Linux - General 2 01-10-2003 05:13 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 02:09 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration