LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - General (https://www.linuxquestions.org/questions/linux-general-1/)
-   -   One sed command covers 3 patterns ? (https://www.linuxquestions.org/questions/linux-general-1/one-sed-command-covers-3-patterns-4175627050/)

thesunlover 04-05-2018 07:54 AM

One sed command covers 3 patterns ?
 
Hi,

I need to write a script to delete the following pattern lines from some files.

johnsmith
johnsmith,
johnsmith:

"sed -i /^$username/d $file" doesn't work because users like johnsmith2 will be removed as well.

"sed -i /^$username[,:]/d $file" works for the "johnsmith," and "johnsmith:" cases.

"sed -i /^$username[$,:]/d $file" and "sed -i /^$username[\n,:]/d $file" don't work. The "johnsmith" line cannot be removed.

How can I use one sed command (or something else) to cover all these 3 cases?

Thank you in advance!

syg00 04-05-2018 08:01 AM

sed will honour "|" (or) as well as "$" as an anchor.

hazel 04-05-2018 08:05 AM

Use the question mark. It's a wild multiplier that means once or not at all. Also there is a code for punctuation marks as a class. Look it up. You already seem to know enough about regular expressions to combine these two notations to give you what you want.

TB0ne 04-05-2018 08:10 AM

Quote:

Originally Posted by thesunlover (Post 5839633)
Hi,
I need to write a script to delete the following pattern lines from some files.

johnsmith
johnsmith,
johnsmith:

"sed -i /^$username/d $file" doesn't work because users like johnsmith2 will be removed as well.
"sed -i /^$username[,:]/d $file" works for the "johnsmith," and "johnsmith:" cases.
"sed -i /^$username[$,:]/d $file" and "sed -i /^$username[\n,:]/d $file" don't work. The "johnsmith" line cannot be removed.

How can I use one sed command (or something else) to cover all these 3 cases?

I'd suggest you pick up a book on regex's. They are complex, but handy, and in your case can be used to accommodate punctuation. Try:
Code:

sed -i -e '/^johnsmith[[:punct:]]$/d' -e '/^johnsmith$/d'
You don't say if these are just separate lines, or if they're part of larger lines (like "user johnsmith has logged in at noon"), but sed can accommodate multiple -e statements. The first will get anything that starts with johnsmith, followed by ANY punctuation. The second is just johnsmith with a newline. That will leave you with johnsmith2, johnsmithallen, etc.

syg00 04-05-2018 08:17 AM

And in case you haven't guessed, there just might be a few ways of achieving your desired outcome.

How like linux ... :p

MadeInGermany 04-05-2018 08:28 PM

\| and \? are GNU extensions.
The latter is short for the more universal \{0,1\}
Here it must be followed by $ to ensure it's at the end of the line.
The [ ] should be within quotes to ensure the shell does not try a glob. Best have all sed code in quotes.
Code:

sed -i "/^$username[,:]\{0,1\}$/d" $file

thesunlover 04-06-2018 08:44 AM

Hi, Thank you very much all for your prompt replies that are very helpful !!

This compact one works very well:

sed -i "/^$username[[:punct:]]\{0,1\}$/d" $file

The only case that it can't do is "johnsmith ,". Any idea?

thesunlover 04-06-2018 09:01 AM

This one is really good "sed -i "/^$username[,:]\{0,1\}$/d" $file". Thanks MadeInGermany !

MadeInGermany 04-06-2018 09:12 AM

Ooh simileys. Please wrap your code in code tags (=> the # button at the top of the Wiki editor).
Quote:

Originally Posted by thesunlover (Post 5839994)
Hi, Thank you very much all for your prompt replies that are very helpful !!

This compact one works very well:

sed -i "/^$username[[:punct:]]\{0,1\}$/d" $file

The only case that it can't do is "johnsmith ,". Any idea?

You mean: allow an additional space before the punctuation character?

Code:

sed -i "/^$username[[:space:]]*[[:punct:]]\{0,1\}$/d" $file
Like [[:punct:]] that is a class of punctuation characters,
the [[:space:]] is a class of space characters (including Space, TAB, CR, vertical TAB).
There is also [[:blank:]] that is only a space or TAB character.
Followed by a * means it can occur zero or once or many times.

thesunlover 04-06-2018 09:17 AM

Hi,

A similar question: How to make the following line shorter, or put the three grep to one?

if grep -q ^$username$ $file || grep -q ^$username, $file || grep -q ^$username: $file ; then

Thanks.

thesunlover 04-06-2018 09:18 AM

Thank you much again MadeInGermany !

MadeInGermany 04-06-2018 10:09 AM

The same RE applies for grep (but no / / that in sed delimits the RE from other code):
Code:

if grep -q "^$username[:,]\{0,1\}$" $file; then
or
Code:

if grep -q "^$username[[:punct:]]\{0,1\}$" $file; then
or
Code:

if grep -q "^$username[[:blank:]]*[:,]\{0,1\}$" $file; then
or ...

thesunlover 04-06-2018 12:06 PM

Thanks MadeInGermany! Glad to know this good format working for grep as well.

Sorry for my mistake. I failed to make it clearer. Actually the patterns to remove are

johnsmith
johnsmith,John Smith...
johnsmith:John Smith...

So the following code doesn't fulfill the requests perfectly:

Code:

sed -i "/^$username[[:punct:]]\{0,1\}$/d" $file

TB0ne 04-06-2018 12:21 PM

Quote:

Originally Posted by thesunlover (Post 5840068)
Thanks MadeInGermany! Glad to know this good format working for grep as well.

Sorry for my mistake. I failed to make it clearer. Actually the patterns to remove are

johnsmith
johnsmith,John Smith...
johnsmith:John Smith...

So the following code doesn't fulfill the requests perfectly:
Code:

sed -i "/^$username[[:punct:]]\{0,1\}$/d" $file

I had asked you in post #4 for details about the input strings, and you failed to provide them. You've been given a lot of advice here, along with the solution to this particular problem, but you will have to think about it.

The "$" means "end-of-line". Therefore, "johnsmith,John Smith" won't match "johnsmith$", will it? The solution I gave you in post #4 can easily be modified to include the references about blank spaces as well. You should experiment and find your solution, since all the pieces have already been given to you.

MadeInGermany 04-06-2018 12:54 PM

This can be handled by a \( \) group marker. A following quantifier handles the whole group.
Code:

sed -i "/^$username\([,:].*\)\{0,1\}$/d" $file
It becomes easier if we switch the RE type from BRE (basic regular expression) to ERE (extended regular expression, that is also in egrep or grep -E and in awk and in perl and ...)
Code:

sed -r -i "/^$username([,:].*){0,1}$/d"$file
or even shorter
Code:

sed -r -i "/^$username([,:].*)?$/d" $file
Last but not least, nothing speaks against two simple commands, as TB0ne posted already:
Code:

sed -i "/^$username$/d; /^$username[,:].*$/d" file


All times are GMT -5. The time now is 09:25 PM.