Problems with regex in sed command
Hi all,
I am going through regex concepts these days but its really confusing... So forgive me if I ask some childish questions here... I have the following file named Test.txt Code:
Harry boss linux is a tuff job but u can get it only with consistency.... so Harry make sure that u r doing it... command 1: Code:
sed 's/Hary*/ hahahaha /g' Test.txt Code:
hahahaha ry boss linux is a tuff job but u can get it only with consistency.... so hahahaha ry make sure that u r doing it... Code:
sed 's/Hary?/ hahahaha /g' Test.txt Code:
Harry boss linux is a tuff job but u can get it only with consistency.... so Harry make sure that u r doing it... Code:
sed 's/\(Hary\)*/ # /g' Test.txt Code:
# H # a # r # r # y # # b # o # s # s # # l # i # n # u # x # # i # s # # a # # t # u # f # f # # j # o # b # # b # u # t # # u # # c # a # n # # g # e # t # # i # t # # o # n # l # y # # w # i # t # h # # c # o # n # s # i # s # t # e # n # c # y # . # . # . # . # # s # o # # H # a # r # r # y # # m # a # k # e # # s # u # r # e # # t # h # a # t # # u # # r # # d # o # i # n # g # # i # t # . # . # . # Code:
sed 's/\(Hary\)?/ # /g' Test.txt Code:
Harry boss linux is a tuff job but u can get it only with consistency.... so Harry make sure that u r doing it... Code:
sed 's/\(Hary\)\*/ # /g' Test.txt Code:
Harry boss linux is a tuff job but u can get it only with consistency.... so Harry make sure that u r doing it... Code:
sed 's/\(Hary\)\?/ # /g' Test.txt Code:
# H # a # r # r # y # # b # o # s # s # # l # i # n # u # x # # i # s # # a # # t # u # f # f # # j # o # b # # b # u # t # # u # # c # a # n # # g # e # t # # i # t # # o # n # l # y # # w # i # t # h # # c # o # n # s # i # s # t # e # n # c # y # . # . # . # . # # s # o # # H # a # r # r # y # # m # a # k # e # # s # u # r # e # # t # h # a # t # # u # # r # # d # o # i # n # g # # i # t # . # . # . # Please help me... Thanks in advance.. Regards _Linux_Learner |
Okay, it might be useful to know what you intended to do with each of the commands. But, for now, I can give an understanding of the first one.
Code:
sed 's/Hary*/ hahahaha /g' Test.txt Did you perhaps mean for the regex string to be "Harry*"? |
*Learner;
With regard to your original post: Too much information!! It does not take that many examples to illustrate a question or a problem. The problem **might** be a simple as recognizing that "*" inside of a SED regex is NOT a wildcard. If this and the post above don't solve the issue, then post a SHORT before and after example of the results you are looking for. |
Quote:
Thanks in advance.. Regards _Linux_Learner |
"*" does NOT mean "any character". It means 0 or more of the previous character.
So "Hary*" means "Har" followed by 0 or more "y"s. Then sed replaces whatever matched with the second string. So "Hary*" matches "Harry boss linux ...", and because sed then swaps out the match with the replacement text, it becomes " hahahaha ry boss linux ..." |
Quote:
The above explanation is OK. But please explain the results of other 5 commands that I posted... If * is checking for 0 or more occurrences of y than why ? is not checking the 0 or 1 occurrence of y.. It should also give the sqame result... Thanks in advance.. Regards _Linux_Learner |
Please don't expect us to explain every detail..... Sometimes the best advice you get is when people help you to understand basic concepts.
What I have found is that I often have to run tests until I understand what is going on. Look up the definitions of the various "metacharacters" and try things until the basic concepts become clear. After posting this, I'll double-check, but: * means any number of the previous regex ? means that the previous regex either occurs--or does not occur. But does Not occur more than once. Simple tests will illustrate the difference. |
@pixellany - did some searching as it was baffling me a little why it wasn't working with "?":
Quote:
|
grail;
I think we're saying the same thing.... "?", meaning "optional"--AKA "occuring zero or once", is part of extended Regexes. To use it in SED, you have escape it ---as you have shown--- or turn on extended Regexes using the SED -r flag. We seem to have lost our OP |
I almost always use extended regular expressions (sed -r). They are more regular and expressive ;)
|
pixellany, What I was trying to point out is that sed requires slosh (\) or escape prior to the use of ? in a regex.
So the OPs example of - sed 's/Hary?/ hahahaha /g' Test.txt correctly returns no change as the text "Hary?" is not in the text. However, this does work: Code:
sed 's/Hary\?/ hahahaha /g' Test.txt |
Quote:
|
Sorry I seem to have suffered from RTFA (A for answer) :redface:
Also, I did not know that previously, still sharpening my sedjitsu |
As part of the *n(i|u)x master plan for obfuscation (MPO), various utilities have different ways of turning on extended regexes (EREs)
sed -r grep -E egrep awk (uses ERE by default) other examples? |
All times are GMT -5. The time now is 04:56 AM. |