Linux - NewbieThis Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place!
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I am going through regex concepts these days but its really confusing... So forgive me if I ask some childish questions here...
I have the following file named Test.txt
Code:
Harry boss linux is a tuff job but u can get it only with consistency.... so Harry make sure that u r doing it...
Now look at the amazing regex expressions Harry.... they are tolling me badly...
Now I am running the following commands and the results I am getting are really hard for me to predict....
command 1:
Code:
sed 's/Hary*/ hahahaha /g' Test.txt
Result1:
Code:
hahahaha ry boss linux is a tuff job but u can get it only with consistency.... so hahahaha ry make sure that u r doing it...
Now look at the amazing regex expressions hahahaha ry.... they are tolling me badly...
command 2:
Code:
sed 's/Hary?/ hahahaha /g' Test.txt
Result 2:
Code:
Harry boss linux is a tuff job but u can get it only with consistency.... so Harry make sure that u r doing it...
Now look at the amazing regex expressions Harry.... they are tolling me badly...
Command 3:
Code:
sed 's/\(Hary\)*/ # /g' Test.txt
Result 3:
Code:
# H # a # r # r # y # # b # o # s # s # # l # i # n # u # x # # i # s # # a # # t # u # f # f # # j # o # b # # b # u # t # # u # # c # a # n # # g # e # t # # i # t # # o # n # l # y # # w # i # t # h # # c # o # n # s # i # s # t # e # n # c # y # . # . # . # . # # s # o # # H # a # r # r # y # # m # a # k # e # # s # u # r # e # # t # h # a # t # # u # # r # # d # o # i # n # g # # i # t # . # . # . #
#
# N # o # w # # l # o # o # k # # a # t # # t # h # e # # a # m # a # z # i # n # g # # r # e # g # e # x # # e # x # p # r # e # s # s # i # o # n # s # # H # a # r # r # y # . # . # . # . # # t # h # e # y # # a # r # e # # t # o # l # l # i # n # g # # m # e # # b # a # d # l # y # . # . # . # # #
Command 4:
Code:
sed 's/\(Hary\)?/ # /g' Test.txt
Result 4:
Code:
Harry boss linux is a tuff job but u can get it only with consistency.... so Harry make sure that u r doing it...
Now look at the amazing regex expressions Harry.... they are tolling me badly...
Command 5:
Code:
sed 's/\(Hary\)\*/ # /g' Test.txt
Result 5:
Code:
Harry boss linux is a tuff job but u can get it only with consistency.... so Harry make sure that u r doing it...
Now look at the amazing regex expressions Harry.... they are tolling me badly...
Command 6:
Code:
sed 's/\(Hary\)\?/ # /g' Test.txt
Result 6:
Code:
# H # a # r # r # y # # b # o # s # s # # l # i # n # u # x # # i # s # # a # # t # u # f # f # # j # o # b # # b # u # t # # u # # c # a # n # # g # e # t # # i # t # # o # n # l # y # # w # i # t # h # # c # o # n # s # i # s # t # e # n # c # y # . # . # . # . # # s # o # # H # a # r # r # y # # m # a # k # e # # s # u # r # e # # t # h # a # t # # u # # r # # d # o # i # n # g # # i # t # . # . # . #
#
# N # o # w # # l # o # o # k # # a # t # # t # h # e # # a # m # a # z # i # n # g # # r # e # g # e # x # # e # x # p # r # e # s # s # i # o # n # s # # H # a # r # r # y # . # . # . # . # # t # h # e # y # # a # r # e # # t # o # l # l # i # n # g # # m # e # # b # a # d # l # y # . # . # . # # #
I have gone through a number of regex tutorials on google but still unclear with the above results...
Okay, it might be useful to know what you intended to do with each of the commands. But, for now, I can give an understanding of the first one.
Code:
sed 's/Hary*/ hahahaha /g' Test.txt
The regex string being "Hary*" means match all things with "Har" which also have zero or more "y"s behind it. So, in the case of it finding the name "Harry", it would match the first three letters "Har" and then zero or more "y"s behind it (in this case, zero). So, it would then replace that with " hahahaha ", which gives " hahahaha ry" as the final string. The /g just means do this to all occurrences.
Did you perhaps mean for the regex string to be "Harry*"?
"*" does NOT mean "any character". It means 0 or more of the previous character.
So "Hary*" means "Har" followed by 0 or more "y"s.
Then sed replaces whatever matched with the second string.
So "Hary*" matches "Harry boss linux ...", and because sed then swaps out the match with the replacement text, it becomes " hahahaha ry boss linux ..."
"*" does NOT mean "any character". It means 0 or more of the previous character.
So "Hary*" means "Har" followed by 0 or more "y"s.
Then sed replaces whatever matched with the second string.
So "Hary*" matches "Harry boss linux ...", and because sed then swaps out the match with the replacement text, it becomes " hahahaha ry boss linux ..."
Hi,
The above explanation is OK. But please explain the results of other 5 commands that I posted...
If * is checking for 0 or more occurrences of y than why ? is not checking the 0 or 1 occurrence of y.. It should also give the sqame result...
Please don't expect us to explain every detail..... Sometimes the best advice you get is when people help you to understand basic concepts.
What I have found is that I often have to run tests until I understand what is going on.
Look up the definitions of the various "metacharacters" and try things until the basic concepts become clear.
After posting this, I'll double-check, but:
* means any number of the previous regex
? means that the previous regex either occurs--or does not occur. But does Not occur more than once.
"?", meaning "optional"--AKA "occuring zero or once", is part of extended Regexes. To use it in SED, you have escape it ---as you have shown--- or turn on extended Regexes using the SED -r flag.
We seem to have lost our OP
Last edited by pixellany; 04-24-2010 at 09:02 AM.
Reason: typo
As part of the *n(i|u)x master plan for obfuscation (MPO), various utilities have different ways of turning on extended regexes (EREs)
sed -r
grep -E
egrep
awk (uses ERE by default)
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.