ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
echo "Find 5-character words which have the same letter in positions 1 and 3."
echo " Examples: fifth, mamma, sassy, total."
egrep '^(.).\1..$' $Work1 >$OutFile
... and it works.
To make the exercise more interesting I coded this ...
Code:
echo "Find 5-character words which have the same letter in positions 1 and 3"
echo " --and-- the character in positions 1 and 3 is not used elsewhere."
echo " Examples: fifth, total."
egrep '^(.)[^\1]\1[^\1][^\1]$' $Work1 >$OutFile
... and it produces an OutFile identical to the first exercise. Evidently using [^\1] to exclude a specific character from a character class is not doing the job.
I found a solution but it is a little out there. Also my reference was posted over 5 years ago so there may be an alternative now.
Anyhoo, here is what worked:
It seems you cannot negate a back reference, but you can negate a look-ahead. You will notice I have also switched from -E (what you are using) to -P for perl regular expressions which support look-aheads
I will be interested to see if there is an alternative or even a way to shorten the current solution
Lookaround assertions are zero-width patterns which match a specific pattern without including it in $& . Positive assertions match when their subpattern matches, negative assertions match when their subpattern fails. Lookbehind matches text up to the current match position, lookahead matches text following the current match position.
Code:
(?!pattern)
A zero-width negative lookahead assertion. For example /foo(?!bar)/ matches any occurrence of "foo" that isn't followed by "bar".
If you are looking for a "bar" that isn't preceded by a "foo", /(?!foo)bar/ will not do what you want. That's because the (?!foo) is just saying that the next thing cannot be "foo"--and it's not, it's a "bar", so "foobar" will match. Use lookbehind instead (see below).
grep -P enable perl regular expression
^(.) matches any character at start of string and captures it in \1
(?!\1) matches previous that isn't followed by \1, no capture
. matches any character (that isn't \1 as previous rule)
\1 matches captured character in \1
(?!\1) matches previous that isn't followed by \1, no capture
. matches any character (that isn't \1 as previous rule)
(?!\1) matches previous that isn't followed by \1, no capture
. matches any character (that isn't \1 as previous rule)
$ end of string
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.