Linux - NewbieThis Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place!
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
([^)]*)
gm
1st Capturing Group ([^)]*)
Match a single character not present in the list below [^)]
* matches the previous token between zero and unlimited times, as many times as possible, giving back as needed (greedy)
) matches the character ) with index 4110 (2916 or 518) literally (case sensitive)
( # grouping
[^)] # Match a single character not present in the list below = anything but )
* # matches the previous token between zero and unlimited times, as many times as possible
) # closing paren of the group
so it means altogether any number of any char but ).
But there can be another solution:
Code:
(
any number of any char but )
)
It depends on your regex engine (if a paren is taken as a paren or used for grouping)
First, the expression is looking for "zero or more repetitions of" one of two possible characters.
Then, by enclosing the group in parentheses, it is identifying it as a string that can be returned to the caller of the string-matching function, giving the caller the exact sequence of characters (possibly in this case "empty string") that were matched. A regular expression can have any number of these parenthesized groups, each of which is returned as an element in an array of values.
So – this gives you, not only the yes/no fact that "there was a successful match," but – "exactly what matched in various designated portions of the input string. You can be provided, not only with the matching string-value, but exactly where in the input string it was found.
Last edited by sundialsvcs; 04-15-2022 at 01:52 PM.
That regex says to match for any of three characters & or 3 or 5, followed by a left parenthesis followed by zero or more characters that are not right parenthesis followed by right parenthesis.
Would it be easier to understand if you just simplified the regex to this?
Code:
[^)]*
You're not doing anything with the capturing groups, so you can take out the outer brackets.
I also don't get where you're seeing that it matches "three times". It matches every substring that starts with a character that isn't either whitespace or a closing parenthese, so you should typically get a lot more than three matches.
Quote:
Originally Posted by sundialsvcs
First, the expression is looking for "zero or more repetitions of" one of two possible characters.
That's wrong. When a caret appears as the first character between two square brackets, it's a negation character.
Faki, obviously, the regex you wrote is wrong and it's not doing what you intended, whatever that is (you forgot to tell us). Were you looking for something to match the insides of parenetheses?
Would it be easier to understand if you just simplified the regex to this?
Code:
[^)]*
You're not doing anything with the capturing groups, so you can take out the outer brackets.
If these are basic (not extended) regular expressions, the those "outer brackets" are taken literally, and the expression
Code:
([^)]*)
means "a left parenthesis, followed by any number (zero or more) of characters that are not a right parenthesis, and finally a right parenthesis."
This almost matches the innermost level of a multi-level parenthesized expression, which seems to be what was intended, but if so, the regex isn't quite right. If that were indeed the intent, this seems to work:
Code:
grep '([^()]*)'
It is useful to include the "--color" option so that you can see just what is being matched.
Note also that grep processes each input line separately, so a multi-line block of code such as in post #3 might not yield the expected result.
Here's something to consider: "a regular expression is a very tiny computerprogram, condensed into a single pregnant line of text." And this just might be the very best way to consider it ... because this actually is how the software implementations of "regular-expression handlers" approach the problem: they "compile" the expression into an intermediate form, then they "execute" it.
The "regex language" is not actually terribly complicated, at least in most cases. But, it is "extremely compact," and there are very many tutorials and practice-websites which can help you to understand it.
FYI: It is also useful to notice that there are many libraries of "pre-debugged regular expressions" out there, for various languages. So, before you spend too much time trying to "puzzle out" your particular problem as though it were brand-new, look to see if it hasn't already been solved. (The "regular expression syntax" used by most languages is also usually the same, so you might be able to "steal" a workable regex from some other language's library.)
Last edited by sundialsvcs; 04-17-2022 at 07:47 PM.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.