LinuxQuestions.org
Visit Jeremy's Blog.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 04-15-2022, 01:10 PM   #1
Faki
Member
 
Registered: Oct 2021
Posts: 574

Rep: Reputation: Disabled
Meaning of regular expression


Would like to understand the meaning of the following regular expression.

Code:
([^)]*)
 
Old 04-15-2022, 01:13 PM   #2
pan64
LQ Addict
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 21,842

Rep: Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308
you can use www.regex101.com in such cases, that is an excellent site.
 
Old 04-15-2022, 01:20 PM   #3
Faki
Member
 
Registered: Oct 2021
Posts: 574

Original Poster
Rep: Reputation: Disabled
Have applied it to the following code but still having difficulty.

Code:
(interactive
   (list (read-regexp "Regex: ")
	 (region-beginning)
	 (region-end) ))
Using www.regex101.com gave

Code:
([^)]*)
gm
1st Capturing Group ([^)]*)
Match a single character not present in the list below [^)]
* matches the previous token between zero and unlimited times, as many times as possible, giving back as needed (greedy)
) matches the character ) with index 4110 (2916 or 518) literally (case sensitive)
Counting the number of matches gives 3.
 
Old 04-15-2022, 01:22 PM   #4
pan64
LQ Addict
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 21,842

Rep: Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308
Is there a question here? You can also enter your sample text (on that page) and see how it works.
 
Old 04-15-2022, 01:35 PM   #5
Faki
Member
 
Registered: Oct 2021
Posts: 574

Original Poster
Rep: Reputation: Disabled
The question is still this. What does the following expression match? And how does it match three times on the mentioned example?

Code:
([^)]*)

Last edited by Faki; 04-15-2022 at 01:37 PM.
 
Old 04-15-2022, 01:41 PM   #6
pan64
LQ Addict
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 21,842

Rep: Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308
Ok, so go thru:
Code:
(       # grouping
[^)]    # Match a single character not present in the list below = anything but )
*       # matches the previous token between zero and unlimited times, as many times as possible
)       # closing paren of the group
so it means altogether any number of any char but ).
But there can be another solution:
Code:
(
any number of any char but )
)
It depends on your regex engine (if a paren is taken as a paren or used for grouping)
 
1 members found this post helpful.
Old 04-15-2022, 01:50 PM   #7
sundialsvcs
LQ Guru
 
Registered: Feb 2004
Location: SE Tennessee, USA
Distribution: Gentoo, LFS
Posts: 10,659
Blog Entries: 4

Rep: Reputation: 3941Reputation: 3941Reputation: 3941Reputation: 3941Reputation: 3941Reputation: 3941Reputation: 3941Reputation: 3941Reputation: 3941Reputation: 3941Reputation: 3941
There are two parts to it:

First, the expression is looking for "zero or more repetitions of" one of two possible characters.

Then, by enclosing the group in parentheses, it is identifying it as a string that can be returned to the caller of the string-matching function, giving the caller the exact sequence of characters (possibly in this case "empty string") that were matched. A regular expression can have any number of these parenthesized groups, each of which is returned as an element in an array of values.

So – this gives you, not only the yes/no fact that "there was a successful match," but – "exactly what matched in various designated portions of the input string. You can be provided, not only with the matching string-value, but exactly where in the input string it was found.

Last edited by sundialsvcs; 04-15-2022 at 01:52 PM.
 
Old 04-15-2022, 03:49 PM   #8
teckk
LQ Guru
 
Registered: Oct 2004
Distribution: Arch
Posts: 5,137
Blog Entries: 6

Rep: Reputation: 1826Reputation: 1826Reputation: 1826Reputation: 1826Reputation: 1826Reputation: 1826Reputation: 1826Reputation: 1826Reputation: 1826Reputation: 1826Reputation: 1826
Quote:
The question is still this. What does the following expression match?
Why don't you try it and see for yourself.
Code:
text="
abcd()efg()hij&
12345
5789()abc
"

grep -Eo '([^)]*)' <<< "$text"

grep -Po '([^)]*)' <<< "$text"

grep -Eo '([^0-9]*)' <<< "$text"

echo "$%^&()[]{}123()345()789" | grep -o '([^)]*)'

echo "$%^&()[]{}123()345()789" | grep -Eo '([^)]*)'
etc.
 
1 members found this post helpful.
Old 04-16-2022, 09:44 AM   #9
Faki
Member
 
Registered: Oct 2021
Posts: 574

Original Poster
Rep: Reputation: Disabled
I cannot understand how the following gives

Result:
Code:
()
()
()
Code:
echo "$%^&()[]{}123()345()789" | grep -o '([^)]*)'
I understand you want me to try for myself, but am not making sense of the results.
 
Old 04-16-2022, 10:11 AM   #10
allend
LQ 5k Club
 
Registered: Oct 2003
Location: Melbourne
Distribution: Slackware64-15.0
Posts: 6,371

Rep: Reputation: 2750Reputation: 2750Reputation: 2750Reputation: 2750Reputation: 2750Reputation: 2750Reputation: 2750Reputation: 2750Reputation: 2750Reputation: 2750Reputation: 2750
With the -o option, grep prints only the matched (non-empty) parts of matching lines, with each such part on a separate output line.
Consider
Code:
bash-5.1$ echo "$%^&(x)[]{}123(xx)345(xxx)789" | grep -o '[&35]([^)]*)'
&(x)
3(xx)
5(xxx)
That regex says to match for any of three characters & or 3 or 5, followed by a left parenthesis followed by zero or more characters that are not right parenthesis followed by right parenthesis.

Last edited by allend; 04-16-2022 at 10:17 AM.
 
Old 04-16-2022, 10:12 AM   #11
dugan
LQ Guru
 
Registered: Nov 2003
Location: Canada
Distribution: distro hopper
Posts: 11,224

Rep: Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320
Would it be easier to understand if you just simplified the regex to this?
Code:
[^)]*
You're not doing anything with the capturing groups, so you can take out the outer brackets.

I also don't get where you're seeing that it matches "three times". It matches every substring that starts with a character that isn't either whitespace or a closing parenthese, so you should typically get a lot more than three matches.

Quote:
Originally Posted by sundialsvcs View Post
First, the expression is looking for "zero or more repetitions of" one of two possible characters.
That's wrong. When a caret appears as the first character between two square brackets, it's a negation character.

Faki, obviously, the regex you wrote is wrong and it's not doing what you intended, whatever that is (you forgot to tell us). Were you looking for something to match the insides of parenetheses?

Last edited by dugan; 04-16-2022 at 11:25 AM.
 
Old 04-16-2022, 03:55 PM   #12
rknichols
Senior Member
 
Registered: Aug 2009
Distribution: Rocky Linux
Posts: 4,779

Rep: Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212
Quote:
Originally Posted by dugan View Post
Would it be easier to understand if you just simplified the regex to this?
Code:
[^)]*
You're not doing anything with the capturing groups, so you can take out the outer brackets.
If these are basic (not extended) regular expressions, the those "outer brackets" are taken literally, and the expression
Code:
([^)]*)
means "a left parenthesis, followed by any number (zero or more) of characters that are not a right parenthesis, and finally a right parenthesis."

This almost matches the innermost level of a multi-level parenthesized expression, which seems to be what was intended, but if so, the regex isn't quite right. If that were indeed the intent, this seems to work:
Code:
 grep '([^()]*)'
It is useful to include the "--color" option so that you can see just what is being matched.

Note also that grep processes each input line separately, so a multi-line block of code such as in post #3 might not yield the expected result.

Last edited by rknichols; 04-16-2022 at 03:57 PM.
 
Old 04-16-2022, 04:34 PM   #13
teckk
LQ Guru
 
Registered: Oct 2004
Distribution: Arch
Posts: 5,137
Blog Entries: 6

Rep: Reputation: 1826Reputation: 1826Reputation: 1826Reputation: 1826Reputation: 1826Reputation: 1826Reputation: 1826Reputation: 1826Reputation: 1826Reputation: 1826Reputation: 1826
Yup, that's nifty.

Code:
text="(Mary)(had)a(little)lamb(.)x(123)"

grep -o '([^()]*)' <<< "$text"

grep -o '([^()]*[!0-9])' <<< "$text"

grep -o '([^()]*[!A-Za-z])' <<< "$text"

grep -Eo '([^()]*)' <<< "$text"
 
Old 04-17-2022, 07:38 PM   #14
sundialsvcs
LQ Guru
 
Registered: Feb 2004
Location: SE Tennessee, USA
Distribution: Gentoo, LFS
Posts: 10,659
Blog Entries: 4

Rep: Reputation: 3941Reputation: 3941Reputation: 3941Reputation: 3941Reputation: 3941Reputation: 3941Reputation: 3941Reputation: 3941Reputation: 3941Reputation: 3941Reputation: 3941
Quote:
Originally Posted by dugan View Post
That's wrong. When a caret appears as the first character between two square brackets, it's a negation character.
Correct as noted. "Me bad. Oopsie!" The correct interpretation of ^) in this case is: "is not a left parenthesis."

Last edited by sundialsvcs; 04-17-2022 at 07:40 PM.
 
Old 04-17-2022, 07:45 PM   #15
sundialsvcs
LQ Guru
 
Registered: Feb 2004
Location: SE Tennessee, USA
Distribution: Gentoo, LFS
Posts: 10,659
Blog Entries: 4

Rep: Reputation: 3941Reputation: 3941Reputation: 3941Reputation: 3941Reputation: 3941Reputation: 3941Reputation: 3941Reputation: 3941Reputation: 3941Reputation: 3941Reputation: 3941
Here's something to consider: "a regular expression is a very tiny computer program, condensed into a single pregnant line of text." And this just might be the very best way to consider it ... because this actually is how the software implementations of "regular-expression handlers" approach the problem: they "compile" the expression into an intermediate form, then they "execute" it.

The "regex language" is not actually terribly complicated, at least in most cases. But, it is "extremely compact," and there are very many tutorials and practice-websites which can help you to understand it.

FYI: It is also useful to notice that there are many libraries of "pre-debugged regular expressions" out there, for various languages. So, before you spend too much time trying to "puzzle out" your particular problem as though it were brand-new, look to see if it hasn't already been solved. (The "regular expression syntax" used by most languages is also usually the same, so you might be able to "steal" a workable regex from some other language's library.)

Last edited by sundialsvcs; 04-17-2022 at 07:47 PM.
 
  


Reply

Tags
regular expression



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] jhalfs sed: -e expression #1, char 55:Invalid preceding regular expression percy_vere_uk Linux From Scratch 10 07-22-2017 07:15 AM
What is meaning about the regular expression pertaining to vim script? haochao Programming 2 03-25-2009 12:08 AM
using a perl regular expression in php markus1982 Programming 5 11-18-2002 02:31 PM
regular expression for parsing html tags Bert Linux - Software 3 10-14-2002 04:31 PM
book recommendation for regular expression? doublefailure Programming 2 07-12-2002 12:20 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 12:57 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration