LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   regex on \d versus [0-9] (https://www.linuxquestions.org/questions/linux-newbie-8/regex-on-%5Cd-versus-%5B0-9%5D-4175601684/)

fanoflq 03-13-2017 03:24 PM

regex on \d versus [0-9]
 
#This grep regex produces no output using \d for digit:
Code:

$ ip addr | egrep -i "^\d{1}"
#This grep regex produces output using [0-9] for digits:
Code:

$ ip addr | egrep -i "^[0-9]{1}"
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1
2: enp3s0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast state DOWN group default qlen 1000
3: wlp2s0b1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
4: virbr0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default qlen 1000
5: virbr0-nic: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast state DOWN group default qlen 1000

What did I missed on the first egrep regex pattern?
Thank you.

rarog 03-13-2017 06:02 PM

\d is regex in Perl notation try:
Code:

ip addr | grep -P "^\d{1}"

fanoflq 03-13-2017 06:05 PM

Thank you.

Is there a list of extended regex special characters in man page somewhere besides "Pattern Matching" in man bash?

norobro 03-13-2017 09:20 PM

Not a man page but a web page, here: https://www.gnu.org/software/finduti...ar-Expressions

HTH

pan64 03-14-2017 04:58 AM

additionally you can try to check this: http://www.pcre.org/, especially: http://www.pcre.org/current/doc/html/pcre2grep.html and/or http://www.pcre.org/current/doc/html/pcre2pattern.html

Turbocapitalist 03-14-2017 05:16 AM

If you're talking about general POSIX regex and POSIX extended regex, then there is manual page for that. There is also a manual page for the Perl regular expressions.

Code:

man 7 regex
man perlre

Both should be on your system already. The Perl pattern matching is well worth becoming familiar with, not only is it very common (often known as PCRE or Perl-Compatible Regular Expressions) but it is very, very useful.

fanoflq 03-14-2017 10:21 AM

Quote:

Originally Posted by Turbocapitalist (Post 5683199)
If you're talking about general POSIX regex and POSIX extended regex, then there is manual page for that. There is also a manual page for the Perl regular expressions.

Code:

man 7 regex
man perlre

Both should be on your system already. The Perl pattern matching is well worth becoming familiar with, not only is it very common (often known as PCRE or Perl-Compatible Regular Expressions) but it is very, very useful.


On centOS 7, minimal install, I could not find it.
Code:

># man -k regex
regexp_table (5)    - format of Postfix regular expression tables
Tie::Hash::NamedCapture (3pm) - Named regexp capture buffers

Which package provides man 7 regex?

pan64 03-15-2017 03:34 AM

you can always reach it here: https://linux.die.net/man/7/regex
the mentioned man page is part of the package manpages
http://packages.ubuntu.com/search?su...rchon=contents

fanoflq 03-15-2017 06:35 AM

Quote:

Originally Posted by pan64 (Post 5683587)
you can always reach it here: https://linux.die.net/man/7/regex
the mentioned man page is part of the package manpages
http://packages.ubuntu.com/search?su...rchon=contents

Code:

[root@Centos7-1024ram-minimal ~]# yum list all | grep -i regex
ant-apache-regexp.noarch                1.9.2-9.el7                    base   
boost-regex.i686                        1.53.0-26.el7                  base   
boost-regex.x86_64                      1.53.0-26.el7                  base   
perl-PPIx-Regexp.noarch                0.034-3.el7                    base   
perl-XML-RegExp.noarch                  0.04-2.el7                    base   
regexp.noarch                          1.5-13.el7                    base   
regexp-javadoc.noarch                  1.5-13.el7                    base

No regex.7.* package in yum repositories.

pan64 03-16-2017 02:04 AM

regex.7.gz is the man page itself, manpages is the name of the package - but it is on ubuntu. Looks like the name of the package on CentOS is: man-pages

fanoflq 03-16-2017 11:00 AM

Quote:

Originally Posted by pan64 (Post 5684055)
regex.7.gz is the man page itself, manpages is the name of the package - but it is on ubuntu. Looks like the name of the package on CentOS is: man-pages

Thank you.
Where did you find this information locally, on your computer?

sundialsvcs 03-16-2017 11:11 AM

The difference ... is ... Unicode. :)

Here's a paragraph that might be relevant, from Perl's perldoc perlre: (emphasis mine)
Quote:

Unlike most locales, which are specific to a language and country pair, Unicode classifies all the characters that are letters somewhere in the world as "\w". For example, your locale might not think that "LATIN SMALL LETTER ETH" is a letter (unless you happen to speak Icelandic), but Unicode does.

Similarly, all the characters that are decimal digits somewhere in the world will match "\d"; this is hundreds, not 10, possible matches. And some of those digits look like some of the 10 ASCII digits, but mean a different number, so a human could easily think a number is a different quantity than it really is. For example, "BENGALI DIGIT FOUR" (U+09EA) looks very much like an "ASCII DIGIT EIGHT" (U+0038). And, "\d+", may match strings of digits that are a mixture from different writing systems, creating a security issue. [...] The "/a"modifier can be used to force "\d" to match just the ASCII 0 through 9.
Perl's implementation of regular expressions is a de facto standard, duplicated by most other languages, therefore this discussion should be directly relevant.

See also a discussion of so-called "POSIX bracket expressions" (N.B. not "character classes" ...) e.g. here. Particularly, in this case, [:digit:].

fanoflq 03-16-2017 11:44 AM

Quote:

Originally Posted by sundialsvcs (Post 5684233)
The difference ... is ... Unicode. :)

Here's a paragraph that might be relevant, from Perl's perldoc perlre: (emphasis mine)


Perl's implementation of regular expressions is a de facto standard, duplicated by most other languages, therefore this discussion should be directly relevant.

See also a discussion of so-called "POSIX bracket expressions" (N.B. not "character classes" ...) e.g. here. Particularly, in this case, [:digit:].

Since perl is a defacto standard, does all commands
allow for perl regex?

pan64 03-16-2017 11:48 AM

Quote:

Originally Posted by fanoflq (Post 5684231)
Where did you find this information locally, on your computer?

for ubuntu and debian there is a package search page (what I linked), so that will easily tell you the name of the package. But obviously you can look around on your own pc.
for centos you can try here: https://www.centos.org/docs/5/html/y...-packages.html, but probably better to try here: http://rpm.pbone.net/index.php3/stat/2/simple/1


All times are GMT -5. The time now is 02:50 AM.