Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game. |
Notices |
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
Are you new to LinuxQuestions.org? Visit the following links:
Site Howto |
Site FAQ |
Sitemap |
Register Now
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
|
|
|
03-27-2023, 08:42 PM
|
#16
|
Member
Registered: Apr 2022
Location: US
Distribution: Slackware64 15.0
Posts: 418
Rep:
|
Quote:
Originally Posted by syg00
How would that handle the afore-mentioned B-52?
|
It would return B. Based on OP's criteria as stated, any hyphenated "word" would normally be returned as multiple words, but 52 does not contain a letter so would not be returned.
|
|
1 members found this post helpful.
|
03-27-2023, 09:45 PM
|
#17
|
Senior Member
Registered: Aug 2005
Location: Brazil
Distribution: Debian
Posts: 1,462
Original Poster
Rep:
|
Quote:
Originally Posted by dugan
I take it you already know about \b?
|
Probably. The \b escape sequence may mean different things in different languages/toolsets.
|
|
|
03-27-2023, 11:23 PM
|
#18
|
LQ Guru
Registered: Nov 2003
Location: Canada
Distribution: distro hopper
Posts: 11,333
|
Quote:
Originally Posted by lucmove
Probably. The \b escape sequence may mean different things in different languages/toolsets.
|
Well, yes, but the language we're talking about is regular expressions. And in regular expresions, \b means "word boundary".
Last edited by dugan; 03-27-2023 at 11:32 PM.
|
|
|
03-29-2023, 06:33 AM
|
#19
|
Senior Member
Registered: Dec 2011
Location: Simplicity
Distribution: Mint/MATE
Posts: 2,927
|
\b is from PCRE (perl). Not yet standard in ERE (grep -E, sed -E, awk, ...)
It exists in a recent Linux glibc, but not all tools use it.
The older \< \> is always supported in Linux.
Compare with \s - the older [[:blank:]] works with all tools in Linux.
|
|
1 members found this post helpful.
|
03-29-2023, 10:09 AM
|
#20
|
Senior Member
Registered: Feb 2007
Location: UK
Distribution: Debian
Posts: 3,723
|
Quote:
Originally Posted by MadeInGermany
\b is from PCRE (perl).
|
No, \b is not "from" PCRE and it is not "PCRE (perl)" either.
PCRE is a C library started in 1997 that was originally inspired by Perl v5, but PCRE is not Perl, and nor is it where \b for word boundary came from.
I can't say for sure where \b originated from - other than definitely not from PCRE - Perl had \b in (at least) v4 in the early 90s (no idea about earlier versions), and the original v1 of Gawk released in the late 80s originally had \b for word boundary - with v3 it was switched to \y to allow compatibility with original AWK using \b for backspace.
Quote:
The older \< \> is always supported in Linux.
|
Gawk had \< and \> in v2, but not in v1 - thus they were probably added early 90s - and it seems Howard Helman added them to Sed in 1991 - so it's possible \< and \> are newer constructs - despite many "modern" regex implementations for some reason not implementing them.
|
|
1 members found this post helpful.
|
03-29-2023, 05:48 PM
|
#21
|
Senior Member
Registered: Dec 2011
Location: Simplicity
Distribution: Mint/MATE
Posts: 2,927
|
AFAIR perl4 had \< \> only, and \b was new in perl5.
|
|
|
03-29-2023, 11:57 PM
|
#22
|
LQ Addict
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 22,728
|
Quote:
Originally Posted by boughtonp
No, \b is not "from" PCRE and it is not "PCRE (perl)" either.
PCRE is a C library started in 1997 that was originally inspired by Perl v5, but PCRE is not Perl, and nor is it where \b for word boundary came from.
I can't say for sure where \b originated from - other than definitely not from PCRE - Perl had \b in (at least) v4 in the early 90s (no idea about earlier versions), and the original v1 of Gawk released in the late 80s originally had \b for word boundary - with v3 it was switched to \y to allow compatibility with original AWK using \b for backspace.
Gawk had \< and \> in v2, but not in v1 - thus they were probably added early 90s - and it seems Howard Helman added them to Sed in 1991 - so it's possible \< and \> are newer constructs - despite many "modern" regex implementations for some reason not implementing them.
|
https://perldoc.perl.org/perlre
perl does not know \< and \>. But it has \A and \Z (which are not the same, just probably similar).
|
|
1 members found this post helpful.
|
03-30-2023, 03:12 AM
|
#23
|
Senior Member
Registered: Dec 2011
Location: Simplicity
Distribution: Mint/MATE
Posts: 2,927
|
That is perl5.
I like their subtle humor:
Quote:
Perl officially stands for Practical Extraction and Report Language, except when it doesn't.
|
I think perl4, being from the pre-Web era, is completely extinguished.
|
|
|
03-30-2023, 03:32 AM
|
#24
|
Moderator
Registered: Oct 2008
Distribution: Slackware [64]-X.{0|1|2|37|-current} ::12<=X<=15, FreeBSD_12{.0|.1}
Posts: 6,297
|
Quote:
Originally Posted by MadeInGermany
That is perl5.
I like their subtle humor:
I think perl4, being from the pre-Web era, is completely extinguished.
|
I hope not - it is the one burned into my brain cell!
|
|
|
03-30-2023, 08:43 AM
|
#25
|
Senior Member
Registered: Feb 2007
Location: UK
Distribution: Debian
Posts: 3,723
|
Quote:
Originally Posted by MadeInGermany
AFAIR perl4 had \< \> only, and \b was new in perl5.
|
If that were true, it would be mentioned under https://perldoc.perl.org/5.6.2/perltrap#Perl4-to-Perl5-Traps]https://perldoc.perl.org/5.6.2/perltrap#Perl4-to-Perl5-Traps
(It isn't.)
Quote:
Originally Posted by pan64
But it has \A and \Z (which are not the same, just probably similar).
|
They are not the same at all - they are equivalent to " ^" and " $" when matching single lines, but for multiple lines " \A" and " \Z" only match once (for the first/last respectively). In the latter case, usually one wants lowercase " \z", which doesn't exclude a final newline.
Quote:
Originally Posted by MadeInGermany
I think perl4, being from the pre-Web era, is completely extinguished.
|
Except, despite being a rewrite, Perl 5 mostly maintained compatibility (aside from the issues at the link above).
Anyway, who still uses Perl 5? All the cool kids have moved to Perl 7...
Last edited by boughtonp; 03-30-2023 at 08:46 AM.
|
|
1 members found this post helpful.
|
03-30-2023, 08:45 AM
|
#26
|
Senior Member
Registered: Feb 2007
Location: UK
Distribution: Debian
Posts: 3,723
|
Quote:
Originally Posted by boughtonp
Anyway, who still uses Perl 5? All the cool kids have moved to Perl 7...
|
Not really. :)
Last edited by boughtonp; 03-30-2023 at 08:47 AM.
|
|
|
All times are GMT -5. The time now is 06:48 AM.
|
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.
|
Latest Threads
LQ News
|
|