LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 03-27-2023, 08:42 PM   #16
metaed
Member
 
Registered: Apr 2022
Location: US
Distribution: Slackware64 15.0
Posts: 373

Rep: Reputation: 172Reputation: 172

Quote:
Originally Posted by syg00 View Post
How would that handle the afore-mentioned B-52?
It would return B. Based on OP's criteria as stated, any hyphenated "word" would normally be returned as multiple words, but 52 does not contain a letter so would not be returned.
 
1 members found this post helpful.
Old 03-27-2023, 09:45 PM   #17
lucmove
Senior Member
 
Registered: Aug 2005
Location: Brazil
Distribution: Debian
Posts: 1,434

Original Poster
Rep: Reputation: 110Reputation: 110
Quote:
Originally Posted by dugan View Post
I take it you already know about \b?
Probably. The \b escape sequence may mean different things in different languages/toolsets.
 
Old 03-27-2023, 11:23 PM   #18
dugan
LQ Guru
 
Registered: Nov 2003
Location: Canada
Distribution: distro hopper
Posts: 11,249

Rep: Reputation: 5323Reputation: 5323Reputation: 5323Reputation: 5323Reputation: 5323Reputation: 5323Reputation: 5323Reputation: 5323Reputation: 5323Reputation: 5323Reputation: 5323
Quote:
Originally Posted by lucmove View Post
Probably. The \b escape sequence may mean different things in different languages/toolsets.
Well, yes, but the language we're talking about is regular expressions. And in regular expresions, \b means "word boundary".

Last edited by dugan; 03-27-2023 at 11:32 PM.
 
Old 03-29-2023, 06:33 AM   #19
MadeInGermany
Senior Member
 
Registered: Dec 2011
Location: Simplicity
Posts: 2,832

Rep: Reputation: 1218Reputation: 1218Reputation: 1218Reputation: 1218Reputation: 1218Reputation: 1218Reputation: 1218Reputation: 1218Reputation: 1218
\b is from PCRE (perl). Not yet standard in ERE (grep -E, sed -E, awk, ...)
It exists in a recent Linux glibc, but not all tools use it.
The older \< \> is always supported in Linux.

Compare with \s - the older [[:blank:]] works with all tools in Linux.
 
1 members found this post helpful.
Old 03-29-2023, 10:09 AM   #20
boughtonp
Senior Member
 
Registered: Feb 2007
Location: UK
Distribution: Debian
Posts: 3,627

Rep: Reputation: 2556Reputation: 2556Reputation: 2556Reputation: 2556Reputation: 2556Reputation: 2556Reputation: 2556Reputation: 2556Reputation: 2556Reputation: 2556Reputation: 2556
Quote:
Originally Posted by MadeInGermany View Post
\b is from PCRE (perl).
No, \b is not "from" PCRE and it is not "PCRE (perl)" either.

PCRE is a C library started in 1997 that was originally inspired by Perl v5, but PCRE is not Perl, and nor is it where \b for word boundary came from.

I can't say for sure where \b originated from - other than definitely not from PCRE - Perl had \b in (at least) v4 in the early 90s (no idea about earlier versions), and the original v1 of Gawk released in the late 80s originally had \b for word boundary - with v3 it was switched to \y to allow compatibility with original AWK using \b for backspace.


Quote:
The older \< \> is always supported in Linux.
Gawk had \< and \> in v2, but not in v1 - thus they were probably added early 90s - and it seems Howard Helman added them to Sed in 1991 - so it's possible \< and \> are newer constructs - despite many "modern" regex implementations for some reason not implementing them.

 
1 members found this post helpful.
Old 03-29-2023, 05:48 PM   #21
MadeInGermany
Senior Member
 
Registered: Dec 2011
Location: Simplicity
Posts: 2,832

Rep: Reputation: 1218Reputation: 1218Reputation: 1218Reputation: 1218Reputation: 1218Reputation: 1218Reputation: 1218Reputation: 1218Reputation: 1218
AFAIR perl4 had \< \> only, and \b was new in perl5.
 
Old 03-29-2023, 11:57 PM   #22
pan64
LQ Addict
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 22,039

Rep: Reputation: 7347Reputation: 7347Reputation: 7347Reputation: 7347Reputation: 7347Reputation: 7347Reputation: 7347Reputation: 7347Reputation: 7347Reputation: 7347Reputation: 7347
Quote:
Originally Posted by boughtonp View Post
No, \b is not "from" PCRE and it is not "PCRE (perl)" either.

PCRE is a C library started in 1997 that was originally inspired by Perl v5, but PCRE is not Perl, and nor is it where \b for word boundary came from.

I can't say for sure where \b originated from - other than definitely not from PCRE - Perl had \b in (at least) v4 in the early 90s (no idea about earlier versions), and the original v1 of Gawk released in the late 80s originally had \b for word boundary - with v3 it was switched to \y to allow compatibility with original AWK using \b for backspace.



Gawk had \< and \> in v2, but not in v1 - thus they were probably added early 90s - and it seems Howard Helman added them to Sed in 1991 - so it's possible \< and \> are newer constructs - despite many "modern" regex implementations for some reason not implementing them.

https://perldoc.perl.org/perlre
perl does not know \< and \>. But it has \A and \Z (which are not the same, just probably similar).
 
1 members found this post helpful.
Old 03-30-2023, 03:12 AM   #23
MadeInGermany
Senior Member
 
Registered: Dec 2011
Location: Simplicity
Posts: 2,832

Rep: Reputation: 1218Reputation: 1218Reputation: 1218Reputation: 1218Reputation: 1218Reputation: 1218Reputation: 1218Reputation: 1218Reputation: 1218
That is perl5.
I like their subtle humor:
Quote:
Perl officially stands for Practical Extraction and Report Language, except when it doesn't.
I think perl4, being from the pre-Web era, is completely extinguished.
 
Old 03-30-2023, 03:32 AM   #24
astrogeek
Moderator
 
Registered: Oct 2008
Distribution: Slackware [64]-X.{0|1|2|37|-current} ::12<=X<=15, FreeBSD_12{.0|.1}
Posts: 6,269
Blog Entries: 24

Rep: Reputation: 4206Reputation: 4206Reputation: 4206Reputation: 4206Reputation: 4206Reputation: 4206Reputation: 4206Reputation: 4206Reputation: 4206Reputation: 4206Reputation: 4206
Quote:
Originally Posted by MadeInGermany View Post
That is perl5.
I like their subtle humor:


I think perl4, being from the pre-Web era, is completely extinguished.
I hope not - it is the one burned into my brain cell!
 
Old 03-30-2023, 08:43 AM   #25
boughtonp
Senior Member
 
Registered: Feb 2007
Location: UK
Distribution: Debian
Posts: 3,627

Rep: Reputation: 2556Reputation: 2556Reputation: 2556Reputation: 2556Reputation: 2556Reputation: 2556Reputation: 2556Reputation: 2556Reputation: 2556Reputation: 2556Reputation: 2556
Quote:
Originally Posted by MadeInGermany View Post
AFAIR perl4 had \< \> only, and \b was new in perl5.
If that were true, it would be mentioned under https://perldoc.perl.org/5.6.2/perltrap#Perl4-to-Perl5-Traps]https://perldoc.perl.org/5.6.2/perltrap#Perl4-to-Perl5-Traps

(It isn't.)


Quote:
Originally Posted by pan64 View Post
But it has \A and \Z (which are not the same, just probably similar).
They are not the same at all - they are equivalent to "^" and "$" when matching single lines, but for multiple lines "\A" and "\Z" only match once (for the first/last respectively). In the latter case, usually one wants lowercase "\z", which doesn't exclude a final newline.


Quote:
Originally Posted by MadeInGermany View Post
I think perl4, being from the pre-Web era, is completely extinguished.
Except, despite being a rewrite, Perl 5 mostly maintained compatibility (aside from the issues at the link above).

Anyway, who still uses Perl 5? All the cool kids have moved to Perl 7...


Last edited by boughtonp; 03-30-2023 at 08:46 AM.
 
1 members found this post helpful.
Old 03-30-2023, 08:45 AM   #26
boughtonp
Senior Member
 
Registered: Feb 2007
Location: UK
Distribution: Debian
Posts: 3,627

Rep: Reputation: 2556Reputation: 2556Reputation: 2556Reputation: 2556Reputation: 2556Reputation: 2556Reputation: 2556Reputation: 2556Reputation: 2556Reputation: 2556Reputation: 2556
Quote:
Originally Posted by boughtonp View Post
Anyway, who still uses Perl 5? All the cool kids have moved to Perl 7...
Not really. :)


Last edited by boughtonp; 03-30-2023 at 08:47 AM.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
LXer: Words, Words, Words--Introducing OpenSearchServer LXer Syndicated Linux News 0 08-07-2019 02:13 PM
[SOLVED] jhalfs sed: -e expression #1, char 55:Invalid preceding regular expression percy_vere_uk Linux From Scratch 10 07-22-2017 07:15 AM
Removing white spaces between words and joining the words in a given format Priyabio Linux - General 4 08-20-2009 07:42 AM
How do I create words.db from words.txt using gdbm? kline General 8 12-14-2008 08:48 PM
Search and Replace: Asian Words to English Words ieeestd802 Linux - Software 0 10-27-2004 07:48 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 05:00 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration