LinuxQuestions.org
Latest LQ Deal: Linux Power User Bundle
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 09-26-2017, 06:48 AM   #1
MrMeeSeeks
Member
 
Registered: Jan 2017
Posts: 31

Rep: Reputation: Disabled
RegEx - character class containing brackets, how to escape correctly


Hey there,
somewhat dim question I guess, but:

I'm trying to figure out how to grep for a character class like [a-d\[\]\*]. Which, as I see it, should match any of abcd[]*, but doesn't in grep.
Also trying to match a class containing both single and double quotes.
No particular reason, just trying to get a better handle on grep and REs.

Now, I came across a statement suggesting that there is no actual escaping in greps BREs, but rather in the shell, however this post suggests there is both simultaneously, so I tried double escaping like '[a\\'b]' - which doesn't work either.
So, I am very confused as to how the escaping works and why it doesn't in these particular instances.


Also, something that deeply weirds me out: when I forget to quote a character class grep always matches capital C's and nothing else. Why on earth?

//okay, now I just noticed something that really freaks me out: when I try to match '[a\-b]' grep matches a, b and every digit. Why?

Best wishes

Last edited by MrMeeSeeks; 09-26-2017 at 06:52 AM.
 
Old 09-26-2017, 07:21 AM   #2
ntubski
Senior Member
 
Registered: Nov 2005
Distribution: Debian, Arch
Posts: 3,191

Rep: Reputation: 1388Reputation: 1388Reputation: 1388Reputation: 1388Reputation: 1388Reputation: 1388Reputation: 1388Reputation: 1388Reputation: 1388Reputation: 1388
Quote:
Originally Posted by MrMeeSeeks View Post
Now, I came across a statement suggesting that there is no actual escaping in greps BREs, but rather in the shell, however this post suggests there is both simultaneously, so I tried double escaping like '[a\\'b]' - which doesn't work either.
So, I am very confused as to how the escaping works and why it doesn't in these particular instances.
There is escaping both in BREs and in the shell; but BRE escaping does not use the same rules as shell escaping. In particular, BRE escaping inside of character classes is a bit... funny.

GNU Grep manual: Character Classes and Bracket Expressions:
Quote:
Most meta-characters lose their special meaning inside bracket expressions.

‘]’
ends the bracket expression if it’s not the first list item. So, if you want to make the ‘]’ character a list item, you must put it first.
[...]

‘-’
represents the range if it’s not first or last in a list or the ending point of a range.
Therefore, to match any of abcd[]*, you want [][abcd*]. To protect this from shell expansion, you should then wrap in quotes:
Code:
grep '[][abcd*]'
Quote:
Originally Posted by MrMeeSeeks View Post
Also, something that deeply weirds me out: when I forget to quote a character class grep always matches capital C's and nothing else. Why on earth?
Hard to say without an example.

Quote:
Originally Posted by MrMeeSeeks View Post
//okay, now I just noticed something that really freaks me out: when I try to match '[a\-b]' grep matches a, b and every digit. Why?
Backslash doesn't escape within a character class, so it's matching a, and everything between \ and b. The exact characters this consists of depends on your locale sort order. In your default locale, digits are sorted between \ and b.
 
1 members found this post helpful.
Old 09-26-2017, 08:01 AM   #3
MrMeeSeeks
Member
 
Registered: Jan 2017
Posts: 31

Original Poster
Rep: Reputation: Disabled
Thanks a lot!
Sorry I wasn't thorough enough - I even was on that page but at a glance assumed it would not help me.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
regex escape the comma casperdaghost Linux - Newbie 11 09-12-2010 01:09 PM
echo escape character weirdness insecurityman Programming 7 10-20-2008 04:20 AM
sh scripting -- escape * character? Brender Linux - Newbie 2 10-12-2007 09:33 AM
.bashrc is there an escape character for alias? acummings Slackware 11 01-13-2007 10:56 AM
Escape character ? juanb Linux - Newbie 2 08-31-2004 10:03 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 01:06 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration