Download your favorite Linux distribution at LQ ISO.
Go Back > Forums > Linux Forums > Linux - Newbie
User Name
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!


  Search this Thread
Old 05-24-2006, 01:30 PM   #1
LQ Newbie
Registered: Nov 2005
Posts: 9

Rep: Reputation: 0
Regex nightmare

I've been investigating the filtering capabilities of Squid using very simple regex patterns - a text file with one rule per one line.

I've read many tutorials but they seem to be rather confusing. To take a few imaginary examples.

Q: How do you block any website that has the exact string "/badword.php" in the URL.

Some tutorials claim that the regex should be "/badword\.php" ("/" is not a reserved word). Some tutorials claim that it should be "\/badword\.php" ("/" IS a reserved word). I've tried the first pattern and it appears to work.

Q: How do you block any address that has the exact string "http://badwords." in the URL?

Again, either "http:\/\/badwords\.", "http:\/\/badwords.* or simply "http://badwords\." (The third pattern appears to work).

I'm confused. I've tested and tested and everything works....but what am I doing wrong?
Old 05-24-2006, 11:18 PM   #2
Registered: Jan 2005
Location: The grassy knoll
Distribution: Slackware,Debian
Posts: 192

Rep: Reputation: 31
The '/' character isn't special in regexes. The '\' is the escape character so you can use specials (such as the .) as a literal character match. Putting \/ in a regex doesn't hurt or help. Well, it makes the regex ugly as you have noticed.

But some progs (like sed) use the construct '/regex/command' to match and manipulate the string. Then you need to escape that 'http://'. Still Confused? So am I.

Hopefully, a real regexp guru will stop by.
Old 05-25-2006, 02:10 AM   #3
LQ Guru
Registered: Aug 2004
Location: Sydney
Distribution: Centos 6.10, Centos 7.3
Posts: 17,548

Rep: Reputation: 2423Reputation: 2423Reputation: 2423Reputation: 2423Reputation: 2423Reputation: 2423Reputation: 2423Reputation: 2423Reputation: 2423Reputation: 2423Reputation: 2423
Most langs use a start & end char eg '/' to determine where the match begins & ends.
For substituions, you need to use it 3 times eg
Note however, that each tool/lang has a slightly different regex engine, so consult the manual for that lang. Some langs even have their own regex engnine, plus the ability to specify using Perl-compatible regex engine (eg php does this: pcreg() ).
The definitve guide (apart from lang/tool specific books) is:
Mastering Regular Expressions
aka the 'Owl' book
Highly recommended.


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off

Similar Threads
Thread Thread Starter Forum Replies Last Post
regex help siyisoy Programming 4 04-07-2006 05:32 AM
Regex Help cmfarley19 Programming 5 03-31-2005 10:13 PM
Help with Sed and regex cmfarley19 Programming 6 11-18-2004 01:09 PM
ip address REGEX Robert0380 Programming 16 08-15-2003 01:00 PM
regex stumper Silly22 Programming 4 07-07-2002 05:10 PM > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 04:41 AM.

Main Menu
Write for LQ is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration