Linux - SoftwareThis forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Situation
There are several versions(?) of regex in our Penguin world.
basic regex (obsolete regex, per man regex)
extended regex (modern?)
.
findutils-default, awk, egrep, ed, emacs, gnu-awk, grep, posix-awk, posix-basic, posix-egrep, posix-extended, posix-minimal-basic, sed
just to name a few.
and similar but different idea, (shell filename substitution)
Think of regex variants like shell variants. For basic stuff, I've never really needed to be aware of the differences. If my search keys are so esoteric that the flavour of regex will effect the output, I'm probably going to break up the search and iterate through it (awk, usually)
Complicated regex aloritha have always made my head hurt. They're notoriously difficult to test. There always winds up being some wacky bit of data that you forgot to test against "false flagging" your script.
I just type a few more lines and do my search in stages. Simple to troubleshoot. Works in all flavours of regex, and it;'s friendly to the next poor soul that needs to understand what I intended.
...but if you're struggling, and you expect to do anything involving any kind of code for any length of time, I highly, highly recommend buying or borrowing the O'Reilly book. Just the first chapter, "Introduction to Regular Expressions," made me want to never look back. Later chapters get into deeper techniques and specifics (like the yes-maddening differences between different languages and tools), but just that one chapter changed the whole way I interact with text.
Not sure, but it may be that some of your trouble comes from the shell, not from regex per se.
The shell sees symbols like "()", ".",
Code:
\
and so forth as significant, and depending on what you're doing, it may go something like,
Code:
[human types something]
=> [shell eats the symbols it recognizes and turns the poor human's text into something unintended]
=> [tool the human really meant to use gets fed something the human did not intend; behaves accordingly, spits out crap]
=> [regex gets the blame]
How to get around that varies with the situation, and it can be tough sometimes. A dozen years into this and sometimes I don't even realize that's what's going wrong. The only advice I can give is to break things down into small pieces:
Code:
$ [command] | [command with regex]
gArBaGe oUtPuT
Swear;
Code:
$ [command] > /tmp/am_i_sure_my_regex_is_being_fed_what_i_expect
$ vi /tmp/am_i_sure_my_regex_is_being_fed_what_i_expect
...then...
Code:
$ [command] | [one piece of my regex]
$ [command] | [different piece of my regex]
$ echo [sample of the text i really care about] | grep [simple regex]
$ echo [larger/different subset] | grep [gradually increasing regex]
You get the idea. And if the regex doesn't do what you want, it might be because of the shell. Stick
Code:
\
before characters in your regex that might be the cause. Sometimes the character causing the trouble is the escape character itself, so change that to
Code:
\\
.
And yeah, it all can look messy at first but it does get easier with practice. Look through some of the scripts under /etc and you'll find shell logic that looks like line noise; somebody wrote that...
Oh, one last thing: in shell world, 'single' and "double" quotes can make all the difference especially where this stuff is concerned. How so is left as an exercise for the already-bored reader.
Edit: ...Ha! This web app doesn't like backslashes either, it wiped mine out the first time I tried to post (even though they were in quotes)! See, nobody is immune from trouble with regex, even the pros who run this site. Big mojo.
I like using regex. They're useful with doing pattern matching/expansion. I use a lot of regex in my shell scripts. I'm not an expert on them, but I know enough to get the results I want.
I keep a journal of many regex expressions I've used and pickup on the net for reference.
By practicing. Not trying to be funny, if you USE regex you will learn. I agree that they are at times a pain in the b.tt, but they are also incredibly useful.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.