Linux - NewbieThis Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place!
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
When you have a list of RE elements in [square brackets], the square-brackets part matches any of the charatcers between the brackets. So [(a.*b)] matches any line with a, b, . or *.
A good way to test your expressions it to turn on colourised output:
Code:
export GREP_OPTIONS=--colour=auto
And then to type in the grep command with no input file specified and nothing piped in. Then just start typing. When you hit return, if your input line matches you'll see it repeated with the matching parts coloured in:
Code:
$ export GREP_OPTIONS=--colour=auto
$ egrep '[(a.*b)(b.*a)]'
abcdefg.* (we like it)
abcdefg.* (we like it)
nothing works here
Note that the "nothing works here" line was not echoed back - it did not match the expression.
When you're done, press control-d (at the start of a line) to finish your test.
There's probably a better way to do this, but here's how I'd make an expression meaning "match any line where it contains an a but no b, OR a b but no a":
Code:
^[^b]*a[^b]*$|^[^a]*b[^a]*$
Looks ugly, but here's how it works:
The expression is split in two parts with the | character, which means that the whole expression matches if one or other sub-expressions matches. For example a|b|c is the same as [abc].
OK the first of the two sub-expressions is:
Code:
^[^b]*a[^b]*$
Let's break it down. The ^ at the start of the expression means "only match this expression if it is anchored to the start of a line".
The [^b] part means "match any character which is not b" (yes, it's a little confusing that the ^ means "not" here and "start of line" outside [square brackets]).
The * means "the previous bit 0 or more times" (the previous bit being the [^b] - any non-b character).
The a is treated as a literal "a" character.
The [^b]* as before.
The $ means "end of line".
So that whole thing ^[^b]*a[^b]*$ means "a whole line containing an a, but no b's".
Now you should be able to see that the whole expression says "a line containing an a but no b's or a line containing a b but no a's".
Interesting thread. It's been a while since I took my bash class but I thought that
Code:
ls | grep a*
is only going match any line with zero or more instances of "a"... and that's it. From the way I understood, on the bash command line the "|" (pipe) means take the output from pre-pipe and feed, as a file, the output to post-pipe. So why does grep even have the opportunity to match the glob because it already has the output from ls? I never knew that but it's good to know that now. I'll be sure to put "quotes" around my regular expressions from now on. I also didn't know that there are more than one type of regular expression.
Good stuff.
I do know this, though:
grep = Global Regular Expression Parser
[QUOTE=zetabill]Interesting thread. It's been a while since I took my bash class but I thought that
Code:
ls | grep a*
is only going match any line with zero or more instances of "a"... and that's it.
Please re-read post #8. It will usually work, but not for the reason you might think, and when you have more than one file the current working directory it won't work at all.
Quote:
Originally Posted by zetabill
From the way I understood, on the bash command line the "|" (pipe) means take the output from pre-pipe and feed, as a file, the output to post-pipe.
That's a little ambiguous because some programs don't take their input from standard input, or don't write their output to standard output. For example, most of the GNU core utilities (grep included ) will take their input from standard input only if no files are specified as part of the command.
It is more precise to say that the pipe connects the standard output file handle of the command to the left of the pipe symbol to the standard input file handle of the command to the right of the pipe. Thus pipes are only useful for programs which read from stdin and/or write to stdout.
Quote:
Originally Posted by zetabill
So why does grep even have the opportunity to match the glob because it already has the output from ls?
The shell tries to expand glob patterns before passing that argument list to programs. So in the example above, if a* matches any files, that list of files is passed to grep, and grep will not see the pattern. Again, I refer you to post #8, which has a detailed example of what is going on.
When the shell interprets a command with pipes in it, it doesn't execute the processes in the order they appear on the command line, and then move the output through them. The pipes are connected together before any of the processes create any output and then the data flows through the pipeline.
Quote:
Originally Posted by zetabill
I never knew that but it's good to know that now. I'll be sure to put "quotes" around my regular expressions from now on. I also didn't know that there are more than one type of regular expression.
Good stuff.
I do know this, though:
grep = Global Regular Expression Parser
Another nice one. I wonder how many different "reasons" for the name grep there are?
Regarding the bash i just want to mention the Advanced Bash-Scripting Guide. It saved me a lot of headaches (next to the manpage ) -> some things i find faster in the man page some are better explained in the guide ...
... I wonder how many different "reasons" for the name grep there are?
Like the Highlander: There Can Be Only One. ... cheers, makyo
Quote:
Originally Posted by Brian Kernighan
The name comes from the ed command g/regular-expression/p ...
The UNIX Programming Environment, Kernighan and Pike, 1984, Prentice-Hall, page 18
Also
Quote:
From Wikipedia, the free encyclopedia
Jump to: navigation, search
The correct title of this article is grep. The initial letter is shown capitalized due to technical restrictions.
grep is a command line utility that was originally written for use with the Unix operating system. The default behavior of grep takes a regular expression on the command line, reads standard input or a list of files, and outputs the lines containing matches for the regular expression.
The name comes from a command in the Unix text editor ed that takes the form:
g/re/p
which means "search globally for lines matching the regular expression, and print them". There are various command line switches available when using grep that modify the default behavior.
Other (incorrect) backronyms of the name exist, including: General Regular Expression Parser, General Regular Expression Print, Global Regular Expression Parser, and Global Regular Expression Print, though the last example is not entirely wrong. http://en.wikipedia.org/wiki/Grep
BWK's credentials include:
Quote:
Brian Kernighan
From Wikipedia, the free encyclopedia
(Redirected from Kernighan)
Jump to: navigation, search
Brian Wilson Kernighan, (IPA pronunciation: ['kɛrnɪˌhæn], the 'g' is silent); born 1942, is a computer scientist who worked at the Bell Labs and contributed to the design of the pioneering AWK and AMPL programming languages. He is also the author of the famous Hello, world program.
Kernighan's name became widely known through co-authorship of the first book on the C programming language with Dennis Ritchie. Kernighan has said that he had no part in the design of the C language: "It's entirely Dennis Ritchie's work". He authored many Unix programs, including ditroff. -- more at: http://en.wikipedia.org/wiki/Kernighan
im using ubuntu in vmware.
when using grep '^[a-z]$' dict-file
the output is ABCDEFGHJIKLMOOPQRSTUVWXYabcdefghijklmnopqrstuvwxyz
thts capital A-Y then lowercase a-z
Grep searches for substring. You'll note that all strings found contain two consecutive alphabet characters (not unexpected when searching a dictionary).
Use ^ to match the beginning of a line, and $ to match the end of the line. Note that $ also has special meaning to the shell, so you might want to add some quotes.
yes its interesting stuff, i first learned of regular expressions when i read 'oreilly - bash in a nutshell', i knew it was some way of using patterns to match text, but i didnt really take it all in.
the term 'regular expression' didnt help either!
thanks to Matthew et al ive got a better handle on it (although i still dont fully understand it, and i suppose its what matthew was saying, you have to practice and read, practice and re-read.)
a big problem for me was realizing that the shell works in its own way and may misinterpret the regular expression if the user was not careful, hence the need for quoting etc.
the other problem i have is difficult to explain, but its regarding structure of commandlines etc, its hazy at the moment, but with each good explaination i understand a little more
oh ho!!! while searching through Man pages,theory & linuxquestions.org
I have realised that,there are things like
1. Wildcard characters, which have been popularly used in DOS
2.There are regular expressions as taught in subjects like Theory of Computational Science.
3. There are also, Extended regular expressions(R.E.).
4. There is Pattern Matching which looks very similar to R.E. but, which are not.
5. And, finally there is Perl-style regular expressions.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.