LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 12-08-2006, 09:27 AM   #16
matthewg42
Senior Member
 
Registered: Oct 2003
Location: UK
Distribution: Kubuntu 12.10 (using awesome wm though)
Posts: 3,530

Rep: Reputation: 65

When you have a list of RE elements in [square brackets], the square-brackets part matches any of the charatcers between the brackets. So [(a.*b)] matches any line with a, b, . or *.

A good way to test your expressions it to turn on colourised output:
Code:
export GREP_OPTIONS=--colour=auto
And then to type in the grep command with no input file specified and nothing piped in. Then just start typing. When you hit return, if your input line matches you'll see it repeated with the matching parts coloured in:
Code:
$ export GREP_OPTIONS=--colour=auto
$ egrep '[(a.*b)(b.*a)]'
abcdefg.* (we like it)
abcdefg.* (we like it)
nothing works here
Note that the "nothing works here" line was not echoed back - it did not match the expression.

When you're done, press control-d (at the start of a line) to finish your test.

There's probably a better way to do this, but here's how I'd make an expression meaning "match any line where it contains an a but no b, OR a b but no a":
Code:
^[^b]*a[^b]*$|^[^a]*b[^a]*$
Looks ugly, but here's how it works:
  • The expression is split in two parts with the | character, which means that the whole expression matches if one or other sub-expressions matches. For example a|b|c is the same as [abc].
  • OK the first of the two sub-expressions is:
    Code:
    ^[^b]*a[^b]*$
  • Let's break it down. The ^ at the start of the expression means "only match this expression if it is anchored to the start of a line".
  • The [^b] part means "match any character which is not b" (yes, it's a little confusing that the ^ means "not" here and "start of line" outside [square brackets]).
  • The * means "the previous bit 0 or more times" (the previous bit being the [^b] - any non-b character).
  • The a is treated as a literal "a" character.
  • The [^b]* as before.
  • The $ means "end of line".
  • So that whole thing ^[^b]*a[^b]*$ means "a whole line containing an a, but no b's".
  • Now you should be able to see that the whole expression says "a line containing an a but no b's or a line containing a b but no a's".
Enjoy
 
Old 12-08-2006, 10:58 AM   #17
zetabill
Member
 
Registered: Oct 2005
Location: Rhode Island, USA
Distribution: Slackware, Xubuntu
Posts: 348

Rep: Reputation: 32
Interesting thread. It's been a while since I took my bash class but I thought that
Code:
ls | grep a*
is only going match any line with zero or more instances of "a"... and that's it. From the way I understood, on the bash command line the "|" (pipe) means take the output from pre-pipe and feed, as a file, the output to post-pipe. So why does grep even have the opportunity to match the glob because it already has the output from ls? I never knew that but it's good to know that now. I'll be sure to put "quotes" around my regular expressions from now on. I also didn't know that there are more than one type of regular expression.

Good stuff.

I do know this, though:
grep = Global Regular Expression Parser
 
Old 12-08-2006, 01:02 PM   #18
matthewg42
Senior Member
 
Registered: Oct 2003
Location: UK
Distribution: Kubuntu 12.10 (using awesome wm though)
Posts: 3,530

Rep: Reputation: 65
[QUOTE=zetabill]Interesting thread. It's been a while since I took my bash class but I thought that
Code:
ls | grep a*
is only going match any line with zero or more instances of "a"... and that's it.
Please re-read post #8. It will usually work, but not for the reason you might think, and when you have more than one file the current working directory it won't work at all.

Quote:
Originally Posted by zetabill
From the way I understood, on the bash command line the "|" (pipe) means take the output from pre-pipe and feed, as a file, the output to post-pipe.
That's a little ambiguous because some programs don't take their input from standard input, or don't write their output to standard output. For example, most of the GNU core utilities (grep included ) will take their input from standard input only if no files are specified as part of the command.

It is more precise to say that the pipe connects the standard output file handle of the command to the left of the pipe symbol to the standard input file handle of the command to the right of the pipe. Thus pipes are only useful for programs which read from stdin and/or write to stdout.

Quote:
Originally Posted by zetabill
So why does grep even have the opportunity to match the glob because it already has the output from ls?
The shell tries to expand glob patterns before passing that argument list to programs. So in the example above, if a* matches any files, that list of files is passed to grep, and grep will not see the pattern. Again, I refer you to post #8, which has a detailed example of what is going on.

When the shell interprets a command with pipes in it, it doesn't execute the processes in the order they appear on the command line, and then move the output through them. The pipes are connected together before any of the processes create any output and then the data flows through the pipeline.

Quote:
Originally Posted by zetabill
I never knew that but it's good to know that now. I'll be sure to put "quotes" around my regular expressions from now on. I also didn't know that there are more than one type of regular expression.

Good stuff.

I do know this, though:
grep = Global Regular Expression Parser
Another nice one. I wonder how many different "reasons" for the name grep there are?
 
Old 12-08-2006, 01:59 PM   #19
cdex
LQ Newbie
 
Registered: Aug 2002
Location: near berlin, germany
Distribution: Debian unstable
Posts: 11

Rep: Reputation: 0
Regarding the bash i just want to mention the Advanced Bash-Scripting Guide. It saved me a lot of headaches (next to the manpage ) -> some things i find faster in the man page some are better explained in the guide ...
 
Old 12-08-2006, 02:26 PM   #20
makyo
Member
 
Registered: Aug 2006
Location: Saint Paul, MN, USA
Distribution: {Free,Open}BSD, CentOS, Debian, Fedora, Solaris, SuSE
Posts: 735

Rep: Reputation: 76
Hi.
Quote:
Originally Posted by matthewg42
... I wonder how many different "reasons" for the name grep there are?
Like the Highlander: There Can Be Only One. ... cheers, makyo
Quote:
Originally Posted by Brian Kernighan
The name comes from the ed command g/regular-expression/p ...
The UNIX Programming Environment, Kernighan and Pike, 1984, Prentice-Hall, page 18
Also
Quote:
From Wikipedia, the free encyclopedia
Jump to: navigation, search

The correct title of this article is grep. The initial letter is shown capitalized due to technical restrictions.

grep is a command line utility that was originally written for use with the Unix operating system. The default behavior of grep takes a regular expression on the command line, reads standard input or a list of files, and outputs the lines containing matches for the regular expression.

The name comes from a command in the Unix text editor ed that takes the form:

g/re/p

which means "search globally for lines matching the regular expression, and print them". There are various command line switches available when using grep that modify the default behavior.

Other (incorrect) backronyms of the name exist, including: General Regular Expression Parser, General Regular Expression Print, Global Regular Expression Parser, and Global Regular Expression Print, though the last example is not entirely wrong.
http://en.wikipedia.org/wiki/Grep
BWK's credentials include:
Quote:
Brian Kernighan
From Wikipedia, the free encyclopedia
(Redirected from Kernighan)
Jump to: navigation, search

Brian Wilson Kernighan, (IPA pronunciation: ['kɛrnɪˌhæn], the 'g' is silent); born 1942, is a computer scientist who worked at the Bell Labs and contributed to the design of the pioneering AWK and AMPL programming languages. He is also the author of the famous Hello, world program.

Kernighan's name became widely known through co-authorship of the first book on the C programming language with Dennis Ritchie. Kernighan has said that he had no part in the design of the C language: "It's entirely Dennis Ritchie's work". He authored many Unix programs, including ditroff. -- more at:
http://en.wikipedia.org/wiki/Kernighan
( edit 1: correct quote )

Last edited by makyo; 12-08-2006 at 03:20 PM.
 
Old 12-09-2006, 12:26 AM   #21
m4a1rifle
LQ Newbie
 
Registered: Dec 2006
Posts: 5

Original Poster
Rep: Reputation: 0
im using ubuntu in vmware.
when using grep '^[a-z]$' dict-file
the output is ABCDEFGHJIKLMOOPQRSTUVWXYabcdefghijklmnopqrstuvwxyz
thts capital A-Y then lowercase a-z

is it wmware fault?
 
Old 03-05-2008, 05:05 AM   #22
Mr. ameya sathe
Member
 
Registered: Jul 2007
Distribution: RedHat Enterprise 5 Server Edition; Ubuntu 8.04 ; RHCE Certificate number: 805008741034103
Posts: 78
Blog Entries: 8

Rep: Reputation: Disabled
Arrow Regular expression containing combination of alphabets & numbers

Given:
Regular expression
[[:alpha:]][0-9]*[[:alpha:]]

Using the above expression, I grep on the file
/usr/share/dict/linux.words
on the bash prompt.
i.e.

Code:
 grep '[[:alpha:]][0-9]*[[:alpha:]]' /usr/share/dict/linux.words
Then, the following output was observed-

10-point
10th
11-point
12-point


etc..

How come this output is seen?
Isn't the output supposed to be similar to

ab
a1b

i.e. an alphabet followed by digit followed by an alphabet
OR
an alphabet followed by an alphabet.

Last edited by Mr. ameya sathe; 03-05-2008 at 08:04 AM. Reason: Correction in CODE i.e. Insertion of single quotes around the Regular expression
 
Old 03-05-2008, 05:07 AM   #23
acid_kewpie
Moderator
 
Registered: Jun 2001
Location: UK
Distribution: Gentoo, RHEL, Fedora, Centos
Posts: 43,417

Rep: Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985
please don't drag up dead threads. new question = new thread.
 
Old 03-05-2008, 05:15 AM   #24
kees-jan
Member
 
Registered: Sep 2004
Distribution: Debian, Ubuntu, BeatrIX, OpenWRT
Posts: 273

Rep: Reputation: 30
Grep searches for substring. You'll note that all strings found contain two consecutive alphabet characters (not unexpected when searching a dictionary).

Use ^ to match the beginning of a line, and $ to match the end of the line. Note that $ also has special meaning to the shell, so you might want to add some quotes.

Groetjes,

Kees-Jan
 
Old 03-05-2008, 07:46 AM   #25
jukebox55
Member
 
Registered: Aug 2007
Distribution: slackware 11
Posts: 101

Rep: Reputation: 15
yes its interesting stuff, i first learned of regular expressions when i read 'oreilly - bash in a nutshell', i knew it was some way of using patterns to match text, but i didnt really take it all in.

the term 'regular expression' didnt help either!

thanks to Matthew et al ive got a better handle on it (although i still dont fully understand it, and i suppose its what matthew was saying, you have to practice and read, practice and re-read.)

a big problem for me was realizing that the shell works in its own way and may misinterpret the regular expression if the user was not careful, hence the need for quoting etc.

the other problem i have is difficult to explain, but its regarding structure of commandlines etc, its hazy at the moment, but with each good explaination i understand a little more

Last edited by jukebox55; 03-05-2008 at 07:50 AM.
 
Old 03-05-2008, 11:09 AM   #26
Mr. ameya sathe
Member
 
Registered: Jul 2007
Distribution: RedHat Enterprise 5 Server Edition; Ubuntu 8.04 ; RHCE Certificate number: 805008741034103
Posts: 78
Blog Entries: 8

Rep: Reputation: Disabled
oh ho!!! while searching through Man pages,theory & linuxquestions.org
I have realised that,there are things like
1. Wildcard characters, which have been popularly used in DOS
2.There are regular expressions as taught in subjects like Theory of Computational Science.
3. There are also, Extended regular expressions(R.E.).
4. There is Pattern Matching which looks very similar to R.E. but, which are not.
5. And, finally there is Perl-style regular expressions.

Oh GOD!! Save Me. What should I learn??
 
  


Reply

Tags
bash, color, expression, globbing, grep, regular, scripting, shell



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Telling people to use "Google," to "RTFM," or "Use the search feature" Ausar General 77 03-21-2010 11:26 AM
"Xlib: extension "XFree86-DRI" missing on display ":0.0"." zaps Linux - Games 9 05-14-2007 03:07 PM
Switching From Daemon to "Regular Mode" surplusxmas Linux - Newbie 4 07-29-2006 10:05 PM
Any way to get "Alice"; "Call of Duty" series and "Descent 3" to work? JBailey742 Linux - Games 13 06-23-2006 01:34 PM
"Out of range" Error for regular users but not root geekychic Linux - Hardware 2 04-01-2005 09:25 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 04:56 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration