LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 01-14-2008, 08:43 AM   #1
huahd
LQ Newbie
 
Registered: Jan 2008
Posts: 27

Rep: Reputation: 15
Smile Some questions on regular expression(shouldn't be hard,so please help)


Hi!
i'm reading the book Linux:The Textbook by Syed Sarwar.Following the examples on the book, i tried the grep/egrep command using bash on my Fedora8, and then encountered some problems.
all the grep/egrep commands below are used to search the text file students, whose content is:

$cat students
John Johnsen john.johnsen@tp.com 503.555.1111
Hassaan Sarwar hsarwar@k12.st.or 503.444.2132
David Kendall d_kendall@msnbc.org 229.111.2013
John Johnsen jjohnsen@psu.net 301.999.8888
Kelly Kimberly kellyk@umich.gov 555.123.9999
Maham Sarwar msarwark@k12.st.or 713.888.0000
Jamie Davidson j.davidson@uet.edu 515.001.2932
Nabeel Sarwar nsarwar@xyz.net 434.555.1212

OK! question #1:
should all regular expressions be quoted using single or double quotes?
if i type $grep ^[A-H] students, does it simply go through the file to see if there is a line containing string "^[A-H]", rather than treat it as a regular expression?

question #2:
if the answer to #1 is yes, do '' and "" mean the same to each other?

question #3:
look at the result:

$grep '[a-z]\{4\}' students
John Johnsen john.johnsen@tp.com 503.555.1111
Hassaan Sarwar hsarwar@k12.st.or 503.444.2132
David Kendall d_kendall@msnbc.org 229.111.2013
John Johnsen jjohnsen@psu.net 301.999.8888
Kelly Kimberly kellyk@umich.gov 555.123.9999
Maham Sarwar msarwark@k12.st.or 713.888.0000
Jamie Davidson j.davidson@uet.edu 515.001.2932
Nabeel Sarwar nsarwar@xyz.net 434.555.1212

why are all of the lines printed out? i thought the command means only to print those lines containing exactly 4 lowercase letters consecutively. that's what {n} means according to my man page GREP(1).

question #4:
again the command $grep -n '[a-z]\{4\}' students
why are there backslashes?
i executed $grep -n '[a-z]{4}' students and saw nothing.
i checked my man page GREP(1), which says: in basic regular expressions the metacharacters ?, +, {, |, (, and ) lose their special meaning; instead use the backslashed versions \?, \+, \{, \|, \(, and \).
so does that mean my grep only support basic regular expression? however, GREP(1) also says: grep understands two different versions of regular expression syntax: "basic” and "extended.” In GNU grep, there is no difference in available functionality using either syntax. In other implementations, basic regular expressions are less powerful. so it seems that my grep is not a GNU grep, but some kind of other implementation.
is that correct? if so,what implementation is my grep? and what is GNU grep anyway?

thanks!

Last edited by huahd; 01-14-2008 at 09:59 PM. Reason: improper format.
 
Old 01-14-2008, 11:33 AM   #2
Telemachos
Member
 
Registered: May 2007
Distribution: Debian
Posts: 754

Rep: Reputation: 60
You should probably spend a little time reading through a good tutorial on grep itself. I like this one: http://www.panix.com/~elflord/unix/grep.html Here are some brief answers though:

1 & 2: You need to put quote marks around all but the simplest grep patterns, and the single and double quote marks have slightly different meanings. You will generally want the single ones. For more, see the section in that tutorial on quoting.

3: All the lines match because of of those lines does have 4 consecutive lower-case letters. You are probably thinking that they shouldn't match if there are more than four letters, but the regex matches as soon as it hits the four lower-case letters in a row. Change the expression to this
Code:
grep '[a-z]\{8\}' students
and you will get different results. If you want four letters and then word end, you need to specify that in your search.

4: Various "magic" characters have to have a backslash in front of them in order to activate (or sometimes to turn off) their magic. Again, the tutorial will tell you more. GNU grep is one version of grep. There are many different versions, and they sometimes have different default behaviors about important things like backslashes and magic characters.
 
Old 01-15-2008, 07:04 AM   #3
huahd
LQ Newbie
 
Registered: Jan 2008
Posts: 27

Original Poster
Rep: Reputation: 15
thanks telemachos, i've read the grep tutorial
that helped me a lot
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Regular Expression msvinaykumar Programming 2 08-14-2006 08:48 AM
Regular expression datbenik Programming 1 01-05-2006 01:58 PM
Questions on regular expression davidas Linux - Newbie 1 04-05-2004 05:55 PM
Anyone know regular expression? ahhua Linux - Software 1 12-04-2003 08:13 AM
Regular Expression Help WeNdeL Linux - General 1 08-14-2003 10:08 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 02:27 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration