Some questions on regular expression(shouldn't be hard,so please help)
Hi!
i'm reading the book Linux:The Textbook by Syed Sarwar.Following the examples on the book, i tried the grep/egrep command using bash on my Fedora8, and then encountered some problems. all the grep/egrep commands below are used to search the text file students, whose content is: $cat students John Johnsen john.johnsen@tp.com 503.555.1111 Hassaan Sarwar hsarwar@k12.st.or 503.444.2132 David Kendall d_kendall@msnbc.org 229.111.2013 John Johnsen jjohnsen@psu.net 301.999.8888 Kelly Kimberly kellyk@umich.gov 555.123.9999 Maham Sarwar msarwark@k12.st.or 713.888.0000 Jamie Davidson j.davidson@uet.edu 515.001.2932 Nabeel Sarwar nsarwar@xyz.net 434.555.1212 OK! question #1: should all regular expressions be quoted using single or double quotes? if i type $grep ^[A-H] students, does it simply go through the file to see if there is a line containing string "^[A-H]", rather than treat it as a regular expression? question #2: if the answer to #1 is yes, do '' and "" mean the same to each other? question #3: look at the result: $grep '[a-z]\{4\}' students John Johnsen john.johnsen@tp.com 503.555.1111 Hassaan Sarwar hsarwar@k12.st.or 503.444.2132 David Kendall d_kendall@msnbc.org 229.111.2013 John Johnsen jjohnsen@psu.net 301.999.8888 Kelly Kimberly kellyk@umich.gov 555.123.9999 Maham Sarwar msarwark@k12.st.or 713.888.0000 Jamie Davidson j.davidson@uet.edu 515.001.2932 Nabeel Sarwar nsarwar@xyz.net 434.555.1212 why are all of the lines printed out? i thought the command means only to print those lines containing exactly 4 lowercase letters consecutively. that's what {n} means according to my man page GREP(1). question #4: again the command $grep -n '[a-z]\{4\}' students why are there backslashes? i executed $grep -n '[a-z]{4}' students and saw nothing. i checked my man page GREP(1), which says: in basic regular expressions the metacharacters ?, +, {, |, (, and ) lose their special meaning; instead use the backslashed versions \?, \+, \{, \|, \(, and \). so does that mean my grep only support basic regular expression? however, GREP(1) also says: grep understands two different versions of regular expression syntax: "basic” and "extended.” In GNU grep, there is no difference in available functionality using either syntax. In other implementations, basic regular expressions are less powerful. so it seems that my grep is not a GNU grep, but some kind of other implementation. is that correct? if so,what implementation is my grep? and what is GNU grep anyway? thanks!:newbie: |
You should probably spend a little time reading through a good tutorial on grep itself. I like this one: http://www.panix.com/~elflord/unix/grep.html Here are some brief answers though:
1 & 2: You need to put quote marks around all but the simplest grep patterns, and the single and double quote marks have slightly different meanings. You will generally want the single ones. For more, see the section in that tutorial on quoting. 3: All the lines match because of of those lines does have 4 consecutive lower-case letters. You are probably thinking that they shouldn't match if there are more than four letters, but the regex matches as soon as it hits the four lower-case letters in a row. Change the expression to this Code:
grep '[a-z]\{8\}' students 4: Various "magic" characters have to have a backslash in front of them in order to activate (or sometimes to turn off) their magic. Again, the tutorial will tell you more. GNU grep is one version of grep. There are many different versions, and they sometimes have different default behaviors about important things like backslashes and magic characters. |
thanks telemachos, i've read the grep tutorial
that helped me a lot:) |
All times are GMT -5. The time now is 02:56 PM. |