LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 05-17-2013, 05:29 PM   #1
asherbarasher
LQ Newbie
 
Registered: May 2013
Distribution: centos 6.x
Posts: 5

Rep: Reputation: Disabled
simple regex question


hi everyone.
i am very new to linux, so i think this is the best forum to post such a question.

I'm trying to construct regex to work with sed, i need it to match words with every third character z.
i don't understand why this doesn't work:
ls | sed -n '/..z*$/p'
Dot should stand for any single character, but this match just everything.
Also if you can point me to any good tutorial for beginners in regex
it will be very helpful.
THank you in advance.
 
Old 05-17-2013, 07:04 PM   #2
chrism01
LQ Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Rocky 9.2
Posts: 18,362

Rep: Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751
Start here http://www.grymoire.com/Unix/Sed.html#uh-0.
Watch that use of '*', it matches anything.
 
1 members found this post helpful.
Old 05-18-2013, 01:30 AM   #3
Diantre
Member
 
Registered: Jun 2011
Distribution: Slackware
Posts: 515

Rep: Reputation: 234Reputation: 234Reputation: 234
Quote:
Originally Posted by asherbarasher View Post
i don't understand why this doesn't work:
ls | sed -n '/..z*$/p'
It's because 'z*' matches either no 'z', one 'z' or more than one 'z'. Maybe this will work better:

Code:
ls | sed -n '/..z.*$/p'
 
1 members found this post helpful.
Old 05-18-2013, 02:33 AM   #4
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,140

Rep: Reputation: 4123Reputation: 4123Reputation: 4123Reputation: 4123Reputation: 4123Reputation: 4123Reputation: 4123Reputation: 4123Reputation: 4123Reputation: 4123Reputation: 4123
Most (all ?) regex questions are simple. It's the resolution that gets complex ...
 
Old 05-18-2013, 04:30 AM   #5
asherbarasher
LQ Newbie
 
Registered: May 2013
Distribution: centos 6.x
Posts: 5

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by chrism01 View Post
Start here http://www.grymoire.com/Unix/Sed.html#uh-0.
Watch that use of '*', it matches anything.
Thanks for this link, i don't know how i didn't find it but its great tutorial.

Quote:
It's because 'z*' matches either no 'z', one 'z' or more than one 'z'. Maybe this will work better:
No it doesn't work either, it just lists everything in directory.
 
Old 05-18-2013, 04:37 AM   #6
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,140

Rep: Reputation: 4123Reputation: 4123Reputation: 4123Reputation: 4123Reputation: 4123Reputation: 4123Reputation: 4123Reputation: 4123Reputation: 4123Reputation: 4123Reputation: 4123
In which case you should provide such so we can evaluate what is happening based on your data.

I was about to comment that last I looked the grymoire site was somewhat dated, but I see a recent attribution. Goodness.
 
Old 05-18-2013, 04:51 AM   #7
asherbarasher
LQ Newbie
 
Registered: May 2013
Distribution: centos 6.x
Posts: 5

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by syg00 View Post
In which case you should provide such so we can evaluate what is happening based on your data.

I was about to comment that last I looked the grymoire site was somewhat dated, but I see a recent attribution. Goodness.
Actually, i don't need it for work task i just wonder how to make this kind of regular expression.
I thought it should be pretty easy but stuck unexpectedly.
 
Old 05-18-2013, 03:48 PM   #8
Diantre
Member
 
Registered: Jun 2011
Distribution: Slackware
Posts: 515

Rep: Reputation: 234Reputation: 234Reputation: 234
Quote:
Originally Posted by asherbarasher View Post
No it doesn't work either, it just lists everything in directory.
Then samples of your data would be useful, as syg00 points out.

The regex actually works, in my system it lists .tar.gz, .zip, and a couple of files ending in 'z', in a directory containing several types of files.
 
Old 05-19-2013, 11:59 AM   #9
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Arch + Xfce
Posts: 6,852

Rep: Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037
You shouldn't be parsing ls for filenames anyway. For simple name pattern matching you can almost always use simple globbing patterns.

Code:
printf '%s\n' ??z*
For matching by more advanced criteria, use find
 
Old 05-19-2013, 12:09 PM   #10
asherbarasher
LQ Newbie
 
Registered: May 2013
Distribution: centos 6.x
Posts: 5

Original Poster
Rep: Reputation: Disabled
Quote:
Then samples of your data would be useful, as syg00 points out.
The regex actually works, in my system it lists .tar.gz, .zip, and a couple of files ending in 'z', in a directory containing several types of files.
I've no problems with first and last, i use ^ and $ operators respectively. But when i try to point to second symbol or third here is the problem.
Here's example:
[root@lab2 bin]# cd /usr/bin
[root@lab2 bin]# ls | sed -n '/..z.*$/p'
abrt-action-analyze-backtrace
abrt-action-analyze-c
abrt-action-analyze-core
abrt-action-analyze-oops
abrt-action-analyze-python
bluetooth-wizard
bunzip2
compiz
compiz-gtk
egroupwarewizard
eu-size
funzip
gettextize
gpg-zip
groupwarewizard
groupwisewizard
gunzip
hg-viz
htfuzzy
--omitted--
*****************
And same with grep.
As you can see, it lists everything in the directory. My question is how can i filter output, based on definition of every second or third or whatever character.

Last edited by asherbarasher; 05-19-2013 at 12:11 PM.
 
Old 05-19-2013, 01:10 PM   #11
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Arch + Xfce
Posts: 6,852

Rep: Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037
It's simple. You use shell globbing, as I said before. or find.

Tools like grep/sed/awk are designed for text processing, not filename matching. Do not try to filter the output of ls for names or metadata.


One thing to remember about regex, by the way, is that it's unanchored by default. You do not need to use ^/$ unless you specifically need the match those positions exactly, and you don't need to give any more than is necessary to uniquely match the string. (e.g. '^..z' will return any string with 'z' as the third character.)

Globbing is more limited though, in that the pattern must match the entire string, usually with the use of "*" wildcards.
 
Old 05-19-2013, 01:29 PM   #12
divyashree
Senior Member
 
Registered: Apr 2007
Location: Bangalore, India
Distribution: RHEL,SuSE,CentOS,Fedora,Ubuntu
Posts: 1,386

Rep: Reputation: 135Reputation: 135
Quote:
Originally Posted by asherbarasher View Post
hi everyone.
i am very new to linux, so i think this is the best forum to post such a question.

I'm trying to construct regex to work with sed, i need it to match words with every third character z.
i don't understand why this doesn't work:
ls | sed -n '/..z*$/p'
Dot should stand for any single character, but this match just everything.
Also if you can point me to any good tutorial for beginners in regex
it will be very helpful.
THank you in advance.
This should work definitely as you want
with grep:

Code:
ls  |egrep  '^..z.*$'
with sed:
Code:
ls  |sed  -n '/^..z.*$/p'

Last edited by divyashree; 05-19-2013 at 01:31 PM.
 
2 members found this post helpful.
Old 05-19-2013, 01:31 PM   #13
Diantre
Member
 
Registered: Jun 2011
Distribution: Slackware
Posts: 515

Rep: Reputation: 234Reputation: 234Reputation: 234
Quote:
Originally Posted by asherbarasher View Post
As you can see, it lists everything in the directory.
I honestly don't know why it's not working for you. In my system, the following commands show exactly the same output (in /usr/bin):

Code:
$ ls | sed -n '/^..z/p'
$ ls | grep '^..z'
$ printf '%s\n' ??z*
$ find . -iname '??z*'
The four commands show all entries where the third letter is a 'z':

Code:
bzz
fiz
lrz
lrzip
lrztar
lrzuntar
lsz
maze
mkzftree
mozilla
p7zip
size
size86
unzip
unzipsfx
And as David the H. points out, it's better not to use grep and sed for this purpose.
 
Old 05-19-2013, 06:59 PM   #14
rabirk
Member
 
Registered: Dec 2012
Location: Maryland, US
Distribution: Debian
Posts: 87
Blog Entries: 8

Rep: Reputation: Disabled
I can't help with giving you the regex, but for a reference, try Regular-Expressions.info .
 
Old 05-19-2013, 08:46 PM   #15
chrism01
LQ Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Rocky 9.2
Posts: 18,362

Rep: Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751
This is the book on regex http://regex.info/book.html
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] simple regex problem doughyi8u Programming 4 01-08-2012 07:55 PM
[SOLVED] differences between shell regex and php regex and perl regex and javascript and mysql golden_boy615 Linux - General 2 04-19-2011 01:10 AM
[SOLVED] What's Wrong with the RegEx [Simple] thund3rstruck Programming 2 04-04-2011 11:04 AM
simple regex not so simple (perl) ludeKing Programming 5 03-02-2005 02:29 AM
simple perl and regex phlx Programming 6 12-03-2004 03:01 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 10:25 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration