LinuxQuestions.org
Visit Jeremy's Blog.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 06-04-2012, 04:35 PM   #1
vonedaddy
Member
 
Registered: Aug 2004
Location: Philadelphia,PA
Posts: 185

Rep: Reputation: 17
Find files not already gzipped


I have a directory with a load of dns logs. Some are gzipped (ending in .gz) and some are just text files.

I run the following command:

ls | grep -v *.gz

expecting to get a list of the files that are NOT zipped, instead I get:

Binary file bind.log.1.120204.gz matches
Binary file bind.log.120125.gz matches
Binary file bind.log.120204.gz matches


Can someone explain what I am doing wrong? And why would I see this output instead of just the files that are not zipped?
 
Old 06-04-2012, 04:39 PM   #2
suicidaleggroll
LQ Guru
 
Registered: Nov 2010
Location: Colorado
Distribution: OpenSUSE, CentOS
Posts: 5,573

Rep: Reputation: 2142Reputation: 2142Reputation: 2142Reputation: 2142Reputation: 2142Reputation: 2142Reputation: 2142Reputation: 2142Reputation: 2142Reputation: 2142Reputation: 2142
The wildcard in your grep (*.gz) is being expanded on the call.

I would use find:
Code:
find . -maxdepth 1 ! -iname "*.gz"

Last edited by suicidaleggroll; 06-04-2012 at 04:41 PM.
 
Old 06-04-2012, 04:43 PM   #3
vonedaddy
Member
 
Registered: Aug 2004
Location: Philadelphia,PA
Posts: 185

Original Poster
Rep: Reputation: 17
Quote:
Originally Posted by suicidaleggroll View Post
The wildcard in your grep (*.gz) is being expanded on the call.

I would use find:
Code:
find . -maxdepth 1 ! -iname "*.gz"
Thanks that works, but can you explain "expanded on the call" for me? I really like to learn the reason why.

Thanks!
 
Old 06-04-2012, 04:47 PM   #4
Didier Spaier
LQ Addict
 
Registered: Nov 2008
Location: Paris, France
Distribution: Slint64-15.0
Posts: 11,055

Rep: Reputation: Disabled
For grep, * means "the preceding item will matched zero or more times".

Instead, write:
ls | grep -v gz$

You'll get all files whose name doesn't end in gz.

EDIT First sentence was wrong, corrected after reading post #8, thanks to David The H.

Last edited by Didier Spaier; 06-05-2012 at 02:00 PM. Reason: First sentence was wrong, corrected after reading post #8, thanks to David The H.
 
1 members found this post helpful.
Old 06-04-2012, 04:47 PM   #5
suicidaleggroll
LQ Guru
 
Registered: Nov 2010
Location: Colorado
Distribution: OpenSUSE, CentOS
Posts: 5,573

Rep: Reputation: 2142Reputation: 2142Reputation: 2142Reputation: 2142Reputation: 2142Reputation: 2142Reputation: 2142Reputation: 2142Reputation: 2142Reputation: 2142Reputation: 2142
When you execute the command "ls | grep -v *.gz", the *.gz is being expanded to all .gz filenames in the cwd before running the command. In other words, grep is not receiving "*.gz" as an argument, it's receiving all of the .gz files in the cwd. You would have to delimit the *, or put "*.gz" in quotes for it to be passed to the grep as you expect.

However, as Didier pointed out, "*.gz" doesn't mean the same thing to grep as it does to ls, you would need "gz$" to do what you're looking for.

Last edited by suicidaleggroll; 06-04-2012 at 04:51 PM.
 
1 members found this post helpful.
Old 06-04-2012, 04:50 PM   #6
vonedaddy
Member
 
Registered: Aug 2004
Location: Philadelphia,PA
Posts: 185

Original Poster
Rep: Reputation: 17
Thanks for the explainations!
 
Old 06-04-2012, 05:01 PM   #7
em31amit
Member
 
Registered: Apr 2012
Location: /root
Distribution: Ubuntu, Redhat, Fedora, CentOS
Posts: 190

Rep: Reputation: 55
how about get this done with only "ls" command. GNU "ls" has rich features.


Code:
#ls -l --ignore=*.gz
 
1 members found this post helpful.
Old 06-05-2012, 01:29 PM   #8
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Arch + Xfce
Posts: 6,852

Rep: Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037
Please use ***[code][/code] tags*** around your code and data, to preserve formatting and to improve readability. Please do not use quote tags, colors, or other fancy formatting.



In bash, you can also exclude files using a simple extended globbing rule.

Code:
shopt -s extglob	#It's not enabled by default
echo !(*.gz)		#Glob patterns can be used with almost any command,
			#as they're expanded by the shell before execution.
extended globbing
globbing


Quote:
Originally Posted by Didier Spaier View Post
For grep, * means "match what follows zero or more times".
Huh? It means no such thing. grep uses regular expressions in its pattern matching, and in regex, "*" means "match the previous character zero or more times". "*.gz" is thus not a valid regular expression (there's no previous character), although it is a valid globbing pattern. The regex equivalent for that glob pattern is "^.*\.gz$". "." in regex means "any character", "^" is start of line", and "$" is "end of line". Note that the second period has to be backslash-escaped to make it literal.

Actually, the "^.*" part is really not necessary since the expression is anchored to the end of the line, so "\.gz$" is equivalent.

The "gz$" used above does also work for the most part, but do be aware that it will match any string that ends in "gz", e.g. "thingz".


Learning how to properly use regular expressions is one of the best bang-for-the-buck subjects you can spend your time on. A very large number of programs support, or even depend on, them.

Here are a few regular expressions tutorials:
http://mywiki.wooledge.org/RegularExpression
http://www.grymoire.com/Unix/Regular.html
http://www.regular-expressions.info/


Finally, be aware that parsing ls is generally not recommended. Use globbing patterns for simple file matching, and find for more complex ones, although it usually takes a bit more work to handle find's output safely.

Edit: One more point. Unlike shell globbing, find searches recursively, so it would return all matching files in subdirectories as well. Use the -maxdepth option to restrict it to the current directory only, as demonstrated by suicidaleggroll. Also, be aware that the -name options use globbing patterns (but not bash's extended globs). There's a separate set of -regex options if you need more sophisticated pattern matching.

Here are a couple of good links about using find:
http://mywiki.wooledge.org/UsingFind
http://www.grymoire.com/Unix/Find.html

Last edited by David the H.; 06-05-2012 at 01:48 PM. Reason: additions & changes for clarity
 
1 members found this post helpful.
Old 06-05-2012, 02:02 PM   #9
Didier Spaier
LQ Addict
 
Registered: Nov 2008
Location: Paris, France
Distribution: Slint64-15.0
Posts: 11,055

Rep: Reputation: Disabled
Thanks David, post #4 corrected.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
mounting a gzipped imagefile cov Linux - General 14 10-23-2008 03:57 AM
keeping modules gzipped pcyanide Linux - Kernel 2 11-20-2007 04:13 AM
Installation of Programs from Gzipped files in SimplyMepis param85047 Linux - Desktop 1 11-21-2006 08:52 AM
my freshly recompiled kernel modules files not gzipped and no sound loaded hottdogg Slackware 12 08-12-2006 01:52 PM
How do I direct Gzipped files from a floppy to a folder? duncan36 Linux - Newbie 3 02-11-2002 09:32 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 02:15 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration