[SOLVED] How do I exclude multiple directories in awk with find?
Linux - NewbieThis Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place!
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
How do I exclude multiple directories in awk with find?
Hello
I found a script on webmaster world that mostly does what I need it to, but have been making modifications to tailor it to my specific needs.
Here is the original command:
Code:
find . -type f ¦ awk '!/\/\..*/ {dir=gensub(/(.+\/).+/,"\\1","g",$0); dir_list[dir]++} END {for (d in dir_list) printf "%s %s\n",d,dir_list[d]}' ¦ sort
I made it into a bash script that will monitor directories to let me know if they are getting too big.
Code:
#dirwatch.sh
#!/bin/bash
#set up the infinte loop
while [ 1 ]
do
#the important stuff
find . -mindepth 2 -type f | awk '!/\/\..*/ {dir=gensub(/(.+\/).+/,"\\1","g",$0); dir_list[dir]++} END {for (d in dir_list) printf "%s %s\n",dir_list[d],d}'| sort -n
echo "";
#put the time/date stamp
date
echo ----------------;
#set how long to sleep between cycles
sleep 5
done
I know that /\/\..*/ tells awk to ignore hidden directories, how do I define more directories to ignore? (i.e. temp, var, etc)? I've tried playing with prune before the awk command with limited success...I know that there are many ways to do the same thing and keep running into brick walls.
Just thought I would point out some redundancies in your awk:
1. dir=gensub(/(.+\/).+/,"\\1","g",$0) - sub, gsub and gensub work on $0 unless otherwise specified, so it is not required here
2. dir_list[dir]++ - as dir is only used in this one spot you could easily combine this with the previous step, ie dir_list[gensub(/(.+\/).+/,"\\1","g")]++
3. printf "%s %s\n",dir_list[d],d - two things on this one: a. as you have input a newline you may as well use print. b. by using print and because you have not
changed the output field separator (OFS) you can use a comma to achieve the space like so - print dir_list[d],d
Thanks for the help guys. Colucix, I ended up adding the && to the awk statement to exclude directories. Also found out how to put a space in, if the directory had a space in it...
&&!/\/My\ Documents/
Grail, ended up incorporating all of your changes to neaten up the code. Thanks again
Sorry I was a bit sleepy when i looked at this and I do now have a suggestion for your actual question.
You should just be able to use the pipe alternator in your regex:
that is with the escaped dot, otherwise ..* matches any sequence of one or more characters (so that any file will be excluded), since dot means any single character. The regexp above will match any file or directory name that contains
they prevent word splitting when the shell encounters a space character (or a tab or a newline) in a string. In other words the space is interpreted literally as part of the string and not as field separator.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.