LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   Bash, find : How to avoid [...] pattern matching in file names expanded from "$var"? (https://www.linuxquestions.org/questions/programming-9/bash-find-how-to-avoid-%5B-%5D-pattern-matching-in-file-names-expanded-from-%24var-875990/)

Telengard 04-19-2011 05:58 PM

Bash, find : How to avoid [...] pattern matching in file names expanded from "$var"?
 
I have two directories, dir1 and dir2. There are about 450 files in each directory. The files in dir2 should have exactly the same names as the files in dir1.

Some of the file names have [...] in them. I can't change the names of the files due to existing naming conventions.

My goal is to find files in dir2 with the same name as files in dir1 and execute commands on them. What follows is an abbreviated example of my failure until now.

Code:

foo$ ls -R
.:
dir1  dir2

./dir1:
file1  file2  file[a1]

./dir2:
file1  file2  file[a1]
foo$ cd dir1
dir1$ for src in * ; do find ../dir2/ -name "$src" ; done
../dir2/file1
../dir2/file2
../dir2/file1
dir1$

The file ../dir2/file[a1] was not listed, and ../dir2/file1 was listed twice! :(

The expansions of $src are as follows.

Code:

dir1$ for src in * ; do echo "$src" ; done
file1
file2
file[a1]
dir1$

find -name obeys shell pattern matching rules. When it sees the pattern file[a1] it is looking for filea and file1.

I want to stop the pattern matching and treat the expansion of "$src" as a literal. How?

Edit
The reason I want to use find is because some of the files in dir2 are nested in subdirectories which don't exist in dir1. So I'm trying to match files like this:

./dir1/file1 -> ./dir2/subdir1/file1
./dir1/file2 -> ./dir2/subdir2/file2
./dir1/file[a1] -> ./dir2/file[a1]

There is no way for me to predict in advance whether or not the duplicate file in dir2 is in a subdirectory or how many levels deep it is nested.

estabroo 04-19-2011 06:32 PM

for src in * ; do [ -f "../dir2/$src" ] && echo "$src" ; done

Telengard 04-19-2011 06:48 PM

Quote:

Originally Posted by estabroo (Post 4329920)
for src in * ; do [ -f "../dir2/$src" ] && echo "$src" ; done

That would solve the problem in the example. The reason I want to use find is because the files in dir2 are nested within subdirectories which don't exist in dir1.

Sorry I didn't make that clear in my OP :o

estabroo 04-19-2011 07:47 PM

well this uses a find

perl -MFile::Find -e 'map { $files{$_} = $_ } @ARGV; find({ wanted => sub {print "$_\n" if exists $files{$_}}}, "../dir2")' *

if you need the full path in ../dir2 then this instead
perl -MFile::Find -e 'map { $files{$_} = $_ } @ARGV; find({ wanted => sub {print "$File::Find::name\n" if exists $files{$_}}}, "../dir2")' *

Telengard 04-19-2011 08:46 PM

perl? Are you serious? What did I ever do to you?
:p

I'd much rather have a solution I can understand. I'd prefer one which doesn't require me to abandon Bash or Gnu Find. Most importantly, I'd really like to know an answer to the question posed in the OP.

Thanks for your replies so far, BTW. I do appreciate it :)

grail 04-19-2011 09:53 PM

How about:
Code:

$ set -f
$ for i in *; do find ../dir2/ -type f -name $i; done
../dir2/file[a1]
../dir2/file1
../dir2/file2
$ set +f

You need to include -type because we have switched off globbing and now the directory will be included

konsolebox 04-20-2011 01:41 AM

guess you have no choice but quote-out the glob characters found in filenames. i.e. process your src variable like this one:
Code:

#!/bin/bash
for src in *; do
    src=${src//\*/\\\*}
    src=${src//\?/\\\?}
    src=${src//\[/\\\[}
    src=${src//\]/\\\]}
    find ../dir2/ -type f -name "$src"
done


bigearsbilly 04-20-2011 11:11 AM

how about?

Code:


cd dir1
find /full/path/to/dir2 -type f -exec basename {} \; |

while read f; do
    [ -f "$f" ] && command "$f"
done


estabroo 04-20-2011 07:15 PM

Quote:

Originally Posted by Telengard (Post 4330017)
perl? Are you serious? What did I ever do to you?
:p

I'd much rather have a solution I can understand. I'd prefer one which doesn't require me to abandon Bash or Gnu Find. Most importantly, I'd really like to know an answer to the question posed in the OP.

Thanks for your replies so far, BTW. I do appreciate it :)

LOL, I can understand your reluctance, but perl users are like zombies, they want everyone else to be a zombie too (and eat braaaaiiinnns).

Having said that (join us), it doesn't require you to abandon bash, using a perl script or one-liner is the same as using find, sed, awk, whatever. You can just run it in a subshell and capture/use the output (really just join us).

Code:

#!/bin/bash

other_dir="../dir2"

for f in $(perl -MFile::Find  -e 'map { $files{$_} = $_ } @ARGV; find({ wanted => sub {print "$File::Find::name\n" if exists $files{$_}}}, '"'$other_dir'"')' *) ; do
  some_command $f
done


(see how easy it is to join us)

;)

grail 04-20-2011 07:52 PM

bloody perlites taking over the world ... hehe .. jk

Telengard 04-21-2011 12:10 AM

Sorry, but that's not the problem.
 
Quote:

Originally Posted by grail (Post 4330051)
Code:

$ set -f
$ for i in *; do find ../dir2/ -type f -name $i; done
../dir2/file[a1]
../dir2/file1
../dir2/file2
$ set +f


/me :study:
4.3.1 The Set Builtin - Bash Reference Manual

This doesn't work because the files listed are simply the contents of dir2 (and its subdirectories). The globbing is now being done by find instead of bash. I did a sample run with a unique file name in dir2 to show this.

Code:

foo$ ls -R
.:
dir1  dir2

./dir1:
file1  file2  file[a1]

./dir2:
file[a1]  ONLY_IN_DIR2  subdir1  subdir2

./dir2/subdir1:
file1

./dir2/subdir2:
file2
foo$ cd dir1/
dir1$ set -f
dir1$ for i in * ; do find ../dir2/ -type f -name $i ; done
../dir2/file[a1]
../dir2/subdir2/file2
../dir2/ONLY_IN_DIR2
../dir2/subdir1/file1
dir1$ set +f
dir1$

  • set -f turns globbing off
  • for i in * opens a loop which, on its first and only iteration, places an unaltered * character into variable i
  • find ../dir2/ -type f -name $i expands to find ../dir2/ -type f -name *
  • find internally performs the * pattern match against all file names from dir2 and prints the matching file names
  • done

I want to find files in dir2 (and its subdirectories) with the same names as files in dir1 (with no subdirectories) and then perform commands on them.

This doesn't work either.

Code:

dir1$ for i in * ; do set -f ; find ../dir2/ -type f -name "$i" ; done ; set +f
../dir2/subdir1/file1
../dir2/subdir2/file2
../dir2/subdir1/file1
dir1$

It is clear that even though bash globbing is disabled, find still performs pattern matching internally. This leads me to believe that I need to rethink the way I'm using find. I don't think there is any way to disable pattern matching with the name tests in find.

Thank you for showing me how to disable globbing in Bash though :)

Telengard 04-21-2011 12:22 AM

This works great!
 
Quote:

Originally Posted by konsolebox (Post 4330194)
Code:

#!/bin/bash
for src in *; do
    # no need to worry about * or ? in my file names
    src=${src//\[/\\\[}
    src=${src//\]/\\\]}
    find ../dir2/ -type f -name "$src"
done


/me :study:
3.5.3 Shell Parameter Expansion - Bash Reference Manual

Your solution directly addresses the problem as explained in my OP. I have to give you props for reading carefully and understanding my help request even though I should have explained better. :hattip:

Code:

foo$ ls -R
.:
dir1  dir2

./dir1:
file1  file2  file[a1]  messed[[up]name  ONLY_IN_DIR1

./dir2:
ONLY_IN_DIR2  subdir1  subdir2

./dir2/subdir1:
file1

./dir2/subdir2:
deepdir  file2  file[a1]

./dir2/subdir2/deepdir:
messed[[up]name
foo$ cd dir1
dir1$ for src in *; do
> src=${src//\[/\\\[}
> src=${src//\]/\\\]}
> find ../dir2/ -type f -name "$src"
> done
../dir2/subdir1/file1
../dir2/subdir2/file2
../dir2/subdir2/file[a1]
../dir2/subdir2/deepdir/messed[[up]name
dir1$

Seems to work perfectly! (I'm excited because this is the first time I've found a real use for this form of parameter expansion.) I think I'd like to use fewer backslashes though.

Code:

dir1$ for src in * ; do src="${src//[/\[}" ; find ../dir2/ -type f -name "${src//]/\]}" -exec echo '{}' \; ; done
../dir2/subdir1/file1
../dir2/subdir2/file2
../dir2/subdir2/file[a1]
../dir2/subdir2/deepdir/messed[[up]name
dir1$

I have also learned that it is only necessary to escape the [ (opening bracket), because without it the ] (closing bracket) is non-special.

Code:

dir1$ for src in * ; do find ../dir2/ -type f -name "${src//[/\[}" -exec echo '{}' \; ; done
../dir2/subdir1/file1
../dir2/subdir2/file2
../dir2/subdir2/file[a1]
../dir2/subdir2/deepdir/messed[[up]name
dir1$

I do believe this is the solution I shall use :)

Telengard 04-21-2011 12:41 AM

This works too!
 
Quote:

Originally Posted by bigearsbilly (Post 4330628)
Code:

cd dir1
find /full/path/to/dir2 -type f -exec basename {} \; |

while read f; do
    [ -f "$f" ] && command "$f"
done


This would work fine, except that it executes commands on files in dir1.

Code:

dir1$ find /tmp/foo/dir2 -type f -exec basename {} \; | while read f; do [ -f "$f" ] && echo $(dirname "$f")/$(basename "$f"); done
./file2
./file1
./messed[[up]name
./file1
dir1$

I want to perform commands on files in the dir2 hierarchy.

Code:

dir1$ find /tmp/foo/dir2 -type f |
> while read f ; do
> [ -f "$(basename "$f")" ] &&
> echo $(dirname "$f")/$(basename "$f")
> done
/tmp/foo/dir2/subdir2/file2
/tmp/foo/dir2/subdir2/file[a1]
/tmp/foo/dir2/subdir2/deepdir/messed[[up]name
/tmp/foo/dir2/subdir1/file1
dir1$

That's not a bad solution at all. It is exactly the kind of re-thinking I referred to in my reply to grail's post. It seems pretty obvious now, and I don't know why I didn't think of it. :doh:

I wonder what the limits or downside might be of piping the output of find into a loop?

I'm marking this thread solved now. If anyone else has more interesting or more graceful solutions, then please feel free to add them here.

Telengard 04-21-2011 01:00 AM

That's just crazy talk
 
Quote:

Originally Posted by estabroo (Post 4331045)
Code:

for f in $(perl -MFile::Find  -e 'map { $files{$_} = $_ } @ARGV; find({ wanted => sub {print "$File::Find::name\n" if exists $files{$_}}}, '"'$other_dir'"')' *) ; do
  some_command $f
done


OMG, I think I just went permanently cross-eyed! :cry:

I wrote a few perl scripts in college. Still have the book in the basement somewhere. Maybe some day I'll dig it out and give it another go. Most likely not, though. :rolleyes:

Wonder if the Ruby and Python camps will be chipping in too?

grail 04-21-2011 02:03 AM

Well here is something a little more extreme, but will cover more than just []:
Code:

while read -r FILE
do
    find dir2 -type f -name "$FILE"
done< <(find dir1 -type f -printf "%f\n" | sed 's/[^[:alnum:]]/\\&/g')



All times are GMT -5. The time now is 04:30 AM.