Latest LQ Deal: Complete CCNA, CCNP & Red Hat Certification Training Bundle
Go Back > Forums > Linux Forums > Linux - Newbie
User Name
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!


  Search this Thread
Old 10-01-2008, 09:40 AM   #1
LQ Newbie
Registered: Oct 2008
Posts: 2

Rep: Reputation: 0
Smile Script to find file differences in two directory trees (bash)

I'm new to bash scripting, and I'll try to describe my problem as succinctly as possible.

I have two directory trees, which have the same structure. These trees go 6 or 7 levels deep, with a mixture of files/folders at every level (until the final level obviously). Let's call these trees dir1 and dir2.

I want to write a script that greps all the files in dir1 to their counterpart in dir2, and then put all the individual file grep results into a third folder, say /grepped.

If a file exists in dir1, but not in dir2 (or vice versa), then a grepped file should exist in /grepped with just the line "/dir1/foo/fileName.txt does not exist in /dir2" - or something equivalent.

So how do I traverse the directory tree? I can use test -d to figure out if something is a path or directory, but then I'm not sure how to actually move around the tree.

Additionally, I searched google and this site for a solution to anything similar to this and was unable to find it. If someone has a good link, please send it my way.

Old 10-01-2008, 10:26 AM   #2
Registered: May 2003
Distribution: slack,gentoo
Posts: 57

Rep: Reputation: 16
hey, it's quite easy. forget about the tree, treat it as a text, line by line.
just create your dir1.txt and dir2.txt by running find ./ > dir1.txt having working directory dir1 and do the same for dir2.

than you can do something like

cat dir1.txt | while read line
grep "$line" dir2.txt >> grepped.txt

I guess you know what to do next, let me know if you need additional help
Old 10-01-2008, 03:43 PM   #3
LQ Newbie
Registered: Oct 2008
Posts: 2

Original Poster
Rep: Reputation: 0

## This script takes two directory trees and creates three output types:
##   1. *.diffed which include the diff results if both dir1 and dir2 contain the file
##   2. lonely.dir1 which includes all the files present in dir1 but not dir2
##   3. lonely.dir2 which includes all the files present in dir2 but not dir1
## It deletes -> recreates a directory /diffed in your run location

echo "Comparing $1 with $2........"


## Has to track the current directory
cd $1
all_files=$(find * -type f)
cd $homedir

## Could put some sort of warning here
if [ -d "diffed" ]; then
   rm -r diffed
mkdir diffed

for f in $all_files; do

   ## Have to remove the /'s from $f for naming
   fslashesremoved=$(echo $f | sed 's_/__')

   if [ -f $1/$f ]; then
      if [ -f $2/$f ]; then
         ## Have to check if there is a difference between files
         diff $1/$f $2/$f > /dev/null
         if [ $? != 0 ]; then
            ## echo "Writing diff between $1/$f and $2/$f"
            diff $1/$f $2/$f > diffed/$fslashesremoved.diffed
         echo "$f: present in $1, but not in $2" >> diffed/lonely.dir1

cd $2
extra_files=$(find * -type f)
cd $homedir

## Now have to do the reverse for tmp2 to tmp1, but only have to check if they are present or not

for f in $extra_files; do

   if [ -f $2/$f ]; then
      if [ -f $1/$f ]; then
         ## Have to figure out how not to do something here
         echo stuff > /dev/null
         echo "$f: present in $2, but not in $1" >> diffed/lonely.dir2

## Now we have a diffed directory that has lots of files
## What should we remove from them?
## Should also remove the *.*~ from this
Slano, it turns out I need a bit more functionality since I want a list of what files are not present in either directories. (But if I didn't need this, I would have done it your way).

Here is what I actually settled on. I know it's very rough, but it works so far! Still trying to figure out how to use say in code "if you don't find the file in the first directory, and it is present in the second directory". That's why you see the hack with echo stuff > /dev/null.

Also, I need to ignore all temporary files.

Last edited by Syqers; 10-01-2008 at 03:46 PM.
Old 10-02-2008, 12:55 AM   #4
Mr. C.
Senior Member
Registered: Jun 2008
Posts: 2,529

Rep: Reputation: 61
Does the basis of diff -r not provide what you need?

It tells you:

a) which files exist in only one or the other directory tree
b) the differences between two corresponding files.

$ diff -r level0*
Only in level0.mirror/a/print: file
Only in level0/b/print: file
diff -r level0/bak/d/dir/file1 level0.mirror/bak/d/dir/file1
> I'm different
You can parse the output as diff lines always contain text output in the above demonstrated format (eg. "Only in ...", "> ...", etc.), or you can specify your own output format.

You can also use the -q option to give you easier parsing:

diff -qr level0*
Only in level0.mirror/a/print: file
Only in level0/b/print: file
Files level0/bak/d/dir/file1 and level0.mirror/bak/d/dir/file1 differ
and you can then perform your own diffs of the lines "Files...differ",

Diff has plenty of good options - be sure to review the man page.


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off

Similar Threads
Thread Thread Starter Forum Replies Last Post
bash shell script find and edit fields in a file hchoonbeng Programming 9 10-29-2008 03:13 AM
Using Bash, Find script files in a directory or subdirectories within... ray5_83 Programming 4 10-10-2008 08:42 PM
bash script to find out more than 1 continuous special characters in a file. kkpal Linux - Newbie 1 06-02-2008 05:56 AM
file or directory? bash script efus Programming 3 04-26-2007 07:11 PM
linux command error message bash: /usr/bin/find: No such file or directory sundaram123 Linux - General 8 04-02-2002 08:18 AM > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 11:58 AM.

Main Menu
Write for LQ is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration