LinuxQuestions.org
Help answer threads with 0 replies.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 05-23-2009, 05:00 AM   #1
rohits1991
LQ Newbie
 
Registered: May 2009
Posts: 5

Rep: Reputation: Disabled
Shell Scripting : compare two files & append


I need to write a Shell Script to compare two files & display the result. If the two files are different append them and store them in a new file. How do i proceed...can someone give me a coding ?
 
Old 05-23-2009, 05:19 AM   #2
ghostdog74
Senior Member
 
Registered: Aug 2006
Posts: 2,697
Blog Entries: 5

Rep: Reputation: 244Reputation: 244Reputation: 244
man diff
 
Old 05-23-2009, 05:32 AM   #3
colucix
LQ Guru
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983
Also
Code:
man comm
 
Old 05-23-2009, 05:42 AM   #4
jamescondron
Member
 
Registered: Jul 2007
Location: Scunthorpe, UK
Distribution: Ubuntu 8.10; Gentoo; Debian Lenny
Posts: 961

Rep: Reputation: 70
I would have thought it'd be far easier just to do it by hand, you don't have to worry about any error checking, but if thats how you want it....:

Code:
#!/bin/bash
#
# USAGE: ./script.sh /path/to/file/one /path/to/file/two

TEMP=temp_file

if (( $# < 2 || $# > 2 )); then
    echo "Usage: $0 /path/to/file/one /path/to/file/two"
    exit
fi

diff $1 $2 > temp_file
COUNT=`wc -l temp_file`

echo $COUNT

if [[  $COUNT  ==  0* ]]; then
    echo "No differences exist"
    rm $TEMP
else
    # Differences exist
    echo "Differences exist. Where would you like the differences catching?"
    read OUTPUT
    mv $TEMP $OUTPUT
fi
If you want to merge differences, you use the 'patch' command. The thing is, if you just want to see the differences on stdout, just use 'diff' - man it if you can't guess the syntax from looking at this
 
Old 05-25-2009, 08:03 AM   #5
rohits1991
LQ Newbie
 
Registered: May 2009
Posts: 5

Original Poster
Rep: Reputation: Disabled
Well here's what i did finally :


c=` diff file1 file2 | wc -l`
if((c>0))
then cat file1 file2 >> file3
else echo The files are the same
fi


turned out easier than i had expected
 
Old 05-25-2009, 08:20 AM   #6
i92guboj
Gentoo support team
 
Registered: May 2008
Location: Lucena, Córdoba (Spain)
Distribution: Gentoo
Posts: 4,083

Rep: Reputation: 405Reputation: 405Reputation: 405Reputation: 405Reputation: 405
Actually it can be even simpler. All you need is to look at the exit status of diff, there's no need to count the lines of the output using wc. Since the exit status is zero if the files are the same (and 0 is true), you need to negate the expression using !

Sample code:

Code:
if ! diff file1 file2 > /dev/null; then
  echo diff
else
  echo not_diff
fi
 
Old 05-26-2009, 10:14 AM   #7
rohits1991
LQ Newbie
 
Registered: May 2009
Posts: 5

Original Poster
Rep: Reputation: Disabled
oh thanks....i am not familiar with dev.....I am an amatuer very new to Shell....
 
Old 05-26-2009, 10:21 AM   #8
bjorke
LQ Newbie
 
Registered: May 2009
Location: Silicon Valley
Distribution: ubuntu, xandros, maemo, angstrom, android
Posts: 4

Rep: Reputation: 0
I'm curious about your use of ">>" since you say "append them and store them in a new file" -- your usage here implies (though does not require) that "file3" already exists, and you are adding the contents of files 1 and 2 to it.

Did you really mean "concatenate files 1 and 2 together and save as a new file"? Because then I think you may be better off using ">" which will guarantees that the contents of "file3" are *only* what came from files 1 and 2
 
Old 05-26-2009, 10:28 AM   #9
i92guboj
Gentoo support team
 
Registered: May 2008
Location: Lucena, Córdoba (Spain)
Distribution: Gentoo
Posts: 4,083

Rep: Reputation: 405Reputation: 405Reputation: 405Reputation: 405Reputation: 405
Quote:
Originally Posted by rohits1991 View Post
oh thanks....i am not familiar with dev.....I am an amatuer very new to Shell....
The redirection to /dev/null is not relevant to the issue, you could very well ommit it and still get the same result. It just avoids the diff being shown in your screen, which is convenient in a script. You could just do without it as well if you don't care:

Code:
f ! diff file1 file2; then
  echo diff
else
  echo not_diff
fi
The important thing is that you don't need to count the lines with wc, as I said. Just the exit status of the diff command is enough to decide if the files are different or not.
 
Old 05-27-2009, 01:55 AM   #10
rohits1991
LQ Newbie
 
Registered: May 2009
Posts: 5

Original Poster
Rep: Reputation: Disabled
@ bjorke,
yeah i thought of it......but my idea was, if i use '>>' , i can perform both operations according to what i am asked to do......


@i92guboj
In any case ,either wc -l or /dev/null , the length of the code is not affected......
As a student in shell , i have nothing to think beyond that
Anyway what are the differences in terms of the resources required to run the two different codes- the one u suggested & the one i did.

Last edited by rohits1991; 05-27-2009 at 02:01 AM.
 
Old 05-27-2009, 10:18 AM   #11
i92guboj
Gentoo support team
 
Registered: May 2008
Location: Lucena, Córdoba (Spain)
Distribution: Gentoo
Posts: 4,083

Rep: Reputation: 405Reputation: 405Reputation: 405Reputation: 405Reputation: 405
Quote:
Originally Posted by rohits1991 View Post
@i92guboj
In any case ,either wc -l or /dev/null , the length of the code is not affected......
As a student in shell , i have nothing to think beyond that
You should.

Again, null is irrelevant here. You can eliminate it if the length of the code is your maximum priority.

When you use a | wc, this is what happen:
  • a pipe is created (ram wasted)
  • a new process is created (wc), that means a context swap at a given point, and then another when you go back
  • all the data have to pass from process A to process B, and back again

All of this could have been avoided by just checking the return status. In terms of performance, there are lightyears of different between checking a return status and spawning a pipe, a new process, and passing ALL your data to the new process. Saying that your solution is suboptimal is an euphemism

Of course, in this snippet you are not going to notice any difference. But when you start using loops and the files are big, the thing changes. Even with an empty file the thing is noticeable:

Code:
$ touch foo # this creates an empty file
# Just checking the exit status
$ time for ((i=1; i<=1000; i++)); do if ! diff foo foo; then :; else :; fi; done

real    0m2.431s
user    0m0.708s
sys     0m1.380s
# pipe the results into a new process to count the lines and 
# then return the result back
$ time for ((i=1; i<=1000; i++)); do c=$(diff foo foo|wc -l); if [ $c -eq 0 ]; then :; else :; fi; done

real    0m5.583s
user    0m1.528s
sys     0m3.340s

# when the files are not empty and the lines have to be counted 
# it's worse
$ time for ((i=1; i<=1000; i++)); do if ! diff Xorg.0.log keysymdef.h > /dev/null; then :; else :; fi; done

real    0m5.567s
user    0m2.808s
sys     0m2.120s
$ time for ((i=1; i<=1000; i++)); do c=$(diff Xorg.0.log keysymdef.h|wc -l); if [ $c -eq 0 ]; then :; else :; fi; done

real    0m8.848s
user    0m3.656s
sys     0m4.124s
Note that I've substituted even the commands inside the branching code by ":", so the loop is not delayed running extra instructions in the middle (after all, all I want to measure is the time it gets to run the test).

Besides that, it's the fact that it's simpler conceptually. You run diff and ask to diff if the files are different. It makes much more sense from a conceptual point of view that counting the different lines, and then seeying if the number is zero... It's my mental model, anyway... Of course, this will make no sense to you if you don't know what the exist status of a command means in shell scripting.

Last edited by i92guboj; 05-27-2009 at 10:22 AM.
 
Old 05-28-2009, 01:11 AM   #12
rohits1991
LQ Newbie
 
Registered: May 2009
Posts: 5

Original Poster
Rep: Reputation: Disabled
Needless to say...the last post of yours contained only a few sentences which i can say i knew before........
Thank you so much ! Next time i have to append two files, i vow that i wont use wc -l !!!

)
 
  


Reply

Tags
compare


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
compare strings in shell scripting culin Linux - General 10 01-24-2013 01:33 AM
How to Compare Two files using shell script pooppp Linux - Networking 14 08-05-2008 03:35 AM
shell script: compare 2 files anhtt Programming 6 08-29-2007 02:39 AM
compare files in C shell shashwat.gupta Programming 8 05-24-2006 02:13 AM
bash shell scripting - && and || gui10 Programming 10 12-15-2001 03:37 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 04:23 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration