Linux - NewbieThis Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place!
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I need to write a Shell Script to compare two files & display the result. If the two files are different append them and store them in a new file. How do i proceed...can someone give me a coding ?
I would have thought it'd be far easier just to do it by hand, you don't have to worry about any error checking, but if thats how you want it....:
Code:
#!/bin/bash
#
# USAGE: ./script.sh /path/to/file/one /path/to/file/two
TEMP=temp_file
if (( $# < 2 || $# > 2 )); then
echo "Usage: $0 /path/to/file/one /path/to/file/two"
exit
fi
diff $1 $2 > temp_file
COUNT=`wc -l temp_file`
echo $COUNT
if [[ $COUNT == 0* ]]; then
echo "No differences exist"
rm $TEMP
else
# Differences exist
echo "Differences exist. Where would you like the differences catching?"
read OUTPUT
mv $TEMP $OUTPUT
fi
If you want to merge differences, you use the 'patch' command. The thing is, if you just want to see the differences on stdout, just use 'diff' - man it if you can't guess the syntax from looking at this
Actually it can be even simpler. All you need is to look at the exit status of diff, there's no need to count the lines of the output using wc. Since the exit status is zero if the files are the same (and 0 is true), you need to negate the expression using !
Sample code:
Code:
if ! diff file1 file2 > /dev/null; then
echo diff
else
echo not_diff
fi
I'm curious about your use of ">>" since you say "append them and store them in a new file" -- your usage here implies (though does not require) that "file3" already exists, and you are adding the contents of files 1 and 2 to it.
Did you really mean "concatenate files 1 and 2 together and save as a new file"? Because then I think you may be better off using ">" which will guarantees that the contents of "file3" are *only* what came from files 1 and 2
oh thanks....i am not familiar with dev.....I am an amatuer very new to Shell....
The redirection to /dev/null is not relevant to the issue, you could very well ommit it and still get the same result. It just avoids the diff being shown in your screen, which is convenient in a script. You could just do without it as well if you don't care:
Code:
f ! diff file1 file2; then
echo diff
else
echo not_diff
fi
The important thing is that you don't need to count the lines with wc, as I said. Just the exit status of the diff command is enough to decide if the files are different or not.
@ bjorke,
yeah i thought of it......but my idea was, if i use '>>' , i can perform both operations according to what i am asked to do......
@i92guboj
In any case ,either wc -l or /dev/null , the length of the code is not affected......
As a student in shell , i have nothing to think beyond that
Anyway what are the differences in terms of the resources required to run the two different codes- the one u suggested & the one i did.
Last edited by rohits1991; 05-27-2009 at 02:01 AM.
@i92guboj
In any case ,either wc -l or /dev/null , the length of the code is not affected......
As a student in shell , i have nothing to think beyond that
You should.
Again, null is irrelevant here. You can eliminate it if the length of the code is your maximum priority.
When you use a | wc, this is what happen:
a pipe is created (ram wasted)
a new process is created (wc), that means a context swap at a given point, and then another when you go back
all the data have to pass from process A to process B, and back again
All of this could have been avoided by just checking the return status. In terms of performance, there are lightyears of different between checking a return status and spawning a pipe, a new process, and passing ALL your data to the new process. Saying that your solution is suboptimal is an euphemism
Of course, in this snippet you are not going to notice any difference. But when you start using loops and the files are big, the thing changes. Even with an empty file the thing is noticeable:
Code:
$ touch foo # this creates an empty file
# Just checking the exit status
$ time for ((i=1; i<=1000; i++)); do if ! diff foo foo; then :; else :; fi; done
real 0m2.431s
user 0m0.708s
sys 0m1.380s
# pipe the results into a new process to count the lines and
# then return the result back
$ time for ((i=1; i<=1000; i++)); do c=$(diff foo foo|wc -l); if [ $c -eq 0 ]; then :; else :; fi; done
real 0m5.583s
user 0m1.528s
sys 0m3.340s
# when the files are not empty and the lines have to be counted
# it's worse
$ time for ((i=1; i<=1000; i++)); do if ! diff Xorg.0.log keysymdef.h > /dev/null; then :; else :; fi; done
real 0m5.567s
user 0m2.808s
sys 0m2.120s
$ time for ((i=1; i<=1000; i++)); do c=$(diff Xorg.0.log keysymdef.h|wc -l); if [ $c -eq 0 ]; then :; else :; fi; done
real 0m8.848s
user 0m3.656s
sys 0m4.124s
Note that I've substituted even the commands inside the branching code by ":", so the loop is not delayed running extra instructions in the middle (after all, all I want to measure is the time it gets to run the test).
Besides that, it's the fact that it's simpler conceptually. You run diff and ask to diff if the files are different. It makes much more sense from a conceptual point of view that counting the different lines, and then seeying if the number is zero... It's my mental model, anyway... Of course, this will make no sense to you if you don't know what the exist status of a command means in shell scripting.
Needless to say...the last post of yours contained only a few sentences which i can say i knew before........
Thank you so much ! Next time i have to append two files, i vow that i wont use wc -l !!!
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.