LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Software (http://www.linuxquestions.org/questions/linux-software-2/)
-   -   How do I compare two values in separate files (http://www.linuxquestions.org/questions/linux-software-2/how-do-i-compare-two-values-in-separate-files-584762/)

xmrkite 09-14-2007 10:53 PM

How do I compare two values in separate files
 
Hello all,

I have two files, one like this:

abc123
abc124
abc127
ABC128

and the second like this:

ABC123
abc125
abc126
abc128

What i want to do is somehow find out which rows are unique to each file (but ignore the case).

Ideally, the second row in the first file would be changed to this:
abc123 - unique

My files don't have the values above, those are just to show you the idea. Each set of files (I have many I need to compare) has something in the 500 to 600 line/values range.

-Thanks for the help, and I hope my question makes sense

David the H. 09-14-2007 11:27 PM

You want the diff command. This is what it was made for.

jay73 09-14-2007 11:33 PM

diff -ui file1 file2

xmrkite 09-15-2007 12:56 AM

That worked like a charm.

-Thank you very much!

xmrkite 09-15-2007 01:34 AM

Guess what...I spoke too soon!

The results totally confuse me. Maybe someone can help.

I will post some of the real files so you can see what i mean

file 1:
dv1000
dv1000 cto
dv1000-dz678av
dv1000-dz731av
dv1000t
dv1000t cto
dv1001ap-pk809as
dv1001xx-pk807as
dv1002ap-pf352pa


file2:
dv1000 cto
dv1000
dv1000t cto
dv1001ap
dv1001xx
dv1002ap
dv1002xx
dv1003ap

My results:
-dv1000
dv1000 cto
-dv1000-dz678av
-dv1000-dz731av
-dv1000t
+dv1000
dv1000t cto
-dv1001ap-pk809as
-dv1001xx-pk807as
-dv1002ap-pf352pa
+dv1001ap
+dv1001xx
+dv1002ap
+dv1002xx
+dv1003ap

What i wanted was the script to tell me that each file already has dv1000 in it, So why is there a - and a + entry for the dv1000 in the results file? Here is what i ran:

diff -ui pavilion1.txt pavilion2.txt > diff.txt

Where did I go wrong?

jay73 09-15-2007 03:34 AM

diff -ui fileone filetwo | grep -v [-+] >outputfile

This will output any shared entries. No output = no shared items.

As for the - and + returned by the command you used, that's how differences are indicated: - means present in file 1 only, + means present in file two only. Neither - nor + means that the item occurs in both files.

jay73 09-15-2007 03:44 AM

In fact, now that I come to think of it, this may be safer:

diff -ui fileone filetwo | grep -v "^[-+].*"

David the H. 09-15-2007 01:41 PM

Try using the -y flag in diff. That will print out contents of both files in columns so you can easily compare them and see which lines are different. Might not be very readable if the individual lines are very long though.

There are lots of other flags you can use for different styles of output. Read the man file.

xmrkite 09-19-2007 05:55 PM

ok, i finally got it working. I had to do

diff -uia --ignore-all-space fileone filetwo


That got it working for some reason. I just randomly tried the ignore all space dash, even thought there are no spaces in either of the files, but it did work.

That's linux commands for you in my experience, sometimes they just don't make sense, but you can still get the job done. Gotta love it. Would have been way more trouble in windows though, so i can't complain.

-Thanks for the help everyone.

chrism01 09-19-2007 09:49 PM

Actually, there are spaces in your example files ... eg 'dv1000 cto'

xmrkite 09-19-2007 10:36 PM

You're right on that one, but my main concern was why did the dv1000 show up with a - and a +. It should have had neither, and appeared only once (indicating that it was in both files).

jay73 09-20-2007 12:36 AM

Diff also checks whether items occupy the same lines in both files. Your items didn't.


All times are GMT -5. The time now is 08:54 AM.