LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   Need to diff two files as described below. (https://www.linuxquestions.org/questions/linux-newbie-8/need-to-diff-two-files-as-described-below-4175603024/)

pk1920 04-02-2017 05:45 AM

Need to diff two files as described below.
 
Hi All ,
I am new to linux shell scripting.
What i want is to compare two files, out of which the first file is my master file i.e. the file i will use as base and the second file is the messed up one, in which some enteries are missing and some extra are present. I need to know what enteries are missing and what are extra compared to the file1. Please help me.

Turbocapitalist 04-02-2017 06:02 AM

Welcome.

If they are two text files, the usual way is with diff.

pk1920 04-02-2017 06:42 AM

Quote:

Originally Posted by Turbocapitalist (Post 5691589)
Welcome.

If they are two text files, the usual way is with diff.

: No, the files not only contains text, but numbers and time stamp also.
Diff is not helping properly

Turbocapitalist 04-02-2017 06:46 AM

Numbers, including time stamps, are text as far as computers are concerned. What goes wrong when you try diff for your data?

Also, can you go into more detail about the data and what kind of differences you are looking for? Some (sanitized) sample data would help, with examples of what you expect to find.

pk1920 04-02-2017 06:52 AM

Quote:

Originally Posted by Turbocapitalist (Post 5691595)
Numbers, including time stamps, are text as far as computers are concerned. What goes wrong when you try diff for your data?

Also, can you go into more detail about the data and what kind of differences you are looking for? Some (sanitized) sample data would help, with examples of what you expect to find.

ok, let me describe it with example :
Suppose the first file, means the base file is :

StartInstall, CDM_2.5B263, OK
EndInstall, CDM_2.5B263, SUCCESS
StartPatch, CDM_2.5.0.2B1, OK
StartPatch, CDM_2.5.0.3B1, OK
EndPatch, CDM_2.5.0.3B1, SUCCESS
StartPatch, CDM_2.5.0_SM-10866B2, OK
EndPatch, CDM_2.5.0_SM-10866B2, SUCCESS
StartPatch, CDM_2.5.0.REQUEST-6753B2, OK
StartPatch, CDM_2.5.0_SM-11515B2, OK
EndPatch, CDM_2.5.0_SM-11515B2, SUCCESS


and the second file is :

StartInstall, CDM_2.5B263, OK
EndInstall, CDM_2.5B263, SUCCESS
StartPatch, CDM_2.5.0_SM-11515B2, OK
EndPatch, CDM_2.5.0_SM-11515B2, SUCCESS

Third file shud be :
all the lines missed from file1 and with the sequence.
The start/END should be taken as one.

Turbocapitalist 04-02-2017 06:57 AM

I see that diff works fine on that sample, in part because it is sorted / grouped. I get the following:

Code:

diff file1 file2
3,8d2
< StartPatch, CDM_2.5.0.2B1, OK
< StartPatch, CDM_2.5.0.3B1, OK
< EndPatch, CDM_2.5.0.3B1, SUCCESS
< StartPatch, CDM_2.5.0_SM-10866B2, OK
< EndPatch, CDM_2.5.0_SM-10866B2, SUCCESS
< StartPatch, CDM_2.5.0.REQUEST-6753B2, OK

The < means that the line printed is present in the first file (file1) and missing in the second file (file2).

What is missing when you run it on a larger data set?

chrism01 04-02-2017 11:32 PM

I usually find the following args to diff create a nice o/p
Code:

diff -Nuw origfile newfile >file.diff
Any decent editor eg vim will understand the o/p syntax (ie .diff extension) and colour code the file.diff file recs for ease of reading.

HTH


All times are GMT -5. The time now is 12:35 AM.