LinuxQuestions.org
Latest LQ Deal: Latest LQ Deals
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 04-02-2017, 05:45 AM   #1
pk1920
LQ Newbie
 
Registered: Apr 2017
Posts: 3

Rep: Reputation: Disabled
Smile Need to diff two files as described below.


Hi All ,
I am new to linux shell scripting.
What i want is to compare two files, out of which the first file is my master file i.e. the file i will use as base and the second file is the messed up one, in which some enteries are missing and some extra are present. I need to know what enteries are missing and what are extra compared to the file1. Please help me.
 
Old 04-02-2017, 06:02 AM   #2
Turbocapitalist
LQ Guru
 
Registered: Apr 2005
Distribution: Linux Mint, Devuan, OpenBSD
Posts: 7,530
Blog Entries: 4

Rep: Reputation: 3832Reputation: 3832Reputation: 3832Reputation: 3832Reputation: 3832Reputation: 3832Reputation: 3832Reputation: 3832Reputation: 3832Reputation: 3832Reputation: 3832
Welcome.

If they are two text files, the usual way is with diff.
 
Old 04-02-2017, 06:42 AM   #3
pk1920
LQ Newbie
 
Registered: Apr 2017
Posts: 3

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by Turbocapitalist View Post
Welcome.

If they are two text files, the usual way is with diff.
: No, the files not only contains text, but numbers and time stamp also.
Diff is not helping properly
 
Old 04-02-2017, 06:46 AM   #4
Turbocapitalist
LQ Guru
 
Registered: Apr 2005
Distribution: Linux Mint, Devuan, OpenBSD
Posts: 7,530
Blog Entries: 4

Rep: Reputation: 3832Reputation: 3832Reputation: 3832Reputation: 3832Reputation: 3832Reputation: 3832Reputation: 3832Reputation: 3832Reputation: 3832Reputation: 3832Reputation: 3832
Numbers, including time stamps, are text as far as computers are concerned. What goes wrong when you try diff for your data?

Also, can you go into more detail about the data and what kind of differences you are looking for? Some (sanitized) sample data would help, with examples of what you expect to find.
 
Old 04-02-2017, 06:52 AM   #5
pk1920
LQ Newbie
 
Registered: Apr 2017
Posts: 3

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by Turbocapitalist View Post
Numbers, including time stamps, are text as far as computers are concerned. What goes wrong when you try diff for your data?

Also, can you go into more detail about the data and what kind of differences you are looking for? Some (sanitized) sample data would help, with examples of what you expect to find.
ok, let me describe it with example :
Suppose the first file, means the base file is :

StartInstall, CDM_2.5B263, OK
EndInstall, CDM_2.5B263, SUCCESS
StartPatch, CDM_2.5.0.2B1, OK
StartPatch, CDM_2.5.0.3B1, OK
EndPatch, CDM_2.5.0.3B1, SUCCESS
StartPatch, CDM_2.5.0_SM-10866B2, OK
EndPatch, CDM_2.5.0_SM-10866B2, SUCCESS
StartPatch, CDM_2.5.0.REQUEST-6753B2, OK
StartPatch, CDM_2.5.0_SM-11515B2, OK
EndPatch, CDM_2.5.0_SM-11515B2, SUCCESS


and the second file is :

StartInstall, CDM_2.5B263, OK
EndInstall, CDM_2.5B263, SUCCESS
StartPatch, CDM_2.5.0_SM-11515B2, OK
EndPatch, CDM_2.5.0_SM-11515B2, SUCCESS

Third file shud be :
all the lines missed from file1 and with the sequence.
The start/END should be taken as one.
 
Old 04-02-2017, 06:57 AM   #6
Turbocapitalist
LQ Guru
 
Registered: Apr 2005
Distribution: Linux Mint, Devuan, OpenBSD
Posts: 7,530
Blog Entries: 4

Rep: Reputation: 3832Reputation: 3832Reputation: 3832Reputation: 3832Reputation: 3832Reputation: 3832Reputation: 3832Reputation: 3832Reputation: 3832Reputation: 3832Reputation: 3832
I see that diff works fine on that sample, in part because it is sorted / grouped. I get the following:

Code:
diff file1 file2
3,8d2
< StartPatch, CDM_2.5.0.2B1, OK
< StartPatch, CDM_2.5.0.3B1, OK 
< EndPatch, CDM_2.5.0.3B1, SUCCESS
< StartPatch, CDM_2.5.0_SM-10866B2, OK
< EndPatch, CDM_2.5.0_SM-10866B2, SUCCESS
< StartPatch, CDM_2.5.0.REQUEST-6753B2, OK
The < means that the line printed is present in the first file (file1) and missing in the second file (file2).

What is missing when you run it on a larger data set?

Last edited by Turbocapitalist; 04-02-2017 at 06:59 AM.
 
Old 04-02-2017, 11:32 PM   #7
chrism01
LQ Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Rocky 9.2
Posts: 18,398

Rep: Reputation: 2780Reputation: 2780Reputation: 2780Reputation: 2780Reputation: 2780Reputation: 2780Reputation: 2780Reputation: 2780Reputation: 2780Reputation: 2780Reputation: 2780
I usually find the following args to diff create a nice o/p
Code:
diff -Nuw origfile newfile >file.diff
Any decent editor eg vim will understand the o/p syntax (ie .diff extension) and colour code the file.diff file recs for ease of reading.

HTH
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
diff files elainelaw Linux - Software 2 07-23-2009 05:52 AM
diff two files noir911 Linux - Server 3 03-25-2009 05:00 PM
is it possible to diff ps files? markhod Linux - General 8 09-05-2005 12:17 AM
diff for binary files? thorax Linux - Software 2 07-17-2004 02:15 PM
.diff files? jtsai256 Linux - Newbie 1 09-28-2003 02:24 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 12:55 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration