LinuxQuestions.org
Visit the LQ Articles and Editorials section
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - General
User Name
Password
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.

Notices



Reply
 
Search this Thread
Old 05-01-2005, 04:39 AM   #1
Marshalle
Member
 
Registered: Mar 2005
Location: Auckland, New Zealand
Distribution: OpenSuSE, SLE, Yoper, Sabayon
Posts: 68

Rep: Reputation: 15
Smile How to extract difference between two files?


Hey there, I'm hoping someone can help me. I have a large collection of MP3's taken from my own CD's and I decided to format my NTFS drive as Reiser (I have Suse 9.2 Pro). I have backed up many of them and got lost somewhere along the way. So my problem is how do I compile a list of songs not backed up to dvd for later burning. So far I have exported the individual DVD's to a text file and also the original list to a text file. I figured I could use diff to find out the differences, but the problem is I need to export all files that are not the same and can't quite figure it out? IE the files on the DVD should be taken away from the master list. Any ideas? I've never actually had a response before so lets see if I get some luck!

Thanks,

M
 
Old 05-01-2005, 07:37 AM   #2
rjlee
Senior Member
 
Registered: Jul 2004
Distribution: Ubuntu 7.04
Posts: 1,990

Rep: Reputation: 67
diff keeps the information from both files to allow rolling back, whereas you want the actual set difference. I would do this with a little Perl script:

Code:
#!/usr/bin/perl
open a, "<filea";
open b, "<fileb";
local $/;
my @a = split /\n/, <a>;
my @b = split /\n/, <b>;
my %b = map { $_ => 1 } @b; # Make hash of B
my @res = grep { !defined $b{$_} } @a; # Everything in A not in B
print join "\n", @res;
print "\n";
I'm assuming that both files have one filename per line, with no newlines in filenames, and the same filename is rendered the same way in both files.

Replace filea with the file containing the list of all files, and fileb with the list of extracted files.
Go to the directory where the files are, then type
Code:
perl > diff.lst
Paste in the entire script, including the newline at the end, then press Ctrl+D.

Your difference file will be produced in diff.lst
 
Old 05-02-2005, 07:38 AM   #3
Marshalle
Member
 
Registered: Mar 2005
Location: Auckland, New Zealand
Distribution: OpenSuSE, SLE, Yoper, Sabayon
Posts: 68

Original Poster
Rep: Reputation: 15
Well thanks for the reply! It's 11:30 so I'm off to sleep but I had a quick try but it seemed to produce all of the contents of both of the files, probably something I did so will try again tomorrow when I'm actually half awake! Would it be easy to list multiple txt files in place of you fileb to save me having to keep reapplying them to a new master file? Looks like perl is something I should learn sometime! Thanks again!

M
 
Old 05-03-2005, 04:04 AM   #4
rjlee
Senior Member
 
Registered: Jul 2004
Distribution: Ubuntu 7.04
Posts: 1,990

Rep: Reputation: 67
To remove the contents of multiple files, replace this:
Code:
open b, "<fileb";
with this:
Code:
open b, "cat fileb filec filed filee |";
(where | is a pipe character). You may need to shell-escape the filenames.
 
Old 05-03-2005, 05:10 AM   #5
Marshalle
Member
 
Registered: Mar 2005
Location: Auckland, New Zealand
Distribution: OpenSuSE, SLE, Yoper, Sabayon
Posts: 68

Original Poster
Rep: Reputation: 15
Cool

Well I put both of these examples together and they worked perfectly! I can't believe how fast it was either, I didn't even see the HDD light blink. So I sent a donation in, was soooo pleased. Thanks a million, great to know support like this is out there, now I can install this for other people with confidence.
 
Old 10-01-2009, 04:02 AM   #6
burf
LQ Newbie
 
Registered: Oct 2009
Posts: 1

Rep: Reputation: 0
Thanks for the reply, it worked for me too

A great way to find the difference in two files
 
Old 10-02-2009, 01:38 AM   #7
chrism01
Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Centos 6.6, Centos 5.10
Posts: 16,324

Rep: Reputation: 2041Reputation: 2041Reputation: 2041Reputation: 2041Reputation: 2041Reputation: 2041Reputation: 2041Reputation: 2041Reputation: 2041Reputation: 2041Reputation: 2041
If you want to take up Perl ( I recommend it), here's a couple of good links:
http://perldoc.perl.org/
http://www.perlmonks.org/?node=Tutorials
 
Old 01-04-2011, 04:06 PM   #8
alessandra86
LQ Newbie
 
Registered: Jan 2011
Posts: 1

Rep: Reputation: 0
Hi!! I am new at linux and I was trying to do the same thing of this forum with my files, but it didn't work. I hope someone can help me.

I use the script and it work ok with a little example that I did (two files with shared numbers), but if I use my files, I dont get differences between them.

I don't know, but maybe is my file content. It has long lines with numbers and letters, like this:

F6U9T4101A8GYD
F6U9T4101BXUMB
F6U9T4101A23JN
F6U9T4101BVP37
F6U9T4101BU6FA
F6U9T4101BN8YQ
F6U9T4101ARLNS
F6U9T4101BYIWG

I got two files. One with the total number of ids (44200) and the other with 5500 ids. So I need the differences, so I would have 38700 ids in the diff file. But I don know what happend, I hope you can help me!!!

Regards, alessa
 
Old 01-05-2011, 01:45 PM   #9
rjlee
Senior Member
 
Registered: Jul 2004
Distribution: Ubuntu 7.04
Posts: 1,990

Rep: Reputation: 67
alessandra86, you don't say what you actually did get. But the script returns everything in the first file that isn't in the second, so you didn't get anything then you might try again with the files the other way around…
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
extract files ust Linux - General 1 08-05-2005 02:16 AM
How to extract Gentoo files raees Linux - Software 6 04-27-2005 11:17 AM
how to extract winrar files Paxmaster Linux - Software 1 11-04-2004 10:16 PM
Extract >1 files using WinRAR mib Linux - Software 11 06-29-2004 02:15 AM
extract bin files?? yowwww Linux - Software 19 03-19-2004 09:44 AM


All times are GMT -5. The time now is 06:44 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration