LinuxQuestions.org
Help answer threads with 0 replies.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices


Reply
  Search this Thread
Old 11-11-2008, 12:41 PM   #1
phugoid
LQ Newbie
 
Registered: Mar 2006
Location: Dubai, United Arab Emirates
Distribution: Suse 10.0
Posts: 14

Rep: Reputation: 0
A tool to manage a number of almost-the-same hard disks?


Before I start writing something from scratch, I'm wondering if any of you know of a program/script to help manage large sets of files that are almost exactly the same.

For example, say I have 20 machines that start out with identical hard disk images. Then over time they diverge, and I'd like to compare them and pull out all the differences. The tool should work on a sub-directory as well, so that I can compare for example the /home/dan directory across all 20 machines.

I'm tempted to start writing some scripts for this. I would recurse through the directory structure and take md5 hashes of all the files and everything I want to know (directory listing with file permissions, etc.). I would store this "fingerprint" of the hard disk in a text file. Then I could use other scripts to help compare two or more fingerprints (from two or more hard disks). Naturally, I want the ability to exclude certain directories and files from the fingerprint (/proc, etc.).

I would greatly appreciate any ideas, advice and criticism at this point.
 
Old 11-11-2008, 12:57 PM   #2
irishbitte
Senior Member
 
Registered: Oct 2007
Location: Brighton, UK
Distribution: Ubuntu Hardy, Ubuntu Jaunty, Eeebuntu, Debian, SME-Server
Posts: 1,213
Blog Entries: 1

Rep: Reputation: 88
What might help you out here is a program like rsync? It's definitely not straightforward! What I have started doing of late is to create an image using PartImage of the ideal install, then re-installing that on machines across the network, again using PartImage.

rsync will not tell you the differences per se, but if you look in the code for it, you might get some inspiration.
 
Old 11-11-2008, 01:58 PM   #3
Poetics
Senior Member
 
Registered: Jun 2003
Location: California
Distribution: Slackware
Posts: 1,181

Rep: Reputation: 49
Were I attempting to do this, I'd do exactly what you are thinking, with the hashes. Some creative finds can exclude any directory you'd like and piping the output to whatever file you want isn't difficult. The 'challenge' is in writing the code to compare the files.

I'm thinking each line in the hash file would be "/path/to/file hash" for easy parsing. Break it up by tabs and now you can compare the two files quickly using perl or whatever language is your preference. Push the discrepancies into an array, and push any files that aren't in one to another array. Then again, I'm a fan of making my output nicely organized.

Then again, couldn't you just use 'diff' on the hash files?
 
Old 11-14-2008, 11:39 PM   #4
phugoid
LQ Newbie
 
Registered: Mar 2006
Location: Dubai, United Arab Emirates
Distribution: Suse 10.0
Posts: 14

Original Poster
Rep: Reputation: 0
Thanks, both of you.

irishbitte: I will indeed have a look at rsync for some inspiration, thanks. In fact, rsync can probably do most of what I need. One advantage of my approach is that you could build a "fingerprint" file for a hard disk, and then disconnect the hard disk and just use that file for comparison purposes. As obscure as it may sound, that's a very interesting feature to me right now.

Poetics: agreed, one challenge is in presenting this info in a useful format. Thankfully this is a separate problem from building the hash file. Thanks for the reassurance - I will continue on this path.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
LXer: Manage partitions and disks with GParted-Clonezilla live CD LXer Syndicated Linux News 0 07-02-2007 10:46 PM
Finding total number of Hard-disks arunka Linux - Hardware 1 02-13-2006 10:26 PM
tool manage download - Software run on Apache -- please help me b:z Linux - Software 2 04-25-2005 12:53 AM
Manage disks on an existing system with LVM doris Linux - Hardware 1 03-02-2005 02:37 PM
Number of hard disks/drives info? mali Linux - Enterprise 3 01-10-2005 03:59 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software

All times are GMT -5. The time now is 07:49 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration