Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum. |
Notices |
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
Are you new to LinuxQuestions.org? Visit the following links:
Site Howto |
Site FAQ |
Sitemap |
Register Now
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
|
|
11-11-2008, 01:41 PM
|
#1
|
LQ Newbie
Registered: Mar 2006
Location: Dubai, United Arab Emirates
Distribution: Suse 10.0
Posts: 14
Rep:
|
A tool to manage a number of almost-the-same hard disks?
Before I start writing something from scratch, I'm wondering if any of you know of a program/script to help manage large sets of files that are almost exactly the same.
For example, say I have 20 machines that start out with identical hard disk images. Then over time they diverge, and I'd like to compare them and pull out all the differences. The tool should work on a sub-directory as well, so that I can compare for example the /home/dan directory across all 20 machines.
I'm tempted to start writing some scripts for this. I would recurse through the directory structure and take md5 hashes of all the files and everything I want to know (directory listing with file permissions, etc.). I would store this "fingerprint" of the hard disk in a text file. Then I could use other scripts to help compare two or more fingerprints (from two or more hard disks). Naturally, I want the ability to exclude certain directories and files from the fingerprint (/proc, etc.).
I would greatly appreciate any ideas, advice and criticism at this point.
|
|
|
11-11-2008, 01:57 PM
|
#2
|
Senior Member
Registered: Oct 2007
Location: Brighton, UK
Distribution: Ubuntu Hardy, Ubuntu Jaunty, Eeebuntu, Debian, SME-Server
Posts: 1,213
Rep:
|
What might help you out here is a program like rsync? It's definitely not straightforward! What I have started doing of late is to create an image using PartImage of the ideal install, then re-installing that on machines across the network, again using PartImage.
rsync will not tell you the differences per se, but if you look in the code for it, you might get some inspiration.
|
|
|
11-11-2008, 02:58 PM
|
#3
|
Senior Member
Registered: Jun 2003
Location: California
Distribution: Slackware
Posts: 1,181
Rep:
|
Were I attempting to do this, I'd do exactly what you are thinking, with the hashes. Some creative finds can exclude any directory you'd like and piping the output to whatever file you want isn't difficult. The 'challenge' is in writing the code to compare the files.
I'm thinking each line in the hash file would be "/path/to/file hash" for easy parsing. Break it up by tabs and now you can compare the two files quickly using perl or whatever language is your preference. Push the discrepancies into an array, and push any files that aren't in one to another array. Then again, I'm a fan of making my output nicely organized.
Then again, couldn't you just use 'diff' on the hash files?
|
|
|
11-15-2008, 12:39 AM
|
#4
|
LQ Newbie
Registered: Mar 2006
Location: Dubai, United Arab Emirates
Distribution: Suse 10.0
Posts: 14
Original Poster
Rep:
|
Thanks, both of you.
irishbitte: I will indeed have a look at rsync for some inspiration, thanks. In fact, rsync can probably do most of what I need. One advantage of my approach is that you could build a "fingerprint" file for a hard disk, and then disconnect the hard disk and just use that file for comparison purposes. As obscure as it may sound, that's a very interesting feature to me right now.
Poetics: agreed, one challenge is in presenting this info in a useful format. Thankfully this is a separate problem from building the hash file. Thanks for the reassurance - I will continue on this path.
|
|
|
All times are GMT -5. The time now is 08:02 AM.
|
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.
|
Latest Threads
LQ News
|
|