[SOLVED] Is the result of MD5 calculation consistant?
ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I am trying to generate MD5Hash of a large number of files(.7z files) from java. I am using MessageDigest to do the same. However, the problem I am facing is that the output MD5Hashes are not consistant;changing everytime it is running(for same files). This is being run on Linux suse 10. Could anyone tell me if MD5calculation is dependent on platform and can change with multiple iterations?
Files are changing but file size and contents remain same. This is what my program does .
1. Reads a set of .gz files.
2. Archives them to a set of 7z files.
3. Calculates MD5Hash of 7z files.
At the third step, I am getting different results for same set of .gz files.
I have executed my program for many other data sets, but it works for all of them and fails only for this particular set. I am almost stuck with no way out for further debugging.
This is natural. Using the MD5 algorithm on files something.gz and something.7z (or .Z, .xz, .bz2 etc) will give different results, even if they are both the compressed versions of the same something file.
I noticed recently that files compressed using xz (another lz/7z format) have an inconsistent size by a few bytes when repeatedly built. This may be the OP's problem.
Finally I found out the problem. Though the gz files remain the same, some of the gz files were having a different timestamp(Replaced or recreated by an external process) every time the process was ran. Hence MD5Hash was changing due to this change in timestamp of files.
Thanks to everyone for their valuable replies.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.