LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Security (https://www.linuxquestions.org/questions/linux-security-4/)
-   -   MD5 collision risk (https://www.linuxquestions.org/questions/linux-security-4/md5-collision-risk-4175436602/)

Skaperen 11-10-2012 08:12 PM

MD5 collision risk
 
Suppose someone comes up with 2 different data files that would collide in their MD5 hash. What is the chance that for a given arbitrary string, these 2 different data files appended to that string (e.g. the string is first) will also collide in their MD5 hash?

Where this fails:
Code:

diff file1 file2
but these get the same results:
Code:

cat file1 | md5sum
cat file2 | md5sum

what chance for these to get the same results:
Code:

cat filex file1 | md5sum
cat filex file2 | md5sum

or these:
Code:

cat file1 filex | md5sum
cat file2 filex | md5sum

or these:
Code:

cat filex file1 filex | md5sum
cat filex file2 filex | md5sum

or even these:
Code:

cat filex file1 filey | md5sum
cat filex file2 filey | md5sum

The idea here is if you have a system where files are being provided by people, where some suspect you manage the files by MD5 hash, and have an interest in creating a collision, and manage to create a bogus file with the same MD5 hash as another one ... could a secret salt file being added to the file before hashing be expected to reasonably obscure the hashing?

catkin 11-10-2012 08:47 PM

Or you could use both MD5 and SHA1 sums

Hangdog42 11-11-2012 08:15 AM

Quote:

where some suspect you manage the files by MD5 hash,
Do you mean you're using the MD5 as a unique identifier for the files?

linosaurusroot 11-11-2012 01:50 PM

If the additional string is at the end and the original files were a multiple of 512 bits then the collision will still exist.

If you are aiming to use additional (secret) information for message authentication then use HMAC http://www.ietf.org/rfc/rfc2104.txt

sneakyimp 11-11-2012 05:53 PM

Quote:

Originally Posted by linosaurusroot (Post 4827151)
If the additional string is at the end and the original files were a multiple of 512 bits then the collision will still exist.

If you are aiming to use additional (secret) information for message authentication then use HMAC http://www.ietf.org/rfc/rfc2104.txt

This shouldn't be too hard to verify. Googling "md5 collisions" yields this page:
http://www.mathstat.dal.ca/~selinger/md5collision/

It's got two distinct strings that yield the same MD5 sum. You could try appending a 512-bit string to each and see if the collision happens.


All times are GMT -5. The time now is 03:16 PM.