It only sort of has to do with cryptography. A hash is the output of a one way function on some input. One way means that given the output, there is NO WAY to determine the input. For example, the function "input * 3" is a 2 way function. If I am given the output 6, I know the input was 2. However, the function "input mod 7" is one way. If I'm given the output 2, there is no way to know if the input was 2, 9, 16, etc. The problem with hashes is that they have what are known as collisions (the 2, 9, and 16 in this example all collide with an output of 2). ALL hashes will have collisions (otherwise they would be 2 way). The trick is to have as few collisions as possible. MD5 is a VERY complicated function that has a very limited number of collisions. This means it is almost impossible to come up with two inputs that will collide with the same output. This means that if you have a file and get an MD5 hash of that file, any other file with the same MD5 hash is almost certainly a copy of that file.
Last edited by forrestt; 01-18-2008 at 02:01 PM.