What is traditionally done is that a "hash function" is applied to the key, and the hash-function returns an integer. (The design of the hash function is intended to be such that it generates a well-distributed span of output-values so that all of the hash-buckets might be more-or-less-equally filled.)
The hash-function result is forced to be a positive integer using some process such as anding it with the value 0x7FFFFFF (which forces the leftmost or "sign" bit to be zero).
Finally, the arithmetic modulo operator is applied, taking the hash-value mod the number-of-buckets. (This operator gives the remainder from a division operation.)
A hash-lookup function derives its speed from this hashing-process. Once the proper 'bucket-number' has been determined, only the values in that particular bucket need to be searched for the desired value. If the hash-function is a good one, and the (should-always-be-prime...) number of buckets is appropriate, this will reduce the search-space to a usefully-small subset.
It should be noted that a hash-lookup structure is not strictly "an array" in the most-traditional sense. A traditional array is a contiguous block of storage... but "a traditional array" is not nearly as useful in-practice as is a hash-based structure. A hash-structure handles a variable number of entries with aplomb, and also does not waste storage if the key-value distribution is "sparse." The algorithm is simple, blisteringly fast, and "much better than nothing."
Last edited by sundialsvcs; 03-04-2008 at 07:15 PM.
|