Hi,
I am interested in fast hashers(targeted on alpha strings to be more specific). Here is one of my two hash-of-choice functions:
Code:
UINT FNV1A_Hash_WHIZ(const char *str, SIZE_T wrdlen)
{
const UINT PRIME = 1607;
UINT hash32 = 2166136261;
const char *p = str;
for(; wrdlen >= sizeof(DWORD); wrdlen -= sizeof(DWORD), p += sizeof(DWORD)) {
hash32 = (hash32 ^ *(DWORD *)p) * PRIME;
}
if (wrdlen & sizeof(WORD)) {
hash32 = (hash32 ^ *(WORD*)p) * PRIME;
p += sizeof(WORD);
}
if (wrdlen & 1)
hash32 = (hash32 ^ *p) * PRIME;
return hash32 ^ (hash32 >> 16);
}
The speed/collision performance on Intel Atom N450 1.66GHz:
Code:
C:\Test\hash>dir
11/07/2010 06:47 AM 84,992 hash.exe
10/27/2010 03:48 PM 146,973,879 wikipedia-en-html.tar.wrd
C:\Test\hash>hash wikipedia-en-html.tar.wrd
12561874 lines read
33554432 elements in the table (25 bits)
Sixtinsensitive+: 16153226 16031348 16039844 16048697 16026421| 16026421 [2251734] !FASTEST!
Hash_Sixtinsensitive: 16501364 16504662 16506868 16509768 16515127| 16501364 [2139242]
Whiz: 16081498 16080691 16080162 16079881 16085656| 16079881 [2189360] !Second-to-FASTEST!
Bernstein: 16222890 16208653 16195534 16205258 16207576| 16195534 [2074237]
K&R: 16181141 16153089 16147347 16144145 16144279| 16144145 [2083145]
x17 unrolled: 16196693 16200720 16203124 16195429 16196980| 16195429 [2410605]
x65599: 17213371 17215886 17211092 17225140 17218658| 17211092 [2102893]
FNV-1a: 17648084 17649994 17644643 17654437 17650937| 17644643 [2081195]
Sedgewick: 17715483 17711538 17706339 17709057 17711857| 17706339 [2080640]
Weinberger: 19795514 19798517 19792040 19783954 19796407| 19783954 [3541181]
Paul Larson: 16904068 16909317 16906315 16905215 16899096| 16899096 [2080111]
Paul Hsieh: 16731679 16720594 16932967 16731003 16725432| 16720594 [2180206]
One At Time: 18317093 18329453 18319193 18325990 18318877| 18317093 [2087861]
lookup3: 16591932 16600364 16593657 16594726 16605181| 16591932 [2084889]
Arash Partow: 16790714 16782529 16783414 16782986 16791156| 16782529 [2084572]
CRC-32: 16878539 16872606 16866719 16876415 16860548| 16860548 [2075088]
Ramakrishna: 16437073 16449160 16438019 16442447 16452696| 16437073 [2093253]
Fletcher: 74995725 74991965 74983105 75027126 75020605| 74983105 [9063797]
Murmur2: 16742975 16730183 16736149 16741761 16735690| 16730183 [2081476]
Hanson: 17286767 17288260 17273143 17280187 17371322| 17273143 [2129832]
Novak unrolled: 60320056 60322432 60315223 60308709 60325221| 60308709 [6318611]
SBox: 17677039 17685532 17686854 17681897 17680986| 17677039 [2084018]
MaPrime2c: 19286056 19296348 19307574 19294326 19295151| 19286056 [2084467]
C:\Test\hash>
Please, if anybody knows faster one let drop a link here.
The second function is called 'Sixtinsensitive' with source code at link below.
For more info:
http://www.strchr.com/hash_functions