LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   hash algorithm (https://www.linuxquestions.org/questions/programming-9/hash-algorithm-463183/)

scanner 07-12-2006 12:30 AM

hash algorithm
 
There is a project which need search for a entry with key of MAC address. It is known MAC address is 48 bits long and so directly using a static table with index of MAC address is really a lame idea. And I am considering hash table with MAC address as key. Have you any theme ecen idea for such theme? Any tip welcome.
Thank you everybody in advance.

chrism01 07-12-2006 01:54 AM

Not sure what you mean by 'static table', but anyway, it depends which lang you want to use eg SQL, Perl etc.

primo 07-12-2006 03:30 AM

Maybe:
http://www.partow.net/programming/hashfunctions/

I don't know if hashing MAC's may be a good idea, at least the complete 48 bits. You may take advantage of the first octects assigned to organizations to branch your search (a hash may help provided it generates unique keys for these first octects) and then do a binary or linear search later.

scanner 07-12-2006 05:17 AM

Quote:

Originally Posted by chrism01
Not sure what you mean by 'static table', but anyway, it depends which lang you want to use eg SQL, Perl etc.

'static table' here means static array. If I use standard C language, may you give me some advice?

arashpartow 07-12-2006 06:15 AM

A hash table is a good idea, but I would use an integer based hash function
rather than a string based hash function - mostly because of speed reasons otherwise
both will give you roughly the same collision rates.


the only thing is from what i understand mac's are unique, really really
unique, why not use the mac value itself as the hash key and mod (%) for the table size
or bucket count (which should be some large amount which is a prime number > 1000000),
then use some kind of chaining or double hashing for collision resolution.





Arash Partow
________________________________________________________
Be one who knows what they don't know,
Instead of being one who knows not what they don't know,
Thinking they know everything about all things.
http_//www_partow_net

sundialsvcs 07-13-2006 10:56 AM

Sure... just take the MAC address, perhaps xor the two halves together to make a more convenient integer, then take the result (unsigned) modulo the hash-table size. Make the size of the hash table some reasonable prime number.

Hashing is one of those very-rare "free" algorithms in the sense that it's very easy to implement and almost instantaneous. You don't have to struggle too much to find "the perfect algorithm" because the fact that you are hashing at all gives quite acceptable payoffs.

Incidentally... most higher-level languages, like (say) C++, offer various "dictionary" types in which all the hard work is done for you. It can be a very short step from C to C++, but one that pays off quite handsomely.

95se 07-13-2006 11:11 AM

Or you could go with a balanced BST, like a red-black tree. If your looking for a decent, well proven hash function, just use google. What language are you doing this in, if I may ask?

scanner 07-13-2006 10:14 PM

Quote:

Originally Posted by 95se
Or you could go with a balanced BST, like a red-black tree. If your looking for a decent, well proven hash function, just use google. What language are you doing this in, if I may ask?

Sure, I am using standard C.
Can you explain 'balanced BST' further?

scanner 07-13-2006 10:22 PM

Actully, MAC is not unique hash key and the other parameter will surely come as a part of the hash key. Till now, the profile of 'other parameter' is still open. For now, I am looking for a MAC hash algorithm with a low hitting ratio.
Anyway, everybody's replies is help a lot to me.
Thanks.

primo 07-13-2006 10:27 PM

Quote:

Originally Posted by scanner
Actully, MAC is not unique hash key and the other parameter will surely come as a part of the hash key. Till now, the profile of 'other parameter' is still open. For now, I am looking for a MAC hash algorithm with a low hitting ratio.
Anyway, everybody's replies is help a lot to me.
Thanks.

Take a look at http://www.gnu.org/software/gperf/

graemef 07-13-2006 11:45 PM

This problem reminds me of the bucket sort. You could probably modify that to get what you wanted.

scanner 07-14-2006 12:34 AM

For my target, I only need a hash table to accelerate search operation and MAC is one part of hash key. I prefer a octet outputed by this hash. Of course, checksum is easy to generate a octet output regardless of input length, but I have no idea about collision status in this case. In short, I am looking for such algorithm with low collision ratio. In addition, no requirement at all is for secret/encryption.

Do I explain that clear enough?

primo 07-14-2006 02:09 AM

Quote:

Originally Posted by scanner
For my target, I only need a hash table to accelerate search operation and MAC is one part of hash key. I prefer a octet outputed by this hash. Of course, checksum is easy to generate a octet output regardless of input length, but I have no idea about collision status in this case. In short, I am looking for such algorithm with low collision ratio. In addition, no requirement at all is for secret/encryption.

With cryptographic hash functions there's a lesser chance of collisions. The problem is that they are usually 128+ bytes and using a table that large isn't feasible. But hash functions are used mainly to speed off search operations and the storing part may be handled separate. You may use the result as a key to a red/black tree, but a MAC could serve as well. The only way is to experiment with the data yourself and see the results.

scanner 07-16-2006 05:38 AM

Quote:

Originally Posted by primo
With cryptographic hash functions there's a lesser chance of collisions. The problem is that they are usually 128+ bytes and using a table that large isn't feasible. But hash functions are used mainly to speed off search operations and the storing part may be handled separate. You may use the result as a key to a red/black tree, but a MAC could serve as well. The only way is to experiment with the data yourself and see the results.

Yes, when information on hash key is clear enough, I will test/evaluate all possible algorithm and determine final hash key.

Thanks everyone.

arashpartow 07-17-2006 10:41 PM

Quote:

Actully, MAC is not unique hash key and the other parameter
will surely come as a part of the hash key.
Actually they are, read the following:
http://en.wikipedia.org/wiki/MAC_address

Somewhere around the 13th word... oh btw thats the word after the
12th word, but before the 14th its the unique thing between
the two.




Arash Partow
________________________________________________________
Be one who knows what they don't know,
Instead of being one who knows not what they don't know,
Thinking they know everything about all things.
http://www.partow.net


All times are GMT -5. The time now is 02:42 AM.