LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 07-12-2006, 12:30 AM   #1
scanner
LQ Newbie
 
Registered: Jul 2006
Posts: 28

Rep: Reputation: 15
Question hash algorithm


There is a project which need search for a entry with key of MAC address. It is known MAC address is 48 bits long and so directly using a static table with index of MAC address is really a lame idea. And I am considering hash table with MAC address as key. Have you any theme ecen idea for such theme? Any tip welcome.
Thank you everybody in advance.
 
Old 07-12-2006, 01:54 AM   #2
chrism01
LQ Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Rocky 9.2
Posts: 18,393

Rep: Reputation: 2774Reputation: 2774Reputation: 2774Reputation: 2774Reputation: 2774Reputation: 2774Reputation: 2774Reputation: 2774Reputation: 2774Reputation: 2774Reputation: 2774
Not sure what you mean by 'static table', but anyway, it depends which lang you want to use eg SQL, Perl etc.
 
Old 07-12-2006, 03:30 AM   #3
primo
Member
 
Registered: Jun 2005
Posts: 542

Rep: Reputation: 34
Maybe:
http://www.partow.net/programming/hashfunctions/

I don't know if hashing MAC's may be a good idea, at least the complete 48 bits. You may take advantage of the first octects assigned to organizations to branch your search (a hash may help provided it generates unique keys for these first octects) and then do a binary or linear search later.
 
Old 07-12-2006, 05:17 AM   #4
scanner
LQ Newbie
 
Registered: Jul 2006
Posts: 28

Original Poster
Rep: Reputation: 15
Question

Quote:
Originally Posted by chrism01
Not sure what you mean by 'static table', but anyway, it depends which lang you want to use eg SQL, Perl etc.
'static table' here means static array. If I use standard C language, may you give me some advice?
 
Old 07-12-2006, 06:15 AM   #5
arashpartow
LQ Newbie
 
Registered: Feb 2006
Posts: 5

Rep: Reputation: 0
A hash table is a good idea, but I would use an integer based hash function
rather than a string based hash function - mostly because of speed reasons otherwise
both will give you roughly the same collision rates.


the only thing is from what i understand mac's are unique, really really
unique, why not use the mac value itself as the hash key and mod (%) for the table size
or bucket count (which should be some large amount which is a prime number > 1000000),
then use some kind of chaining or double hashing for collision resolution.





Arash Partow
________________________________________________________
Be one who knows what they don't know,
Instead of being one who knows not what they don't know,
Thinking they know everything about all things.
http_//www_partow_net

Last edited by arashpartow; 07-12-2006 at 06:24 AM.
 
Old 07-13-2006, 10:56 AM   #6
sundialsvcs
LQ Guru
 
Registered: Feb 2004
Location: SE Tennessee, USA
Distribution: Gentoo, LFS
Posts: 10,827
Blog Entries: 4

Rep: Reputation: 3984Reputation: 3984Reputation: 3984Reputation: 3984Reputation: 3984Reputation: 3984Reputation: 3984Reputation: 3984Reputation: 3984Reputation: 3984Reputation: 3984
Sure... just take the MAC address, perhaps xor the two halves together to make a more convenient integer, then take the result (unsigned) modulo the hash-table size. Make the size of the hash table some reasonable prime number.

Hashing is one of those very-rare "free" algorithms in the sense that it's very easy to implement and almost instantaneous. You don't have to struggle too much to find "the perfect algorithm" because the fact that you are hashing at all gives quite acceptable payoffs.

Incidentally... most higher-level languages, like (say) C++, offer various "dictionary" types in which all the hard work is done for you. It can be a very short step from C to C++, but one that pays off quite handsomely.
 
Old 07-13-2006, 11:11 AM   #7
95se
Member
 
Registered: Apr 2002
Location: Windsor, ON, CA
Distribution: Ubuntu
Posts: 740

Rep: Reputation: 32
Or you could go with a balanced BST, like a red-black tree. If your looking for a decent, well proven hash function, just use google. What language are you doing this in, if I may ask?
 
Old 07-13-2006, 10:14 PM   #8
scanner
LQ Newbie
 
Registered: Jul 2006
Posts: 28

Original Poster
Rep: Reputation: 15
Quote:
Originally Posted by 95se
Or you could go with a balanced BST, like a red-black tree. If your looking for a decent, well proven hash function, just use google. What language are you doing this in, if I may ask?
Sure, I am using standard C.
Can you explain 'balanced BST' further?

Last edited by scanner; 07-13-2006 at 11:40 PM.
 
Old 07-13-2006, 10:22 PM   #9
scanner
LQ Newbie
 
Registered: Jul 2006
Posts: 28

Original Poster
Rep: Reputation: 15
Actully, MAC is not unique hash key and the other parameter will surely come as a part of the hash key. Till now, the profile of 'other parameter' is still open. For now, I am looking for a MAC hash algorithm with a low hitting ratio.
Anyway, everybody's replies is help a lot to me.
Thanks.
 
Old 07-13-2006, 10:27 PM   #10
primo
Member
 
Registered: Jun 2005
Posts: 542

Rep: Reputation: 34
Quote:
Originally Posted by scanner
Actully, MAC is not unique hash key and the other parameter will surely come as a part of the hash key. Till now, the profile of 'other parameter' is still open. For now, I am looking for a MAC hash algorithm with a low hitting ratio.
Anyway, everybody's replies is help a lot to me.
Thanks.
Take a look at http://www.gnu.org/software/gperf/
 
Old 07-13-2006, 11:45 PM   #11
graemef
Senior Member
 
Registered: Nov 2005
Location: Hanoi
Distribution: Fedora 13, Ubuntu 10.04
Posts: 2,379

Rep: Reputation: 148Reputation: 148
This problem reminds me of the bucket sort. You could probably modify that to get what you wanted.
 
Old 07-14-2006, 12:34 AM   #12
scanner
LQ Newbie
 
Registered: Jul 2006
Posts: 28

Original Poster
Rep: Reputation: 15
For my target, I only need a hash table to accelerate search operation and MAC is one part of hash key. I prefer a octet outputed by this hash. Of course, checksum is easy to generate a octet output regardless of input length, but I have no idea about collision status in this case. In short, I am looking for such algorithm with low collision ratio. In addition, no requirement at all is for secret/encryption.

Do I explain that clear enough?
 
Old 07-14-2006, 02:09 AM   #13
primo
Member
 
Registered: Jun 2005
Posts: 542

Rep: Reputation: 34
Quote:
Originally Posted by scanner
For my target, I only need a hash table to accelerate search operation and MAC is one part of hash key. I prefer a octet outputed by this hash. Of course, checksum is easy to generate a octet output regardless of input length, but I have no idea about collision status in this case. In short, I am looking for such algorithm with low collision ratio. In addition, no requirement at all is for secret/encryption.
With cryptographic hash functions there's a lesser chance of collisions. The problem is that they are usually 128+ bytes and using a table that large isn't feasible. But hash functions are used mainly to speed off search operations and the storing part may be handled separate. You may use the result as a key to a red/black tree, but a MAC could serve as well. The only way is to experiment with the data yourself and see the results.
 
Old 07-16-2006, 05:38 AM   #14
scanner
LQ Newbie
 
Registered: Jul 2006
Posts: 28

Original Poster
Rep: Reputation: 15
Smile

Quote:
Originally Posted by primo
With cryptographic hash functions there's a lesser chance of collisions. The problem is that they are usually 128+ bytes and using a table that large isn't feasible. But hash functions are used mainly to speed off search operations and the storing part may be handled separate. You may use the result as a key to a red/black tree, but a MAC could serve as well. The only way is to experiment with the data yourself and see the results.
Yes, when information on hash key is clear enough, I will test/evaluate all possible algorithm and determine final hash key.

Thanks everyone.
 
Old 07-17-2006, 10:41 PM   #15
arashpartow
LQ Newbie
 
Registered: Feb 2006
Posts: 5

Rep: Reputation: 0
Quote:
Actully, MAC is not unique hash key and the other parameter
will surely come as a part of the hash key.
Actually they are, read the following:
http://en.wikipedia.org/wiki/MAC_address

Somewhere around the 13th word... oh btw thats the word after the
12th word, but before the 14th its the unique thing between
the two.




Arash Partow
________________________________________________________
Be one who knows what they don't know,
Instead of being one who knows not what they don't know,
Thinking they know everything about all things.
http://www.partow.net
 
  


Reply

Tags
hash


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
need a hash algorithm ignoring input order Thinking Programming 3 07-12-2006 06:09 AM
hash bye md5 algorithm vishalbutte Programming 5 02-18-2006 10:54 PM
need a hash algorithm ignoring input order Thinking Programming 1 01-02-2006 05:15 PM
Change Password Hash Algorithm Trano Linux - Security 1 08-23-2005 07:48 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 05:55 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration