问题描述
我有一长串英文单词,我想对它们进行哈希处理.什么是好的散列函数?到目前为止,我的散列函数对字母的 ASCII 值求和,然后对表大小求模.我正在寻找高效而简单的东西.
I have a long list of English words and I would like to hash them. What would be a good hashing function? So far my hashing function sums the ASCII values of the letters then modulo the table size. I'm looking for something efficient and simple.
推荐答案
简单地将字母相加并不是一个好的策略,因为排列会产生相同的结果.
To simply sum the letters is not a good strategy because a permutation gives the same result.
这个 (djb2) 非常受欢迎,并且与ASCII 字符串.
This one (djb2) is quite popular and works nicely with ASCII strings.
unsigned long hashstring(unsigned char *str)
{
unsigned long hash = 5381;
int c;
while (c = *str++)
hash = ((hash << 5) + hash) + c; /* hash * 33 + c */
return hash;
}
更多信息此处.
如果您需要更多替代方案和一些性能措施,请阅读此处.
If you need more alternatives and some perfomance measures, read here.
添加:这些是通用散列函数,其中输入域是事先未知的(除了一些非常一般的假设:例如,上述使用 ascii 稍微好一点输入),这是最常见的场景.如果您有一个已知的受限域(固定输入集),您可以做得更好,请参阅 Fionn 的回答.
Added: These are general hashing functions, where the input domain is not known in advance (except perhaps some very general assumptions: eg the above works slightly better with ascii input), which is the most usual scenario. If you have a known restricted domain (set of inputs fixed) you can do better, see Fionn's answer.
这篇关于什么是英语单词的好的哈希函数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!


大气响应式网络建站服务公司织梦模板
高端大气html5设计公司网站源码
织梦dede网页模板下载素材销售下载站平台(带会员中心带筛选)
财税代理公司注册代理记账网站织梦模板(带手机端)
成人高考自考在职研究生教育机构网站源码(带手机端)
高端HTML5响应式企业集团通用类网站织梦模板(自适应手机端)