Hashing

hash table $ht$ contains * $b$ buckets ht[0],h1[1],...,ht[b-1] * A bucket has $s$ slots

loading density

\[ a = n/sb \]

n = hash 中已經放入的資料 s = 每個bucket 有幾個slots b = 幾個bucket

Note：若 load density 越大，代表 Hash Table 利用度高，但相對 Collision 機會也大

Collision and Overflow

Collision

所在的bucket已經有了資料

Overflow

所在的bucket已經滿了

Hash function

Division

\[ h(k) = k\%D \]

D = buckets

Folding

將key 切成幾個小片段$P_0,P_1,...,P_i$ 接下來將他們加在一起

Folding at the boundaries : 偶數做reverse

Overflow Handing

Chaining

每個bucket 皆使用linked list 實作

Open Addressing

尋找還沒有滿的bucket

Linear probing 當 $H (x)$ 發生 overflow 的時候，則探測 $(H (x)+i)% B$，$B$ 為 Bucket 數，$i = 1,2,3,…,B-1$，直到有 Bucket 可存 or Table 全滿為止

sample

#in #in intint

href="#__codelineno-0-1">#include<stdio.h> clude<stdlib.h> clude<string.h> class="w"> h[10005][105]; class="w"> main(){ int bucket,slot; scanf("bucket %d\n",&bucket); scanf("slot %d\n",&slot); char op[20]; memset(h,-1,sizeof(h)); while(scanf("%s",op)){ if(strcmp(op,"exit") == 0)break; int num; scanf("%d",&num); if(strcmp(op,"insert") == 0){ int key = num % bucket; int i = 0,flag = 0; while( i < bucket){ for(int j=0;j<slot;j++){ if(h[key][j] == -1){ h[key][j] = num; flag = 1; break; } } if(flag) break; i++; key = (key+1)%bucket; } } if(strcmp(op,"search") == 0){ int key = num % bucket; int i = 0,flag = 0; while( i < bucket){ for(int j=0;j<slot;j++){ if(h[key][j] == num){ printf("%d %d\n",key,j); flag = 1; break; } } if(flag) break; i++; key = (key+1)%bucket; } } if(strcmp(op,"delete") == 0){ int key = num % bucket; int i = 0,flag = 0; while( i < bucket){ for(int j=0;j<slot;j++){ if(h[key][j] == num){ h[key][j] = -1; flag = 1; break; } } if(flag) break; i++; key = (key+1)%bucket; } } } }

Quadratic probing (二次方探測)

會以下面的順序探測hash function $$ ht[h(k) \% b]\ ht[(h(k)+1) \% b],ht[(h(k) - 1) \% b]\ ht[(h(k)+2^2) \% b],ht[(h(k) - 2^2) \% b]\ ht[(h(k)+3^2) \% b],ht[(h(k) - 3^2) \% b]\ $$

b 需要是質數

Rehashing 有很多的hash function $h_1,h_2,...,h_m$ 滿了改用下一個直到找到可以存的地方或hash function用完為止

Performance

Worst-case find/insert/delete time is O(n) 在沒有Collision的情況下預期search是O(1)