$25
BLG 335E Analysis of Algorithms I
Project 3
In this project, you are asked to implement several hash functions on the given input files to store dataset in to a m-sized hash table. There are two input files: “vocab.txt” and “search.txt”. First file contains unique tokens which are collected from an English corpus. Each token exists in a line and index values of tokens are corresponding to in which line they are. The other file is given to be used in searching operations. This file contains tokens which are searched in a hash table after hash tables are constituted according to different hash functions.
1. Implement a m-sized hash table that uses linear probing strategy in order to store string elements by using the following hash function. Each slot of your hash table can only contain a string variable. (Note: You do not store i values during the insertion operation. You will use i values in searching operation as how you do in insertion operation.)
ℎ(𝑘,𝑖) =(𝑘+𝑖) 𝑚𝑜𝑑 𝑚 and 𝑖∈[0,𝑚−1]
2. Implement a m-sized hash table that uses double hashing strategy in order to store string elements by using the following hash function. Each slot of your hash table can only contain a string variable.
h(k, i) = (h1(k) + i * h2(k)) mod m and 𝑖∈[0,𝑚−1] h1(k) = k mod m
h2(k) = p – ( k mod p ), p is a prime number and p∈[0,𝑚−1]
3. Implement a m-sized hash table that uses universal hashing strategy in order to store string elements. In the case of any collision, you can get help from any open addressing strategy (e.g.
as in linear probing). Details of the universal hashing strategy is given below.
Step #1: Choose the table size m to be prime.
Step #2: Decompose the key k into 2 digits (at the end, r chunks) so that k = <k0, k1, ... , kr> and 0 ≤ ki < m. For this homework, r=3.
Step #3: Pick 𝑎= [𝑎0,𝑎1,…,𝑎𝑟] where each 𝑎𝑖 is generated randomly from [0,𝑚−1] interval.
Step #4: Apply hashing function given below:
Example for decomposition for k:
= 96356→k0=9, 1= 63, 2=56
= 1398→k0=0, 1= 13, 2= 98
= 548→k0=0, 1= 5, 2= 48
= 83→k0=0, 1= 0, 2= 83
= 7→k0=0, 1= 0, 2= 7
4.Search Functions
Write three functions and each of them searchs given set of keys in the hash table that you implemented in (1), (2) and (3) respectively.
5.The situation where a newly inserted key maps to an already occupied slot in hash table is called collision. In this part, hashing funtions which are implemented in this homework are compared according to collision numbers occur during the insertion / search operations.
Number of collisions when inserting/searching an item i is defined as follows: collisions = 0
Compute h(k,i) if h(k,i) is not empty then increment collisions by 1.
Using the definition above, insert all the elements in “vocab.txt” to the hash tables, compute the number of collisions for each strategy and fill in the table with the number of collisions occured for each strategy and m value. Similarly, search all the elements in “search.txt” and compute the number of collisions for each strategy. Comment on the results.
Insertion
Search
Linear
Double
Universal
Linear
Double
Universal
m = 17863
m = 17863
m = 21929
m = 21929