$35
Exercise 1 Building a Hash Table We want to compare the performance of hash tables implemented using chaining and open addressing. In this assignment, we will consider hash tables implemented using the multiplication and linear probing methods. We will (respectively) call the hash functions h and g and describe them below. Note that we are using the hash function h to define g.
Collisions solved by chaining (multiplication method): h(k) = ((A · k) mod 2w) >> (w − r)
Open addressing (linear probing): g(k,i) = (h(k) + i) mod 2r
In the formula above, r and w are two integers such that w > r, and A is a random number such that 2w−1 < A < 2w. In addition, let n be the number of keys inserted, and m the number of slots in the hash tables. Here, we set m = 2r and r = dw/2e. The load factor α is equal to mn .
We want to estimate the number of collisions when inserting keys with respect to keys and the choice of values for A.
We provide you a set of three template files within COMP251HW1.zip that you will complete. This file contains three classes, a main class and one for each hash function. Those contain several helper functions, namely generateRandom that enables you to generate a random number within a specified range. Details on which functions are included, how to use them, and where to add in your code can be found as comments in the java files. Please read them with attention.
Your first task is to complete the two java methods Open_Addressing.probe and Chaining.chain. These methods must implement the hash functions for (respectively) the linear probing and multiplication methods. They take as input a key k, as well as an integer 0 ≤ i < m for the linear probing method, and return a hash value in [0,m[.
Next, you will implement the method insertKey in both classes, which inserts a key k into the hash table and returns the number of collisions encountered before insertion, or the number of collisions encountered before giving up on inserting, if applicable. Note that for this exercise as well as for the rest of the homework, we define the number of collisions in open addressing as the number of keys encountered, or "jumped over" before inserting or removing a key. For chaining, we simply consider the number of other keys in the same bin at the time of insertion as the number collisions. You can assume the key is not negative.
You will also implement a method removeKey, this one only in Open_Addressing. This method should take as input a key k, and remove it from the hash table while visiting the minimum number of slots possible. Like insertKey, it should output the number of collisions. If the key is not in the hash table, the method should simply not change the hash table, and output the number of slots visited. You will notice from the code and comments that empty slots are given a value of −1. If applicable, you are allowed to use a different notation of your choice for slots containing a deleted element.
You can then implement tests in the main class. Make sure to test your assignment thoroughly by thinking about all the different situations that can occur when dealing with hash tables.
For this assignment, you will need to submit a zip file containing the completed version of the three provided java files on codePost.
Exercise 2 Least common multiple This problem aims to study an algorithm that computes, for an integer n ∈ N, the least common multiple (LCM) of integers ≤ n.
For a given integer n ∈ N, let Pn pxkk, where p1,p2,··· ,pk is a strictly increasing sequence of prime numbers between 2 and n and for each i ∈ {1,···k}, xi is the integer such that pxi i ≤ n < pxi i+1. For example, P9 = 23 · 32 · 5 · 7.
More precisely, we’re going to compute all Pj, j ∈ {1,··· ,n} and store pairs of integers (pα,p) in a heap, a binary tree where the element stored in the parent node is strictly smaller than those stored in children nodes. For two given pairs of integers (a,b) and (a0,b0), (a,b) < (a0,b0) if and only if a < a0. Let h denotes the tree height, we admit that h = Θ(logn). All levels of the binary tree are filled with data except for the level h, where elements are stored from the left to the right. After computing Pj, all pairs (pα,p) are stored in the heap such that p is a prime number smaller or equal to j and α is the smallest integer such that j < pα. For instance, after
computing P9, we store (16,2), (27,3), (25,5), and (49,7) in the heap.
The algorithm is iterative. We store in the variable LCM the least common multiple computed so far. At first, LCM= 2 is the LCM of integers smaller than 2 and the heap is constructed with only one node with value (4,2). After finish the (j −1)-th step, we compute the j-th step as follows:
1. If j is a prime number, multiply LCM by j and insert a new node (j2,j) in the heap.
2. Otherwise, if the root (pα,p) satisfies j = pα, then we multiply LCM by p, change the root’s value by (pα+1,p), and reconstruct the heap.
√
We’re going to prove, step by step, that the time complexity of this algorithm is O(n n).
3.1 -
In operation 1, a new node is inserted. What is the complexity of this operation?
3.2 -
In operation 2, the heap is reconstructed. What is the time complexity of this operation?
3.3 -
√
The number of prime numbers smaller than n concerned in the operation 2 is less than n. Prove that the number of times N we need to execute operation 2 to compute Pn is asymptotically negligible compared to n. Tip: you can prove this by proving N is o(n), where o (little o) denotes a strict upper bound.
3.4 -
√
Assume the complexity of assessing whether an integer is a prime number is n and suppose
√ multiplication has a time complexity of 1. Prove that the algorithm’s complexity is O(n n).
3.5 -
Prove that, for a given heap of height h with n nodes, we have h = Θ(logn). No partial credit will be awarded.