Starting from:

$35

CS202- Homework 3: Heaps and AVL Trees Solved

Your ZIP archive should contain the following files:pdf, the file containing the answers to Part 1,
h, MaxHeap.cpp, MinHeap.h, MinHeap.cpp, MedianHeap.h, MedianHeap.cpp, HuffmanHeap.h, HuffmanHeap.cpp, HuffmanQueue.h, HuffmanQueue.cpp, HuffmanCode.h, HuffmanCode.cpp, main.cpp files which contain the C++ source codes, and the Makefile.
Do not forget to put your name, student ID, and section number in all of these files. Well comment your implementation. Add a header as in Listing 1 to the beginning of each file:
Listing 1: Header style

/**

Title: Heaps and AVL Trees
Author: Name Surname
ID: 21XXXXXX
Section: X
Assignment: 3
Description: description of your code
*/

Do not put any unnecessary files such as the auxiliary files generated from your favorite IDE. Be careful to avoid using any OS dependent utilities (for example to measure the time).
1

You should prepare the answers of Part 1 using a word processor (in other words, do not submit images of handwritten answers).
Please use the algorithms as exactly shown in lectures.
Although you may use any platform or any operating system to implement your algorithms and obtain your experimental results, your code should work in a Linux environment (specifically on the Dijkstra ’dijkstra.ug.bcc.bilkent.edu.tr’ server). We will compile your programs with the g++ compiler and test your codes in a Linux environment. Thus, you may lose significant amount of points if your C++ code does not compile or execute in a Linux environment.
Please make sure that you are aware of the homework grading policy that is explained in the rubric for homeworks
This homework will be graded by your TA, Alihan Okka. Thus, please contact him directly for any homework related questions.
Attention: For this assignment, you are allowed to use the codes given in our textbook and/or our lecture slides. However, you ARE NOT ALLOWED to use any codes from other sources (including the codes given in other textbooks, found on the Internet, belonging to your classmates, etc.). Furthermore, you ARE NOT ALLOWED to use any data structure or algorithm related function from the C++ standard template library (STL).

Do not forget that plagiarism and cheating will be heavily punished. Please do the homework yourself.

Part 1 
[5 points] Insert the values 13,9,10,21,11,16,4,12,6 into an initially empty AVL tree in order. Show only the final tree after all insertions. Then, delete 10,9,6 in given order. Show only the final tree after all deletion operations.
[5 points] Insert 15,13,9,5,12,8,7,4,0,6,2,1,3 into an empty min-heap. Show only the final heap as a tree (not as an array) after all insertions.
[15 points] Let T1 and T2 be AVL trees such that the largest key in T1 is less than the smallest key in T2. Give an algorithm (in psuedocode) for procedure JOIN-AVLTREES(T1, T2) that joins these two AVL trees. The runtime should be O(logn), where n is the size of the resulting AVL tree. Justify in a few of sentences the correctness and efficiency of your algorithm.
Part 2 – Running Median 
Use the given file names and function signatures during implementation. Feel free to write helper functions to accomplish certain tasks but remember to give meaningful function names.

[7,5 points] Implement an array-based max-heap data structure named as MaxHeap for maintaining a list of integer values that supports the following operations:
Listing 2: MaxHeap

void insert(int value); // inserts integer into heap int peek(); // returns the value of the max element int extractMax(); // retrieves and removes the max element int size(); // returns the number of elements in the heap

Put your code into MaxHeap.h and MaxHeap.cpp files.

[7,5 points] Implement an array-based min-heap data structure named as MinHeap for maintaining a list of integer values that supports the following operations:
Listing 3: MinHeap

void insert(int value); // inserts integer into heap int peek(); // returns the value of the min element int extractMin(); // retrieves and removes the min element int size(); // returns the number of elements in the heap

Put your code into MinHeap.h and MinHeap.cpp files.

[20 points] Design a MedianHeap data structure reusing MaxHeap and MinHeap, which supports getting median of the elements in O(1) time and inserting an element in O(logn) If the length of the list is even, choose the larger value of the two middle elements as the median.
Listing 4: MedianHeap

void insert(int value); //inserts an element into MedianHeap int findMedian(); //returns the value of the median

Put your code into MedianHeap.h and MedianHeap.cpp files.

Part 3 – Huffman Coding 
Huffman coding, in essence, is a lossless data compression algorithm. Consider the data which is a sequence of characters encoded in ASCII. In ASCII, every character is encoded with the same number of bits: 8 bits per character. The common characters, e.g., alphanumeric characters, punctuation, control characters, etc., use only 7 bits; there are 128 different characters that can be encoded with 7 bits. Huffman coding on the other hand, compresses data by using fewer bits to encode more frequently occurring characters so that not all characters are encoded with 8 bits.

Suppose that we have a 100,000 character data file that we wish to store. We observe that the characters in the file occur with the frequencies given by Table 1. That is, only 6 different characters appear, and the character ’a’ occurs 45,000 times. If we store it without compression we would need 700,000 bits. However with Huffman coding, we can compress the file to (45 · 1 + 13 · 3 + 12 · 3 + 16 · 3 + 9 · 4 + 5 · 4)·1,000 = 224,000 bits.

Table 1: A character-coding

                                                                         a               b                c                d               e                f

Frequency (in thousands)             45             13              12              16              9                5

ASCII                                         1100001   1100010   1100011   1100100   1100101     1100110

Variable-length codeword             0             101            100            111          1101         1100

The main goal is to create a tree that corresponds to coding schemes that will be used to encode/decode a file. Huffman trees are a type of binary tree used in performing this kind of conversion. A Huffman tree includes all the characters of your data on the leaf nodes, with most frequent characters being closer to the root. The distance and path away from the root is used to determine what bits (what 0s and 1s) are used to represent each character. The Huffman tree of Table 1 is as follows.

In the pseudocode that follows, we assume that C is a set of n characters and that each character c in C is an object with an attribute c.freq giving its frequency. The algorithm uses a min-priority queue Q, keyed on the freq attribute, to identify the two least-frequent objects to merge together.

Algorithm 1: HUFFMAN(C)

n = |C|;
Q = C;
for i = 1 to n - 1 do
4allocate a new node z;

5z.left = x = EXTRACT_MIN(Q);

6z.right = y = EXTRACT_MIN(Q);

7z.freq = x.freq + y.freq;

8INSERT(Q,z);

end
return EXTRACT_MIN(Q)
The steps of creating the Huffman tree of Table I is given in the next page as an example [1]. Your first objective should be to implement the min-heap data structure as an array of pointers in which each pointer points to a MinHeapNode.

Listing 5: MinHeapNode

struct MinHeapNode

{

char character; // An input character int freq; // Frequency of the character

MinHeapNode *left, *right; // Left and right child

};

Put your min-heap code into HuffmanHeap.h and HuffmanHeap.cpp files. After you are finished, transform your heap into a min-priority queue implementing Huffman coding and put the codes into HuffmanQueue.h and HuffmanQueue.cpp files. Design the Huffman coding algorithm and put your code into HuffmanCode.h and HuffmanCode.cpp files.

Add a main function to the main.cpp file which reads an input file of characters and their frequencies (i.e., e 9), and writes their coding schemes (i.e., e 1101) on an output file. Take input and output file names as arguments. At the end, write a basic makefile which compiles all your code and creates an executable file named hw3.

References
[1] T.M. Cormen, C. E. Leiserson, R. L. Rivest & C. Stein. Introduction to Algorithms,

3rd edition. The MIT Press. 2001. 432

Your ZIP archive should contain the following files:pdf, the file containing the answers to Part 1,
h, MaxHeap.cpp, MinHeap.h, MinHeap.cpp, MedianHeap.h, MedianHeap.cpp, HuffmanHeap.h, HuffmanHeap.cpp, HuffmanQueue.h, HuffmanQueue.cpp, HuffmanCode.h, HuffmanCode.cpp, main.cpp files which contain the C++ source codes, and the Makefile.
Do not forget to put your name, student ID, and section number in all of these files. Well comment your implementation. Add a header as in Listing 1 to the beginning of each file:
Listing 1: Header style

/**

Title: Heaps and AVL Trees
Author: Name Surname
ID: 21XXXXXX
Section: X
Assignment: 3
Description: description of your code
*/

Do not put any unnecessary files such as the auxiliary files generated from your favorite IDE. Be careful to avoid using any OS dependent utilities (for example to measure the time).
1

You should prepare the answers of Part 1 using a word processor (in other words, do not submit images of handwritten answers).
Please use the algorithms as exactly shown in lectures.
Although you may use any platform or any operating system to implement your algorithms and obtain your experimental results, your code should work in a Linux environment (specifically on the Dijkstra ’dijkstra.ug.bcc.bilkent.edu.tr’ server). We will compile your programs with the g++ compiler and test your codes in a Linux environment. Thus, you may lose significant amount of points if your C++ code does not compile or execute in a Linux environment.
Please make sure that you are aware of the homework grading policy that is explained in the rubric for homeworks
This homework will be graded by your TA, Alihan Okka. Thus, please contact him directly for any homework related questions.
Attention: For this assignment, you are allowed to use the codes given in our textbook and/or our lecture slides. However, you ARE NOT ALLOWED to use any codes from other sources (including the codes given in other textbooks, found on the Internet, belonging to your classmates, etc.). Furthermore, you ARE NOT ALLOWED to use any data structure or algorithm related function from the C++ standard template library (STL).

Do not forget that plagiarism and cheating will be heavily punished. Please do the homework yourself.

Part 1 – 25 points
[5 points] Insert the values 13,9,10,21,11,16,4,12,6 into an initially empty AVL tree in order. Show only the final tree after all insertions. Then, delete 10,9,6 in given order. Show only the final tree after all deletion operations.
[5 points] Insert 15,13,9,5,12,8,7,4,0,6,2,1,3 into an empty min-heap. Show only the final heap as a tree (not as an array) after all insertions.
[15 points] Let T1 and T2 be AVL trees such that the largest key in T1 is less than the smallest key in T2. Give an algorithm (in psuedocode) for procedure JOIN-AVLTREES(T1, T2) that joins these two AVL trees. The runtime should be O(logn), where n is the size of the resulting AVL tree. Justify in a few of sentences the correctness and efficiency of your algorithm.
Part 2 – Running Median 
Use the given file names and function signatures during implementation. Feel free to write helper functions to accomplish certain tasks but remember to give meaningful function names.

[7,5 points] Implement an array-based max-heap data structure named as MaxHeap for maintaining a list of integer values that supports the following operations:
Listing 2: MaxHeap

void insert(int value); // inserts integer into heap int peek(); // returns the value of the max element int extractMax(); // retrieves and removes the max element int size(); // returns the number of elements in the heap

Put your code into MaxHeap.h and MaxHeap.cpp files.

[7,5 points] Implement an array-based min-heap data structure named as MinHeap for maintaining a list of integer values that supports the following operations:
Listing 3: MinHeap

void insert(int value); // inserts integer into heap int peek(); // returns the value of the min element int extractMin(); // retrieves and removes the min element int size(); // returns the number of elements in the heap

Put your code into MinHeap.h and MinHeap.cpp files.

[20 points] Design a MedianHeap data structure reusing MaxHeap and MinHeap, which supports getting median of the elements in O(1) time and inserting an element in O(logn) If the length of the list is even, choose the larger value of the two middle elements as the median.
Listing 4: MedianHeap

void insert(int value); //inserts an element into MedianHeap int findMedian(); //returns the value of the median

Put your code into MedianHeap.h and MedianHeap.cpp files.

Part 3 – Huffman Coding 
Huffman coding, in essence, is a lossless data compression algorithm. Consider the data which is a sequence of characters encoded in ASCII. In ASCII, every character is encoded with the same number of bits: 8 bits per character. The common characters, e.g., alphanumeric characters, punctuation, control characters, etc., use only 7 bits; there are 128 different characters that can be encoded with 7 bits. Huffman coding on the other hand, compresses data by using fewer bits to encode more frequently occurring characters so that not all characters are encoded with 8 bits.

Suppose that we have a 100,000 character data file that we wish to store. We observe that the characters in the file occur with the frequencies given by Table 1. That is, only 6 different characters appear, and the character ’a’ occurs 45,000 times. If we store it without compression we would need 700,000 bits. However with Huffman coding, we can compress the file to (45 · 1 + 13 · 3 + 12 · 3 + 16 · 3 + 9 · 4 + 5 · 4)·1,000 = 224,000 bits.

Table 1: A character-coding

                                                                         a               b                c                d               e                f

Frequency (in thousands)             45             13              12              16              9                5

ASCII                                         1100001   1100010   1100011   1100100   1100101     1100110

Variable-length codeword             0             101            100            111          1101         1100

The main goal is to create a tree that corresponds to coding schemes that will be used to encode/decode a file. Huffman trees are a type of binary tree used in performing this kind of conversion. A Huffman tree includes all the characters of your data on the leaf nodes, with most frequent characters being closer to the root. The distance and path away from the root is used to determine what bits (what 0s and 1s) are used to represent each character. The Huffman tree of Table 1 is as follows.

In the pseudocode that follows, we assume that C is a set of n characters and that each character c in C is an object with an attribute c.freq giving its frequency. The algorithm uses a min-priority queue Q, keyed on the freq attribute, to identify the two least-frequent objects to merge together.

Algorithm 1: HUFFMAN(C)

n = |C|;
Q = C;
for i = 1 to n - 1 do
4allocate a new node z;

5z.left = x = EXTRACT_MIN(Q);

6z.right = y = EXTRACT_MIN(Q);

7z.freq = x.freq + y.freq;

8INSERT(Q,z);

end
return EXTRACT_MIN(Q)
The steps of creating the Huffman tree of Table I is given in the next page as an example [1]. Your first objective should be to implement the min-heap data structure as an array of pointers in which each pointer points to a MinHeapNode.

Listing 5: MinHeapNode

struct MinHeapNode

{

char character; // An input character int freq; // Frequency of the character

MinHeapNode *left, *right; // Left and right child

};

Put your min-heap code into HuffmanHeap.h and HuffmanHeap.cpp files. After you are finished, transform your heap into a min-priority queue implementing Huffman coding and put the codes into HuffmanQueue.h and HuffmanQueue.cpp files. Design the Huffman coding algorithm and put your code into HuffmanCode.h and HuffmanCode.cpp files.

Add a main function to the main.cpp file which reads an input file of characters and their frequencies (i.e., e 9), and writes their coding schemes (i.e., e 1101) on an output file. Take input and output file names as arguments. At the end, write a basic makefile which compiles all your code and creates an executable file named hw3.

References
[1] T.M. Cormen, C. E. Leiserson, R. L. Rivest & C. Stein. Introduction to Algorithms,

3rd edition. The MIT Press. 2001. 432

More products