Starting from:

$34.99

COMP9319 Assignment 1: Entropy and Size Determination of Huffmam Solution


This is a warm-up assignment for you to practise the knowledge that you learned from the first 2 weeks and to get familiar with the programming tools (e.g., makefile). Your task in this assignment is to implement a C or C++ program that determines the average number of bits to encode a given file using static Huffman encoding and LZW as well as its entropy. The given file can be a text or binary file (e.g., images, videos or executable programs). When you determine the average number of bits, you assume that the size of the encoded file should be the minimum size (i.e., do not need to include any overheads such as the size of any statistic information) and should not consider adding any other encoding / further optimisation into it.
Your program should be called csize. Your program should accept: 1) an optional commandline argument -s and a number between 9 and 18 (inclusively) to specify the fixed width (in bits) of the codes for LZW (where the width determines the number of dictionary entries available). 2) a path to the given file to be encoded (e.g.,
~/folder1/subfolder2/file2.jpg). If -s is omitted, the default width of 12 bits is assumed.

Your program csize will output 3 numbers, one each line, and nothing else (e.g., do not output any spaces). These 3 numbers are the Shannon's entropy, the average of bits per symbol for static Huffman, and the average of bits per symbol for LZW, respectively. Each number must be round to two decimal places (even if the number is an integer, output 00 for the two decimal places). For example:
%wagner> csize ~/Desktop/test.jpg
2.76 3.00
3.40
%wagner>
In the above example, 2.76 is the entropy; 3.00 and 3.40 are for Huffman and LZW, respectively.

Marks will be deducted if you do not follow the output format described above. Your solution will be compiled and run on a typical CSE Linux machine e.g. grieg. Your solution should read the input file as read-only (because you might not have write permission to the file) and should not write out any external files. Any solution that fails to compile on a CSE Linux machine, fails to read a read-only file, or writes out external files, will receive zero points for the entire assignment.
Compilation and Usage Example

An example showing the usage and output of csize is as follows:
cs9319@grieg:~/a1$ ls -s test.txt
-rw------- 1 cs9319 cs9319 19 Jun 13 17:09 test.txt
cs9319@grieg:~/a1$ more test.txt
^WED^WE^WEE^WEB^WET
cs9319@grieg:~/a1$ cat test.txt
^WED^WE^WEE^WEB^WETcs9319@grieg:~/a1$ wc test.txt
0 1 19 test.txt
cs9319@grieg:~/a1$ csize test.txt
2.21 2.26
7.58
cs9319@grieg:~/a1$ csize -s 10 test.txt
2.21 2.26
6.32 cs9319@grieg:~/a1$
Performance
Your solution will be marked based on memory and runtime performance. Your soluton will not be tested against any files that are larger than 1MB. Any solution that takes more than 3 seconds (sum of the sys and usr time using the time command, /usr/bin/time) to run on a given test file on grieg will be killed and you will receive zero points for that auto test.
Documentation and Code Readability
Remarks
1. It is part of the assignment that you are responsible for the correctness of your program output. For example, you can test your program by starting with some small files and manually verify the entropy and average bit calculations.
4. Marks will be deducted for output of any extra text, other than the required, correct answer (i.e., the number of matches as shown in the examples above). This extra information includes (but not limited to) debugging messages, line numbers and so on.
Submission
Use the give command below to submit the assignment: give cs9319 a1 *.h *.c makefile or:
give cs9319 a1 *.h *.cpp makefile
Note that the give command is available on any CSE linux machines (such as wagner) but not on grieg. You can check your submission status by:
9319 classrun -check a1



More products