$29.99
Programming Assignment 2
ATTENTION:
● Compress all of the Java program source files (.java) files into a single zip file.
● The name of the zip file should follow the below convention:
CS102_Sec1_Asgn2_YourSurname_YourName.zip
● Replace the variables “Sec1”, “YourSurname” and “YourName” with your actual section, surname and name.
Reducing Document Size
In this assignment, you are going to implement a Java program that reduces document size by replacing words with integers. You should have a method for reading and processing the input text file (“input.txt”) to generate two outputs:
• The first output (“map.txt”) should include which integer corresponds to which word.
• The second output (“encoded.txt”) should include the document where each word is replaced by an integer based on your mapping.
Input text file is guaranteed to have all lowercase letters and no punctuation; however, you should preserve the line endings in your output. A sample input and corresponding outputs are as follows:
Sample Input/Output for the Process
input.txt map.txt map.txt (continued) encoded.txt
the red bike was on the road
i kept riding the bike near the black road the bike was black in the end because of the road such a road and such a bike 0: the
1: red
2: bike
3: was
4: on
5: road
6: i
7: kept
8: riding
9: near 10: black
11: in
12: end
13: because
14: of
15: such
16: a
17: and 0 1 2 3 4 0 5
6 7 8 0 2 9 0 10 5
0 2 3 10 11 0 12 12 14 0 5
15 16 5 17 15 16 2
Check if your encoded files are smaller in size. Include a second strategy for creating the word mapping. Here you are free to use any approach, some suggestions include:
• Assigning random non-overlapping integers to each word.
• Assigning smaller integers to the words with high occurrence frequency.
• Assigning smaller integers to the longer words.
Compare if the second strategy is better or worse. Keep both strategies in your final implementation as different methods.
You should also have a method for the reverse operation. This reverse operation should receive
Sample Input/Output for the Reverse Process
map.txt map.txt (continued) encoded.txt decoded.txt
11: in
0: the
1: red
16: a
17: and
8: riding
9: near
6: i 7: kept 10: black
12: end
13: because
14: of
15: such
2: bike
3: was
4: on
5: road
0 1 2 3 4 0 5
6 7 8 0 2 9 0 10 5
0 2 3 10 11 0 12 12 14 0 5
15 16 5 17 15 16 2 the red bike was on the road
i kept riding the bike near the black road the bike was black in the end because of the road such a road and such a bike
Preliminary Submission: You will submit an early version of your solution before the final submission. This version should at least include the following:
• The required functionality to read and create “map.txt” and “encoded.txt” should be complete using the initial approach (word indexes are determined based on first occurrence). Do not forget to test your method with sample files.
Not completing the preliminary submission on time results in 50% reduction of this assignment’s final grade.