Starting from:

$29.99

EECS281 Programming Project 4 Gotta Catch 'em All! v1.1 Solution

Overview
In Trainer Evasion Mode (Part A), we assume you have not faced any trainers and you must fight some of them such that all pokemon in the country are reachable. You need to create Trainer-Free paths by fighting as few trainers as possible and by traveling the least distance (one metric will account for both of these).
In Champion Mode (Parts B and C), we assume you have visited all of the country, beaten every Trainer, and have access to Fly, meaning you can skip fighting trainers altogether and fly from pokemon to pokemon. You will be constructing a path (rather, a cycle, as you return home) to catch every pokemon while minimizing the distance traveled.
In this project, your output does not need to be exactly the same as ours to be correct. As long as it provides a valid answer, and follows the rules for formatting output, it can be in a different order and still be correct.
Project Goals
● Understand and implement MST algorithms. Be able to determine whether Prim’s or Kruskal’s is more efficient for a particular scenario.
● Understand and implement a Branch and Bound algorithm. Develop a fast and effective bounding algorithm.
● Explore various heuristic approaches to achieving a nearly-optimal solution as fast as possible. ○ Do some web searching for “TSP heuristics”, don’t choose one worse than O(n2).
● Use a visualization tool to help with debugging.

1 Pikachu is the mascot of Pokemon; it is unimportant for the sake of this project.
2 Surfing is where a trainer rides a water-type pokemon to travel on water. Flying is where a trainer rides a flying-type pokemon to fly directly to different locations
Version 03-30-22
Written by many 281 staff
Map Input
On startup, your program, poke, reads input from standard input (cin) describing the locations of the pokemon in the country. The country is mapped on a grid with water on the bottom left quadrant (see ‘Path Rules’). You will be given a list of M pokemon in the country with associated integer coordinates (x, y). Pokemon are identified by integer indices which correspond to the order in which they are read in (the first pokemon coordinate you read corresponds to pokemon location 0, the second pokemon coordinate to pokemon location
1, etc.). For parts B and C you always start at the location of the 0-th pokemon.
Sample:
5
6 1
2 3
-5 -4
-1 6
0 -1
The above sample can be visualized as in the figure below, where the numbers shown are the pokemon indices that starts at the 0-th pokemon, and the blue line indicates the coastline (for more details, see the 'Path Rules' section) which separates land from sea: pokemon 2 is in the sea, 4 is on the coastline, the rest are on
land..

There is more than one way to represent this configuration internally in your program, and this will affect your runtime. Choose your data structures wisely!
For Part A, there will always be at least two points. For Parts B and C, there will always be at least 3.
Path Rules
Distance
We represent the paths you take as a set of pairs of points or as a sequence of points. Your calculations should use Euclidean distance, and you should represent your distances as doubles.
In Parts B and C, you have access to Fly, meaning you can fly directly from location to location. You move along the line segments that connect pokemon. Please be very careful with rounding errors in your distance calculations; the autograder will allow you a 0.01 margin of error to account for rounding, but you should avoid rounding at intermediate steps.
The Sea
For the sake of simplicity, assume two land locations that "cross" the sea can be connected by a direct path. The example below shows two validly connected land pokemon, at locations A and B.

In this example, there is a direct path from A to B.
Other
Important: For parts B and C, you always start at the 0-th pokemon location.
You are also not allowed to ‘wrap around the edges of the world’ (you cannot go above the top of the map to arrive at the bottom).
Command Line Input
Your program, poke, should take the following case-sensitive command line options:
● -m, --mode <MODE>
This command line option must be specified, if it is not, print a useful error message to standard error (cerr) and exit(1). MODE is a required argument. Set the program mode to MODE. MODE must be one of MST, FASTTSP, or OPTTSP. The MODE corresponds to the algorithm poke runs (and therefore what it outputs).
● -h, --help
Print a short description of this program and its arguments and exit(0).
Valid examples of how to execute the program:
poke --mode MST (OK, but you must type the input by hand or copy/paste it) poke -h < inputFile.txt (OK, -h happens before we realize there’s no -m; input ignored) poke -m OPTTSP < inputFile.txt (OK, reads from a file on disk)
poke -m BLAH (BAD mode following -m)
Remember that when we redirect input, it does not affect the command line. Redirecting input just sends the file contents to cin. You should not have to modify your code to allow this to work; the operating system will handle it.
Algorithms
Part A – MST mode
Description
This mode will devise a path that visits every pokemon location while minimizing the total distance of that path.
Remember that in this part, you must respect the boundary between land and sea. Your MST cannot connect a pokemon in the sea to one on land, without passing through a pokemon on the coastline.
Output Format (for Part A only)
For the MST mode, you should print the total weight of the MST you generate by itself on a line; this weight is the sum of the weights of all edges in your MST (in terms of Euclidean distance). You should then print all edges in the MST. All output should be printed to standard output (cout).
The output should be of the format:
weight node node node node ......
The nodes are the pokemon location numbers corresponding to the vertices of the MST and a pair of nodes on a given line of the output describes an edge in the MST from the first node to the second. To be clear, the weight should be formatted as a double (2 decimal point precision is enough - see Appendix A), and the node numbers should all be integer values when printed. For example, given the example input file above, your MST mode output might be:
19.02
0 1
2 4
1 3
1 4
You should also always print the pairs of vertices that describe an edge such that the index on the left has a smaller integer value than the index on the right. In other words:
1 2
is a possible valid edge of output, but
2 1
is not.
If it is not possible to construct an MST for all pokemon (i.e. if there are pokemon on both the land and in the sea, with no pokemon on the coastline), your program should print the message "Cannot construct MST" to cerr and exit(1).
Parts B & C (FASTTSP & OPTTSP mode)
Description (for both Parts B and C)
In this mode, you will figure out how to travel to every pokemon and then return to your starting location. The route will always start at pokemon index 0, visit every other pokemon exactly once, and return to the starting point. Your job will thus be to solve the TSP (Traveling Salesperson Problem) and choose paths to pokemon locations so as to minimize the total distance traveled. Euclidean (straight-line) distance is used here again to compute distances between pokemon. Because you have now discovered a flying pokemon, there is no longer any restrictions on land-coastline-sea; you can fly from any pokemon location to any other.
For FASTTSP mode, you do not need to produce an optimal TSP tour, but your solution should be close to optimal. Because your FASTTSP algorithm does not need to be perfectly optimal, we expect it to run much faster than finding a perfect optimal solution. Do some online searching for “TSP heuristics”. There are several types, some easier to implement, some with better path lengths, some both.
For OPTTSP mode, you must find an optimal solution to the TSP (the actual minimum Euclidean distance necessary). More on the differences between OPTTSP and FASTTSP modes will be discussed later.
For both methods:
You must start the tour from the 0-th pokemon location (your starting location).
You must visit every pokemon exactly once (there's no point in returning to a place already checked), and at the end of the tour, you must return back to the 0-th location.
The length of a tour is defined as the sum of all pairwise distances traveled - that is, the sum of the lengths of all of the paths taken (using Euclidean distance).
Your program must print the indices of the locations in an order such that the length of this tour is as small as possible. More details about how to accomplish this are listed below.
Output Format (for both Parts B and C)
You should begin your output by printing the overall length of your tour on a line. On the next line, output the nodes in the order in which you visit them. The initial node should be the starting location index and the last node should be the location number directly before returning back to the 0-th location. The nodes in your tour should be printed such that they are separated by a single space. There can be a space after the last location listed. All output should be printed to standard output (cout).
For example, if given the input file above, your program could produce:
31.64
0 4 2 3 1
or
31.64
0 1 3 2 4
Part B Only Description
You should be able to produce a solution to the Traveling Salesperson Problem (TSP) that is as close as possible to the optimal tour length (though it does not need to be optimal). In the best case, your produced tour length will equal the optimal tour length.
You are allowed to use any combination of algorithms for this section that we have covered in class, including the MST algorithm you wrote for Part A.
You will need to be creative when designing your algorithms for this section. You are free to implement any other algorithm you choose, so long as it meets the time and memory constraints. However, you should not use any advanced algorithms or formulas (such as Simulated Annealing, Genetic Algorithms and Tabu search they are too slow) that are significantly different from what has been covered in class. Instead, creatively combine the algorithms that you already know and come up with concise optimizations. Your heuristic will very likely be greedy in some way, but there are different ways to be greedy! Just make sure to limit yourself to heuristics whose time complexity is O(n2).
Part C Only Description
To find an optimal tour, you could start with the brute force method of exhaustive enumeration that evaluates every tour and picks a smallest tour. By structuring this enumeration in a clever way, you could determine that some branches of the search cannot lead to optimal solutions. For example, you could compute lower bounds on the length of any full tour that can be found in a given branch. If such a lower bound exceeds the cost of a full solution you have found previously, you can skip this branch as hopeless. If implemented correctly, such a branch-and-bound method should always produce an optimal solution. It will not scale as well as sorting or searching algorithms do for other problems, but it should be usable with a small number of locations. Clever optimizations (identifying hopeless branches of search early) can make your algorithm a hundred times faster. Drawing TSP tours on paper and solving small location configurations to optimality by hand should be very useful. Remember that there is a tradeoff between the time it takes to run your bounding function and the number of branches that bound lets you prune.
4 folder) in the file named genPerms.txt, which is safe to copy from (PDF files are not).
Given an input set of N locations defined by integer coordinates, your job is to produce an optimal tour using branch-and-bound algorithms. Your program should always produce the shortest possible tour as a solution, even if computing that solution is time-consuming. You will be given a 35-second cpu time limit to generate your solution. If your program does not produce a valid solution, it will fail the test case. Your solution will also be judged by time and space budgets as per previous projects.
Libraries and Restrictions
We highly encourage the use of the STL for this project, with the exception of several prohibited features. Do not use:
● The C++11 regular expressions library
● The STL thread/atomics libraries (which spoil runtime measurements) ● Shared or unique pointers.
● Other libraries (e.g., boost, pthreads, etc).
Testing and Debugging
Part of this project is to prepare several test files that will expose defects in buggy solutions - your own or someone else’s. As this should be useful for testing and debugging your programs, we recommend that you first try to catch a few of our intentionally-buggy solutions with your test files, before completing your solution. The autograder will also tell you if one of your own test files exposes bugs in your solution.
Each test that you submit should consist of an input file. When we run your test files on one of
intentionally-buggy project solutions, we compare the output to that of a correct project solution. If the outputs differ, the test file is said to expose that bug.
Test files should be named test-n-MODE.txt where 1 ≤ n ≤ 10. The autograder’s buggy solutions will run your test files in the specified MODE. The mode must be MST, FASTTSP, or OPTTSP.
Submitting to the Autograder
Do all of your work (with all needed source code files, as well as test files) in some directory other than your home directory. This will be your "submit directory". Before you turn in your code, be sure that:
● Every source code and header file contains the following project identifier in a comment at the top of the file:
● // Project Identifier: 5949F553E20B650AB0FB2266D3C0822B13D248B0
● The Makefile must also have this identifier (in the first TODO block)
● You have deleted all .o files and your executable(s). Typing ‘make clean’ shall accomplish this.
● Your makefile is called Makefile. Typing ‘make -R -r’ builds your code without errors and generates an executable file called poke. The command-line options -R and -r disable automatic build rules, which will not work on the autograder.
● Your Makefile specifies that you are compiling with the gcc optimization option -O3. This is extremely important for getting all of the performance points, as -O3 can speed up code by an order of magnitude.
● The total size of your program and test files does not exceed 2MB.
● You don't have any unnecessary files (including temporary files created by your text editor and compiler, etc) or subdirectories in your submit directory (i.e. the .git folder used by git source code management).
● Your code compiles and runs correctly using version 6.2.0 of the g++ compiler. This is available on the CAEN Linux systems (that you can access via login.engin.umich.edu). Even if everything seems to work on another operating system or with different versions of GCC, the course staff will not support anything other than GCC 6.2.0 running on Linux (students using other compilers and OS did observe incompatibilities). To compile with g++ version 6.2.0 on CAEN you must put the following at the top of your Makefile:
PATH := /usr/um/gcc-6.2.0/bin:$(PATH)
LD_LIBRARY_PATH := /usr/um/gcc-6.2.0/lib64
LD_RUN_PATH := /usr/um/gcc-6.2.0/lib64
Turn in all of the following files:
● All your .h, .hpp, and/or .cpp files for the project
● Your Makefile
● Your test files
You must prepare a compressed tar archive (.tar.gz file) of all of your files to submit to the autograder. One way
to do this is to have all of your files for submission (and nothing else) in one directory. In this directory, run
dos2unix *; tar czvf ./submit.tar.gz *.cpp *.h Makefile test-*.txt
This will prepare a suitable file in your working directory. Alternatively, the 281 Makefile has useful targets fullsubmit and partialsubmit that will do this for you. Use the command make help to find out what else it can do!
Submit your project files directly to either of the two autograders at:
Grading
80 points -- Your grade will be derived from correctness and performance (runtime). This will be determined by the autograder. On this project we expect a much broader spread of runtimes than on previous projects. As with all projects, the test cases used for the final grading are likely to be different.
10 points -- Student test file coverage (effectiveness at exposing buggy solutions).
Runtime Quality Tradeoffs
In this project there is no single correct answer (unlike previous projects). Accordingly, the grading of your problem will not be as simple as a ‘diff’, but will instead be a result of evaluating your output. For example, if we gave you a square for Part A, you might choose any 3 of the 4 edges, and print them in any order, meaning there are 24 possible correct output files!
In addition to time and memory, your score for Part B test cases will be determined based on how close you are to a good solution, computed as a percentage. As with time and memory, having a cycle length that is over by a small amount is OK, but at some threshold you start losing points, and the further over you are, the more points you lose.
Hints and Advice
There's a project 4 tutorial video available. There's also a video that shows how a greedy nearest neighbor implementation looks while it's running (part B). It shows how this approach produces a suboptimal tour because it contains crossings, and how 2-OPT removes them.
It can be difficult to get this project correct without visualizing your MSTs and TSP tours. We have provided a visualization tool on the Autograder that you can use; follow this link: https://g281-2.eecs.umich.edu/p4viz/.
Running your code (on CAEN or locally) using valgrind can help you find and remove undefined (buggy) behavior and memory leaks from your code. This can save you from losing points in the final run when you mistakenly believe your code to be correct.
Make sure that you are using getopt_long() for handling command-line options.
Appendix A
In order to ensure that your output is within the tolerable margins of error for the autograder to correctly grade your program you must run the following lines of code before you output anything to cout. We highly recommend that you simply put them at the top of main() so that you don’t forget about them.
cout << std::setprecision(2); // Always show 2 decimal places
cout << std::fixed; // Disable scientific notation for large numbers You will need to #include <iomanip> to use this code.
Appendix B
One possible heuristic (NOT the only one) is to construct a starting point for your TSP tour by following MST edges and skipping previously visited vertices. By using this technique correctly, you can find a tour that is guaranteed to be no more than twice the optimal solution’s length (and use this “2x” check for debugging). You can then use this starting point to make adjustments and do better. This is not necessarily the best approach! It is one approach used to illustrate how a heuristic can be used, there are better ones out there. This example method is VERY hard to code. Do some research on other heuristics!
“Corner Cutting” algorithm illustrated
Suppose locations are distributed in an input map as shown below:

Below is an MST that would be formed from the above locations.

Below is a path taken by strictly following the edges of the MST.

However, by “cutting corners,” as described in the picture below, an effective TSP solution can be generated. This is possible because once a vertex is visited, it will not be visited again. In the above path that strictly follows the edges of the MST, the middle vertex is visited four times (visited after an outer vertex is visited). If the middle vertex is skipped after the first visit, a TSP tour shown below is achieved.

The reason we bring up this ‘twice-around-the-tree’ heuristic is to state the theorem that the resulting tours are no worse than 2x the MST cost.
The following is an explanation/proof sketch as to why the 2x bound is valid:
If you perform a DFS traversal of a tree and return to the initial node, you have visited every edge exactly twice ("going in" and "returning"), so this gives you exactly double the cost/length of the MST (the fact that the tree is minimal is not important for the logic of the proof - this works for any tree). Since the tree is spanning, you have visited all locations. The only problem with this twice-around-the-tree self-intersecting path is that it's not a tour.
It can be turned into a tour by taking shortcuts (i.e., taking shortcuts is important not only to reduce the length).
When considering other heuristics:
1. The theorem allows you to upper-bound the cost of optimal TSP tours based on MST length.
2. The theorem has a constructive proof - a heuristic that always produces TSP tours that satisfy this upper bound .
As a consequence, your heuristic should also satisfy the 2x upper bound; if not, you can just implement the twice-around-the-tree heuristic. However, we do not require this because there are much better heuristics – ones that are faster and produce better results.
Appendix C
Legend for test case names
Autograder test names are broken down by the mode used with each test:
● Name of test file starting with “A”: these test files are used with the MST mode.
● Name of test file starting with “B”: these test files are used with the FASTTSP mode. ● Name of test file starting with “C”: these test files are used with the OPTTSP mode.
Test files that start with “C” are much smaller in size than those that start with “A” and “B”. The number of pokemon for such files is smaller than 40 as compared to ten thousands of pokemon for the others. The reason is that TSP is NP-complete and finding an optimal solution in a reasonable amount of time is only possible for small-sized problems.
There are 8 bugs in our solution for you to find with your test files, you will start earning points after finding 3 bugs, and will receive full points for finding 7.

More products