Starting from:

$25

COMP550-Project 5 Special Topics in Motion Planning Solved

In this assignment, you are free to select a project from one of several below or propose your own interesting motion planning problem. Those who propose a project must receive prior approval from the instructor. This project may be completed in pairs.

Deliverables
First, a statement of what project you are working is due Friday November 8 at 1pm on Canvas.

An initial progress report is due Tuesday November 12th at 1pm on Canvas. This report should be short, no longer than one page in PDF format. At a minimum, the report should state who your partner is, which project you are working on, and what progress you have made thus far. Those who propose their own project or deviate significantly from a proposed project must receive approval from the instructor prior to submitting their progress report. Those completing the project in groups need to provide just one progress report.

Final submissions are due Wednesday November 27th at 1pm. Submissions are on Canvas and consist of two things. First, to submit your project, clean your build space with make clean, zip up the project directory into a file named Project2 <your NetID> <partner’s NetID>.zip. Your code must compile within a modern Linux environment. If your code compiled on the virtual machine, then it will be fine. If you deem it necessary, also include a README with details on compiling and executing your code.

Second, submit a written report that summarizes your experiences and findings from this project. The report should be no longer than 10 pages, in PDF format, and contain (at least) the following information:

•    A problem statement and a short motivation for why solving this problem is useful.

•    The details of your approach and an explanation of how/why this approach solves your problem.

•    A description of the experiments you conducted. Be precise about any assumptions you make on the robot or its environment.

•    A quantitative and qualitative analysis of your approach using the experiments you conduct.

•    Rate the difficulty of each exercise on a scale of 1–10 (1 being trivial, 10 being impossible). Give an estimate of how many hours you spent on each exercise, and detail what was the hardest part of the assignment.

Please submit the PDF write-up separate from the source code archive. Those completing the project in groups need to provide just one submission for code and report. In addition to the code and report, there will also be a short (∼5 minute) in-class presentation on your chosen topic. Details of the presentation will be announced closer to the presentation dates.

If your code or your report is missing by the deadline above, your project is late. The latest date for submission is Friday at 5pm before the last week of classes. Your presentation is during the last week of classes.

Project Topics
The topics proposed below are a broad spectrum of interesting and useful problems within robot motion planning. Each topic comes with its own set of questions and deliverables that should be addressed in the code and written report. For topics marked 450-only, you may only choose the topic if both you and your partner are enrolled in the undergraduate versions of the class. The graduate section of the topics is required if either you or your partner is enrolled in the graduate versions of the course, and optional if you are both enrolled in the undergraduate versions. No extra credit will be rewarded for undergraduate students completing the graduate requirement. Although it may not be stated explicitly, visualization is essential for virtually every project, report, and presentation to establish correctness of the implementation.


5.1 Path Clustering (450-Only)
Sampling-based motion planning algorithms can produce many different paths for a given query. However, many of these paths can be virtually identical to one another. Whether this is the case is often difficult to determine, particularly if the configuration space has more than 3 dimensions since it is difficult to visualize such a space. In this project you will implement a distance metric for paths using a given distance metric for configurations. The path distance metric is then used to cluster a collection of paths that begin and end at the same configurations.

Path Distance:

The distance from a path A to a path B can be defined by integrating the distance to the closest point on B for a point moving along A. This formulation, however, is problematic since two very different paths can still have distance 0 (see Figure 1(a) for an example).

 

        (a) Path distance should be sig-          (b) Increasingly complex environments which result in increasingly many path clusters.

nificantly greater than 0 between the solid and dotted curve. Figure 1: Path distance and path clustering.

A more sophisticated approach first constructs a monotonic mapping between curve-length parametrizations of A and B and then integrates the distance between corresponding points on A and B. Under the assumption that paths A and B are densely and uniformly sampled so that there are m states along A and n states along B, we can compute the distance using dynamic programming. Let Di,j be the distance between state i ∈ A and state j ∈ B using the standard distance metric for configurations. By taking advantage of the optimal substructure of this computation, we can compute the distance between the two paths by finding a minimum-distance mapping between the states in A and B. Let Ci,j denote the minimum distance mapping up to state i ∈ A and state j ∈ B. Then Cm,n is the total distance between two paths, where C1,1 = D1,1 and

Ci,j = min[Ci−1,j,Ci,j−1]+Di,j.

Path Clustering:

Once you have implemented a distance metric for paths, you can begin clustering paths that are close given your metric. You may choose any clustering algorithm: single-linkage (nearest neighbor) clustering, k-means, or more sophisticated techniques. Single-linkage and k-means are simple, heuristic approaches that attempt to group paths together into a set of clusters. Given the clustering output, one can visually assess the cluster quality using a similarity matrix. The similarity matrix is computed by sorting the data based on cluster id and then visualizing the elements of D (after normalization). An example of a good clustering and similarity matrix is shown in Figure 2.

If two paths are considered close according to the distance metric, but cannot be continuously deformed from one to another without passing through an obstacle, then these paths are topologically different. Ideally,

 

Figure 2: (Left) An example data set and grouping into three clusters. (Right) The similarity matrix for the clustering. Red squares along the diagonal indicate a good clustering.

we would like paths in different homotopy classes to end up in different clusters. This is a hard problem in general, but the following heuristic can potentially improve the clustering: if the straight-line path between ai ∈ A to bj ∈ B is not valid, then Di,j should change to a large value to discourage clustering paths in different homotopic classes. If all straight line paths between every ai and corresponding bj are valid, the original distance is returned.

Deliverables:

Implement the path distance function, with and without the homotopy check, and a clustering algorithm as described above. The implementation should work for any arbitrary, user-defined distance metric. Verify that the distance function reports a significant different between the two paths in Figure 1a. Evaluate your implementation in a variety of environments, like those in Figure 1b, to verify whether your clusters make sense. You may assume a point robot translating in R2.

How well does the path distance function and clustering algorithm classify your data? Does the homotopy heuristic improve the clustering? How did you select the number of clusters? Be sure to cluster both small and large numbers of paths, and to visualize the similarity matrix of your clustering when presenting your conclusions.

Protips:

Note that even in the simplest case with just one obstacle, there exist more than two clusters of paths (e.g., there is a cluster of paths that all go around the obstacle clockwise 5 times before reaching the goal). Generally speaking, there are infinitely many homotopy classes, thus infinitely many “ideal” clusterings.

To find many different paths, you must run a planner many times and extract a solution path from each run. It may be advantageous to generate paths from different planners for diversity. Moreover, you may also want to disable path simplification for some or all of the paths that you generate.

Bonus (5 pt):

Clustering algorithms can produce wildly different results on the same input. As a bonus experiment, compare and contrast two clustering algorithms on paths using the path distance function described above. You can choose any pair of clustering algorithms: single linkage (nearest neighbor) clustering, k-means, etc. Present a qualitative review of the clustering algorithms you choose. What are the strengths and weaknesses of the algorithms? Compare the clusterings computed by the algorithms.

 5.2 Planning Under Uncertainty (450-Only)
For physical robots, actuation is often imperfect: motors cannot instantaneously achieve a desired torque, the execution time to apply a set of inputs cannot be hit exactly, or wheels may unpredictably slip on loose terrain just to name a few examples. We can model the actuation error for many situations using a conditional probability distribution, where the probability of reaching a state q0 depends on the initial state q

and the input u, written succinctly as P(q0|q,u). Figure 3: Even under an optimal policy, a system with actuation uncertainty can fail to reach the goal (green circle) consistently.

Because we cannot anticipate the exact state of

From Alterovitz et al.; see below.

the robot a priori, algorithms that reason over uncertainty compute a policy instead of a path or trajectory since the robot is unlikely to follow a single path exactly. A policy is a mapping of every state of the system to an action. Formally, a policy π is a function π : Q → U, where Q is the configuration space and U is the control space. When we know the conditional probability distribution, an optimal policy to navigate the robot can be computed efficiently by solving a Markov decision process.

Markov decision process (MDP):

When the transitional probability distribution depends only on the current state, the robot is said to have the

Markov property. For Markov systems, an optimal policy is obtained by solving a Markov decision process (MDP). An MDP is a tuple (Q,U,P,R), where:

•    Q is a set of states

•    U is a set of controls applied at each state

•    P : Q×U ×Q → [0,1] is the probability of reaching q0 ∈ Q via initial state q ∈ Q and action u ∈ U.

•    R : Q → R is the reward for reaching a state q ∈ Q.

An optimal policy over an MDP maximizes the total expected reward the system receives when executing a policy. To compute an optimal policy, we utilize a classical method from dynamic programming to estimate the expected total reward for each state: Bellman’s equation. Let V0(q) be an arbitrary initial estimate for the maximum expected value for q, say 0. We iteratively update our estimate for each state until we converge to a fixed point in our estimate for all states. This fixed-point algorithm is known as value iteration and can be succinctly written as

                                                                        Vt+1(q) = R(q)+max∑P(q0|q,u)Vt(q0),                            (1)

u∈U q0

We stop value iteration when Vt+1(q) = Vt(q) for all q ∈ Q. Upon convergence, the optimal policy for each state q is the action u that maximizes the sum in the equation above.

Sampling-based MDP:

Efficient methods to solve an MDP assume that the state and control spaces are discrete. Unfortunately, a robot’s configuration and control spaces are usually continuous, preventing us from computing the optimal policy exactly. Nevertheless, we can leverage sampling-based methods to build an MDP that approximates the configuration and control space. The stochastic motion roadmap (SMR) is one method to approximate a continuous MDP. Building an SMR is similar to building a PRM; there is a learning phase and a query phase. During the learning phase, states are sampled uniformly at random and the probabilistic transition between states are discovered. During the query phase, an optimal policy over the roadmap is constructed to bring the system to a goal state with maximum probability.

Learning phase: During the learning phase, an MDP approximation of the evolution of a robot with uncertain actuation is constructed. First, n valid states are sampled uniformly at random. Second, the transition probabilities between states are discovered. SMR uses a small set of predefined controls for the robot. The transition probabilities can be estimated by simulating the system m times for each state-action pair. Note that m has to be large enough to be an accurate approximation of the transition probabilities. For simplicity,

we assume that the uncertainty is represented with a Gaussian distribution. Once a resulting state is generated from the state-action pair, you can match it with the n sampled states through finding its nearest neighbor, or if a sampled state can be found within a radius. It is also useful to add an obstacle state to the SMR to indicate transitions with a non-zero probability of colliding with an obstacle.

Query phase: The SMR that is constructed during the learning phase is used to extract an optimal policy for the robot. For a given goal state (or region), an optimal policy is computed over the SMR that maximizes the probability of success. To maximize the probability of success, simply use the value iteration algorithm from equation 1 with the following reward function:

(

0 R(q) =
if q is not a goal state,
1
if q is a goal state.
Note: it is assumed that the robot stops when it reaches a goal or obstacle state. In other words, there are no controls for the goal and obstacle states and Vt(q) = R(q) is constant for these states.

Deliverables:

Implement the SMR algorithm for motion planning under action uncertainty with a 2D steerable needle. Modeling a steerable needle is similar to a Dubins car; see section III of the SMR paper (cited below) for details on the exact configuration space and control set. Construct some interesting 2D environments and visualize the execution of the robot under your optimal policy. Simulated execution of the steerable needle is identical to the simulation performed when constructing the transition probabilities. Simulate your policy many times. Does the system always follow the same path? Does it always reach the goal? How sensitive is probability of success to the standard deviation of the Gaussian noise distribution?

Reference:

R. Alterovitz, T. Simeon, and K. Goldberg, The Stochastic Motion Roadmap: A Sampling Framework for´ Planning with Markov Motion Uncertainty. In Robotics: Science and Systems, pp. 233–241, 2007. http://goldberg.berkeley.edu/pubs/rss-Alterovitz2007 RSS.pdf.

Bonus (5 pt):

The value V(q) computed by value iteration for each state q is the estimated probability of reaching the goal state under the optimal policy of the SMR starting at q. As a bonus experiment, we can check how accurate this probability is. Keeping the environment, start, and goal fixed, generate 20 (or more) SMR structures with n sampled states and compute the probability of reaching the goal from the start state using value iteration. Then simulate the system many times (1000 or more) under the same optimal policies. How does the difference between the expected and actual probability of success change as n increases? Evaluate starting with a small n (around 1000), increasing until some sufficiently large n; a logarithmic scale for n may be necessary to show a statistical trend.

5.3 Elastic Kinematic Chains
Let an elastic kinematic chain be defined as a planar kinematic chain with n revolute joints, where each joint has an angular spring. The spring energy of the chain is ∑i u2i , where ui corresponds to joint angle i. Suppose the base link is fixed at (0,0) with orientation 0. Now imagine that the end effector is slowly moved around to different positions and orientations so that the arm is always at a stable equilibrium configuration. How many parameters would you need to describe all stable configurations for n joints? The surprising answer is 3 for any n! There is a relatively straightforward algorithm to determine which configurations are stable (although understanding its correctness is far from trivial).

 

Figure 4: A sequence of stable configurations for a chain with 4 joints. From McCarthy & Bretl; see below.

There is one small typo in the algorithm in the reference below. The line that reads

Pn−4 = (A†B)T(I −NK)TM(I −NK)(A†B)

should be changed to

Pn−4 = Qn−4+(A†B)T(I −NK)TM(I −NK)(A†B).

As you can tell from this one line, you will want to use a linear algebra package such as Eigen (a C++ header-only library) or Numpy (a Python library with many Matlab-like functions) for your implementation.

Reference:

Z. McCarthy and T. Bretl, Mechanics and Manipulation of Planar Elastic Kinematic Chains, in IEEE Intl. Conf. on Robotics and Automation, 2012. http://ieeexplore.ieee.org/xpl/freeabs all.jsp?arnumber=6224693

Deliverables: Implement the isStable function from the reference above and use it as a StateValidityChecker to plan paths between stable configurations of an elastic kinematic chain. Add an option to check whether the joint angles (denoted by ui in the paper) are within −π and π. Then use any OMPL planner to find a path of stable configurations between any pair of stable configurations. Next, write an algorithm that using this function finds a pair of configurations that requires a long path (in terms of number of states on the path). It is assumed that you will use path simplification on each path produced by the planner so that a long path is not simply the result of an unlucky run of the planner. How do the [−π,π] joint limits change the results?

Picking reasonable bounds for the 3-parameter space of stable configurations is not entirely obvious. For a 10-joint arm you could use the following bounds on the configuration space: [−15,15]×[−15,15]×[−π,π]. They may have to be adjusted for a different number of joints. Implementing the bonus below can help you visualize when the free space becomes almost fractal-like (in which case, you would need to specify a very small value for setStateValidityCheckingResolution). Consider the symmetries that should exist in your c-space obstacles. If your visualization does not match the expected symmetries, there may be an error in your computations.

Graduate: Write code that visualizes the free space of the three-parameter configuration space. This can be done by creating plots of 2D slices of the configuration space, as was done in the reference above. You don’t actually have to compute the boundaries of the configuration space obstacles. It is sufficient to create a dense 3D grid of the configuration space, where for each grid point you compute the state validity. You can then plot each slice of the grid by drawing invalid configurations as black pixels and free configurations as white pixels. If you are using the Python numpy module or Matlab, you can use the spy function.

5.4 Centralized Multi-robot Planning
Planning motions for multiple robots that operate in the same environment is a challenging problem. One method for solving this problem is to compute a plan for all of the robots simultaneously by treating all of the robots as a single composite system with many degrees of freedom. This is known as centralized multi-robot planning. A naive approach to solving this problem constructs a PRM for each robot individually, and then plans a path using a typical graph search in the composite PRM (the product of each PRM). Unfortunately, composite PRM becomes prohibitively expensive to store, let alone search. If there are k robots, each with a PRM of n nodes, the composite PRM has nk vertices!

 

Searching the exponentially large composite roadmap for a valid multi-robot path is a significant computational challenge. Recent work to solve this problem suggests implicitly searching the composite roadmap using a discrete version of the RRT algorithm (dRRT). At its core, dRRT grows a tree over the (implicit) composite roadmap, rooted at the start state, with the objective of connecting the start state to the goal state.

 The key steps of dRRT are as follows:

1.   Sample a (composite) configuration qrand uniformly at random.

2.   Find the state qnear in the dRRT nearest to the random sample.

3.   Using an expansion oracle, find the state qnew in the composite roadmap that is connected to qnear in the closest direction of qrand. (Figure 6).

4.   Add the path from qnear to qnew to the tree if it is collision free.        Figure 6: The dRRT expansion step.

5.   Repeat 1-4 until until the goal is successfully added to the tree.      From Solovey et al.; see below.

Deliverables:

Implement the dRRT algorithm for multi-robot motion planning. For n robots, you will first need to generate n

PRMs, one for each robot. If your robots are all the same, one PRM will suffice for every robot. Then use your dRRT algorithm to implicitly search the product of the n PRMs. The challenges here are the implementation of the expansion oracle (step 3) and the local planner to check for robot-robot collisions (step 4). The reference below gives details on one possible expansion oracle (section 3.1) and local planner (section 4.2). Evaluate your planner in scenarios with at least four robots, like those in Figure 5. For simplicity you may assume that the robots are planar rigid bodies. You may need to construct additional scenarios to evaluate different properties of the dRRT algorithm.

Address the following in your report: Is your dRRT implementation always able to find a solution? How does the algorithm scale as the number of robots increases? Elaborate on the relationship between the quality of the single robot PRMs and the time required to run dRRT. What do the final solution paths look like?

Reference:

K. Solovey, O. Salzman, and D. Halperin, Finding a needle in an exponential haystack: Discrete RRT for exploration of implicit roadmaps in multi-robot motion planning. In Algorithmic Foundations of Robotics, 2014. http://robot.cmpe.boun.edu.tr/wafr2014/papers/paper 20.pdf.
Figure 5: Two challenging multi-robot problems. Left: robots 1 and 2 need to swap positions with robot 4 and 3, respectively. Right: robot i,i = 1,...,4, needs to swap positions with robot i+4.


5.5 Decentralized Multi-robot Coordination
Planning motions for multiple robots simultaneously is a difficult computational challenge. Algorithms that provide hard guarantees for this problem usually require a central representation of the problem, but the size and the dimension of the search space limits these approaches to no more than a handful of robots. Decentralization or distributed coordination of multiple robots, on the other hand, is an effective (heuristic) approach that performs well in practical instances by limiting the scope of information reasoned over at any given time. There exist many effective methods in the literature to practically coordinate the motions of hundreds or even thousands of moving agents.

 

Figure 7: Two challenging multi-robot problems. Left: robots 1 and 2 need to swap positions with robot 4 and 3, respectively. Right: robot i,i = 1,...,4, needs to swap positions with robot i+4.

One popular decentralized approach is to assign a priority to each robot and compute motion plans starting with the highest priority. The robot with the highest priority ignores the position of all other robots. Motion plans for subsequent (lower priority) robots must respect the paths for the previously planned (higher priority) robots. In short, the higher priority robots become moving obstacles that lower priority robots must avoid. An example of such an approach is detailed in Berg et al. 2005, cited below.

 A more recent approach reasons over the relative velocities between robots, employing the concept of a reciprocal velocity obstacle to simultaneously select velocities for each robot that both avoid collision with other robots and allow each robot to progress toward its goal. This technique is used not only in robotics but also in animation and even video games (e.g., Warhammer 40k: Space Marine, Crysis 2). Briefly, a reciprocal velocity obstacle RVOBA defines the space of velocities of robot A that will cause A to collide with another robot B at some point

in the future, given the current position and velocity of A and B. Using Figure 8: The reciprocal velocity obstathis concept, a real-time coordination algorithm can be developed that cle formed by robot B’s current position

and velocity relative to robot A. From simulates the robots for some short time period, then recalculates the

Berg et al. 2008; see below.

velocities for each robot by finding a velocity that lies outside the union of each robot’s RVO.

Deliverables:

Implement one of the methods above for decentralized multi-robot coordination and evaluate the method in multiple scenarios with varying numbers of robots. Graduate students must implement reciprocal velocity obstacles. You may assume that the robots are disks that translate in the plane.

Address the following: Is your method always successful in finding valid paths? Expound upon the kinds of environments and/or robot configurations that may be easy or difficult for your approach? How well does your method scale as you increase the number of robots? Qualitatively, how do the solution paths look? Depending on the method you chose, answer the following questions.

Prioritized method: Compare and contrast the paths for higher priority robots and lower priority robots. How sensitive is your method to the assignment of priorities?

RVO method: Discuss the challenges faced by the RVO approach as the number of robots increases. Is there always a velocity that lies outside of the RVO? How practical do you think this algorithm is?

Protips:

For the prioritized method, the key challenge is the implementation of the prioritized state validity checker. Not only must this function check that the robot being planned does not collide with obstacles, but also for collisions with the higher priority robots that have been previously planned. The notion of time must be incorporated into the configuration space (and possibly the planner as well) to ensure collision-free coordination. A modified version of RRT, EST, or similar planner to generate the single-robot paths is sufficient.

The implementation of the RVO method requires reasoning over the velocities of your robots. You can greatly reduce the complexity of the implementation by planning geometrically and bounding the velocity by limiting the distance a robot can travel during any one simulation step. Moreover, the RVO implementation requires reasoning over the robot geometry; choose simple geometries (circles) and ensure your implementation is correct before considering more complex geometries.

Visualization, particularly animation, is key in debugging these methods. Matlab, Matplotlib, and related packages support generating videos from individual frames.

More products