Starting from:

$25

ANP- Homework #2 Solved

•   Make sure to name the submitted file ID1-ID2.zip (or .pdf) with ID1,ID2 - the students’ ids.

•   For hands-on parts, your source code should be submitted as well (include it in the above zip file).

Theoretical questions:

.

1. Consider a planning session at time instant 0 and a given set of non-myopic candidate actions A = {a0:L−1},

. where L is the planning horizon. Denote the belief over state Xk at time instant k by bk = P(Xk | a0:k−1,z1:k), and assume motion and observation models, P(Xk | Xk−1,ak−1) and P(z | X), are given. Consider an objective function J(.,) of expected cumulative rewards given some reward (cost) function r(b,a).

(a)    Write the objective function explicitly for the above-considered setting.

(b)    Derive a recursive formulation that expresses J(b0,a0:L−1), for a given non-myopic action a0:L−1 ∈ A, via J(b1,a1:L−1). Present the derivation in detail.

(c)    Consider now the corresponding value function  ) and some stationary policy π within the same setting.

i.        Write an explicit expression for the value function 

ii.      Similar to clause 1b, derive a recursive formulation that expresses ) in terms of 

iii.    Discuss the difference between J(b0,a0:L−1) and 

Hands-on tasks: You are highly encouraged to use the Julia language for all hands-on tasks. However, this is not compulsory. In case you use Julia, the supplied skeleton may be helpful.

1.    Consider a mobile robot navigating in a 2D environment. For simplicity, we consider only the position state of the robot, i.e. Xi ∈ R2 denotes the x-y robot position at time instant i. The motion and observation models, P(Xk+1 | Xk,ak) and P(zk | Xk), are given by

                                                                                    Xk+1 = f(Xk,ak) + w , zk = h(Xk) + v                                                                            (1)

where w and v are process and measurement noise terms, with w ∼ N(0,Σw) and v ∼ N(0,Σv).

In this exercise, we shall consider simple linear models, i.e. f(Xk,ak) = F ·Xk +ak with F = I2×2 and ak ∈ R2 is the commanded position displacement. We also consider position measurements such that zk ∈ R2 and h(Xk) ≡ Xk.

.

Consider a prior over robot position at time instant 0 is available: b(X0) = P(X0) = N(µ0,Σ0).

(a)    Implement a function PropagateUpdateBelief, which updates a Gaussian belief according to a (given) performed action and captured observation, considering a recursive setting.

•   Input: a Gaussian belief from some time instant k over Xk, i.e. b(Xk) = N(µk,Σk), action ak and observation zk+1.

•   Output: an updated posterior Gaussian belief at the next time step, i.e. b(Xk+1) = N(µk+1,Σk+1).

(b)    Implement a function SampleMotionModel that, given a robot state X and action a, generates the next state, i.e. X′ ∼ P(X′ | X,a), according to the motion model in Eq. (1).

•   Input: State X, action a

•   Output: Next state X′

(c)    Implement a function GenerateObservation, which generates an observation according to the observation model in Eq. (1), i.e. z ∼ P(z | X).

•   Input: State X

•   Output: Observation z

(d)   Consider b(X0) = N(µ0,Σ0) with µ0 = (0,0)T and Σ0 = I2×2, Σw = 0.12 · I2×2, Σv = 0.012 · I2×2, and ak = (1,0)T. Let the actual initial robot location (ground truth) be X0 = (0.5,0.2)T.

.

i. Generate a possible robot trajectory τ = {X0,X1,...,XT} where the robot starts at X0 and follows a given action sequence a0:T−1 =. {a0,a1,...,aT−1}. Assume T = 10, and ai = (1,0)T for i ∈ [0,T −1]. ii. Generate observations along the trajectory τ, by generating a single zi for each Xi ∈ τ.

iii.    Calculate belief propagation along the trajectory τ while only considering a motion model (without observations). In other words, for each time step i ∈ [0,T], calculate P(Xi | a0:i−1,b(X0)) =. N(µmmi ,Σmmi ), considering the prior belief b(x0) and the probabilistic motion model (mm). Draw on a plot the actual trajectory and these propagated beliefs[1] (in terms of mean and covariance) versus time.

iv.    Calculate the posterior beliefs along the trajectory τ, this time also using observations (generated observations in 1(d)ii). In other words, for each time step i ∈ [0,T], calculate P(Xi | a0:i−1,z1:i,b(X0)) =. N(µi,Σi), considering the prior belief b(X0) and the probabilistic motion and observation models. Draw on a separate plot the actual trajectory, and these posterior beliefs (in terms of mean and covariance) versus time.

2.    We now modify the problem setting as follows. Several beacons are scatted in the environment. The locations of these beacons are known to the robot (denote it by Xjb for the jth beacon), and represented by a matrix

Xb ∈ R[2]×n, i.e. ], where n is the number of beacons.

Assume the robot gets a relative position observation zrel ∈ R2 from a beacon if it is sufficiently close to it (distance less than d). For simplicity, consider the beacons are positioned sufficiently apart such that an observation from only one of the beacons can be obtained for every possible robot location. Moreover, assume the accuracy of these observations deteriorates with the distance r to the beacon, starting from a certain given distance rmin ≤ d, such that for r ≤ d, the measurement covariance is Σv = (0.01 · max(r,rmin))2 · I2×2.

(a)    Write the corresponding observation model for the measurement zrel, as a function of robot location X, beacon location Xb, and additional parameters mentioned above.

(b)    Implement a function GenerateObservationFromBeacons, which generates an observation zrel according to the observation model from clause 2a. Assume Xb,r,rmin and d are given.

•   Input: Robot location X

•   Output: Measurement zrel and index of beacon within range; null otherwise.

(c)    Consider n = 9 beacons equally scattered in a 9×9 grid, and let the coordinate of its bottom left corner be (0,0). To clarify, one of the beacons is located at (0,0). Assume b(X0) = N(µ0,Σ0) with µ0 = (0,0)T and Σ0 = I2×2, Σw = 0.12 · I2×2. Let the actual initial robot location (ground truth) be X0 = (−0.5,−0.2)T.

.

Consider an action sequence a0:T−1 = {a0,...,aT−1}, with ai = (0.1,0.1)T, i ∈ [0,T − 1], for T = 100. Further, let d = 1 and rmin = 0.1.

i.        Repeat clauses 1(d)i-1(d)iv. Indicate in all plots also the n beacons.

ii.      Repeat clauses 1(d)i-1(d)iv, while considering a fixed measurement covariance Σv = 0.012I2×2 (measurement is still obtained only if r ≤ d). Indicate in all plots also the n beacons.

iii.    Present in a new figure localization estimation errors2 X˜ versus time for the two cases above.

iv.    Present in a new figure the square root of the trace of estimation covariance for the two cases above. Compare and analyze qualitatively the obtained results.

(d) Leveraging the above capabilities, let us now consider a simple belief space planning problem. Given a set of action sequences , the robot has to choose the one that is expected to yield minimum uncertainty at the final state (XT). In this question we consider det(Σ) as the information-theoretic cost function c(b,a).

Further, consider N = 10 simple action sequences in , where the jth action sequence   is given by , where aji = (0. .1,0.1 · j/5)T.

i.      Generate a single trajectory realization τj for each action sequence in  by following clause 1(d)i. Draw all N trajectories and indicate beacon locations on a single plot.

ii.    Calculate the posterior beliefs along each trajectory τj, ∀j ∈ [1,N], and evaluate the information theoretic cost. Plot the information-theoretic cost versus trajectory index. Designate the best action for the given cost, and explain why this result makes sense.


 
[1] You can use the supplied drawCovarianceEllipse function to draw a 1-std covariance ellipse. Alternatively, similar Julia function covellipse! from package StatsPlots.jl is available.
[2] X˜ = Xˆ −X, where Xˆ is an estimate and X is the ground truth value.

More products