Starting from:

$30

PGM - Assignment 2 - Solved

Probabilistic Graphical Models Exercise Sheet 2




Pictorial Structures
The goal of this exercise is to implement a simplified version of the pictorial structures model, although using an efficient algorithm, and test it on a pedestrian dataset. You are going to use a star model over (upper and lower) legs, head, and torso as follows:

 

Each Li in this distribution is actually a two-dimensional random variable representing image coordinates of the center of each body part: Li = (xi,yi). Scale and rotation is not considered here. Unary factors represent likelihoods pi(Li) and pairwise factors stand for kinematic priors pij(Li,Lj). You will learn the priors from a training dataset. Then, you will perform inference on test images, for which you are given the corresponding likelihood maps.

The first step is to download assignment02-data.zip from the lecture’s website and import the data.mat. The variable likelihoods{k,i} is the 2D likelihood map pi(Li) for test image number k. The images can be found in the directory testset for reference. The index i is defined as follows:

                                                         i            Body part

 

1     Lower Left leg 2  Upper Left leg

3                        Upper Right leg

4                        Lower Right leg

5                        Head

6                        Torso

The variable train contains the training data. More precisely, each row k of train{i} represents the coordinates Lki of body part i in training image k.

1           Learning Kinematic Priors
Points: 4
We set the prior as a 2D Gaussian pij(Li,Lj) ∼ N(Li − Tij(Lj);0,Σij) and we define the transformation Tij(Lj) = Lj + µij. This is a reasonable approximation for small joint rotations, as it is the case for pedestrians. The parameters to be determined for each prior are thus the covariance matrix Σij and the mean µij.

Implement the function pairwisePots = learnPairwisePots(train) which, for each body part i, computes the maximum-likelihood estimate of µij as a row vector pairwisePots{i,1} and of Σij as pairwisePots{i,2}. Note that there are only potentials between the torso and the remaining body parts, hence we assume j = 6 is fixed. Use the built-in Matlab functions mean and cov.

2           Maximal Marginal States
Points: 6
The goal is to compute maximal marginals of the model using the sumproduct algorithm. To this end, implement the corresponding function maxstates

= sumproduct(pairwisePots, unaryPots). It returns a 6x2 matrix of x,y coordinates.

•   Note that the graph is not a chain but a tree, so you have to think about the correct message scheduling.

•   You will need computations like f(Li) = PLj N(Li − Tij(Lj))g(Lj) in your code. However, summing over Lj is impractical due to the large state space. That is why you should implement the sum as a convolution (N ∗ (g ◦ Tij−1))(Li) := Ps N(Li − s)g(Tij−1(s)) with s = Tij(Lj) using built-in functions. We additionally assume a diagonal covariance so that the Gaussian is separable (simply ignore the off-diagonal entries of Σij). The Matlab functions fspecial, conv2 and the prepared file shiftimg.m (it shifts a given image for a given 2D offset) will be useful here. Take care when computing messages in the opposite direction.

•   You can visualize your maxima using drawmaxima.m. You can compare your results for the first 10 images with the official solution in the directory solution.

•   Hint: Recall that we work with (x,y) vectors but Matlab indexes its matrices by (row, column).

3           Modes
Points: 6
The goal is to compute the maximum state of the joint distribution using the min-sum algorithm (i.e. using negated log potentials). Implement the function maxstates = minsum(pairwisePots, unaryPots). It returns a 6x2 matrix of x,y coordinates.

•   For reasons of efficiency, we compute the minima using the generalized distance transform

DT−log[g(Tij−1(·))](Li) = mins δ(Li,s) − log[g(Tij−1(s))]

= min−log[N(Li − Tij(Lj))] − log[g(Lj)].

Lj

You can find the code in DT.m, the function takes the covariance matrix of the Gaussian as the second argument.

•   Unfortunately, DT does not give you the argmin, only the min. For this reason, you cannot backtrack and you need to implement the min-sum algorithm as in the lecture on max-sum algorithm earlier, i.e. computing all messages (two messages per edge) and taking a node-wise minimum. This will probably fail on potential ties (multiple modes) but that is fine in this exercise.

4           Evaluation
Points: 4
Now you are going to evaluate the model as a person detector by writing a script evaluation.m. To keep it simple, fix a bounding box of size 80x200px around the torso (horizontally centered at it, vertically offset in 1:2 ratio). A predicted bounding box is considered correct if it overlaps more than 50% with a ground-truth bounding box, otherwise the bounding box is considered a false positive detection. Ground truth can be found in the variable GT in the supplied mat file, each row is a rectangle [x1,y1,w,h]. Bounding box overlap is computed using the boxoverlap.m function which is provided for your convenience.

Compute bounding boxes for each test image using three ways of choosing the torso:

•   Torso as the result of min-sum.

•   Torso as the result of sum-product.

•   Torso as the maximum of a torso’s likelihood, which corresponds to using no model at all.


More products