Starting from:

$30

Machine perception-Feature points, matching, homography Solved

Introduction
This assignment will deal with automatic searching for correspondence points between two images. Correspondences are a key step when dealing with the task of aligning two or more images, for example building a panorama from multiple images. Methods that find correspondences between images usually start by finding feature points in them. Feature points denote regions in the image that have a high chance of re-detection in an image where the same scene is captured from a slightly different angle or viewing conditions.

Exercise 1: Feature points detectors
In this exercise you will implement two frequently used feature point detectors: the Hessian algorithm [1](p. 44) and the Harris algorithm.

(a) The Hessian detector is based on second derivatives matrix H(x,y) (also named Hessian matrix, hence the name of the algorithm) at point (x,y) in the image:

                                                                H  ,                                  (1)

where σ explicitly states that the second derivative is computed on a smoothed image. Hessian detector selects point (x,y) as a feature point if the determinant of the Hessian matrix exceeds a given threshold value t. If we want to create a variant of the Hessian detector that is scale independent (independent of the Gaussian filter that is used to smooth the image) we have to include a normalization factor of σ4 and thus we get a feature detection rule:

                                                   det(H(x,y)) = σ4(Ixx(x,y;σ)Iyy(x,y;σ) − Ixy(x,y;σ)2) > t.                        (2)

Implement a function hessian_points, that computes a Hessian determinant using the equation (2) for each pixel of the input image. As this computation can be very slow if done pixel by pixel, you have to implement it using vector operations

(without explicit for loops). Test the function using image from test_points.jpg as your input (do not forget to convert it to grayscale) and visualize the result.

Extend the function hessian_points by implementing a non-maximum suppression post-processing step that only retains responses that are higher than all the neighborhood responses and whose value are higher than a given threshold value thresh. The function should return the list of points that satisfy these properties.

Create a function that you will use plot the detected points (use plt.plot()) over the input image. Load the image test_points.jpg, process it and visualize the result. Set the input parameters of the Hessian detector to thresh=100 and σ = 1, and then change their values to get a feeling for the effect of the parameters.

 

Question: What kind of structures in the image are detected by the algorithm? How does the parameter σ affect the result?

(b) Implement the Harris feature point detector. This detector is based on the autocorrelation matrix C that measures the level of self-similarity for a pixel neighborhood for small deformations. At the lectures you have been told that the Harris detector chooses a point (x,y) for a feature point if both eigenvalues of the auto-correlation matrix for that point are large. This means that the neighborhood of (x,y) contains two well defined rectangular structures – i.e. a corner. Autocorrelation matrix can be computed using the first partial derivatives at (x,y) that are subsequently smoothed using a Gaussian filter:



where ∗ stands for convolution. Computing eigenvalues λ1 and λ2 of matrix C(x,y;σ,σ˜) is expensive, therefore we will use the following relations[1]

det(C) = λ1λ2                   (4) trace(C) = λ1 + λ2    (5) to compute the ratio r = λ1/λ2. If we assume that

                                                         2                                                        2                                               2                                    2

                                  trace (C)        (λ1 + λ2)          (rλ2 + λ2)         (r + 1)

                                                           =                     =                       =                ,

                                       detC                λ1λ2                                  rλ2λ2                                  r

we can express the feature point condition for (x,y) as:
(6)
det(C) − αtrace2C > t.
(7)
In practice we use σ˜ = 1.6σ, α = 0.06 for parameter values. Implement the condition (7) without using explicit computation of matrix C and without for loops, as you did for the core part of the Hessian detector.

Implement the feature point detector as function harris_points that computes values of equation (7) for all pixels, performs the non-maximum suppression postprocessing step as well as thresholding using threshold thresh. The function should return the list of points that satisfy these properties.

Load the image test_points.jpg and compute the Harris feature points. Compare the result with the feature points detected by the Hessian detector. Experiment with different parameter values. Do the feature points of both detectors appear on the same structures in the image?

 

Exercise 2: Matching local regions
On of the uses of feature points is searching for similar structures in different images. To do this we will need descriptors of the regions around these points. In this assignment you will implement some simple descriptors as well as their matching.

(a)      Use the function simple_descriptors to calculate descriptors for a list of feature points. Then, write a function find_correspondences that, given two lists of descriptors, calculates the similarities between all descriptors in both lists. Use the Hellinger distance. Finally, for each descriptor from the first list, find the most similar descriptor from the second list. A good idea might be to use the same list (perhaps shuffled) for both inputs to sanity check your code.

Note: Use your own functions gauss and gaussdx from previous assignments in order for the descriptor function to work correctly.

(b)     Write a script that loads images graf/graf1_small.jpg and graf/graf2_small.jpg, runs the function find_correspondences and visualizes the result. Use the function display_matches from the supplementary material for visualization. Experiment with different parameters for descriptor calculation and report on the changes that occur.

 

(c)      Implement a simple feature point matching algorithm. Write a function find_matches that is given two images as an input and returns a list of matched feature points from image 1 to image 2. The function should return a list of index pairs, where the first element is the index for feature points from the first image and the second element is the index for feature points from the second image.

Follow the algorithm below:

•    Execute a feature point detector to get stable points for both images (you can experiment with both presented detectors),

•    Compute simple descriptors for all detected feature points

•    Find best matches between descriptors in left and right images using the Hellinger distance, i.e. compute the best matches from the left to right image and then the other way around. In a post-processing step only select symmetric matches. A symmetric match is a match where a feature point   in the left image is matched to point   in the right image and at the same time point   in the right image is matched to the point  in left image. This way we get a set of point pairs where each point from the left image is matched to exactly one point in the right image as well as the other way around.

Use the function display_matches from the supplementary material to display all the symmetric matches. Write a script that loads images graf/graf1_small.jpg and graf/graf2_small.jpg, runs the function find_matches and visualizes the result.

Question: What do you notice when visualizing the correspondences? How robust are the matches?

 

(d)     F  Incorrect matches can occur when matching descriptors. Suggest and implement a simple method for eliminating at least some of these incorrect matches. You can use the ideas that were presented at the lectures or test your own ideas. Either way, you need to explain the idea behind the method as well as well as demonstrate that the number of incorrect matches is lowered when using the proposed method.

(e)     F  Implement a local feature detector of your choice (e.g. SIFT, SURF, BRIEF, HOG). Test it on other assignments and report on the changes (increased robustness etc.).

(f)       F  Record a video with your phone or webcam. Use OpenCV to detect keypoints of your choice, display them using cv2.drawKeypoints and save the video with the displayed keypoints. The video must demonstrate the keypoint detector’s robustness to rotation and scale change. Make the video at least 15s long.

Exercise 3: Homography estimation
In this assignment we are dealing with planar images, we can try and estimate a homography matrix H that maps one image to another using planar correspondences. In this exercise you will implement an algorithm that computes such a transformation using minimization of the mean square error. For additional information about the method, consult the lecture notes as well as the course literature [1] (in the literature the term direct linear transform (DLT) is frequently used to describe the idea).

We will start with a short overview of the minimization of mean square error on a simpler case of similarity transform estimation. The similarity transform is a linear transform that accounts for translation, rotation, and scale. The transformation of a point x can be written as x0 = f(x,p), where p = [p1,p2,p3,p4] is a vector of four parameters that define the transform such that

x0   .

Question: Looking at the equation above, which parameters account for translation and which for rotation and scale?

Question: Write down a sketch of an algorithm to determine similarity transform from a set of point correspondences . For more details consult the lecture notes.

You will now implement a similar, but more complex algorithm for homography estimation. For a reference point xr in the first image we compute a corresponding point xt in the second image as:

                                                         Hxr            = xt                                                                                                                                                         (8)

               ;     where  ,

where points xr and xt are written in homogeneous coordinates[2]. Using the equations (8) we get a system of linear equations with eight unknowns h11,h12,h13,h21,h22,h23,h31,h32:

 (9)

                                                                                                                       ,                                             (10)

that can be transformed to
 
 
 
h11xr + h12yr + h13 − xth31xr − xth32yr − xt
=
0
(11)
h21xr + h22yr + h23 − yth31xr − yth32yr − yt
=
0.
(12)
If we want to estimate the eight parameters that determine a homography, we need at least four pairs of matched feature points. As some matches can be imprecise, we can increase the accuracy of our estimate by using a larger number of matches  .

This way we get an overdetermined system of equations:

 
 
 
Ah
=
0             (13)


xr1

 0



 xr2



 0

 ...



  xrn



0
yr1 0

yr2 0

...

yrn

0
1

0

1

0

...

1

0
0

xr1 0 xr2

...

0 xrn
0

yr1 0

yr2

...

0 yrn
0

1

0

1

...

0

1
−xt1xr1

−yt1xr1

−xt2xr2

−yt2xr2

...

−xtnxrn

−ytnxrn
−xt1yr1

−yt1yr1

−xt2yr2

−yt2yr2

...

−xtnyrn

−ytnyrn
 h11 

−xt1  h12 

−xt2 

−yt1  hhh132122 

−yt2

...                    

 h23 

                          

−xtn  h31 

−ytn  h32 

h33
=
 

0

 0 

 

 0 

 

 0   , (14)  0 

 ... 

 

 

 0 

0
that can be (similarly to estimation of similarity transform) solved as a minimization of mean square error. If the matrix A is a square matrix then we get an exact solution of the system. In case of an over-determined system (e.g., n > 4) the matrix A is not square. This problem is usually [1] solved using a matrix pseudo-inverse ATA, that is square and can be therefore be split to eigenvectors and eigenvalues. A solution of such a system is the unit eigenvector of ATA that corresponds to the lowest eigenvalue (using function eig). The same solution can be obtained more efficiently using the singular-value decomposition (SVD) as an eigenvector that corresponds to the lowest eigenvalue of A (using function svd).

                                                                                          λ11 ...        0             v11 ...      v19         T

Asvd= UDV(15)

A vector h that contains the parameters of the homography matrix is obtained from the last column of matrix V and by normalizing the vector with the value of v99 to set the last element in h to 1:

                                                                                h .                                                      (16)

(a)      Write function estimate_homography, that approximates a homography between two images using a given set of matched feature points following the algorithm below.

•    Construct a matrix A using the equation (13).

•    Perform a matrix decomposition using the SVD algorithm: [U,S,V]=svd(A).

•    Compute vector h using equation (16).

•    Reorder the elements of h to a 3×3 matrix H (e.g. using the function reshape).

(b)      Load the two New York cityscape images (newyork/image1.jpg and newyork/image2.jpg), and four hand-annotated correspondence pairs in the file newyork.txt[3]. Use the function display_matches to display the pairs of points. Using these pairs estimate the homography matrix H from the first image to the second image and use function cv2.warpPerspective to transform the first image to the plane of the second image using the given homography, then display the result. Also test your algorithm with images graf/graf1.jpg, graf/graf2.jpg and points from graf.txt. Note: You can check the correctness of your implementation by comparing with the file newyork/H.txt

(c)      When you get the correct result for the given points, repeat the procedure by generating a set of matching points automatically using the find_matches function that you have implemented in the previous assignment. Select the first 10 best matches and use them to compute a homography. Visualize the result by transforming the first image to the space of the second image as well as displaying the matches between the images. How well have you managed to estimate the homography? Does the estimate improve if you use more/less matching pairs? What happens if you include imperfect matches in the set? Set the parameters (feature point detector, parameters, number of selected points, etc.) to get the best performance of the method.

Note: This does not need to produce perfect results every time because DLT is not robust to outliers.

 

(d)      F  Implement and test a more robust homography estimation algorithm using iterative reweighed least squares approach that you have heard about at lectures. Test the robustness of the algorithm on the image pairs from the previous task. Experiment with different numbers of correspondences used as an input to the estimation.

(e)      F Write your own function for mapping points using the homography matrix. It should work like OpenCV’s warpPerspective(). It should accept an image and a homography matrix and return the input image as remapped by the homography matrix. The size of the output should match the size of the input image. Hint: You will need to use homogeneous coordinates.

References
[1] D. A. Forsyth and J. Ponce. Computer Vision: A Modern Approach. Prentice Hall, 2002.


 
[1] In the following text we will omit the parameters of an auto-correlation matrix C(x,y;σ,σ˜) for clarity and write C instead.
[2] Homogeneous coordinates for a point in 2D space are obtained by adding a third coordinate and setting it to 1
[3] Use read_data function to read the points. Reshape the data into four columns. The first two columns contain x and y coordinates of the points in the first image and the second two columns contain the corresponding points for the second image.

More products