Starting from:

$30

CMU15463-Assignment 5 Solved

The purpose of this assignment is to explore photometric stereo as a computational imaging technique for recovering the shape of an object. As we discussed in class, photometric stereo, in its most common form, takes as input images of an object under a fixed orthographic camera and multiple illumination directions. By assuming that the object is Lambertian and the illumination directional, per-pixel albedoes, normals, and eventually depth can be recovered using simple linear algebraic operations.

In the first part of the homework, you will use data captured by us to implement two types of photometric stereo: uncalibrated, where the lighting directions are unknown, and calibrated, where the lighting directions are known. By comparing the two, you will learn about the generalized bas-relief ambiguity [2] that exists in the uncalibrated case. In the second part, you will use uncalibrated photometric stereo to measure the shape of some objects of your choice (up to a generalized bas-relief ambiguity), by capturing images with your own camera. There are also two bonus parts, where you can implement two popular algorithms for resolving the bas-relief ambiguity.

You are strongly encouraged to read the papers by Bellhuemer et al. [2] and Yuille and Snow [5], which discuss the generalized bas-relief ambiguity and uncalibrated photometric stereo. As always, there is a “Hints and Information” section at the end of this document that is likely to help.

1.     Photometric stereo 
For the first part of the homework, you will use a set of seven images of a face, measured using a nearorthographic camera with fixed viewpoint, and under different illuminations. These images are available as files ./data/input  N.tif in the homework ZIP archive, where N = {1,...,7}. These images are linear images, corresponding to RAW files that have been demosaicked and converted to the (linear) sRGB color space.

Initials (5 points). Load the seven images in Matlab, convert them to XYZ color space, and extract the luminance channel for each of them. Then, stack the seven luminance channels into a matrix I of size 7×P, where P is the number of pixels of each luminance channel.

Uncalibrated photometric stereo . Our goal is to recover a 3 × 1 normal vector n and a scalar albedo a at each pixel of the camera. As we did in class, it will be convenient to consider at each pixel the pseudo-normal b = a · n. We can stack the pseudo-normals for all pixels into a 3 × P matrix B, which we call the pseudo-normal matrix.

Additionally, each of our seven input images is captured under some directional light, described by a 3×1 unit-norm vector li. We can stack the seven light vectors into a 3×7 matrix L, which we call the light matrix.

Photometric stereo relies on the “n-dot-l” shading model we discussed in class, which is valid under directional light and Lambertian reflectance. Under this model, we can relate the matrices I, L and B through a simple matrix product,

                                                                                                I = LT · B.                                                                                           (1)

If all of our assumptions are satisfied exactly, the matrix I will have rank equal to exactly 3. In practice, this will not be exactly the case, because of noise and because the n-dot-l shading assumptions are in practice never perfectly accurate. However, we can find a best approximation (in the least-squares sense) by using SVD to recover the best rank-3 decomposition of matrix I. From this decomposition, we can recover estimates for the matrices L and B. In turn, from B we can use normalization to recover estimates for the 1 × P albedo matrix A and the 3 × P normal matrix N.

Unfortunately, these estimates are not unique. Let’s call Le and Be the light and pseudo-normal matrices obtained from the above SVD procedure, and let Q be an invertible 3 × 3 matrix. Then, the matrices LQ = Le · Q and BQ = Q−TBe approximate Equation (1) exactly as well as the original estimates Le and

Be.

Use SVD to recover Le and Be, and then convert Be to per-pixel albedoes Ae and normals Ne. Reshape Ae and Ne into single-channel and three-channel images with width and height same as the original image, and show the results. Additionally, visualize the normals as a vector field. (See help for Matlab function quiver for visualizing the normals.) Additionally, select any non-diagonal matrix Q, and visualize the albedo AQ and normals NQ you compute from the corresponding BQ.

Simple rendering (10 points). Use the albedoes and normals you recovered from Be and from BQ to predict what the person would look like (in grayscale) if illuminated from direction l = (0.58,−0.58,−0.58) and from direction l = (−0.58,−0.58,−0.58). Additionally, synthesize images using the normals and albedoes from BQ, after transforming the light vectors by Q. What do you observe when you compare the various images you synthesized?

Enforcing integrability (40 points). As we discussed in class, the per-pixel normals n(x,y) can be related, after appropriate normalization, to the x and y derivatives of the depth image z = f(x,y) corresponding to the surface of the object we are scanning. Therefore, the true normals are expected to be integrable, as otherwise they would not correspond to a true surface.

Arbitrary invertible transformations Q do not preserve integrability of normal fields. Therefore, we can try to resolve the ambiguity in the normals by trying to find a Q such that the corresponding normal field NQ is integrable. Unfortunately, while enforcing integrability does help remove some of the ambiguity, it is not sufficiet for uniquely determining the true albedoes and normals. As shown by Belhumeur et al. [2], there exists a class of matrices of the form

                                                                                        G   ,                                                                                   (2)

such that, for all µ and ν and for all λ 6= 0, they are invertible and preserve invertibility. That is, if a normal field N is integrable, then the transformed field

                                                                                                   G−TN                                                                                              (3)

is also integrable! Therefore, by enforcing integrability, we can only hope to recover the per-pixel albedoes and normals up to a matrix of the form G. Put another way, enforcing integrability lets us reduce the degrees of freedom we have from nine (the entries of the original Q) down to three (the µ,ν,λ of G), but not down to zero. Matrices of the form of Equation (2) correspond to the generalized bas-relief (GBR) transformation, so-called because when µ = 0 and ν = 0 this corresponds to the transformation used in bas-relief sculptures to create a perception of 3D shape.

Our goal now is to find a matrix Q such that the corresponding normals NQ are integrable. We will follow the original derivation of Yuille and Snow [5], which solves for the matrix ∆ = QT instead Q. To estimate ∆, first let’s denote as b˜ = ∆−1be the transformed pseudonormal at each pixel. Then, we can write the integrability constraint at each pixel as:

 !

(4)

                                                                                              = b˜(3)∂ b˜(1) − b˜(1)∂ b˜(3).                                    (5)

                                                                                                                         ∂y                     ∂y

By replacing b˜ = ∆−1be, and after some manipulation, we arrive at the following linear equation:

                                                                      ,                                                                (6)

where

                                                                                                                               (1)                                                                      (7)

(1)                                                                                                         (8)

(2)                                                                                                         (9)

                                                                                                                                  (1)                                                                (10)

                                                                                                                                  (1)                                                                (11)

                                                                                                                                        ,                                                               (12)

and x is a 6 × 1 vector such that,

                                                                                 ,                                                                      (13)

where the fixed third column corresponds to the three degrees of freedom the GBR affords us. You can find the details of the above derivation in Yuille and Snow [5].

You can now compute matrix ∆ by performing the following steps:

1.    Form the three-channel “pseudo-normal” image be, and compute its x and y spatial derivatives (use Matlab’s gradient). We recommend applying a small amount of Gaussian filtering to be before using gradient.

2.    Form a homogeneous linear system Ax = 0, by stacking together linear constraints of the form of Equation (6) for all image pixels.

3.    Solve for x, and estimate ∆ from it using Equation (13).

Once you have found ∆, apply it to the pseudonormal matrix Be, and then visualize the resulting albedo (show albedo image) and normals (show normal image and normal vector field). Figure 1 shows the expected results.

 

Figure 1: Uncalibrated photometric stereo results. From left to right, estimated albedo image (times 10), normal image, normal vector field.

 Normal integration (10 points). Now that you have a normal field, you can use it to compute a surface Z = f(x,y). First, compute from the normals the derivatives (∂f∂x, ∂f∂y). Then, integrate the normals to compute the surface Z = f(x,y).

For the integration, you can experiment with the functions integrate  poisson.m and integrate frankot.m provided in the ./src directory in the homework ZIP archive: The first function integrates the derivative vector field by solving the Poisson equation, which is similar to the integration procedure you implemented in Homework 3. The second function integrates the derivative vector field using a projection method by Frankot and Chellappa [3]. Try both functions, and use the result you like the most.

Visualize the final surface you can reconstructed as both a depth image and a 3D surface. (See help for Matlab function surf.) Figure 2 shows two views of the expected surface.

 

Figure 2: Uncalibrated photometric stereo results. Two views of the recovered face shape.

Additionally, experiment with GBR transformations G for different values µ,ν,λ, until you find one that produces a reasonably undistorted face surface. Report what GBR you end up using, and show the corresponding albedoes, normals, and 3D surfaces.

Calibrated photometric stereo (20 points). The file ./data/sources.mat in the homework ZIP archive contains the groundtruth light source vectors, in the form of the light matrix L. When the light directions have been calibrated, photometric stereo becomes considerably easier: Given that now both I and L are known, all you have to do is solve the linear system of Equation (1) for the pseudo-normal matrix B.

Solve this linear system to recover per-pixel albedoes and normals. Additionally, perform normal integration as above, to recover the 3D surface z. Visualize the recovered albedoes, normals, and surface as before. Figures 3 and 4 show what you should expect to see. How do these results compare to the results of the uncalibrated case?

Dealing with non-idealities . Note the poor estimates of the albedo in the area surrounding the nostrils. What is the source of this error? Can you think of a method for finding a better estimate of this information from these seven images.

2.     Capture and reconstruct your own shapes
You will now perform uncalibrated photometric stereo using images you capture with your own camera. For this, you should select two objects you want to scan: First, select an object that approximately satisfies the assumptions of photometric stereo (very diffuse reflectance without much/any glossiness, few interreflections and occlusions). Second, select an object that partially violates the assumptions of photometric stereo (e.g., it has a somewhat glossy reflectance, or it has strong convavities).

For each object, capture at least seven images with a fixed camera and different lighting conditions. Make sure to consult the Hints section for information on how to best capture these images. Apply uncalibrated photometric stereo to the sequences of images you capture, and produce a reconstruction of the albedo, normals, and surfaces for each of the two objects. Since your reconstructions will be up to a GBR, you can manually experiment with different GBR transformations until you find the best surface result.

 

Figure 3: Calibrated photometric stereo results. From left to right, estimated albedo image, normal image, normal vector field.

For each object, show one of the input images you captured, and visualize the albedo, normals, and surface you reconstructed. Additionally, show a rendering of both objects under a new lighting direction of your preference.

3.     Bonus: Resolving the GBR ambiguity
Following the discovery of the GBR ambiguity, there have been a number of techniques that use different heuristics to try to resolve the ambiguity when performing uncalibrated photometric stereo. Below we mention two that have been particularly successful. For up to 100 points of extra credit (50 points for each method), you can read the corresponding paper, implement their method, and apply it to both the images that came with the homework for Part 1, and the images you captured for Part 2. (You can still get partial credit for incomplete implementations, and for applying the method to just one set of images).

Entropy minimization . This technique was introduced by Alldrin and Kriegman [1]. The intuition behind it is that many real-world objects have a relatively small number of albedo values (e.g., different parts of the surface are painted with a small number of different colors). Therefore, among all possible GBR transformations, we should prefer the one that reduces the variability of the recovered albedo values. The paper proposes measuring variability using entropy.

Using perspective cameras. The GBR ambiguity is, in part, a consequence of the fact that we assume an orthographic camera. When the camera we use is perspective, then normals and albedoes can be reconstructed exactly. This was proven by Papadhimitri and Favaro [4], who also show how one can do the reconstruction in this case.

More products