Introduction
As a 3D scene representation method, the light field records both the radiance and direction information of light rays coming from the scene. Generally, a 4D light filed is parameterized by two 2D plane, as illustrated in Fig. 1, where any point pair from different planes denotes a light ray. We can use two typical devices, i.e. camera array and light-field camera, to capture the light field of a scene. For example, the camera array shown in Fig. 2 consists of multiple cameras that are regularly installed on a plane, each of which observes the scene from a unique viewpoint. Due to the rich scene information recorded, light fields enable many exciting applications. In this assignment, you are required to implement a program on synthesizing novel views from a recorded light filed.
Fig. 1. Two-plane representation of light field Fig. 2. Camera array
Implementation Guideline
Data description. The provided light field data, consisting of 81 RGB sub-images (each of image resolution: 512 × 512 ), is captured by a camera array of 9 × 9 views. All these cameras of the same imaging setting, are regularly installed on a plane and share the orientation perpendicular to the plane. The detailed parameters, baseline=30mm, image size=35x35mm2, focal length (i.e. distance between viewpoint and image plane)=100mm, are marked in Fig. 3.
Fig. 3. Camera array parameters Fig. 4. Target light ray interpolation from neighbour light rays
Implementation steps. A light field can be regarded as a set of light rays with known radiance and direction values. For example, the pixel 𝑝(𝑥𝑖, 𝑦𝑖) on the sub-image of viewpoint 𝑉(𝑠𝑗, 𝑡𝑗), is representing a light ray passing 3D point 𝑃(𝑋𝑗, 𝑌𝑗, 𝑍𝑗) with direction of 𝐷(𝑥𝑖, 𝑦𝑖, −𝑓), where 𝑃 is the position of the viewpoint 𝑉(𝑠𝑗, 𝑡𝑗) in the global 3D coordinate system (as defined in Fig.
3), 𝑓 is the focal length and (𝑥𝑖, 𝑦𝑖) is the coordinate of 𝑝 on image plane. So, to synthesize an image of new viewpoint, we only need to retrieve/resample the radiances (pixel values) from the collection of pixel values (or image array). In summary, the algorithm can be implemented by the following steps:
1. Loading light field data – Read all the sub-images of different viewpoints in order, which correspond to the grid viewpoints {𝑉(𝑠, 𝑡) | 𝑠 = 0,1, … ,8; 𝑡 = 0,1, … ,8} on the view plane. Specifically, the provided sub-images of viewpoint 𝑉(𝑠, 𝑡) are named as “cam_no.bmp”, where 𝑛𝑜 = 𝑡 ∗ 9 + 𝑠. For the corresponding 3D coordinate, we have that 𝑉(0,0) locates at 𝑃(−120,120,0) and 𝑉(8,8) locates at 𝑃(120, −120,0).
2. Building global coordinate system – Before any computation, we need to build a 3D global coordinate system to describe the relative position of viewpoints and the directions of light rays. Although any 3D cartesian coordinate system is feasible, in this assignment, we require the global coordinate system used in your program to follow the one defined in Fig. 3 (just for a uniform testing standard).
3. Defining the target view – To synthesize a new image, we need to define the parameters of the virtual camera, consisting of intrinsic and external parameters. Here, intrinsic parameters include focal length f, image size w × ℎ, image resolution c × 𝑟; external parameters include the viewpoint position (𝑋, 𝑌, 𝑍) and orientation vector (𝑣𝑥, 𝑣𝑦, 𝑣𝑧) of the virtual camera. To simplify the problem, you are only required to implement the feature of variable viewpoint position, with other parameters fixed as that of the cameras array: f =
100mm, w × ℎ = 35𝑚𝑚 × 35𝑚𝑚, c × 𝑟 = 512 × 512, (𝑣𝑥, 𝑣𝑦, 𝑣𝑧) = (0,0, −1).
4. Resampling new light rays – Once the target view is defined, we can compute the pixel values by retrieving/resampling their corresponding 3D light rays.
• In Fig. 4, the new light ray (red dash line), passes through the two parameterization plane, where we can first find the neighbour light rays (blue dash lines) from known viewpoints (blue nodes). (Hint: considering the tiny scalar of baseline w.r.t the distance between scene and camera, the blue dash lines could be regarded as parallel to the red dash line in your implementation.)
• In Fig. 5, for a light ray (e.g. blue dash line) from a known viewpoint, we can bilinearly interpolating its value from the 4-neighborhood pixels on the image of this viewpoint.
• When all the four neighbour light rays obtained, we finally compute the value of the new light ray from the target view via bilinear interpolation from them.
Fig. 5. The known neighbour light rays of a target light ray
5. Saving resultant image – When all the pixel values of the new target view are obtained, save this synthesized image into the user specified path.
Basic Requirements (80 point)
1. You are required to implement a special case of the algorithm described above: the target viewpoint locates on the camera array plane, i.e. the viewpoint coordinate is = (𝑋, 𝑌, 0) with 𝑋 ∈ [−120,120], 𝑌 ∈ [−120,120]. With this assumption, we can easily find the four neighbour known viewpoints of the target viewpoint, of which all the light rays can be interpolated from the light rays of the four known viewpoints. So, there is no need to locate neighbour viewpoints for each individual new light ray.
2. Program must be coded in ANSI C/C++ and uses standard libraries only
3. The compiled program must run in Windows 10 command prompt as a console program and accepts source bitmap (.bmp format) with the following syntax and save generated images to the current directory.
C:\ veiwSynthesis <LF_dir <viewpoint_coord <focal_length veiwSynthesis is the executable file of your program, e.g. the command: veiwSynthesis light_filed_views 200 200 0 100 is to synthesize an image from the viewpoint position (200, 200, 0) and with a focal length 100. <LF_dir is the directory where the light field sub-images are located.
<viewpoint_coord and <focal_length are the 3D coordinate of the target viewpoint and the focal length of the virtual camera respectively. Note that the coordinate and scalar conform to the definition in Fig. 3.
4. You are required to submit source code only. We will use Visual Studio 2015 C++ compiler and have your program compiled via Visual Studio command prompt (Tools Command Prompt) with the command line: C:\ cl viewSynthesis.cpp bmp.cpp (Please ensure your source code gets compiled well with it).
Enhanced Features (20 point)
You are required to implement the full version of the algorithm described above, which means that the target viewpoint can locate at any 3D position 𝑃(𝑋, 𝑌, 𝑍) 𝑤ℎ𝑒𝑟𝑒 𝑋 ∈ [−120,120], 𝑌 ∈ [−120,120], 𝑍 ∈ [0, ∞). Besides, your implementation is also encouraged to support a variable focal length 𝑓 ∈ (0, ∞) Note that, when a query light ray falls out of the range of recorded light field, it is assigned with the background black color, i.e. rgb=(0,0,0).