Starting from:

$25

TDT4195-Image Processing: Assignment 1 Solved

•   You can work on your own or in groups of up to 2 people.

•   Upload your code as a single ZIP file.

•   Upload your report as a single PDF file to blackboard.

•   You are required to use python3 to finish the programming assignments. For the deep learning part, all starter code is given in PyTorch and we highly recommend you to use this framework. • The delivered code is taken into account with the evaluation. Ensure your code is well documented and as readable as possible.

Introduction
This assignment will give you an introduction to basic image processing with python, filtering in the spatial domain, and a simple introduction to building fully-connected neural networks with PyTorch.

With this assignment, we provide you starter code for the programming tasks. You can download this from: https://github.com/hukkelas/TDT4195-StarterCode.

Before starting, please read through the assignment README.md.

Recommended readings

1.    Convolution (recommended to review the lecture slides before looking at these resources):

•   Cross-Correlation overview

•   Convolution overview

•   Cross correlation vs convolution

2.    3Blue1Brown introduction to neural networks. A very good introduction to neural networks with great visualizations.

3.    Deep Learning with PyTorch: A 60 Minute Blitz: A short introduction to PyTorch (similar to what will be given in the lectures).

Delivery

We ask you to follow these guidelines:

•   Report: Deliver your answers as a single PDF file. Include all tasks in the report, and mark it clearly with the task you are answering (Task 1.a, Task1.b, Task 2.c etc). There is no need to include your code in the report.

•   Plots in report: For the plots in the report, ensure that they are large and easily readable. You might want to use the ”ylim” function in the matplotlib package to ”zoom” in on your plots. Label the different graphs such that it is easy for us to see which graphs correspond to the train, validation and test set.

•   Source code: Upload your code as a zip file. In the assignment starter code, we have included a script (create_submission_zip.py) to create your delivery zip. Please use this, as this will structure the zipfile as we expect. (Run this from the same folder as all the python files).

To use the script, simply run: python3 create_submission_zip.py

•   Upload to blackboard: Upload the ZIP file with your source code and the report to blackboard before the delivery deadline.

Any group who does not follow these guidelines will be subtracted in points.

Spatial Filtering
Task 1: Theory [1.5pt]
A digital image is constructed from an image sensor. An image sensor outputs a continuous voltage waveform that represents the image, and to construct a digital image, we need to convert this continuous signal. This conversion involves two processes: sampling and quantization.

(a)    [0.1pt] Explain in one sentence what sampling is.

(b)    [0.1pt] Explain in one sentence what quantization is.

(c)    [0.2pt] Looking at an image histogram, how can you see that the image has high contrast?

(d)   [0.5pt] Perform histogram equalization by hand on the 3-bit (8 intensity levels) image in Figure 1a Your report must include all the steps you did to compute the histogram, the transformation, and the transformed image. Round down any resulting pixel intesities that are not integer (use the floor operator).

(e)    [0.1pt] What happens to the dynamic range if we apply a log transform to an image with a large variance in pixel intensities?

Hint: A log transform is given by s = c · log(1 + r), where r and s is the pixel intensity before and after transformation, respectively. c is a constant.

(f)     [0.5pt] Perform spatial convolution by hand on the image in Figure 1a using the kernel in Figure 1b. The convolved image should be 3×5. You are free to choose how you handle boundary conditions, and state how you handle them in the report.

6
7
5
4
6
 
1
0
-1
4
5
7
0
7
2
0
-2
7
1
6
6
3
1
0
-1
                                   (a) A 3 × 5 image.                                                                 (b) A 3 × 3 Sobel kernel.

Figure 1: An image I and a convolutional kernel K. For the image, each square represents an image pixel, where the value inside is the pixel intensity in the [0,7] range (3-bit).

Task 2: Programming [1.0pt]
In this task, you can choose to use either the provided python files (task2ab.py, task2c.py) or jupyter notebooks (task2ab.ipynb, task2c.ipynb).

Basic Image Processing

Converting a color image to grayscale representation can be done by taking a weighted average of the three color channels, red (R), green (G), and blue (B). One such weighted average - used by the sRGB color space - is:

                                                               greyi,j = 0.212Ri,j + 0.7152Gi,j + 0.0722Bi,j                                                                                       (1)

Complete the following tasks in python3. Use the functions given in file task2ab.py in the starter code.

NOTE: Do not change the name of the file, the signature of the function, or the type of the returned image in the function. Task 2 will be automatically evaluated, and to ensure that the return output of your function has the correct shape, we have included a set of assertions at the end of the given code. Do not change this.

(a)    [0.1pt] Implement a function that converts an RGB image to greyscale. Use Equation 1. Implement this in the function greyscale.

In your report, include the image lake.jpg as a greyscale image.

(b)    [0.2pt] Implement a function that takes a grayscale image and applies the following intensity transformation T(p) = 1 − p. Implement this in the function inverse

In your report, apply the transformation on lake.jpg, and include in your report.

Tip: if the image is in the range [0,255], then the transformation must be changed to T(p) = 255−p.

Spatial Convolution

Equation 2 shows two convolutional kernels. ha is a 3 × 3 sobel kernel. hb is a 5 × 5 is an approximated gaussian kernel.

                                                                                                   (2)

(c)    [0.7pt] Implement a function that takes an RGB image and a convolutional kernel as input, and performs 2D spatial convolution. Assume the size of the kernel is odd numbered, e.g. 3 × 3, 5 × 5, or 7 × 7. You must implement the convolution operation yourself from scratch.

Implement the function in convolve_im.

You are not required to implement a procedure for adding or removing padding (you can return zero in cases when the convolutional kernel goes outside the original image).

In your report, test out the convolution function you made. Convolve the image lake.jpg with the sobel kernel (ha) and the smoothing kernel (hb) in Equation 2. Show both images in your report.

Tip: To convolve a color image, convolve each channel separately and concatenate them afterward.

Neural Networks
Task 3: Theory [1.0pt]
A neural network consists of a number of parameters (weights or biases). To train a neural network, we require a cost function (also known as an error function, loss function, or an objective function). A typical cost function for regression problems is the L2 loss.

                                                                             ,                                                                       (3)

where ˆy is the output of our neural network, and y is the target value of the training example. This cost function is used to optimize our parameters by showing our neural network several training examples with given target values.

To find the direction we want to update our parameters, we use gradient descent. For each training example, we can update each parameter with the following:

                                                                                    ,                                                                              (4)

where α is the learning rate, and θt is the parameter at time step t.

By using this knowledge, we can derive a typical approach to update our parameters over N training examples.

 

Algorithm 1 Stochastic Gradient Descent

 

1: procedure SGD

2:
w0 ← 0
3:
for n = 0,....,N do
4:
xn,yn ← Select training sample n
5:
yˆn ← Forward pass xnthrough our network
 ∂C

6:

 

(a)    [0.1pt] A single-layer neural network is a linear function. Give an example of a binary operation that a single-layer neural network cannot represent (either AND, OR, NOT, NOR, NAND, or XOR).

(b)    [0.1pt] Explain in one sentence what a hyperparameter for a neural network is. Give two examples of a hyperparameter.

(c)    [0.1pt] Why is the softmax activation functioned used in the last layer for neural networks trained to classify objects?

(d)   [0.5pt] Figure 2 shows a simple neural network. Perform a forward pass and backward pass on this network with the given input values. Use Equation 3 as the cost function and let the target value be y = 1.

Find and report the final values for  , and .

Explain each step in the computation, such that it is clear how you compute the derivatives.

(e)    [0.2pt] Compute the updated weights w1, w3, and b1 by using gradient descent and the values you found in task d. Use α = 0.1

More products