Starting from:

$25

DATA642-MIDTERM Solved

MIDTERM

Advanced Machine Learning DATA 442/642

Exercise 1 

Show that the `1 norm is a convex function (as all norms), yet it is not strictly convex. In contrast, show that the squared Euclidean norm is a strictly convex function.

Exercise 2

Let the observations resulting from an experiment be xn, n = 1,2,...,N. Assume that they are independent and that they originate from a Gaussian PDF with mean µ and standard deviation σ2. Both, the mean and the variance, are unknown. Prove that the maximum likelihood (ML) estimates of these quantities are given by

 

Exercise 3

For the regression model where the noise vector η = [η1,...,ηN]> comprises samples from zero mean Gaussian random variable, with covariance matrix Σn, show that the Fisher information matrix is given by

where X is the input matrix.

Exercise 4 Consider the regression problem described in one of our labs. Read the same audio file, then add white Gaussian noise at a 15 dB level and randomly “hit” 10% of the data samples with outliers (set the outlier values to 80% of the maximum value of the data samples).

(a)    Find the reconstructed data samples obtained by the support vector regression. Employthe Gaussian kernel with σ = 0.004 and set  003 and C = 1. Plot the fitted curve of the reconstructed samples together with the data used for training.

(b)    Repeat step (a) using C = 0.05,0.1,0.5,5,10,100.

(c)    Repeat step (a) using .

(d)    Repeat step (a) using σ = 0.001,0.002,0.01,0.05,0.1.

(e)    Comment on the results.

Exercise 5 

Show, using Lagrange multipliers, that the `2 minimizer in equation (9.18) from the textbook accepts the closed form solution

θˆ = X>(XX>)−1y

Now, show that for the system y = Xθ with X ∈ Rn×l and n > l the least squares solution is given by

θˆ = (X>X)−1X>y

Exercise 6 

Show that the null space of a full rank N ×l matrix X is a subspace of imensionality l −N, for N < l.

Exercise 7

Generate in Python a sparse vector θ ∈ Rl, l = 100, with its first five components taking random values drawn from a normal distribution with mean zero and variance one and the rest being equal to zero. Build, also, a sensing matrix X with N = 30 rows having samples normally distributed, with mean zero and variance 1/N, in order to get 30 observations based on the linear regression model y = Xθ. Then perform the following tasks.

(a)    Use a LASSO implementation to reconstruct θ from y and X.

(b)    Repeat the experiment 500 times, with different realizations of X, in order to compute the probability of correct reconstruction (assume the reconstruction is exact when ||y = Xθ|| < 10−8).

(c)    Repeat the same experiment (500 times) with matrices of the form

                                                                                 ,       with probability  



                                                             X(i,j) =        0,                  with probability 1  

                                                                                 ,       with probability  

for p equal to 1,9,25,36,64 (make sure that each row and each column of X has at least a nonzero component). Give an explanation why the probability of reconstruction falls as p increases (observe that both the sensing matrix and the unknown vector are sparse).

More products