Starting from:

$25

ECE595-Homework 1 Solved


Objective
As the first homework assignment, we would like you to refresh some of the concepts in the Background chapter, and have some hands-on experience with Python. Here are the specific objectives.

(a)    Familiarize yourself with tools in Python that will be helpful to you in later part of the course. In addition to basic Python functions and objects, you will gain experience working with functions that simulate random data sampling from probability distributions, and visualize the data;

(b)    Review of some important concepts in linear algebra and probability. Warm up with some proof techniques that will be used later in the course.

Exercise 1: Installing Python and Getting Started (0 point)
To get started with the homework, please download and install Python on your local machine. Here are a few steps to guide you through. For additional information, e.g., video demonstrations, please visit our course website.

(a)    If you are a beginner to Python, we suggest you download Anaconda at https://www.anaconda.com/ download/. Follow the instruction and install on your local machine.

(b)    Once you have installed Anaconda, open an environment and install Spyder.

(c)     Make sure you have standard packages installed: scipy, numpy, matplotlib, cvxpy, cvxopt, and imageio.

(d)    After you have installed all these packages, open Spyder and type your hello world program.

import numpy as np import scipy import matplotlib.pyplot as plt import cvxpy as cp import csv import imageio print("Hello World!")

If you are already familiar with Python, you may skip this exercise. Please contact our teaching assistants if you need any help.

Exercise 2: Generating 1D Random Variables
In this exercise, we will use Python to draw random samples from a 1D Gaussian and visualize the data using a the histogram.

(a)    Let X be a random variable with X ∼ N(µ,σ2). The PDF of X is written explicitly as

                                                                                      .                                                                     (1)

Prove that E[X] = µ and Var[X] = σ2.

(b)    Let µ = 0 and σ = 1 so that X ∼ N(0,1). Plot fX(x) using matplotlib.pyplot.plot for the range x ∈ [−3,3]. Use matplotlib.pyplot.savefig to save your figure.

(c)     Let us investigate the use of histograms in data visualization.

(i)       Use numpy.random.normal to draw 1000 random samples from N(0,1).

(ii)     Make two histogram plots using matplotlib.pyplot.hist, with the number of bins m set to 4 and 1000.

(iii)    Use scipy.stats.norm.fit to estimate the mean and standard deviation of your data. Report the estimated values.

(iv)    Plot the fitted gaussian curve on top of the two histogram plots using scipy.stats.norm.pdf.

(v)      Are the two histograms representative of your data’s distribution? How are they different in terms of data representation?

(d)    A practical way to estimate the optimal bin width is to make use of what is called the cross validation estimator of risk (CVER) of the dataset. Denoting h = (max data value − min data value)/m as the bin width, with m = the number of bins (assuming you applied no rescaling to your raw data), we seek h∗ that minimizes the CVER  ), expressed as follows:

                                                                           ,                                                           (2)

where  is the empirical probability of a sample falling into each bin, and n is the total number of samples.

Plot  ) with respect to m the number of bins, for m = 1,2,...,200. Find the m∗ that minimizes

 ), plot the histogram of your data with that m∗, and plot the Gaussian curve fitted to your data on top of your histogram. How is your current histogram different from those you obtained in part (c)?

Note: If you are interested in why  ) plays an important role in estimating the optimal bin width of the histogram, see the additional note of this homework.

Exercise 3: Generating 2D Random Variables
In this exercise, we consider the following question: suppose that we are given a random number generator that can only generate zero-mean unit variance Gaussians, i.e., X ∼ N(0,I), how do we transform the distribution of X to an arbitrary Gaussian distribution? We will first derive a few equations, and then verify them with an empirical example, by drawing samples from the 2D Gaussian, applying the transform to the dataset, and checking if the transformed dataset really takes the form of the desired Gaussian.

(a)    Let X ∼ N(µ,Σ) be a 2D Gaussian. The PDF of X is given by

                                                          ,                                          (3)

where in this exercise we assume

                                                                X  , x  , µ  ,               and                                            (4)

(i)       Simplify the expression fX(x) for the particular choices of µ and Σ here. Show your derivation. (ii) Using matplotlib.pyplot.contour, plot the contour of fX(x) for the range x ∈ [−1,5]×[0,10].

(b)    Suppose X ∼ N(0,I). We would like to derive a transformation that can map X to an arbitrary

Gaussian.

(i)       Let X ∼ N(0,I) be a d-dimensional random vector. Let A ∈ Rd×d and b ∈ Rd. Let Y = AX+b be an affine transformation of X. Let µY def= E[Y ] be the mean vector and ΣY def= E[(Y −µY )(Y − µY )T] be the covariance matrix. Show that

                                                                                        µY = b,        and         ΣY = AAT.                                                                (5)

(ii)     Show that ΣY is symmetric positive semi-definite.

(iii)    Under what condition on A would ΣY become a symmetric positive definite matrix? (iv) Consider a random variable Y ∼ N(µY ,ΣY ) such that

                                                                                    µY  ,        and       .

Determine A and b which could satisfy Equation (5).

Hint: Consider eigen-decomposition of ΣY . You may compute the eigen-decomposition numerically.

(c)     Now let us verify our results from part (b) with an empirical example.

(i)       Use numpy.random.multivariate_normal to draw 5000 random samples from the 2D standard normal distribution, and make a scatter plot of the data point using matplotlib.pyplot.scatter.

(ii)     Apply the affine transformation you derived in part (b)(iv) to the data points, and make a scatter plot of the transformed data points. Now check your answer by using the Python function numpy.linalg.eig to obtain the trasformation and making a new scatter plot of the transformed data points.

(iii)    Do your results from parts (c)(i) and (ii) support your theoretical findings from part (b)? You are welcome to utilize Python functions you find useful and include plots in your answer.

Exercise 4: Norm and Positive Semi-Definiteness
The aim of this exercise is to reinforce your understanding of the vital concepts of norms, the two famous inequalities, eigen-decomposition, and the notion of positive (semi-)definiteness, which will be ubiquitous throughout the semester.

(a)    Schur’s lemma (one of the several named after Issai Schur) is one of the most commonly used inequalities in estimating quadratic forms. Given a matrix A ∈ Rm×n, vectors x ∈ Rm and y ∈ Rn, the inequality takes the form

 √ 

                                       |xTAy| ≤                                                                                                                   RCkxk2kyk2, where(6)

Prove this inequality.

Hint: Use the Cauchy-Schwarz inequality.

(b)    Recall from the lectures the concepts related to positive (semi-)definite matrices.

(i)       Prove that any positive definite matrix A is invertible.

(ii)     Find a function f : R2 → R whose Hessian is invertible but not positive definite anywhere in R2.

(iii)    Under what extra condition is any positive semi-definite matrix positive definite? Justify your answer.

(c)     Recall the concept of eigen-decomposition: for any symmetric matrix A ∈ Rn×n, there exist a diagonal matrix Λ ∈ Rn×n with eigenvalues of A on its diagonal, and orthonormal matrix U ∈ Rn×n with eigenvectors of A as its columns, such that A = UΛUT. Prove that there exists A† ∈ Rn×n such that the following holds:

                                                                                                     AA†A = A                                                                                     (7)

Hint: You can use the fact that, for symmetric A with rank k ≤ n, it is possible to eigen-decompose A such that the first k diagonal entries of Λ are nonzero, and the rest are all zeros. Then define A† = UΛ−UT where [Λ−]j,j = [Λ]j,j−1 for 1 ≤ j ≤ k, and 0 everywhere else. A† is what is called the pseudoinverse of A.

More products