Starting from:

$30

Machine-Learning-Homework 5 Solved

I.                 Gaussian Process  

In this section, you are going to implement the Gaussian Process and visualize the result.

● Training data o input.data is a 34x2 matrix. Every row corresponds to a 2D data point

(Xi,Yi).

o Yi = f(Xi) + 𝜖𝜖i  is a noisy observation, where 𝜖𝜖i ~ N(∙|0, β-1). You can use β = 5 in this implementation.

● What you are going to do o Part1: Apply Gaussian Process Regression to predict the distribution of f and visualize the result. Please use a rational quadratic kernel to compute similarities between different points.

Details of the visualization:

-    Show all training data points.

-    Draw a line to represent the mean of f in range [-60,60].

-    Mark the 95% confidence interval of f.  

(You can use matplotlib.pyplot to visualize the result, e.g. use matplotlib.pyplot.fill_between to mark the 95% confidence interval, or you can use any other package you like.)

o Part2: Optimize the kernel parameters by minimizing negative marginal log-likelihood, and visualize the result again. (You can use

scipy.optimize.minimize to optimize the parameters.)

             

II.               SVM on MNIST dataset  

Use SVM models to tackle classification on images of hand-written digits (digit class only ranges from 0 to 4, as the figure shown below).

  

● Training data

o   X_train.csv is a 5000x784 matrix. Every row corresponds to a 28x28 grayscale image.

o   Y_train.csv is a 5000x1 matrix, which records the class of the training samples.

●       Testing data o X_test.csv is a 2500x784 matrix. Every row corresponds to a 28x28 grayscale image. o Y_test.csv is a 2500x1 matrix, which records the class of the test samples.

●       What you are going to do o Part1: Use different kernel functions (linear, polynomial, and RBF kernels) and have comparison between their performance. o Part2: Please use C-SVC (you can choose by setting parameters in the function input, C-SVC is soft-margin SVM). Since there are some parameters you need to tune for, please do the grid search for finding parameters of the best performing model. For instance, in C-SVC you have a parameter C, and if you use RBF kernel you have another parameter 𝛾𝛾, you can search for a set of (C, 𝛾𝛾) which gives you best performance in cross-validation. (There are lots of sources on the internet, just google for it.)

o Part3: Use linear kernel + RBF kernel together (therefore a new kernel function) and compare its performance with respect to others. You would need to find out how to use a user-defined kernel in libsvm.

More products