$25
I. Gaussian Process
In this section, you are going to implement the Gaussian Process and visualize the result.
● Training data o input.data is a 34x2 matrix. Every row corresponds to a 2D data point a
(Xi,Yi).
o Yi = f(Xi) + 𝜖i is a noisy observation, where 𝜖i ~ N(∙|0, β-1). You can use β = 5 in this implementation.
● What you are going to do o Part1: Apply Gaussian Process Regression to predict the distribution of f and visualize the result. Please use a rational quadratic kernel to compute similarities between different points.
Details of the visualization:
- Show all training data points.
- Draw a line to represent the mean of f in range [-60,60].
- Mark the 95% confidence interval of f.
(You can use matplotlib.pyplot to visualize the result, e.g. use matplotlib.pyplot.fill_between to mark the 95% confidence interval, or you can use any other package you like.)
o Part2: Optimize the kernel parameters by minimizing negative marginal log-likelihood, and visualize the result again. (You can use
scipy.optimize.minimize to optimize the parameters.)
II. SVM on MNIST dataset
Use SVM models to tackle classification on images of hand-written digits (digit class only ranges from 0 to 4, as the figure shown below).
● Training data
o X_train.csv is a 5000x784 matrix. Every row corresponds to a 28x28 grayscale image. o Y_train.csv is a 5000x1 matrix, which records the class of the training samples.
● Testing data o X_test.csv is a 2500x784 matrix. Every row corresponds to a 28x28 grayscale image.
o Y_test.csv is a 2500x1 matrix, which records the class of the test samples.
● What you are going to do o Part1: Use different kernel functions (linear, polynomial, and RBF kernels) and have comparison between their performance.
o Part2: Please use C-SVC (you can choose by setting parameters in the function input, C-SVC is soft-margin SVM). Since there are some parameters you need to tune for, please do the grid search for finding parameters of the best performing model. For instance, in C-SVC you have a parameter C, and if you use RBF kernel you have another parameter 𝛾, you can search for a set of (C, 𝛾) which gives you best performance in cross-validation. (There are lots of sources on the internet, just google for it.)
o Part3: Use linear kernel + RBF kernel together (therefore a new kernel function) and compare its performance with respect to others. You would need to find out how to use a user-defined kernel in libsvm.