$25
You are given two files train.csv and test.csv containing the training data and testing data respectively. You can download the files from her e
Each file contains two columns -- a feature and a label.
1. Understanding the data and simple curve fitting [5 + 25 = 30 marks]
a. Plot a feature vs label graph for both the training data and the test data.
b. Write a code to fit a curve that minimizes squared error cost function using gradient descent (with learning rate 0.05), as discussed in class, on the training set while the model takes the following form y = W T Φn(x), where W ∈Rn+1 , and
Φn(x) = [1, x, x2, x3... , xn]. Squared error is defined as J .
i=1
In your experiment, vary n from 1 to 9. In other words, fit 9 different curves (polynomials of degree 1, 2, …, 9) to the training data , and hence estimate the parameters. Use the estimated W to predict labels on test data and measure squared error on the test set, name it as test error.
2. Visualization of the fitted curves [10 + 10 = 20 marks]
a. Draw separate plots of all 9 different curves that you have fit for the training dataset in 1b.
b. Report squared error on both train and test data for each value of n in the form of a plot where along x-axis, vary n from 1 to 9 and along y-axis, plot both training error and test error. Explain which value of n is suitable for the dataset that you have, and why.
3. Regularization [15 + 15 = 30 marks]
Perform the following regularizations on the curves for which you obtain the minimum and maximum training error from above.
a. Perform Lasso regression on the cost function as follows, vary λ = 0.25,0.5,0.75,1 :
J W
i=1
b. Perform Ridge regression on the cost function as follows, vary λ = 0.25,0.5,0.75,1 :
J W2
i=1
Plot both training and test error for the two types of regularization (a) and (b). What differences do you notice between the two kinds of regression? Which one would you prefer for this problem and why?