$35
1 Introduction
Linear Regression is a simple technique for predicting a real output y given an input x = (x1,x2,...,xP) via the linear model:
P
f(x) = β0 + Xβkxk (1)
k=1
Typically there is a set of training data T = (xi,yi),i = 1,..,N from which to estimate the coefficients β = [β0,β1,...βP]T.
The Least Squares (LS) approach finds these coefficients by minimizing the sum of squares error
N
SSE = X(yi − f(xi))2 (2)
y=1
The linear model is limited because the output is a linear function of the input variables xk. However, it can easily be extended to more complex models by considering linear combinations of nonlinear functions, φk(x), of the input variables
P
f(x) = β0 + Xβkφk(x) (3)
k=1
In this case the model is still linear in the parameters although it is nonlinear in x. Examples of nonlinear functions include polynomial functions and Radial basis functions. This assignment aims at illustrating Linear Regression. In the first part, we’ll experiment linear and polynomial models. In the second part, we’ll illustrate regularized Least Squares Regression.
2 Practical assignment
Note In this assignment, you’ll often find between brackets, as in {command}, suggestions of Python commands that may be useful to perform the requested tasks. You should search Python documentation, when necessary, to obtain a description of how to use these commands. At the end of the session you should submit, in fenix, your code along with your answers.
2.1 Least Squares Fitting
(T) Write the matrix expressions for the LS estimate of the coefficients of a polynomial fit of degree P and of the corresponding sum of squares error, from training data T = (xi,yi),i = 1,..,N.
Write code to fit a polynomial of degree P to 1D data variables x and y. Write your owncode, do not use any Python ready made function for LS estimation or for polynomial fitting [1]. {matmul, transpose, inv from numpy}
Load the data in files ‘data1_x.npy’ and ‘data1_y.npy’ and use your code to fit astraight line to variables y and x. {load from numpy}Plot the fit on the same graph as the training data. Comment. {plot, scatter from matplotlib}
Indicate the coefficients and the Sum of squared errors (SSE) you obtained.
Load the data in files ‘data2_x.npy’ and ‘data2_y.npy’, which contain noisy observations of a cosine function, with x ∈ [−1,1], in which is Gaussian noise with a standard deviation of 0.15. Use your code to fit a second-degree polynomial to these data.Plot the training data and the fit. Comment.
Indicate the coefficients and the SSE you obtained. Comment.
Repeat item 4 using as input the data from files ‘data2a_x.npy’ and ‘data2a_y.npy’.This file contains the same data used in the previous exercise except for the presence of a couple of outlier points.Plot the training data and the fit. Comment.
Indicate the coefficients and the SSE you obtained. Compute, in addition, theSSE only on the inliers. Comment on the sensitivity of LS to outliers.
2.2 Regularization
The goal of this second part is to illustrate linear regression with regularization. We’ll experiment with Ridge Regression and Lasso.
Load the data in files ‘data3_x.npy’ and ‘data3_y.npy’ which contain 3-dimensionalfeatures in variable x and a single output y. One of the features in x is irrelevant.
(T) Explain Ridge and Lasso regularization methods and explain how Lasso can be used for feature selection and Ridge can not. Use expressions when appropriate.
Instantiate Ridge and Lasso models. {Ridge, Lasso from sklearn.linear_model}
Fit Ridge and Lasso models to your data for values of α in the range 10−3 to 10 with step size 0.01. Set the maximum number of iterations to 10000.
Plot the Lasso and Ridge coefficients against α, using a logarithmic scale for the α For comparison plot horizontal lines corresponding to the LS coefficients in the same figure (α = 0). {plot from matplotlib}
Comment on what you observe in the plot. Identify the irrelevant feature.
Consider now only the Lasso method. Choose an adequate value for α. Plot y and the fit obtained for that value of α and compare with the LS fit. Compute the SSE in both cases. Comment.
[1] If you are unable to write your own code, you may use LinearRegression and PolynomialFeatures from sklearn (set include_bias to False). However, this option will have a penalty of 2 values in the lab grade.