Starting from:

$30

CS613- Assignment 1 - Regression Solved

Introduction
In this assignment you will perform linear regression on a dataset and using cross-validation to analyze your results. In addition to computing and applying the close-form solution, you will also implement from scratch a gradient descent algorithm for linear regression.

As with all homeworks, you cannot use any functions that are against the “spirit” of the assignment, unless explicitly told to do so. For this assignment that would mean any linear regression functions. You may use statistical and linear algebra functions to do things like:

mean
std
cov
inverse
matrix multiplication
transpose
..
Datasets
Fish Length Dataset (x06Simple.csv)       This dataset consists of 44 rows of data each of the form:

Index
Age (days)
Temperature of Water (degrees Celsius)
Length of Fish
The first row of the data contains header information.

Data obtained from: http://people.sc.fsu.edu/ jburkardt/datasets/regression/regression.html

1             Theory
(10pts) Consider the following data:
−2

−  5

−3

  0  −6

 −2



  1



 5 −1



3
1 

−4

1 

3 

11

5 

0  −1

−3

1
Compute the coefficients for the linear regression using least squares estimate (LSE), wherethe second value (column) is the dependent variable (the value to be predicted) and the first column is the sole feature. Show your work and remember to add a bias feature and to standardize the features. Compute this model using all of the data (don’t worry about separating into training and testing sets).
Confirm your coefficient and intercept term using the sklearn.linear model LinearRegression function.
For the function g(x) = (x − 1)4, where x is a single value (not a vector or matrix):(3pts) What is the gradient with respect to x? Show your work to support your answer.
(3pts) What is the global minima for g(x)? Show your work to support your answer.
(3pts) Plot x vs g(x) using matplotlib and use this image in your report.
2             Closed Form Linear Regression
Download the dataset x06Simple.csv from Blackboard. This dataset has header information in its first row and then all subsequent rows are in the format:

ROWId,xi,1,xi,2,yi

Your code should work on any CSV data set that has the first column be header information, the first column be some integer index, then D columns of real-valued features, and then ending with a target value.

Write a script that:

Reads in the data, ignoring the first row (header) and first column (index).
Randomizes the data
Selects the first 2/3 (round up) of the data for training and the remaining for testing
Standardizes the data (except for the last column of course) using the training data
Computes the closed-form solution of linear regression
Applies the solution to the testing samples
Computes the root mean squared error (RMSE): . where Yˆi is the predicted value for observation Xi.
Implementation Details

Seed the random number generate with zero prior to randomizing the data
Don’t forget to add in a bias feature!
In your report you will need:

The final model in the form y = θ0 + θ1x:,1 + ...
The root mean squared error.
3             S-Folds Cross-Validation
Cross-Validation is a technique used to get reliable evaluation results when we don’t have that much data (and it is therefore difficult to train and/or test a model reliably).

In this section you will do S-Folds Cross-Validation for a few different values of S. For each run you will divide your data up into S parts (folds) and test S different models using S-Folds Cross-Validation and evaluate via root mean squared error. In addition, to observe the affect of system variance, we will repeat these experiments several times (shuffling the data each time prior to creating the folds). We will again be doing our experiment on the provided fish dataset. You may use sklearn KFold to perform this task.

Write a script that:

Reads in the data, ignoring the first row (header) and first column (index).
20 times does the following:
(a) Randomizes the data (b) Creates S folds.

(c) For i = 1 to S

Select fold i as your testing data and the remaining (S−1) folds as your training data ii. Standardizes the data (except for the last column of course) based on the training data
iii. Train a closed-form linear regression model iv. Compute the squared error for each sample in the current testing fold (d) You should now have N squared errors. Compute the RMSE for these.

You should now have 20 RMSE values. Compute the mean and standard deviation of these.The former should give us a better “overall” mean, whereas the latter should give us feel for the variance of the models that were created.
Implementation Details

Don’t forget to add a bias feature!
Set your seed value at the very beginning of your script (if you set it within the 20 tests, eachtest will have the same randomly shuffled data!).
In your report you will need:

The average and standard deviation of the root mean squared error for S = 3 over the 20 different seed values..
The average and standard deviation of the root mean squared error for S = 5 over the 20 different seed values.
The average and standard deviation of the root mean squared error for S = 20 over 20 different seed values.
The average and standard deviation of the root mean squared error for S = N (where N is the number of samples) over 20 different seed values. This is basically leave-one-out 
4             Locally-Weighted Linear Regression
Next we’ll do locally-weighted closed-form linear regression. You may use sklearn train test split for this part.

Write a script to:

Read in the data, ignoring the first row (header) and first column (index).
Randomize the data
Select the first 2/3 of the data for training and the remaining for testing
Standardize the data (except for the last column of course) using the training data
Then for each testing sampleCompute the necessary distance matrices relative to the training data in order to computea local model.
Evaluate the testing sample using the local model.
Compute the squared error of the testing sample.
Computes the root mean squared error (RMSE): . where Yˆi is the predicted value for observation Xi.
Implementation Details

Seed the random number generate with zero prior to randomizing the data
Don’t forget to add in the bias feature!
Use the L1 distance when computing the distances d(a,b).
Let k = 1 in the similarity function β(a,b) = e−d(a,b)/k2.
Use all training instances when computing the local model.
In your report you will need:

The root mean squared error.
5             Gradient Descent
As discussed in class Gradient Descent (Ascent) is a general algorithm that allows us to converge on local minima (maxima) when a closed-form solution is not available or is not feasible to compute.

In this section you are going to implement a gradient descent algorithm to find the parameters for linear regression on the same data used for the previous sections. You may NOT use any function for a ML library to do this for you, except sklearn train test split for the data.

Implementation Details

Seed the random number generator prior to your algorithm.
Don’t forget to a bias feature!
Initialize the parameters of θ using random values in the range [-1, 1]
Do batch gradient descent
Terminate when absolute value of the percent change in the RMSE on the training data is less than 2−23, or after 1,000 iterations have passed (whichever occurs first).
Use a learning rate η = 0.
Make sure that you code can work for an arbitrary number of observations and an arbitrarynumber of features.
Write a script that:

Reads in the data, ignoring the first row (header) and first column (index).
Randomizes the data
Selects the first 2/3 (round up) of the data for training and the remaining for testing
Standardizes the data (except for the last column of course) base on the training data
While the termination criteria (mentioned above in the implementation details) hasn’t beenmetCompute the RMSE of the training data
While we can’t let the testing set affect our training process, also compute the RMSE ofthe testing error at each iteration of the algorithm (it’ll be interesting to see).
Update each parameter using batch gradient descent
Compute the RMSE of the testing data.
What you will need for your report

Final model
A graph of the RMSE if the training and testing sets as a function of the iteration
The final RMSE testing 

More products