$25
Exercise 1: Baseline predictor
Compute the baseline predictor πΉΜ based on the following raw data matrix πΉ:
5
β
πΉ = 4 3
[1
β
1
1
4
5
5
1
2
β
3
4
4
4
3
β]
(Hint: This involves a least square with sixteen equations and nine variables. Feel free to use any programming language. For example, the backslash operator or pinv() in Matlab can be helpful. If there are multiple solutions to the least squares problem, take one of those.
Show the matrix π¨, π and π, πΉΜ. Note that the possible rate is between 1 and 5. No need to submit the programming code.)
Exercise 2: Neighborhood predictor
Using the given R and the computed πΉΜ from the previous question, compute the neighborhood predictor πΉΜπ with πΏ = 2. Compute neighbors across the columns (movies.)
(Hint: Note that the possible rate is between 1 and 5. Compute the similarity matrix, show the
Neighbors of each movie, find the πΉΜπ ; No need to submit the programming code. )
Exercise 3: Least squares
a) Solve for b in the following least squares problem, by hand or using any programming language:
πππππππ§ππ βπ¨πβπβ22,
Where
1
1
π¨ = [
0
2
0
1
2
1
2 2
0 and π = [1] ]
1 1
1 3
(Hint: take the derivative of βπ¨πβπβ22 with respect to b.)
b) Solve the above least squares problem again with regularization. Vary the regularization parameter π for π = 0,0.2,0.4,0.6,0.8,β¦,4.8,5.0, and plot both βπ¨πβπβ22 and βπβ22 against π.
(Hint: take the derivative of
βπ¨πβπβ22 +πβπβ22
with respect to b to obtain a system of linear equations. Note that π is between 0 to 5.0 with an interval of 0.2)
Hint: We briefly mentioned regularization in class, which is a method we use to prevent overfitting. You should be able to do question (b) by using a similar method to differentiate
βπβ22 .)