$20
Problem 1.
Consider the training objectiveHow would the hypothesis class capacity, overfitting/underfittting, and bias/variance vary๐ฝ = ||๐๐ค − ๐ก||2 subject to ||๐ค||2 ≤ ๐ถ for some constant ๐ถ.
according to ๐ถ?
Larger
Smaller
Model capacity (large/small?)
_____ ๐ถ
_____ ๐ถ
Overfitting/Underfitting?
__fitting
__fitting
Bias variance (how/low?)
__ bias / __ variance
__ bias / __ variance
Note: No proof is needed Problem 2.
๐ก(๐) ∼ ๐(๐ค๐ฅ(๐), σ
๐คConsider a one-dimensional linear regression model∼ ๐(0, σ 2). Show that the posterior of ๐ค is also a Gaussian distribution, i.e.,ฯต2) with a Gaussian prior
๐ค|๐ฅ(1), ๐ก(1), ๐ค ···, ๐ฅ(๐), ๐ก(๐) ∼ ๐(µ๐๐๐ ๐ก, σ๐๐๐ ๐ก2). Give the formulas for µ๐๐๐ ๐ก, σ๐๐๐ ๐ก2.
Note: If a prior has the same formula (but typically with different parameters) as the posterior, itHint: Work with ๐(๐ค|๐ท) ∝ ๐(๐ค)๐(๐ท|๐ค). Do not handle the normalizing term.
is known as a conjugate prior. The above conjugacy also applies to multi-dimensional Gaussian, but the formulas for the mean vector and the covariance matrix will be more complicated.
Problem 3.
equivalent toGive the prior distribution of๐1-penalized mean square loss.๐ค for linear regression, such that the max a posteriori estimation is