$30
Problem 1: Consider a classification problem with a binary class label Y and a single continuous feature X that takes values in (−4,−2) ∪ (2,4). Suppose (X,Y ) is generated by choosing Y at random with P(Y = 1) = P(Y = 2) = 1/2, and then drawing X conditional on Y according to uniform distributions. Specifically, assume that the class-conditional densities for X are
) and
In the below we consider 0-1 loss, that is, the risk of a classifier is the probability of an error.
(a) What is the marginal distribution of X? What is the conditional distribution of Y given X?
(b) What is the Bayes rule fB(x) and its risk P(Y 6= fB(x)?
Explain!
(c) Let fˆ1(x;S) be the 1-nearest neighbor classifier based on a training sample S = {(x1,y1),...,(xn,yn)} of i.i.d observations of (X,Y ). What is the risk Pr(Y 6= fˆ1(X;S)? Explain. (Here, the risk is computed by integrating over training data and a new independent pair (X,Y )).
(d) Under the same scenario calculate the risk of the 3-nearest neighbor classifier.
(e) Which method, 1-nearest neighbor or 3-nearest neighbor, has smaller risk in this problem?
2. ISLR Section 8.4 Problem 3
2. ISLR Section 8.4 Problem 9 (a) ...(g) (10 points each)
1