1. In this problem, consider a Bayes minimum-error classifier. Three classes are described by normal densities as follows:
p(x Si )= N(x,mi,Σi ), i =1, 2, 3 P(S1)=π1, P(S2)=π2, P(S3)=π3
in which each πi is a variable denoting the prior for class Si (to save on writing). Assume the πi are given.
In this problem, the algebra should be done by hand. The plots may be done using computer or by hand.
(a) Write an expression for the discriminant functions gi (x) , in two forms:
(i) Expressed in terms of Mahalanobis distance dM (x,mi ) and other given terms;
(ii) Expressed only in terms of x, mi, Σi, and Σi−1, grouped as follows:
gi (x)=xTW i x+wi x+w0(i)
and give expressions for W, w, and w0 in terms of the same quantities.
(b) State the decision rule in terms of the discriminant functions gi (x). Is the classifier linear, quadratic, or neither? (c) For this part you are also given:
m1=⎢⎡ 12 ⎥⎦⎤, m2 =⎡⎢⎣ −11 ⎦⎤⎥, m3 =⎢⎣⎡ −22 ⎤⎦⎥
⎣
Σ1=Σ2 =Σ3 =⎢⎡⎣ −11 −21 ⎤⎥⎦
(i) Give expressions for the discriminant functions g1(x), g2(x) and g3(x) in terms of the given numbers, in simplest forms. Hint: before plugging in numbers, you can simplify the gi (x) by dropping some terms. Is the classifier linear, quadratic, or neither?
(ii) Let . Give equations for the 3 decision boundaries, in simplest form.
(iii) In (non-augmented) feature space, plot the class means, the 3 curves for dM2(x,mi )=1, and the decision boundaries. Show clearly the decision regions and final boundaries.
2. A Naïve Bayes classifier is a Bayes classifier in which the features, conditioned on class, are assumed independent; thus:
p(x Si)=∏ p(xj Si) ∀i =1,!,C . D
j=1
Most of the below is similar to your work for Problem 1.
For a Naïve Bayes (minimum error) classifier, let the given class-conditional densities and priors be:
p(x Si )= N(x,mi,Σi ), i =1, 2, 3
⎡ ( (i))2 ⎤
⎢ σ1 0 ⎥
Σi =⎢⎢⎢⎣ 0 (σ2(i))2 ⎥⎥⎥⎦
P(S1)=π1, P(S2)=π2, P(S3)=π3
(a) Write an expression for the discriminant functions gi (x) , in the form:
gi (x)=w1(1i)x12 +w1(2i)x1x2 +w2(i2)x22 +w1(i)x1+w2(i)x2 +w0(i)
and give expressions for all the weights in terms of given scalar quantities, in simplest form. Let the mean vector of class Si be denoted mi =⎡⎢⎣m1(i) m2(i)⎤⎥⎦T .
(b) Is the classifier quadratic, linear, or neither?
(c) For this part you are also given:
m1=⎢⎣⎡ 12 ⎤⎥⎦, m2 =⎡⎢⎣ −11 ⎥⎤⎦, m3 =⎢⎣⎡ −22 ⎥⎦⎤ σ1(i))2 =σ12 =1, (σ2(i))2 =σ22 = 2, ∀i
(i) Give expressions for the discriminant functions g1(x), g2(x) and g3(x) in terms of the given numbers, in simplest forms. Hint: before plugging in numbers, you can simplify the gi (x) by dropping some terms. Is the classifier linear, quadratic, or neither?
(ii) Let . Give equations for the 3 decision boundaries, in simplest form.
(iii) In (non-augmented) feature space, plot the class means, the 3 curves for dM2(x,mi )=1, i =1,2,3, and the decision boundaries. Show clearly the decision regions and final boundaries. Problem 2 continues on next page…
(d) Compare your plot of 2(c)(iii) with your plot of 1(c)(iii). Does the dependence vs.
independence of features make a substantial difference in the decision boundaries and regions?
3. This problem uses the minimum risk criterion (instead of minimum error criterion) for classification (introduced in Lecture 21, covered in Discussion 12 and DHS 2.2). In this problem, please use our notation ( Si instead of ωi for class i ).
This problem is based on our 2-class tumor classification example. Suppose the classconditional densities are known (or assumed) to be Gaussian:
p(x Si )= N(x,mi,Σi ), i =1, 2
For a Bayes minimum risk classifier, answer the parts below.
(a) Write the decision rule for a 2-class Bayes minimum risk classifier, in terms of conditional risks R(α1 x) and R(α2 x). Then, write the decision rule in terms of dM2(x,mi ), Σ1, Σ2 , λ11,λ12,λ21,λ22 , P(S1), and P(S2).
For parts (b)-(d) below, suppose also that on average, 80% of the tumors that are to be classified (based on their MRI images) are benign (class S1), and the other 20% are cancerous (class S2).
(b) Give estimates of the class priors P(S1) and P(S2).
For parts (c)-(e) below, suppose also that misclassifying a cancerous tumor as benign, is considered much worse than misclassifying a benign tumor as cancerous. Thus we will use for our cost coefficients:
⎡⎢⎢⎣ λλ1211 λλ1222 ⎦⎤⎥⎥=⎣⎡⎢ 01 100 ⎤⎥⎦
Also, you are given that the mean vectors and covariance matrices for each class are known (or estimated) as: =⎡⎢⎣ 14 ⎤⎥⎦, m2 =⎡⎢⎣ 42 ⎤⎥⎦
m1
Σ1=Σ2 =⎢⎡⎣ −00.5.5 −02.5 ⎤⎥⎦
(c) Solve for the decision boundary and regions; that is, give an expression (in simplest form, based on the given numbers) for the decision rule. Problem 3 continues on next page…
(d) Plot in 2D feature space, the class means, the 2 curves for dM2(x,mi )=1, i =1,2, the decision boundary, and show the decision regions (by a small arrow at the boundary pointing into Γ1, or by labelling Γ1 andΓ2).
(e) If the guidelines for ordering an MRI change, and it is estimated that the data points coming in will be, on average, 50% benign and 50% cancerous, repeat part (d) (or add it to your part (d) plot, clearly labeled as (e)). How have the boundary and regions changed?