Note: for a dataset, classification accuracy is defined as number of correctly classified data points divided by total number of data points. Reminder: include a copy of your code as part of your homework submission, as one separate computer-readable pdf file, for all assignments that include computer problems in this class. 1. In this 3-class problem, you will use the one vs. one method for multiclass classification. Let the discriminant functions be:
g12(x)=−x1−x2+5 g13(x)=−x1+3 g23(x)=−x1+x2−1
and gji(x)=−gij(x).
The decision rule is:
x ∈Sk iff gkj(x) 0 forall j ≠ k.
Draw the decision boundaries and label decision regions Γi and any indeterminate regions.
Classify the points x = (4,1, 1), (1,5, 3), and (0,0, 1) . If there is an indeterminate region prove it by
finding a point that doesn’t get classified according to the above rule. If there is no indeterminate region, so state.
2. For the wine dataset, code up a nearest-means classifier with the following multiclass approach: one vs. rest. Use the original unnormalized data. Note that the class means should always be defined by the training data. Run the one vs. rest classifier using only the following two features: 1 and 2.
Note that the same guidelines as HW1 apply on coding the classifier(s) yourself vs. using available packages or routines , with one possible exception*.
Give the following:
(a) Classification accuracy on training set and on testing set.
(b) Plots showing each resulting 2-class decision boundary and regions ( Sk′ vs. Sk′ ) (c) A plot showing the final decision boundaries and regions ( Γ1, Γ2, Γ3, indeterminate).
Hint 1: For (b) and (c), you can use PlotDecBoundaries(). Modify it if necessary.
Hint 2: *If using Python, you may optionally use scipy.spatial.distance.cdist in calculating Euclidean distance between matrix elements. HW2 continues on next page…
p. 1 of 2
3. (a) Derive an expression for the discriminant function 𝑔(𝑥)for a 2-class nearestmeans classifier, based on Euclidean distance, for class means µ1 and µ2 . Is the classifier linear?
(b) Continuing from part (a), for the following class means:
µ1=⎢⎡⎣ −02 ⎥⎦⎤, µ2 =⎡⎢⎣ 01 ⎥⎦⎤
Plot the decision boundaries and label the decision regions.
(c) Repeat part (a) except for a 3-class classifier, using the maximal value method (MVM):
find the three discriminant functions g1(x), g2(x), g3(x) , given three class means µ1, µ2, and µ3. Express in simplest form. Is the classifier linear?
(d) Continuing from part (c) using MVM, for the following class means:
Plot the decision boundaries and label the decision regions.
Hint: Refer to Lecture 5 and (upcoming) Lecture 6 if you have trouble with this.
4. Extra credit. DHS Problem 5.9. (Note that DHS has a set of “Problems”, and a set of
“Computer Exercises”, both at the end of each chapter. This is “Problem” 9 of Chapter 5.)
The problem statement starts “The convex hull of a set of vectors…”. Some versions of the DHS text may have a slightly different numbering of problems, so it’s best to check every time that you are going to solve an assigned problem.
Additional hint: Classify the point x twice, once based on x in the convex hull of S1 data points, and a second time based on x in the convex hull of S2 data points.