1. In the parts below, you will set up the primal Lagrangian and derive the dual Lagrangian, for support vector machine classifier, for the case of data that is separable in expanded feature space.
For the SVM learning, we stated the optimization problem for the linearly separable (in u-space) case, as:
Minimize J(w)= 12 w 2
s.t. zi (wTui + w0)−1≥ 0 ∀i
Assume our data is linearly separable in the expanded feature space. Also, nonaugmented notation is used throughout this problem.
(a) If the above set of constraints (second line of equations above) is satisfied, will all the training data be correctly classified?
(b) Write the Lagrangian function L(w,w0,λ) for the minimization problem stated above. Use λi, i =1, 2,!, N for the Lagrange multipliers. Also state the KKT conditions. (Hint: there are 3 KKT conditions).
(c) Derive the dual Lagrangian LD , by proceeding as follows:
(i) Minimize L w.r.t. the weights.
Hint: solve ∇wL = 0 for the optimal weight w* (in terms of λi and other
variables); and set ∂L = 0 and simplify.
∂w0
(ii) Substitute your expressions from part (i) into L, and use your expression from
∂∂wL0 = 0 as a new constraint, to derive LD as:
⎡ N N T ⎤ N
LDλiλjziz jui u j ⎥ + ∑λi
⎣⎢i=1 j=1 ⎥⎦ i=1
N
subject to the (new) constraint: ∑λizi = 0 . Also give the other two KKT
i=1
conditions on λi , which carry over from the primal form.
2. In this problem you will do SVM learning on 2 data points, using the result from Problem 1 above.
p. 1 of 2
You are given a training data set consisting of 2 data points, in a 2-class problem. In the expanded feature space, the data points are:
u1=⎢⎡⎣ −1 ⎤⎥ ∈ S1; u2 =⎡⎢ 01 ⎤⎥⎦ ∈ S2.
0 ⎦ ⎣
(a) Solve, by hand, for the SVM decision boundary and regions. To do this, use Lagrangian techniques to optimize LD w.r.t. λ subject to its equality constraint
N
∑λizi = 0, in order to find the optimal weight vector w*, and also find the optimal i=1
w0 . Plot the training data and the decision boundary in 2D (nonaugmented) u-
space, and show which side of the boundary is positive ( Γ1).
Hints: Start from:
′(λ,µ) N ⎡ N N jzizju u ⎤ ⎛∑N ⎞ T ⎥ +µ
LD i=1 ⎢⎣i=1 j=1 i j ⎥⎦ ⎜⎝ i=1ziλi⎠⎟
in which the last term has been added to incorporate the equality constraint stated above. Once finished, then check that the resulting λ satisfies the KKT conditions on λ that you stated in Problem 1(c)(ii); one of these can also help you find w0 *.
(b) Use the data points and solution you found in part (a), and call its solution decision boundary H . Calculate in 2D u − space the distance between u1 and H , and the distance between u2 and H . Is there any other possible linear boundary in u − space that would give larger values for both distances (“margins”) than H gives? (No proof required.)