Starting from:

$30

EE559- Homework 5 Solved

1.For 2-class perceptron with margin algorithm, using basic sequential GD, fixed increment, prove convergence for linearly separable training data by modifying the perceptron convergence proof covered in class.  You may write out the proof, or you may take the 3-page proof from the posted handout (or from lecture), and mark up the proof to show all changes as needed.  If you mark up the existing proof, be sure to mark everything that needs changing (e.g., if a change propagates through the proof, be sure to make all changes for a complete answer).

2.       You are given the following training data points in three pattern classes S1, S2, and S3:

                 {(0,1,−1,2)}∈S1; {(1,1,1,1),(2,1,1,1)}∈S2; {(−1,1,0,−1)}∈S3
 Note that in our notation (throughout this class), for convenience we can write (x1,x2,x3,x4) with commas, to denote a column vector (of dimension 4 in this case).  

(a)       Find linear discriminant functions that correctly classify the training data, using the multiclass Perceptron algorithm using maximal value method (given in Discussion 6 and in the posted handout).  Use augmented space (so first augment the data).  There are few enough iterations that this can be done by hand, or you may write code to do it if you prefer.  

 Use the following assumptions and starting point.  Assume the data points have already been shuffled, so use the training data in the order given above.  Use η(i)= 1  ∀i,  and initial weight vectors:

 

                                                   w(1)(0)=−1,     w(2)(0)=1,    w(3)(0)= 0.  

 

(b)       From this 5-dimensional feature space, consider points that lie in the plane  P defined by all  x such that  x=(1,x1,x2,0,0). Give the decision rule for points  (x1,x2) that lie in this plane. Plot in 2-space, the decision boundaries and decision regions in plane  P .   

3.    Widrow-Hoff learning.  Starting from the MSE criterion function, derive a learning algorithm using the basic sequential gradient descent technique, as follows:

(a)       Find an expression for Jn(w) and from that derive an expression for ∇w Jn(w) .  

(b)       Complete the derivation to get the sequential gradient descent algorithm based on MSE.  Your algorithm should be stated as a set of statements and forumulas and should include:  any random shuffling of the data, what is allowed for initializatin of w(0), weight update formula, any restriction on η(i), and any formula relating the iteration index i to the data point index n.  You may omit a halting condition because that isn’t part of this homework problem.  

 Hint:  Note that if your weight update formula has a positive constant (call it a) that multiplies by η(i), you may set η′(i)=aη(i) and then drop the prime, to simplify your final formula.  

                                                                  p. 1 of 1

More products