Starting from:

$25

CPTS570-Homework 1 Solved

1         Analytical Part
This part will be graded as a PASS or FAIL.

1.    Answer the following questions with a yes or no along with proper justification.

a.    Is the decision boundary of voted perceptron linear?

b.   Is the decision boundary of averaged perceptron linear?

2.    In the class, we saw the Passive-Aggressive (PA) update that tries to achieve a margin equalto one after each update. Derive the PA weight update for achieving margin M.

3.    Consider the following setting. You are provided with n training examples: (x1,y1,h1),···,(xn,yn,hn), where xi is the input example, yi is the class label (+1 or -1), and hi > 0 is the importance weight of the example. The teacher gave you some additional information by specifying the importance of each training example.

a.    How will you modify the perceptron algorithm to be able to leverage this extra informa-tion? Please justify your answer.

b.   How can you solve this learning problem using the standard perceptron algorithm? Pleasejustify your answer. I’m looking for a reduction based solution.

4.    Consider the following setting. You are provided with n training examples: (x1,y1),(x2,y2),···,(xn,yn), where xi is the input example, and yi is the class label (+1 or -1). However, the training data is highly imbalanced (say 90% of the examples are negative and 10% of the examples are positive) and we care more about the accuracy of positive examples.

a.    How will you modify the perceptron algorithm to solve this learning problem? Pleasejustify your answer.

b.   How can you solve this learning problem using the standard perceptron algorithm? Pleasejustify your answer. I’m looking for a reduction based solution.

2         Programming and Empirical Analysis Part
1. Programming and empirical analysis question.

Implement a binary classifier with both perceptron and passive-aggressive (PA) weight update as shown below.

Algorithm 1 Online Binary-Classifier Learning Algorithm

 

Input: D = Training examples, T = maximum number of training iterations

Output: w, the final weight vector

1: Initialize the weights w = 0

2: for each training iteration itr ∈ {1,2,···,T} do

3:                 for each training example (xt,yt) ∈ D do

4:                       yˆt = sign(w · xt) // predict using the current weights

5:                    if mistake then

6:                             w = w + τ · yt · xt // update the weights

7:                   end if

8:             end for

9: end for

10: return final weight vector w

 

For standard perceptron, you will use τ = 1, and for Passive-Aggressive (PA) algorithm, you will compute the learning rate τ as follows.

                                                                                                                                  (1)

Implement a multi-class online learning algorithm with both perceptron and passive-aggressive



 
 
 

(PA) weight update as shown below. Employ the single weight vector represenation (representationII as discussed in the class). This representation is defined as follows. Each training example is of the form (xt,yt), where xt ∈ <d is the input and yt ∈ {1,2,···,k} is the class (output) label. In this representation, you will have a single weight-vector w ∈ <k·d and the augmented feature function F(xt,y) ∈ <k·d will have k blocks of size d and it will have zeroes everywhere except for the yth block, which will have xt in it.

 

Algorithm 2 Online Multi-Class Classifier Learning Algorithm

 

Input: D = Training examples, k = number of classes, T = maximum number of training iterations

Output: w, the final weight vector

1: Initialize the weights w = 0

2: for each training iteration itr ∈ {1,2,···,T} do

3:                 for each training example (xt,yt) ∈ D do

4:                            yˆt = argmaxy∈{1,2,···,k} w · F(xt,y) // predict using the current weights

5:                    if mistake then

6:                                 w = w + τ · (F(xt,yt) − F(xt,yˆt)) // update the weights

7:                   end if

8:             end for

9: end for

10: return final weight vector w

 

For standard perceptron, you will use τ = 1, and for Passive-Aggressive (PA) algorithm, you will compute the learning rate τ as follows.

                                                                                               (2)

You will use the Fashin MNIST data (https://github.com/zalandoresearch/fashion-mnist). There is a fixed training and testing set.

Each example is a 28x28 grayscale image, associated with a label from 10 classes: 0 Tshirt/top, 1 Trouser, 2 Pullover, 3 Dress, 4 Coat, 5 Sandal, 6 Shirt, 7 Sneaker, 8 Bag, 9 Ankle boot.

You will use ONLY training data for training and testing data for evaluation.

5.1 Binary Classification  Learn a binary classifier to classify even labels (0, 2, 4, 6, 8) and odd labels (1, 3, 5, 7, 9).

a. Compute the online learning curve for both Perceptron and PA algorithm by plotting thenumber of training iterations (1 to 50) on the x-axis and the number of mistakes on the y-axis. Compare the two curves and list your observations.

b. Compute the accuracy of both Perceptron and PA algorithm on the training data andtesting data for 20 training iterations. So you will have two accuracy curves for Perceptron and another two accuracy curves for PA algorithm. Compare the four curves and list your observations.

c.  Repeat experiment (b) with averaged perceptron. Compare the test accuracies of plainperceptron and averaged perceptron. What did you observe?

d. Compute the general learning curve (vary the number of training examples starting from5000 in the increments of 5000) for 20 training iterations. Plot the number of training examples on x-axis and the testing accuracy on the y-axis. List your observations from this curve.

5.2 Multi-Class Classification Learn a multi-class classifier to map images to the corresponding fashion label.

a. Compute the online learning curve for both Perceptron and PA algorithm by plotting thenumber of training iterations (1 to 50) on the x-axis and the number of mistakes on the y-axis. Compare the two curves and list your observations.

b. Compute the accuracy of both Perceptron and PA algorithm on the training data andtesting data for 20 training iterations. So you will have two accuracy curves for Perceptron and another two accuracy curves for PA algorithm. Compare the four curves and list your observations.

c.  Repeat experiment (b) with averaged perceptron. Compare the test accuracies of plainperceptron and averaged perceptron. What did you observe?

d. Compute the general learning curve (vary the number of training examples starting from5000 in the increments of 5000) for 20 training iterations. Plot the number of training examples on x-axis and the testing accuracy on the y-axis. List your observations from this curve.

More products