Starting from:

$25

CS273A-Homework 4 Solved

CS 273A: Machine Learning

Homework 4

       1        Setting up the data 
 
The following is the snippet of code to load the datasets, and split it into train and validation data:

# Data Loading

X    = np.genfromtxt('data/X_train.txt', delimiter=None)

Y    = np.genfromtxt('data/Y_train.txt', delimiter=None)

X,Y = ml.shuffleData(X,Y)
1

2

3

4

1.    Print the minimum, maximum, mean, and the variance of all of the features. 5 points

2.    Split the dataset, and rescale each into training and validation, as:

Xtr, Xva, Ytr, Yva = ml.splitData(X, Y)

Xt, Yt = Xtr[:5000], Ytr[:5000] # subsample for efficiency (you can go higher)

XtS, params = ml.rescale(Xt)                                       # Normalize the features

XvS, _ = ml.rescale(Xva, params) # Normalize the features
1

2

3

4

Print the min, maximum, mean, and the variance of the rescaled features. 5 points

       2        Linear Classifiers
In this problem, you will use an existing implementation of logistic regression, from the last homework, to analyze its performance on the Kaggle dataset.

learner = mltools.linearC.linearClassify() learner.train(XtS, Yt, reg=0.0, initStep=0.5, stopTol=1e-6, stopIter=100) learner.auc(XtS, Yt) # train AUC
1

2

3

1.     One of the important aspects of using linear classifiers is the regularization. Vary the amount of regularization, reg , in a wide enough range, and plot the training and validation AUC as the regularization weight is varied. Show the plot. 

2.    We have also studied the use of polynomial features to make linear classifiers more complex. Add degree 2 polynomial features, print out the number of features, why it is what it is. 

3.    Reuse your code that varied regularization to compute the training and validation performance (AUC) for this transformed data. Show the plot. 5 points

       3        Nearest Neighbors
In this problem, you will analyze an existing implementation of K-Nearest-neighbor classification for the Kaggle dataset. The K-nearest neighbor classifier implementation supports two hyperparameters: the size of the neighborhood, K, and how much to weigh the distance to the point, a (0 means no unweighted average, and the higher the α, the higher the closer ones are weighted[1]). Note, you might have to subsample a lot for KNN to be efficient.

learner = mltools.knn.knnClassify() learner.train(XtS, Yt, K=1, alpha=0.0) learner.auc(XtS, Yt) # train AUC
1

2

3

1.    Plot of the training and validation performance for an appropriately wide range of K, with α= 0. 

2.    Do the same with unscaled/original data, and show the plots. 

3.    Since we need to select both the value of K and α, we need to vary both, and see how the performance changes. For a range of both K and α, compute the training and validation AUC (for unscaled or scaled data,

whichever you think would be a better choice), and plot them in a two dimensional plot like so:

K = range(1,10,1) # Or something else A = range(0,5,1) # Or something else tr_auc = np.zeros((len(K),len(A))) va_auc = np.zeros((len(K),len(A))) for i,k in enumerate(K):

for j,a in enumerate(A):

tr_auc[i][j] = ... # train learner using k and a va_auc[i][j] = ... # Now plot it f, ax = plt.subplots(1, 1, figsize=(8, 5)) cax = ax.matshow(mat, interpolation='nearest') f.colorbar(cax) ax.set_xticklabels(['']+A) ax.set_yticklabels(['']+K) plt.show()
1

2

3

 
4

5

6

7

8

9

10

11

12

13

14

15

Show both the plots, and recommend a choice of K and α based on these results. 

       4        Decision Trees 
For this problem, you will be using a similar analysis of hyper-parameters for the decision tree implementation.

maxDepth
 There are three hyper-parameters in this implementation that become relevant to its performance;, minParent , and minLeaf , where the latter two specify the minimum number of data points necessary to split a

node and form a node, respectively.

learner = ml.dtree.treeClassify(Xt, Yt, maxDepth=15)
1

maxDepth
1.    Keeping minParent=2 and minLeaf=1 , varyto a range of your choosing, and plot the training and validation AUC. 

2.      Plot the number of nodes in the tree as maxDepth is varied (using learner.sz ). Plot another line in this plot by increasing either minParent or minLeaf (choose either, and by how much). 

maxDepth
3.     Setto a fixed value, and plot the training and validation performance of the other two hyperparameters in an appropriate range, using the same 2D plot we used for nearest-neighbors. Show the plots, and recommend a choice for minParent and minLeaf based on these results. 

       5        Neural Networks 
Last we will explore the use of neural networks for the same Kaggle dataset. The neural networks contain many possible hyper-parameters, such as the number of layers, the number of hidden nodes in each layer, the activation function the hidden units, etc. These don’t even take into account the different hyper-parameters of the optimization algorithm.

nn = ml.nnet.nnetClassify() nn.init_weights([[XtS.shape[1], 5, 2], 'random', XtS, Yt) # as many layers nodes you want nn.train(XtS, Yt, stopTol=1e-8, stepsize=.25, stopIter=300)
1

2

3

1.    Vary the number of hidden layers and the nodes in each layer (we will assume each layer has the same number of nodes), and compute the training and validation performance. Show 2D plots, like for decision trees and K-NN classifiers, and recommend a network size based on the above.

2.    Implement a new activation function of your choosing, and introduce it as below:

def sig(z): return np.atleast_2d(z) def dsig(z): return np.atleast_2d(1) nn.setActivation('custom', sig, dsig)
1

2

3

logistic
and
htangent
Compare the performance of this activation function with, in terms of the training and validation performance.

 
       6       Conclusions

  Pick the classifier that you think will perform best, mention all of its hyper-parameter values, and explain the reason for your choice. Train it on as much data as you can, preferably all of X , submit the predictions on Xtest to Kaggle, and include your Kaggle username and leaderboard AUC in the report. Here’s the code to create the Kaggle submission:

Xte = np.genfromtxt('data/X_test.txt', delimiter=None) learner = .. # train one using X,Y

Yte = np.vstack((np.arange(Xte.shape[0]), learner.predictSoft(Xte)[:,1])).T np.savetxt('Y_submit.txt', Yte, '%d, %.2f', header='ID,Prob1', comments='', delimiter=',')
1

2

3

4

More products