$20
In this assignment, you are going to implement a one hidden layer fully connected neural network using Python from the given skeleton code mlp_skeleton.py on Canvas (find in the Files tab). This skeleton code forces you to write linear transformation, ReLU, sigmoid crossentropy layers as separate classes. You can add to the skeleton code as long as you follow its class structure. Given N training examples in 2
categories , your code should implement backpropagation using the cross
entropy loss (see Assignment 1 for the formula) on top of a sigmoid layer: (e.g.
), where you should train for an output
. is the ReLU activation function (note Assignment #1
used a sigmoid activation but here it's ReLU), is a matrix with the number of rows equal to the number of hidden units, and the number of columns equal to the input dimensionality.
Finish the above project and write a report (in pdf) with following questions:
Please put the report(in pdf) and the source code into a same zip file, "firstname_lastname_hw2.zip". Submit this zip file on Canvas. You have to make sure your code could run and produce reasonable results!
Write a function that evaluates the trained network (5 points), as well as computes all the subgradients of and using backpropagation (5 points).
Write a function that performs stochastic minibatch gradient descent training (5 points). You may use thedeterministic approach of permuting the sequence of the data. Use the momentum approach described in the course slides.
Train the network on the attached 2class dataset extracted from CIFAR10: (data can be found in thecifar2classpyzip file on Canvas.). The data has 10,000 training examples in 3072 dimensions and 2,000 testing examples. For this assignment, just treat each dimension as uncorrelated to each other. Train on all the training examples, tune your parameters (number of hidden units, learning rate, minibatch size, momentum) until you reach a good performance on the testing set. What accuracy can you achieve? (20 points based on the report).
Training Monitoring: For each epoch in training, your function should evaluate the training objective,testing objective, training misclassification error rate (error is 1 for each example if misclassifies, 0 if correct), testing misclassification error rate (5 points).
Tuning Parameters: please create three figures with following requirements. Save them into jpg format: i) test accuracy with different number of batch size ii)test accuracy with different learning rate iii) test accuracy with different number of hidden units
Discussion about the performance of your neural network.