$24.99
Submission Instructions
Tasks
1 General Questions About Logistic Regression [60 pts]
1. [10 Points] Explain why logistic regression is a discriminative classifier (as opposed to a generative classifier such as Naive Bayes).
2. [10 Points] Recall the prediction rule for logistic regression is if p(yj = 1|xj) > p(yj = 0|xj), then predict 1, otherwise predict 0. What does the decision boundary of logistic regression look like? Justify your answer (e.g., try to write out the decision boundary as a function of w0, w1, w2 and
3. In this question, we will derive the logistic regression algorithm (the M(C)LE and its gradient). For simplicity, we assume the dataset is two-dimensional. Given a training set {(xi,yi);i = 1,...,n} where xi ∈ R2 is a feature vector and yi ∈ 0,1 is a binary label, we want to find the parameters ˆw that maximize the likelihood for the training set, assuming a parametric model of the form.
(a) [20 Points] Below, we give a derivation of the conditional log likelihood. In this derivation, provide a short justification for why each line follows from the previous one.
Next, we will derive the gradient of the previous expression with respect to w0, w1, w2, i.e., , where l(w) denotes the log likelihood from part 1. We will perform a few steps of the derivation, and then ask you to do one step at the end. If we take the derivative of Expression 8 with respect to wi for i ∈ {1,2}, we get the following expression:
The blue expression is linear in wi, so it can be simplified to . For the red expression, we use the chain rule as follows (first we consider a single j ∈ [1,n]).
(b) [20 Points] Now, use Equation 13 (and the previous discussion) to show that overall, Expression 9, i.e., , is equal to
Hint: does Expression 13 look like a familiar probability?
Since the log likelihood is concave, it is easy to optimize using gradient ascent. The final algorithm is as follows. We pick a step size η, and then perform the following iterations until the change is < ϵ:
2 Logistic Regression Implementation [40 pts]
In this assignment you will implement simple linear classifiers and run them on the following dataset:
The goal of this assignment is to help you understand the fundamentals of the classic logistic regression method and become familiar with scientific computing tools in Python. You will also get experience in hyperparameter tuning and using proper train/validation/test data splits.
Download the starting code from the package (“Logistic Regression.zip”) provided to you.
The top-level notebook (“Logistic.ipynb”) will guide you through all of the steps. Setup instructions are below. The format of this assignment is inspired by the Stanford CS231n assignments, and we have borrowed some of their data loading and instructions in our assignment IPython notebook.
None of the parts of this assignment require the use of a machine with a GPU. You should be able to complete the assignment using your local machine.
Environment Setup (Local): You will need a Python environment set up with the appropriate packages.
Reporting: Describe the hyperparameter tuning you tried for learning rate and number of epochs. Report the optimal hyperparameter setting you found in the list below. Also report your training, validation, and testing accuracy with your optimal hyperparameter setting.
• Optimal hyperparameters:
• Training accuracy:
• Validation accuracy:
• Test accuracy:
Also, create two plots as following:
1. Fix the optimal learning rate, then create plot where x axis varies the number of epochs and y-axis plots the training, validation, and testing accuracy.
2. Fix the optimal number of epochs, then create plot where x axis varies the learning rate and y-axis plots the training, validation, and testing accuracy.
Finally, submit “logistic.py” on Canvas as well.