$34.99
Natural Language Processing)
Homework 1
20% of the course mark
1. Compute the partial derivatives of the ridge regression loss function
.
You can also read the interesting paper “The Matrix Calculus You Need For Deep Learning” (https://arxiv.org/pdf/1802.01528.pdf).
2. Implement gradient descent, stochastic gradient descent and mini-batch gradient descentfor ridge regression in the toolkit Scikit-Learn (https://scikit-learn.org/stable/). Use the California Housing Dataset subset available in this link to demonstrate your algorithms. Use appropriate visualization techniques like the ones we presented in the lectures to demonstrate how well your implementations perform on this dataset (more specifically that they are not underfitting or overfitting). Start by reading chapter 2 of the book “Hands-on Machine Learning with Scikit-Learn, Keras and TensorFlow” (https:
//www.oreilly.com/library/view/hands-on-machine-learning/9781492032632/).
3. Develop a sentiment classifier using logistic regression for the Twitter sentiment classification dataset available in this link. Start by reading the relevant chapters 4 and 5 of the “Speech and Language Processing” book of Jurafsky and Martin (http://web.stanford. edu/~jurafsky/slp3/) or any other relevant literature you can find. You should use the toolkit Scikit-Learn again. You should evaluate your classifier using precision, recall and F-measure.
1
Note: You should hand in: (i) a pdf with the answer to question 1 and a detailed explanation of your solutions for questions 2 and 3, including citations to relevant literature that you might have used in developing your solutions. (ii) 2 Colab notebooks (ipynb files using https:// colab.research.google.com/) containing your code, for questions 2 and 3, respectively. You should use Python 3.6, or a later version.
2