Starting from:

$30

COMP551-Optimization and Text Classification Solved

Background
The goal of this project is twofold. First, we want you to gain some insight into the optimization process of gradient based methods. In particular, this section will ask you to implement different variations of gradient descent and analyze their impact on the convergence speed. In the second part of the assignment, you will be experimenting with text data. The section encourages you to explore text-specific preprocessing techniques, and to reapply concepts such as cross-validation that are central to machine learning. In both sections, you will be performing binary classification with Logistic Regression.

Part 1 : Optimization
In this section you will be using the diabetes dataset, which you can find in the assignment folder. For this part, you do not need to perform any data cleaning or preprocessing, i.e. you can use the dataset as-is. For this section, we encourage you to leverage the Logistic Regression notebook, which you can find here. For each bullet point, make sure to include clear and concise plots to go with your discussion.

1.    You should first start by running the logistic regression code using the given implementation. This will serve as a baseline for the following steps. Find a learning rate and a number of training iterations such that the model has fully converged to a solution. Make sure to provide empirical evidence supporting your decision (e.g. training and validation accuracy as a function of number of training iterations).

2.    Implement mini-batch stochastic gradient descent. Then, using growing minibatch sizes (e.g. 8, 16, 32, ...) compare the convergence speed and the quality of the final solution to the fully batched baseline. What configuration works the best among the ones you tried ?

3.    Add momentum to the gradient descent implementation. Trying multiple values for the momentum coefficient, how does it compare to regular gradient descent ? Specifically, analyze the impact of momentum on the convergence speed and the quality of the final solution.

4.    repeat the previous step for a) the smallest batch size and b) largest batch size you tried in 2). In which setting (small mini-batch, large mini-batch, fully batched) is it the most / least effective ?

Part 2 : Text Classification 
In this part, you will be using the fake news dataset. The goal is to detect which articles are generated by a computer, and which ones are written by humans. The dataset has already been split into training, validation, and test. No preprocessing has been applied. A good place to start is the sklearn text data tutorial. For this part, we recommend using the sklearn’s Logistic Regression package as your base model.

Get a basic version working. This includes building a preprocessing pipeline to map raw text to features on which you can train a model. You can go above and beyond, for example, to see if you can achieve more than 80% on the test set.

More products