$25
Dataset Info:The assignment targets document classification of the BBC news dataset. The dataset/files has already been converted to a single csv file for easier processing. The dataset has 2 features, first is the ‘Article’ which is the input for the task while the second ‘Class’ features classifies the article into one of 5 classes : business, entertainment, politics, sport and tech . You can find more info at http://mlg.ucd.ie/datasets/bbc.htm
What you have to do:
1. Implement the Feed Forward Neural Network for the sequence classification task. Implement the model with a minimum 2 hidden layers. Run the model for 5 epochs. Use the seed value of ‘1’ for consistent results. Evaluate the model on the basis of performance metrics (accuracy, precision, recall) and number of parameters. Write your results in a separate file when submitting. Steps are
a. Preprocess/clean the data(remove stopwords etc.)
b. The input to Neural Network will be the articles, with tokens represented with a 100 dimension vector each(ie. Embedding size = 100).
c. Make sure to pad the input to the maximum article length or fix a max length and truncate the longer sequences if you are facing memory issues.
d. Use 70:10:20 split for training, validation and testing
e. Use of dataloader is optional.
2. The general implementation for Neural networks for classification tasks often uses ReLU for hidden layers and sigmoid/softmax for final output. Run your model(created above) with different activation layers (Relu, softmax/sigmoid, tanh) at their appropriate positions and calculate the performance change for each combination. Specify which works best in case of your model. Do the same for various optimization algorithms available (SGD,