Starting from:

$25

CS571- Assignment 4 News Headlines Classification using Naive Bayes classifier Solved

1.     Problem statement: Given the headline of news, the objective is to find the​        category of the news. (Note: Use only headline as input to find the category​   )​ For example: 

               Short description: …​         

               Headline: Why Keeping a Food Journal Is Better Than Going on a Diet​ 

               Date: …​          

               Link: …​          

               Authors: …​    

               Category: HEALTHY LIVING​    

 

Consider the following categories only: Business, Comedy, Sports, Crime,​  Religion, Healthy Living, Politics

 

2.     Dataset: news_category_dataset.json 

3.     Classification Algorithm:  

Naive Bayes 

 

4.     Features: 

Train the classifier using the following features.

a.     Bag-of-words 

b.    TF-IDF 

c.     Create your own custom feature vectors. 

For example, feature vector can contains following features:

1.     Current word (Unigram)

2.     POS tag of current word

3.     Position of the word

4.     Length of the news instance

 

                              Here, a total of 3​ models needs to be trained i.e., one model using​ 

Bag-of-words features, one model using ​ TF-IDF ​        features and one model using​   Custom feature vectors.​

 

For more information on feature selection, refer the following paper:

Bo Pang, Lillian Lee, and Shivakumar Vaithyanathan:​​ ​Thumbs up? Sentiment Classification using Machine Learning Techniques​. ​In Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing (EMNLP 2002).  

 

5.     Evaluation: 

Perform 3-fold cross-validation for each model and report

a.     Overall precision, recall and F1-score

b.    Category-wise precision, recall and F1-score

 

More products