Starting from:

$24.99

CS60075 Assignment 1- Language Modeling Solution

Course Name: Natural Language Processing
Platform: Google collab/Kaggle/Machine
Task Definition: Language modeling is the task of predicting the next word or character in a document. This technique can be used to train language models that can be applied to a wide range of natural language tasks like text generation, classification, and question answering. The common language modeling techniques involve N-gram Language Models and Neural Langauge Models. A model's language modeling capability is measured using cross-entropy and perplexity.
N-gram Language model:
1. Implement an N-gram language model with Laplace smoothing and sentence generation.
2. Write a language model class with the below function.
i) A smoothing function for applying Laplace smoothing to n-gram frequency distribution.
iv) A generate-sentence function to generate a sentence using the language model.
Neural Network Based Language Model :
1. Implement a neural network-based language model, and calculate the perplexities for each input sentence based on the trained model. Please split your train data between 90% and 10% to create a dev set and fine-tune the model based on the dev dataset. Please prefer to use a PyTorch framework to implement the above code.
2. Please use Word2Vec/Glove for the initialization of the model.
Evaluation Metrics: Perplexity (For evaluating both the models)
Dataset: English
Submission Materials: Python file, Google drive link for the trained model, a doc-file with results, and your observation.

More products