1 Introduction In this assignment, you will have the chance to get hands-on experience with the naive Bayes model and hidden Markov models (HMM). Python is the programming language choice for this homework. Only Python ≥ 3.0 is allowed.
2 Naive Bayes (60 pts) In this section, you are going to implement a naive Bayes classifier. You will write your code in the function naive bayes in task1.py. When grading, we will call the function with different parameters.
• The dataset is a CSV file where each line is an instance. And, the last column is the label of an instance.
• No external library is allowed for this task.
• For the provided dataset, the accuracy value you should achieve is 88,15.
• Since you need to know the number of features and the possible values of a feature to compute the probabilities, you will iterate over the dataset for those values. Because a value of the feature may be present in one set but not present in the other set, you need to iterate over both training and test sets to get the values right.
• The time limit for this task is 1 minute.
1
3 Hidden Markov Models (40 pts) In this section, you are going to implement the forward and Viterbi algorithms to solve evaluation and decoding tasks of HMMs. You will write your code in the functions forward and viterbi in task2.py. To test your implementation, you can use task2 runner.py.
• As an external library, only NumPy is allowed.
• The provided data consists of NumPy arrays having np.float64 items. In your implementations, do continue to use np.float64 precision.
• The limit is for this task is 1 minute (running task2 runner with all provided data).
• As a reference for HMMs, you can check this link.