$20
Problem 1: Recall the naive Bayes model.
● For simplicity, we only consider binary features
● The generation model is
Here: A Bernoulli distribution parametrized by 𝜋 means that It is a special case of categorical distributions in that only two cases are considered.Pr[𝑋 = 1] = π Pr[𝑋 = 0] = 1 − π and .
● Such a model can be used to represent a document in text classification. For example, the target indicates Spam or NotSpam. The feature indicates if a word in the vocabulary occurs in the document.
Show that the decision boundary of naive Bayes is also linear.
Problem 2: Give a set of parameters of the below stack of logistic regression models to accomplish the non-linear classification.
Problem 3: Using the above example to show the optimization of neural networks is non-convex.
Hint: As mentioned in class, you can renamemodels. Interpolating them will give you a very bad model.𝑐1 as 𝑐2 and 𝑐2 as 𝑐1. Then, you will have two