$20
Problem 1.
Derive the gradient in softmax regression ∂𝑤∂𝐽𝑘,𝑖 , ∂∂𝑏𝐽𝑘
Problem 2.
Read the section “Logistic regression vs. softmax” in the lecture note. It shows that a two-way softmax can be reduced to logistic regression.
Please show that logistic regression can also be reduced to 2-way softmax, i.e., for any parameter of the logistic regression model, there exists some parameter of the softmax regression model that does the same thing.
Problem 3.
Consider a -way classification. The predicted probability of a sample of . Incorrect predictions do not have utilities or losses.𝑘 𝑢𝑘
is Suppose correctly predicting a sample of categoryy ∈ ℝ𝐾, where𝑘 𝑦𝑘 is the predicted probability of the 𝑘leads to a utilityth category.
Give the decision rule, i.e., a mapping from y to 𝑡^, that maximizes the total expected utility.