$20
Problem 1
Suppose 90% samples are positive (t=1) and 10% are negative (t=0). Compute P, R, and F1 scores of majority guess (always predicting t=1). If in your derivation the denominator is 0, please compute the limit.
Explain the deficiency of the F1-score in this case, and discuss one possible treatment to resurrect P, R, F1 scores for this problem.
Problem 2.
Prove that the sigmoid functionother words, σ(𝑧) =. 1+exp1{−𝑧} is symmetric about the point (0,0.5), in
Problem 3. σ(− 𝑧) = 1 − σ(𝑧)
Prove that minimizing the loss is equivalent to
minimize the Kullback--Leibler (KL) divergence between𝐽 =− 𝑡 log 𝑦 − (1 −t and𝑡)ylog, denoted by KL((1 − 𝑦) t || y), where t=(
1For two discrete distributions− 𝑡, 𝑡) and y=(1 − 𝑦, 𝑦) are two Bernoulli distributions.P and Q
= (𝑝1, ···, 𝑝𝐾) = (𝑞1, ···, 𝑞𝐾), the KL divergence is defined as
KL(P || Q)= ∑𝐾 𝑝 log 𝑝𝑘
Note: KL divergence is not symmetric between𝑘=1 𝑘 𝑞𝑘 P and Q. To minimize the KL, Q must cover all the support of P. Thus, the learned distribution may be smoother than it should be.