Problem 1 For the K-means clustering problem, when the binary indicators (responsibilities) 𝑟𝑘𝑛’s are fixed for k=1, 2, …, K and n=1, 2, …, N, derive for the cluster centers 𝐦𝑘, k=1, 2, …, K, such that the following objective function J is minimized:
Problem 2 Iris.xls contains 150 data samples of three Iris categories, labeled by outcome values 0, 1, and 2. Each data sample has four attributes: sepal length, sepal width, petal length, and petal width.
Implement the K-means clustering algorithm to group the samples into K=3 clusters. Randomly choose three samples as the initial cluster centers. Calculate the objective function value J as defined in Problem 1 after the assignment step in each iteration. Exit the iterations if the following criterion is met: 𝐽(Iter−1)−𝐽(Iter) < ε, where ε = 10−5, and Iter is the iteration number. Plot the objective function value J versus the iteration number Iter. Comment on the result. Attach the code at the end of the homework.
Problem 3 Assume a data sample 𝐱 ∈ ℝ𝐷 comes from one of two classes, 𝐶1 and 𝐶2. Use logistic regression to do classification.
a. Write the math expression of the logistic regression output, and the criterion used for the final classification.
b. How many parameters (weights) need to be calculated/trained in this method?
Problem 4 Assume a data sample 𝐱 ∈ ℝ𝐷 comes from one of 𝐾 classes, 𝐶1, 𝐶2, …, 𝐶𝐾. Use logistic regression to do classification.
a. Write the math expression of the logistic regression output, and the criterion used for the final classification.
b. How many parameters (weights) need to be calculated/trained in this method?