Problem 1 For the K-means clustering problem, when the binary indicators (responsibilities) ๐๐๐’s are fixed for k=1, 2, …, K and n=1, 2, …, N, derive for the cluster centers ๐ฆ๐, k=1, 2, …, K, such that the following objective function J is minimized:
Problem 2 Iris.xls contains 150 data samples of three Iris categories, labeled by outcome values 0, 1, and 2. Each data sample has four attributes: sepal length, sepal width, petal length, and petal width.
Implement the K-means clustering algorithm to group the samples into K=3 clusters. Randomly choose three samples as the initial cluster centers. Calculate the objective function value J as defined in Problem 1 after the assignment step in each iteration. Exit the iterations if the following criterion is met: ๐ฝ(Iter−1)−๐ฝ(Iter) < ε, where ε = 10−5, and Iter is the iteration number. Plot the objective function value J versus the iteration number Iter. Comment on the result. Attach the code at the end of the homework.
Problem 3 Assume a data sample ๐ฑ ∈ โ๐ท comes from one of two classes, ๐ถ1 and ๐ถ2. Use logistic regression to do classification.
a. Write the math expression of the logistic regression output, and the criterion used for the final classification.
b. How many parameters (weights) need to be calculated/trained in this method?
Problem 4 Assume a data sample ๐ฑ ∈ โ๐ท comes from one of ๐พ classes, ๐ถ1, ๐ถ2, …, ๐ถ๐พ. Use logistic regression to do classification.
a. Write the math expression of the logistic regression output, and the criterion used for the final classification.
b. How many parameters (weights) need to be calculated/trained in this method?