$29.99
In this homework, you will implement a spectral clustering algorithm in Python. Here are the steps you need to follow:
1. You are given a two-dimensional data set in the file named hw08_data_set.csv, which contains 300 data points generated randomly from five bivariate Gaussian densities with the following parameters.
𝜇! = #++22..55( , Σ! = #+−00..86 −0.6( ,
+0.8 𝑁! = 50
𝜇" = #−+22..55( , Σ" = #++00..86 +0.6( ,
+0.8 𝑁" = 50
𝜇# = #−−22..55( , Σ# = #+−00..86 −0.6( ,
+0.8 𝑁# = 50
𝜇$ = #+−22..55( , Σ$ = #++00..86 +0.6( ,
+0.8 𝑁$ = 50
𝜇% = #++00..00( , Σ% = #++10..60 +0.0( ,
+1.6 𝑁% = 100
The given data points are shown in the following figure.
2. You should first calculate the Euclidean distances between the pairs of data points. The data point pairs with distance less than or equal to 𝛿 = 1.25 are considered as connected.
Construct the matrix 𝐁 as follows:
1,
𝑏&’ = 4
0, 5𝒙& − 𝒙’5" < 𝛿 otherwise.
𝑏&& = 0
You should also visualize this connectivity matrix by drawing a line between two data points if they are connected. Your figure should be similar to the following figure.
𝐋()**+,-&. = 𝐈 − 𝐃/!/"𝐁𝐃/!/"
5. Run k-means clustering algorithm on 𝐙 matrix to find 𝐾 = 5 clusters. When initializing your algorithm, use the following rows of 𝐙 matrix for initial centroids: 29, 143, 204, 271, and 277.
6. Draw the clustering result obtained by your spectral clustering algorithm by coloring each cluster with a different color. Your figure should be similar to the following figure.
What to submit: You need to submit your source code in a single file (.py file) and a short report explaining your approach (.doc, .docx, or .pdf file).
How to submit: Submit the two files (source code and short report) you created to Blackboard. Submissions that do not follow these guidelines will not be graded.
Cheating policy: Very similar submissions will not be graded.