• Load the dataset “fisheriris” into the workspace.
o Study the dataset in terms of (a) Number of classes, (b) Number of features, and
(c) What the data represents, i.e., gain some intuition about the problem domain. Based on your study, would you expect the features to perform well in this problem?
• Compute the following quantities for each feature. Do you observe anything of interest from these statistics?
Variance 𝑠𝑠𝑠𝑠(𝑖𝑖)= ∑𝑀𝑀𝑗𝑗=1𝑃𝑃𝑗𝑗𝜎𝜎𝑗𝑗𝑗𝑗, where 𝜎𝜎𝑗𝑗𝑗𝑗 is variance of i-th feature in class j, and 𝑃𝑃𝑗𝑗 is a-prior probability of class j Between-Class
Variance 𝑠𝑠𝑠𝑠(𝑖𝑖)= ∑𝑀𝑀𝑗𝑗=1𝑃𝑃𝑗𝑗(𝜇𝜇𝑗𝑗𝑗𝑗 − 𝜇𝜇𝑗𝑗)2, where 𝜇𝜇𝑗𝑗𝑗𝑗 is mean of i-th feature in class
j, and 𝜇𝜇𝑗𝑗 is the mean of the i-th feature • Compute and display the correlation coefficients exactly as shown below (left figure). Do you observe anything interesting from this display?
• Display each of the four features versus the class label, exactly as shown below (right figure). What can you state about how well the features may perform in classification?
• Perform the following classification tasks.
Setosa Vs. Versi+Virigi All Features Batch_Perceptron and LS Setosa Vs. Versi+Virigi Features 3 and 4 Only Batch_Perceptron and LS Virgi Vs. Versi+Setosa All Features Batch_Perceptron and LS Virgi Vs. Versi+Setosa Features 3 and 4 Only Batch_Perceptron and LS Setosa Vs. Versi Vs. Virigi Features 3 and 4 Only Multiclass LS • For each case, (a) report whether the method converged, (b) No. of epochs, (c) Computed weight vector, (d) No. of training misclassifications, and whenever appropriate, (e) plot of feature vectors, as well as the computed decision boundary.
Upload your .m or .py file to Blackboard prior to the deadline.