Starting from:

$25

685.621-Homework 3 Solved

1.   Problem 1

In this problem, develop code to analyze the Iris data sets using the test statistics listed in Table 1. Table 1: Data Analysis Statistics

Test Statistics
Statistical Function F(·)
Standard Deviation
   
The analysis should be done by feature followed by class of flower type. This analysis should provide insight into the Iris data set.

Note: The trimmed mean is a variation of the mean which is calculated by removing values from the beginning and end of a sorted set of data. The average is then taken using the remaining values. This allows any potential outliers to be removed when calculating the statistics of the data. Assuming the data in xs = [x1,s,x2,s,··· ,xn,s] is sorted, the resulting xs,p = [x1+p,s,x2+p,s,··· ,xn−p,s]. the trimmed mean allows the removal of extreme values influencing the mean of the data.

2.   Problem 2 Parts a and b

In this problem we will begin to analyze Iris data based on the class of flower type using linear discriminant analysis.

(a)                  Implement the two class linear discriminant based on the Fisher’s Linear Discriminant (FLD) two-class separability (Fisher, 1936) described below. This is also shown in the two class linear discriminant function presented in (Bishop, 2006) Section 4.1.1 Two classes. For this exercise you will want to separate your Iris data into three sets and focus on any two class combination. For example, from the iris data take the first 50 observations for class 1, the next 50 as class 2 and the final 50 as class 3. Using the two class linear discriminant function compare class 1 verses class 2, class 1 verses class 3 and finally compare class 2 versus class 3.

(b)                  For this problem you will want to expand the two class case from part a to a three class case as presented in (Bishop, 2006) from Section 4.1.2 Multiple classes.

Now that we have our statistic set up let look a the mean and standard deviation between the classes (Iris flower types) and within the classes let’s consider the Fisher’s Linear Discriminant (FLD) to quantify two-class separability of features (Fisher, 1936). FLD is a simple technique which measures the discrimination of sets of real numbers. Without going into all of the theory of the FLD lets focus on the primary components assuming we have a two class problem, equal class sample and a covariance matrix that is generated from normal distributions. The within-class scatter matrix is defined as

SW = XPCSC

C

where SC is the covariance matrix for class C ∈ {−1,+1}
(1)
lC

SC = X(x − µC)(x − µC)T
(2)
i=1,

i∈C

and PC is the a priori probability class C. That is, PC ≈ kC/k, where kC is the number of samples in class C, out of a total of k samples. The between-class smatter matrix is defined as

SB = X(µ−1 − µ+1)(µ−1 − µ+1)T

C

where µ is the global mean vector
(3)
                                                                                                                                                   (4)

and the class mean vector µC is defined as

                                                                                                                                              (5)

Now lets look at the criterion function J(·) written as follows:

wTSBw

J(w) =            (6) wTSWw

where w is calculated to optimize J(·) as follows:

                                                                               w  )                                                    (7)

w for the Fisher Linear Discriminant has been obtained, which will allow for the linear function to yield the maximum ratio between of the between-class scatter and the within-class scatter. Now let’s determine a threshold b that will allow us to determine which class a new observation will belong to. The optima decision boundary assuming each class has the same number of samples can be calculated as follows:

                                                                              b = −0.5(wµ−1 + wµ+1)                                                   (8)

Now, if we have a new input observation x we can determine which class the new observation belongs to based on the following

                                                                                         y = wx + b                                                              (9)

where y < 0 is class −1 and y ≥ 0 is class +1.

The previous discussion is based on the FLD and is simplified as a two class linear discriminant function presented in (Bishop, 2006) Section 4.1.1 Two classes

More products