$30
Functions Created
Here are some functions I have created to save time when completing questions.
xbar = function(matrix){
n = dim(matrix)[1]
onevec = rep(1,n)
xbar = 1/n*t(matrix)%*%onevec
return(xbar)
}
Spool = function(dat){
#Getting the n lengths and parameters
nbar = rep(0,length(dat))
for (i in 1: length(dat)){
nbar[i] = dim(dat[[i]])[1]
}
n = sum(nbar)
params = dim(dat[[1]])[2]
g = length(dat)
#Getting the variance matrices
S = list()
for (i in 1:g){
S = append(S,list(var(dat[[i]])))
}
#Calculating the Within Matrix
W = matrix(0,params,params)
Wnew = matrix(0,params,params)
for (i in 1:g){
Wnew = as.matrix(lapply(S[i], '*', (nbar[i] -1)))
Wnew = matrix(unlist(Wnew), ncol = params, byrow = T)
W = W + Wnew
}
#s_pooled
result = W/(n - g)
return(result)
}
spectral = function(matrix){
eig = eigen(matrix)
D
= diag(eig$values)
P
= eig$vectors
matsqrt = P%*%sqrt(D)%*%t(P)
14630 Assignment 3 Ravish Kamath 213893664
output = list(D, P, matsqrt)
names(output) = c('D', 'P', 'SqrtMat')
return(output)
}
24630 Assignment 3 Ravish Kamath 213893664
Question 1
Consider the data given in the EXCEL fifile tab “q1”.
(a) State the multivariable linear regression model with all the necessary assumptions.
(b) Find the predicted model.
(c) Test the signifificance of the model.
(d) Regarless of your result in part (c), test if X1 is signifificant? How about X2?
(e) Find a 95% woking Hotelling confifidence region for the mean response when X1 = 192 and X2 = 152.
(f) Find a 95% woking Hotelling prediction region for a new response when X1 = 192 and X2 = 152.
Solution
Part A
˜
Y
= X
˜
—
+˜
‘
where
˜
Y
= [Y(1), Y(2)], X = [˜1, X(1), X(2)],
˜
—
= [—(0), —(1), —(2)] and ˜
‘ = [‘(1), ‘(2)]
Assumptions: E(˜
‘(i)) = ˜
0 V (˜
‘(i)) = ‡iiI cov(˜
‘(i),˜
‘(j)) = ‡ij I
Part B
For this section we will show both ways of getting the predicted model. One will be through the actual LS
method, and the other will be using the lm function built in R.
Least Squares Method
Y = as.matrix(q1df[,1:2])
X = as.matrix(q1df[,3:4])
onevec = rep(1, dim(X)[1])
Xnew = cbind(onevec,X)
beta_coef = solve(crossprod(Xnew))%*%crossprod(Xnew, Y)
beta_coef
## y1 y2
## onevec 28.0020858 35.8023808
## x1 0.3876325 0.2446617
## x2 0.5766253 0.4713147
3Solution
4630 Assignment 3 Ravish Kamath 213893664
Using the built in function in R
fit = lm(cbind(y1, y2) ~ x1 + x2, data = q1df)
summary(fit)
## Response y1 :
##
## Call:
## lm(formula = y1 ~ x1 + x2, data = q1df)
##
## Residuals:
## Min 1Q Median 3Q Max
## -13.361 -3.794 0.001 4.041 17.925
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 28.0021 29.8446 0.938 0.358
## x1 0.3876 0.2454 1.579 0.129
## x2 0.5766 0.3673 1.570 0.131
##
## Residual standard error: 6.564 on 22 degrees of freedom
## Multiple R-squared: 0.5837, Adjusted R-squared: 0.5459
## F-statistic: 15.43 on 2 and 22 DF, p-value: 6.502e-05
##
##
## Response y2 :
##
## Call:
## lm(formula = y2 ~ x1 + x2, data = q1df)
##
## Residuals:
## Min 1Q Median 3Q Max
## -7.6192 -3.2907 -0.5662 2.5827 10.7487
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 35.8024 23.8778 1.499 0.148
## x1 0.2447 0.1964 1.246 0.226
## x2 0.4713 0.2938 1.604 0.123
##
## Residual standard error: 5.252 on 22 degrees of freedom
## Multiple R-squared: 0.5349, Adjusted R-squared: 0.4926
## F-statistic: 12.65 on 2 and 22 DF, p-value: 0.0002205
4Solution
4630 Assignment 3 Ravish Kamath 213893664
Part C
We have our null hypothesis to be H0 : X1 = X2 = 0. Let eˆ = Y ≠ Yˆ and
ˆ
Σ
= n≠1eˆT eˆ. Below is the
calculation for or sample variance.
Yhat = Xnew%*%beta_coef
ehat = Y - Yhat
n = dim(X)[1]
Sigmahat = 1/n*t(ehat)%*%ehat
Sigmahat
## y1 y2
## y1 37.9206 10.70300
## y2 10.7030 24.27345
Now to calculate our test statistic (which will be using the Wilks Lambda), we need too solve for SSE and
SST. We have shown the calculation below. Recall that SSE = n
ˆ
Σ
and SST = (Y ≠ Yˆ )T (Y ≠ Yˆ ). Below
are the given outputs for SSE and SST.
SSE = n*Sigmahat
SSE
## y1 y2
## y1 948.015 267.5750
## y2 267.575 606.8363
ybar = colMeans(Y)
n = nrow(Y)
m = ncol(Y)
r = ncol(X)
q = 1
Ybar = matrix(ybar, n ,m, byrow = T)
SST = crossprod(Y - Ybar)
SST
## y1 y2
## y1 2277.44 1230.04
## y2 1230.04 1304.64
We now fifinally calculate the Wilks Lambda and complete the Bartlett method. Let it be shown that Wilks
Lambda (Λ) is Λ = |
SSE
|
|
SST|
. Furthermore the formula for the Bartlett test statistic is as follows:
≠(n ≠ r ≠ 1 ≠
1
2
(m ≠ r + q + 1)) log(Λ) ≥ ‰
2
m
(r≠q)
The output is shown below.
WilksLam = det(SSE)/det(SST)
Bartlett = -(n - r - 1 - 1/2*(m - r + q + 1))*log(WilksLam)
df = m*(r-q)
1 - pchisq(Bartlett, df)
## [1] 1.420822e-05
Thus we get the Barlett observed value to be 22.32338 with the p-value being, 1.420822e-05. Based of the
very small p-value, we will reject the null hypothesis. This implies that there is evidence that the model
would be signfificant.
5Solution
4630 Assignment 3 Ravish Kamath 213893664
Part D
Let H0 : X2 = 0
Same procedure as part (c), we have our Λ statistic to be 0.8552134, which when used through Bartlett’s
method, we get our test statistic to be 3.284489, with a p-value of 0.1935451. Based of the large p-value,
we cannot reject H0. This implies that X2 is not signifificant. Below is the output code.
#Testing if Beta(2) is significant
#H_0: Beta(2) or X2 = 0
X1 = Xnew[,1:2]
Betahat1 = solve(crossprod(X1))%*%crossprod(X1,Y)
Sigmahat1 = 1/n*crossprod(Y - X1%*%Betahat1)
E = n*Sigmahat
H = n*(Sigmahat1 - Sigmahat)
n = nrow(Y)
m = ncol(Y)
r = ncol(X1)
q = 1
WilksLam = det(E)/det(E + H)
Bartlett = -(n - r - 1 - 1/2*(m - r + q + 1))*log(WilksLam)
df = m*(r-q)
1 - pchisq(Bartlett, df)
## [1] 0.1935451
Let H0 : X1 = 0
Same procedure as part (c), we have our Λ statistic to be 0.8787331, which when used through Bartlett’s
method, we get our test statistic to be 3.284489, with a p-value of 0.2573347. Based of the large p-value,
we cannot reject H0. This implies that X1 is not signifificant. Below is the output code.
#Testing if Beta(1) is significant
#H_0: Beta(1) or X1 = 0
X2 = cbind(Xnew[,1], Xnew[,3])
Betahat2 = solve(crossprod(X2))%*%crossprod(X2,Y)
Sigmahat2 = 1/n*crossprod(Y - X2%*%Betahat2)
E = n*Sigmahat
H = n*(Sigmahat2 - Sigmahat)
n = nrow(Y)
m = ncol(Y)
r = ncol(X2)
q = 1
WilksLam = det(E)/det(E + H)
Bartlett = -(n - r - 1 - 1/2*(m - r + q + 1))*log(WilksLam)
df = m*(r-q)
1 - pchisq(Bartlett, df)
## [1] 0.2573347
6Solution
4630 Assignment 3 Ravish Kamath 213893664
Part E
The formula for a 100(1 ≠ –)% Confifidence Region is as follow:
˜
x
T
0 —ˆ(i) ± Ú# (m(n ≠
r
≠
1))
n ≠
r
≠
m
$Fm,n≠r≠m,–Ò
˜
x
0(XT X)≠1
˜
x
0
ˆ
Σ
ii
Let
˜
x
0 = [192, 152]T . Let us now compute
˜
x
0 into the formula, given above.
newobs = data.frame(x1 = 192, x2 = 152)
pred = predict(fit, newobs)
n = nrow(Y)
m = ncol(Y)
r = ncol(X)
table = sqrt( ((m*(n-r-1))/(n-r-m))*qf(0.95, df1 = m, df2 = n-r-m) )
x0 = c(1,192, 152)
sd1 = sqrt(t(x0)%*%solve(crossprod(Xnew))%*%x0*Sigmahat[1,1])
sd2 = sqrt(t(x0)%*%solve(crossprod(Xnew))%*%x0*Sigmahat[2,2])
sd = c(sd1, sd2)
CR_L = pred - table*sd
CR_U = pred + table*sd
CR = cbind(t(CR_L), t(CR_U))
colnames(CR) = c("Lower", "Upper")
CR
## Lower Upper
## y1 185.4438 194.7054
## y2 150.7123 158.1222
Hence our Confifidence Region will be:
5 140
.
0746
154
.4173
6 ± 2.695139 5 1
.
718209
1
.374688
6
7Solution
4630 Assignment 3 Ravish Kamath 213893664
Part F
Very similar to our C.R. formula from part (e), our Hotelling Prediction Region is as follows:
˜
x
T
0 —ˆ(i) ± Ú
# (m(n ≠
r
≠
1))
n ≠
r
≠
m
$Fm,n≠r≠m,–Ò (1 +
˜
x
0(XT X)≠1
˜
x
0)
ˆ
Σ
ii
Let us now use R,to calculate the P.R.
sd1 = sqrt( (1 + t(x0)%*%solve(crossprod(Xnew))%*%x0)*Sigmahat[1,1] )
sd2 = sqrt( (1 + t(x0)%*%solve(crossprod(Xnew))%*%x0)*Sigmahat[2,2] )
sd = c(sd1, sd2)
PR_L = pred - table*sd
PR_U = pred + table*sd
PR = cbind(t(PR_L), t(PR_U))
colnames(PR) = c("Lower", "Upper")
PR
## Lower Upper
## y1 172.8440 207.3051
## y2 140.6316 168.2029
Hence our Prediction Region will be:
5 140
.
0746
154
.4173
6 ± 2.695139 5 6.3931987
5.115 6
84630 Assignment 3 Ravish Kamath 213893664
Question 2
Timm (1975) reported the results of an experiment in which subjects’ respond time to “probe words” at fifive
position (Y1 is at the beginning of the sentence, Y2 is in the fifirst quartile of the sentence, Y3 is in the middle
of the sentence, Y4 is in the third quartile of the sentence, and Y5 is at end of the sentence). The data are
recorded in the EXCEL fifile tab “q2”.
(a) Use the sample variance and obtain all the principle components.
(b) Timm specififically required the reduction in dimension should cover at least 90% of the total variance.
How many principle components are needed? Why?
(c) Repeat parts (a) and (b) using the sample correlation matrix.
Solution
Part A
In this part, we will try to hard code the principal components, using the sample variance. Note that we
could also use the prcomp function that is usually used to calculate P.C. First we need to calculate the sample
variance. The output of the sample variance is calculated below.
X = data.matrix(q2df)
n = dim(X)[1]
params = dim(X)[2]
onevec = rep(1, n)
#Mean Vector
xbar = 1/n*t(X)%*%onevec
#Variance Matrix
S = 1/(n - 1)*(t(X)%*%X - n*xbar%*%t(xbar))
S
## x1 x2 x3 x4 x5
## x1 65.09091 33.64545 47.59091 36.77273 25.42727
## x2 33.64545 46.07273 28.94545 40.33636 28.36364
## x3 47.59091 28.94545 60.69091 37.37273 41.12727
## x4 36.77273 40.33636 37.37273 62.81818 31.68182
## x5 25.42727 28.36364 41.12727 31.68182 58.21818
Now we get the eigen value and eigen vectors. The eigen vectors will represent the Principal Components
#Getting the eigenvalues and vectors
eig = eigen(S)
eig$vectors
## [,1] [,2] [,3] [,4] [,5]
## [1,] -0.4727831 0.57631826 0.41685181 0.2285789 0.4672469
## [2,] -0.3918187 0.10826396 -0.45239805 0.6558114 -0.4472185
## [3,] -0.4875471 -0.09600807 0.47945493 -0.3689607 -0.6221505
## [4,] -0.4677199 0.12056819 -0.61953744 -0.5803451 0.2146494
## [5,] -0.4080320 -0.79522446 0.08869553 0.2114962 0.3853964
Hence our Principal Components will be as follow:
Y1 = ≠0
.4728
X
1 ≠
0
.3918
X
2 ≠
0
.4875
X
3 ≠
0
.4677
X
4 ≠
0
.4080
X
5
Y2 = 0.
5763
X
1 + 0
.
1082
X
2 ≠
0.
0960
X
3
+ 0
.
1206
X
4
≠
0.
7952
X
5
Y3 = 0.4168X1 ≠ 0.4524
X
2
+ 0
.
4794
X
3
≠
0
.
6195
X
4
+ 0
.
0887
X
5
Y4 = 0.2286X1 + 0.
6558
X
2
≠ 0
.
3690
X
3
≠
0
.
5803
X
4
+ 0
.
2115
X
5
Y5 = 0.4672X1 ≠ 0.4472X2 ≠
0
.
6222
X3
+ 0
.2146X4 + 0.3854X5
9Solution
4630 Assignment 3 Ravish Kamath 213893664
Part B
result = prcomp(X, scale = F)
var_exp = result$sdevˆ2/sum(result$sdevˆ2)
plot(var_exp, type = 'b', col = 'green', yaxt = 'n', ylab = '', xlab = '')
par(new = T)
plot(cumsum(var_exp), xlab = "Principal Component",
ylab = "Proportion of Variance Explained",
main = "Scree Plot Using Covariance Matrix",
ylim = c(0, 1), type = "b", col = 'blue')
legend(4.2, 0.4, legend=c("Proportion", "Cumulative"),
col=c("green", "blue"), lty=1:1, cex=0.8)
1
2
3
4
5
1
2
3
4
5
Scree Plot Using Covariance Matrix
Principal Component
Proportion
Cumulative
I would say that using the fifirst 3 P.C. would be needed to cover 90% of the total variance. When comparing
P.C. 3 and P.C. 4 we oly get an extra 5% of explained variance.
10
0.0
0.2
0.4
0.6
0.8
1.0
Proportion of Variance ExplainedSolution
4630 Assignment 3 Ravish Kamath 213893664
Part C
Same procedure for part (a), we fifirst need to fifind the correlation matrix.
sd = sqrt(diag(diag(S),5))
invsd = solve(sd)
cor = invsd%*%S%*%t(invsd)
cor
## [,1] [,2] [,3] [,4] [,5]
## [1,] 1.0000000 0.6143902 0.7571850 0.5750730 0.4130573
## [2,] 0.6143902 1.0000000 0.5473897 0.7497770 0.5476595
## [3,] 0.7571850 0.5473897 1.0000000 0.6052716 0.6918927
## [4,] 0.5750730 0.7497770 0.6052716 1.0000000 0.5238876
## [5,] 0.4130573 0.5476595 0.6918927 0.5238876 1.0000000
Now let us compute the P.C for the correlation matrix. But before we do that, when using the correlation
matrix for PCA, we need to standardize our independent variables. Let Zi = (X
i
≠
µi)
Ô
‡
ii
for i = 1, ..., 5.
result = prcomp(scale(X), scale = F )
print(result)
## Standard deviations (1, .., p=5):
## [1] 1.8483758 0.7838567 0.7564879 0.5207797 0.3543867
##
## Rotation (n x k) = (5 x 5):
## PC1 PC2 PC3 PC4 PC5
## x1 -0.4418394 0.2006104 -0.6786078 0.2125365 -0.5087760
## x2 -0.4535595 0.4280646 0.3491277 0.6055405 0.3499642
## x3 -0.4727808 -0.3678765 -0.3754368 -0.2581448 0.6584479
## x4 -0.4536224 0.3934629 0.3345386 -0.7010073 -0.1899641
## x5 -0.4120276 -0.6974023 0.4058723 0.1734903 -0.3860467
Hence our principal components are as follows:
Y1 = ≠0
.4418
Z
1 ≠
0
.4536
Z
2 ≠
0
.47288
Z
3
≠
0.4536
Z
4
≠
0.4120
Z5
Y2 = 0.
2006
Z1
≠
0
.
4281
Z
2
≠
0.
3679
Z
3
+ 0
.
3935
Z
4
≠
0
.
6974
Z
5
Y3 = 0.6786Z1 ≠
0
.
3491
Z
2
+ 0
.
3754
Z
3
≠
0
.
3345
Z
4
≠
0
.
4059
Z
5
Y4 = 0.2125Z1 + 0.
6055Z2 ≠
0.
2581
Z3
≠
0
.
7010
Z
4
+ 0
.
1735
Z
5
Y5 = 0.5088Z1 ≠ 0.3410Z2 ≠
0
.6584
Z
3 + 0
.1900Z4 + 0.3860Z5
114630 Assignment 3 Ravish Kamath 213893664
Question 3
Use the data in the EXCEL fifile tab “q2”.
(a) Find the canonical correlation between (x1, x2, x3) and (x4, x5).
(b) Test the signifificance canonical correlations.
(c) Regardless of your answer in part (b), is each canonical correlation individually signifificant?
Solution
X = data.matrix(q3df)
col_order = c('x4', 'x5', 'x1', 'x2', 'x3')
X = X[,col_order]
Part A
We fifirst need to get the correlation matrix in order to calculate the canonical correlations.
cor(X)
## x4 x5 x1 x2 x3
## x4 1.0000000 0.5238876 0.5750730 0.7497770 0.6052716
## x5 0.5238876 1.0000000 0.4130573 0.5476595 0.6918927
## x1 0.5750730 0.4130573 1.0000000 0.6143902 0.7571850
## x2 0.7497770 0.5476595 0.6143902 1.0000000 0.5473897
## x3 0.6052716 0.6918927 0.7571850 0.5473897 1.0000000
Now let X(1) = (
˜
X
4,
˜
X
5) and X(2) = (
˜
X
1,
˜
X
2,
˜
X
3)
Furthermore, let the following Σ matrices be subsections of our correlation matrix.
Σ11 = 5
1 0.5239
0.5239 1
6 Σ12 = 5 0
.
5751 0.
7498 0.
6053
0
.4131 0
.5477 0
.
6919 6
Σ21 = Σ12
Σ12 = S
U
1 0.
6144 0
.
7572
0
.
6144 1 0
.5474
0
.
7572 0
.
5474 1
T
V
Let A = Σ≠
1
112
Σ
≠1
22
Σ21Σ≠
1
112
. Let us use R to calculate Σ≠
1
112
and Σ
≠1
22
.
12Solution
4630 Assignment 3 Ravish Kamath 213893664
cor11 = matrix(c(1.0000000, 0.5238876,
0.5238876, 1.0000000),
ncol = 2, byrow = T)
cor11_sqrt = spectral(cor11)
cor11_invsqrt = solve(cor11_sqrt$SqrtMat)
cor12 = matrix(c(0.5750730, 0.7497770, 0.6052716,
0.4130573, 0.5476595, 0.6918927),
ncol = 3, byrow = T)
cor21 = t(cor12)
cor22 = matrix(c(1.0000000, 0.6143902, 0.7571850,
0.6143902, 1.0000000, 0.5473897,
0.7571850, 0.5473897, 1.0000000),
ncol = 3, byrow = T)
cor22_inv = solve(cor22)
Now we can fifinally calculate our A matrix. This is shown below.
A = cor11_invsqrt%*%cor12%*%cor22_inv%*%cor21%*%cor11_invsqrt
A
## [,1] [,2]
## [1,] 0.4705581 0.2831067
## [2,] 0.2831067 0.4371090
To fifind the canonical correlation, we now need to compute the eigen values of the A matrix. Using R once
again we come with the following output.
canon_cor = sqrt((eigen(A)$values))
canon_cor
## [1] 0.8587396 0.4125933
By taking the square root of the eigen values of the matrix A, we get the following canonical correlations:
pú
1 = 0.8587396
p
ú
2
= 0.4125933
13Solution
4630 Assignment 3 Ravish Kamath 213893664
Part B
Let our null hypothesis be: H0 : pú
1 = p
ú
2
= Σ12 = 0. We will be using what Bartlett suggests:
≠[n ≠ 1 ≠
1
2
(p + q + 1)] log 1
p
iŸ
=1
(1 ≠ (ˆ
púi )2)2 > ‰
2
pq
(–)
Let
p
Æ q which will be the number of variables in each group. Hence, p = 2 and q = 3. Using the test
statistic formula from above we get the p-value to be 0.09923, which is quite high. Hence we cannot reject
our null hypothesis. This implies that our canonical correlations are not signifificant. Below is the R code to
provide the evidence for our statement.
n = dim(X)[1]
p = min(3,2)
q = max(3,2)
Bartlett = -(n - 1 - 1/2*(p + q + 1))*log(prod((1-canon_corˆ2)))
Bartlett
## [1] 10.66704
df = p*q
1-pchisq(Bartlett, df)
## [1] 0.09922844
Part C
Let H(1) : pú
1
”
= 0
, p
ú
2
= 0. Hence our alternative hypothesis will be Ha (1) : p
ú
2
”
= 0
.
The Bartlett formula to fifind the test statistic is as follows:
≠[n ≠ 1 ≠
1
2
(p + q + 1)] log(1 ≠ (ˆ
p
ú
2
)2) > ‰2
(p≠1)(q≠1)(–)
Using the above test statistic, our Bartlett observed value is 1.306275 with a p-value of 0.5204105.Due to
the large p-value we fail to reject our null hypothesis. Hence we have do not have evidence that p
ú
2
is
signifificant. Below is the R code for this problem.
#Testing p_2 not equal to 0
Bartlett = -(n - 1 - 1/2*(p + q + 1))*log((1-canon_cor[2]ˆ2))
Bartlett
## [1] 1.306275
df = (p - 1)*(q-1)
1-pchisq(Bartlett, df)
## [1] 0.5204105
14Solution
4630 Assignment 3 Ravish Kamath 213893664
Let H(2) : p
ú
2
”
= 0
, pú
1 = 0. Hence our alternative hypothesis will be
Ha (2) : pú
1
”
= 0
. Essentially we are using
the same Bartlett formula except we replace
p
ú
2
, with pú
1.
Using the above test statistic, our Bartlett observed value is 9.360764 with a p-value of 0.009275471.Due
to the very small p-value we reject our null hypothesis. Hence we have evidence that pú
1 is signifificant.
Below is the R code for this problem.
#Testing p_1 not equal to 0
Bartlett = -(n - 1 - 1/2*(p + q + 1))*log((1-canon_cor[1]ˆ2))
Bartlett
## [1] 9.360764
df = (p - 1)*(q - 1)
1-pchisq(Bartlett, df)
## [1] 0.009275471
154630 Assignment 3 Ravish Kamath 213893664
Question 4
(a) Show that
≠
1
2(
˜
x
≠
˜
µ
1)ÕΣ≠1(
˜
x
≠
˜
µ
1) +
1
2
(
˜
x
≠
˜
µ
2)ÕΣ≠1(
˜
x
≠
˜
µ
2)=(
˜
µ
1 ≠
˜
µ
2)ÕΣ≠1
˜
x
≠
1
2
(
˜
µ
1 ≠
˜
µ
2)Õ (
˜
µ
1 +
˜
µ
2)
(b) Let
f1(x) = (1 ≠ |x|) for
|
x| Æ
1
f2(x) = (1 ≠ |x ≠
0.5|)
for ≠
0
.5
Æ x Æ 1
Solution
Part A
Handwritten notes.
Part B.1
−1.0
−0.5
0.0
0.5
1.0
1.5
2.0
Probability Density Function Graph
Values of x
Line types
(1 − |x|)
(1 − |x − 0.5|)
Part B.2
Handwritten notes.
Part B.3
Handwritten notes.
16
0.0
0.2
0.4
0.6
0.8
1.0
Density4630 Assignment 3 Ravish Kamath 213893664
Question 5
Use the data in the EXCEL fifile tab “q5”. Assume the data is a sample from two multivariate normal
distributions.
(a) Identify the classifification rule for the case p1 = p2 and c(1|2) = c(2|1).
(b) Identify the classifification rule for the case p1 = 0.25 and c(1|2) is half of c(2|1).
(c) Identify the Bayesian rule using p1 = 0.6.
(d) Based on the rules given in part (a), which population will a new data point (50, 48, 47, 49) be classifified
into? How about using the rules in part (b)? Using the rule in part (c)?
Solution
A = as.matrix(q5df[q5df$Group == "A",1:4])
B =as.matrix(q5df[q5df$Group =="B",1:4])
dat = list(A, B)
Part A
All the formulation of the classifification rule will be in the handwritten notes, however here is the R code for
the calculation.
xbar = function(matrix){
n = dim(matrix)[1]
onevec = rep(1,n)
xbar = 1/n*t(matrix)%*%onevec
return(xbar)
}
# Mean Vectors
xbar1 = xbar(A)
xbar2 = xbar(B)
spooled = Spool(dat)
ahat = t(xbar1 - xbar2)%*%solve(spooled)
ybar1 = ahat%*%xbar1
ybar2 = ahat%*%xbar2
midpoint = 1/2*(ybar1 + ybar2)
midpoint
## [,1]
## [1,] -7.062536
# Hence the classification is -7.062536
Part B
Handwritten Notes
Part C
Handwritten Notes.
17Solution
4630 Assignment 3 Ravish Kamath 213893664
Part D.1
When we have equal cost and prior discriminants:
newobs = c(50, 48, 47, 49)
ahat%*%newobs
## [,1]
## [1,] -5.481962
Since the observation is bigger than the cutoffff point, which was -7.062536, we will classify this obs. to
population A.
Part D.2
When p1 = 0.25, and c(1|2) is half of c(2|1)
ahat%*%newobs - midpoint
## [,1]
## [1,] 1.580574
Since the cutoffff value for this classifification is 0.4055, we will also classify this obs. to population A.
Part D.3
Most of this will be shown in the handwritten, however to calculation certain probabilities for the Bayesian
rule, we need to use R. This is shown below
#When we use Bayesian Rule
probA_newob = (2*pi)ˆ(-2)*(det(spooled))ˆ(-1/2)*
exp( (-1/2)*t((newobs - xbar1))%*% solve(spooled)%*%(newobs -
xbar1))
probA_newob
## [,1]
## [1,] 1.475133e-09
probB_newob = (2*pi)ˆ(-2)*(det(spooled))ˆ(-1/2)*
exp( (-1/2)*t((newobs - xbar2))%*% solve(spooled)%*%(newobs -
xbar2))
probB_newob
## [,1]
## [1,] 3.036662e-10
A_newob = (0.6*probA_newob)/(0.6*probA_newob + 0.4*probB_newob)
A_newob
## [,1]
## [1,] 0.8793235
B_newob = (0.4*probB_newob)/(0.6*probA_newob + 0.4*probB_newob)
B_newob
## [,1]
## [1,] 0.1206765
18