Starting from:

$30

MATH4630 Assignment 3 -Solved

Functions Created 
Here are some functions I have created to save time when completing questions. 
xbar = function(matrix){ 
n = dim(matrix)[1] 
onevec = rep(1,n) 
xbar = 1/n*t(matrix)%*%onevec 
return(xbar) 

Spool = function(dat){ 
#Getting the n lengths and parameters 
nbar = rep(0,length(dat)) 
for (i in 1: length(dat)){ 
nbar[i] = dim(dat[[i]])[1] 
}
n = sum(nbar) 
params = dim(dat[[1]])[2] 
g = length(dat) 
#Getting the variance matrices 
S = list() 
for (i in 1:g){ 
S = append(S,list(var(dat[[i]]))) 

#Calculating the Within Matrix 
W = matrix(0,params,params) 
Wnew = matrix(0,params,params) 
for (i in 1:g){ 
Wnew = as.matrix(lapply(S[i], '*', (nbar[i] -1))) 
Wnew = matrix(unlist(Wnew), ncol = params, byrow = T) 
W = W + Wnew 

#s_pooled 
result = W/(n - g) 
return(result) 

spectral = function(matrix){ 
eig = eigen(matrix) 

= diag(eig$values) 

= eig$vectors 
matsqrt = P%*%sqrt(D)%*%t(P) 
14630 Assignment 3 Ravish Kamath 213893664 
output = list(D, P, matsqrt) 
names(output) = c('D', 'P', 'SqrtMat') 
return(output) 

24630 Assignment 3 Ravish Kamath 213893664 
Question 1 
Consider the data given in the EXCEL fifile tab “q1”. 
(a) State the multivariable linear regression model with all the necessary assumptions. 
(b) Find the predicted model. 
(c) Test the signifificance of the model. 
(d) Regarless of your result in part (c), test if X1 is signifificant? How about X2? 
(e) Find a 95% woking Hotelling confifidence region for the mean response when X1 = 192 and X2 = 152. 
(f) Find a 95% woking Hotelling prediction region for a new response when X1 = 192 and X2 = 152. 
Solution 
Part A 
˜

= X
˜
— 

‘ 
where 
˜

= [Y(1), Y(2)], X = [˜1, X(1), X(2)], 
˜
— 
= [—(0), —(1), —(2)] and ˜
‘ = [‘(1), ‘(2)] 
Assumptions: E(˜
‘(i)) = ˜
0 V (˜
‘(i)) = ‡iiI cov(˜
‘(i),˜
‘(j)) = ‡ij I 
Part B 
For this section we will show both ways of getting the predicted model. One will be through the actual LS 
method, and the other will be using the lm function built in R. 
Least Squares Method 
Y = as.matrix(q1df[,1:2]) 
X = as.matrix(q1df[,3:4]) 
onevec = rep(1, dim(X)[1]) 
Xnew = cbind(onevec,X) 
beta_coef = solve(crossprod(Xnew))%*%crossprod(Xnew, Y) 
beta_coef 
## y1 y2 
## onevec 28.0020858 35.8023808 
## x1 0.3876325 0.2446617 
## x2 0.5766253 0.4713147 
3Solution 
4630 Assignment 3 Ravish Kamath 213893664 
Using the built in function in R 
fit = lm(cbind(y1, y2) ~ x1 + x2, data = q1df) 
summary(fit) 
## Response y1 : 
## 
## Call: 
## lm(formula = y1 ~ x1 + x2, data = q1df) 
## 
## Residuals: 
## Min 1Q Median 3Q Max 
## -13.361 -3.794 0.001 4.041 17.925 
## 
## Coefficients: 
## Estimate Std. Error t value Pr(>|t|) 
## (Intercept) 28.0021 29.8446 0.938 0.358 
## x1 0.3876 0.2454 1.579 0.129 
## x2 0.5766 0.3673 1.570 0.131 
## 
## Residual standard error: 6.564 on 22 degrees of freedom 
## Multiple R-squared: 0.5837, Adjusted R-squared: 0.5459 
## F-statistic: 15.43 on 2 and 22 DF, p-value: 6.502e-05 
## 
## 
## Response y2 : 
## 
## Call: 
## lm(formula = y2 ~ x1 + x2, data = q1df) 
## 
## Residuals: 
## Min 1Q Median 3Q Max 
## -7.6192 -3.2907 -0.5662 2.5827 10.7487 
## 
## Coefficients: 
## Estimate Std. Error t value Pr(>|t|) 
## (Intercept) 35.8024 23.8778 1.499 0.148 
## x1 0.2447 0.1964 1.246 0.226 
## x2 0.4713 0.2938 1.604 0.123 
## 
## Residual standard error: 5.252 on 22 degrees of freedom 
## Multiple R-squared: 0.5349, Adjusted R-squared: 0.4926 
## F-statistic: 12.65 on 2 and 22 DF, p-value: 0.0002205 
4Solution 
4630 Assignment 3 Ravish Kamath 213893664 
Part C 
We have our null hypothesis to be H0 : X1 = X2 = 0. Let eˆ = Y ≠ Yˆ and 
ˆ
Σ 
= n≠1eˆT eˆ. Below is the 
calculation for or sample variance. 
Yhat = Xnew%*%beta_coef 
ehat = Y - Yhat 
n = dim(X)[1] 
Sigmahat = 1/n*t(ehat)%*%ehat 
Sigmahat 
## y1 y2 
## y1 37.9206 10.70300 
## y2 10.7030 24.27345 
Now to calculate our test statistic (which will be using the Wilks Lambda), we need too solve for SSE and 
SST. We have shown the calculation below. Recall that SSE = n
ˆ
Σ 
and SST = (Y ≠ Yˆ )T (Y ≠ Yˆ ). Below 
are the given outputs for SSE and SST. 
SSE = n*Sigmahat 
SSE 
## y1 y2 
## y1 948.015 267.5750 
## y2 267.575 606.8363 
ybar = colMeans(Y) 
n = nrow(Y) 
m = ncol(Y) 
r = ncol(X) 
q = 1 
Ybar = matrix(ybar, n ,m, byrow = T) 
SST = crossprod(Y - Ybar) 
SST 
## y1 y2 
## y1 2277.44 1230.04 
## y2 1230.04 1304.64 
We now fifinally calculate the Wilks Lambda and complete the Bartlett method. Let it be shown that Wilks 
Lambda (Λ) is Λ = |
SSE

|
SST| 
. Furthermore the formula for the Bartlett test statistic is as follows: 
≠(n ≠ r ≠ 1 ≠ 
1
2
(m ≠ r + q + 1)) log(Λ) ≥ ‰
2
m
(r≠q) 
The output is shown below. 
WilksLam = det(SSE)/det(SST) 
Bartlett = -(n - r - 1 - 1/2*(m - r + q + 1))*log(WilksLam) 
df = m*(r-q) 
1 - pchisq(Bartlett, df) 
## [1] 1.420822e-05 
Thus we get the Barlett observed value to be 22.32338 with the p-value being, 1.420822e-05. Based of the 
very small p-value, we will reject the null hypothesis. This implies that there is evidence that the model 
would be signfificant. 
5Solution 
4630 Assignment 3 Ravish Kamath 213893664 
Part D 
Let H0 : X2 = 0 
Same procedure as part (c), we have our Λ statistic to be 0.8552134, which when used through Bartlett’s 
method, we get our test statistic to be 3.284489, with a p-value of 0.1935451. Based of the large p-value, 
we cannot reject H0. This implies that X2 is not signifificant. Below is the output code. 
#Testing if Beta(2) is significant 
#H_0: Beta(2) or X2 = 0 
X1 = Xnew[,1:2] 
Betahat1 = solve(crossprod(X1))%*%crossprod(X1,Y) 
Sigmahat1 = 1/n*crossprod(Y - X1%*%Betahat1) 
E = n*Sigmahat 
H = n*(Sigmahat1 - Sigmahat) 
n = nrow(Y) 
m = ncol(Y) 
r = ncol(X1) 
q = 1 
WilksLam = det(E)/det(E + H) 
Bartlett = -(n - r - 1 - 1/2*(m - r + q + 1))*log(WilksLam) 
df = m*(r-q) 
1 - pchisq(Bartlett, df) 
## [1] 0.1935451 
Let H0 : X1 = 0 
Same procedure as part (c), we have our Λ statistic to be 0.8787331, which when used through Bartlett’s 
method, we get our test statistic to be 3.284489, with a p-value of 0.2573347. Based of the large p-value, 
we cannot reject H0. This implies that X1 is not signifificant. Below is the output code. 
#Testing if Beta(1) is significant 
#H_0: Beta(1) or X1 = 0 
X2 = cbind(Xnew[,1], Xnew[,3]) 
Betahat2 = solve(crossprod(X2))%*%crossprod(X2,Y) 
Sigmahat2 = 1/n*crossprod(Y - X2%*%Betahat2) 
E = n*Sigmahat 
H = n*(Sigmahat2 - Sigmahat) 
n = nrow(Y) 
m = ncol(Y) 
r = ncol(X2) 
q = 1 
WilksLam = det(E)/det(E + H) 
Bartlett = -(n - r - 1 - 1/2*(m - r + q + 1))*log(WilksLam) 
df = m*(r-q) 
1 - pchisq(Bartlett, df) 
## [1] 0.2573347 
6Solution 
4630 Assignment 3 Ravish Kamath 213893664 
Part E 
The formula for a 100(1 ≠ –)% Confifidence Region is as follow: 
˜
x
T
0 —ˆ(i) ± Ú# (m(n ≠ 

≠ 
1)) 
n ≠ 

≠ 

$Fm,n≠r≠m,–Ò
˜
x
0(XT X)≠1
˜
x

ˆ
Σ
ii 
Let 
˜
x
0 = [192, 152]T . Let us now compute 
˜
x
0 into the formula, given above. 
newobs = data.frame(x1 = 192, x2 = 152) 
pred = predict(fit, newobs) 
n = nrow(Y) 
m = ncol(Y) 
r = ncol(X) 
table = sqrt( ((m*(n-r-1))/(n-r-m))*qf(0.95, df1 = m, df2 = n-r-m) ) 
x0 = c(1,192, 152) 
sd1 = sqrt(t(x0)%*%solve(crossprod(Xnew))%*%x0*Sigmahat[1,1]) 
sd2 = sqrt(t(x0)%*%solve(crossprod(Xnew))%*%x0*Sigmahat[2,2]) 
sd = c(sd1, sd2) 
CR_L = pred - table*sd 
CR_U = pred + table*sd 
CR = cbind(t(CR_L), t(CR_U)) 
colnames(CR) = c("Lower", "Upper") 
CR 
## Lower Upper 
## y1 185.4438 194.7054 
## y2 150.7123 158.1222 
Hence our Confifidence Region will be: 
5 140
.
0746 
154
.4173 
6 ± 2.695139 5 1
.
718209 
1
.374688 

7Solution 
4630 Assignment 3 Ravish Kamath 213893664 
Part F 
Very similar to our C.R. formula from part (e), our Hotelling Prediction Region is as follows: 
˜
x
T
0 —ˆ(i) ± Ú
# (m(n ≠ 

≠ 
1)) 
n ≠ 

≠ 

$Fm,n≠r≠m,–Ò (1 + 
˜
x
0(XT X)≠1
˜
x
0)
ˆ
Σ
ii 
Let us now use R,to calculate the P.R. 
sd1 = sqrt( (1 + t(x0)%*%solve(crossprod(Xnew))%*%x0)*Sigmahat[1,1] ) 
sd2 = sqrt( (1 + t(x0)%*%solve(crossprod(Xnew))%*%x0)*Sigmahat[2,2] ) 
sd = c(sd1, sd2) 
PR_L = pred - table*sd 
PR_U = pred + table*sd 
PR = cbind(t(PR_L), t(PR_U)) 
colnames(PR) = c("Lower", "Upper") 
PR 
## Lower Upper 
## y1 172.8440 207.3051 
## y2 140.6316 168.2029 
Hence our Prediction Region will be: 
5 140
.
0746 
154
.4173 
6 ± 2.695139 5 6.3931987 
5.115 6 
84630 Assignment 3 Ravish Kamath 213893664 
Question 2 
Timm (1975) reported the results of an experiment in which subjects’ respond time to “probe words” at fifive 
position (Y1 is at the beginning of the sentence, Y2 is in the fifirst quartile of the sentence, Y3 is in the middle 
of the sentence, Y4 is in the third quartile of the sentence, and Y5 is at end of the sentence). The data are 
recorded in the EXCEL fifile tab “q2”. 
(a) Use the sample variance and obtain all the principle components. 
(b) Timm specififically required the reduction in dimension should cover at least 90% of the total variance. 
How many principle components are needed? Why? 
(c) Repeat parts (a) and (b) using the sample correlation matrix. 
Solution 
Part A 
In this part, we will try to hard code the principal components, using the sample variance. Note that we 
could also use the prcomp function that is usually used to calculate P.C. First we need to calculate the sample 
variance. The output of the sample variance is calculated below. 
X = data.matrix(q2df) 
n = dim(X)[1] 
params = dim(X)[2] 
onevec = rep(1, n) 
#Mean Vector 
xbar = 1/n*t(X)%*%onevec 
#Variance Matrix 
S = 1/(n - 1)*(t(X)%*%X - n*xbar%*%t(xbar)) 

## x1 x2 x3 x4 x5 
## x1 65.09091 33.64545 47.59091 36.77273 25.42727 
## x2 33.64545 46.07273 28.94545 40.33636 28.36364 
## x3 47.59091 28.94545 60.69091 37.37273 41.12727 
## x4 36.77273 40.33636 37.37273 62.81818 31.68182 
## x5 25.42727 28.36364 41.12727 31.68182 58.21818 
Now we get the eigen value and eigen vectors. The eigen vectors will represent the Principal Components 
#Getting the eigenvalues and vectors 
eig = eigen(S) 
eig$vectors 
## [,1] [,2] [,3] [,4] [,5] 
## [1,] -0.4727831 0.57631826 0.41685181 0.2285789 0.4672469 
## [2,] -0.3918187 0.10826396 -0.45239805 0.6558114 -0.4472185 
## [3,] -0.4875471 -0.09600807 0.47945493 -0.3689607 -0.6221505 
## [4,] -0.4677199 0.12056819 -0.61953744 -0.5803451 0.2146494 
## [5,] -0.4080320 -0.79522446 0.08869553 0.2114962 0.3853964 
Hence our Principal Components will be as follow: 
Y1 = ≠0
.4728
X
1 ≠ 
0
.3918
X
2 ≠ 
0
.4875
X
3 ≠ 
0
.4677
X
4 ≠ 
0
.4080
X

Y2 = 0.
5763
X
1 + 0
.
1082
X
2 ≠ 
0.
0960
X

+ 0
.
1206
X

≠ 
0.
7952
X

Y3 = 0.4168X1 ≠ 0.4524
X

+ 0
.
4794
X

≠ 
0
.
6195
X

+ 0
.
0887
X

Y4 = 0.2286X1 + 0.
6558
X

≠ 0
.
3690
X

≠ 
0
.
5803
X

+ 0
.
2115
X

Y5 = 0.4672X1 ≠ 0.4472X2 ≠ 
0
.
6222
X3 
+ 0
.2146X4 + 0.3854X5 
9Solution 
4630 Assignment 3 Ravish Kamath 213893664 
Part B 
result = prcomp(X, scale = F) 
var_exp = result$sdevˆ2/sum(result$sdevˆ2) 
plot(var_exp, type = 'b', col = 'green', yaxt = 'n', ylab = '', xlab = '') 
par(new = T) 
plot(cumsum(var_exp), xlab = "Principal Component", 
ylab = "Proportion of Variance Explained", 
main = "Scree Plot Using Covariance Matrix", 
ylim = c(0, 1), type = "b", col = 'blue') 
legend(4.2, 0.4, legend=c("Proportion", "Cumulative"), 
col=c("green", "blue"), lty=1:1, cex=0.8) 










Scree Plot Using Covariance Matrix 
Principal Component 
Proportion 
Cumulative 
I would say that using the fifirst 3 P.C. would be needed to cover 90% of the total variance. When comparing 
P.C. 3 and P.C. 4 we oly get an extra 5% of explained variance. 
10 
0.0 
0.2 
0.4 
0.6 
0.8 
1.0 
Proportion of Variance ExplainedSolution 
4630 Assignment 3 Ravish Kamath 213893664 
Part C 
Same procedure for part (a), we fifirst need to fifind the correlation matrix. 
sd = sqrt(diag(diag(S),5)) 
invsd = solve(sd) 
cor = invsd%*%S%*%t(invsd) 
cor 
## [,1] [,2] [,3] [,4] [,5] 
## [1,] 1.0000000 0.6143902 0.7571850 0.5750730 0.4130573 
## [2,] 0.6143902 1.0000000 0.5473897 0.7497770 0.5476595 
## [3,] 0.7571850 0.5473897 1.0000000 0.6052716 0.6918927 
## [4,] 0.5750730 0.7497770 0.6052716 1.0000000 0.5238876 
## [5,] 0.4130573 0.5476595 0.6918927 0.5238876 1.0000000 
Now let us compute the P.C for the correlation matrix. But before we do that, when using the correlation 
matrix for PCA, we need to standardize our independent variables. Let Zi = (X
i

µi) 
Ô

ii 
for i = 1, ..., 5. 
result = prcomp(scale(X), scale = F ) 
print(result) 
## Standard deviations (1, .., p=5): 
## [1] 1.8483758 0.7838567 0.7564879 0.5207797 0.3543867 
## 
## Rotation (n x k) = (5 x 5): 
## PC1 PC2 PC3 PC4 PC5 
## x1 -0.4418394 0.2006104 -0.6786078 0.2125365 -0.5087760 
## x2 -0.4535595 0.4280646 0.3491277 0.6055405 0.3499642 
## x3 -0.4727808 -0.3678765 -0.3754368 -0.2581448 0.6584479 
## x4 -0.4536224 0.3934629 0.3345386 -0.7010073 -0.1899641 
## x5 -0.4120276 -0.6974023 0.4058723 0.1734903 -0.3860467 
Hence our principal components are as follows: 
Y1 = ≠0
.4418
Z
1 ≠ 
0
.4536
Z
2 ≠ 
0
.47288
Z

≠ 
0.4536
Z

≠ 
0.4120
Z5 
Y2 = 0.
2006
Z1 
≠ 
0
.
4281
Z

≠ 
0.
3679
Z

+ 0
.
3935
Z

≠ 
0
.
6974
Z

Y3 = 0.6786Z1 ≠ 
0
.
3491
Z

+ 0
.
3754
Z

≠ 
0
.
3345
Z

≠ 
0
.
4059
Z

Y4 = 0.2125Z1 + 0.
6055Z2 ≠ 
0.
2581
Z3 
≠ 
0
.
7010
Z

+ 0
.
1735
Z

Y5 = 0.5088Z1 ≠ 0.3410Z2 ≠ 
0
.6584
Z
3 + 0
.1900Z4 + 0.3860Z5 
114630 Assignment 3 Ravish Kamath 213893664 
Question 3 
Use the data in the EXCEL fifile tab “q2”. 
(a) Find the canonical correlation between (x1, x2, x3) and (x4, x5). 
(b) Test the signifificance canonical correlations. 
(c) Regardless of your answer in part (b), is each canonical correlation individually signifificant? 
Solution 
X = data.matrix(q3df) 
col_order = c('x4', 'x5', 'x1', 'x2', 'x3') 
X = X[,col_order] 
Part A 
We fifirst need to get the correlation matrix in order to calculate the canonical correlations. 
cor(X) 
## x4 x5 x1 x2 x3 
## x4 1.0000000 0.5238876 0.5750730 0.7497770 0.6052716 
## x5 0.5238876 1.0000000 0.4130573 0.5476595 0.6918927 
## x1 0.5750730 0.4130573 1.0000000 0.6143902 0.7571850 
## x2 0.7497770 0.5476595 0.6143902 1.0000000 0.5473897 
## x3 0.6052716 0.6918927 0.7571850 0.5473897 1.0000000 
Now let X(1) = (
˜
X
4, 
˜
X
5) and X(2) = (
˜
X
1, 
˜
X
2, 
˜
X
3) 
Furthermore, let the following Σ matrices be subsections of our correlation matrix. 
Σ11 = 5 
1 0.5239 
0.5239 1 
6 Σ12 = 5 0
.
5751 0.
7498 0.
6053 
0
.4131 0
.5477 0
.
6919 6 
Σ21 = Σ12 
Σ12 = S

1 0.
6144 0
.
7572 
0
.
6144 1 0
.5474 
0
.
7572 0
.
5474 1 
T

Let A = Σ≠ 

112 
Σ
≠1 
22 
Σ21Σ≠ 

112 
. Let us use R to calculate Σ≠ 

112 
and Σ
≠1 
22 

12Solution 
4630 Assignment 3 Ravish Kamath 213893664 
cor11 = matrix(c(1.0000000, 0.5238876, 
0.5238876, 1.0000000), 
ncol = 2, byrow = T) 
cor11_sqrt = spectral(cor11) 
cor11_invsqrt = solve(cor11_sqrt$SqrtMat) 
cor12 = matrix(c(0.5750730, 0.7497770, 0.6052716, 
0.4130573, 0.5476595, 0.6918927), 
ncol = 3, byrow = T) 
cor21 = t(cor12) 
cor22 = matrix(c(1.0000000, 0.6143902, 0.7571850, 
0.6143902, 1.0000000, 0.5473897, 
0.7571850, 0.5473897, 1.0000000), 
ncol = 3, byrow = T) 
cor22_inv = solve(cor22) 
Now we can fifinally calculate our A matrix. This is shown below. 
A = cor11_invsqrt%*%cor12%*%cor22_inv%*%cor21%*%cor11_invsqrt 

## [,1] [,2] 
## [1,] 0.4705581 0.2831067 
## [2,] 0.2831067 0.4371090 
To fifind the canonical correlation, we now need to compute the eigen values of the A matrix. Using R once 
again we come with the following output. 
canon_cor = sqrt((eigen(A)$values)) 
canon_cor 
## [1] 0.8587396 0.4125933 
By taking the square root of the eigen values of the matrix A, we get the following canonical correlations: 

1 = 0.8587396 
p
ú

= 0.4125933 
13Solution 
4630 Assignment 3 Ravish Kamath 213893664 
Part B 
Let our null hypothesis be: H0 : pú
1 = p
ú

= Σ12 = 0. We will be using what Bartlett suggests: 
≠[n ≠ 1 ≠ 
1
2
(p + q + 1)] log 1 
p
iŸ 
=1 
(1 ≠ (ˆ
púi )2)2 > ‰

pq
(–) 
Let 

Æ q which will be the number of variables in each group. Hence, p = 2 and q = 3. Using the test 
statistic formula from above we get the p-value to be 0.09923, which is quite high. Hence we cannot reject 
our null hypothesis. This implies that our canonical correlations are not signifificant. Below is the R code to 
provide the evidence for our statement. 
n = dim(X)[1] 
p = min(3,2) 
q = max(3,2) 
Bartlett = -(n - 1 - 1/2*(p + q + 1))*log(prod((1-canon_corˆ2))) 
Bartlett 
## [1] 10.66704 
df = p*q 
1-pchisq(Bartlett, df) 
## [1] 0.09922844 
Part C 
Let H(1) : pú


= 0
, p
ú

= 0. Hence our alternative hypothesis will be Ha (1) : p
ú


= 0

The Bartlett formula to fifind the test statistic is as follows: 
≠[n ≠ 1 ≠ 
1
2
(p + q + 1)] log(1 ≠ (ˆ
p
ú
2
)2) > ‰2
(p≠1)(q≠1)(–) 
Using the above test statistic, our Bartlett observed value is 1.306275 with a p-value of 0.5204105.Due to 
the large p-value we fail to reject our null hypothesis. Hence we have do not have evidence that p
ú

is 
signifificant. Below is the R code for this problem. 
#Testing p_2 not equal to 0 
Bartlett = -(n - 1 - 1/2*(p + q + 1))*log((1-canon_cor[2]ˆ2)) 
Bartlett 
## [1] 1.306275 
df = (p - 1)*(q-1) 
1-pchisq(Bartlett, df) 
## [1] 0.5204105 
14Solution 
4630 Assignment 3 Ravish Kamath 213893664 
Let H(2) : p
ú


= 0
, pú
1 = 0. Hence our alternative hypothesis will be 
Ha (2) : pú


= 0
. Essentially we are using 
the same Bartlett formula except we replace 
p
ú
2
, with pú
1. 
Using the above test statistic, our Bartlett observed value is 9.360764 with a p-value of 0.009275471.Due 
to the very small p-value we reject our null hypothesis. Hence we have evidence that pú
1 is signifificant. 
Below is the R code for this problem. 
#Testing p_1 not equal to 0 
Bartlett = -(n - 1 - 1/2*(p + q + 1))*log((1-canon_cor[1]ˆ2)) 
Bartlett 
## [1] 9.360764 
df = (p - 1)*(q - 1) 
1-pchisq(Bartlett, df) 
## [1] 0.009275471 
154630 Assignment 3 Ravish Kamath 213893664 
Question 4 
(a) Show that 

1
2(
˜

≠ 
˜
µ
1)ÕΣ≠1(
˜

≠ 
˜
µ
1) + 
1
2
(
˜

≠ 
˜
µ
2)ÕΣ≠1(
˜

≠ 
˜
µ
2)=(
˜
µ
1 ≠ 
˜
µ
2)ÕΣ≠1
˜

≠ 
1
2
(
˜
µ
1 ≠ 
˜
µ
2)Õ (
˜
µ
1 + 
˜
µ
2) 
(b) Let 
f1(x) = (1 ≠ |x|) for 
|
x| Æ 

f2(x) = (1 ≠ |x ≠ 
0.5|) 
for ≠ 
0
.5 
Æ x Æ 1 
Solution 
Part A 
Handwritten notes. 
Part B.1 
−1.0 
−0.5 
0.0 
0.5 
1.0 
1.5 
2.0 
Probability Density Function Graph 
Values of x 
Line types 
(1 − |x|) 
(1 − |x − 0.5|) 
Part B.2 
Handwritten notes. 
Part B.3 
Handwritten notes. 
16 
0.0 
0.2 
0.4 
0.6 
0.8 
1.0 
Density4630 Assignment 3 Ravish Kamath 213893664 
Question 5 
Use the data in the EXCEL fifile tab “q5”. Assume the data is a sample from two multivariate normal 
distributions. 
(a) Identify the classifification rule for the case p1 = p2 and c(1|2) = c(2|1). 
(b) Identify the classifification rule for the case p1 = 0.25 and c(1|2) is half of c(2|1). 
(c) Identify the Bayesian rule using p1 = 0.6. 
(d) Based on the rules given in part (a), which population will a new data point (50, 48, 47, 49) be classifified 
into? How about using the rules in part (b)? Using the rule in part (c)? 
Solution 
A = as.matrix(q5df[q5df$Group == "A",1:4]) 
B =as.matrix(q5df[q5df$Group =="B",1:4]) 
dat = list(A, B) 
Part A 
All the formulation of the classifification rule will be in the handwritten notes, however here is the R code for 
the calculation. 
xbar = function(matrix){ 
n = dim(matrix)[1] 
onevec = rep(1,n) 
xbar = 1/n*t(matrix)%*%onevec 
return(xbar) 

# Mean Vectors 
xbar1 = xbar(A) 
xbar2 = xbar(B) 
spooled = Spool(dat) 
ahat = t(xbar1 - xbar2)%*%solve(spooled) 
ybar1 = ahat%*%xbar1 
ybar2 = ahat%*%xbar2 
midpoint = 1/2*(ybar1 + ybar2) 
midpoint 
## [,1] 
## [1,] -7.062536 
# Hence the classification is -7.062536 
Part B 
Handwritten Notes 
Part C 
Handwritten Notes. 
17Solution 
4630 Assignment 3 Ravish Kamath 213893664 
Part D.1 
When we have equal cost and prior discriminants: 
newobs = c(50, 48, 47, 49) 
ahat%*%newobs 
## [,1] 
## [1,] -5.481962 
Since the observation is bigger than the cutoffff point, which was -7.062536, we will classify this obs. to 
population A. 
Part D.2 
When p1 = 0.25, and c(1|2) is half of c(2|1) 
ahat%*%newobs - midpoint 
## [,1] 
## [1,] 1.580574 
Since the cutoffff value for this classifification is 0.4055, we will also classify this obs. to population A. 
Part D.3 
Most of this will be shown in the handwritten, however to calculation certain probabilities for the Bayesian 
rule, we need to use R. This is shown below 
#When we use Bayesian Rule 
probA_newob = (2*pi)ˆ(-2)*(det(spooled))ˆ(-1/2)* 
exp( (-1/2)*t((newobs - xbar1))%*% solve(spooled)%*%(newobs - 
xbar1)) 
probA_newob 
## [,1] 
## [1,] 1.475133e-09 
probB_newob = (2*pi)ˆ(-2)*(det(spooled))ˆ(-1/2)* 
exp( (-1/2)*t((newobs - xbar2))%*% solve(spooled)%*%(newobs - 
xbar2)) 
probB_newob 
## [,1] 
## [1,] 3.036662e-10 
A_newob = (0.6*probA_newob)/(0.6*probA_newob + 0.4*probB_newob) 
A_newob 
## [,1] 
## [1,] 0.8793235 
B_newob = (0.4*probB_newob)/(0.6*probA_newob + 0.4*probB_newob) 
B_newob 
## [,1] 
## [1,] 0.1206765 
18

More products