$30
1. We study the effect of various breeds and diets on the milk yield of cows. A study is conducted on9 cows and the following data obtained:
Diet
Breed
1
2
3
1
18.8
16.7
19.8
21.2
23.9
2
22.3
15.9
19.2
21.8
(a) Express this as a two-factor model with no interaction in matrix form.
(b) Express this as a two-factor model with interaction in matrix form.
(c) Express the hypothesis that there is no interaction in terms of your parameters. Eliminateany redundancies.
(d) Input this data into R. Plot an interaction plot between breed and diet.
(e) Test for the presence of interaction.
(f) What is the degrees of freedom used for the interaction test?
(g) From the interaction model, what is the estimated amount of milk produced from breed 2 anddiet 3?
(h) Fit an additive model. What is the estimated amount of milk produced from breed 2 and diet3 now?
(i) Test the hypothesis (under the additive model) that the 2nd and 3rd diets are equivalent interms of milk produced.
(j) Find a 95% confidence interval, under the additive model, for the amount of milk producedfrom breed 2 and diet 3. Use both matrix calculations and the estimable function from the gmodels package.
(k) Find the same confidence interval under the interaction model.
(l) Why is the second interval wider than the first?
2. We study the growth of peas when fed three different types of fertilizer. A study is conductedwhere the samples are divided into 6 “blocks”, corresponding to different plots of land. The data is stored in the npk data frame in R. This data frame contains 5 variables:
block: label of the block of the sample
N: indicator (0/1) for the application of nitrogen
P: indicator (0/1) for the application of phosphate K: indicator (0/1) for the application of potassium yield: yield of peas in pounds/plot
(a) Fit an additive linear model with all variables; then repeat without the block variables. Doesthe fitted model change? Are the block variables significant?
(b) Fit a model with the fertilizer variables and all pairwise interaction terms. Are the interactionterms significant?
(c) Perform variable selection using stepwise selection with AIC, starting from the model with nointeraction terms (but considering them for inclusion). What do you find?
(d) What is the best treatment for peas, according to your final model? Find a 95% confidenceinterval for the yield of this treatment.
1
3. A study was conducted to determine the effect of the size of the root system on the growth ofDouglas-fir seedlings when they are planted out. Seedlings were obtained from three seed lots, and when they were planted out their root volume was classified as small (RV1), medium (RV2), or large (RV3). The heights of the seedlings were then measured at the end of the first growing season. The data from the experiment are given in the file douglas.csv.
(a) How has randomisation and blocking been used in the design of this experiment?
(b) Generate two interaction plots for the data. Is there any evidence of an interaction?
(c) Fit a model with interaction to the data and use it to find the fitted means for each combinationof factor levels.
(d) Find a 95% confidence interval for the difference in height between a seedling with large rootvolume (RV3) and a seedling with medium root volume (RV2). Suppose that the seedling came from seed lot B349.
(e) Test for the presence of an interaction at the 5% significance level. Would it be meaningful tocheck the significance of the main effects? Why?
(f) Fit an additive model to the data using the lm command, and produce plots to justify the model assumption that the errors are normal and homoskedastic.
4. Suppose that y ∼ MV N(µ1,Σ), where
.
For what values of ρ are the sample mean and sample variance independent?
5. In the one-way classification model, show that any linear combination of ¯y1 −y¯·,...,y¯k −y¯· can be written as a linear combination of ¯y1,...,y¯k. Does the converse hold?