$30
Questions 1–7 use the ‘sleep’ dataset, which you can download from the course website. This dataset contains (among other things) data on the body weight (kg) and brain weight (g) of 62 mammals. Use the following commands to read the data:
> mammals <- read.csv("../data/sleep.csv")
> mammals$BodyWt <- log(mammals$BodyWt)
> mammals$BrainWt <- log(mammals$BrainWt)
This creates a data frame, mammals, with components (among others) named BodyWt and BrainWt, then applies a logarithmic transformation to both BodyWt and BrainWt.
1. Fit a linear model explaining brain weight from body weight, using the lm command.
Display the summary of the fitted model, and then create a scatter plot of the data and superimpose the fitted regression line on it. Does it look like a reasonable fit?
Use diagnostic plots to assess if the model assumptions are satisfied.
2. Using the fitted model or otherwise, obtain:
(a) The least squares estimator of the parameters, b;
(b) The vector of residuals, e;
(c) The residual sum of squares, SSRes;
(d) The regression sum of squares, SSReg;
(e) The estimator for the variance of the errors, s2;
(f) The standardised residuals;
(g) The leverages of the points;
(h) The Cook’s distances of the points;
(i) 95% confidence intervals for each of the parameters.
3. Find a 95% confidence interval for a mammal weighing 50 kg.
4. Find a 95% prediction interval for a mammal weighing 50 kg.
5. Find and draw a 95% joint confidence region for the parameters.
6. Test the following hypotheses, using the anova function.
(a) H0 : β = 0
(b) H0 : β1 = 0
(c) H0 : β0 = 0
(d) H0 : β = (2,1)
7. By visualising the raw data, justify the use of a double logarithmic transformation. Write downthe final model for the (untransformed) brain weight vs. body weight.
8. In this question we consider the hypothesis H0 : β = β∗. The test statistic for this hypothesis is
.
1
(a) Show that
(b−β∗)TXTX(b−β∗) = (y− Xβ∗)T(y− Xβ∗) − (y− Xb)T(y− Xb).
That is, it is the SSRes for the null model minus the SSRes for the full model. Also show that
(b−β∗)TXTX(b−β∗) 6= yTX(XTX)−1XTy−β∗XTXβ∗.
That is, in this case we can not write it as the SSReg for the full model minus the SSReg for the model under H0.
(b) Show directly that (b−β∗)TXTX(b−β∗) and SSRes are independent, that is without using our existing results that b and SSRes are independent.
Hint: set q = y− Xβ∗ then
i. Show that (b−β∗)TXTX(b−β∗) = qTX(XTX)−1XTq.
ii. Show that SSRes = qT[I − X(XTX)−1XT]q and hence that these two quadratic forms are independent.
2