Starting from:

$30

BIOSTATISTICS 755 HOMEWORK 2 -Solved

1. The longitudinal data from an insulin study contain 36 rabbits where 12 rabbits were 
randomly assigned to each of 3 groups: group 1 rabbits received the standard insulin 
mixture, group 2 rabbits received a mixture containing 1% less protamine than the 
standard, and group 3 rabbits received a mixture containing 5% less pro- tamine. 
Rabbits were injected with the assigned mixture at time 0, and blood sugar 
measurements taken on each rabbit at the time of injection (time 0) and 0.5, 1.0, 1.5, 
2.0, 2.5, and 3.0 hours post-injection. 
The data file “insulin” is on the course website. The variables appearing in columns 
are: (1) rabbit id, (2) insulin group, and (3-9) response (blood sugar level) at 7 time 
points. 
(a) (Create a spaghetti plot of the data with separate panels for each 
group. Comment on the heterogeneity in the data. 
Here we'll do separate plots (panels) for each group; 
Proc SGpanel data = insulin; 
PanelBy group / columns=3; 
series x=hour y=Ins / group =id LineAttrs= (pattern=1); 
run; 
The amount within subject variability in the data appears to be relatively high. The rankings of 
the insulin values bounce around a lot over the study period. There doesn’t appear to be much 
heterogeneity in the data. 
36 
35 
34 
33 
32 
31 
30 
29 
28 
27 
26 
25 
24 
23 
22 
21 
20 
19 
18 
17 
16 
15 
14 
13 
12 
11 
10 









id 
hour 
group = 3 
group = 2 
group = 1 
0.0 0.5 1.0 1.5 2.0 2.5 3.0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 
40
60
80 
100 
120 
Ins(b) ( Create a plot that has the groups means over time on the same 
plot with different colors (if you’ll print/submit in color) or line-types (if you 
print/submit in black and white). What trends might be appropriate (e.g., pro
file, linear, quadratic, etc.)? 
proc sort data=insulin; 
by group hour; 
run; 
*Calculate the mean by group and hour; 
proc means mean data=insulin; 
by group hour; 
var Ins; 
output out = MN_GRP_dat mean = mn_GRP_Ins; 
proc print data = MN_GRP_dat; 
run; 
*Plot the mean insulin by hour and group (with color); 
Proc SGplot data = MN_GRP_dat; 
series x=hour y=mn_GRP_Ins / group =group LineAttrs=(pattern=1 thickness=3); 
run; 
*Plot the mean insulin by hour and group (with different line types); 
Proc SGplot data = MN_GRP_dat; 
series x=hour y=mn_GRP_Ins / group =group LineAttrs=(color=1 thickness=3); 
0.0 
0.5 
1.0 
1.5 
2.0 
2.5 
3.0 
hour 
60
80 
100 



group 
mn_GRP_Insrun; 
The trends here appear to be linear. Considering the small sample size (n=36) this is likely the 
best way to model time. 
(c) (1Fit a full interaction model using a profile analysis with the follow
ing covariance matrices. Hand in the estimates of the covariance matrices and 
model fit statistics (AIC and BIC) for each. Nothing else. 
i. Unstructured (heterogeneous and symmetric) 
proc mixed data=insulin; 
class ID group hour; 
model Ins = group hour group*hour; 
repeated hour/type=UN subject=ID; 
run; 
Estimated R Matrix for Subject 1 
Row 
Col1 
Col2 
Col3 
Col4 
Col5 
Col6 
Col7 

142.10 55.3734 27.3998 -20.4943 
8.1449 11.6053 -9.8090 
2 55.3734 
100.88 49.9615 
4.0966 
1.8378 
3.5719 -16.8321 
3 27.3998 49.9615 
116.31 -4.6389 -15.3237 -0.1450 
3.7044 
4 -20.4943 
4.0966 -4.6389 59.1381 21.6112 -10.9575 -27.5573 

8.1449 
1.8378 -15.3237 21.6112 88.2290 29.7329 
6.4097 
0.0 
0.5 
1.0 
1.5 
2.0 
2.5 
3.0 
hour 
60
80 
100 



group 
mn_GRP_InsEstimated R Matrix for Subject 1 
Row 
Col1 
Col2 
Col3 
Col4 
Col5 
Col6 
Col7 
6 11.6053 
3.5719 -0.1450 -10.9575 29.7329 89.1641 23.7737 
7 -9.8090 -16.8321 
3.7044 -27.5573 
6.4097 23.7737 95.7134 
Fit Statistics 
-2 Res Log Likelihood 
1721.6 
AIC (Smaller is Better) 
1777.6 
AICC (Smaller is Better) 1785.7 
BIC (Smaller is Better) 
1822.0 
ii. Compound Symmetry 
proc mixed data=insulin; 
class ID group hour; 
model Ins = group hour group*hour; 
repeated hour/type=CS subject=ID; 
run; 
Estimated R Matrix for Subject 1 
Row 
Col1 
Col2 
Col3 
Col4 
Col5 
Col6 
Col7 
1 98.7919 6.7365 6.7365 6.7365 6.7365 6.7365 6.7365 
2 6.7365 98.7919 6.7365 6.7365 6.7365 6.7365 6.7365 
3 6.7365 6.7365 98.7919 6.7365 6.7365 6.7365 6.7365 
4 6.7365 6.7365 6.7365 98.7919 6.7365 6.7365 6.7365 
5 6.7365 6.7365 6.7365 6.7365 98.7919 6.7365 6.7365 
6 6.7365 6.7365 6.7365 6.7365 6.7365 98.7919 6.7365 
7 6.7365 6.7365 6.7365 6.7365 6.7365 6.7365 98.7919 
Fit Statistics 
-2 Res Log Likelihood 
1766.1 
AIC (Smaller is Better) 
1770.1 
AICC (Smaller is Better) 1770.1 
BIC (Smaller is Better) 
1773.2 
iii. Heterogeneous Compound Symmetry.proc mixed data=insulin; 
class ID group hour; 
model Ins = group hour group*hour; 
repeated hour/type=CSH subject=ID; 
run; 
Estimated R Matrix for Subject 1 
Row 
Col1 
Col2 
Col3 
Col4 
Col5 
Col6 
Col7 
1 140.80 6.9994 7.5786 5.5232 6.5803 6.6238 6.9966 
2 6.9994 98.5251 6.3397 4.6203 5.5045 5.5409 5.8528 
3 7.5786 6.3397 115.51 5.0026 5.9600 5.9995 6.3371 
4 5.5232 4.6203 5.0026 61.3493 4.3436 4.3723 4.6184 
5 6.5803 5.5045 5.9600 4.3436 87.0789 5.2091 5.5023 
6 6.6238 5.5409 5.9995 4.3723 5.2091 88.2348 5.5387 
7 6.9966 5.8528 6.3371 4.6184 5.5023 5.5387 98.4468 
Fit Statistics 
-2 Res Log Likelihood 
1759.7 
AIC (Smaller is Better) 
1775.7 
AICC (Smaller is Better) 1776.3 
BIC (Smaller is Better) 
1788.4 
(d) (1 Based on the estimated covariance matrices what do you think is 
best and why? 
The diagonal term of the unstructured and CSH covariance matrices do appear to bounce around 
a good bit. There’s no consistent pattern though and the sample size is small, so it may be that 
the homogeneous is the best. The off-diagonal values of the unstructured have estimates that are 
close to zero and even negative. This is likely just noise as the sample size is small. To me, this 
says that the compound symmetric is likely the best. 
(e) () For the models that were fit in (c), which model has the best fit 
according to AIC and BIC? 
Compound Symmetric structure works the best among all information criteria. 
(f) Complete a likelihood ratio test between the following structures.To each of the results coincide with the results from AIC and BIC? 
i. Unstructured and Heterogeneous Compound Symmetry 
For the Unstructured vs CSH: 
D =1759.7 -1721.6 = 38.1 
df = 28-8 = 20 
A chi-squared with 20 df and alpha of 0.01 = 37.57, which is less than D. 
More specifically the p-value = 0.0086 
This shows that the unstructured fits significantly better than the CSH. This does not agree with 
AIC or BIC, which both prefer the CSH model. 
(g) ( Using the model that fit best from (c), test whether the time profiles 
of means are different in the groups. 
Type 3 Tests of Fixed Effects 
Effect 
Num DF Den DF F Value Pr > F 
group 

33 
12.10 0.0001 
time 

198 
121.67 <.0001 
group*time 
12 
198 
1.75 0.0587 
The interaction type III test is insignificant (though close). As a result, we cannot say that the 
mean profiles of the groups are different. 
(h) (Does time have a significant impact on the response (this may re
quire you to fit another model)? 
In the model with no interaction and the model with an interaction the "hour" variable is found to 
be significant. This is an indication that there is a significant time effect. 
(i) Using the model that fit best from (c), test and give an estimate of 
the difference in the mean response level at 0.5 hour from baseline (0 hour) in 
group 1. 
There are many ways to do this test. Below I ran a model that had time 0 and group 1 as the 
referent group. The estimate of the difference is -9.21. In that model the significance test for 
hour=0.5 is a test that the means at 0 and 0.5 hours are the same for group 1. 
Another way to complete this would be to use the 'pdiff' option. Solution for Fixed Effects 
Effect 
group time Estimate Standard 
Error 
DF t Value Pr > |t| 
Intercept 
100.78 
2.8693 33 
35.13 <.0001 
group 

1.3667 
4.0577 33 
0.34 0.7384 
group 

4.3917 
4.0577 33 
1.08 0.2870 
group 






time 
0.5 
-9.2083 
3.9170 198 
-2.35 0.0197 
time 

-17.6583 
3.9170 198 
-4.51 <.0001 
time 
1.5 
-25.0167 
3.9170 198 
-6.39 <.0001 
time 

-32.2417 
3.9170 198 
-8.23 <.0001 
time 
2.5 
-34.9667 
3.9170 198 
-8.93 <.0001 
time 

-36.3417 
3.9170 198 
-9.28 <.0001 
time 






group*time 2 
0.5 
-9.0167 
5.5394 198 
-1.63 0.1052 
group*time 2 

-8.4667 
5.5394 198 
-1.53 0.1280 
group*time 2 
1.5 
-10.2083 
5.5394 198 
-1.84 0.0668 
group*time 2 

-11.7333 
5.5394 198 
-2.12 0.0354 
group*time 2 
2.5 
-15.7000 
5.5394 198 
-2.83 0.0051 
group*time 2 

-16.9250 
5.5394 198 
-3.06 0.0026 
group*time 2 


.  

group*time 3 
0.5 
-3.3250 
5.5394 198 
-0.60 0.5490 
group*time 3 

-4.9333 
5.5394 198 
-0.89 0.3742 
group*time 3 
1.5 
-6.3917 
5.5394 198 
-1.15 0.2500 
group*time 3 

-10.6667 
5.5394 198 
-1.93 0.0556 
group*time 3 
2.5 
-13.2333 
5.5394 198 
-2.39 0.0178 
group*time 3 

-19.1167 
5.5394 198 
-3.45 0.0007 
group*time 3 



group*time 1 
0.5 





group*time 1 




.Solution for Fixed Effects 
Effect 
group time Estimate Standard 
Error 
DF t Value Pr > |t| 
group*time 1 
1.5 



group*time 1 




group*time 1 
2.5 





group*time 1 



 

group*time 1 


 

(j) (Using the model that fit best from (c), interpret at least two of the 
parameters in context of the problem. Have one of the parameters you interpret 
be from an interaction. 
Various interpretations accepted. 
Among rabbits in group 1, the average blood sugar level at 2 hours was 32.24 ng/mL less than 
the average blood sugar at baseline (95% CI: -39.85, -24.63). 
The difference in average blood sugar between baseline and 3 hours among rabbits in group 2 is 
16.92 ng/mL less than the difference in average blood sugar between baseline and 3 hours among 
rabbits in group 1 (95% CI: 6.06, 27.8).

More products