Starting from:

$30

BIOSTATISTICS 755 HOMEWORK 3 -Solved


1. The dataset MITgrowth.csv on the course website data are from a prospective study on body 
fat accretion in a cohort of 162 girls from the MIT Growth and Development Study. The study 
was designed to look at changes in percent body fat in girls before and after menarche. All 
subjects had to be pre-menarche and non-obese to enter the study. Observations were taken 
annually until 4 years after menarche. At each observation percent body fat was measured. 
Two time-scales are included: age, and time since menarche (which can be negative). Time since 
menarche is the more biologically relevant time scale to use. The variables (in order) are: Subject 
ID, Current Age (years), Age at Menarche (years), time relative to Menarche (years), Percent 
Body Fat. 
aProduce a spaghetti plot of the data using Age and then time relative to 
menarche. Which time scale appears to have a stronger relationship with percent 
body fat? Which time scale is best to answer the study question? Comment on the 
possible parametric methods that could be used to include time in the model. 
G
b. Fit a model with the time effect you found to be most appropriate in (a), 
with randomly varying intercepts and slopes. 
Grading: 
• 9 points for correcting fitting the random intercept and slope model. They don’t have 
to fit the model I fit, just the one that corresponds to what they found in a. 
• 2 points each for a, b, and c. Just right or wrong. 
Here I’m going to use the linear spline model. First, I need to add a variable that will allow for a 
different linear trend before and after menarche. 
data menarche2; 
set menarche; 
time2 = min(Time_r_men,0); 
run; 
proc mixed data = menarche2; 
class ID; 
model Perc_BF = Time_r_men time2/ solution alpha=0.05; 
random intercept Time_r_men/type=UN subject=ID; 
run; 
i. 
What is the estimated variance of the random intercepts? 
The variance of the random intercepts is 37.9136 
ii. 
What is the estimated variance of the random slope(s)? 
The variance of the random slope is 0.5333 
iii. 
What is the estimated correlation between the random intercepts and slopes? 
The estimated correlation is -0.2439. The estimated covariance is -1.0967 
(c) Fit a model with only randomly varying intercepts. What do you think about 
this model versus the previous model? Should this be used? Why? 

Fit Statistics 
-2 Res Log Likelihood 
6085.2 
AIC (Smaller is Better) 
6093.2 
AICC (Smaller is Better) 6093.3 
BIC (Smaller is Better) 
6105.6 
Here are the fit statistics from the random intercept model: 
proc mixed data = menarche2; 
class ID; 
model Perc_BF = Time_r_men time2/ solution alpha=0.05; 
random intercept /type=UN subject=ID; 
run; 
Fit Statistics 
-2 Res Log Likelihood 
6155.4 
AIC (Smaller is Better) 
6159.4 
AICC (Smaller is Better) 6159.5 
BIC (Smaller is Better) 
6165.6 
By AIC, BIC or a likelihood ratio test (which has p<0.001), the random intercept and slope model 
fits better. 
(d) ( Give a full description of your findings. Include interpretations of at least two 
regression coefficients in your description. All descriptions should be in context of the study 
and the goals of the study. 
Gs: 
Solution for Fixed Effects 
Effect 
Estimate 
Standard 
Error DF t Value Pr > |t| Alpha 
Lower 
Upper 
Intercept 
20.9902 
0.5205 161 
40.32 <.0001 
0.05 19.9623 22.0182 
Time_r_men 
0.5095 
0.1291 161 
3.95 0.0001 
0.05 
0.2545 
0.7646 
time2 
2.3535 
0.1867 724 
12.60 <.0001 
0.05 
1.9869 
2.7200 
(1) The study finds that there is a significant increase in body fat percentage before menarche. 
(2) Specifically, before menarche we expect an increase of 0.51 percentage points for each year 
they get older. (3) After menarche, there is a significantly higher increase in body fat 
percentage than before menarche. (4) Specifically, after menarche girls body fact increases 2.35 
percentage points more per year than before Menarche. (5) After Menarche girls body fat 
increases 2.86 (0.51 + 2.35) percentage points per year. 
2. A study was conducted to investigate two treatments for patients suffering from 
multiple sclerosis. 150 suffers of the disease were recruited into the study, and 75 were 
randomized to receive azathioprine (AZ) alone (group 1), and 75 were randomized to receive 
azathioprine plus methylprednisommne (AZ+MP, group 2). For each participant, a measure of 
auto-immunity, azathioprine AFCR, was planned at clinic visits at baseline (time 0, at initiation 
of treatment) and at 3, 6, 9, 12, 15, and 18 months thereafter. Multiple sclerosis (MS) affects 
the immune system. Low values of AFCR (approaching 0) are evidence that immunity is 
improving, which is hopefully associated with a better prognosis for suffers of MS. Also 
recorded for each subject was age at entry into the study and an indicator of whether or not 
the subject had had previous treatment with either of the study agents (0=no, 1=yes). The 
average age of the men across both treatment groups was 50.45, with SD 6.69. The primary 
scientific aims of the study are to investigate whether (i) both treatments (AZ or AZ+MP) lower 
AFCR over the 18 month period and (ii) whether treatment with AZ+MP results in different 
immune system response than does AZ alone, and, if so, how it is different in terms of response 
over time. It was also suspected that a subject’s age and prior history might be related to their 
AFCR level at baseline and to the rate at which AFCR changes during the 18 month period. The 
square root of AFCR is the response variable of interest (square roots were taken so that the 
AFCR observations better satisfy the assumption of normality). 
a. Complete an analysis of this data using a linear mixed model that results in the best possible 
answer to the scientific aims of the study. This should include exploratory data analysis (e.g., figures), model fit, discussion of the random effect(s) variance (or covariance), and a paragraph 
that accurately summarized your conclusions. It is encouraged to give the code to replicate your 
findings. 
Gy. 
The first thing I'd like to do is to look at a plot of the data by group.There really doesn't appear to be much of an effect of the group variable. It appears that a linear time trend may be 
sufficient for this data. Also, there may be more heterogeneity in the later months of the study so a random slope 
may be required. 
Random intercept only: 
Fit Statistics 
-2 Res Log Likelihood 
3375.1 
AIC (Smaller is Better) 
3379.1 
AICC (Smaller is Better) 3379.1 
BIC (Smaller is Better) 
3385.0 
Random intercept and slope: 
Fit Statistics 
-2 Res Log Likelihood 
3374.4 
AIC (Smaller is Better) 
3382.4 
AICC (Smaller is Better) 3382.5Fit Statistics 
BIC (Smaller is Better) 
3394.2 
Based on this analysis, it appears that the random slope is not required. So, we'll use the model 
with just the random intercept. Here's a summary of the model. 
Covariance Parameter Estimates 
Cov Parm Subject 
Estimate 
UN(1,1) 
id 
2.4026 
Residual 
3.0725 
Fit Statistics 
-2 Res Log Likelihood 
3375.1 
AIC (Smaller is Better) 
3379.1 
AICC (Smaller is Better) 3379.1 
BIC (Smaller is Better) 
3385.0 
Null Model Likelihood Ratio Test 
DF 
Chi-Square 
Pr > ChiSq 

219.60 
<.0001 
Solution for Fixed Effects 
Effect 
group Estimate Standard 
Error 
DF t Value Pr > |t| Alpha Lower 
Upper 
Intercept 
13.0730 
0.2464 138 
53.05 <.0001 
0.05 12.5857 13.5603 
time 
-0.1578 
0.01498 648 -10.53 <.0001 
0.05 -0.1872 -0.1283 
group 

0.2351 
0.3474 648 
0.68 0.4989 
0.05 -0.4471 
0.9173 
group 





time*group 2 
-0.08893 
0.02114 648 
-4.21 <.0001 
0.05 -0.1304 -0.04743 
time*group 1 


.Type 3 Tests of Fixed Effects 
Effect 
Num DF Den DF F Value Pr > F 
time 

648 366.15 <.0001 
group 

648 
0.46 0.4989 
time*group 

648 
17.70 <.0001 
Despite the first figure appearing that there was not an impact of the drug, the final model does 
show that the treatment was effective. Specifically, the AZ + MP group had their azathioprine 
decrease more rapidly than the group that had AZ alone. In the AZ group (which is group 1, the 
referent group) we expect that the outcome (sqrt of afcr) will decrease 0.158 units (95% CI - 
0.187, -0.128) for every month increase. In the AZ+MP group, the outcome will decrease 0.089 
units (95% CI -0.130, -0.047) more per month than the change in the AZ group. 
Note that I did not include age or prior into the model. You can adjust for these variables, but it 
is not necessary because the treatment was randomized. As a result, neither of these variables 
are related to the treatment group and are thus not confounders.

More products