Starting from:

$25

STAT292- Assignment 3 ANOVA Solved

There are five questions, worth a total of 100 marks. Question 1 starts on page 2.

Assignment Guidelines (one more time)

You are encouraged to discuss assignments with other students, but your submitted work must be your own.

The following Assignment Guidelines are helpful for all the assignments in Parts 2 and 3 of the course.

When you carry out a statistical test of hypothesis, you should state the following, when relevant:

•    Model equation.

•    Assumptions about the data, and comments about whether diagnostic graphs support those assumptions.

•    Null and alternative hypotheses.

•    ANOVA Table (if relevant), p-value.

•    Statistical conclusions. For example, “We reject H0 and conclude HA, that µ1 and µ2 differ at the 5% significance level”.

•    Interpretation of the statistical conclusions back to the original problem, using the original meaning of the response variable and any factors or covariates. For example, if comparing heights of two groups, “Female and male adults have different mean heights, with males being taller on average”.

Assignment Guidelines

1. Skink Temperatures
Skinks are tested for their preferred daytime temperature. Each one is placed in a long tank which is warmer at one end, cooler at the other. The temperature at the position where it settles is recorded. There are four different species of skink, and we wish to test (at the 5% level of significance) whether the species differ in their preferred temperature.

The following table gives the data.

Species
 
Preferred temperatures (oC)
 
Total
Mean
A
18
21     22
18
20
19
19
23
17
22
199
19.9
B
24
18     19
21
20
17
23
22
22
19
205
20.5
C
22
21     24
19
25
18
23
21
24
22
219
21.9
D
21
19     26
24
25
21
20
20
27
25
228
22.8
SAS output is given on pages 3 and 4.

(a)   When running the experiment, other possible factors such as time of day, light,amount of food recently eaten, are kept as near constant as possible. Why?

(b)   The skinks are not put in the tank together. Why?

(c)    Give values of n and p (the number of treatments) for this experiment. How many degrees of freedom are in the Treatments row, the Error row and the Total row of the ANOVA table? (Give the algebraic expressions and the actual values for this experiment.)

(d)   Use the output to write up the ANOVA in the style suggested in the AssignmentGuidelines on page 1. You should include a statement of the (complete) model equation, and also comments on whether the assumptions are satisfied. Use a 5% significance level for the ANOVA test.

One-Way Analysis of Variance 
 Dependent Variable: Temperature   

 



                             Source                  DF Sum of Squares Mean Square F Value Pr F

                             Model                  3       52.0750000 17.3583333       3.06 0.0402

                             Error                  36    203.9000000     5.6638889                          

                             Corrected Total 39     255.9750000                                               





 

  One-Way Analysis of Variance 

 



Levene's Test for Homogeneity of Temperature Variance ANOVA of Squared Deviations from Group Means

Source DF Sum of Squares Mean Square F Value Pr F

                                     Species 3             86.1428        28.7143      1.36 0.2691

                                   Error      36                757.4        21.0391                          



 

 

 

 Means and Descriptive Statistics 

 



                                   Mean of                   Std. Dev. of                   Minimum of                   Maximum of

Species

                            Temperature                  Temperature                 Temperature                  Temperature

                                    21.275              2.5619253577                                17                                 27

A                                                 19.9  2.0248456731 17        23

B                                                 20.5  2.2730302828 17        24

C                                                 21.9  2.2335820757 18        25

D                                                 22.8  2.8982753492 19        27



 

 

 

 



 

2. Nasal Sprays
Improvement in breathing airflow is measured for twenty-five people suffering from nasal congestion. They were treated with either a saline spray (A) or one of four nasal sprays (B, C, D, E) available over the counter in pharmacies.

Spray
Airflow improvement
Total
Mean
A
15      10
16
14
8
63
12.6
B
25      41
37
44
26
173
34.6
C
21         6
9
15
14
65
13.0
D
16         7
24
22
15
84
16.8
E
24      15
39
34
30
142
28.4
 
 
527
21.08
Relevant SAS output follows, on pages 5 to 7.

Write a report that compares the five treatments using the guidelines on page 1. Ensure that you comment on all the included SAS output. Use a 5% significance level for all statistical tests. Make a recommendation for either a single best nasal spray, or a group of best choices which are similar in their effects; refer to the Tukey test to justify your decision.

One-Way Analysis of Variance 

Results: Nasal Spray Example

 The ANOVA Procedure

 



Class Level Information Class Levels Values Spray 5 A B C D E

 



Number of Observations Read 25 Number of Observations Used 25

 

 

  Dependent Variable: Improvement   

 



                                  Source                DF Sum of Squares Mean Square F Value     Pr F

                                 Model                 4     1959.440000 489.860000       9.73 0.0002

                                 Error                 20    1006.400000    50.320000                         

                                  Corrected Total 24     2965.840000                                             
 



 

 

 

 

 

The ANOVA Procedure

Nasal Spray Example

 



Levene's Test for Homogeneity of Improvement Variance

ANOVA of Squared Deviations from Group Means

                                     Source DF Sum of Squares Mean Square F Value      Pr F

                                   Type       4             11893.9           2973.5       1.59 0.2156

                                   Error    20             37393.0           1869.6                           
 

 

 



                                                   Level of                 Improvement

                                                   Type      N             Mean         Std Dev

A                 5 12.6000000 3.43511281

B                 5 34.6000000 8.67755726

C                 5 13.0000000 5.78791845

D                 5 16.8000000 6.68580586

E                 5 28.4000000 9.28977933

 

 

 

Tukey's Studentized Range (HSD) Test for Improvement

 

Note: This test controls the Type I experimentwise error rate, but it generally has a higher     Type II error rate than REGWQ.

 



                                            Alpha                                                        0.05

                                            Error Degrees of Freedom                           20

                                            Error Mean Square                                 50.32

Critical Value of Studentized Range 4.23186

                                            Minimum Significant Difference         13.425

 



Means with the same letter are not significantly different.

                                                    Tukey Grouping         Mean N Type

A                    34.600 5 B

                                                                 A                                

B                    A             28.400 5 E

                                                    B                                             

B                     C 16.800 5 D

                                                                 C                                

C                     13.000 5 C

                                                                 C                                

                                                                 C               12.600 5 A



 

 

 

 

 

 

 

 

 

Nonparametric One-Way ANOVA 
 The NPAR1WAY Procedure: Nasal Spray Example

 



Wilcoxon Scores (Rank Sums) for Variable Improvement

Classified by Variable Type

                                                         Sum of      Expected           Std Dev    Mean

                                         Type N      Scores      Under H0         Under H0    Score

A             5       36.50   65.0     14.682756        7.30

B             5       108.00 65.0     14.682756        21.60

C             5       35.00   65.0     14.682756        7.00

D             5       55.50   65.0     14.682756        11.10

E             5       90.00   65.0     14.682756        18.00

Average scores were used for ties.



Kruskal-Wallis Test

                                                             Chi-Square        15.8695

                                                             DF                               4

Pr Chi-Square 0.0032

 

 

 

 

 

3. Forensic dental X-rays
The extent to which X-rays can penetrate tooth enamel has been suggested as a suitable mechanism for differentiating between females and males in forensic medicine (e.g., think about shows like ‘CSI’ and parts of ‘NCIS’). The table below gives spectropenetration gradients for one tooth from each of eight females and eight males.

Gender
Y = spectropenetration gradient
Mean
Std. dev.
Female
4.8    5.3    3.7    4.1    5.6    4.0    3.6 5.0
4.5125
0.7605
Male
4.9    5.4    5.0    5.5    5.4    6.6    6.3 4.3
5.4250
0.7440
Note that a high reading reflects a fast drop-off in X-ray penetration, with less penetration by X-rays.

(a)   Explain why the teeth have been sampled from eight different people of eachsex, and not eight teeth from one female and eight from one male.

(b)   Given that the researcher could afford to test n = 16 subjects, explain the advantages of choosing eight from each group.

(c)    SAS output from an ANOVA is on pages 9 and 10. Write a report, followingthe guidelines on page 1.

(d)   Explain why there is no point doing a Tukey test with this data.


 

One-Way Analysis of Variance 
Results: X-ray Penetration Gradient

 The ANOVA Procedure

 



Class Level Information Class Levels Values

                                                         Gender          2 Female Male
 



Number of Observations Read 16

Number of Observations Used 16
 

 

 

 

Dependent Variable: Xray_grad   

 



                             Source                  DF Sum of Squares Mean Square F Value     Pr F

                             Model                  1       3.33062500 3.33062500       5.88 0.0294

                             Error                  14       7.92375000 0.56598214                           

                             Corrected Total 15     11.25437500                                               

 

R-Square Coeff Var Root MSE Xray_grad Mean

                                            0.295940 15.14099 0.752318            4.968750

 

                                       Source DF        Anova SS Mean Square F Value     Pr F

                                      Gender      1 3.33062500 3.33062500        5.88 0.0294
 

 

 



 

 

 

 

 

 

The ANOVA Procedure

X-ray Penetration Gradient
 



Levene's Test for Homogeneity of Xray_grad Variance

ANOVA of Squared Deviations from Group Means

                                      Source DF Sum of Squares Mean Square F Value       Pr F

                                   Gender    1            0.00189         0.00189       0.01 0.9305

                                   Error     14              3.3504           0.2393                          


                                                    Level of                  Xray_grad

                                                    Gender N             Mean         Std Dev

Female 8 4.51250000 0.76052144

                                                    Male       8 5.42500000 0.74402381
 

 

 

 

 

 

 

4. Personality types
In psychology, there are tests to classify people into one of many personality types. An experiment is run to find the extent of the influence of personality type on the subject’s score in a certain test. A random sample of four personality types is taken, and within each type a random sample of ten subjects is taken. Each subject is given the test, and the score Y is recorded, with data as follows:

Type
 
 
 
Test Score, Y
 
 
 
T1
50
52
44
49     60
51
40
41
54
39
T2
63
45
48
49     65
55
47
58
57
56
T3
50
52
47
48     44
56
55
39
51
53
T4
39
38
51
50     53
53
59
41
45
48
(a)   Explain why this is a random effects design, rather than a fixed effects design.

(b)   Some SAS output is given on pages 12 and 13. Note that the boxplots do notinclude estimates of group means, since any differences in population means are not the focus of this investigation.

Present a report and your conclusions. Include in your report comments on whether the relevant assumptions seem satisfied. Give your estimated components of variance, plus the percentage of the total variance of Y that is due to personality, along with the percentage unexplained,

Do you think personality type is important in determining the score on this particular test?

                                       SAS Output for Personality Type Example
 

Box Plot 

 

 

 

 

One-Way Analysis of Variance 
Results 
The ANOVA Procedure

 

Class Level Information
Class
Levels
 Values
PersType
         4
 T1 T2 T3 T4
 

Number of Observations Read
 40
Number of Observations Used
 40
 

Dependent Variable: Score   

 

Source
DF
 Sum of Squares
 Mean Square
 F Value
 Pr F
Model
3
        279.675000
      93.225000
      2.23
 0.1017
Error
36
       1506.700000
      41.852778
             
   
Corrected Total
 39
       1786.375000
             
             
   
 

R-Square
 Coeff Var
 Root MSE
 Score Mean
0.156560
 12.97117
 6.469372
      49.87500
 

Source
DF
     Anova SS
 Mean Square
 F Value
 Pr F
PersType
 3
 279.6750000
    93.2250000
      2.23
 0.1017
 

 

 

                                       SAS Output for Personality Type Example
 

 

 

Levene's Test for Homogeneity of Score Variance ANOVA of Squared Deviations from Group Means
Source
DF
 Sum of Squares
 Mean Square
 F Value
 Pr F
PersType
 3
               2400.7
            800.2
      0.49
 0.6926
Error
36
             59004.7
          1639.0
   
   
 

 

 

 

 

 

 

 

5. Phytoremediation
Phytoremediation (New Scientist, 20 Dec 1997, p.26) is a process by which plants are used to remove toxic metals from the soil. For example, sunflowers were used around Chernobyl, where there was radioactive contamination from a nuclear power station accident.

Certain plants take up toxic metals (e.g. zinc, cadmium, uranium) and accumulate them in their vacuoles as protection against chewing insects and infection.

Suppose that four species of plant were tested, at lower and higher soil pH, for their uptake of zinc, Y , measured in parts per million (ppm) of dry plant weight at the end of the trial.

Uptake of zinc, Y, (ppm):

 
Soil pH
Plant Name
5.5 (acid)
7 (neutral)
Lettuce
250       470     330
400       310     430
Martin red fescue
2850  2380   3130
1070    960   1300
Alpine pennycress
6340  4280   5170
2880  4330   3050
Bladder campion
3690  4750   5100
2360  1990   2140
(a)   What kind of design is this? Give the model equation, including an interactionterm.

(b)   SAS analysis of the data using the model from part (a) was tried on both rawdata Y and transformed data log Y . Diagnostic graphs from both analyses are given on pages 15 and 16. Explain, with reasons, whether it is better to analyse Y or log Y .

(c)    Further SAS output is given on pages 17 to 19. Present a report and yourconclusions, following the usual guidelines. Use a 5% significance level.

                                  SAS Output for Phytoremediation Example
 

 

 

                                  SAS Output for Phytoremediation Example
 

 

                                  SAS Output for Phytoremediation Example
 

Linear Models 

 

The GLM Procedure

 

Class Level Information
Class
 Levels
Values
pH
2
acid neutral
Plant
4
AlpineP BladderC Lettuce MartinRF
 

Number of Observations Read
 24
Number of Observations Used
24
 

 

 

Dependent Variable: logZinc  
 
 
Source
DF
 Sum of Squares
Mean Square
 F Value
Pr F
 
 
Model
7
    24.03027190
3.43289599
 92.71
<.0001
 
 
Error
16
      0.59245521
0.03702845
   
 
 
 
Corrected Total
23
    24.62272711
 
   
 
 
 
 
 
 
 
 
 
 
 
 

R-Square
 Coeff Var
 Root MSE
logZinc Mean
0.975939
 2.589699
 0.192428
7.430507
 

Source
DF
      Type I SS
 Mean Square
 F Value
Pr F
pH
1
 1.46932364
 1.46932364
 39.68
<.0001
Plant
3
 21.65807393
 7.21935798
 194.97
<.0001
pH*Plant
3
 0.90287433
 0.30095811
     8.13
0.0016
 

Source
DF
    Type III SS
 Mean Square
 F Value
Pr F
pH
1
 1.46932364
 1.46932364
 39.68
<.0001
Plant
3
 21.65807393
 7.21935798
 194.97
<.0001
pH*Plant
3
 0.90287433
 0.30095811
     8.13
0.0016
 

                                  SAS Output for Phytoremediation Example
 

 

 

 

 

             Alternative interaction graph for phytoremediation example
 

 

 

 

More products