$30
. Question 1: A researcher wants to study whether an existing drug causes a particular
kind of skin reaction. She collects data from an existing medical database; she samples
individuals who showed the skin reaction as well as individuals without a reaction. She
then examines how many individuals had taken the drug of interest.
(a) ] What kind of study is this? Be as specifific as possible, and explain your
answer.
(b) Suppose the researcher fifinds that those who took the drug had a higher
chance of developing a skin reaction. Do you believe this result? Can you think of
other specifific factors that might a↵ect the conclusion? Explain your answers.
Question A researcher runs a randomized experiment where study partic
ipants are randomized to be given either a drug or a placebo. Another researcher wants
to perform an additional study on this same study group. He asks participants whether
they get more or less than 3 hours of exercise per week. He concludes that individuals
with more than 3 hours of exercise per week have lower blood pressure than those who
get less than 3 hours, on average. Are you worried about confounders in this conclusion?
Explain.
Question 3: We want to study the genetics of colour distribution in a population of
unicorns. Suppose that in unicorns there is a single gene that determines colour; there
are two variants of the gene, one called A and the other called a. Each unicorn has two
copies of the gene (one from each parent). The combination of the gene variants for copy
1 and copy 2 of the gene determines colour as follows:
Copy 1 Copy 2 Colour
A A
Red
A a
Pink
a A
Pink
a a
White
We want to determine if “random mating” is happening in this unicorn population. This
would mean that the probability that a newly born unicorn inherits variant A or a with
probability equal to the prevalence of each variant in the overall population.
1(a) Let p be the proportion of the A variant in the population, and q be the
proportion of the variant a, so that q = 1−p
. Under random mating, each newborn
unicorn inherits A with probability p and
a
with probability q and the two copies
inherited in each unicorn are independent of one another. Calculate the expected
proportions of unicorn colours under random mating.
(b) Suppose you know that p = 0.75 and q = 0.25, and that you collect the
following data from a sample of unicorns:
Colour Number of unicorns
Red 45
Pink 49
White 12
Use the appropriate statistical test to determine whether the random mating as
sumption holds in this population. Write out your calculations for this part;
using R for this part will not result in credit.
2Ravish Kamath
213893664
Question 1
A researcher wants to study whether an existing drug causes a particular
kind of skin reaction. She collects data from an existing medical database;
she samples individuals who showed the skin reaction as well as individuals
without a reaction. She then examines how many individuals had taken the
drug of interest.
(a) What kind of study is this? Be as specifific as possible, and explain your
answer.
(b) Suppose the researcher fifinds that those who took the drug had a higher
chance of developing a skin reaction. Do you believe this result? Can
you think of other specifific factors that might affffect the conclusion?
Explain your answers.
Solution
(a) This study would be considered a retrospective study. The reason
would be that the researcher is looking into an already existing database
system, hence she is looking into the past for her subjects. Based on
this medical database system, she is sampling individuals who already
showed a skin reaction vs. individuals that did not have a reaction.
(b) It is possible to have this result. However there might be certain factors
that can affffect the conclusion by the researcher. For example, season
ality, where exposure to high levels of sunlight can cause skin reaction,
food allergies can cause skin reactions, and fifinally prior infections and
diseases could have caused similar skin reaction. All of these have no
relation to the drug that was used by the patients, however they were
still able to get skin reactions though not from the drug. This can
cause some problems with the way the experiment was held. However
depending on the sample size, these may be outlier situations, and the
drug does indeed cause a development of skin reaction.
Page 1Ravish Kamath
4330: Assignment 1
213893664
Question 2
A researcher runs a randomized experiment where study participants are
randomized to be given either a drug or a placebo. Another researcher
wants to perform an additional study on this same study group. He asks
participants whether they get more or less than 3 hours of exercise per
week. He concludes that individuals with more than 3 hours of exercise per
week have lower blood pressure than those who get less than 3 hours, on
average. Are you worried about confounders in this conclusion? Explain.
Solution
Yes there is a worry about confounders in this conclusion. The fifirst
confounder that can be thought offff is age of the participants. Age is a
factor because older people may have less time to exercise due to work,
family etc. and with age, high blood pressure is quite common than younger
people. Because of the age confounder, it may seem that less hours of exercise
will cause higher blood pressure, however it could just be the age of those
individuals that may be causing the high blood pressure.
Page 2Ravish Kamath
4330: Assignment 1
213893664
Question 3
Question 3: We want to study the genetics of colour distribution in a pop
ulation of unicorns. Suppose that in unicorns there is a single gene that
determines colour; there are two variants of the gene, one called A and the
other called a. Each unicorn has two copies of the gene (one from each par
ent). The combination of the gene variants for copy 1 and copy 2 of the gene
determines colour as follows:
Copy 1 Copy 2 Colour
A A
Red
A a
Pink
a A
Pink
a a
White
We want to determine if “random mating” is happening in this unicorn pop
ulation. This would mean that the probability that a newly born unicorn
inherits variant A or a with probability equal to the prevalence of each variant
in the overall population.
(a) Let p be the proportion of the A variant in the population, and q be
the proportion of the variant a, so that q = 1 − p . Under random
mating, each newborn unicorn inherits A with probability
p and a
with probability q and the two copies inherited in each unicorn are
independent of one another. Calculate the expected proportions of
unicorn colours under random mating.
(b) Suppose you know that p = 0.75 and q = 0.25, and that you collect the
following data from a sample of unicorns:
Colour Number of unicorns
Red 45
Pink 49
White 12
Use the appropriate statistical test to determine whether the random mating
assumption holds in this population. Write out your calculations for
this part; using R for this part will not result in credit.
Page 3Ravish Kamath
4330: Assignment 1
213893664
Solution
(a)
E(Red Unicorn Colour) = proportion of A ⇥ proportion of A
= (p ⇥ p)
= p2
E(Pink Unicorn Colour) = pproportion of a ⇥ proportion of A
= (q ⇥ p)
= (qp)
= p(1 − p)
E(White Unicorn Colour) = proportion of a ⇥ proportion of a
= (q ⇥ q)
= q2
= (1 − p)2
(b) Let n = 45 + 49 + 12 = 106.
E(Red Unicorn colour) = (0.75)2 ⇥ 106
= 59.625
E(Pink Unicorn Colour) = (0.75) ⇥ (0.25) ⇥ (106)
= 19.875
E(White Unicorn Colour) = (0.25)2 ⇥ 106
= 59.625
χ2 = (45 − 59.625)2
59.625
+
(49 − 19.875)2
19.875
+
(12 − 59.625)2
59.625
= 84.3074
H0: Random mating does occur in the unicorn population
Ha: Random mating does not occur
Let ↵ = 0.05 and df = 3 − 1=2
Page 4Ravish Kamath
4330: Assignment 1
213893664
Using R to calculate th test statistic, we get that the p-value is 4.9304⇥
10−19 which is well below 0.05. Hence we reject H0 and say that there
is no evidence to show that random mating does occur in the unicorn
population.
Page 5Question 1
A) This study would be considered a retrospective study. The reason would be that the
researcher is looking into an already existing database system, hence she is looking into
the past for her subjects. Based on this medical database system, she is sampling
individuals who already showed a skin reaction vs individuals that did not have a
reaction.
B) It is possible to have this result. However there might be certain factors that can
affect the conclusion by the researcher. For example, seasonality, where exposure to high
levels of sunlight can cause skin reaction, food allergies can cause skin reactions, and
finally, prior infections and diseases could have caused similar skin reaction. All of these
have no relation to the drug that was used by the patients, however they were still able
to get skin reactions though not from the drug. This can cause some problems with the
way the experiment was held. However depending on the sample size, these may be
outlier situations, and the drug does indeed cause a development of skin reaction. Question 2
Yes there is a worry about confounders in this conclusion. The first confounder that can
be thought off is age of the participants. Age is a factor because older people may have
less time to exercise due to work, family etc. and with age, high blood pressure is quite
common than younger people. Because of the age confounder, it may seem that less
hours of exercise will cause higher blood pressure, however it could just be the age of
those individuals that may be causing the high blood pressure. Question 3