$35
1. Inference for normal mean and deviation
A factory has a production line for manufacturing car windshields. A sample of windshields has been taken for testing hardness. The observed hardness values y1 can be found in the windshieldy1 dataset. The data can be accessed from the bsda R package as follows:
library(bsda) data("windshieldy1") head(windshieldy1)
## [1] 13.357 14.928 14.896 15.297 14.820 12.067
We may assume that the observations follow a normal distribution with an unknown standard deviation σ. We wish to obtain information about the unknown average hardness µ. For simplicity we assume standard uninformative prior discussed in the book, that is, p(µ,σ) ∝ σ−1. It is not necessary to derive the posterior distribution in the report, as it has already been done in the book.
Below are test examples that can be used. The functions below can also be tested with markmyassignment. Note! This is only a test case. You need to change to the full data windshieldy above when reporting your results.
windshieldy_test <- c(13.357, 14.928, 14.896, 14.820)
In the report, formulate (1) model likelihood, (2) the prior, and (3) the resulting posterior.
a) What can you say about the unknown µ? Summarize your results using Bayesian point estimate (i.e. E(µ|y)), a posterior interval (95%), and plot the density. A test example can be found below for an uninformative prior. Note! Posterior intervals are also called credible intervals and are di erent from con dence intervals.
mu_point_est(data = windshieldy_test)
## [1] 14.5
mu_interval(data = windshieldy_test, prob = 0.95)
## [1] 13.3 15.7
b) What can you say about the hardness of the next windshield coming from the production line before actually measuring the hardness? Summarize your results using Bayesian point estimate, a predictive interval (95%), and plot the density. A test example can be found below.
mu_pred_point_est(data = windshieldy_test)
## [1] 14.5
mu_pred_interval(data = windshieldy_test, prob = 0.95)
## [1] 11.8 17.2
Note! Predictive intervals are di erent from posterior intervals.
Hint With a conjugate prior a closed form posterior is Student’s t form (see equations in the book). R users can use the dt function after doing input normalisation. We have added an R function dtnew() in the bsda R package which does that. For generating samples, you can use the corresponding rtnew function.
2. Inference for the di erence between proportions
An experiment was performed to estimate the e ect of beta-blockers on mortality of cardiac patients. A group of patients was randomly assigned to treatment and control groups: out of 674 patients receiving the control, 39 died, and out of 680 receiving the treatment, 22 died. Assume that the outcomes are independent and binomially distributed, with probabilities of death of p0 and p1 under the control and treatment, respectively. Set up a noninformative or weakly informative prior distribution on (p0,p1).
In the report, formulate (1) model likelihood, (2) the prior, and (3) the resulting posterior.
a) Summarize the posterior distribution for the odds ratio, (p1/(1 − p1))/(p0/(1 − p0)). Compute the point estimate, a posterior interval (95%), and plot the histogram. Use Frank Harrell’s recommendations how to state results in Bayesian two group comparison. Below is a test case on how the odd ratio should be computed. Note! This is only a test case. You need to change to the real posteriors when reporting your results.
set.seed(4711) p0 <- rbeta(100000, 5, 95) p1 <- rbeta(100000, 10, 90) posterior_odds_ratio_point_est(p0 = p0, p1 = p1)
## [1] 2.676
posterior_odds_ratio_interval(p0 = p0, p1 = p1, prob = 0.9)
## [1] 0.875 6.059
b) Discuss the sensitivity of your inference to your choice of prior density with a couple of sentences.
Hint With a conjugate prior, a closed-form posterior is the Beta form for each group separately (see equations in the book). You can use rbeta() to sample from the posterior distributions of p0 and p1, and use these samples and odds ratio equation to get samples from the distribution of the odds ratio.
3. Inference for the di erence between normal means
Consider a case where the same factory has two production lines for manufacturing car windshields. Independent samples from the two production lines were tested for hardness. The hardness measurements for the two samples y1 and y2 are given in the les windshieldy1.txt and windshieldy2.txt. These can be accessed directly with
data("windshieldy1") data("windshieldy2")
We assume that the samples have unknown standard deviations σ1 and σ2.
In the report, formulate (1) model likelihood, (2) the prior, and (3) the resulting posterior.
Use uninformative or weakly informative priors and answer the following questions:
a) What can you say about µd = µ1 − µ2? Summarize your results using a Bayesian point estimate, a posterior interval (95%), and plot the histogram. Use Frank Harrell’s recommendations how to state results in Bayesian two group comparison.
b) Given the model used, what is the probability that the means are exactly the same (µ1 = µ2)? Explain your reasoning.
Hint With a conjugate prior, a closed-form posterior is Student’s t form for each group separately (see equations in the book). You can use rt() function to sample from the posterior distributions of µ1 and µ2, and use these samples to get samples from the distribution of the di erence µd = µ1 − µ2. Be careful to scale them and shift them according to their mean and variance values in R, as described above.
Hint Posterior distributions of µ1 and µ2 are continuous, and thus the posterior distribution of the di erence µd = µ1 − µ2 is also continuous. What is the probability that µd = 0?