$24.99
Total pts: 10 (reproducibility) + 30 (Q1) + 20 (Q2) + 40 (Q3) = 100
General instructions for homeworks: Please follow the uploading file instructions according to the syllabus. You will give the commands to answer each question in its own code block, which will also produce plots that will be automatically embedded in the output file. Each answer must be supported by written statements as well as any code used. Your code must be completely reproducible and must compile.
Commenting code Code should be commented. See the Google style guide for questions regarding commenting or how to write code https://google.
github.io/styleguide/Rguide.xml. No late homework’s will be accepted.
1. Lab component (30 points total) Please refer to lab 2 and complete tasks 3—5.
(a) (10) Task 3
(b) (10) Task 4
(c) (10) Task 5
The goal of this problem is to see how a conjugate model relates to real data. You will get practice deriving a posterior distribution that you have not seen before, plotting densities as we did in class, and seeing a connection to real data. Finally, you will get practice thinking about when the model below might be appropriate in practice.
2. (20 points total) The Exponential-Gamma Model We write X ∼ Exp(θ) to indicate that X has the Exponential distribution, that is, its p.d.f. is
p(x|θ) = Exp(x|θ) = θ exp(−θx)1(x > 0).
The Exponential distribution has some special properties that make it a good model for certain applications. It has been used to model the time between events (such as neuron spikes, website hits, neutrinos captured in a detector), extreme values such as maximum daily rainfall over a period of one year, or the amount of time until a product fails (lightbulbs are a standard example).
Suppose you have data x1,...,xn which you are modeling as i.i.d. observations from an Exponential distribution, and suppose that your prior is θ ∼ Gamma(a,b), that is,
p(θ) = Gamma( .
(a) (5) Derive the formula for the posterior density, p(θ|x1:n). Give the form of the posterior in terms of one of the most common distributions (Bernoulli, Beta, Exponential, or Gamma).
(b) (5) Why is the posterior distribution a proper density or probability distribution function?
(c) (5) Now, suppose you are measuring the number of seconds between lightning strikes during a storm, your prior is Gamma(0.1,1.0), and your data is
(x1,...,x8) = (20.9,69.7,3.6,21.8,21.4,0.4,6.7,10.0).
Plot the prior and posterior p.d.f.s. (Be sure to make your plots on a scale that allows you to clearly see the important features.)
(d) (5) Give a specific example of an application where an Exponential model would be reasonable. Give an example where an Exponential model would NOT be appropriate, and explain why.
The goal of this problem is to introduce you to a new family of distributions, get more practice deriving the posterior, and work with a posterior predictive distribution on your own for the first time. This will be an intense problem, so reach out if you’re having trouble!
3. (40 points total) Priors, Posteriors, Predictive Distributions (Hoff, 3.9) An unknown quantity Y | θ has a Galenshore(a,θ) distribution if its density is given by
for y > 0,θ > 0,a > 0. Assume for now that a is known and θ is unknown and a random variable. For this density,
and
.
(a) (10) Identify a class of conjugate prior densities for θ. Assume the prior parameters are c and d, which are fixed and known. That is, state the prior distribution for θ, which will have known and fixed parameters c,d such that the resulting posterior is conjugate. Plot a few members of this class of densities.
iid
(b) (5) Let Y1,...,Yn ∼ Galenshore(a,θ). Find the posterior distribution of θ | y1:n using a prior from your conjugate class.
(c) (10) Show that
,
where
θa,θb ∼ Galenshore(c,d).
Identify a sufficient statistic.
(d) (5) Determine E[θ | y1:n].
(e) (10) Show that the form of the posterior predictive density
.