Starting from:

$25

Stat4DS -Homework 002 - Solved

Statistical Methods in Data Science II & Lab


1a) Illustrate the characteristics of the statistical model for dealing with the Dugong’s data [data available in the R code 2022-W-13-R2jags-code.R]. Lengths (Yi) and ages (xi) of 27 dugongs (see cows) captured off the coast of Queensland have been recorded and the following (non linear) regression model is considered in Carlin and Gelfand (1991):

Yi

N(µi,τ2)
µi = f(xi)
=
α − βγxi
Model parameters are α ∈ (1,∞), β ∈ (1,∞), γ ∈ (0,1), τ2 ∈ (0,∞). Let us consider the following prior distributions:

α

N(0,σα2)
β

N(0,σβ2)
γ

Unif(0,1)
τ2

IG(a,b))(InverseGamma)
1b) Derive the corresponding likelihood function

1c) Write down the expression of the joint prior distribution of the parameters at stake and illustrate your suitable choice for the hyperparameters.

1d) Derive the functional form (up to proportionality constants) of all full-conditionals

1e) Which distribution can you recognize within standard parametric families so that direct simulation from full conditional can be easily implemented ?

1f) Using a suitable Metropolis-within-Gibbs algorithm simulate a Markov chain (T = 10000) to approximate the posterior distribution for the above model

1g) Show the 4 univariate trace-plots of the simulations of each parameter

1h) Evaluate graphically the behaviour of the empirical averages Iˆt with growing t = 1,...,T

1i) Provide estimates for each parameter together with the approximation error and explain how you have evaluated such error

1l) Which parameter has the largest posterior uncertainty? How did you measure it?

1m) Which couple of parameters has the largest correlation (in absolute value)?


 

1n) Use the Markov chain to approximate the posterior predictive distribution of the length of a dugong with age of 20 years.

1o) Provide the prediction of a different dugong with age 30

1p) Which prediction is less precise?

(write your answers and provide your R code for the numerical solution)


 

2) Let us consider a Markov chain (Xt)t≥0 defined on the state space S = {1,2,3} with the following transition

 

2a) Starting at time t = 0 in the state X0 = 1 simulate the Markov chain with distribution assigned as above for t = 1000 consecutive times

2b) compute the empirical relative frequency of the two states in your simulation 2c) repeat the simulation for 500 times and record only the final state at time t = 1000 for each of the 500 simulated chains. Compute the relative frequency of the 500 final states. What distribution are you approximating in this way? Try to formalize the difference between this point and the previous point.

2d) compute the theoretical stationary distribution π and explain how you have obtained it 2e) is it well approximated by the simulated empirical relative frequencies computed in (b) and (c)?

2f) what happens if we start at t = 0 from state X0 = 2 instead of X0 = 1?

More products