Starting from:

$29.99

STA442H1 S Assignment 2 Solution


1. For this question you have to simulate a dataset. Let’s assume the outcome Y depends on 50 covariates X1,X2,...,X50 linearly. That is the relationship is presented with the following equation:
Y = β0+ β1X1+ β2X2+ ··· + β50X50+ ϵ (1)
• Generating training sets of size 100. That is,
– Generate 100 random values of all X variables from standard normal distribution.
That is X1 ∼ N(0,1),X2 ∼ N(0,1)...X50 ∼ N(0,1)
– Generate ϵ also from standard normal ϵ ∼ N(0,1)
– Generate βs from some Uniform distribution where β1 to β20 are simulated from Uniform(0.5, 1.5) and β21 to β50 are simulated from Uniform(0.2, 0.4)
– Then generate Y using (1)
• Generating test set of size 1000,
– Generate 1000 random values of all X variables from standard normal distribution. That is X1 ∼ N(0,1),X2 ∼ N(0,1)...X50 ∼ N(0,1)
– Generate ϵ also from standard normal ϵ ∼ N(0,1) – Use the same βs generated for the training set.
– Then generate Y using (1)
1
2. For this problem you need to load the NHANES dataset using the following command
## If the package is not installed then use ##
install.packages('NHANES') ## And install.packages('tidyverse') library(tidyverse) library(NHANES) small.nhanes <- na.omit(NHANES[NHANES$SurveyYr=="2011_12"
& NHANES$Age > 17,c(1,3,4,8:11,13,25,61)]) small.nhanes <- small.nhanes %>% group_by(ID) %>% filter(row_number()==1)
This is data collected by US National Center for Health Statistics (NCHS). The preceeding codes creates a small dataset of the original NHANES dataset. With this dataset answer the following questions,
small.nhanes <- na.omit(NHANES[NHANES$SurveyYr=="2011_12" & NHANES$Age > 17,c(1,3,4,8:11,13,25,61)])
Fit a mixed effects logistic regression. Only consider random intercept for subject ID. Use all the available predictors. Interpret the results.
2

More products