Starting from:

$25

DSC423 -  Data Analysis and Regression Assignment 08 - Residual Analysis  - Solved

1.       Short Essay (10 points).  Read the short PDF on George Box.  Explain in your own words the significance of “all models are wrong, but some are useful” as if you were interviewing for job in data science. 

2.       Previously, you used the PGA tour dataset to predict Prize Money.  Use a log transformation to transform Prize Money into a new response variable.  Apply your knowledge of regression analysis to fit a regression model using the remaining predictors in your dataset.  If necessary, remove the non-significant variables. Remember to remove one variable at a time (variable with largest pvalue is removed first) and refit the model, until all variables are significant.  

a.       (10 points) Check for multicollinear.  Explain your process. 

b.       (10 points) Compare this model to the one you made in the previous assignment.  How did performing a log transformation impact the quality of the model?  Why? 

c.       (10 points) Analyze and discuss the residual plots. 

d.       (10 points) Analyze if there are any outliers and/or influential points. If there are points in the dataset that need to be investigated, give one or more reason to support each point chosen.  Discuss your answer. 

More products