sc <- spark_connect(master = "local") myremotedata <- copy_to(sc, mylocaldata, overwrite = TRUE) # our group's formula for the best model that describes bodyfat
# bodyfat ~ Wrist + log(Abdomen) + Weight^2
# unfortunately, ml_linear_regression has a problem with the log and exponent functions \
# so we have to simplify the formula mymodel <- ml_linear_regression(x=myremotedata , formula = bodyfat~Wrist+Abdomen+Weight)
summary(mymodel)
spark_web(sc)
1b. Output of summary(model): summary(mymodel) Deviance Residuals:
Min 1Q Median 3Q Max
-13.0803 -3.2463 -0.2175 3.2472 9.8018
Coefficients:
(Intercept) Wrist Abdomen Weight
-27.9299169 -1.2448589 0.9751296 -0.1144609
R-Squared: 0.7277
Root Mean Squared Error: 4.358
1c. A screen capture image of the running Apache Spark web UI