Starting from:

$29.99

Assignment 2  Linear Regression Solved

Task1: Simple Linear Regression
1. Download the file ‘insurance.csv’ from our class Blackboard site.
2. Read this file into your R environment. Show the step that you used to accomplish this.

3. Filter the dataframe to create a new dataframe that only contains the records of people who are not smokers. Show the code that you used to do this.

– You will use this new dataframe for the rest of the assignment –
4. Using ggplot, create a scatterplot to depict the relationship between the input variable age and the output variable charges. Show your scatterplot, along with the code that you used to build it. What does this scatterplot suggest about the relationship between the two variables? Why (or why not) does this make intuitive sense to you?


5. Find the correlation between age and charges. Show the code that you used, and the results from your console, in a screenshot.

6. Using your assigned seed value, create a data partition. Assign approximately 60% of the records to your training set, and the other 40% to your validation set. Show the code that you used to do this.

7. Using your training set, create a simple linear regression model, with the input variable age and the outcome variable charges. Show the step(s) that you used to do this. Include a screenshot of the summary of your model, along with the code you used to generate that summary.

8. What is the regression equation generated by your model? Make up a hypothetical input value and explain what it would predict as an outcome. To show the predicted outcome value, you can either use a function in R, or just explain what the predicted outcome would be, based on the regression equation and some simple math.


9. Using the accuracy() function from the forecast package, assess the accuracy of your model against both the training set and the validation set. What do you notice about these results? Describe your findings in a couple of sentences.

Task 2: K-Nearest Neighbors
The model that we’ll build will aim to predict which species of fish is being sold at a popular urban fish market, using only the numeric attributes as inputs. The outcome variable of this model will be Species. The numeric attributes are described below:
Weight = weight of fish in Gram g
Length1 = vertical length in cm
Length2 = diagonal length in cm
Length3 = cross length in cm
Height = height in cm
Width = diagonal width in cm
1. Download the file ‘fishmarket.csv’ from our class Blackboard site.
2. Read this file into your R environment. Show the step that you used to accomplish this.
Hide

3. Using your assigned seed value (from Assignment 2), partition your data into training (60%) and validation (40%) sets. Show the step(s) that you used to do this.

4. Make up a fake fish (yes, really!)
a. Give your fish a name (there’s no R code needed here, and you won’t use the name when you run k-nn… but give the fish a name anyway and just write it here).

b. Use the runif() function to give your fish values for each of the six numeric attributes. Use the min and max values from your training set as the lower and upper boundaries for runif().
Hide
Weight <- runif(1, min(training$Weight), max(training$Weight))
Lenght1 <- runif(1, min(training$Length1), max(training$Length1))
Lenght2 <- runif(1, min(training$Length2), max(training$Length2))
Lenght3 <- runif(1, min(training$Length3), max(training$Length3))
Height <- runif(1, min(training$Height), max(training$Height))
Width <- runif(1, min(training$Width), max(training$Width))

balloon_molly <- data.frame(Weight = 172, Length1 = 12,
Length2 = 43,
Length3 = 48,
Height = 10.841,
Width = 7)
5. Normalize your data using the preProcess() function from the caret package. Use Table 7.2 from the book as a guide for this. Show the step(s) that you used to do this.


6. Using the knn() function from the FNN package, and using a k-value of 7, generate a predicted classification for your fish. Show the step(s) that you used to do this, along with the output in the console. What Species was your fish predicted to belong to? Also, who were your fish’s 7 nearest neighbors? What Species’ did they belong to? Show the step(s) that you used to find this out, along with a screenshot of the output in the console.


7a. Use your validation set to help you determine an optimal k-value. Use Table 7.3 from the textbook as a guide here. Show the step(s) that you used to do this, along with the output in the console.

longer object length is not a multiple of shorter object lengthLevels are not in the sam e order for reference and data. Refactoring data to match.longer object length is not a multiple of shorter object lengthLevels are not in the same order for reference and dat
a. Refactoring data to match.longer object length is not a multiple of shorter object le ngthLevels are not in the same order for reference and data. Refactoring data to match.l onger object length is not a multiple of shorter object lengthLevels are not in the same order for reference and data. Refactoring data to match.longer object length is not a mu ltiple of shorter object lengthLevels are not in the same order for reference and data. Refactoring data to match.longer object length is not a multiple of shorter object lengt hLevels are not in the same order for reference and data. Refactoring data to match.long er object length is not a multiple of shorter object lengthLevels are not in the same or der for reference and data. Refactoring data to match.longer object length is not a mult iple of shorter object lengthLevels are not in the same order for reference and data. Re factoring data to match.longer object length is not a multiple of shorter object lengthL evels are not in the same order for reference and data. Refactoring data to match.longer object length is not a multiple of shorter object lengthLevels are not in the same order for reference and data. Refactoring data to match.longer object length is not a multiple of shorter object lengthLevels are not in the same order for reference and data. Refacto ring data to match.longer object length is not a multiple of shorter object lengthLevels are not in the same order for reference and data. Refactoring data to match.longer objec t length is not a multiple of shorter object lengthLevels are not in the same order for reference and data. Refactoring data to match.longer object length is not a multiple of shorter object lengthLevels are not in the same order for reference and data. Refactorin g data to match.longer object length is not a multiple of shorter object lengthLevels ar e not in the same order for reference and data. Refactoring data to match.longer object length is not a multiple of shorter object lengthLevels are not in the same order for re ference and data. Refactoring data to match.longer object length is not a multiple of sh orter object lengthLevels are not in the same order for reference and data. Refactoring data to match.longer object length is not a multiple of shorter object lengthLevels are not in the same order for reference and data. Refactoring data to match.longer object le ngth is not a multiple of shorter object lengthLevels are not in the same order for refe rence and data. Refactoring data to match.longer object length is not a multiple of shor ter object lengthLevels are not in the same order for reference and data. Refactoring da ta to match.longer object length is not a multiple of shorter object lengthLevels are no t in the same order for reference and data. Refactoring data to match.longer object leng th is not a multiple of shorter object lengthLevels are not in the same order for refere nce and data. Refactoring data to match.longer object length is not a multiple of shorte r object lengthLevels are not in the same order for reference and data. Refactoring data to match.longer object length is not a multiple of shorter object lengthLevels are not i n the same order for reference and data. Refactoring data to match.longer object length is not a multiple of shorter object lengthLevels are not in the same order for reference and data. Refactoring data to match.longer object length is not a multiple of shorter ob ject lengthLevels are not in the same order for reference and data. Refactoring data to match.longer object length is not a multiple of shorter object lengthLevels are not in t he same order for reference and data. Refactoring data to match.longer object length is not a multiple of shorter object lengthLevels are not in the same order for reference an d data. Refactoring data to match.longer object length is not a multiple of shorter obje ct lengthLevels are not in the same order for reference and data. Refactoring data to ma tch.longer object length is not a multiple of shorter object lengthLevels are not in the same order for reference and data. Refactoring data to match.longer object length is not a multiple of shorter object lengthLevels are not in the same order for reference and da ta. Refactoring data to match.longer object length is not a multiple of shorter object l engthLevels are not in the same order for reference and data. Refactoring data to match. longer object length is not a multiple of shorter object lengthLevels are not in the sam e order for reference and data. Refactoring data to match.longer object length is not a multiple of shorter object lengthLevels are not in the same order for reference and dat
a. Refactoring data to match.longer object length is not a multiple of shorter object le ngthLevels are not in the same order for reference and data. Refactoring data to match.l onger object length is not a multiple of shorter object lengthLevels are not in the same order for reference and data. Refactoring data to match.longer object length is not a mu ltiple of shorter object lengthLevels are not in the same order for reference and data. Refactoring data to match.longer object length is not a multiple of shorter object lengt hLevels are not in the same order for reference and data. Refactoring data to match.long er object length is not a multiple of shorter object lengthLevels are not in the same or der for reference and data. Refactoring data to match.longer object length is not a mult iple of shorter object lengthLevels are not in the same order for reference and data. Re factoring data to match.longer object length is not a multiple of shorter object lengthL evels are not in the same order for reference and data. Refactoring data to match.longer object length is not a multiple of shorter object lengthLevels are not in the same order for reference and data. Refactoring data to match.longer object length is not a multiple of shorter object lengthLevels are not in the same order for reference and data. Refacto ring data to match.longer object length is not a multiple of shorter object lengthLevels are not in the same order for reference and data. Refactoring data to match.longer objec t length is not a multiple of shorter object lengthLevels are not in the same order for reference and data. Refactoring data to match.longer object length is not a multiple of shorter object lengthLevels are not in the same order for reference and data. Refactorin g data to match.longer object length is not a multiple of shorter object lengthLevels ar e not in the same order for reference and data. Refactoring data to match.longer object length is not a multiple of shorter object lengthLevels are not in the same order for re ference and data. Refactoring data to match.longer object length is not a multiple of sh orter object lengthLevels are not in the same order for reference and data. Refactoring data to match.longer object length is not a multiple of shorter object lengthLevels are not in the same order for reference and data. Refactoring data to match.longer object le ngth is not a multiple of shorter object lengthLevels are not in the same order for refe rence and data. Refactoring data to match.longer object length is not a multiple of shor ter object lengthLevels are not in the same order for reference and data. Refactoring da ta to match.longer object length is not a multiple of shorter object lengthLevels are no t in the same order for reference and data. Refactoring data to match.longer object leng th is not a multiple of shorter object lengthLevels are not in the same order for refere nce and data. Refactoring data to match.longer object length is not a multiple of shorte r object lengthLevels are not in the same order for reference and data. Refactoring data to match.longer object length is not a multiple of shorter object lengthLevels are not i n the same order for reference and data. Refactoring data to match.longer object length is not a multiple of shorter object lengthLevels are not in the same order for reference and data. Refactoring data to match.longer object length is not a multiple of shorter ob ject lengthLevels are not in the same order for reference and data. Refactoring data to match.Levels are not in the same order for reference and data. Refactoring data to matc
h.Levels are not in the same order for reference and data. Refactoring data to match.Lev els are not in the same order for reference and data. Refactoring data to match.Levels a re not in the same order for reference and data. Refactoring data to match.Levels are no t in the same order for reference and data. Refactoring data to match.Levels are not in the same order for reference and data. Refactoring data to match.Levels are not in the s ame order for reference and data. Refactoring data to match.Levels are not in the same o rder for reference and data. Refactoring data to match.Levels are not in the same order for reference and data. Refactoring data to match.Levels are not in the same order for r eference and data. Refactoring data to match.Levels are not in the same order for refere nce and data. Refactoring data to match.Levels are not in the same order for reference a

7b. Using either the base graphics package or ggplot, make a scatterplot with the various k values that you used in 7a on your x-axis, and the accuracy metrics on the y-axis.
Hide

8. Re-run your knn() function with this new k-value. What result did you obtain? Was it different from the one you saw in Step 9? Show the step(s) that you used to do this, along with the output in the console. Also, what were the outcome classes (Species) for each of your fish’s k-nearest neighbors?

# The new outcome of new knn is now that 'balloon molly' is classfied as a Perch
# Using K=3 would indicate a new species. However, the index would never change from 1:9 5.

More products