AISyE65001- Homework 02 Solved

Starting from:

$30

•       Every learner should submit his/her own homework solutions. However, you are allowed to discuss the homework with each other (in fact, I encourage you to form groups and/or use the forums) – but everyone must submit his/her own solution; you may not copy someone else’s solution.

•       The homework will be peer-graded. In analytics modeling, there are often lots of different approaches that work well, and I want you to see not just your own, but also others.

•       The homework grading scale reflects the fact that the primary purpose of homework is learning:

Rating
Meaning
Point value (out of 100)
4
All correct (perhaps except a few details) with a deeper solution than expected
100
3
Most or all correct
90
2
Not correct, but a reasonable attempt
75
1
Not correct, insufficient effort
50
0
Not submitted
0

Question 3.1

Using the same data set (credit_card_data.txt or credit_card_data-headers.txt) as in Question 2.2, use the ksvm or kknn function to find a good classifier:

(a)   using cross-validation (do this for the k-nearest-neighbors model; SVM is optional); and

(b)   splitting the data into training, validation, and test data sets (pick either KNN or SVM; the other is optional).

Question 4.1

Describe a situation or problem from your job, everyday life, current events, etc., for which a clustering model would be appropriate. List some (up to 5) predictors that you might use.

Question 4.2

The iris data set iris.txt contains 150 data points, each with four predictor variables and one categorical response. The predictors are the width and length of the sepal and petal of flowers and the response is the type of flower. The data is available from the R library datasets and can be accessed with iris once the library is loaded. It is also available at the UCI Machine Learning Repository

(https://archive.ics.uci.edu/ml/datasets/Iris ). The response values are only given to see how well a specific method performed and should not be used to build the model.

Use the R function kmeans to cluster the points as well as possible. Report the best combination of predictors, your suggested value of k, and how well your best clustering predicts flower type.

More products