Starting from:

$25

COEN240 Homework 1 Solved

Problem 1 You have a set of ๐‘ training inputs ๐ฑ ∈ โ„๐‘€, ๐‘› 1, 2, … , ๐‘, ๐‘ โ‰ซ ๐‘€. The target outputs of the training inputs are ๐‘ก ∈ โ„, ๐‘› 1, 2, … , ๐‘. Build a linear regression model to predict the target value by ๐ฐ ๐ฑ .

Derive the closed-form solution for the weight vector ๐ฐ ∈ โ„๐‘€ that minimizes the error function  ๐ธ ๐ฐ

  ๐ฐ ๐ฑ          ๐‘ก     2.  

 

Problem 2 The Pima Indians diabetes data set (pima-indians-diabetes.xlsx) is a data set used to diagnostically predict whether or not a patient has diabetes, based on certain diagnostic measurements included in the dataset. All patients here are females at least 21 years old of Pima Indian heritage. The dataset consists of M = 8 attributes and one target variable, Outcome (1 represents diabetes, 0 represents no diabetes). The 8 attributes include Pregnancies, Glucose, BloodPressure, BMI, insulin level, age, and so on. There are N=768 data samples.

Randomly select n samples from the diabetes class and n am le f om he no diabe e cla , and e hem as the training samples. The remaining data samples are the test samples. Build a linear regression model as described in Problem 1 with the training set, and test your model on the test samples to predict whether or not a test patient has diabetes or not. Assume the predicted outcome of a test sample is ๐‘กฬ‚, if ๐‘กฬ‚ 0.5 (closer to 1), classify

i a diabe e ; if ๐‘กฬ‚ 0.5 (closer to 0), cla if i a no diabe e . Run 1000 independent experiments, and calculate the prediction accuracy rate as      %. Let n=40, 80, 120, 160, 200, plot the

                                                                                                 

accuracy rate versus n. Comment on the result. Attach the code at the end of the homework.

 

More products