Starting from:

$30

CENG499- Homework 3 Solved

1        Introduction
In this assignment, you will have the chance to get hands-on experience with support vector machines (SVM) and linear regression. Python is the programming language choice for this homework.

2        Support Vector Machines (40 pts)
In this task, you will improve your comprehension of the SVM by experimenting on various kernel functions and the hyperparameter C. You will employ the SVM implementation (called SVC, link) of scikit-learn, a popular machine learning library, for both subtasks. In each subtask, you will read the corresponding dataset with NumPy library, and plot the decision boundaries with Mathplotlib, a popular plotting library. No other external library is allowed.

2.1       Kernel Functions
The dataset file for this subtask is task1 A.npz. And you will read it by using the following code snippet. X is the feature matrix (100, 2), whereas y is the corresponding labels (100, ). Notice that the file is put under the directory task1 (Do not change the location). import numpy as np

data = np.load("task1\\task1_A.npz") X, y = data["X"], data["y"]

Write your code in task1 A.py. When grading, your implementation will be called as follows.

python task1_A.py

By using the dataset, your program will train 4 different SVMs, each of which has different kernel parameter. The values you will try for the kernel parameter are ‘linear’, ’sigmoid’, ‘poly’, and ‘rbf’. Stick with the default values for the other parameters of the SVC. As the output, your program will show 4 decision boundary plots on the screen. We have provided an example decision boundary plot below. You are to state the axes, the kernel name, and the training accuracy in your plots.



2.2       The hyperparameter C
The dataset file for this subtask is task1 B.npz. And you will read it by using the following code snippet. X is the feature matrix (100, 2), whereas y is the corresponding labels (100, ). Notice that the file is assumed to be under the directory task1 (Do not change the location). import numpy as np

data = np.load("task1\\task1_B.npz") X, y = data["X"], data["y"]

Write your code in task1 B.py. When grading, your implementation will be called as follows. python task1_B.py

By using the dataset, your program will train 4 different SVMs, each of which has different C parameter. The values you will try for the C parameter are 0.01, 0.1, 1, and 10. Use polynomial kernel. Stick with the default values for the other parameters of the SVC. As the output, your program will show 4 decision boundary plots on the screen. We have provided an example decision boundary plot below. You are to state the axes, the C value, and the training accuracy in your plots.



3        Linear Regression (60 pts)
In this section, you are going to perform a regression on a dataset by using multivariate linear regression. Specifically, you will parse the input dataset, perform normalization on it, train the model with gradient descent, and compute the performance metric on the test set. You will write your code in the function linear regression in task2.py. When grading, we will call the function with different parameters. The dataset for this task is taken from here (link).

•   No external library is allowed for this task.

•   As we stated above, we will use different datasets. Although the datasets we are going to use in grading have the same structure as the provided dataset, they have different number of features. In short, do not assume a fixed value for the number of features.

•   To normalize both sets, use the following equation. X is a feature, Xmin is the minimum value for the feature X, and Xmax is the maximum value for the feature X. Also, do not apply normalization on the target value (the last column).



•   The performance metric for this task is RMSE (Root Mean Square Error). m is the number of instances in the set, yi is the value of the ith instance, yˆi is the predicted value of the ith instance.



•   For the provided dataset, the RMSE value you should achieve is 4.76 (whenn num epochs is 1000 and learning rate is 0.001).

More products