$29.99
Note 1: Your submission header must have the format as shown in the above-enclosed rounded rectangle.
Note 4: All submitted materials must be legible. Figures/diagrams must have good quality. Note 5: Please use and check the Blackboard discussion for further instructions, questions, answers, and hints.
1. Considering the 1-D dataset below and the following bootstrap samples (bagging rounds 1 to 5) randomly generated during the bagging process. Show how a bagging algorithm can perfectly classify this data by drawing and writing the decision stumps for each round, the summary table of the trained decision stumps, and the combination table of your base classifiers with the final predictions. Hint: you might need to test alternative but equally accurate decision stumps on your training set to get maximum accuracy.
x 1 2 3 4 5
y -1 -1 1 1 -1
Dataset
x 1 1 2 4 5
x 3 3 4 4 5
x 1 2 2 5 5
x 1 3 4 4 5
x 1 2 3 3 4
Round 1
Round 2
Round 3
Round 4
Round 5
2. [12 points] Considering the different 1-D dataset below and the following rounds from 1 to 3 randomly generated during the boosting process. Show how a boosting algorithm can perfectly classify this data by drawing and writing the decision stumps and weights for each round, the summary table of the trained decision stumps, and the combination table of your base classifiers with the weighted final predictions. Hint: there is a single best decision stump (more accurate) for each round.
x 1 2 3 4 5
y 1 1 -1 -1 1
Dataset
x 1 2 3 4 4
x 5 5 5 5 5
x 3 3 4 4 5
Round 1
Round 2
Round 3
3. [15 points] Complete the Python program (bagging_random_forest.py) that will read the file optdigits.tra (3,823 samples) that includes training instances of handwritten digits (optically recognized). Read the file optdigits.names to get detailed information about this dataset. Also, check the file optdigits-orig.tra and optdigits-orig.names to see the original format of this data, and how it was transformed to speed-up the learning process (pre-processing phase). Your goal is to build a base classifier by using a single decision tree, by using an ensemble classifier that combines multiple decision trees, and by using Random Forest. To test the accuracy of those distinct models, you will use the file optdigits.tes (1,797 samples).
4. [20 points] Say you are given the training data set shown below. This is a binary classification task in which the instances are described by two integer-valued attributes.
a. [2 points] Draw the decision boundary and its parallel hyperplanes for a linear SVM with maximum margin (hard margin formulation) and identify the support vectors.
b. [2 points] If a black circle is added as a training sample in the position (7,5), does this affect the previously learned decision boundary? Explain why.
c. [2 points] If a yellow circle is added as a training sample in the position (4,2), does this affect the previously learned decision boundary? Explain why.
d. [2 points] If a black circle is added as a test sample in the position (7,5), will this sample be classified correctly according to the previously learned decision boundary? Explain why.
e. [2 points] If a black circle is added as a test sample in the position (6,4), will this sample be classified correctly according to the previously learned decision boundary? Explain why.
f. [2 points] If a yellow circle is added as a test sample in the position (4,2), will this sample be classified correctly according to the previously learned decision boundary? Explain why.
g. [2 points] If a yellow circle is added as a test sample in the position (5,3), will this sample be classified correctly according to the previously learned decision boundary? Explain why.
h. [2 points] If a black circle is added as a test sample in the position (5,3), will this sample be classified correctly according to the previously learned decision boundary? Explain why.
i. [2 points] If a yellow circle is added as a test sample in the position (6,4), will this sample be classified correctly according to the previously learned decision boundary? Explain why.
j. [2 points] If a black circle is added as a training sample in the position (4,4), how this will affect the decision boundary if C = 1 and C = ∞? Consider the soft margin formulation.
5. [11 points] Consider the following 1-dimensional data with two classes:
a. [3 points] Find the decision boundary of a linear SVM on this data (hard-margin formulation) and identify the support vectors (write the 𝑥 coordinate to provide your answers).
b. [4 points] Find the solution parameters 𝑤 and 𝑏 for this linear SVM and the width of the margin. Hint: place the identified support vectors (positive and negative) into the formula 𝑦𝑖(𝑤. 𝑥𝑖 + 𝑏) = 1 since you know this formula holds for them.
c. [4 points] Suppose we remove the point (1,+) from this training set and train the SVM again. Find the new values of the solution parameters 𝑤 and 𝑏 and the width of the margin.
6. [12 points] The quadratic kernel 𝐾(𝑥, 𝑦) = (𝑥. 𝑦 + 1)2 should be equivalent to mapping each 𝑥 into a six-dimensional space where
𝛷
for the case where 𝑥 = (𝑥1,𝑥2). Demonstrate this equivalence by answering the following questions while using the data points: 𝐴 = (1,2), 𝐵 = (2,4).
a. 𝛷(𝐴)
b. 𝛷(𝐵)
c. 𝛷(𝐴)𝛷(𝐵)
d. 𝐾(𝐴, 𝐵). Hint: your answers for (c) and (d) should be the same. By using the kernel function, SVM “cheats” and performs significantly fewer calculations. This is known by “kernel trick”.
7. [15 points] Complete the Python program (svm.py) that will also read the file optdigits.tra to build multiple SVM classifiers. You will simulate a grid search, trying to find which combination of four SVM hyperparameters (c, degree, kernel, and decision_function_shape) leads you to the best prediction performance. To test the accuracy of those distinct models, you will also use the file optdigits.tes. You should update and print the accuracy, together with the hyperparameters, when it is getting higher and print at the end the highest accuracy and corresponding hyperparameters.