$25
In this homework, you will implement a decision tree regression algorithm in Matlab, Python, or R. Here are the steps you need to follow:
Read Sections 9.2.2 and 9.3 from the textbook.
You are given a univariate regression data set, which contains 133 data points, in the file named csv. Divide the data set into two parts by assigning the first 100 data points to the training set and the remaining 33 data points to the test set.
Implement a decision tree regression algorithm using the following pre-pruning rule: If a node has 𝑃 or fewer data points, convert this node into a terminal node and do not split further, where 𝑃 is a user-defined parameter.
Learn a decision tree by setting the pre-pruning parameter 𝑃 to 15. Draw training data points, test data points, and your fit in the same figure. Your figure should be similar to the following figure.
P = 15
Calculate the root mean squared error (RMSE) of your regressogram for test data points. The formula for RMSE can be written as
".
’$%&$
Your output should be similar to the following sentence.
RMSE is 26.8777 when P is 15
Learn decision trees by setting the pre-pruning parameter 𝑃 to 5, 10, 15, …, 50. Draw RMSE for test data points as a function of 𝑃. Your figure should be similar to the following figure.
What to submit: You need to submit your source code
In this homework, you will implement a decision tree regression algorithm in Matlab, Python, or R. Here are the steps you need to follow:
Read Sections 9.2.2 and 9.3 from the textbook.
You are given a univariate regression data set, which contains 133 data points, in the file named csv. Divide the data set into two parts by assigning the first 100 data points to the training set and the remaining 33 data points to the test set.
Implement a decision tree regression algorithm using the following pre-pruning rule: If a node has 𝑃 or fewer data points, convert this node into a terminal node and do not split further, where 𝑃 is a user-defined parameter.
Learn a decision tree by setting the pre-pruning parameter 𝑃 to 15. Draw training data points, test data points, and your fit in the same figure. Your figure should be similar to the following figure.
P = 15
Calculate the root mean squared error (RMSE) of your regressogram for test data points. The formula for RMSE can be written as
".
’$%&$
Your output should be similar to the following sentence.
RMSE is 26.8777 when P is 15
Learn decision trees by setting the pre-pruning parameter 𝑃 to 5, 10, 15, …, 50. Draw RMSE for test data points as a function of 𝑃. Your figure should be similar to the following figure.