$25
You will use the Appliances energy prediction data set. You should ignore the first attribute, which is a date-time variable, and you should also remove the last attribute, which is a duplicate of the previous one. Use the first attribute (after removing the date-time variable), which denotes the Appliances Energy Use, as the response variable, with the remaining attributes as predictor variables. However, you have to discretize the response variable differently for the two sections as noted below, since CSCI4390 will implement binary classification, whereas CSCI6390 will implement multiclass classification via MLP.
CSCI4390: Binary Classification
You will implement binary classification via the MLP training Algorithm 25.1 (Chapter 25, page 655). This algorithm assumes sigmoid activation for hidden layer, and squared error for the loss. However you have to use ReLU activation for the hidden layer, and use binary cross entropy for the loss. Consequently, you have to modify lines 10 and 11 appropriately so that line 10 uses cross-entropy loss, and line 11 uses ReLU derivative. See section 25.1.1 for derivative of ReLU, and section 25.2.3 for the derivative of binary cross-entropy loss function.
Note that the Appliances Energy Use attribute takes values in the range . However, for binary classification, we need only two values, so for
[10, 1080]
the purpose of this assignment you should consider energy use less than or equal to 50 as the positive class (1), and energy use higher than 50 as negative class (0).
You should shuffle the data points before selecting 70% of the data training and 30% for testing, so that there is an equal mix of the classes in both.
CSCI6390: Multiclass Classification
You will implement multiclass classification via the deep MLP algorithm in Algorithm 25.2 (Chapter 25, page 668). You should use ReLU activation for all hidden layers, and softmax for the output layer. See section 25.1.1 for the derivative of the activation functions, and end of section 25.4.3 for derivative of the multiclass cross-entropy function.
Note that the Appliances Energy Use attribute takes values in the range . However, for multiclass regression, we will convert these into four
[10, 1080]
classes as follows: energy use less than or equal to 30 is class , energy use greater than 30 but less than or equal to 50 is class , energy use greater
c1 c2
than 50 but less than or equal to 100 is class , and finally energy use higher than 100 is class . You need to do this conversion to create the
c3 c4
categorical response variable, before you select the train (70%) and test (30%) subsets.
You should shuffle the data points before selecting 70% of the data training and 30% for testing, so that there is an equal mix of the classes in both.