$20
A) Understand and explore a data set
Three data sets (set A, B, and C) have been created following normally distributed classes. These data sets provide examples of male and female population where:
- The first column represents the height in feet.
- The second column represents the weight in pounds.
- The third (last) column corresponds to the gender (0 for male, and 1 for female).
Each data set contains 2,000 samples for each gender.
For each data set, do the following:
1. Plot the data for male and female students.
2. Manually draw (by hand) a separation line. This will be a linear separator (or decision function) which separates female and male students.
3. Determine the equation of this linear separator
a. Write the definition of a neuron. Note: Think of the inequality we covered in class.
b. Determine the weights and threshold. Comment.
4. Calculate false positives and false negatives (refer to confusion matrix).
5. Calculate accuracy, error, true positive rate and true negative rate, false positive rate and false negative rate.
6. Compare results for each data set and explain the differences. How are these datasets different?
Important: Assume the example of true positive: the class is “it is a female” and prediction is “female”
B) McCulloch-Pitts neurons
1. Create a truth table for the artificial neuron below. What is the functionality of this neuron?
2. Given the same set of weights and the determined functionality of a neuron, what would be the range of possible values for threshold?
Note: Consider unipolar hard threshold activation function (possible inputs/outputs are obviously 0 & 1).
Always start with the unit definition (net, output).
Hint: The truth table (similar to the one in class) should present inequalities that will evidence the
functionality of a neuron (prove that it works as promised).