We would like to build a learning algorithm that will compute the probability that a person has chicken pox. So, given a set of training data where each instance is mapped to 1 for true or 0 for false: 1. Derive the proper error function to use for finding the ML hypothesis using Bayes’ Rule. You should go through a similar process as the one used to derive least squared error in the lessons. 2. Compare and contrast your result to the rule we derived for a deterministic function perturbed by zero-mean gaussian noise. What would a normal neural network using sum of squared errors do with these data? What if the data consisted of x,y pairs where y was an estimate of the probability instead of 0s and 1s? 2. Design a two-input perceptron that implements the boolean function 𝐴∧¬𝐵. Design a two-layer network of perceptrons that implements 𝐴 ⊕ 𝐵 (where ⊕ is XOR). 3. Derive the perceptron training rule and gradient descent training rule for a single unit with output 𝑜, where 𝑜 = 𝑤0 + 𝑤1𝑥1 + 𝑤1𝑥12 + … + 𝑤n𝑥n + 𝑤𝑛𝑥n2. What are the advantages of using gradient descent training rule for training neural networks over the perceptron training rule? 4. Explain how one can use Decision Trees to perform regression? Show that when the error function is squared error that the expected value at any leaf is the mean. Take the Boston Housing dataset (http:// lib.stat.cmu.edu/datasets/boston) and use Decision Trees to perform regression. 5. Suggest a lazy version of the eager decision tree learning algorithm ID3. What are the advantages and disadvantages of your lazy algorithm compared to the original eager algorithm? 6. Imagine you had a learning problem with an instance space of points on the plane and a target function that you knew took the form of a line on the plane where all points on one side of the line are positive and all those on the other are negative. If you were constrained to only use decision tree or nearest-neighbor learning, which would you use? Why? 7. Give the VC dimension of these hypothesis spaces, briefly explaining your answers: 1. An origin-centered circle (2D) 2. An origin-centered sphere (3D)