$30
Q1.
Consider the signal-flow graph of the perceptron shown in the above figure. The activation function, ϕ( )v , where v is the induced local field, can be designed by the user. If the activation function is chosen as hard limiter (i.e. step function), then it becomes the classical perceptron, and the decision boundary is shown to be a hyperplane. In this problem, let’s explore other choices of the activation function, and its effect on the decision boundary. Let’s assume that the classification decision made by the perceptron is simply a threshold rule defined as follows:
Observation vector belongs to class C1 if the output y ξ, whereξis a userdefined threshold; otherwise, x belongs to class C2.
Consider the following three choices of activation function:
1) The activation function is a linear function: ϕ( )v = +av b;
1
2) The activation function is the logistic function: ϕ( )v = −2v ;
1+e
v2
−
3) The activation function is the Bell-shaped Gaussian function: ϕ( )v =e 2 .
For each case, investigate whether the resulting decision boundary is a hyper-plane or not.
Q2.
Consider the logic function, EXCLUSIVE OR (XOR). Truth Table of XOR
x1
0
1
0
1
x2
0
0
1
1
y
0
1
1
0
It is well known that the XOR problem is not linearly separable. It seems obvious by visually checking, which however cannot be accepted as mathematical proof. Therefore, please supply a rigorous mathematical proof for this statement.
Q3.
The perceptron could be used to perform numerous logic functions, such as AND, OR, COMPLEMENT and NAND function, whose truth tables are tabulated as follows respectively.
x1
0
0
1
1
x1
0
0
1
1
x2
0
1
0
1
x2
0
1
0
1
y
0
0
0
1
y
0
1
1
1
AND OR
x
0
1
y
1
0
x1
0
0
1
1
x2
0
1
0
1
y
1
1
1
0
COMPLEMENT NAND
a). Demonstrate the implementation of the logic functions AND, OR, COMPLEMENT and NAND with selection of weights by off-line calculations.
b). Demonstrate the implementation of the logic functions AND, OR, COMPLEMENT and NAND with selection of weights by learning procedure. Suppose initial weights are chosen randomly and learning rate is 1.0. Plot out the trajectories of the weights for each case. Compare the results with those obtained in (a). Try other learning rates, and report your observations with different learning rates.
c). What would happen if the perceptron is applied to implement the EXCLUSIVE OR function with selection of weights by learning procedure? Suppose initial weight is chosen randomly and learning rate is 1.0. Do the computer experiment and explain your finding.
Q4.
Single layer perceptron with pure linear activation function can be used to fit a linear model to a set of input-output pairs. Suppose that we are given the following pairs:
{(0,0.5), (0.8, 1), (1.6, 4), (3, 5), (4.0, 6), (5.0, 9)} and a single linear neuron as shown in the following figure.
a). Find the solution of w and b using the standard linear least-squares (LLS) method. Plot out the fitting result.
b). Suppose that initial weight is chosen randomly and learning rate is 0.01. Find the solution of w and b using the least-mean-square (LMS) algorithm for 100 epochs. Plot out the fitting result and the trajectories of the weights versus learning steps. Will the weights converge?
c). Compare the results obtained by LLS and the LMS methods.
d) Repeat the simulation study in b) with different learning rates , and explain your findings.
Q5. (
Consider that we are trying to fit a linear model to a set of input-output pairs (x(1), d(1)), (x(2), d(2)) …, (x(n), d(n)) observed in an interval of duration n, where input x is mdimensional vector, 𝑥𝑥 = [𝑥𝑥1, 𝑥𝑥2, ⋯ , 𝑥𝑥𝑚𝑚]𝑇𝑇 . The linear model takes the following form:
𝑦𝑦(𝑥𝑥) = 𝑤𝑤1𝑥𝑥1 + 𝑤𝑤2𝑥𝑥2 + ⋯ + 𝑤𝑤𝑚𝑚𝑥𝑥𝑚𝑚 = 𝑤𝑤𝑇𝑇𝑥𝑥
Derive the formula to calculate the optimal parameter w* such that the following cost function J(w) is minimized.
n 2 n
J w( ) =∑i=1 r i e i( ) ( ) =∑i=1 r i( )( ( )d i − y(x( )))i 2
where r(i)0 are the weighting factors for each output error e(i).