$34.99
Homework 2
Write a computer program to evaluate the generalization error (GE), model prediction error (ME) and training error (TE) for the k-nearest neighbors learning approach. Use provided k-NN Python code. Learning model:
In the k-nearest neighbors learning approach guess output, Yˆk , as:
Yˆk fk x Yitraning
where N xk ( ) is a set of k nearest neighbors of x within the observed (training) data. Data model:
Training data
Let us define observed (training) data model.
• First generate N traning =50 uniformly spaced feature samples in the interval from zero to one. That is xi = i x iΔ , = 0,..,N traning −1 where Δx=1N traning .
• Next generate N traning =50 noise samples,ni , as Normally distributed random process with a zero mean and 0.1 standard deviation.
(Hint: Use npr.normal nympy function)
Now the observed noisy data can be calculated as:
Yitraning = f x( i ) +ni, i =1,..,N traning where f x( )i =sin 2( πxi ).
Testing data
• To generate the testing data,Yjtesting, j =1,..,Ntesting , the same procedure described above can be used where N=300.
Evaluation:
The generalization error can be approximately calculated as:
GE k = E⎡⎢
( ) ⎣ k ⎥⎦ N testing j=1 (Yjtesting − fˆk (xtjesting ))2 ,
the model prediction error as: ( ) ⎣ k ⎥⎦ N testing j=1 (xtjesting )− fˆk (xtjesting ))2 ,
ME k = E⎡⎢
and the training error as:
TE k (Yjtraning − fˆk (xtraningj ))2 .
N j=1
Report: