Starting from:

$30

ML [Machine Learning ] Homework 1 Solved

 Machine Learning
Homework 1

1        Bayesian Linear Regression
Given the training set x and corresponding label set t, we want to predict the label t of new test point x. In other words, we wish to evaluate the predictive distribution p(t|x,x,t).

A linear regression function can be expressed as below where the φ(x) is a basis function:

y(x,w) = wTφ(x)

In order to make prediction of t for new test data x from the learned w, we will:

•   Multiply the likelihood function of new data p(t|x,w) and the posterior distribution of training set with label set.

•   Take the integral over w to find the predictive distribution:

 .

Now, please answer the following questions:

1.   Why we need the basis function φ(x) for linear regression? And what is the benefit for applying basis function over linear regression? (5%)

2.   Prove that the predictive distribution just mentioned is the same with the form

p(t|x,x,t) = N(t|m(x),s2(x))

where

 

s2(x) = β−1 + φ(x)TSφ(x).

Here, the matrix S−1 is given by S 

(hint: p(w|x,t) ∝ p(t|x,w)p(w) and you may use the formulas shown in page 93.)

3.   Could we use linear regression function for classification? Why or why not? Explain it! (10%)

1

2        Linear Regression
In this homework, you need to predict the chance of being admit in base on relevant student resume data. The following two approaches need to be realized respectively:

•   Maximum likelihood approach (ML)

•   Maximum a posteriori approach (MAP)

 

model! Dataset provides total 500 students with 7 features. Can you use these features to predict the chance of admit for your own dream school?

One might consider the following steps to start the work:

1.   Download and check for the dataset.

2.   Create a new Colab or Jupyter notebook file.

3.   Divide the dataset into training and validation.

Dataset Description

•   dataset X.csv contains 7 different resume feature served as the input.

GRE score, TOFEL score, University rating, SOP, LOR, CGPA, Research

•   dataset T.csv contains the chance of admit regard as the target. Chance of Admit

Specification

•   For those problems with Code Result at the end, you must show your result in your .ipynb file or you will get no points.

•   For those problem with Explain at the end, you must have a clear explanation or you will get low points.

•   You are also encouraged to have some discussion on those problem which is not marked as Explain.

1.   Feature select

In real-world applications, the dimension of data is usually more than one. In the training stage, please fit the data by applying a polynomial function of the form

                                                                     D                         D       D

                                                   y(x,w) = w0 + Xwixi + XXwijxixj                        (M = 2)

                                                                    i=1                        i=1 j=1

and minimizing the error function.

 

(a)   In the feature selection stage, please apply polynomials of order M = 1 and M = 2 over the dimension D = 7 input data. Please evaluate the corresponding RMS error on the training set and valid set. (15%) Code Result

(b)   How will you analysis the weights of polynomial model M = 1 and select the most contributive feature? Code Result, Explain (10%)

2.   Maximum likelihood approach

(a)   Which basis function will you use to further improve your regression model, Polynomial, Gaussian, Sigmoidal, or hybrid? Explain (5%)

(b)   Introduce the basis function you just decided in (a) to linear regression model and analyze the result you get. (Hint: You might want to discuss about the phenomenon when model becomes too complex.) Code Result, Explain (10%)

φ(x) = [φ1(x),φ2(x),...,φN(x),φbias(x)]

 

(c)    Apply N-fold cross-validation in your training stage to select at least one hyperparameter(order, parameter number, ...) for model and do some discussion(underfitting, overfitting). Code Result, Explain (10%)

3.   Maximum a posterior approach

 

(a)   What is the key difference between maximum likelihood approach and maximum a posterior approach? Explain (5%)

(b)   Use Maximum a posterior approach method to retest the model in 2 you designed. You could choose Gaussian distribution as a prior. Code Result (10%)

(c)    Compare the result between maximum likelihood approach and maximum a posterior approach. Is it consistent with your conclusion in (a)? Explain (5%)

3        Rules
•   Please name the assignment as hw1 StudentID.zip (e.g. hw1 0123456.zip).

•   In your submission, it needs to contain three files.

–   .ipynb file which contains all the results and codes for this homework.

–   .py file which is downloaded from the .ipynb file

–   .pdf file which is the report that contains your description for this homework.

•   Implementation will be graded by

–   Completeness

–   Algorithm Correctness

–   Model description

–   Discussion

•   Only Python implementation is acceptable.

•   Numpy library is recommended for the implementation.

•   Don’t use high level toolbox/module functions(e.g. sklearn, polyfit).

•   DO NOT PLAGIARISM. (We will check program similarity score.)

More products