Starting from:

$30

COMP6211E-Homework 2 Solved

           Theoretical Problems 

1.     Consider the quadratic objective function

 

defined on x = [x1,x2] ∈R2. Assume that we want to solve

x∗ = argminQ(x)

x

from x0 = 0.

•    Find A and b so that .

•    For gradient descent method with constant learning rate η, what range should η belong to? What is the optimal value of η, and what is the corresponding convergence rate?

•    For CG, how many iterations T are needed to find XT = x∗? Find values of α1, β1, and α2.

•    For the Heavy-Ball method with constant η and β. What’s the optimal values of (η,β) to achieve the fastest asymptotic convergence rate, and what is the corresponding convergence rate?

2.   Consider the regularized logistic regression:

 

where xi ∈Rd and yi ∈{±1}. Assume kxik2 ≤ 1 for all i.

•   find the smoothness parameter L of f(w).

•   find an estimate of Lipschitz constant G in the region {w : f(w) ≤ f(0)} which holds for all dataset {xi} such that kxik2 ≤ 1.

3.  Consider training data (xi,yi) so that kxik2 ≤ 1 and yi ∈{±1}, and we would like to solve the linear SVM problem

 

using subgradient descent with w0 = 0, and learning rate ηt ≤ η < 1/λ.

•    Let C = {w : kwk2 ≤ R}. Find the smallest R so that for all training data that satisfy the assumptions of the problem, subgradient descent without projection belongs to C.

1

•    Find an upper bound of Lipschitz constant G of f(w) in C.

4.    Given a nonsmooth function, we would like to find its smooth approximation.

•   Find a closed form solution of

  .

•   Let x ∈Rd, find

 .

Proof. Let z achieves the minimum on the right hand side. If kxk2 ≤ γ, then z = 0 is a solution, and

 .

If kxk2 > γ, then z = (1 − γ/kxk2)x, and

f(x) = kxk2− γ/2.

We thus obtain

 kxk2 ≤ γ

otherwise

 

Programming Problem 

We consider optimization with the smoothed hinge loss, and randomly generated data.

• Use the python template “prog-template.py”, and implement functions marked with ’# implement’. • Submit your code and outputs. Please note that in real machine learning problems, strong-convexity settings are often associated with L2 regularization: it is necessary to choose a proper regularization to better estimate true w. Thus, you should

1.    choose the most proper setting mentioned in “prog-template.py” for real problems (including λ, γ, and the optimizer);

2.    Try some different λ and discuss how λ influences optimization and prediction respectively.

2

More products