The following are the policies regarding this assignment.
2. Use LaTeX (online platforms such as Overleaf) for writing theory questions. Handwritten submissions will not be accepted.
3. Submit the assignment as a zip file containing LaTeX source, and the PDF file.
Questions
(a) f(x) = (detX)1/n on dom
(b) f(x1,x2) = x1/x2 on R2++
(c) f(x1,x2) = 1/(x1x2) on R2++
(d) f : Rn → R, f(x) = maxi=1,2,...,k ∥A(i)x−b(i)∥, where A(i) ∈ Rm×n,b(i) ∈ Rm, and ∥·∥ is a norm on Rm.
(e) Gaussian distribution function f
Question 2. Consider f to be a convex function, λ1 > 0, λi ≤ 0 for i = 2,...,n and Pi λi = 1. Let dom(f) be affine, and for x1,...,xn ∈ dom(f), show that the inequality always holds:
Question 3. We say that a function f is log-convex on the real interval D = [a,b] if ∀ x,y ∈ D and
λ ∈ [0,1], the function satisfies
We will show that for an increasing log-convex function f : D → R and 0 ≤ t ≤ 1,
where
(a) First, we prove the following inequalities -
(i) For 0 < t < 1, the following holds
and
Question 4. Let D = [a,b] and f : D → R be a convex or concave C2 class function. Show that if
|f′(x)| ≥ ζ for all x ∈ D and ζ > 0, then
√
Question 5. The basic idea behind many reinforcement learning algorithms is to estimate the action-value function Q∗(s,a) by using the Bellman equation as an iterative update,
Qi+1(s,a) = Es′[r + γ max′ Qi(s′,a′)|s,a] a
where {a} are the actions, {s} are the states, r is the reward and γ is a discounting factor. In practice, such iterative methods converge to the optimal value function as i → ∞. [If you’re not familiar with Reinforcement Learning, read this short introduction to understand the terminologies used: Reinforcement Learning, although it is not required to solve the question.]
It is seen that, this is infeasible and a neural network Q(s,a,θ) is used as an approximator to estimate this optimal action-value function as Q(s,a;θ) ≈ Q∗(s,a). During training, we minimize the mean-squared error in the Bellman equation, and the loss function of such a network is given as
Li(θi) = E(s,a,r,s′)∼U(D)[(r + γ max′ Q(s′,a′;θi−) − Q(s,a;θi))2] a
where e = (s,a,r,s′) are the experiences forming the dataset D. It is known that θi− is fixed.
Question 6. Let x1,...,xn be non-negative points, and p1,...,pn be positive numbers such that Pi pi = 1. Define a non-decreasing convex function f : conv{x1,...,xn} → R. Then show that
(a)
(b)
Question 7.
(a) Show that the following definitions are equivalent:
A function f is L-smooth with Lipschitz constant L > 0, if
• ∀ x,y ∈ dom(f), ∥∇f(x) − ∇f(y)∥ ≤ L∥x − y∥ (i.e, ∇f is L-Lipschitz continuous)
• a quadratic function upper bounds f, i.e,
(b) Let f : Rn → R be such that:
• f is a convex function
• ∇f is Lipschitz-continuous with Lipschitz constant 2µ
Show that, for all x,y ∈ Rn,
Question 8. Implement Numerically correct versions of the following functions:
1) Logistic Loss:
2) Hinge Loss/SVMs: . Here yi ∈ {−1,+1}.
3) Least Squares Loss: . Here yi ∈ R.
Note: Write your codes in the given notebook: Assignment1.ipynb with your implementations of 1), 2), 3), respectively. Do not modify the arguments.
2. Implement these functions using vectorized code and compare the result with the previous simple