SDSC6015 - Assignment 1
#assignment #sdsc6015
Problem 1: Jensen’s inequality
Let f be convex, x1,…,xm∈dom(f),λ1,…,λm∈R+ such that ∑i=1mλi=1 . Show that
f(i=1∑mλixi)≤i=1∑mλif(xi)
Proof.
-
For m=2:
Let λ1+λ2=1, then:
f(λ1x1+λ2x2)f(λ1x1+(1−λ1)x2)≤λ1f(x1)+λ2f(x2)≤λ1f(x1)+(1−λ1)f(x2)
For a convex function f, ∀x,y∈dom(f) and θ∈[0,1], we have:
f(θx+(1−θ)y)≤θf(x)+(1−θ)f(y)
This holds for m=2.
-
Assume it holds for m=k
That is, for m=k, λ1,…,λm∈R+ such that ∑i=1mλi=1, we have:
f(i=1∑kλixi)≤i=1∑kλif(xi)
-
For m=k+1, we have:
a. For λk+1=1:
Then:
f(xi)≤f(xi)
This always holds.
b. For λk+1<1:
Then:
f(i=1∑k+1λixi)f(i=1∑k+1λixi)∵f(i=1∑kλixi)∴(1−λk+1)f(i=1∑k1−λk+1λixi)∴f(i=1∑k+1λixi)=f(i=1∑kλixi+λk+1xk+1)=f((1−λk+1)i=1∑k1−λk+1λixi+λk+1xk+1)≤(1−λk+1)f(i=1∑k1−λk+1λixi)+λk+1f(xk+1)≤i=1∑kλif(xi)≤(1−λk+1)i=1∑k1−λk+1λif(xi)=i=1∑kλif(xi)≤i=1∑kλif(xi)+λk+1f(xk+1)=i=1∑k+1λif(xi)
The inequality holds for m=k+1.
By mathematical induction, Jensen’s inequality is proven.
Problem 2
(i) Let f1,f2,…,fm be convex functions, λ1,λ2,…,λm∈R+ . Show that f:=∑i=1mλifi is convex on dom(f):=⋂i=1mdom(fi) .
Since each fi is convex, for any i, we have:
fi(θx+(1−θ)y)λifi(θx+(1−θ)y)i=1∑mλifi(θx+(1−θ)y)∴f(θx+(1−θ)y)≤θfi(x)+(1−θ)fi(y)≤λiθfi(x)+λi(1−θ)fi(y)≤θi=1∑mλifi(x)+(1−θ)i=1∑mλifi(y)≤θf(x)+(1−θ)f(y)
Therefore, f is convex.
(ii) Let f be a convex function with dom(f)⊆Rd,g:Rm→Rd an affine function, meaning that g(x)=Ax+b , for some matrix A∈Rd×m and some vector b∈Rd . Show that the function f∘g (that maps x to f(Ax+b)) is convex on dom(f∘g):={x∈Rm:g(x)∈dom(f)} .
(f∘g)(θx+(1−θ)y)=f(g(θx+(1−θ)y))
Since g is an affine function:
g(θx+(1−θ)y)f(θg(x)+(1−θ)g(y))∴(f∘g)(θx+(1−θ)y)=A(θx+(1−θ)y)+b=θ(Ax+b)+(1−θ)(Ay+b)=θg(x)+(1−θ)g(y)≤θf(g(x))+(1−θ)f(g(y))≤θ(f∘g)(x)+(1−θ)(f∘g)(y)
f∘g is convex.
Problem 3
Show that the quadratic function f(x)=x⊤Qx+b⊤x+c , with symmetric matrix Q, is smooth with parameter 2∣∣Q∣∣.
∥∇f(x)−∇f(y)∥≤2∥Q∥∥x−y∥
Compute the gradient:
∇f(x)=2Qx+b
Therefore:
∇f(x)−∇f(y)∥∇f(x)−∇f(y)∥=2Q(x−y)=∥2Q(x−y)∥≤2∥Q∥∥x−y∥
Thus, f is smooth with parameter L=2∥Q∥.
Problem 4
Consider the projected gradient descent algorithm with a convex differentiable function f:X→R . Suppose that for some t⩾0,xt+1=xt . Prove that in this case, xt is a minimizer of f over the closed and convex set X.
Given xt+1=xt, so:
∵xt+1=xt∴xt=ΠX(xt−η∇f(xt))
Let z=xt−η∇f(xt), y=xt, so:
⟨xt−(xt−η∇f(xt)),x−xt⟩⟨η∇f(xt),x−xt⟩⟨∇f(xt),x−xt⟩≥0∀x∈X≥0∀x∈X≥0∀x∈X
Therefore, xt is a minimizer of f over X.