作业初稿
SDSC6015 - Assignment 1
Problem 1: Jensen’s inequality
Let f be convex, x1,…,xm∈dom(f),λ1,…,λm∈R+ such that ∑i=1mλi=1 . Show that
f(i=1∑mλixi)≤i=1∑mλif(xi)
Proof.
-
对于m=2:
令 λ1+λ2=1,有:
f(λ1x1+λ2x2)f(λ1x1+(1−λ1)x2)≤λ1f(x1)+λ2f(x2)≤λ1f(x1)+(1−λ1)f(x2)
对于凸函数f,∀x,y∈dom(f) 和 θ∈[0,1],有:
f(θx+(1−θ)y)≤θf(x)+(1−θ)f(y)
使用m=2时成立
-
假设对于 m=k 成立
即对于 m=k, λ1,…,λm∈R+ such that ∑i=1mλi=1, 有:
f(i=1∑kλixi)≤i=1∑kλif(xi)
-
对于 m=k+1,有:
a. 对于λk+1=1:
则有:
f(xi)≤f(xi)
恒成立。
b. 对于λk+1<1:
则有:
f(i=1∑k+1λixi)f(i=1∑k+1λixi)∵f(i=1∑kλixi)∴(1−λk+1)f(i=1∑k1−λk+1λixi)∴f(i=1∑k+1λixi)=f(i=1∑kλixi+λk+1xk+1)=f((1−λk+1)i=1∑k1−λk+1λixi+λk+1xk+1)≤(1−λk+1)f(i=1∑k1−λk+1λixi)+λk+1f(xk+1)≤i=1∑kλif(xi)≤(1−λk+1)i=1∑k1−λk+1λif(xi)=i=1∑kλif(xi)≤i=1∑kλif(xi)+λk+1f(xk+1)=i=1∑k+1λif(xi)
不等式对 m=k+1 成立。
由数学归纳法,Jensen 不等式得证。
Problem 2
(i) Let f1,f2,…,fm be convex functions, λ1,λ2,…,λm∈R+ . Show that f:=∑i=1mλifi is convex on dom(f):=⋂i=1mdom(fi) .
由于每个 fi 是凸函数,对于任意 i,有:
fi(θx+(1−θ)y)λifi(θx+(1−θ)y)i=1∑mλifi(θx+(1−θ)y)∴f(θx+(1−θ)y)≤θfi(x)+(1−θ)fi(y)≤λiθfi(x)+λi(1−θ)fi(y)≤θi=1∑mλifi(x)+(1−θ)i=1∑mλifi(y)≤θf(x)+(1−θ)f(y)
因此,f 是凸函数。
(ii) Let f be a convex function with dom(f)⊆Rd,g:Rm→Rd an affine function, meaning that g(x)=Ax+b , for some matrix A∈Rd×m and some vector b∈Rd . Show that the function f∘g (that maps x to f(Ax+b)) is convex on dom(f∘g):={x∈Rm:g(x)∈dom(f)} .
(f∘g)(θx+(1−θ)y)=f(g(θx+(1−θ)y))
由于 g 是仿射函数:
g(θx+(1−θ)y)f(θg(x)+(1−θ)g(y))∴(f∘g)(θx+(1−θ)y)=A(θx+(1−θ)y)+b=θ(Ax+b)+(1−θ)(Ay+b)=θg(x)+(1−θ)g(y)≤θf(g(x))+(1−θ)f(g(y))≤θ(f∘g)(x)+(1−θ)(f∘g)(y)
f∘g 是凸函数。
Problem 3
Show that the quadratic function f(x)=x⊤Qx+b⊤x+c , with symmetric matrix Q, is smooth with parameter 2||Q||.
∥∇f(x)−∇f(y)∥≤2∥Q∥∥x−y∥
计算梯度:
∇f(x)=2Qx+b
因此:
∇f(x)−∇f(y)∥∇f(x)−∇f(y)∥=2Q(x−y)=∥2Q(x−y)∥≤2∥Q∥∥x−y∥
即f 是光滑的,参数 L=2∥Q∥。
Problem 4
Consider the projected gradient descent algorithm with a convex differentiable function f:X→R . Suppose that for some t⩾0,xt+1=xt . Prove that in this case, xt is a minimizer of f over the closed and convex set X.
给定 xt+1=xt,所以:
∵xt+1=xt∴xt=ΠX(xt−η∇f(xt))
令 z=xt−η∇f(xt), y=xt,所以:
⟨xt−(xt−η∇f(xt)),x−xt⟩⟨η∇f(xt),x−xt⟩⟨∇f(xt),x−xt⟩≥0∀x∈X≥0∀x∈X≥0∀x∈X
因此,xt 是 f 在 X 上的最小化点。