SDSC6015 - Assignment 1

#assignment #sdsc6015

Problem 1: Jensen’s inequality

Let f be convex, $x_{1},\ldots, x_{m}\in\operatorname{dom}(f),\lambda_{1},\ldots,\lambda_{m}\in R_{+}$ such that $\sum_{i=1}^{m}\lambda_{i}=1$ . Show that

$f\left( \sum_{i=1}^m \lambda_i x_i \right) \leq \sum_{i=1}^m \lambda_i f(x_i)$

Proof.

For $m=2$ :
Let $\lambda_1 + \lambda_2 = 1$ , then:

$\begin{align*} f(\lambda_1 x_1 + \lambda_2 x_2) & \leq \lambda_1 f(x_1) + \lambda_2 f(x_2) \\ f(\lambda_1 x_1 + (1 - \lambda_1) x_2) & \leq \lambda_1 f(x_1) + (1 - \lambda_1) f(x_2) \end{align*}$

For a convex function $f$ , $\forall x, y \in \operatorname{dom}(f)$ and $\theta \in [0,1]$ , we have:

$f(\theta x + (1-\theta)y) \leq \theta f(x) + (1-\theta) f(y)$

This holds for $m = 2$ .
Assume it holds for $m=k$
That is, for $m=k$ , $\lambda_{1},\ldots,\lambda_{m}\in R_{+}$ such that $\sum_{i=1}^{m}\lambda_{i}=1$ , we have:

$f\left( \sum_{i=1}^k \lambda_i x_i \right) \leq \sum_{i=1}^k \lambda_i f(x_i)$
For $m=k+1$ , we have:

a. For $\lambda_{k+1}=1$ :

Then:

$f\left(x_i \right) \leq f(x_i)$

This always holds.

b. For $\lambda_{k+1}\lt1$ :

Then:

$\begin{align*} f\left( \sum_{i=1}^{k+1} \lambda_i x_i \right) & = f\left( \sum_{i=1}^{k} \lambda_i x_i + \lambda_{k+1} x_{k+1} \right)\\ & = f\left( (1 - \lambda_{k+1})\sum_{i=1}^{k} \frac{\lambda_i}{1 - \lambda_{k+1}} x_i + \lambda_{k+1} x_{k+1} \right) \\ f\left( \sum_{i=1}^{k+1} \lambda_i x_i \right) & \leq (1 - \lambda_{k+1}) f\left(\sum_{i=1}^{k} \frac{\lambda_i}{1 - \lambda_{k+1}} x_i\right) + \lambda_{k+1}f(x_{k+1})\\ \because f\left( \sum_{i=1}^k \lambda_i x_i \right) &\leq \sum_{i=1}^k \lambda_i f(x_i)\\ \therefore (1 - \lambda_{k+1}) f\left(\sum_{i=1}^{k} \frac{\lambda_i}{1 - \lambda_{k+1}} x_i\right) & \leq (1 - \lambda_{k+1}) \sum_{i=1}^{k} \frac{\lambda_i}{1 - \lambda_{k+1}} f\left(x_i\right) = \sum_{i=1}^{k} \lambda_i f\left(x_i\right)\\ \therefore f\left( \sum_{i=1}^{k+1} \lambda_i x_i \right) & \leq \sum_{i=1}^{k}\lambda_i f(x_i) + \lambda_{k+1}f(x_{k+1}) = \sum_{i=1}^{k+1}\lambda_i f(x_i) \\ \end{align*}$

The inequality holds for $m=k+1$ .

By mathematical induction, Jensen’s inequality is proven.

Problem 2

(i) Let $f_{1}, f_{2},\ldots, f_{m}$ be convex functions, $\lambda_{1},\lambda_{2},\ldots,\lambda_{m}\in R_{+}$ . Show that $f:=\sum_{i=1}^{m}\lambda_{i} f_{i}$ is convex on $\operatorname{dom}(f):=\bigcap_{i=1}^{m}\operatorname{dom}\left(f_{i}\right)$ .

Since each $f_i$ is convex, for any $i$ , we have:

$\begin{align*} f_i(\theta x + (1-\theta)y) &\leq \theta f_i(x) + (1-\theta) f_i(y) \\ \lambda_i f_i(\theta x + (1-\theta)y) &\leq \lambda_i \theta f_i(x) + \lambda_i (1-\theta) f_i(y) \\ \sum_{i=1}^m \lambda_i f_i(\theta x + (1-\theta)y) &\leq \theta \sum_{i=1}^m \lambda_i f_i(x) + (1-\theta) \sum_{i=1}^m \lambda_i f_i(y) \\ \therefore f(\theta x + (1-\theta)y) &\leq \theta f(x) + (1-\theta) f(y) \end{align*}$

Therefore, $f$ is convex.

(ii) Let f be a convex function with $\operatorname{dom}(f)\subseteq R^{d}, g: R^{m}\rightarrow R^{d}$ an affine function, meaning that $g(x)=A x+b$ , for some matrix $A\in R^{d\times m}$ and some vector $b\in R^{d}$ . Show that the function $f\circ g$ (that maps x to $f(A x+b))$ is convex on $\operatorname{dom}(f\circ g):=\left\{x\in R^{m}:g(x)\in\operatorname{dom}(f)\right\}$ .

$(f \circ g)(\theta x + (1-\theta)y) = f(g(\theta x + (1-\theta)y))$

Since $g$ is an affine function:

$\begin{align*} g(\theta x + (1-\theta)y) &= A(\theta x + (1-\theta)y) + b \\ &= \theta (Ax + b) + (1-\theta)(Ay + b) \\ &= \theta g(x) + (1-\theta) g(y) \\ f(\theta g(x) + (1-\theta) g(y)) &\leq \theta f(g(x)) + (1-\theta) f(g(y)) \\ \therefore (f \circ g)(\theta x + (1-\theta)y) &\leq \theta (f \circ g)(x) + (1-\theta) (f \circ g)(y) \end{align*}$

$f \circ g$ is convex.

Problem 3

Show that the quadratic function $f(x)=x^{\top}Q x+b^{\top} x+c$ , with symmetric matrix Q, is smooth with parameter $2||Q||$ .

$\| \nabla f(x) - \nabla f(y) \| \leq 2 \|Q\| \| x - y \|$

Compute the gradient:

$\nabla f(x) = 2 Q x + b$

Therefore:

$\begin{align*} \nabla f(x) - \nabla f(y) &= 2 Q (x - y) \\ \| \nabla f(x) - \nabla f(y) \| &= \| 2 Q (x - y) \| \leq 2 \|Q\| \| x - y \| \end{align*}$

Thus, $f$ is smooth with parameter $L = 2 \|Q\|$ .

Problem 4

Consider the projected gradient descent algorithm with a convex differentiable function $f: X\rightarrow R$ . Suppose that for some $t\geqslant 0, x_{t+1}=x_{t}$ . Prove that in this case, $x_{t}$ is a minimizer of f over the closed and convex set X.

Given $x_{t+1} = x_t$ , so:

$\begin{align*} &\because x_{t+1} = x_t \\ &\therefore x_t = \Pi_X (x_t - \eta \nabla f(x_t)) \\ \end{align*}$

Let $z = x_t - \eta \nabla f(x_t)$ , $y = x_t$ , so:

$\begin{align*} \langle x_t - (x_t - \eta \nabla f(x_t)), x - x_t \rangle &\geq 0 \quad \forall x \in X \\ \langle \eta \nabla f(x_t), x - x_t \rangle &\geq 0 \quad \forall x \in X \\ \langle \nabla f(x_t), x - x_t \rangle &\geq 0 \quad \forall x \in X \end{align*}$

Therefore, $x_t$ is a minimizer of $f$ over $X$ .