SDSC6007 课程 1-动态规划算法简介
#sdsc6007 English / 中文 简介 离散时间动态系统 (The Discrete-Time Dynamic System) 该系统具有以下形式: xk+1=fk(xk,uk,wk),k=0,1,…,N−1,x_{k + 1} = f_{k} (x_k, u_k, w_k ), \quad k = 0, 1, \ldots , N − 1, xk+1=fk(xk,uk,wk),k=0,1,…,N−1, 其中: kkk:离散时间索引 NNN:时间范围(Horizon)或控制被应用的次数 xkx_kxk:系统的状态,属于状态集合 SkS_kSk uku_kuk:在时间 kkk 需要选择的控制变量/决策变量/动作(control/decision variable/action),从集合 Uk(xk)U_k (x_k )Uk(xk) 中选择 wkw_kwk:一个随机参数(也称为扰动 disturbance) fkf_kfk:描述状态如何更新的函数 假设 (Assumption) wkw_kwk 是相互独立的。其概率分布...
SDSC6007 Course 4-Markov Decision Processes(MDPS)
#sdsc6007 English / 中文 Elements of Reinforcement Learning Reinforcement learning includes the following five core elements: Agent and Environment: The agent performs actions, and the environment returns observations and rewards. Reward Signal: A scalar feedback signal indicating the agent’s performance at time t. Policy: Describes the agent’s behavior, mapping from states to actions. Value Function: Predicts the expected future reward (under a specific policy). Model: Predicts the b...
SDSC6012 - Question of Assignment 1
#assignment #sdsc6012 Question 1 趋势成分提取(移动平均法) 趋势成分通过中心化移动平均法提取: Trendt=1k∑i=t−mt+mxi\text{Trend}_t = \frac{1}{k} \sum_{i=t-m}^{t+m} x_i Trendt=k1i=t−m∑t+mxi 其中: kkk 为窗口大小(此处取12,对应年度周期) m=⌊k/2⌋m = \lfloor k/2 \rfloorm=⌊k/2⌋(中心化移动平均的半窗宽) 边界处理:当t<mt < mt<m或t>n−mt > n-mt>n−m时,使用可用数据计算均值 季节性成分提取(周期平均法) 计算去趋势序列:dt=xt−Trendtd_t = x_t - \text{Trend}_tdt=xt−Trendt 对每个周期位置jjj(j=0,1,…,11j=0,1,\ldots,11j=0,1,…,11)计算平均值: sj=1Nj∑k=0Nj−1dj+12ks_j = \f...
SDSC6015 Course 1-Introduction / Preliminaries of Stochastic Optimization
#sdsc6015 English / Chinese Course Introduction and Preliminary Stochastic Optimization Main Problem Given labeled training data (x1,y1),…,(xn,yn)∈Rd×Y(x_1, y_1), \dots, (x_n, y_n) \in \mathbb{R}^d \times \mathcal{Y}(x1,y1),…,(xn,yn)∈Rd×Y, find weights θ\thetaθ to minimize: minθf(θ)=1n∑i=1nℓ(θ,(xi,yi)),n extremely large\min_{\theta} f(\theta) = \frac{1}{n} \sum_{i=1}^{n} \ell(\theta, (x_i, y_i)), \quad n \text{ extremely large} θminf(θ)=n1i=1∑nℓ(θ,(xi,yi)),n extremely large Object...
SDSC6015 课程 5
Mirror Descent 点击展开 Mirror Descent 复习内容 动机 考虑单纯形约束优化问题: minx∈△df(x)\min_{x \in \triangle_d} f(x) x∈△dminf(x) 其中单纯形 △d:={x∈Rd:∑i=1dxi=1,xi≥0,∀i}\triangle_d := \{x \in \mathbb{R}^d : \sum_{i=1}^d x_i = 1, x_i \geq 0, \forall i\}△d:={x∈Rd:∑i=1dxi=1,xi≥0,∀i}。假设梯度无穷范数有界:∥∇f(x)∥∞=maxi=1,…,d∣[∇f(x)]i∣≤1\|\nabla f(x)\|_\infty = \max_{i=1,\ldots,d} |[\nabla f(x)]_i| \leq 1∥∇f(x)∥∞=maxi=1,…,d∣[∇f(x)]i∣≤1。 符号说明:xxx 是优化变量,ddd 是维度,△d\triangle_d△d 是概率单纯形。 几何意义:单纯形是概率分布空间,约束优化要求解在概率约束下最小化损失...
SDSC6015 课程 4
#sdsc6015 English / 中文 投影梯度下降 (Projected Gradient Descent) 投影梯度下降是处理约束优化问题的算法,通过梯度步后投影回可行集来确保约束满足。 约束优化问题定义 约束优化问题形式化定义为: minf(x)subject tox∈X\begin{aligned} &\min f(x) \\ &\text{subject to}\quad x \in X \end{aligned} minf(x)subject tox∈X 其中: f:Rd→Rf: \mathbb{R}^d \rightarrow \mathbb{R}f:Rd→R 是目标函数 X⊆RdX \subseteq \mathbb{R}^dX⊆Rd 是一个闭凸集(closed convex set) x∈Rdx \in \mathbb{R}^dx∈Rd 是优化变量 几何意义:在满足约束 x∈Xx \in Xx∈X 的前提下,寻找使 f(x)f(x)f(x) 最小的点。 算法描述 投影梯度下降迭代步骤: For t=0,1,2,… ...
SDSC6015 课程 3-最速梯度下降与次梯度下降
#sdsc6015 English / 中文 回顾 点击展开 凸优化问题 凸优化问题的一般形式为: minx∈Rdf(x)\min_{x \in \mathbb{R}^d} f(x) x∈Rdminf(x) 其中 fff 是凸函数,Rd\mathbb{R}^dRd 是凸集,x∗x^*x∗ 是其最小化点: x∗=argminx∈Rdf(x)x^* = \arg\min_{x \in \mathbb{R}^d} f(x) x∗=argx∈Rdminf(x) 梯度下降(Gradient Descent, GD)的更新规则为: xk+1=xk−ηk+1∇f(xk)x_{k+1} = x_k - \eta_{k+1} \nabla f(x_k) xk+1=xk−ηk+1∇f(xk) xkx_kxk:当前参数点 ηk>0\eta_k > 0ηk>0:步长(学习率) xk+1x_{k+1}xk+1:更新后的参数点 平滑函数(Smooth Functions) 定义: 若函数 f:dom(f)→Rf: \text{dom}(f) \to...
SDSC6012 Course 5-Autoregressive models
#sdsc6012 English / 中文 Autoregressive Model (AR(p)) Definition and Form A p-th order autoregressive model, denoted as AR(p), has the form: xt=ϕ1xt−1+ϕ2xt−2+⋯+ϕpxt−p+wtx_t = \phi_1 x_{t-1} + \phi_2 x_{t-2} + \cdots + \phi_p x_{t-p} + w_t xt=ϕ1xt−1+ϕ2xt−2+⋯+ϕpxt−p+wt where: wt∼wn(0,σw2)w_t \sim \text{wn}(0, \sigma_w^2)wt∼wn(0,σw2) is white noise with mean 0 and variance σw2\sigma_w^2σw2. ϕ1,ϕ2,…,ϕp\phi_1, \phi_2, \ldots, \phi_pϕ1,ϕ2,…,ϕp (with ϕp≠0\phi_p \neq 0ϕp=0) are the ...
SDSC6012 课程 5-自回归移动平均模型
#sdsc6012 English / 中文 自回归模型(AR(p)) 定义与形式 一个 ppp 阶自回归模型,记为 AR(p)AR(p)AR(p),其形式为: xt=ϕ1xt−1+ϕ2xt−2+⋯+ϕpxt−p+wtx_t = \phi_1 x_{t-1} + \phi_2 x_{t-2} + \cdots + \phi_p x_{t-p} + w_t xt=ϕ1xt−1+ϕ2xt−2+⋯+ϕpxt−p+wt 其中: wt∼wn(0,σw2)w_t \sim \text{wn}(0, \sigma_w^2)wt∼wn(0,σw2) 是均值为0、方差为 σw2\sigma_w^2σw2 的白噪声。 ϕ1,ϕ2,…,ϕp\phi_1, \phi_2, \ldots, \phi_pϕ1,ϕ2,…,ϕp(且 ϕp≠0\phi_p \neq 0ϕp=0)是自回归系数。 直观理解:当前值 xtx_txt 是其自身过去 ppp 个历史值的线性组合,再加上一个随机扰动。它捕捉的是序列的“惯性”或“记忆”,即序列自身历史对其当前状态的影响。 平稳...
SDSC6007 - Assignment 1
SDSC6007 - Assignment 1 #assignment #sdsc6007 Question 1 Ash Ketchum is preparing his next trip and packing up. His backpack has maximum weight capacity is z and he wants to fill it up with different quantities of N different items. Denote viv_{i}vi : the value of the i th type of item wiw_{i}wi : the weight of i th item xix_{i}xi : the number of items of type i that are loaded in the backpack Ash Ketchum is trying to maximize the total value of all the items in his backpack, i.e. t...
