迷麟の小站

发表于2025-09-03|更新于2025-10-19

#sdsc6007 English / 中文简介离散时间动态系统 (The Discrete-Time Dynamic System) 该系统具有以下形式： xk+1=fk(xk,uk,wk),k=0,1,…,N−1,x_{k + 1} = f_{k} (x_k, u_k, w_k ), \quad k = 0, 1, \ldots , N − 1, xk+1=fk(xk,uk,wk),k=0,1,…,N−1, 其中： kkk：离散时间索引 NNN：时间范围（Horizon）或控制被应用的次数 xkx_kxk：系统的状态，属于状态集合 SkS_kSk uku_kuk：在时间 kkk 需要选择的控制变量/决策变量/动作（control/decision variable/action），从集合 Uk(xk)U_k (x_k )Uk(xk) 中选择 wkw_kwk：一个随机参数（也称为扰动 disturbance） fkf_kfk：描述状态如何更新的函数假设 (Assumption) wkw_kwk 是相互独立的。其概率分布...

SDSC6007 Course 4-Markov Decision Processes(MDPS)

发表于2025-10-06|更新于2025-10-18

#sdsc6007 English / 中文 Elements of Reinforcement Learning Reinforcement learning includes the following five core elements: Agent and Environment: The agent performs actions, and the environment returns observations and rewards. Reward Signal: A scalar feedback signal indicating the agent’s performance at time t. Policy: Describes the agent’s behavior, mapping from states to actions. Value Function: Predicts the expected future reward (under a specific policy). Model: Predicts the b...

SDSC6012 - Question of Assignment 1

发表于2025-10-18|更新于2025-10-18

#assignment #sdsc6012 Question 1 趋势成分提取（移动平均法）趋势成分通过中心化移动平均法提取： Trendt=1k∑i=t−mt+mxi\text{Trend}_t = \frac{1}{k} \sum_{i=t-m}^{t+m} x_i Trendt=k1i=t−m∑t+mxi 其中： kkk 为窗口大小（此处取12，对应年度周期） m=⌊k/2⌋m = \lfloor k/2 \rfloorm=⌊k/2⌋（中心化移动平均的半窗宽）边界处理：当t<mt < mt<m或t>n−mt > n-mt>n−m时，使用可用数据计算均值季节性成分提取（周期平均法）计算去趋势序列：dt=xt−Trendtd_t = x_t - \text{Trend}_tdt=xt−Trendt 对每个周期位置jjj（j=0,1,…,11j=0,1,\ldots,11j=0,1,…,11）计算平均值： sj=1Nj∑k=0Nj−1dj+12ks_j = \f...

SDSC6015 Course 1-Introduction / Preliminaries of Stochastic Optimization

发表于2025-09-11|更新于2025-10-15

#sdsc6015 English / Chinese Course Introduction and Preliminary Stochastic Optimization Main Problem Given labeled training data (x1,y1),…,(xn,yn)∈Rd×Y(x_1, y_1), \dots, (x_n, y_n) \in \mathbb{R}^d \times \mathcal{Y}(x1,y1),…,(xn,yn)∈Rd×Y, find weights θ\thetaθ to minimize: min⁡θf(θ)=1n∑i=1nℓ(θ,(xi,yi)),n extremely large\min_{\theta} f(\theta) = \frac{1}{n} \sum_{i=1}^{n} \ell(\theta, (x_i, y_i)), \quad n \text{ extremely large} θminf(θ)=n1i=1∑nℓ(θ,(xi,yi)),n extremely large Object...

SDSC6015 课程 5

发表于2025-10-15|更新于2025-10-15

Mirror Descent 点击展开 Mirror Descent 复习内容动机考虑单纯形约束优化问题： min⁡x∈△df(x)\min_{x \in \triangle_d} f(x) x∈△dminf(x) 其中单纯形 △d:={x∈Rd:∑i=1dxi=1,xi≥0,∀i}\triangle_d := \{x \in \mathbb{R}^d : \sum_{i=1}^d x_i = 1, x_i \geq 0, \forall i\}△d:={x∈Rd:∑i=1dxi=1,xi≥0,∀i}。假设梯度无穷范数有界：∥∇f(x)∥∞=max⁡i=1,…,d∣[∇f(x)]i∣≤1\|\nabla f(x)\|_\infty = \max_{i=1,\ldots,d} |[\nabla f(x)]_i| \leq 1∥∇f(x)∥∞=maxi=1,…,d∣[∇f(x)]i∣≤1。符号说明：xxx 是优化变量，ddd 是维度，△d\triangle_d△d 是概率单纯形。几何意义：单纯形是概率分布空间，约束优化要求解在概率约束下最小化损失...

SDSC6015 课程 4

发表于2025-10-14|更新于2025-10-15

#sdsc6015 English / 中文投影梯度下降 (Projected Gradient Descent) 投影梯度下降是处理约束优化问题的算法，通过梯度步后投影回可行集来确保约束满足。约束优化问题定义约束优化问题形式化定义为： min⁡f(x)subject tox∈X\begin{aligned} &\min f(x) \\ &\text{subject to}\quad x \in X \end{aligned} minf(x)subject tox∈X 其中： f:Rd→Rf: \mathbb{R}^d \rightarrow \mathbb{R}f:Rd→R 是目标函数 X⊆RdX \subseteq \mathbb{R}^dX⊆Rd 是一个闭凸集（closed convex set） x∈Rdx \in \mathbb{R}^dx∈Rd 是优化变量几何意义：在满足约束 x∈Xx \in Xx∈X 的前提下，寻找使 f(x)f(x)f(x) 最小的点。算法描述投影梯度下降迭代步骤： For t=0,1,2,… ...

SDSC6015 课程 3-最速梯度下降与次梯度下降

发表于2025-09-22|更新于2025-10-13

#sdsc6015 English / 中文回顾点击展开凸优化问题凸优化问题的一般形式为： min⁡x∈Rdf(x)\min_{x \in \mathbb{R}^d} f(x) x∈Rdminf(x) 其中 fff 是凸函数，Rd\mathbb{R}^dRd 是凸集，x∗x^*x∗ 是其最小化点： x∗=arg⁡min⁡x∈Rdf(x)x^* = \arg\min_{x \in \mathbb{R}^d} f(x) x∗=argx∈Rdminf(x) 梯度下降（Gradient Descent, GD）的更新规则为： xk+1=xk−ηk+1∇f(xk)x_{k+1} = x_k - \eta_{k+1} \nabla f(x_k) xk+1=xk−ηk+1∇f(xk) xkx_kxk：当前参数点 ηk>0\eta_k > 0ηk>0：步长（学习率） xk+1x_{k+1}xk+1：更新后的参数点平滑函数（Smooth Functions）定义：若函数 f:dom(f)→Rf: \text{dom}(f) \to...

SDSC6012 Course 5-Autoregressive models

发表于2025-10-10|更新于2025-10-11

#sdsc6012 English / 中文 Autoregressive Model (AR(p)) Definition and Form A p-th order autoregressive model, denoted as AR(p), has the form: xt=ϕ1xt−1+ϕ2xt−2+⋯+ϕpxt−p+wtx_t = \phi_1 x_{t-1} + \phi_2 x_{t-2} + \cdots + \phi_p x_{t-p} + w_t xt=ϕ1xt−1+ϕ2xt−2+⋯+ϕpxt−p+wt where: wt∼wn(0,σw2)w_t \sim \text{wn}(0, \sigma_w^2)wt∼wn(0,σw2) is white noise with mean 0 and variance σw2\sigma_w^2σw2. ϕ1,ϕ2,…,ϕp\phi_1, \phi_2, \ldots, \phi_pϕ1,ϕ2,…,ϕp (with ϕp≠0\phi_p \neq 0ϕp=0) are the ...

SDSC6012 课程 5-自回归移动平均模型

发表于2025-10-10|更新于2025-10-11

#sdsc6012 English / 中文自回归模型（AR(p)）定义与形式一个 ppp 阶自回归模型，记为 AR(p)AR(p)AR(p)，其形式为： xt=ϕ1xt−1+ϕ2xt−2+⋯+ϕpxt−p+wtx_t = \phi_1 x_{t-1} + \phi_2 x_{t-2} + \cdots + \phi_p x_{t-p} + w_t xt=ϕ1xt−1+ϕ2xt−2+⋯+ϕpxt−p+wt 其中： wt∼wn(0,σw2)w_t \sim \text{wn}(0, \sigma_w^2)wt∼wn(0,σw2) 是均值为0、方差为 σw2\sigma_w^2σw2 的白噪声。 ϕ1,ϕ2,…,ϕp\phi_1, \phi_2, \ldots, \phi_pϕ1,ϕ2,…,ϕp（且 ϕp≠0\phi_p \neq 0ϕp=0）是自回归系数。直观理解：当前值 xtx_txt 是其自身过去 ppp 个历史值的线性组合，再加上一个随机扰动。它捕捉的是序列的“惯性”或“记忆”，即序列自身历史对其当前状态的影响。平稳...

SDSC6007 - Assignment 1

发表于2025-09-28|更新于2025-10-11

SDSC6007 - Assignment 1 #assignment #sdsc6007 Question 1 Ash Ketchum is preparing his next trip and packing up. His backpack has maximum weight capacity is z and he wants to fill it up with different quantities of N different items. Denote viv_{i}vi : the value of the i th type of item wiw_{i}wi : the weight of i th item xix_{i}xi : the number of items of type i that are loaded in the backpack Ash Ketchum is trying to maximize the total value of all the items in his backpack, i.e. t...