SDSC6007 Course Information
SDSC6007 Course Information
#sdsc6007
#course information
English / 中文
Course Overview
Course Code: SDSC6007 & SDSC8006
Course Name: Dynamic Programming & Reinforcement Learning
Semester: Semester A 2025
Instructor: Clint Chin Pang Ho
Email: client.ho@cityu.edu.hk
Office: LAU-16-228
Office Hour: By appointment
Teaching Assistants (TAs):
-
Yanbo He (
yanbohe3-c@my.cityu.edu.hk
) -
Ellen Yi Wong (
ywong692-c@my.cityu.edu.hk
) -
Yuqi Zha (
charlie.yqzha@my.cityu.edu.hk
)
Assessment Method
Component | Weight | Details |
---|---|---|
Assignments | 20% | Two assignments (each 10%). Submitted online via Canvas in formats: .pdf, .py, .mp4, .txt. |
Midterm Exam | 20% | Closed-book exam. |
Group Project | 30% | One group project. Details below. |
Final Exam | 30% | Closed-book exam. |
-
Late Submission Policy: If late by ( ) days (( )), maximum score is ( ).
-
GenAI Policy: Allowed for non-exam tasks (assignments and project). Must be properly cited. Students are fully accountable for all submitted materials.
Schedule and Teaching
Topic Area | Key Content | Remarks |
---|---|---|
Dynamic Programming Algorithm | Foundational concepts and frameworks. | Includes historical context by Richard Bellman. |
Deterministic Systems & Shortest Path | Modeling sequential decision problems. | |
Markov Decision Processes (MDPs) | Theory and applications in operations research and control. | |
Value Iteration, Policy Iteration, Linear Programming | Solution methodologies for MDPs. | |
Model-Free Prediction & Control | Learning without explicit environment models. | |
Value Function Approximation | Techniques for large-scale problems. | |
Policy Gradient | Optimization methods. | |
Multi-Armed Bandits | Exploration vs. exploitation trade-offs. |
Project Requirements (30% of Final Grade)
-
Nature: One group project focused on solving real-world problems using dynamic programming or reinforcement learning.
-
Key Guidelines:
- Projects must be original and independently designed for this course. Reuse of content from other courses or papers is prohibited.
- Use of GenAI tools requires explicit citation.
- Submit via Canvas in allowed formats (.pdf, .py, .mp4, .txt).
-
Scoring: Based on scientific value and innovation; plagiarism will result in penalties.
Aims and Topics
-
Aims:
- Understand concepts and principles of DP and RL.
- Formulate problems as DP/RL models and implement solvers in Python.
- Apply methodologies to real-world scenarios.
-
Topics:
- Dynamic Programming Algorithm
- Markov Decision Processes
- Value Iteration and Policy Iteration
- Model-Free Control
- Value Function Approximation
- Policy Gradient
- Multi-Armed Bandits
Reference Books
-
Bertsekas, D.P. (2019). Reinforcement Learning and Optimal Control.
-
Sutton, R.S. & Barto, A.G. (2018). Reinforcement Learning: An Introduction.
-
Puterman, M.L. (2005). Markov Decision Processes: Discrete Stochastic Dynamic Programming.
-
Additional resources: Silver (2015) and Brunskill (2019) lecture series.