SDSC6007 Course Information

#sdsc6007 #course information

English / 中文

Course Overview

Course Code: SDSC6007 & SDSC8006

Course Name: Dynamic Programming & Reinforcement Learning

Semester: Semester A 2025

Instructor: Clint Chin Pang Ho

Email: client.ho@cityu.edu.hk

Office: LAU-16-228

Office Hour: By appointment

Teaching Assistants (TAs):

  • Yanbo He (yanbohe3-c@my.cityu.edu.hk)

  • Ellen Yi Wong (ywong692-c@my.cityu.edu.hk)

  • Yuqi Zha (charlie.yqzha@my.cityu.edu.hk)

Assessment Method

Component Weight Details
Assignments 20% Two assignments (each 10%). Submitted online via Canvas in formats: .pdf, .py, .mp4, .txt.
Midterm Exam 20% Closed-book exam.
Group Project 30% One group project. Details below.
Final Exam 30% Closed-book exam.
  • Late Submission Policy: If late by ( tt ) days (( t>0t > 0 )), maximum score is ( (0.75)t×100%(0.75)^t \times 100\% ).

  • GenAI Policy: Allowed for non-exam tasks (assignments and project). Must be properly cited. Students are fully accountable for all submitted materials.

Schedule and Teaching

Topic Area Key Content Remarks
Dynamic Programming Algorithm Foundational concepts and frameworks. Includes historical context by Richard Bellman.
Deterministic Systems & Shortest Path Modeling sequential decision problems.
Markov Decision Processes (MDPs) Theory and applications in operations research and control.
Value Iteration, Policy Iteration, Linear Programming Solution methodologies for MDPs.
Model-Free Prediction & Control Learning without explicit environment models.
Value Function Approximation Techniques for large-scale problems.
Policy Gradient Optimization methods.
Multi-Armed Bandits Exploration vs. exploitation trade-offs.

Project Requirements (30% of Final Grade)

  • Nature: One group project focused on solving real-world problems using dynamic programming or reinforcement learning.

  • Key Guidelines:

    • Projects must be original and independently designed for this course. Reuse of content from other courses or papers is prohibited.
    • Use of GenAI tools requires explicit citation.
    • Submit via Canvas in allowed formats (.pdf, .py, .mp4, .txt).
  • Scoring: Based on scientific value and innovation; plagiarism will result in penalties.

Aims and Topics

  • Aims:

    • Understand concepts and principles of DP and RL.
    • Formulate problems as DP/RL models and implement solvers in Python.
    • Apply methodologies to real-world scenarios.
  • Topics:

    • Dynamic Programming Algorithm
    • Markov Decision Processes
    • Value Iteration and Policy Iteration
    • Model-Free Control
    • Value Function Approximation
    • Policy Gradient
    • Multi-Armed Bandits

Reference Books

  • Bertsekas, D.P. (2019). Reinforcement Learning and Optimal Control.

  • Sutton, R.S. & Barto, A.G. (2018). Reinforcement Learning: An Introduction.

  • Puterman, M.L. (2005). Markov Decision Processes: Discrete Stochastic Dynamic Programming.

  • Additional resources: Silver (2015) and Brunskill (2019) lecture series.