Course Description
Algorithms for automated learning from interactions with the environment to optimize long-term performance. Markov decision processes, dynamic programming, temporal-difference learning, Monte Carlo reinforcement learning methods. Credit will not be given for both CSE 337 and CSE 437. Prerequisite: Math 231 and CSE 109.
Instructor: Hector Munoz-Avila (Fall 2019)
Textbook
Richard S. Sutton and Andrew G. Barto, "Reinforcement Learning: An Introduction (Adaptive Computation and Machine Learning)", 2nd Edition, 2018, A Bradford Book (1998), 978-0262039249