These are my notes from the second edition of “Reinforcement Learning: An Introduction” by Richard S. Sutton and Andrew G. Barto.

Chapter 1: Introduction

Chapter 2: Multi-armed Bandits

Chapter 3: Finite Markov Decision Process

Chapter 4: Dynamic Programming

Chapter 5: Monte Carlo Methods

Chapter 6: Temporal-Difference Learning