Section 1 : Welcome
|
Lecture 1 | Introduction | 00:03:01 Duration |
|
Lecture 2 | Course Outline and Big Picture | 00:07:45 Duration |
|
Lecture 3 | Where to get the Code | 00:04:27 Duration |
|
Lecture 4 | Anyone Can Succeed in this Course | 00:11:46 Duration |
|
Lecture 5 | Warmup | 00:15:26 Duration |
Section 2 : Return of the Multi-Armed Bandit
Section 3 : High Level Overview of Reinforcement Learning
|
Lecture 1 | What is Reinforcement Learning | 00:07:58 Duration |
|
Lecture 2 | On Unusual or Unexpected Strategies of RL | 00:05:59 Duration |
|
Lecture 3 | From Bandits to Full Reinforcement Learning | 00:08:33 Duration |
Section 4 : Markov Decision Proccesses
|
Lecture 1 | MDP Section Introduction | 00:06:11 Duration |
|
Lecture 2 | Gridworld | 00:12:25 Duration |
|
Lecture 3 | Choosing Rewards | 00:03:49 Duration |
|
Lecture 4 | The Markov Property | 00:06:02 Duration |
|
Lecture 5 | Markov Decision Processes (MDPs) | 00:14:33 Duration |
|
Lecture 6 | Future Rewards | 00:09:24 Duration |
|
Lecture 7 | Value Functions | 00:04:58 Duration |
|
Lecture 8 | The Bellman Equation (pt 1) | 00:08:38 Duration |
|
Lecture 9 | The Bellman Equation (pt 2) | 00:06:33 Duration |
|
Lecture 10 | The Bellman Equation (pt 3) | 00:06:01 Duration |
|
Lecture 11 | Bellman Examples | 00:22:25 Duration |
|
Lecture 12 | Optimal Policy and Optimal Value Function (pt 1) | 00:09:08 Duration |
|
Lecture 13 | Optimal Policy and Optimal Value Function (pt 2) | 00:04:00 Duration |
|
Lecture 14 | MDP Summary | 00:02:49 Duration |
Section 5 : Dynamic Programming
|
Lecture 1 | Intro to Dynamic Programming and Iterative Policy Evaluation | 00:02:58 Duration |
|
Lecture 2 | Designing Your RL Program | 00:04:50 Duration |
|
Lecture 3 | Gridworld in Code | 00:11:28 Duration |
|
Lecture 4 | Iterative Policy Evaluation in Code | 00:12:08 Duration |
|
Lecture 5 | Windy Gridworld in Code | 00:07:39 Duration |
|
Lecture 6 | Iterative Policy Evaluation for Windy Gridworld in Code | 00:07:04 Duration |
|
Lecture 7 | Policy Improvement | 00:02:42 Duration |
|
Lecture 8 | Policy Iteration | 00:01:51 Duration |
|
Lecture 9 | Policy Iteration in Code | 00:08:18 Duration |
|
Lecture 10 | Policy Iteration in Windy Gridworld | 00:08:41 Duration |
|
Lecture 11 | Value Iteration | 00:03:49 Duration |
|
Lecture 12 | Value Iteration in Code | 00:06:26 Duration |
|
Lecture 13 | Dynamic Programming Summary | 00:05:05 Duration |
Section 6 : Monte Carlo
|
Lecture 1 | Monte Carlo Intro | |
|
Lecture 2 | Monte Carlo Policy Evaluation | 00:05:36 Duration |
|
Lecture 3 | Monte Carlo Policy Evaluation in Code | |
|
Lecture 4 | Policy Evaluation in Windy Gridworld | 00:03:29 Duration |
|
Lecture 5 | Monte Carlo Control | 00:05:49 Duration |
|
Lecture 6 | Monte Carlo Control in Code | 00:04:04 Duration |
|
Lecture 7 | Monte Carlo Control without Exploring Starts | 00:02:50 Duration |
|
Lecture 8 | Monte Carlo Control without Exploring Starts in Code | 00:02:51 Duration |
|
Lecture 9 | Monte Carlo Summary |
Section 7 : Temporal Difference Learning
|
Lecture 1 | Temporal Difference Intro | 00:01:33 Duration |
|
Lecture 2 | TD(0) Prediction | 00:03:37 Duration |
|
Lecture 3 | TD(0) Prediction in Code | 00:02:27 Duration |
|
Lecture 4 | SARSA | 00:05:06 Duration |
|
Lecture 5 | SARSA in Code | 00:03:38 Duration |
|
Lecture 6 | Q Learning | 00:02:56 Duration |
|
Lecture 7 | Q Learning in Code | 00:02:14 Duration |
|
Lecture 8 | TD Summary | 00:02:24 Duration |
Section 8 : Approximation Methods
|
Lecture 1 | Approximation Intro | 00:04:03 Duration |
|
Lecture 2 | Linear Models for Reinforcement Learning | 00:04:06 Duration |
|
Lecture 3 | Features | 00:03:53 Duration |
|
Lecture 4 | Monte Carlo Prediction with Approximation | 00:01:45 Duration |
|
Lecture 5 | Monte Carlo Prediction with Approximation in Code | |
|
Lecture 6 | TD(0) Semi-Gradient Prediction | 00:04:14 Duration |
|
Lecture 7 | Semi-Gradient SARSA | 00:02:58 Duration |
|
Lecture 8 | Semi-Gradient SARSA in Code | 00:04:08 Duration |
|
Lecture 9 | Course Summary and Next Steps | 00:08:30 Duration |
Section 9 : Stock Trading Project with Reinforcement Learning
|
Lecture 1 | Beginners, halt! Stop here if you skipped ahead | 00:13:59 Duration |
|
Lecture 2 | Stock Trading Project Section Introduction | 00:05:04 Duration |
|
Lecture 3 | Data and Environment | 00:12:12 Duration |
|
Lecture 4 | How to Model Q for Q-Learning | 00:09:27 Duration |
|
Lecture 5 | Design of the Program | 00:06:35 Duration |
|
Lecture 6 | Code pt 1 | 00:07:49 Duration |
|
Lecture 7 | Code pt 2 | 00:09:30 Duration |
|
Lecture 8 | Code pt 3 | 00:04:18 Duration |
|
Lecture 9 | Code pt 4 | 00:07:08 Duration |
|
Lecture 10 | Stock Trading Project Discussion | 00:03:27 Duration |
Section 10 : Setting Up Your Environment (FAQ by Student Request)
|
Lecture 1 | Windows-Focused Environment Setup 2018 | 00:20:13 Duration |
|
Lecture 2 | How to install Numpy, Scipy, Matplotlib, Pandas, IPython, Theano, and TensorFlow | 00:17:33 Duration |
Section 11 : Extra Help With Python Coding for Beginners (FAQ by Student Request)
|
Lecture 1 | How to Code by Yourself (part 1) | 00:15:54 Duration |
|
Lecture 2 | How to Code by Yourself (part 2) | 00:09:23 Duration |
|
Lecture 3 | Proof that using Jupyter Notebook is the same as not using it | 00:12:29 Duration |
|
Lecture 4 | Python 2 vs Python 3 | 00:04:31 Duration |
Section 12 : Effective Learning Strategies for Machine Learning (FAQ by Student Request)
|
Lecture 1 | How to Succeed in this Course (Long Version) | 00:10:18 Duration |
|
Lecture 2 | Is this for Beginners or Experts Academic or Practical Fast or slow-paced | 00:21:58 Duration |
|
Lecture 3 | Machine Learning and AI Prerequisite Roadmap (pt 1) | 00:11:13 Duration |
|
Lecture 4 | Machine Learning and AI Prerequisite Roadmap (pt 2) | 00:16:07 Duration |
Section 13 : Appendix FAQ Finale
|
Lecture 1 | What is the Appendix | 00:02:42 Duration |