#### Section 1 : Welcome

 Lecture 1 Introduction 3:1 Lecture 2 Course Outline and Big Picture 7:45 Lecture 3 Where to get the Code 4:27 Lecture 4 Anyone Can Succeed in this Course 11:46 Lecture 5 Warmup 15:26

#### Section 2 : Return of the Multi-Armed Bandit

 Lecture 6 Section Introduction The Explore-Exploit Dilemma 10:8 Lecture 7 Applications of the Explore-Exploit Dilemma 7:51 Lecture 8 Epsilon-Greedy Theory 6:55 Lecture 9 Calculating a Sample Mean (pt 1) 5:46 Lecture 10 Epsilon-Greedy Beginner's Exercise Prompt Lecture 11 Designing Your Bandit Program 3:59 Lecture 12 Epsilon-Greedy in Code 7:1 Lecture 13 Comparing Different Epsilons 5:53 Lecture 14 Optimistic Initial Values Theory 5:30 Lecture 15 Optimistic Initial Values Beginner's Exercise Prompt 2:17 Lecture 16 Optimistic Initial Values Code 4:8 Lecture 17 UCB1 Theory 14:23 Lecture 18 UCB1 Beginner's Exercise Prompt 2:3 Lecture 19 UCB1 Code 3:18 Lecture 20 Bayesian Bandits Thompson Sampling Theory (pt 1) 12:33 Lecture 21 Bayesian Bandits Thompson Sampling Theory (pt 2) 17:25 Lecture 22 Thompson Sampling Beginner's Exercise Prompt 2:40 Lecture 23 Thompson Sampling Code 4:54 Lecture 24 Thompson Sampling With Gaussian Reward Theory 11:14 Lecture 25 Thompson Sampling With Gaussian Reward Code 6:8 Lecture 26 Why don't we just use a library 5:29 Lecture 27 Nonstationary Bandits 7:1 Lecture 28 Bandit Summary, Real Data, and Online Learning 6:20 Lecture 29 (Optional) Alternative Bandit Designs 9:52 Lecture 30 Suggestion Box 2:54

#### Section 3 : High Level Overview of Reinforcement Learning

 Lecture 31 What is Reinforcement Learning 7:58 Lecture 32 On Unusual or Unexpected Strategies of RL 5:59 Lecture 33 From Bandits to Full Reinforcement Learning 8:33

#### Section 4 : Markov Decision Proccesses

 Lecture 34 MDP Section Introduction 6:11 Lecture 35 Gridworld 12:25 Lecture 36 Choosing Rewards 3:49 Lecture 37 The Markov Property 6:2 Lecture 38 Markov Decision Processes (MDPs) 14:33 Lecture 39 Future Rewards 9:24 Lecture 40 Value Functions 4:58 Lecture 41 The Bellman Equation (pt 1) 8:38 Lecture 42 The Bellman Equation (pt 2) 6:33 Lecture 43 The Bellman Equation (pt 3) 6:1 Lecture 44 Bellman Examples 22:25 Lecture 45 Optimal Policy and Optimal Value Function (pt 1) 9:8 Lecture 46 Optimal Policy and Optimal Value Function (pt 2) 4:0 Lecture 47 MDP Summary 2:49

#### Section 5 : Dynamic Programming

 Lecture 48 Intro to Dynamic Programming and Iterative Policy Evaluation 2:58 Lecture 49 Designing Your RL Program 4:50 Lecture 50 Gridworld in Code 11:28 Lecture 51 Iterative Policy Evaluation in Code 12:8 Lecture 52 Windy Gridworld in Code 7:39 Lecture 53 Iterative Policy Evaluation for Windy Gridworld in Code 7:4 Lecture 54 Policy Improvement 2:42 Lecture 55 Policy Iteration 1:51 Lecture 56 Policy Iteration in Code 8:18 Lecture 57 Policy Iteration in Windy Gridworld 8:41 Lecture 58 Value Iteration 3:49 Lecture 59 Value Iteration in Code 6:26 Lecture 60 Dynamic Programming Summary 5:5

#### Section 6 : Monte Carlo

 Lecture 61 Monte Carlo Intro Lecture 62 Monte Carlo Policy Evaluation 5:36 Lecture 63 Monte Carlo Policy Evaluation in Code Lecture 64 Policy Evaluation in Windy Gridworld 3:29 Lecture 65 Monte Carlo Control 5:49 Lecture 66 Monte Carlo Control in Code 4:4 Lecture 67 Monte Carlo Control without Exploring Starts 2:50 Lecture 68 Monte Carlo Control without Exploring Starts in Code 2:51 Lecture 69 Monte Carlo Summary

#### Section 7 : Temporal Difference Learning

 Lecture 70 Temporal Difference Intro 1:33 Lecture 71 TD(0) Prediction 3:37 Lecture 72 TD(0) Prediction in Code 2:27 Lecture 73 SARSA 5:6 Lecture 74 SARSA in Code 3:38 Lecture 75 Q Learning 2:56 Lecture 76 Q Learning in Code 2:14 Lecture 77 TD Summary 2:24

#### Section 8 : Approximation Methods

 Lecture 78 Approximation Intro 4:3 Lecture 79 Linear Models for Reinforcement Learning 4:6 Lecture 80 Features 3:53 Lecture 81 Monte Carlo Prediction with Approximation 1:45 Lecture 82 Monte Carlo Prediction with Approximation in Code Lecture 83 TD(0) Semi-Gradient Prediction 4:14 Lecture 84 Semi-Gradient SARSA 2:58 Lecture 85 Semi-Gradient SARSA in Code 4:8 Lecture 86 Course Summary and Next Steps 8:30

#### Section 9 : Stock Trading Project with Reinforcement Learning

 Lecture 87 Beginners, halt! Stop here if you skipped ahead 13:59 Lecture 88 Stock Trading Project Section Introduction 5:4 Lecture 89 Data and Environment 12:12 Lecture 90 How to Model Q for Q-Learning 9:27 Lecture 91 Design of the Program 6:35 Lecture 92 Code pt 1 7:49 Lecture 93 Code pt 2 9:30 Lecture 94 Code pt 3 4:18 Lecture 95 Code pt 4 7:8 Lecture 96 Stock Trading Project Discussion 3:27

#### Section 10 : Setting Up Your Environment (FAQ by Student Request)

 Lecture 97 Windows-Focused Environment Setup 2018 20:13 Lecture 98 How to install Numpy, Scipy, Matplotlib, Pandas, IPython, Theano, and TensorFlow 17:33

#### Section 11 : Extra Help With Python Coding for Beginners (FAQ by Student Request)

 Lecture 99 How to Code by Yourself (part 1) 15:54 Lecture 100 How to Code by Yourself (part 2) 9:23 Lecture 101 Proof that using Jupyter Notebook is the same as not using it 12:29 Lecture 102 Python 2 vs Python 3 4:31

#### Section 12 : Effective Learning Strategies for Machine Learning (FAQ by Student Request)

 Lecture 103 How to Succeed in this Course (Long Version) 10:18 Lecture 104 Is this for Beginners or Experts Academic or Practical Fast or slow-paced 21:58 Lecture 105 Machine Learning and AI Prerequisite Roadmap (pt 1) 11:13 Lecture 106 Machine Learning and AI Prerequisite Roadmap (pt 2) 16:7

#### Section 13 : Appendix FAQ Finale

 Lecture 107 What is the Appendix 2:42