Stanford CS234: Reinforcement Learning | Winter 2019 | Lecture 2

Stanford CS234: Reinforcement Learning | Winter 2019 | Lecture 2 – Given a Model of the World

For more information about Stanford’s Artificial Intelligence professional and graduate programs, visit: https://stanford.io/ai

Professor Emma Brunskill, Stanford University
https://stanford.io/3eJW8yT

Professor Emma Brunskill
Assistant Professor, Computer Science
Stanford AI for Human Impact Lab
Stanford Artificial Intelligence Lab
Statistical Machine Learning Group

To follow along with the course schedule and syllabus, visit: http://web.stanford.edu/class/cs234/index.html

0:00 Introduction
2:55 Full Observability: Markov Decision Process (MDP)
3:55 Recall: Markov Property
4:50 Markov Processor Markov Chain
5:53 Example: Mars Rover Markov Chain Transition Matrix, P
12:06 Example: Mars Rover Markov Chain Episodes
13:05 Markov Reward Process (MRP)
14:37 Return & Value Function
16:32 Discount Factor
18:23 Example: Mars Rover MRP
23:19 Matrix Form of Bellman Equation for MRP
26:52 Iterative Algorithm for Computing Value of a MRP
33:29 MDP Policy Evaluation, Iterative Algorithm
34:44 Policy Evaluation: Example & Check Your Understanding
36:39 Practice: MDP 1 Iteration of Policy Evaluation, Mars Rover Example
50:48 MDP Policy Iteration (PI)
55:44 Delving Deeper into Policy Improvement Step

THE FUTURE IS HERE

AI Now

The Biggest Risks Of Using AI In Education

Thinking in an AI-Augmented World | Askwith Education Forum

AI in Education: Panel Discussion: How Does AI Affect How We Learn?

Human augmentation and digital technology's impact on humans

“What If We Make SUPER Humans?” – AI Gene Editing Future SPARKS Human Enhancement CONTROVERSY

What is AI risk management? – AIGP Certification (2026)

AI in Risk Management and Governance

How AI is Revolutionizing Talent Management | Latest Technology Updates

The Impact of AI on Talent Management and Performance Improvement

AI for Talent Management and AI Upskilling for small-to-medium Businesses