Reinforcement Learning in the OpenAI Gym (Tutorial)

Reinforcement Learning in the OpenAI Gym (Tutorial) – Double Q Learning

Today we're going to use double Q learning to deal with the problem of maximization bias in reinforcement learning problems. We'll use the Open AI gym's cart pole example.

We get maximization bias when we use the same set of samples to calculate the max action, and to calculate the value of that action. We can deal with this by using two estimates of the action value function, and alternating between them.

Code for this video is here:
https://github.com/philtabor/Youtube-Code-Repository/blob/master/ReinforcementLearning/Fundamentals/doubleQLearning.py

Learn how to turn deep reinforcement learning papers into code:

Get instant access to all my courses, including the new Prioritized Experience Replay course, with my subscription service. $24.99 a month gives you instant access to 35 hours of instructional content plus access to future updates, added monthly.

Discounts available for Udemy students (enrolled longer than 30 days). Just send an email to sales@neuralnet.ai

https://www.neuralnet.ai/courses

Or, pickup my Udemy courses here:

Deep Q Learning:
https://www.udemy.com/course/deep-q-learning-from-paper-to-code/?couponCode=DQN-JUNE-22

Actor Critic Methods:
https://www.udemy.com/course/actor-critic-methods-from-paper-to-code-with-pytorch/?couponCode=AC-JUNE-22

Curiosity Driven Deep Reinforcement Learning
https://www.udemy.com/course/curiosity-driven-deep-reinforcement-learning/?couponCode=ICM-JUNE-22

Natural Language Processing from First Principles:
https://www.udemy.com/course/natural-language-processing-from-first-principles/?couponCode=NLP-JUNE-22
Reinforcement Learning Fundamentals
https://www.manning.com/livevideo/reinforcement-learning-in-motion

Here are some books / courses I recommend (affiliate links):
Grokking Deep Learning in Motion: https://bit.ly/3fXHy8W
Grokking Deep Learning: https://bit.ly/3yJ14gT
Grokking Deep Reinforcement Learning: https://bit.ly/2VNAXql

Come hang out on Discord here:
https://discord.gg/Zr4VCdv

Need personalized tutoring? Help on a programming project? Shoot me an email! phil@neuralnet.ai

Website: https://www.neuralnet.ai
Github: https://github.com/philtabor
Twitter: https://twitter.com/MLWithPhil

#OpenAIGym #ReinforcementLearning #DoubleQLearning

Today we’re going to use double Q learning to deal with the problem of maximization bias in reinforcement learning problems. We’ll use the Open AI gym’s cart pole example.

Code for this video is here:
https://github.com/philtabor/Youtube-Code-Repository/blob/master/ReinforcementLearning/Fundamentals/doubleQLearning.py

Learn how to turn deep reinforcement learning papers into code:

Discounts available for Udemy students (enrolled longer than 30 days). Just send an email to sales@neuralnet.ai

Courses

Or, pickup my Udemy courses here:

Deep Q Learning:
https://www.udemy.com/course/deep-q-learning-from-paper-to-code/?couponCode=DQN-JUNE-22

Actor Critic Methods:
https://www.udemy.com/course/actor-critic-methods-from-paper-to-code-with-pytorch/?couponCode=AC-JUNE-22

Curiosity Driven Deep Reinforcement Learning
https://www.udemy.com/course/curiosity-driven-deep-reinforcement-learning/?couponCode=ICM-JUNE-22

Come hang out on Discord here:
https://discord.gg/Zr4VCdv

Need personalized tutoring? Help on a programming project? Shoot me an email! phil@neuralnet.ai

Website: https://www.neuralnet.ai
Github: https://github.com/philtabor
Twitter: https://twitter.com/MLWithPhil

#OpenAIGym #ReinforcementLearning #DoubleQLearning

THE FUTURE IS HERE

AI Now

Real-Life Robots

All New Atlas | Boston Dynamics

Manta Ray Drone DARPA | Versatile Warrior #drone #DARPA #usmc #usmilitary

Artificial Intelligence (AI) in Transportation

Ethics and Risks of AI in Transportation

AI trained to control traffic

AI Generates Human Evolution

AI and Human Augmentation: Enhancing Physical and Cognitive Abilities

Artificial Intelligence (AI) in Agriculture | The Future of Modern Smart Farming with IoT

Avoiding AI bias in predictive policing