Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

spirosrap/Deep-Reinforcement-Learning

Repository files navigation

Deep-Reinforcement-Learning

Deep Reinforcement Learning Algorithms and Code - Explanations of research papers and their implementations (All algorithm implementations are done in Pytorch)

  1. REINFORCE: Vanilla Policy Gradient
  2. DQN: Deep Q-Learning, Mnih et al, 2013
  3. A3C/A2C: Asynchronous methods for Deep RL,Mnih et al, 2016
  4. PPO: Proximal Policy Optimization,Schulman et al, 2017
  5. DDPG: Deep Deterministic Policy Gradient,Lillicrap et al, 2015

(Folder General: General tips on Deep reinforcement Learning)

From Open AI "Spinning Up as a Deep RL Researcher (or Practitioner)".: How to start in Deep RL assuming you've got a solid background in Mathematics(1,2), a general knowledge of Deep Learning and are familiar with at least one Deep Learning Library (Like PyTorch or TensorFlow):

OPEN AI

Which algorithms? You should probably start with vanilla policy gradient (also called REINFORCE), DQN, A2C (the synchronous version of A3C), PPO (the variant with the clipped objective), and DDPG, approximately in that order. The simplest versions of all of these can be written in just a few hundred lines of code (ballpark 250-300), and some of them even less (for example, a no-frills version of VPG can be written in about 80 lines). Write single-threaded code before you try writing parallelized versions of these algorithms. (Do try to parallelize at least one.)

Further Algorithms to study (Suggested at Open AI Hackathon):

How to study the RL Algorithms

Start with the most simple algorithm (REINFORCE). First read the paper carefully. Then read the implementation and try to rewrite the code from scratch. Take care not to overfit on implementation details or on paper details.

Notes

My framework of choice is Pytorch which is covered by a free licence ( Modified BSD license).

The implementations were taken from various sources with a focus on simplicity and ease of understanding (including Udacity's repository for the Deep Reinforcement Learning Nanodegree). There are numerous implementations available including very good modular ones but my purpose is mastering the RL theory and algorithms. Creating modular code is a secondary goal.

There are minor corrections on the implementations with the aim of making them easier to understand and consistent.

Sources

Releases

No releases published

Packages

No packages published
Morty Proxy This is a proxified and sanitized view of the page, visit original site.