proximal-policy-optimization

Hi, I am trying to use the PPO algorithm; however, it's not clear how to construct the stochastic policy. Should I use the Gaussian policy network?

Cool library by the way; I like the modularity!

Some time around ae030395f56efca50a51335fe4f3367caf950066 we regressed and the example code in gym_client.cpp doesn't converge any more. Presumably because of some difference in our observation normalization compared to the OpenAI Baselines one.

I'll look in more detail this weekend and confirm if it's that exact commit causing the problem.

Apr	MAY	Jun
	30
2019	2020	2021

proximal-policy-optimization

Here are 67 public repositories matching this topic...

MorvanZhou / Reinforcement-learning-with-tensorflow

ikostrikov / pytorch-a2c-ppo-acktr-gail

Khrylx / PyTorch-RL

zuoxingdong / lagom

cpnota / autonomous-learning-library

Documentation on StochasticPolicy

TianhongDai / reinforcement-learning-algorithms

Omegastick / pytorch-cpp-rl

Example code doesn't converge

ChenglongChen / pytorch-MADRL

nikhilbarhate99 / PPO-PyTorch

pekaalto / sc2aibot

miroblog / tf_deep_rl_trader

jcwleo / curiosity-driven-exploration-pytorch

lnpalmer / PPO

RLOpensource / Relational_Deep_Reinforcement_Learning

adik993 / ppo-pytorch

lcswillems / torch-ac

nav74neet / gail_gym

cxxgtxy / POP3D

TianhongDai / distributed-ppo

xwhan / walk_the_blocks

TianhongDai / google-football-pytorch

chagmgang / pysc2_rl

agakshat / spacefortress

apparatusbox / rlbox

jw1401 / PPO-Tensorflow-2.0

wisnunugroho21 / reinforcement_learning_ppo_rnd

RLOpensource / spinning_up_kr

marcelloaborges / Soccer-PPO

RLOpensource / Generative_Adversarial_Imitation_Learning

marcosfede / Reinforcement-Landing

Improve this page

Add this topic to your repo