Introduction

This repository implements NIPS 2017 Value Prediction Network (Oh et al.) in Tensorflow.

@inproceedings{Oh2017VPN,
  title={Value Prediction Network},
  author={Junhyuk Oh and Satinder Singh and Honglak Lee},
  booktitle={NIPS},
  year={2017}
}

Our code is based on OpenAI's A3C implemenation.

Dependencies

Tensorflow
Beutiful Soup
Golang
six (for py2/3 compatibility)
tmux (the start script opens up a tmux session with multiple windows)
htop (shown in one of the tmux windows)
gym
gym[atari]
universe
opencv-python
numpy
scipy

Training

The following command trains a value prediction network (VPN) with plan depth of 3 on stochastic Collect domain:

python train.py --config config/collect_deterministic.xml --branch 4,4,4 --alg VPN

train_vpn script contains commands for reproducing the main result of the paper.

Notes

Tensorboard shows the performance of the epsilon-greedy policy. This is NOT the learning curve in the paper, because epsilon decreases from 1.0 to 0.05 for the first 1e6 steps. Instead, [logdir]/eval.csv shows the performance of the agent using greedy-policy.
Our code supports multi-gpu training. You can specify GPU IDs in --gpu option (e.g., --gpu 0,1,2,3).

Name	Name	Last commit message	Last commit date
Latest commit History 12 Commits
config	config
.gitignore	.gitignore
README.md	README.md
a3c.py	a3c.py
async.py	async.py
envs.py	envs.py
maze.py	maze.py
model.py	model.py
q.py	q.py
test.py	test.py
test_vpn	test_vpn
train.py	train.py
train_vpn	train_vpn
util.py	util.py
vpn.py	vpn.py
worker.py	worker.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Introduction

Dependencies

Training

Notes

About

Uh oh!

Releases

Packages

Languages

Search code, repositories, users, issues, pull requests...

junhyukoh/value-prediction-network

Folders and files

Latest commit

History

Repository files navigation

Introduction

Dependencies

Training

Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages