Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

junhyukoh/value-prediction-network

Repository files navigation

Introduction

This repository implements NIPS 2017 Value Prediction Network (Oh et al.) in Tensorflow.

@inproceedings{Oh2017VPN,
  title={Value Prediction Network},
  author={Junhyuk Oh and Satinder Singh and Honglak Lee},
  booktitle={NIPS},
  year={2017}
}

Our code is based on OpenAI's A3C implemenation.

Dependencies

Training

The following command trains a value prediction network (VPN) with plan depth of 3 on stochastic Collect domain:

python train.py --config config/collect_deterministic.xml --branch 4,4,4 --alg VPN

train_vpn script contains commands for reproducing the main result of the paper.

Notes

  • Tensorboard shows the performance of the epsilon-greedy policy. This is NOT the learning curve in the paper, because epsilon decreases from 1.0 to 0.05 for the first 1e6 steps. Instead, [logdir]/eval.csv shows the performance of the agent using greedy-policy.
  • Our code supports multi-gpu training. You can specify GPU IDs in --gpu option (e.g., --gpu 0,1,2,3).

About

NIPS 2017 Value Prediction Network

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published
Morty Proxy This is a proxified and sanitized view of the page, visit original site.