ESTool

Evolved Biped Walker.

Implementation of various Evolution Strategies, such as GA, PEPG, CMA-ES and OpenAI's ES using common interface.

CMA-ES is wrapping around pycma.

Backround Reading:

A Visual Guide to Evolution Strategies

Evolving Stable Strategies

Using Evolution Strategies Library

To use es.py, please check out the simple_es_example.ipynb notebook.

The basic concept is:

solver = EvolutionStrategy()
while True:

  # ask the ES to give us a set of candidate solutions
  solutions = solver.ask()

  # create an array to hold the solutions.
  # solver.popsize = population size
  rewards = np.zeros(solver.popsize)

  # calculate the reward for each given solution
  # using your own evaluate() method
  for i in range(solver.popsize):
    rewards[i] = evaluate(solutions[i])

  # give rewards back to ES
  solver.tell(rewards)

  # get best parameter, reward from ES
  reward_vector = solver.result()

  if reward_vector[1] > MY_REQUIRED_REWARD:
    break

Parallel Processing Training with MPI

Please read Evolving Stable Strategies article for more demos and use cases.

To use the training tool (relies on MPI):

python train.py bullet_racecar -n 8 -t 4

will launch training jobs with 32 workers (using 8 MPI processes). the best model will be saved as a .json file in log/. This model should train in a few minutes on a 2014 MacBook Pro.

If you have more compute and have access to a 64-core CPU machine, I recommend:

python train.py name_of_environment -e 16 -n 64 -t 4

This will calculate fitness values based on an average of 16 random runs, on 256 workers (64 MPI processes x 4). In my experience this works reasonably well for most tasks inside config.py.

After training, to run pre-trained models:

python model.py bullet_ant log/name_of_your_json_file.json

bullet_ant pybullet environment. PEPG.

Another example: to run a minitaur duck model, run this locally:

python model.py bullet_minitaur_duck zoo/bullet_minitaur_duck.cma.256.json

Custom Minitaur Env.

In the .hist.json file, and on the screen output, we track the progress of training. The ordering of fields are:

generation count
time (seconds) taken so far
average fitness
worst fitness
best fitness
average standard deviation of params
average timesteps taken
max timesteps taken

Using plot_training_progress.ipynb in an IPython notebook, you can plot the traning logs for the .hist.json files. For example, in the bullet_ant task:

Bullet Ant training progress.

You need to install mpi4py, pybullet, gym etc to use various environments. Also roboschool/Box2D for some of the OpenAI gym envs.

On Windows, it is easiest to install mpi4py as follows:

Download and install mpi_x64.Msi from the HPC Pack 2012 MS-MPI Redistributable Package
Install a recent Visual Studio version with C++ compiler
Open a command prompt

git clone https://github.com/mpi4py/mpi4py
cd mpi4py
python setup.py install

Modify the train.py script and replace mpirun with mpiexec and -np with -n

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ESTool

Backround Reading:

Using Evolution Strategies Library

Parallel Processing Training with MPI

About

Uh oh!

Releases

Packages

Languages

Name	Name	Last commit message	Last commit date
Latest commit History 15 Commits 15 Commits
custom_envs	custom_envs
log	log
zoo	zoo
LICENSE	LICENSE
README.md	README.md
config.py	config.py
env.py	env.py
es.py	es.py
model.py	model.py
plot_training_progress.ipynb	plot_training_progress.ipynb
setup.py	setup.py
simple_es_example.ipynb	simple_es_example.ipynb
train.py	train.py

Search code, repositories, users, issues, pull requests...

License

lurium/estool

Folders and files

Latest commit

History

Repository files navigation

ESTool

Backround Reading:

Using Evolution Strategies Library

Parallel Processing Training with MPI

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages