dopamine

Dopamine



Dopamine is a research framework for fast prototyping of reinforcement learning algorithms. It aims to fill the need for a small, easily grokked codebase in which users can freely experiment with wild ideas (speculative research).

Our design principles are:

In the spirit of these principles, this first version focuses on supporting the state-of-the-art, single-GPU Rainbow agent (Hessel et al., 2018) applied to Atari 2600 game-playing (Bellemare et al., 2013). Specifically, our Rainbow agent implements the three components identified as most important by Hessel et al.:

For completeness, we also provide an implementation of DQN (Mnih et al., 2015). For additional details, please see our documentation.

We provide a set of Colaboratory notebooks which demonstrate how to use Dopamine.

We provide a website which displays the learning curves for all the provided agents, on all the games.

This is not an official Google product.

What’s new

Instructions

Install via source

Installing from source allows you to modify the agents and experiments as you please, and is likely to be the pathway of choice for long-term use. The instructions below assume that you will be running Dopamine in a virtual environment. A virtual environment lets you control which dependencies are installed for which program.

Dopamine is a Tensorflow-based framework, and we recommend you also consult the Tensorflow documentation for additional details. Finally, these instructions are for Python 3.6 and above.

First download the Dopamine source.

git clone https://github.com/google/dopamine.git

Then create a virtual environment and activate it.

python3 -m venv ./dopamine-venv
source dopamine-venv/bin/activate

Finally setup the environment and install Dopamine’s dependencies

pip install -U pip
pip install -r dopamine/requirements.txt

Running tests

You can test whether the installation was successful by running the following:

cd dopamine
export PYTHONPATH=$PYTHONPATH:$PWD
python -m tests.dopamine.atari_init_test

Training agents

Atari games

The entry point to the standard Atari 2600 experiment is dopamine/discrete_domains/train.py. To run the basic DQN agent,

python -um dopamine.discrete_domains.train \
  --base_dir /tmp/dopamine_runs \
  --gin_files dopamine/agents/dqn/configs/dqn.gin

By default, this will kick off an experiment lasting 200 million frames. The command-line interface will output statistics about the latest training episode:

[...]
I0824 17:13:33.078342 140196395337472 tf_logging.py:115] gamma: 0.990000
I0824 17:13:33.795608 140196395337472 tf_logging.py:115] Beginning training...
Steps executed: 5903 Episode length: 1203 Return: -19.

To get finer-grained information about the process, you can adjust the experiment parameters in dopamine/agents/dqn/configs/dqn.gin, in particular by reducing Runner.training_steps and Runner.evaluation_steps, which together determine the total number of steps needed to complete an iteration. This is useful if you want to inspect log files or checkpoints, which are generated at the end of each iteration.

More generally, the whole of Dopamine is easily configured using the gin configuration framework.

Non-Atari discrete environments

We provide sample configuration files for training an agent on Cartpole and Acrobot. For example, to train C51 on Cartpole with default settings, run the following command:

python -um dopamine.discrete_domains.train \
  --base_dir /tmp/dopamine_runs \
  --gin_files dopamine/agents/rainbow/configs/c51_cartpole.gin

You can train Rainbow on Acrobot with the following command:

python -um dopamine.discrete_domains.train \
  --base_dir /tmp/dopamine_runs \
  --gin_files dopamine/agents/rainbow/configs/rainbow_acrobot.gin

Install as a library

An easy, alternative way to install Dopamine is as a Python library:

pip install dopamine-rl

References

Bellemare et al., The Arcade Learning Environment: An evaluation platform for general agents. Journal of Artificial Intelligence Research, 2013.

Machado et al., Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents, Journal of Artificial Intelligence Research, 2018.

Hessel et al., Rainbow: Combining Improvements in Deep Reinforcement Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 2018.

Mnih et al., Human-level Control through Deep Reinforcement Learning. Nature, 2015.

Mnih et al., Asynchronous Methods for Deep Reinforcement Learning. Proceedings of the International Conference on Machine Learning, 2016.

Schaul et al., Prioritized Experience Replay. Proceedings of the International Conference on Learning Representations, 2016.

Giving credit

If you use Dopamine in your work, we ask that you cite our white paper. Here is an example BibTeX entry:

@article{castro18dopamine,
  author    = {Pablo Samuel Castro and
               Subhodeep Moitra and
               Carles Gelada and
               Saurabh Kumar and
               Marc G. Bellemare},
  title     = {Dopamine: {A} {R}esearch {F}ramework for {D}eep {R}einforcement {L}earning},
  year      = {2018},
  url       = {http://arxiv.org/abs/1812.06110},
  archivePrefix = {arXiv}
}