Gabriel Barth-Maron


I am a Staff Research Engineer at Google DeepMind in London. My research interests span reinforcement learning, data-efficient learning, multimodal modeling, and training large-scale models. I’m also interested in building tools that accelerate the pace of research in machine learning and AI.

Among other contributions, I pioneered Distributed Reinforcement Learning at DeepMind and in the greater academic community. Our papers Distributed Prioritized Experience Replay and Distributed Distributional Deterministic Policy Gradients (D4PG) helped to prove the effectiveness of using Distributed Reinforcement Learning. We developed and open-sourced Acme, Reverb, and Launchpad to make Distributed RL easier.

Recently I have been working on extending transformers to multiple modalities. One example of this is Gato, a multi-modal, multi-task, multi-embodiment generalist policy. As part of Google DeepMind’s Gemini team I am working on the next generation of large-scale multimodal transformer models.

I hold a BA in mathematical economics and a ScM in computer science from Brown University.


Jul 5, 2023 Transactions on Machine Learning Research (TMLR) awarded Gato its first Outstanding Cetification (Best Paper Award). Read the full post here.
Jun 20, 2023 Gato is used as the backbone for RoboCat, the first large transformer sequence model that can solve a large set of dexterous tasks on multiple real robotic embodiments with differing observation and action specifications.
Dec 9, 2022 Part of a panel discussion on Scaling + Models at the 5th Robot Learning Workshop at NeurIPS. You can find a recording here.
Dec 3, 2022 Gave an invited talk on Gato at the NeurIPS workshop Foundation Models for Decision Making. You can find a recording here.

selected publications

  1. Distributed prioritized experience replay
    Dan Horgan, John Quan, David Budden, Gabriel Barth-Maron, Matteo Hessel, Hado Van Hasselt, and 1 more author
    arXiv preprint arXiv:1803.00933 2018
  2. Distributed distributional deterministic policy gradients
    Gabriel Barth-Maron, Matthew W Hoffman, David Budden, Will Dabney, Dan Horgan, Dhruva Tb, and 3 more authors
    arXiv preprint arXiv:1804.08617 2018
  3. Acme: A research framework for distributed reinforcement learning
    Matt Hoffman, Bobak Shahriari, John Aslanides, Gabriel Barth-Maron, Feryal Behbahani, Tamara Norman, and 5 more authors
    arXiv preprint arXiv:2006.00979 2020
  4. Reverb: a framework for experience replay
    Albin Cassirer, Gabriel Barth-Maron, Eugene Brevdo, Sabela Ramos, Toby Boyd, Thibault Sottiaux, and 1 more author
    arXiv preprint arXiv:2102.04736 2021
  5. Launchpad: a programming model for distributed machine learning research
    Fan Yang, Gabriel Barth-Maron, Piotr Stańczyk, Matthew Hoffman, Siqi Liu, Manuel Kroiss, and 2 more authors
    arXiv preprint arXiv:2106.04516 2021
  6. A Generalist Agent
    Scott Reed, Konrad Zolna, Emilio Parisotto, Sergio Gomez Colmenarejo, Alexander Novikov, Gabriel Barth-Maron, and 5 more authors
    arXiv preprint arXiv:2205.06175 2022