I am a Staff Research Engineer at Google DeepMind in London. My research interests span reinforcement learning, data-efficient learning, multimodal modeling, and training large-scale models. I’m also interested in building tools that accelerate the pace of research in machine learning and AI.
Among other contributions, I pioneered Distributed Reinforcement Learning at DeepMind and in the greater academic community. Our papers Distributed Prioritized Experience Replay and Distributed Distributional Deterministic Policy Gradients (D4PG) helped to prove the effectiveness of using Distributed Reinforcement Learning. We developed and open-sourced Acme, Reverb, and Launchpad to make Distributed RL easier.
Recently I have been working on extending transformers to multiple modalities. One example of this is Gato, a multi-modal, multi-task, multi-embodiment generalist policy. As part of Google DeepMind’s Gemini team I am working on the next generation of large-scale multimodal transformer models.
I hold a BA in mathematical economics and a ScM in computer science from Brown University.
|Jul 5, 2023||Transactions on Machine Learning Research (TMLR) awarded Gato its first Outstanding Cetification (Best Paper Award). Read the full post here.|
|Jun 20, 2023||Gato is used as the backbone for RoboCat, the first large transformer sequence model that can solve a large set of dexterous tasks on multiple real robotic embodiments with differing observation and action specifications.|
|Dec 9, 2022||Part of a panel discussion on Scaling + Models at the 5th Robot Learning Workshop at NeurIPS. You can find a recording here.|
|Dec 3, 2022||Gave an invited talk on Gato at the NeurIPS workshop Foundation Models for Decision Making. You can find a recording here.|
- Distributed prioritized experience replayarXiv preprint arXiv:1803.00933 2018
- Distributed distributional deterministic policy gradientsarXiv preprint arXiv:1804.08617 2018
- Acme: A research framework for distributed reinforcement learningarXiv preprint arXiv:2006.00979 2020
- Reverb: a framework for experience replayarXiv preprint arXiv:2102.04736 2021
- Launchpad: a programming model for distributed machine learning researcharXiv preprint arXiv:2106.04516 2021
- A Generalist AgentarXiv preprint arXiv:2205.06175 2022