Gunshi Gupta

Gunshi Gupta

Deep Learning Researcher

University of Oxford

OATML

Wayve

MILA

Biography

Hey! I’ll soon be graduating as a Machine Learning D.phil student at the OATML group at University of Oxford. I’m supervised by Prof. Yarin Gal.

I’m currently working on designing methods, architectures and benchmarks to enable transformer-based agents to do long horizon tasks by creating and accessing memories, through large-scale RL.

Some of the topics I have done research on over the previous three years are:

  • Leveraging advances in visual diffusion modeling for robotics
  • Mechanistic interpretability in transformer-based world models
  • Training generative world models for video games and robotics, and
  • Causally-correct, sample-efficient learning from imbalanced data.

I also collaborated closely with researchers from Toyota Research (Adrien Gaidon and Rowan McAllister) on topics related to causal robot learning.

Prior to my Ph.D, I was a deep learning researcher at Wayve, exploring reinforcment learning algorithms on autonomous driving data. I graduated from a Machine Learning Research Master’s at Mila (Sept 2020) where I did research on meta learning, continual learning and inverse reinforcement learning. I was also an ED&I Fellow with the MPLS department at the University of Oxford in 2022-2023 cohort.

Download my resumé.

Interests

  • Policy Learning, Reinforcement Learning
  • Diffusion modeling
  • Continual Learning, Meta Learning
  • Memory-augmented models

Education

  • D.Phil Machine Learning (AIMS CDT), 2024

    University of Oxford

  • Research Master's in Machine Learning, 2020

    Montreal Institute of Learning Algorithms

  • B.Tech in Maths and Computing (Applied Mathematics), 2016

    Delhi Technological University (DTU/DCE)

Experience

 
 
 
 
 

Deep Learning Intern (RL/IL)

Microsoft Research

Apr 2023 – Jul 2023 Cambridge
  1. I contributed to a team submission to NeurIPS titled “WHAM: World and Human Action Modelling in a Modern Xbox Game” exploring a VQGAN-transformer based world-and-action model trained on 3 years of gameplay trajectories in a high-fidelity multi-player game.
  2. Develop an evaluation suite for mechanistic interpretability of transformer representations to track emergence of game-relevant concepts like locations of adversaries, health resources of the player and so on.
 
 
 
 
 

Deep Learning Researcher

Wayve

Jul 2020 – Sep 2021 London
I am part of the policy learning team that focuses on exploring algorithms that can learn in robust and sample-efficient manner aided by expert demonstrations.
 
 
 
 
 

Graduate Research Assistant

Robotics Research Center, IIITH

Feb 2017 – Apr 2018 Hyderabad

Here I:

  • Developed a Multi Robot Visual SLAM framework for the Center for Artificial Intelligence and Robotics (CAIR, India) that was tested using the Husky UGV platform
  • Published “View-Invariant Intersection Recognition from Videos using Deep Network Ensembles” at IROS 2018
 
 
 
 
 

Software Developer

Microsoft

Jun 2016 – Feb 2017 Hyderabad
  • Built prediction and summarisation modules for employee performance feedback using deep learning and NLP
  • Organised workshops on ‘Machine Learning Fundamentals’ for Microsoft employees.

Publications & Preprint

Quickly discover relevant content by filtering publications.

Contact