Research Scientist
Google, New York
E-Mail: vaishnavh at google.com
He/Him/His

Resume | | | | |


Research Interests

I like thinking about when and why complex AI systems work, and how to make them better. I do this by pursuing minimal abstractions, insightful (counter)examples and simple results. I am currently interested in understanding the limits of and going beyond the next-token prediction paradigm that underlies many current AI models (see this or this recent work). My prior works similarly identify examples of failure and success across a broad range of settings in AI including out-of-distribution generalization, generalization bounds and GAN optimization amongst other things.

My work has been recognized with the Outstanding Paper Award at ICML 2025, the Outstanding New Directions Paper Award at NeurIPS 2019, an Oral presentation at NeurIPS 2017 and a spotlight at ICLR 2021.

More on my research philosophy which is shaped by what I have read.

Your Name


Students I have worked with

I have had the fortune of closely working with and/or mentoring the following students:


Select Papers (Google Scholar)


CONFERENCE PUBLICATIONS / FULL-LENGTH PREPRINTS

  • Deep sequence models memorize geometrically; it’s unclear why,
    NeurIPS 2025 Workshop on Foundations of Reasoning in Language Models,
    Shahriar Noroozizadeh, Vaishnavh Nagarajan, Elan Rosenfeld, Sanjiv Kumar
    [arxiv]

  • Roll the dice and look before you leap: Going beyond the creative limits of next-token prediction,
    International Conference on Machine Learning (ICML) 2025,
    (Double first author) Vaishnavh Nagarajan*, Chen Henry Wu*, Charles Ding and Aditi Raghunathan
    Winner of Outstanding Paper Award
    Oral presentation (1% acceptance)
    [arxiv][Poster][Oral Slides] [1h Talk Slides][Code]

  • The pitfalls of next-token prediction,
    International Conference on Machine Learning (ICML) 2024,
    (Double first author) Gregor Bachmann* and Vaishnavh Nagarajan*
    [arxiv][Poster][Slides][Simons Institute Talk][Code]
    • Also oral presentation at ICLR ‘24 Workshop “How Far Are We From AGI?”

  • Think before you speak: Training language models with pause tokens,
    International Conference on Learning Representations (ICLR) 2024,
    Sachin Goyal, Ziwei Ji, Ankit Singh Rawat, Aditya Krishna Menon, Sanjiv Kumar, Vaishnavh Nagarajan
    [arxiv] [Poster]

  • Assessing Generalization via Disagreement,
    International Conference on Learning Representations (ICLR) 2022,
    (Double first author) Yiding Jiang*, Vaishnavh Nagarajan*, Christina Baek, J. Zico Kolter
    Accepted for Spotlight presentation, 5.2% acceptance
    [arxiv] [Poster]

  • Understanding the failure modes of out-of-distribution generalization,
    International Conference on Learning Representations (ICLR) 2021,
    Vaishnavh Nagarajan, Anders Andreassen and Behnam Neyshabur
    [arxiv] [Poster] [1h talk] [Slides]


WORKSHOP PAPERS
  • Theoretical Insights into Memorization in GANs,
    Neural Information Processing Systems (NeurIPS) 2017 - Integration of Deep Learning Theories Workshop
    Vaishnavh Nagarajan, Colin Raffel, Ian Goodfellow.
    [PDF]

  • Generalization in Deep Networks: The Role of Distance from Initialization,
    Neural Information Processing Systems (NeurIPS) 2017 - Deep Learning: Bridging Theory and Practice
    Vaishnavh Nagarajan and J. Zico Kolter.
    Spotlight talk
    [arxiv] [Poster]


THESIS
  • Explaining generalization in deep learning: progress and fundamental limits,
    Vaishnavh Nagarajan, 2021
    [arxiv]



Peer Review

Reviewer:

  • ICLR 2023, 2021 (outstanding reviewer award, top 10%)
  • NeurIPS 2024 (top 7%), 2023 (top 10%), 2021, 2020 (top 10%), 2019, (top 50%), 2018 (top 30%)
  • ICML 2024 & 2023 (Expert reviewer), 2022, 2021 (Expert reviewer, top 10%), 2020 (top 33%), 2019 (top 5%)
  • COLT 2019
  • ALT 2021
  • UAI 2022
  • AISTATS 2023 (top 10%) 2019
  • JMLR, Nature
  • Workshops: ICML 22 PODS, ICML 21 OPPO, ICLR-Me-FOMO 2023, DistShift NeurIPS 2023, R0-FoMo NeurIPS 2023 (area chair)

Area chair:

  • ICML 2025
  • NeurIPS 2025
  • COLM 2025

Last Updated: Jul 15 2025