Research Scientist
Google, New York
E-Mail: vaishnavh at google.com
He/Him/His

Resume | | | | |

Research Interests

I like thinking about when and why complex AI systems work. I hope to do this by pursuing minimal abstractions, insightful (counter)examples and clear and nuanced arguments. I am currently interested in understanding the limits of (and going beyond) the next-token prediction paradigm that underlies many current AI models (see this or this recent work). My prior works similarly identify examples of failure and success across a broad range of settings in AI, typically what were prevailing paradigms back then. This includes out-of-distribution generalization, uniform-convergence-based generalization bounds and GAN optimization.

I am strongly against pushing too many papers into the void. I also enjoy deep collaborations where we meet often, so feel free to reach out to brainstorm!

Your Name


Students I have worked with

I have had the fortune of closely working with and/or mentoring the following students:


Select Papers (Google Scholar)


CONFERENCE PUBLICATIONS
  • Roll the dice and look before you leap: Going beyond the creative limits of next-token prediction,
    International Conference on Machine Learning (ICML) 2025,
    (Double first author) Vaishnavh Nagarajan*, Chen Henry Wu*, Charles Ding and Aditi Raghunathan
    Oral presentation (1% acceptance)
    [arxiv][Poster]

  • The pitfalls of next-token prediction,
    International Conference on Machine Learning (ICML) 2024,
    (Double first author) Gregor Bachmann* and Vaishnavh Nagarajan*
    [arxiv][Poster][Slides][Simons Institute Talk]
    • Also oral presentation at ICLR ‘24 Workshop “How Far Are We From AGI?”

  • Think before you speak: Training language models with pause tokens,
    International Conference on Learning Representations (ICLR) 2024,
    Sachin Goyal, Ziwei Ji, Ankit Singh Rawat, Aditya Krishna Menon, Sanjiv Kumar, Vaishnavh Nagarajan
    [arxiv] [Poster]

  • Assessing Generalization via Disagreement,
    International Conference on Learning Representations (ICLR) 2022,
    (Double first author) Yiding Jiang*, Vaishnavh Nagarajan*, Christina Baek, J. Zico Kolter
    Accepted for Spotlight presentation, 5.2% acceptance
    [arxiv] [Poster]

  • Understanding the failure modes of out-of-distribution generalization,
    International Conference on Learning Representations (ICLR) 2021,
    Vaishnavh Nagarajan, Anders Andreassen and Behnam Neyshabur
    [arxiv] [Poster]
    • Invited poster presentation at Conceptual Understanding of Deep Learning Workshop, Google Algorithms Workshop Series, 2021.


WORKSHOP PAPERS
  • Theoretical Insights into Memorization in GANs,
    Neural Information Processing Systems (NeurIPS) 2017 - Integration of Deep Learning Theories Workshop
    Vaishnavh Nagarajan, Colin Raffel, Ian Goodfellow.
    [PDF]

  • Generalization in Deep Networks: The Role of Distance from Initialization,
    Neural Information Processing Systems (NeurIPS) 2017 - Deep Learning: Bridging Theory and Practice
    Vaishnavh Nagarajan and J. Zico Kolter.
    Spotlight talk
    [arxiv] [Poster]


THESIS
  • Explaining generalization in deep learning: progress and fundamental limits,
    Vaishnavh Nagarajan, 2021
    [arxiv]



Peer Review

Reviewer:

  • ICLR 2023, 2021 (outstanding reviewer award, top 10%)
  • NeurIPS 2024 (top 7%), 2023 (top 10%), 2021, 2020 (top 10%), 2019, (top 50%), 2018 (top 30%)
  • ICML 2024 & 2023 (Expert reviewer), 2022, 2021 (Expert reviewer, top 10%), 2020 (top 33%), 2019 (top 5%)
  • COLT 2019
  • ALT 2021
  • UAI 2022
  • AISTATS 2023 (top 10%) 2019
  • JMLR, Nature
  • Workshops: ICML 22 PODS, ICML 21 OPPO, ICLR-Me-FOMO 2023, DistShift NeurIPS 2023, R0-FoMo NeurIPS 2023 (area chair)

Area chair:

  • ICML 2025
  • NeurIPS 2025
  • COLM 2025

Last Updated: May 29 2025