ML-NYC Speaker Series and Happy Hour: John Langford

Name: ML-NYC Speaker Series and Happy Hour: John Langford
Start: 2026-03-25T16:00:00-04:00
End: 2026-03-25T18:00:00-04:00
Location: Flatiron Institute

Schedule

Wed Mar 25 2026 at 04:00 pm to 06:00 pm

UTC-04:00

Location

Flatiron Institute | New York, NY

Next-Latent Prediction Transformers Learn Compact World Models
About this Event

The ML in NYC Speaker Series + Happy Hour is excited to host John Langford from Microsoft Research as our February speaker! His talk will take place Wednesday, March 25 at 4pm at the Flatiron Institute. As always, there will be a reception afterward for all attendees.

Title: Next-Latent Prediction Transformers Learn Compact World Models

Abstract: Transformers replace recurrence with a memory that grows with sequence length and self-attention that enables ad-hoc look ups over past tokens. Consequently, they lack an inherent incentive to compress history into compact latent states with consistent transition rules. This often leads to learning solutions that generalize poorly. We introduce Next-Latent Prediction (NextLat), which extends standard next-token training with self-supervised predictions in the latent space. Specifically, NextLat trains a transformer to learn latent representations that are predictive of its next latent state given the next output token. Theoretically, we show that these latents provably converge to belief states, compressed information of the history necessary to predict the future. This simple auxiliary objective also injects a recurrent inductive bias into transformers, while leaving their architecture, parallel training, and inference unchanged. NextLat effectively encourages the transformer to form compact internal world models with its own belief states and transition dynamics -- a crucial property absent in standard next-token prediction transformers. Empirically, across benchmarks targeting core sequence modeling competencies -- world modeling, reasoning, planning, and language modeling -- NextLat demonstrates significant gains over standard next-token training in downstream accuracy, representation compression, and lookahead planning. NextLat stands as a simple and efficient paradigm for shaping transformer representations toward stronger generalization.
Bio: John Langford is a Partner Research Manager at Microsoft Research New York, of which he was one of the founding members. He is known for his work on the Isomap embedding algorithm, CAPTCHA challenges, Cover Trees for nearest neighbor search, Contextual Bandits (which he coined) for reinforcement learning, and learning reductions. He is also the principal developer of Vowpal Wabbit and the primary author of the popular Machine Learning blog hunch.net. He was president of the International Conference on Machine Learning from 2019 to 2021. Dr. Langford was previously affiliated with Yahoo! Research, Toyota Technological Institute at Chicago, and IBM's Watson Research Center. He studied Physics and Computer Science at the California Institute of Technology, earning a double bachelor's degree in 1997, and he received his Ph.D. in computer science from Carnegie Mellon University in 2002.