Test-Time Training for LLMs: Models Should Keep Learning After Deployment
Pretraining taught us that neural networks can compress massive amounts of data into weights. But once we deploy an LLM, we usually stop updating those weights completely. The model becomes frozen — it reads new inputs but never learns from them. Test-time training asks a more ambitious question: what if the model kept learning while it was being used? TTT-E2E is one practical answer. It lets a language model adapt its weights online from the very sequence it is reading. One consequence is dramatically stronger long-context behavior — but the deeper insight is that inference and learning don’t have to be separate phases. ...