Developed over a year and a half, TTT models promise to process more data than transformers while using much less compute power. A fundamental part of transformers is the “hidden state,” a ...