Blog

#AI research

Posts tagged #AI research

Do language models need sleep?

Do language models need sleep?

A CMU/UMD paper shows that what limits SSM-attention hybrids on deep reasoning isn't memory capacity but the number of passes over the context. A 'sleep' mechanism runs N recurrent passes at window eviction and consolidates the context into fast weights, moving computational depth offline.

Jakub Kontra
Jakub KontraDeveloper