January 16, 2025:
Google Innovates LLMs with Cost-Efficient Titans Architecture - Google's new Titans architecture tackles LLM cost challenges by combining neural memory layers with traditional attention blocks. This design supports both short- and long-term memory tasks, allowing models to efficiently extend memory during inference. Titans surpasses traditional transformers and linear models in long-sequence tasks with fewer parameters.
The architecture balances context and memory, reducing inference costs for extensive sequences and enhancing applications. Google's strategy aims to advance enterprise applications and plans to release model training and evaluation code.