MultiHashFormer: Hash-based Generative Language Models Paper • 2606.28057 • Published 7 days ago • 19
Deconstructing Attention: Investigating Design Principles for Effective Language Modeling Paper • 2510.11602 • Published Oct 13, 2025 • 15