The FID Lottery: Quantifying Hidden Randomness in Generative-Model Evaluation Paper • 2606.20536 • Published 17 days ago • 12
view article Article Introducing North Mini Code: Cohere’s First Model For Developers CohereLabs • 25 days ago • 79
view article Article Fine-tune FLUX.2 [klein] with a LoRA under 60 minutes black-forest-labs • 30 days ago • 25
Qwen3.5 Collection Qwen3.5 is Qwen's new model family including Qwen3.5 Small: 0.8B, 2B, 4B, 9B and Qwen3.5 Medium: 35B-A3B, 27B, 122B-A10B and 397B-A17B. • 25 items • Updated 19 days ago • 161
view article Article GGML and llama.cpp join HF to ensure the long-term progress of Local AI +4 ggerganov, ngxson, allozaur, lysandre, victor, julien-c • Feb 20 • 507
SLA2: Sparse-Linear Attention with Learnable Routing and QAT Paper • 2602.12675 • Published Feb 13 • 59
Hibiki-Zero Collection Streaming speech translation without the need for word-level alignments • 4 items • Updated May 9 • 4
CASA Collection CASA: Cross-Attention over Self-Attention for Efficient Vision-Language Fusion on long-context streaming inputs • 6 items • Updated Mar 9 • 8
Mistral Large 3 Collection A state-of-the-art, open-weight, general-purpose multimodal model with a granular Mixture-of-Experts architecture. • 4 items • Updated Dec 2, 2025 • 100