Running 3.9k The Ultra-Scale Playbook 🌌 3.9k The ultimate guide to training LLM on large GPU Clusters
principled-intelligence/gemma-4-E2B-it-text-only Feature Extraction • 5B • Updated Apr 3 • 1.77k • 6
Qwen2.5 Collection Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. • 43 items • Updated Mar 2 • 727
meta-llama/Meta-Llama-3-8B-Instruct Text Generation • 8B • Updated Jun 18, 2025 • 1.41M • • 4.62k
mistralai/Mistral-7B-Instruct-v0.2 Text Generation • 7B • Updated Jul 24, 2025 • 1.2M • • 3.16k
TheBloke/Mixtral-8x7B-Instruct-v0.1-GPTQ Text Generation • 47B • Updated Dec 14, 2023 • 399 • 141