mlx-community/NVIDIA-Nemotron-3-Nano-30B-A3B-OptiQ-4bit Text Generation • 32B • Updated 15 minutes ago
mlx-community/NVIDIA-Nemotron-3-Nano-30B-A3B-OptiQ-4bit Text Generation • 32B • Updated 15 minutes ago
KVarN: Variance-Normalized KV-Cache Quantization Mitigates Error Accumulation in Reasoning Tasks Paper • 2606.03458 • Published 3 days ago • 47
mlx-community/NVIDIA-Nemotron-3-Nano-4B-OptiQ-4bit Text Generation • 0.8B • Updated about 19 hours ago • 101
mlx-community/NVIDIA-Nemotron-3-Nano-4B-OptiQ-4bit Text Generation • 0.8B • Updated about 19 hours ago • 101