emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation Paper • 2312.15185 • Published Dec 23, 2023
MMAR: A Challenging Benchmark for Deep Reasoning in Speech, Audio, Music, and Their Mix Paper • 2505.13032 • Published May 19, 2025 • 4
Omni-Captioner: Data Pipeline, Models, and Benchmark for Omni Detailed Perception Paper • 2510.12720 • Published Oct 14, 2025 • 2
Evaluating the Expressive Appropriateness of Speech in Rich Contexts Paper • 2605.09413 • Published May 10 • 5
Omni-Captioner: Data Pipeline, Models, and Benchmark for Omni Detailed Perception Paper • 2510.12720 • Published Oct 14, 2025 • 2
Running Agents 63 Qwen3 Omni Captioner Demo 🐠 63 Generate a caption for any uploaded or recorded audio