Automatic Speech Recognition
ESPnet
multilingual
audio
speech-translation
language-identification
Eval Results
Instructions to use espnet/owsm_ctc_v4_1B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- ESPnet
How to use espnet/owsm_ctc_v4_1B with ESPnet:
from espnet2.bin.asr_inference import Speech2Text model = Speech2Text.from_pretrained( "espnet/owsm_ctc_v4_1B" ) speech, rate = soundfile.read("speech.wav") text, *_ = model(speech)[0] - Notebooks
- Google Colab
- Kaggle
| - dataset: | |
| id: hf-audio/open-asr-leaderboard | |
| task_id: mean_wer | |
| value: 7.42 | |
| date: '2025-01-16' | |
| source: | |
| url: https://huggingface.co/hf-audio | |
| name: open-asr-leaderboard | |
| user: hf-audio | |
| - dataset: | |
| id: hf-audio/open-asr-leaderboard | |
| task_id: rtfx | |
| value: 453.97 | |
| date: '2025-01-16' | |
| source: | |
| url: https://huggingface.co/hf-audio | |
| name: open-asr-leaderboard | |
| user: hf-audio | |
| - dataset: | |
| id: hf-audio/open-asr-leaderboard | |
| task_id: ami_wer | |
| value: 13.1 | |
| date: '2025-01-16' | |
| source: | |
| url: https://huggingface.co/hf-audio | |
| name: open-asr-leaderboard | |
| user: hf-audio | |
| - dataset: | |
| id: hf-audio/open-asr-leaderboard | |
| task_id: earnings22_wer | |
| value: 13.66 | |
| date: '2025-01-16' | |
| source: | |
| url: https://huggingface.co/hf-audio | |
| name: open-asr-leaderboard | |
| user: hf-audio | |
| - dataset: | |
| id: hf-audio/open-asr-leaderboard | |
| task_id: gigaspeech_wer | |
| value: 10.83 | |
| date: '2025-01-16' | |
| source: | |
| url: https://huggingface.co/hf-audio | |
| name: open-asr-leaderboard | |
| user: hf-audio | |
| - dataset: | |
| id: hf-audio/open-asr-leaderboard | |
| task_id: librispeech_clean_wer | |
| value: 2.59 | |
| date: '2025-01-16' | |
| source: | |
| url: https://huggingface.co/hf-audio | |
| name: open-asr-leaderboard | |
| user: hf-audio | |
| - dataset: | |
| id: hf-audio/open-asr-leaderboard | |
| task_id: librispeech_other_wer | |
| value: 4.89 | |
| date: '2025-01-16' | |
| source: | |
| url: https://huggingface.co/hf-audio | |
| name: open-asr-leaderboard | |
| user: hf-audio | |
| - dataset: | |
| id: hf-audio/open-asr-leaderboard | |
| task_id: spgispeech_wer | |
| value: 2.55 | |
| date: '2025-01-16' | |
| source: | |
| url: https://huggingface.co/hf-audio | |
| name: open-asr-leaderboard | |
| user: hf-audio | |
| - dataset: | |
| id: hf-audio/open-asr-leaderboard | |
| task_id: tedlium_wer | |
| value: 4.43 | |
| date: '2025-01-16' | |
| source: | |
| url: https://huggingface.co/hf-audio | |
| name: open-asr-leaderboard | |
| user: hf-audio | |
| - dataset: | |
| id: hf-audio/open-asr-leaderboard | |
| task_id: voxpopuli_wer | |
| value: 7.35 | |
| date: '2025-01-16' | |
| source: | |
| url: https://huggingface.co/hf-audio | |
| name: open-asr-leaderboard | |
| user: hf-audio | |