SeamlessM4T-v2 Bahnar-Vietnamese S2TT
This model is a fine-tuned version of facebook/seamless-m4t-v2-large for Speech-to-Text Translation (S2TT) from Bahnar to Vietnamese.
Model Details
- Base model:
facebook/seamless-m4t-v2-large - Task: Speech-to-Text Translation (S2TT)
- Source language: Bahnar (
bdq) - Target language: Vietnamese (
vie)
Note This model only supports the Speech-to-Text Translation (S2TT) task.
Dataset
This model was trained on the Bahnar Speech Translation Dataset.
The dataset was curated from internet sources and processed using automatic alignment techniques. It contains Bahnar speech audio paired with Vietnamese translations.
For more details on the data creation process, please refer to the dataset README and repository below.
Usage with Transformers
import torch
import soundfile as sf
from transformers import AutoProcessor, SeamlessM4Tv2ForSpeechToText
model_id = "cuong06/seamlessm4t-v2-Bahnar-Vietnamese"
processor = AutoProcessor.from_pretrained(model_id)
model = SeamlessM4Tv2ForSpeechToText.from_pretrained(model_id)
audio, sampling_rate = sf.read("sample.wav")
inputs = processor(
audio=audio,
sampling_rate=sampling_rate,
return_tensors="pt"
)
with torch.no_grad():
predicted_ids = model.generate(
**inputs,
tgt_lang="vie"
)
translation = processor.batch_decode(
predicted_ids,
skip_special_tokens=True
)[0]
print(translation)
Evaluation
Results on the test set using beam search (beam_size=5):
| Metric | Score |
|---|---|
| BLEU | 24.58 |
sacreBLEU Signature
nrefs:1|case:lc|eff:no|tok:13a|smooth:exp|version:2.6.
Limitations
- The dataset was automatically collected and aligned from internet sources, so some noisy samples may remain.
- Performance may degrade on unseen dialects, noisy audio, or long-form speech.
- This model is intended only for Bahnar → Vietnamese speech translation.
Citation
If you use this model or the dataset, please cite the repository and dataset.
Repository
@misc{bahnar_vietnamese_s2tt,
author = {Dam Cuong},
title = {Bahnar-Vietnamese Speech-to-Text Translation},
year = {2026},
howpublished = {\url{https://github.com/damcuong8/Bahnar-Vietnamese-S2TT}}
}
Links
- Downloads last month
- 101
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for cuong06/seamlessm4t-v2-Bahnar-Vietnamese
Base model
facebook/seamless-m4t-v2-largeEvaluation results
- BLEU (nrefs:1|case:lc|eff:no|tok:13a|smooth:exp|version:2.6.) on Bahnar Speech Translation Datasetself-reported24.580