Skip to main content
Article
ArTST: Arabic Text and Speech Transformer
arXiv
  • Hawau Olamide Toyin, Mohamed bin Zayed University of Artificial Intelligence
  • Amirbek Djanibekov, Mohamed bin Zayed University of Artificial Intelligence
  • Ajinkya Kulkarni, Mohamed bin Zayed University of Artificial Intelligence
  • Hanan Al Darmaki, Mohamed bin Zayed University of Artificial Intelligence
Document Type
Article
Abstract

We present ArTST, a pre-trained Arabic text and speech transformer for supporting open-source speech technologies for the Arabic language. The model architecture follows the unified-modal framework, SpeechT5, that was recently released for English, and is focused on Modern Standard Arabic (MSA), with plans to extend the model for dialectal and code-switched Arabic in future editions. We pre-trained the model from scratch on MSA speech and text data, and fine-tuned it for the following tasks: Automatic Speech Recognition (ASR), Text-To-Speech synthesis (TTS), and spoken dialect identification. In our experiments comparing ArTST with SpeechT5, as well as with previously reported results in these tasks, ArTST performs on a par with or exceeding the current state-of-the-art in all three tasks. Moreover, we find that our pre-training is conducive for generalization, which is particularly evident in the low-resource TTS task. The pre-trained model as well as the fine-tuned ASR and TTS models are released for research use. Copyright © 2023, The Authors. All rights reserved.

DOI
10.48550/arXiv.2310.16621
Publication Date
10-25-2023
Keywords
  • Arabic languages,
  • Arabic speech,
  • Arabic texts,
  • Automatic speech recognition,
  • Modeling architecture,
  • Modern standards,
  • Open-source,
  • Speech data,
  • Speech technology,
  • Standard arabics
Comments

Preprint: arXiv

Archived with thanks to arXiv

Uploaded 30 November 2023

Citation Information
H.O. Toyin, A. Djanibekov, A. Kulkarni, and H. Aldarmaki, "ArTST: Arabic Text and Speech Transformer", arXiv, Oct 2023. doi:10.48550/arXiv.2310.16621