Skip to main content
Article
Yet Another Model for Arabic Dialect Identification
arXiv
  • Ajinkya Kulkarni, Mohamed bin Zayed University of Artificial Intelligence
  • Hanan Al Darmaki, Mohamed bin Zayed University of Artificial Intelligence
Document Type
Article
Abstract

In this paper, we describe a spoken Arabic dialect identification (ADI) model for Arabic that consistently outperforms previously published results on two benchmark datasets: ADI-5 and ADI-17. We explore two architectural variations: ResNet and ECAPA-TDNN, coupled with two types of acoustic features: MFCCs and features exratected from the pre-trained self-supervised model UniSpeech-SAT Large, as well as a fusion of all four variants. We find that individually, ECAPA-TDNN network outperforms ResNet, and models with UniSpeech-SAT features outperform models with MFCCs by a large margin. Furthermore, a fusion of all four variants consistently outperforms individual models. Our best models outperform previously reported results on both datasets, with accuracies of 84.7% and 96.9% on ADI-5 and ADI-17, respectively. © 2023, CC BY-NC-SA.

DOI
10.48550/arXiv.2310.13812
Publication Date
10-20-2023
Keywords
  • Acoustic features,
  • Arabic dialects,
  • Benchmark datasets,
  • Best model,
  • Dialect identification,
  • Identification modeling,
  • Individual modeling,
  • Large margins
Comments

Preprint: arXiv

Archived with thanks to arXiv

Preprint License: CC BY NC SA 4.0

Uploaded 30 November 2023

Citation Information
A. Kulharni and H. Aldarmaki, "Yet Another Model for Arabic Dialect Identification", arXiv, Oct 2023. doi:10.48550/arXiv.2310.13812