"Yet Another Model for Arabic Dialect Identification" by Ajinkya Kulkarni

Selected Works of Hanan Al Darmaki

Article

Yet Another Model for Arabic Dialect Identification

arXiv

Ajinkya Kulkarni, Mohamed bin Zayed University of Artificial Intelligence
Hanan Al Darmaki, Mohamed bin Zayed University of Artificial Intelligence

Download

Document Type

Article

Abstract

In this paper, we describe a spoken Arabic dialect identification (ADI) model for Arabic that consistently outperforms previously published results on two benchmark datasets: ADI-5 and ADI-17. We explore two architectural variations: ResNet and ECAPA-TDNN, coupled with two types of acoustic features: MFCCs and features exratected from the pre-trained self-supervised model UniSpeech-SAT Large, as well as a fusion of all four variants. We find that individually, ECAPA-TDNN network outperforms ResNet, and models with UniSpeech-SAT features outperform models with MFCCs by a large margin. Furthermore, a fusion of all four variants consistently outperforms individual models. Our best models outperform previously reported results on both datasets, with accuracies of 84.7% and 96.9% on ADI-5 and ADI-17, respectively. © 2023, CC BY-NC-SA.

DOI

10.48550/arXiv.2310.13812

Publication Date

10-20-2023

Keywords

Acoustic features,
Arabic dialects,
Benchmark datasets,
Best model,
Dialect identification,
Identification modeling,
Individual modeling,
Large margins

Disciplines

Comments

Preprint: arXiv

Archived with thanks to arXiv

Preprint License: CC BY NC SA 4.0

Uploaded 30 November 2023

Additional Links

arXiv link: https://doi.org/10.48550/arXiv.2310.13812

Citation Information

A. Kulharni and H. Aldarmaki, "Yet Another Model for Arabic Dialect Identification", arXiv, Oct 2023. doi:10.48550/arXiv.2310.13812