Skip to main content
Article
Towards Equal Opportunity Fairness through Adversarial Learning
arXiv
  • Xudong Han, The University of Melbourne, Australia
  • Timothy Baldwin, The University of Melbourne, Australia & Mohamed bin Zayed University of Artificial Intelligence
  • Trevor Cohn, The University of Melbourne, Australia
Document Type
Article
Abstract

Adversarial training is a common approach for bias mitigation in natural language processing. Although most work on debiasing is motivated by equal opportunity, it is not explicitly captured in standard adversarial training. In this paper, we propose an augmented discriminator for adversarial training, which takes the target class as input to create richer features and more explicitly model equal opportunity. Experimental results over two datasets show that our method substantially improves over standard adversarial debiasing methods, in terms of the performance-fairness trade-off. Copyright © 2022, The Authors. All rights reserved.

DOI
10.48550/arXiv.2203.06317
Publication Date
3-11-2022
Keywords
  • Machine learning,
  • Natural language processing systems,
  • Adversarial learning,
  • De-biasing,
  • Equal opportunity,
  • Performance,
  • Rich features,
  • Target class,
  • Trade off,
  • Economic and social effects,
  • Artificial Intelligence (cs.AI),
  • Computation and Language (cs.CL)
Comments

IR Deposit conditions: non-described

Preprint available on arXiv

Citation Information
X. Han, T. Baldwin, and T. Cohn, "Towards Equal Opportunity Fairness through Adversarial Learning", 2022, arXiv:2203.06317