Skip to main content
Towards Equal Opportunity Fairness through Adversarial Learning
  • Xudong Han, The University of Melbourne, Australia
  • Timothy Baldwin, The University of Melbourne, Australia & Mohamed bin Zayed University of Artificial Intelligence
  • Trevor Cohn, The University of Melbourne, Australia
Document Type

Adversarial training is a common approach for bias mitigation in natural language processing. Although most work on debiasing is motivated by equal opportunity, it is not explicitly captured in standard adversarial training. In this paper, we propose an augmented discriminator for adversarial training, which takes the target class as input to create richer features and more explicitly model equal opportunity. Experimental results over two datasets show that our method substantially improves over standard adversarial debiasing methods, in terms of the performance-fairness trade-off. Copyright © 2022, The Authors. All rights reserved.

Publication Date
  • Machine learning,
  • Natural language processing systems,
  • Adversarial learning,
  • De-biasing,
  • Equal opportunity,
  • Performance,
  • Rich features,
  • Target class,
  • Trade off,
  • Economic and social effects,
  • Artificial Intelligence (cs.AI),
  • Computation and Language (cs.CL)

IR Deposit conditions: non-described

Preprint available on arXiv

Citation Information
X. Han, T. Baldwin, and T. Cohn, "Towards Equal Opportunity Fairness through Adversarial Learning", 2022, arXiv:2203.06317