Skip to main content
Article
Casual Balancing for Domain Generalization
arXiv
  • Xinyi Wang, Department of Computer Science, University of California, Santa Barbara, United States
  • Michael Saxon, Department of Computer Science, University of California, Santa Barbara, United States
  • Jiachen Li, Department of Computer Science, University of California, Santa Barbara, United States
  • Hongyang Zhang, David R. Cheriton School of Computer Science, University of Waterloo, Canada
  • Kun Zhang, Department of Philosophy, Carnegie Mellon University, United States & Mohamed bin Zayed University of Artificial Intelligence
  • William Yang Wang, Department of Computer Science, University of California, Santa Barbara, United States
Document Type
Article
Abstract

While machine learning models rapidly advance the state-of-the-art on various real-world tasks, out-of-domain (OOD) generalization remains a challenging problem given the vulnerability of these models to spurious correlations. We propose a balanced mini-batch sampling strategy to transform a biased data distribution into a spurious-free balanced distribution, based on the invariance of the underlying causal mechanisms for the data generation process. We argue that the Bayes optimal classifiers trained on such balanced distribution are minimax optimal across a diverse enough environment space. We also provide an identifiability guarantee of the latent variable model of the proposed data generation process, when utilizing enough train environments. Experiments are conducted on DomainBed, demonstrating empirically that our method obtains the best performance across 20 baselines reported on the benchmark. 1 Copyright © 2022, The Authors. All rights reserved.

DOI
10.48550/arXiv.2206.05263
Publication Date
6-10-2022
Keywords
  • Balancing,
  • Benchmarking,
  • Machine learning
Comments

IR Deposit conditions: non-described

Citation Information
X. Wang, M. Saxon, J. Li, H. Zhang, K. Zhang, and W.Y. Zang, "Causal Balancing for Domain Generalization", 2022, doi:10.48550/arXiv.2206.05263