Skip to main content
Article
Large Batch Optimization for Deep Learning Using New Complete Layer-Wise Adaptive Rate Scaling
Thirty-Fifth AAAI Conference on Artificial Intelligence, Thirty-third Conference on Innovative Applications of Artificial Intelligence and the Eleventh Symposium on Educational Advances in Artificial Intelligence
  • Zhouyuan Huo, Google, CA, USA
  • Bin Gu, Mohamed Bin Zayed University of Artificial Intelligence & JD Finance Amer Corp., CA, USA
  • Heng Huang, JD Finance Amer Corp., CA, USA & University of Pittsburgh
Document Type
Conference Proceeding
Abstract

Training deep neural networks using a large batch size has shown promising results and benefits many real-world applications. Warmup is one of nontrivial techniques to stabilize the convergence of large batch training However, warmup is an empirical method and it is still unknown whether there is a better algorithm with theoretical underpinnings. In this paper, we propose a novel Complete Layer-wise Adaptive Rate Scaling (CLARS) algorithm for large-batch training. We prove the convergence of our algorithm by introducing a new fine-grained analysis of gradient-based methods. Furthermore, the new analysis also helps to understand two other empirical tricks, layer-wise adaptive rate scaling and linear learning rate scaling. We conduct extensive experiments and demonstrate that the proposed algorithm outperforms gradual warmup technique by a large margin and defeats the convergence of the state-of-the-art large-batch optimizer in training advanced deep neural networks (ResNet, DenseNet, MobileNet) on ImageNet dataset.

Publication Date
1-1-2021
Keywords
  • deep neural networks
Comments

IR Deposit conditions: non-described

Open Access version available on AAAI:

Citation Information
Z. Huo, B. Gu, and H. Huang, "Large Batch Optimization for Deep Learning Using New Complete Layer-Wise Adaptive Rate Scaling", in 35th AAAI Conference on Artificial Intelligence / 33rd Conference on Innovative Applications of Artificial Intelligence / 11th Symposium on Educational Advances in Artificial Intelligence, California, USA, February 2–9, 2021, p. 7883-7890, https://ojs.aaai.org/index.php/AAAI/article/view/16962/16769