Skip to main content
Article
AI-SARAH: Adaptive and Implicit Stochastic Recursive Gradient Methods
arXiv
  • Zheng Shi, Industrial and Systems Engineering, Lehigh University, Bethlehem, United States & IBM Corporation, Armonk, United States
  • Abdurakhmon Sadiev, Mohamed bin Zayed University of Artificial Intelligence & Moscow Institute of Physics and Technology, Dolgoprudny, Russian Federation
  • Nicolas Loizou, Johns Hopkins University, Baltimore, MD, United States
  • Peter Richtárik, Computer Science, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
  • Martin Takac, Mohamed Bin Zayed University of Artificial Intelligence
Document Type
Article
Abstract

We present AI-SARAH, a practical variant of SARAH. As a variant of SARAH, this algorithm employs the stochastic recursive gradient yet adjusts step-size based on local geometry. AI-SARAH implicitly computes step-size and efficiently estimates local Lipschitz smoothness of stochastic functions. It is fully adaptive, tune-free, straightforward to implement, and computationally efficient. We provide technical insight and intuitive illustrations on its design and convergence. We conduct extensive empirical analysis and demonstrate its strong performance compared with its classical counterparts and other state-of-the-art first-order methods in solving convex machine learning problems. Copyright © 2021, The Authors. All rights reserved.

DOI
10.48550/arXiv.2102.09700
Publication Date
2-1-2021
Keywords
  • Gradient methods,
  • Machine learning,
  • Classical counterpart,
  • Computationally efficient,
  • Empirical analysis,
  • Gradient's methods,
  • Lipschitz,
  • Local geometry,
  • Performance,
  • Step size,
  • Stochastic functions,
  • Stochastics,
  • Stochastic systems,
  • Machine Learning (cs.LG),
  • Optimization and Control (math.OC)
Comments

IR Deposit conditions: non-described

Preprint available on arXiv

Citation Information
Z. Shi, Ab. Sadiev, N. Loizou, P. Ricktarik, and M. Takac, "AI-SARAH: Adaptive and Implicit Stochastic Recursive Gradient Methods", 2021, arXiv:2102.09700