Skip to main content
Article
SP2: A Second Order Stochastic Polyak Method
arXiv
  • Shuang Li, UCLA, Los Angeles, United States
  • William J. Swartworth, UCLA, Los Angeles, United States
  • Martin Takac, Mohamed bin Zayed University of Artificial Intelligence
  • Deanna Needell, UCLA, Los Angeles, United State
  • Robert M. Gower, Center for Computational Mathematics, Flatiron Institute, Simons Foundation, New York, 10010, NY, United States
Document Type
Article
Abstract

Recently the SP (Stochastic Polyak step size) method has emerged as a competitive adaptive method for setting the step sizes of SGD. SP can be interpreted as a method specialized to interpolated models, since it solves the interpolation equations. SP solves these equation by using local linearizations of the model. We take a step further and develop a method for solving the interpolation equations that uses the local second-order approximation of the model. Our resulting method SP2 uses Hessian-vector products to speed-up the convergence of SP. Furthermore, and rather uniquely among second-order methods, the design of SP2 in no way relies on positive definite Hessian matrices or convexity of the objective function. We show SP2 is very competitive on matrix completion, non-convex test problems and logistic regression. We also provide a convergence theory on sums-of-quadratics. Copyright © 2022, The Authors. All rights reserved.

DOI
10.48550/arXiv.2207.08171
Publication Date
7-17-2022
Keywords
  • Machine learning,
  • Stochastic systems
Comments

IR Deposit conditions: non-described

Preprint available on arXiv

Citation Information
S. Li, W.J. Swartworth, M. Takac, D. Needell, and R.M. Gower, "SP2: A Second Order Stochastic Polyak Method", 2022, arXiv:2207.08171