|
[1]
|
Robbins, H. and Monro, S. (1951) A Stochastic Approximation Method. The Annals of Mathematical Statistics, 22, 400-407. [Google Scholar] [CrossRef]
|
|
[2]
|
Bottou, L., Curtis, F.E. and Nocedal, J. (2018) Optimi-zation Methods for Large-Scale Machine Learning. SIAM Review, 60, 223-311. [Google Scholar] [CrossRef]
|
|
[3]
|
Roux, N.L., Schmidt, M. and Bach, F.R. (2013) A Stochastic Gradient Method with an Exponential Convergence Rate for Finite Training Sets. Neural Information Processing Systems, Lake Tahoe, 5 December 2013, 2663-2671.
|
|
[4]
|
Schmidt, M.W., Roux, N.L. and Bach, F. (2017) Minimizing Finite Sums with the Stochastic Average Gradient. Mathematical Programming, 162, 83-112. [Google Scholar] [CrossRef]
|
|
[5]
|
Defazio, A., Bach, F. and Lacoste-Julien, S. (2014) SAGA: A Fast Incremental Gradient Method with Support for Non-Strongly Convex Composite Objectives. Neural Information Processing Systems, Montreal, 8 December 2014, 1646-1654.
|
|
[6]
|
Johnson, R. and Zhang, T. (2013) Accelerating Stochastic Gradient Descent Using Predictive Variance Reduction. Neural Information Processing Systems, Lake Tahoe, 5 December 2013, 315-323.
|
|
[7]
|
Konecný, J. and Richtarik, P. (2017) Semi-Stochastic Gradient Descent Methods. Frontiers in Applied Mathematics & Statistics, 3. [Google Scholar] [CrossRef]
|
|
[8]
|
Babanezhad, R., Ahmed, M.O., Virani, A., Schmidt, M.W., Konecný, J. and Sallinen, S. (2015) Stop Wasting My Gradients: Practical SVRG. Neural Information Processing Systems, Montreal, 7 December 2015, 2251-2259.
|
|
[9]
|
Nguyen, L.M., Liu, J., Scheinberg, K. and Takac, M. (2017) SARAH: A Novel Method for Machine Learning Problems Using Stochastic Recursive Gradient. Neural Information Processing Systems, Long Beach, 4 December 2017, 2613-2621.
|
|
[10]
|
Tan, C., Ma, S., Dai, Y.H. and Qian, Y. (2016) Barzilai-Borwein Step Size for Stochastic Gradient Descent. Neural Information Processing Systems, Barcelona, 5 December 2016, 685-693.
|
|
[11]
|
Liu, Y., Wang, X. and Guo, T. (2020) A Linearly Convergent Stochastic Recursive Gradient Method for Convex Optimization. Optimization Letters, 14, 2265-2283. [Google Scholar] [CrossRef]
|
|
[12]
|
Raydan, M. (1997) The Barzilai and Borwein Gradient Method for the Large Scale Unconstrained Minimization Problem. SIAM Journal on Optimization, 7, 26-33. [Google Scholar] [CrossRef]
|
|
[13]
|
Grippo, L. and Lucidi, F.L. (1986) A Nonmonotone Line Search Technique for Newton’s Method. SIAM Journal on Numerical Analysis, 23, 707-716. [Google Scholar] [CrossRef]
|
|
[14]
|
Birgin, E.G., Martínez, J.M. and Raydan, M. (2000) Nonmonotone Spectral Projected Gradient Methods on Convex Sets. SIAM Journal on Optimization, 10, 1196-1211. [Google Scholar] [CrossRef]
|
|
[15]
|
Yuan, Y.X. (2006) A New Stepsize for the Steepest Descent Method. Journal of Computational Mathematics, 24, 149-156.
|
|
[16]
|
Yuan, Y. (2008) Step-Sizes for the Gradient Method. AMS/IP Studies in Advanced Mathematics, 785-796. [Google Scholar] [CrossRef]
|
|
[17]
|
Dai, Y. and Yuan, Y.X. (2017) Analysis of Monotone Gradient Methods. Journal of Industrial & Management Optimization, 1, 181-192. [Google Scholar] [CrossRef]
|
|
[18]
|
Huang, Y., Dai, Y.H. and Liu, X.W. (2021) Equipping Barzilai-Borwein Method with Two Dimensional Quadratic Termination Property. SIAM Journal on Optimization, 31, 3068-3069. [Google Scholar] [CrossRef]
|