L<sup>p</sup> and almost sure rates of convergence of averaged stochastic gradient algorithms: locally strongly convex objective

Godichon-Baggioni, Antoine

doi:10.1051/ps/2019011

L^p and almost sure rates of convergence of averaged stochastic gradient algorithms: locally strongly convex objective

ESAIM: Probability and Statistics, Tome 23 (2019), pp. 841-873.

Suite au passage du modèle économique de la revue en S20, le texte intégral des articles des années 2019 et 2020 est accessible uniquement sur le site de l'éditeur et est réservé aux abonnés.

Résumé

An usual problem in statistics consists in estimating the minimizer of a convex function. When we have to deal with large samples taking values in high dimensional spaces, stochastic gradient algorithms and their averaged versions are efficient candidates. Indeed, (1) they do not need too much computational efforts, (2) they do not need to store all the data, which is crucial when we deal with big data, (3) they allow to simply update the estimates, which is important when data arrive sequentially. The aim of this work is to give asymptotic and non asymptotic rates of convergence of stochastic gradient estimates as well as of their averaged versions when the function we would like to minimize is only locally strongly convex.

Reçu le : 2018-09-07
Accepté le : 2019-06-14

MR Zbl

DOI : 10.1051/ps/2019011

Classification : 62L12, 62G35, 62L20
Mots-clés : Stochastic optimization, Stochastic gradient algorithm, averaging, Robust statistics

Affiliations des auteurs :

Godichon-Baggioni, Antoine ¹

@article{PS_2019__23__841_0,
     author = {Godichon-Baggioni, Antoine},
     title = {L\protect\textsuperscript{p} and almost sure rates of convergence of averaged stochastic gradient algorithms: locally strongly convex objective},
     journal = {ESAIM: Probability and Statistics},
     pages = {841--873},
     publisher = {EDP-Sciences},
     volume = {23},
     year = {2019},
     doi = {10.1051/ps/2019011},
     mrnumber = {4045544},
     zbl = {1507.62276},
     language = {en},
     url = {http://archive.numdam.org/articles/10.1051/ps/2019011/}
}

TY  - JOUR
AU  - Godichon-Baggioni, Antoine
TI  - Lp and almost sure rates of convergence of averaged stochastic gradient algorithms: locally strongly convex objective
JO  - ESAIM: Probability and Statistics
PY  - 2019
SP  - 841
EP  - 873
VL  - 23
PB  - EDP-Sciences
UR  - http://archive.numdam.org/articles/10.1051/ps/2019011/
DO  - 10.1051/ps/2019011
LA  - en
ID  - PS_2019__23__841_0
ER  -

%0 Journal Article
%A Godichon-Baggioni, Antoine
%T Lp and almost sure rates of convergence of averaged stochastic gradient algorithms: locally strongly convex objective
%J ESAIM: Probability and Statistics
%D 2019
%P 841-873
%V 23
%I EDP-Sciences
%U http://archive.numdam.org/articles/10.1051/ps/2019011/
%R 10.1051/ps/2019011
%G en
%F PS_2019__23__841_0

Godichon-Baggioni, Antoine. L^p and almost sure rates of convergence of averaged stochastic gradient algorithms: locally strongly convex objective. ESAIM: Probability and Statistics, Tome 23 (2019), pp. 841-873. doi : 10.1051/ps/2019011. http://archive.numdam.org/articles/10.1051/ps/2019011/

Bibliographie
Cité par

[1] F. Bach, Adaptivity of averaged stochastic gradient descent to local strong convexity for logistic regression. J. Mach. Learn. Res. 15 (2014) 595–627. | MR | Zbl

[2] Borwein and P.B. Borwein, Pi and the AGM. Wiley, New York (1987). | MR | Zbl

[3] S. Boyd and L. Vandenberghe, Convex optimization. Cambridge University Press (2004). | DOI | MR | Zbl

[4] H. Cardot and A. Godichon-Baggioni, Fast estimation of the median covariation matrix with application to online robust principal components analysis. TEST 26 (2017) 461–480. | DOI | MR | Zbl

[5] H. Cardot, P. Cénac and P.-A. Zitt, Efficient and fast estimation of the geometric median in Hilbert spaces with an averaged stochastic gradient algorithm. Bernoulli 19 (2013) 18–43. | DOI | MR | Zbl

[6] H. Cardot, P. Cénac, A. Godichon-Baggioni et al., Online estimation of the geometric median in hilbert spaces: non asymptotic confidence balls. Ann. Stat. 45 (2017) 591–614. | DOI | MR | Zbl

[7] P. Chaudhuri, Multivariate location estimation using extension of R-estimates through U-statistics type approach. Ann. Statist. 20 (1992) 897–916. | DOI | MR | Zbl

[8] P. Chaudhuri, On a geometric notion of quantiles for multivariate data. J. Am. Stat. Assoc. 91 (1996) 862–872. | DOI | MR | Zbl

[9] Y. Chen, X. Dang, H. Peng and H.L. Bart, Outlier detection with the kernelized spatial depth function. IEEE Trans. Pattern Anal. Mach. Intel. 31 (2009) 288–305. | DOI

[10] Cohen, A. Nedic and R. Srikant, On projected stochastic gradient descent algorithm with weighted averaging for least squares regression. In 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2016) 2314–2318. | DOI

[11] M. Duflo, Algorithmes stochastiques. Springer, Berlin (1996). | MR | Zbl

[12] Duflo, Random iterative models, Vol. 34 of Applications of Mathematics (New York). Translated fromthe 1990 French original by Stephen S. Wilson and revised by the author. Springer-Verlag, Berlin (1997). | MR | Zbl

[13] S. Ghadimi and G. Lan, Optimal stochastic approximation algorithms for strongly convex stochastic composite optimization I: a generic algorithmic framework. SIAM J. Optim. 22 (2012) 1469–1492. | DOI | MR | Zbl

[14] A. Godichon-Baggioni, Estimating the geometric median in hilbert spaces with stochastic gradient algorithms: Lp and almost sure rates of convergence. J. Multivar. Anal. 146 (2016) 209–222. | DOI | MR | Zbl

[15] A. Goh, C. Lenglet, P.M. Thompson and R. Vidal, A nonparametric Riemannian framework for processing high angular resolution diffusion images (HARDI). IEEE Conference on Computer Vision and Pattern Recognition (2009) 2496–2503.

[16] J.B.S. Haldane, Note on the median of a multivariate distribution. Biometrika 35 (1948) 414–417. | DOI | Zbl

[17] M. Hallin and D. Paindaveine, Semiparametrically efficient rank-based inference for shape. I. Optimal rank-based tests for sphericity. Ann. Stat. 34 (2006) 2707–2756. | MR | Zbl

[18] P. Huber and E. Ronchetti, Robust Statistics, 2nd edn. John Wiley and Sons (2009). | DOI | MR | Zbl

[19] J. Kemperman, The median of a finite measure on a Banach space. In Statistical data analysis based on the L₁ -norm and related methods (Neuchâtel, 1987). North-Holland, Amsterdam (1987) 217–230. | MR

[20] H.J. Kushner and G.G. Yin, Stochastic approximation and recursive algorithms and applications, Stochastic Modelling and Applied Probability. 2nd edn. Vol. 35 of Applications of Mathematics (New York). Springer-Verlag, New York (2003). | MR | Zbl

[21] R.A. Maronna, R.D. Martin and V.J. Yohai, Robust statistics. Theory and methods. Wiley Series in Probability and Statistics. John Wiley& Sons, Ltd., Chichester (2006). | MR | Zbl

[22] E. Moulines and F.R. Bach, Non-asymptotic analysis of stochastic approximation algorithms for machine learning. In Advances in Neural Information Processing Systems (2011) 451–459.

[23] M. Pelletier, On the almost sure asymptotic behaviour of stochastic algorithms. Stoch. Process. Appl. 78 (1998) 217–244. | DOI | MR | Zbl

[24] M. Pelletier, Asymptotic almost sure efficiency of averaged stochastic algorithms. SIAM J. Control Optim. 39 (2000) 49–72. | DOI | MR | Zbl

[25] G.H. Polya, G. Harold and Littlewood, Inequalities. University Press (1952). | Zbl

[26] B. Polyak and A. Juditsky, Acceleration of stochastic approximation. SIAM J. Cont. Optim. 30 (1992) 838–855. | DOI | MR | Zbl

[27] H. Robbins and S. Monro, A stochastic approximation method. Ann. Math. Stat. (1951) 400–407. | DOI | MR | Zbl

[28] M. Rudelson, Recent developments in non-asymptotic theory of random matrices. Modern Aspects of Random Matrix Theory. In Vol. 72 of Proceedings of Symposia in Applied Mathematics (2014) 83–120. | DOI | MR | Zbl

[29] D. Ruppert, Efficient estimations from a slowly convergent robbins-monro process. Technical report, Cornell University Operations Research and Industrial Engineering (1988).

[30] R. Serfling, Depth functions in nonparametric multivariate inference. In Vol. 72 of DIMACS Series in Discrete Mathematics and Theoretical Computer Science 72 (2006). | DOI | MR

[31] P. Turaga, A. Veeraraghavan and R. Chellappa, Statistical analysis on Stiefel and Grassmann manifolds with applications in computer vision. IEEE Conference on Computer Vision and Pattern Recognition (2008).

Cité par Sources :