An $\ell _1$-oracle inequality for the Lasso in finite mixture gaussian regression models

Meynet, Caroline

doi:10.1051/ps/2012016

ℓ_{1}

-oracle inequality for the Lasso in finite mixture gaussian regression models

Meynet, Caroline

ESAIM: Probability and Statistics, Tome 17 (2013), pp. 650-671.

Résumé

We consider a finite mixture of Gaussian regression models for high-dimensional heterogeneous data where the number of covariates may be much larger than the sample size. We propose to estimate the unknown conditional mixture density by an ℓ₁-penalized maximum likelihood estimator. We shall provide an ℓ₁-oracle inequality satisfied by this Lasso estimator with the Kullback-Leibler loss. In particular, we give a condition on the regularization parameter of the Lasso to obtain such an oracle inequality. Our aim is twofold: to extend the ℓ₁-oracle inequality established by Massart and Meynet [12] in the homogeneous Gaussian linear regression case, and to present a complementary result to Städler et al. [18], by studying the Lasso for its ℓ₁-regularization properties rather than considering it as a variable selection procedure. Our oracle inequality shall be deduced from a finite mixture Gaussian regression model selection theorem for ℓ₁-penalized maximum likelihood conditional density estimation, which is inspired from Vapnik's method of structural risk minimization [23] and from the theory on model selection for maximum likelihood estimators developed by Massart in [11].

| 1 citation dans Numdam

DOI : 10.1051/ps/2012016

Classification : 62G08, 62H30
Mots-clés : finite mixture of gaussian regressions model, Lasso, ℓ1-oracle inequalities, model selection by penalization, ℓ1-balls

@article{PS_2013__17__650_0,
     author = {Meynet, Caroline},
     title = {An $\ell _1$-oracle inequality for the {Lasso} in finite mixture gaussian regression models},
     journal = {ESAIM: Probability and Statistics},
     pages = {650--671},
     publisher = {EDP-Sciences},
     volume = {17},
     year = {2013},
     doi = {10.1051/ps/2012016},
     language = {en},
     url = {http://archive.numdam.org/articles/10.1051/ps/2012016/}
}

TY  - JOUR
AU  - Meynet, Caroline
TI  - An $\ell _1$-oracle inequality for the Lasso in finite mixture gaussian regression models
JO  - ESAIM: Probability and Statistics
PY  - 2013
SP  - 650
EP  - 671
VL  - 17
PB  - EDP-Sciences
UR  - http://archive.numdam.org/articles/10.1051/ps/2012016/
DO  - 10.1051/ps/2012016
LA  - en
ID  - PS_2013__17__650_0
ER  -

%0 Journal Article
%A Meynet, Caroline
%T An $\ell _1$-oracle inequality for the Lasso in finite mixture gaussian regression models
%J ESAIM: Probability and Statistics
%D 2013
%P 650-671
%V 17
%I EDP-Sciences
%U http://archive.numdam.org/articles/10.1051/ps/2012016/
%R 10.1051/ps/2012016
%G en
%F PS_2013__17__650_0

Meynet, Caroline. An $\ell _1$-oracle inequality for the Lasso in finite mixture gaussian regression models. ESAIM: Probability and Statistics, Tome 17 (2013), pp. 650-671. doi : 10.1051/ps/2012016. http://archive.numdam.org/articles/10.1051/ps/2012016/

Bibliographie
Cité par

[1] P.L. Bartlett, S. Mendelson and J. Neeman, ℓ1-regularized linear regression: persistence and oracle inequalities, Probability and related fields. Springer (2011).

[2] J.P. Baudry, Sélection de Modèle pour la Classification Non Supervisée. Choix du Nombre de Classes. Ph.D. thesis, Université Paris-Sud 11, France (2009).

[3] P.J. Bickel, Y. Ritov and A.B. Tsybakov, Simultaneous analysis of Lasso and Dantzig selector. Ann. Stat. 37 (2009) 1705-1732. | MR | Zbl

[4] S. Boucheron, G. Lugosi and P. Massart, A non Asymptotic Theory of Independence. Oxford University press (2013). | MR | Zbl

[5] P. Bühlmann and S. Van De Geer, On the conditions used to prove oracle results for the Lasso. Electr. J. Stat. 3 (2009) 1360-1392. | MR

[6] E. Candes and T. Tao, The Dantzig selector: statistical estimation when p is much larger than n. Ann. Stat. 35 (2007) 2313-2351. | MR | Zbl

[7] S. Cohen and E. Le Pennec, Conditional Density Estimation by Penalized Likelihood Model Selection and Applications, RR-7596. INRIA (2011).

[8] B. Efron, T. Hastie, I. Johnstone and R. Tibshirani, Least Angle Regression. Ann. Stat. 32 (2004) 407-499. | MR | Zbl

[9] M. Hebiri, Quelques questions de sélection de variables autour de l'estimateur Lasso. Ph.D. Thesis, Université Paris Diderot, Paris 7, France (2009).

[10] C. Huang, G.H.L. Cheang and A.R. Barron, Risk of penalized least squares, greedy selection and ℓ1-penalization for flexible function librairies. Submitted to the Annals of Statistics (2008). | MR

[11] P. Massart, Concentration inequalities and model selection. Ecole d'été de Probabilités de Saint-Flour 2003. Lect. Notes Math. Springer, Berlin-Heidelberg (2007). | MR | Zbl

[12] P. Massart and C. Meynet, The Lasso as an ℓ1-ball model selection procedure. Elect. J. Stat. 5 (2011) 669-687. | MR | Zbl

[13] C. Maugis and B. Michel, A non asymptotic penalized criterion for Gaussian mixture model selection. ESAIM: PS 15 (2011) 41-68. | Numdam | MR

[14] G. Mclachlan and D. Peel, Finite Mixture Models. Wiley, New York (2000). | MR | Zbl

[15] N. Meinshausen and B. Yu, Lasso type recovery of sparse representations for high dimensional data. Ann. Stat. 37 (2009) 246-270. | MR | Zbl

[16] R.A. Redner and H.F. Walker, Mixture densities, maximum likelihood and the EM algorithm. SIAM Rev. 26 (1984) 195-239. | MR | Zbl

[17] P. Rigollet and A. Tsybakov, Exponential screening and optimal rates of sparse estimation. Ann. Stat. 39 (2011) 731-771. | MR | Zbl

[18] N. Städler, B.P. Hlmann, and S. Van De Geer, ℓ1-penalization for mixture regression models. Test 19 (2010) 209-256. | Zbl

[19] R. Tibshirani, Regression shrinkage and selection via the Lasso. J. Roy. Stat. Soc. Ser. B 58 (1996) 267-288. | MR | Zbl

[20] M.R. Osborne, B. Presnell and B.A. Turlach, On the Lasso and its dual. J. Comput. Graph. Stat. 9 (2000) 319-337. | MR

[21] M.R. Osborne, B. Presnell and B.A Turlach, A new approach to variable selection in least squares problems. IMA J. Numer. Anal. 20 (2000) 389-404. | MR | Zbl

[22] A. Van Der Vaart and J. Wellner, Weak Convergence and Empirical Processes. Springer, Berlin (1996). | MR | Zbl

[23] V.N. Vapnik, Estimation of Dependencies Based on Empirical Data. Springer, New-York (1982). | MR | Zbl

[24] V.N. Vapnik, Statistical Learning Theory. J. Wiley, New-York (1990). | MR | Zbl

[25] P. Zhao and B. Yu On model selection consistency of Lasso. J. Mach. Learn. Res. 7 (2006) 2541-2563. | MR | Zbl

Cité par Sources :