The aim of this paper is to build an estimate of an unknown density as a linear combination of functions of a dictionary. Inspired by Candès and Tao's approach, we propose a minimization of the ℓ1-norm of the coefficients in the linear combination under an adaptive Dantzig constraint coming from sharp concentration inequalities. This allows to consider a wide class of dictionaries. Under local or global structure assumptions, oracle inequalities are derived. These theoretical results are transposed to the adaptive Lasso estimate naturally associated to our Dantzig procedure. Then, the issue of calibrating these procedures is studied from both theoretical and practical points of view. Finally, a numerical study shows the significant improvement obtained by our procedures when compared with other classical procedures.
L'objectif de cet article est de construire un estimateur d'une densité inconnue comme combinaison linéaire de fonctions d'un dictionnaire. Inspirés par l'approche de Candès et Tao, nous proposons une minimisation de la norme ℓ1 des coefficients dans la combinaison linéaire sous une contrainte de Dantzig adaptative issue d'inégalités de concentration précises. Ceci nous permet de considérer une large classe de dictionnaires. Sous des hypothèses de structure locale ou globale, nous obtenons des inégalités oracles. Ces résultats théoriques sont transposés à l'estimateur Lasso adaptatif naturellement associé à notre procédure de Dantzig. Le problème de la calibration de ces procédures est alors étudié à la fois du point de vue théorique et du point de vue pratique. Enfin, une étude numérique montre l'amélioration significative obtenue par notre procédure en comparaison d'autres procédures plus classiques.
Keywords: calibration, concentration inequalities, Dantzig estimate, density estimation, dictionary, Lasso estimate, oracle inequalities, sparsity
@article{AIHPB_2011__47_1_43_0, author = {Bertin, K. and Le Pennec, E. and Rivoirard, V.}, title = {Adaptive {Dantzig} density estimation}, journal = {Annales de l'I.H.P. Probabilit\'es et statistiques}, pages = {43--74}, publisher = {Gauthier-Villars}, volume = {47}, number = {1}, year = {2011}, doi = {10.1214/09-AIHP351}, mrnumber = {2779396}, zbl = {1207.62077}, language = {en}, url = {http://archive.numdam.org/articles/10.1214/09-AIHP351/} }
TY - JOUR AU - Bertin, K. AU - Le Pennec, E. AU - Rivoirard, V. TI - Adaptive Dantzig density estimation JO - Annales de l'I.H.P. Probabilités et statistiques PY - 2011 SP - 43 EP - 74 VL - 47 IS - 1 PB - Gauthier-Villars UR - http://archive.numdam.org/articles/10.1214/09-AIHP351/ DO - 10.1214/09-AIHP351 LA - en ID - AIHPB_2011__47_1_43_0 ER -
%0 Journal Article %A Bertin, K. %A Le Pennec, E. %A Rivoirard, V. %T Adaptive Dantzig density estimation %J Annales de l'I.H.P. Probabilités et statistiques %D 2011 %P 43-74 %V 47 %N 1 %I Gauthier-Villars %U http://archive.numdam.org/articles/10.1214/09-AIHP351/ %R 10.1214/09-AIHP351 %G en %F AIHPB_2011__47_1_43_0
Bertin, K.; Le Pennec, E.; Rivoirard, V. Adaptive Dantzig density estimation. Annales de l'I.H.P. Probabilités et statistiques, Volume 47 (2011) no. 1, pp. 43-74. doi : 10.1214/09-AIHP351. http://archive.numdam.org/articles/10.1214/09-AIHP351/
[1] Data-driven calibration of penalties for least-squares regression. J. Mach. Learn. Res. 10 (2009) 245-279.
and .[2] Dantzig selector homotopy with dynamic measurements. In Proceedings of SPIE Computational Imaging VII 7246 (2009) 72460E.
and .[3] Simultaneous analysis of Lasso and Dantzig selector. Ann. Statist. 37 (2009) 1705-1732. | MR | Zbl
, and .[4] Model selection for density estimation with L2-loss, 2008. Available at arXiv 0808.1416.
.[5] Minimal penalties for Gaussian model selection. Probab. Theory Related. Fields 138 (2007) 33-73. | MR | Zbl
and .[6] Aggregation and sparsity via ℓ1 penalized least squares. In Learning Theory 379-391. Lecture Notes in Comput. Sci. 4005. Springer, Berlin, 2006. | MR | Zbl
, and .[7] Sparse density estimation with ℓ1 penalties. Learning Theory 530-543. Lecture Notes in Comput. Sci. 4539. Springer, Berlin, 2007. | Zbl
, and .[8] Sparsity oracle inequalities for the LASSO. Electron. J. Statist. 1 (2007) 169-194. | MR | Zbl
, and .[9] Aggregation for Gaussian regression. Ann. Statist. 35 (2007) 1674-1697. | MR | Zbl
, and .[10] Spades and mixture models. Ann. Statist. (2010). To appear. Available at arXiv 0901.2044. | MR | Zbl
, , and .[11] Consistent selection via the Lasso for high dimensional approximating regression models. In Pushing the Limits of Contemporary Statistics: Cartributions in Honor of J. K. Ghosh 122-137. Inst. Math. Stat. Collect 3. IMS, Beachwood, OH, 2008. | MR
. and Y. Plan. Near-ideal model selection by ℓ1 minimization. Ann. Statist. 37 (2009) 2145-2177. |[13] The Dantzig selector: Statistical estimation when p is much larger than n. Ann. Statist. 35 (2007) 2313-2351. | MR | Zbl
and .[14] Atomic decomposition by basis pursuit. SIAM Rev. 43 (2001) 129-159. | MR | Zbl
, and .[15] Stable recovery of sparse overcomplete representations in the presence of noise. IEEE Trans. Inform. Theory 52 (2006) 6-18. | MR
, and .[16] Ideal spatial adaptation via wavelet shrinkage. Biometrika 81 (1994) 425-455. | MR | Zbl
and .[17] Least angle regression. Ann. Statist. 32 (2004) 407-499. | MR | Zbl
, , and .[18] On minimax density estimation on ℝ. Bernoulli 10 (2004) 187-220. | MR | Zbl
and .[19] Asymptotics for Lasso-type estimators. Ann. Statist. 28 (2000) 1356-1378. | MR | Zbl
and .[20] Sup-norm convergence rate and sign concentration property of Lasso and Dantzig estimators. Electron. J. Stat. 2 (2008) 90-102. | MR
.[21] Concentration inequalities and model selection. Lecture Notes in Math. 1896. Springer, Berlin. Lectures from the 33rd Summer School on Probability Theory held in Saint-Flour July 6-23 2003, 2007. | MR | Zbl
.[22] High-dimensional graphs and variable selection with the Lasso. Ann. Statist. 34 (2006) 1436-1462. | MR | Zbl
and . and -type recovery of sparse representations for high-dimensional data. Ann. Statist. 37 (2009) 246-270. |[24] On the Lasso and its dual. J. Comput. Graph. Statist. 9 (2000) 319-337. | MR
, and .[25] A new approach to variable selection in least squares problems. IMA J. Numer. Anal. 20 (2000) 389-404. | MR | Zbl
, and .[26] Near optimal thresholding estimation of a Poisson intensity on the real line. Electron. J. Statist. 4 (2010) 172-238. | MR
and .[27] Adaptive density estimation: A curse of support? 2009. Available at arXiv 0907.1794. | Zbl
, and .[28] Regression shrinkage and selection via the Lasso. J. Roy. Statist. Soc. Ser. B 58 (1996) 267-288. | MR | Zbl
.[29] High-dimensional generalized linear models and the Lasso. Ann. Statist. 36 (2008) 614-645. | MR | Zbl
.[30] On model selection consistency of Lasso estimators. J. Mach. Learn. Res. 7 (2006) 2541-2567. | MR
and .[31] The sparsity and bias of the Lasso selection in high-dimensional linear regression. Ann. Statist. 36 (2008) 1567-1594. | MR | Zbl
and .[32] The adaptive Lasso and its oracle properties. J. Amer. Statist. Assoc. 101 (2006) 1418-1429. | MR | Zbl
.Cited by Sources: