This paper is devoted to tail index estimation in the context of survey data. Assuming that the population of interest is described by a heavy-tailed statistical model, we prove that the survey scheme plays a crucial role in the design of consistent inference methods for extremes. As can be revealed by simulation experiments, ignoring the sampling plan generally induces a significant bias, jeopardizing the accuracy of the extreme value statistics thus computed. Focus is here on the celebrated Hill method for tail index estimation, it is shown how to modify it in order to take into account the survey design. Precisely, under specific conditions on the inclusion probabilities of first and second orders, we establish the consistency of the variant of the Hill estimator we propose. Additionally, its asymptotic normality is proved in a specific situation. Application of this limit result for building Gaussian confidence intervals is thoroughly discussed and illustrated by numerical results.
Mots-clés : Survey sampling, tail index estimation, Hill estimator, Poisson survey scheme, rejective sampling
@article{PS_2015__19__28_0, author = {Bertail, Patrice and Chautru, Emilie and Cl\'emen\c{c}on, St\'ephan}, title = {Tail index estimation based on survey data}, journal = {ESAIM: Probability and Statistics}, pages = {28--59}, publisher = {EDP-Sciences}, volume = {19}, year = {2015}, doi = {10.1051/ps/2014011}, mrnumber = {3374868}, zbl = {1351.62019}, language = {en}, url = {http://archive.numdam.org/articles/10.1051/ps/2014011/} }
TY - JOUR AU - Bertail, Patrice AU - Chautru, Emilie AU - Clémençon, Stéphan TI - Tail index estimation based on survey data JO - ESAIM: Probability and Statistics PY - 2015 SP - 28 EP - 59 VL - 19 PB - EDP-Sciences UR - http://archive.numdam.org/articles/10.1051/ps/2014011/ DO - 10.1051/ps/2014011 LA - en ID - PS_2015__19__28_0 ER -
%0 Journal Article %A Bertail, Patrice %A Chautru, Emilie %A Clémençon, Stéphan %T Tail index estimation based on survey data %J ESAIM: Probability and Statistics %D 2015 %P 28-59 %V 19 %I EDP-Sciences %U http://archive.numdam.org/articles/10.1051/ps/2014011/ %R 10.1051/ps/2014011 %G en %F PS_2015__19__28_0
Bertail, Patrice; Chautru, Emilie; Clémençon, Stéphan. Tail index estimation based on survey data. ESAIM: Probability and Statistics, Tome 19 (2015), pp. 28-59. doi : 10.1051/ps/2014011. http://archive.numdam.org/articles/10.1051/ps/2014011/
J. Beirlant, Y. Goegebeur, J. Segers and J. Teugels, Statistics of extremes: theory and applications. John Wiley & Sons Inc (2004). | MR | Zbl
Rate of convergence to normal distribution for the Horvitz−Thompson estimator. J. Stat. Plann. Inference 67 (1998) 209–226. | MR | Zbl
,P. Bertail, E. Chautru and S. Clémençon, Empirical processes in survey sampling. Submitted (2013).
N.H. Bingham, C.M. Goldie and J.L. Teugels, Regular variation. Encycl. Math. Appl. Cambridge Univ Press, Cambridge (1987). | MR | Zbl
D. Bonnéry, J. Breidt and F. Coquet, Propriétés asymptotiques de l’échantillon dans le cas d’un plan de sondage informatif. Submitted (2011).
Improved Horvitz−Thompson estimation of model parameters from two-phase stratified samples: applications in epidemiology. Stat. Biosci. 1 (2009) 32–49.
, , , and ,Weighted likelihood for semiparametric models and two-phase stratified samples, with application to Cox regression. Scand. J. Stat. 35 (2007) 186–192. | MR | Zbl
and ,A Z-theorem with estimated nuisance parameters and correction note for “Weighted likelihood for semiparametric models and two-phase stratified samples, with application to Cox regression”. Scand. J. Stat. 35 (2008) 186–192. | MR | Zbl
and ,W.G. Cochran, Sampling techniques. Wiley, New York (1977). | MR | Zbl
Using a bootstrap method to choose the sample fraction in tail index estimation. J. Multivariate Anal. 76 (2001) 226–248. | MR | Zbl
, , and ,L. De Haan and A. Ferreira, Extreme value theory: an introduction. Springer Verlag (2006). | MR | Zbl
Comparison of tail index estimators. Stat. Neerl. 52 (1998) 60–70. | MR | Zbl
and .On asymptotic normality of the Hill estimator. Stoch. Models 14 (1998) 849–867. | MR | Zbl
and ,Generalized regular variation of second order. J. Austral. Math. Soc. Ser. A 61 (1996) 381–395. | MR | Zbl
and ,J.C. Deville, Réplications d’échantillons, demi-échantillons, Jackknife, bootstrap dans les sondages. Economica, Ed. Droesbeke, Tassi, Fichet (1987).
Calibration estimators in survey sampling. J. Am. Stat. Assoc. 87 (1992) 376–382. | MR | Zbl
and ,W. Feller, An introduction to probability theory and its applications, 2nd edition. John Wiley & Sons Inc., New York (1971). | MR | Zbl
Large sample theory of empirical distributions in biased sampling models. Ann. Stat. 16 (1988) 1069–1112. | MR | Zbl
, and ,Linking Pareto-tail kernel goodness-of-fit statistics with tail index at optimal threshold and second order estimation. Revstat 6 (2008) 51–69. | MR | Zbl
, and ,Slow variation with remainder: theory and applications. Quart. J. Math. Oxford 38 (1987) 45–71. | MR | Zbl
and ,The bootstrap methodology in statistics of extremes – choice of the optimal sample fraction. Extremes 4 (2001) 331–358. | MR | Zbl
and ,C. Gourieroux, Théorie des sondages. Economica (1981).
C. Gourieroux, Effets d’un sondage: cas du et de la régression. Economica, Ed. Droesbeke, Tassi, Fichet (1987).
Asymptotic theory of rejective sampling with varying probabilities from a finite population. Ann. Math. Stat. 35 (1964) 1491–1523. | MR | Zbl
,Sampling with unequal probabilities and without replacement. Ann. Math. Stat. 33 (1962) 350–374. | MR | Zbl
and ,A simple general approach to inference about the tail of a distribution. Ann. Stat. 3 (1975) 1163–1174. | MR | Zbl
,A generalization of sampling without replacement from a finite universe. J. Am. Stat. Assoc. 47 (1951) 663–685. | MR | Zbl
and ,Laws of large numbers for sums of extreme values. Ann. Probab. 10 (1982) 756–764. | MR | Zbl
,R.B. Nelsen, An introduction to copulas. Springer (1999). | MR | Zbl
S.I. Resnick, Heavy-tail phenomena: probabilistic and statistical modeling. Springer Verlag (2007). | MR | Zbl
On the convergence of the Horvitz−Thompson estimator. Austral. J. Stat. 24 (1982) 234–238. | MR | Zbl
,Asymptotic theory for successive sampling. J. Am. Math. Soc. 43 (1972) 373–397. | Zbl
,T. Saegusa and J.A. Wellner, Weighted likelihood estimation under two-phase sampling. Preprint available at (2011). | arXiv | MR | Zbl
Y. Tillé, Sampling algorithms. Springer Ser. Stat. (2006). | MR | Zbl
Cité par Sources :