Validity of the parametric bootstrap for goodness-of-fit testing in semiparametric models
Annales de l'I.H.P. Probabilités et statistiques, Volume 44 (2008) no. 6, p. 1096-1127

In testing that a given distribution $P$ belongs to a parameterized family $𝒫$, one is often led to compare a nonparametric estimate ${A}_{n}$ of some functional $A$ of $P$ with an element ${A}_{{\theta }_{n}}$ corresponding to an estimate ${\theta }_{n}$ of $\theta$. In many cases, the asymptotic distribution of goodness-of-fit statistics derived from the process ${n}^{1/2}\left({A}_{n}-{A}_{{\theta }_{n}}\right)$ depends on the unknown distribution $P$. It is shown here that if the sequences ${A}_{n}$ and ${\theta }_{n}$ of estimators are regular in some sense, a parametric bootstrap approach yields valid approximations for the $P$-values of the tests. In other words if ${A}_{n}^{*}$ and ${\theta }_{n}^{*}$ are analogs of ${A}_{n}$ and ${\theta }_{n}$ computed from a sample from ${P}_{{\theta }_{n}}$, the empirical processes ${n}^{1/2}\left({A}_{n}-{A}_{{\theta }_{n}}\right)$ and ${n}^{1/2}\left({A}_{n}^{*}-{A}_{{\theta }_{n}^{*}}\right)$ then converge jointly in distribution to independent copies of the same limit. This result is used to establish the validity of the parametric bootstrap method when testing the goodness-of-fit of families of multivariate distributions and copulas. Two types of tests are considered: certain procedures compare the empirical version of a distribution function or copula and its parametric estimation under the null hypothesis; others measure the distance between a parametric and a nonparametric estimation of the distribution associated with the classical probability integral transform. The validity of a two-level bootstrap is also proved in cases where the parametric estimate cannot be computed easily. The methodology is illustrated using a new goodness-of-fit test statistic for copulas based on a Cramér-von Mises functional of the empirical copula process.

Pour tester qu’une loi $P$ donnée provient d’une famille paramétrique $𝒫$, on est souvent amené à comparer une estimation non paramétrique ${A}_{n}$ d’une fonctionnelle $A$ de $P$ à un élément ${A}_{{\theta }_{n}}$ correspondant à une estimation ${\theta }_{n}$ de $\theta$. Dans bien des cas, la loi asymptotique de statistiques de tests bâties à partir du processus ${n}^{1/2}\left({A}_{n}-{A}_{{\theta }_{n}}\right)$ dépend de la loi inconnue $P$. On montre ici que si les suites ${A}_{n}$ et ${\theta }_{n}$ d’estimateurs sont régulières dans un sens précis, le recours au rééchantillonnage paramétrique conduit à des approximations valides des seuils des tests. Autrement dit si ${A}_{n}^{*}$ et ${\theta }_{n}^{*}$ sont des analogues de ${A}_{n}$ et ${\theta }_{n}$ déduits d’un échantillon de loi ${P}_{{\theta }_{n}}$, les processus empiriques ${n}^{1/2}\left({A}_{n}-{A}_{{\theta }_{n}}\right)$ et ${n}^{1/2}\left({A}_{n}^{*}-{A}_{{\theta }_{n}^{*}}\right)$ convergent alors conjointement en loi vers des copies indépendantes de la même limite. Ce résultat est employé pour valider l’approche par rééchantillonnage paramétrique dans le cadre de tests d’adéquation pour des familles de lois et de copules multivariées. Deux types de tests sont envisagés : les uns comparent la version empirique d’une loi ou d’une copule et son estimation paramétrique sous l’hypothèse nulle ; les autres mesurent la distance entre les estimations paramétrique et non paramétrique de la loi associée à la transformation intégrale de probabilité classique. La validité du rééchantillonnage à deux degrés est aussi démontrée dans les cas où l’estimation paramétrique est difficile à calculer. La méthodologie est illustrée au moyen d’un nouveau test d’adéquation de copules fondé sur une fonctionnelle de Cramér-von Mises du processus de copule empirique.

DOI : https://doi.org/10.1214/07-AIHP148
Classification:  62F05,  62F40,  62H15
Keywords: copula, goodness-of-fit test, Monte Carlo simulation, parametric bootstrap, P-values, semiparametric estimation
@article{AIHPB_2008__44_6_1096_0,
author = {Genest, Christian and R\'emillard, Bruno},
title = {Validity of the parametric bootstrap for goodness-of-fit testing in semiparametric models},
journal = {Annales de l'I.H.P. Probabilit\'es et statistiques},
publisher = {Gauthier-Villars},
volume = {44},
number = {6},
year = {2008},
pages = {1096-1127},
doi = {10.1214/07-AIHP148},
zbl = {1206.62044},
mrnumber = {2469337},
language = {en},
url = {http://www.numdam.org/item/AIHPB_2008__44_6_1096_0}
}

Genest, Christian; Rémillard, Bruno. Validity of the parametric bootstrap for goodness-of-fit testing in semiparametric models. Annales de l'I.H.P. Probabilités et statistiques, Volume 44 (2008) no. 6, pp. 1096-1127. doi : 10.1214/07-AIHP148. http://www.numdam.org/item/AIHPB_2008__44_6_1096_0/

[1] P. Barbe, C. Genest, K. Ghoudi and B. Rémillard. On Kendall's process. J. Multivariate Anal. 58 (1996) 197-229. | MR 1405589 | Zbl 0862.60020

[2] R. Beran. Minimum distance procedures. In Nonparametric Methods 741-754. Handbook of Statistics 4. North-Holland, Amsterdam, 1984. | MR 831734 | Zbl 0597.62032

[3] R. Beran and P. W. Millar. A stochastic minimum distance test for multivariate parametric models. Ann. Statist. 17 (1989) 125-140. | MR 981440 | Zbl 0684.62041

[4] P. J. Bickel and J.-J. Ren. The bootstrap in hypothesis testing. In State of the Art in Probability and Statistics (Leiden, 1999) 91-112. IMS Lecture Notes Monogr. Ser. 36. Inst. Math. Statist., Beachwood, OH, 2001. | MR 1836556

[5] P. J. Bickel and M. J. Wichura. Convergence criteria for multiparameter stochastic processes and some applications. Ann. Math. Statist. 42 (1971) 1656-1670. | MR 383482 | Zbl 0265.60011

[6] W. Breymann, A. Dias and P. Embrechts. Dependence structures for multivariate high-frequency data in finance. In Selected Proceedings from Quantitative Methods in Finance, 2002 (Cairns/Sydney) 3 1-14, 2003. | MR 1972372

[7] S. Demarta and A. J. Mcneil. The t copula and related copulas. Internat. Statist. Rev. 73 (2005) 111-129. | Zbl 1104.62060

[8] J. Dobrić and F. Schmid. A goodness of fit test for copulas based on Rosenblatt's transformation. Comput. Statist. Data Anal. 51 (2007) 4633-4642. | MR 2364470 | Zbl 1162.62343

[9] J. Durbin. Weak convergence of the sample distribution function when parameters are estimated. Ann. Statist. 1 (1973) 279-290. | MR 359131 | Zbl 0256.62021

[10] J.-D. Fermanian. Goodness-of-fit tests for copulas. J. Multivariate Anal. 95 (2005) 119-152. | MR 2164126 | Zbl 1095.62052

[11] J.-D. Fermanian, D. Radulović and M. H. Wegkamp. Weak convergence of empirical copula processes. Bernoulli 10 (2004) 847-860. | MR 2093613 | Zbl 1068.62059

[12] P. Gänßler and W. Stute. Seminar on Empirical Processes. Birkhäuser Verlag, Basel, 1987. | MR 902803 | Zbl 0637.62047

[13] C. Genest, K. Ghoudi and L.-P. Rivest. A semiparametric estimation procedure of dependence parameters in multivariate families of distributions. Biometrika 82 (1995) 543-552. | MR 1366280 | Zbl 0831.62030

[14] C. Genest, J.-F. Quessy and B. Rémillard. Tests of serial independence based on Kendall's process. Canad. J. Statist. 30 (2002) 441-461. | MR 1944373 | Zbl 1016.62051

[15] C. Genest, J.-F. Quessy and B. Rémillard. Goodness-of-fit procedures for copula models based on the probability integral transformation. Scand. J. Statist. 33 (2006) 337-366. | MR 2279646 | Zbl 1124.62028

[16] C. Genest, B. Rémillard and D. Beaudoin. Goodness-of-fit tests for copulas: A review and a power study. Insurance Math. Econom. 43 (2008). In press. | MR 2517885 | Zbl 1161.91416

[17] C. Genest and L.-P. Rivest. Statistical inference procedures for bivariate Archimedean copulas. J. Amer. Statist. Assoc. 88 (1993) 1034-1043. | MR 1242947 | Zbl 0785.62032

[18] K. Ghoudi and B. Rémillard. Empirical processes based on pseudo-observations. In Asymptotic Methods in Probability and Statistics (Ottawa, ON, 1997) 171-197. North-Holland, Amsterdam, 1998. | MR 1661480 | Zbl 0959.62044

[19] K. Ghoudi and B. Rémillard. Empirical processes based on pseudo-observations. II. The multivariate case. In Asymptotic Methods in Stochastics 381-406. Fields Inst. Commun. 44. Amer. Math. Soc., Providence, RI, 2004. | MR 2106867 | Zbl 1079.60024

[20] N. Henze. Empirical-distribution-function goodness-of-fit tests for discrete models. Canad. J. Statist. 24 (1996) 81-93. | MR 1394742 | Zbl 0846.62037

[21] M. N. Jouini and R. T. Clemen. Copula models for aggregating expert opinions. Oper. Res. 44 (1996) 444-457. | Zbl 0864.90067

[22] C. A. J. Klaassen and J. A. Wellner. Efficient estimation in the bivariate normal copula model: Normal margins are least favourable. Bernoulli 3 (1997) 55-77. | MR 1466545 | Zbl 0877.62055

[23] Y. Malevergne and D. Sornette. Testing the Gaussian copula hypothesis for financial assets dependences. Quant. Finance 3 (2003) 231-250. | MR 1999654

[24] D. Pollard. The minimum distance method of testing. Metrika 27 (1980) 43-70. | MR 563412 | Zbl 0425.62029

[25] J. H. Shih and T. A. Louis. Inferences on the association parameter in copula models for bivariate survival data. Biometrics 51 (1995) 1384-1399. | MR 1381050 | Zbl 0869.62083

[26] W. Stute, W. González-Manteiga and M. Presedo-Quindimil. Bootstrap based goodness-of-fit tests. Metrika 40 (1993) 243-256. | MR 1235086 | Zbl 0770.62016

[27] H. Tsukahara. Semiparametric estimation in copula models. Canad. J. Statist. 33 (2005) 357-375. | MR 2193980 | Zbl 1077.62022

[28] A. W. Van Der Vaart and J. A. Wellner. Weak Convergence and Empirical Processes. Springer, New York, 1996. | MR 1385671 | Zbl 0862.60002

[29] W. Wang and M. T. Wells. Model selection and semiparametric inference for bivariate failure-time data (with discussion). J. Amer. Statist. Assoc. 95 (2000) 62-76. | MR 1803141 | Zbl 0996.62091