La modélisation par processus Gaussiens est une des approches les plus utilisées pour construire un métamodèle dans le cas de simulateurs numériques coûteux. Souvent, les sorties du code correspondent à des quantités physiques dont le comportement est connu à l’avance : les concentrations chimiques sont comprises entre 0 et 1, la sortie est croissante par rapport à un des paramètres, etc. Plusieurs approches ont été proposées pour prendre en compte de telles informations. Dans cet article, nous introduisons un nouveau cadre théorique pour inclure des contraintes dans la modélisation par processus Gaussiens, qui englobe les contraintes de bornes, de monotonie et de convexité. Nous étendons également ce cadre à tous les types de contraintes linéaires. Cette nouvelle méthodologie fait appel aux moments conditionnels de lois normales multivariées tronquées. Nous proposons plusieurs approximations basées sur une hypothèse de décorrélation, des outils d’intégration numérique et des techniques d’échantillonnage. D’un point de vue pratique, nous illustrons l’amélioration des performances de prédiction par processus Gaussiens lorque l’on inclut des contraintes. Nous comparons finalement les différents prédicteurs approchés sur des exemples avec contraintes de bornes, monotonie et convexité.
Gaussian process modeling is one of the most popular approaches for building a metamodel in the case of expensive numerical simulators. Frequently, the code outputs correspond to physical quantities with a behavior which is known a priori: Chemical concentrations lie between 0 and 1, the output is increasing with respect to some parameter, etc. Several approaches have been proposed to deal with such information. In this paper, we introduce a new framework for incorporating constraints in Gaussian process modeling, including bound, monotonicity and convexity constraints. We also extend this framework to any type of linear constraint. This new methodology mainly relies on conditional expectations of the truncated multinormal distribution. We propose several approximations based on correlation-free assumptions, numerical integration tools and sampling techniques. From a practical point of view, we illustrate how accuracy of Gaussian process predictions can be enhanced with such constraint knowledge. We finally compare all approximate predictors on bound, monotonicity and convexity examples.
@article{AFST_2012_6_21_3_529_0, author = {Da Veiga, S\'ebastien and Marrel, Amandine}, title = {Gaussian process modeling with inequality constraints}, journal = {Annales de la Facult\'e des sciences de Toulouse : Math\'ematiques}, pages = {529--555}, publisher = {Universit\'e Paul Sabatier, Institut de math\'ematiques}, address = {Toulouse}, volume = {Ser. 6, 21}, number = {3}, year = {2012}, doi = {10.5802/afst.1344}, zbl = {1279.60047}, mrnumber = {3076411}, language = {en}, url = {http://archive.numdam.org/articles/10.5802/afst.1344/} }
TY - JOUR AU - Da Veiga, Sébastien AU - Marrel, Amandine TI - Gaussian process modeling with inequality constraints JO - Annales de la Faculté des sciences de Toulouse : Mathématiques PY - 2012 SP - 529 EP - 555 VL - 21 IS - 3 PB - Université Paul Sabatier, Institut de mathématiques PP - Toulouse UR - http://archive.numdam.org/articles/10.5802/afst.1344/ DO - 10.5802/afst.1344 LA - en ID - AFST_2012_6_21_3_529_0 ER -
%0 Journal Article %A Da Veiga, Sébastien %A Marrel, Amandine %T Gaussian process modeling with inequality constraints %J Annales de la Faculté des sciences de Toulouse : Mathématiques %D 2012 %P 529-555 %V 21 %N 3 %I Université Paul Sabatier, Institut de mathématiques %C Toulouse %U http://archive.numdam.org/articles/10.5802/afst.1344/ %R 10.5802/afst.1344 %G en %F AFST_2012_6_21_3_529_0
Da Veiga, Sébastien; Marrel, Amandine. Gaussian process modeling with inequality constraints. Annales de la Faculté des sciences de Toulouse : Mathématiques, Série 6, Tome 21 (2012) no. 3, pp. 529-555. doi : 10.5802/afst.1344. http://archive.numdam.org/articles/10.5802/afst.1344/
[1] Abrahamsen (P.) and Benth (F.E.).— Kriging with inequality constraints. Mathematical Geology, 33(6), p. 719-744 (2001). | MR | Zbl
[2] Azaïs (J.-M.) and Wschebor (M.).— Level sets and extrema of random processes and fields. New York: Wiley (2009). | MR | Zbl
[3] Bigot (J.) and Gadat (S.).— Smoothing under diffeomorphic constraints with homeomorphic splines. SIAM Journal on Numerical Analysis, 48(1), p. 224-243 (2010). | MR
[4] Chopin (N.).— Fast simulation of truncated Gaussian distributions. Statistics and Computing, 21, p. 275-288 (2011). | MR
[5] Cozman (F.) and Krotkov (E.).— Truncated Gaussians as Tolerance Sets. Fifth Workshop on Artificial Intelligence and Statistics, Fort Lauderdale Florida (1995).
[6] Cramér (H.) and Leadbetter (M.R.).— Stationary and Related Stochastic Processes: Sample Function Properties and Their Applications. New York: Wiley (1967). | MR | Zbl
[7] Da Veiga (S.), Wahl (F.) and Gamboa (F.).— Local Polynomial Estimation for Sensitivity Analysis on Models With Correlated Inputs Technometrics, 51(4), p. 452-463 (2009). | MR
[8] Dette (H.) and Scheder (R.).— Strictly monotone and smooth nonparametric regression for two or more variables. The Canadian Journal of Statistics, 34(44), p. 535-561 (2006). | MR | Zbl
[9] Ellis (N.) and Maitra (R.).— Multivariate Gaussian Simulation Outside Arbitrary Ellipsoids. Journal of Computational and Graphical Statistics, 16(3), p. 692-798 (2007). | MR
[10] Fernandez (P.J.), Ferrari (P.A.) and Grynberg (S.P.).— Perfectly random sampling of truncated multinormal distributions. Adv. in Appl. Probab., 39(4), p. 973-990 (2007). | MR | Zbl
[11] Genz (A.).— Numerical Computation of Multivariate Normal Probabilities. J. Comp. Graph Stat., 1, p. 141-149 (1992). | Zbl
[12] Genz (A.).— Comparison of Methods for the Computation of Multivariate Normal Probabilities. Computing Science and Statistics, 25, p. 400-405 (1993).
[13] Genz (A.) and Bretz (F.).— Computation of Multivariate Normal and t Probabilities. Lecture Notes in Statistics, Vol. 195, Springer-Verlag, Heidelberg (2009). | MR | Zbl
[14] Geweke (J.).— Efficient simulation from the multivariate normal and student t-distribution subject to linear constraints. Computing Science and Statistics: Proceedings of the Twenty-Third Symposium on the Interface, p. 571-578 (1991).
[15] Ginsbourger (D.), Bay (X.) and Carraro (L.).— Noyaux de covariance pour le Krigeage de fonctions symétriques. submitted to C. R. Acad. Sci. Paris, section Maths (2009).
[16] Griffiths (W.).— A Gibbs’ sampler for the parameters of a truncated multivariate normal distribution. Working Paper, http://ideas.repec.org/p/mlb/wpaper/856.html (2002). | Zbl
[17] Hall (P.) and Huang (L.-S.).— Nonparametric kernel regression subject to monotonicity constraints. The Annals of Statistics, 29(3), p. 624-647 (2001). | MR | Zbl
[18] Hazelton (M.L.) and Turlach (B.A.).— Semiparametric regression with shape-constrained penalized slpines. Computational Statistics and Data Analysis, 55, p. 2871-2879 (2011). | MR | Zbl
[19] Horrace (W.C.).— Some results on the multivariate truncated normal distribution. Journal of Multivariate Analysis, 94, p. 209-221 (2005). | MR | Zbl
[20] Kleijnen (J.P.C.) and van Beers (W.C.M.).— Monotonicity-preserving bootstrapped Kriging metamodels for expensive simulations. Working Paper, http://www.tilburguniversity.edu/research/institutes-and-research-groups/center/staff/kleijnen/monotone Kriging.pdf (2010).
[21] Kotecha (J.H.) and Djuric (P.M.).— Gibbs sampling approach for generation of truncated multivariate gaussian random variables. IEEE Computer Society, p. 1757-1760 (1999).
[22] Kotz (S.), Balakrishnan (N.) and Johnson (N.L.).— Continuous multivariate distributions, Volume 1: models and applications New York: Wiley (2000). | MR | Zbl
[23] Lee (L.-F.).— On the first and second moments of the truncated multi-normal distribution and a simple estimator. Economics Letters, 3, p. 165-169 (1979). | MR
[24] Lee (L.-F.).— The determination of moments of the doubly truncated multivariate tobit model. Economics Letters, 11, p. 245-250 (1983).
[25] Marrel (A.), Iooss (B.), Van Dorpe (F.) and Volkova (E.).— An efficient methodology for modeling complex computer codes with Gaussian processes. Computational Statistics and Data Analysis, 52, p. 4731-4744 (2008). | MR
[26] Michalak (A.M.).— A Gibbs sampler for inequality-constrained geostatistical interpolation and inverse modeling. Water Resour. Res., 44, W09437, doi:10.1029/2007WR006645 (2008).
[27] Muthén (B.).— Moments of the censored and truncated bivariate normal distribution. British Journal of Mathematical and Statistical Psychology, 43, p. 131-143 (1990). | MR | Zbl
[28] Oakley (JE.) and O’Hagan (A.).— Probabilistic sensitivity analysis of complex models: A Bayesian approach. Journal of the Royal Statistical Society, Series B, 66, p. 751-769 (2004). | MR | Zbl
[29] Philippe (A.) and Robert (C.).— Perfect simulation of positive Gaussian distributions. Statistics and Computing, 13(2), p. 179-186 (2003). | MR
[30] Racine (J.S.), Parmeter (C.F.) and Du (P.).— Constrained nonparametric kernel regression: Estimation and inference. Working Paper, http:/economics.ucr.edu/spring09/Racine paper for 5 8 09.pdf (2009).
[31] Ramsay (J.O.) and Silverman (B.W.).— Functional Data Analysis. Springer Series in Statistics, Springer-Verlag (2005). | MR | Zbl
[32] Rasmussen (C.E.) and Williams (C.K.I.).— Gaussian Processes for Machine Learning (2006). The MIT Press. | MR | Zbl
[33] Robert (C.P.).— Simulation of truncated normal variables. Statistics and Computing, 5, p. 121-125 (1995).
[34] Rodriguez-Yam (G.), Davis (R.A.) and Scharf (L.).— Efficient Gibbs Sampling of Truncated Multivariate Normal with Application to Constrained Linear Regression. Working Paper, http://www.stat.columbia.edu/ rdavis/papers/CLR.pdf (2004).
[35] Sacks (J.), Welch (W.), Mitchell (T.) and Wynn (H.).— Design and analysis of computer experiments. Statistical Science, 4, p. 409-435 (1989). | MR | Zbl
[36] Saltelli (A.), Chan (K.) and Scott (E.M.) (Eds.).— Sensitivity Analysis. Wiley (2000). | MR | Zbl
[37] Santner (T.), Williams (B.) and Notz (W.).— The design and analysis of computer experiments. Springer (2003). | MR | Zbl
[38] Tallis (G.M.).— The moment generating function of the truncated multinormal distribution. Journal of the Royal Statistical Society, Series B, 23(1), p. 223-229 (1961). | MR | Zbl
[39] Tallis (G.M.).— Elliptical and radial truncation in normal populations. Ann. Math. Statist., 34, p. 940-944 (1963). | MR | Zbl
[40] Tallis (G.M.).— Plane truncation in normal populations. Journal of the Royal Statistical Society, Series B, 27(2), p. 301-307 (1965). | MR | Zbl
[41] Yoo (E.-H.) and Kyriakidis (P.C.).— Area-to-point Kriging with inequality-type data. Journal of Geographical Systems, 8(4), p. 357 (2006).
Cité par Sources :