Hyperparameter Estimation in Bayesian MAP Estimation: Parameterizations and Consistency

Dunlop, Matthew M.; Helin, Tapio; Stuart, Andrew M.

doi:10.5802/smai-jcm.62

Dunlop, Matthew M.¹ ; Helin, Tapio ² ; Stuart, Andrew M.³

¹ Courant Institute of Mathematical Sciences, New York University, New York, New York, 10012, USA
² School of Engineering Science, Lappeenranta–Lahti University of Technology, Lappeenranta, 53850, Finland
³ Computing & Mathematical Sciences, California Institute of Technology, Pasadena, California, 91125, USA

The SMAI Journal of computational mathematics, Tome 6 (2020), pp. 69-100.

Résumé

The Bayesian formulation of inverse problems is attractive for three primary reasons: it provides a clear modelling framework; it allows for principled learning of hyperparameters; and it can provide uncertainty quantification. The posterior distribution may in principle be sampled by means of MCMC or SMC methods, but for many problems it is computationally infeasible to do so. In this situation maximum a posteriori (MAP) estimators are often sought. Whilst these are relatively cheap to compute, and have an attractive variational formulation, a key drawback is their lack of invariance under change of parameterization; it is important to study MAP estimators, however, because they provide a link with classical optimization approaches to inverse problems and the Bayesian link may be used to improve upon classical optimization approaches. The lack of invariance of MAP estimators under change of parameterization is a particularly significant issue when hierarchical priors are employed to learn hyperparameters. In this paper we study the effect of the choice of parameterization on MAP estimators when a conditionally Gaussian hierarchical prior distribution is employed. Specifically we consider the centred parameterization, the natural parameterization in which the unknown state is solved for directly, and the noncentred parameterization, which works with a whitened Gaussian as the unknown state variable, and arises naturally when considering dimension-robust MCMC algorithms; MAP estimation is well-defined in the nonparametric setting only for the noncentred parameterization. However, we show that MAP estimates based on the noncentred parameterization are not consistent as estimators of hyperparameters; conversely, we show that limits of finite-dimensional centred MAP estimators are consistent as the dimension tends to infinity. We also consider empirical Bayesian hyperparameter estimation, show consistency of these estimates, and demonstrate that they are more robust with respect to noise than centred MAP estimates. An underpinning concept throughout is that hyperparameters may only be recovered up to measure equivalence, a well-known phenomenon in the context of the Ornstein–Uhlenbeck process. The applicability of the results is demonstrated concretely with the study of hierarchical Whittle–Matérn and ARD priors.

Publié le : 2020-04-24

MR Zbl

DOI : 10.5802/smai-jcm.62

Classification : 62G05, 62C10, 62G20, 45Q05
Mots clés : Bayesian inverse problems, hierarchical Bayesian, MAP estimation, optimization, nonparametric inference, hyperparameter inference, consistency of estimators.

Affiliations des auteurs :

Dunlop, Matthew M. ¹ ; Helin, Tapio ² ; Stuart, Andrew M. ³

@article{SMAI-JCM_2020__6__69_0,
     author = {Dunlop, Matthew M. and Helin, Tapio and Stuart, Andrew M.},
     title = {Hyperparameter {Estimation} in {Bayesian} {MAP} {Estimation:} {Parameterizations} and {Consistency}},
     journal = {The SMAI Journal of computational mathematics},
     pages = {69--100},
     publisher = {Soci\'et\'e de Math\'ematiques Appliqu\'ees et Industrielles},
     volume = {6},
     year = {2020},
     doi = {10.5802/smai-jcm.62},
     mrnumber = {4100532},
     zbl = {07207994},
     language = {en},
     url = {http://archive.numdam.org/articles/10.5802/smai-jcm.62/}
}

TY  - JOUR
AU  - Dunlop, Matthew M.
AU  - Helin, Tapio
AU  - Stuart, Andrew M.
TI  - Hyperparameter Estimation in Bayesian MAP Estimation: Parameterizations and Consistency
JO  - The SMAI Journal of computational mathematics
PY  - 2020
SP  - 69
EP  - 100
VL  - 6
PB  - Société de Mathématiques Appliquées et Industrielles
UR  - http://archive.numdam.org/articles/10.5802/smai-jcm.62/
DO  - 10.5802/smai-jcm.62
LA  - en
ID  - SMAI-JCM_2020__6__69_0
ER  -

%0 Journal Article
%A Dunlop, Matthew M.
%A Helin, Tapio
%A Stuart, Andrew M.
%T Hyperparameter Estimation in Bayesian MAP Estimation: Parameterizations and Consistency
%J The SMAI Journal of computational mathematics
%D 2020
%P 69-100
%V 6
%I Société de Mathématiques Appliquées et Industrielles
%U http://archive.numdam.org/articles/10.5802/smai-jcm.62/
%R 10.5802/smai-jcm.62
%G en
%F SMAI-JCM_2020__6__69_0

Dunlop, Matthew M.; Helin, Tapio; Stuart, Andrew M. Hyperparameter Estimation in Bayesian MAP Estimation: Parameterizations and Consistency. The SMAI Journal of computational mathematics, Tome 6 (2020), pp. 69-100. doi : 10.5802/smai-jcm.62. http://archive.numdam.org/articles/10.5802/smai-jcm.62/

Bibliographie
Cité par

[1] Agapiou, Sergios; Bardsley, Johnathan M.; Papaspiliopoulos, Omiros; Stuart, Andrew M. Analysis of the Gibbs sampler for hierarchical inverse problems, SIAM/ASA J. Uncertain. Quantif., Volume 2 (2014) no. 1, pp. 511-544 | DOI | MR | Zbl

[2] Agapiou, Sergios; Burger, Martin; Dashti, Masoumeh; Helin, Tapio Sparsity-promoting and edge-preserving maximum a posteriori estimators in non-parametric Bayesian inverse problems, Inverse Probl., Volume 34 (2018) no. 4, 045002 | MR | Zbl

[3] Agapiou, Sergios; Dashti, Masoumeh; Helin, Tapio Rates of contraction of posterior distributions based on $p$ -exponential priors (2018) (https://arxiv.org/abs/1811.12244)

[4] Agapiou, Sergios; Larsson, Stig; Stuart, Andrew M. Posterior contraction rates for the Bayesian approach to linear ill-posed inverse problems, Stochastic Processes Appl., Volume 123 (2013) no. 10, pp. 3828-3860 | DOI | MR | Zbl

[5] Agapiou, Sergios; Mathé, Peter Posterior Contraction in Bayesian Inverse Problems Under Gaussian Priors, New Trends in Parameter Identification for Mathematical Models, Springer, 2018, pp. 1-29 | Zbl

[6] Agapiou, Sergios; Stuart, Andrew M.; Zhang, Yuan-Xiang Bayesian posterior contraction rates for linear severely ill-posed inverse problems, J. Inverse Ill-Posed Probl., Volume 22 (2014) no. 3, pp. 297-321 | MR | Zbl

[7] Berger, James O. Statistical Decision Theory and Bayesian Analysis, Springer, 2013

[8] Beskos, Alexandros; Jasra, Ajay; Muzaffer, Ege A.; Stuart, Andrew M. Sequential Monte Carlo methods for Bayesian elliptic inverse problems, Stat. Comput., Volume 25 (2015) no. 4, pp. 727-737 | DOI | MR | Zbl

[9] Beskos, Alexandros; Roberts, Gareth; Stuart, Andrew M.; Voss, Jochen MCMC methods for diffusion bridges, Stoch. Dyn., Volume 8 (2008) no. 03, pp. 319-350 | DOI | MR | Zbl

[10] Chada, Neil K.; Iglesias, Marco A.; Roininen, Lassi; Stuart, Andrew M. Parameterizations for ensemble Kalman inversion, Inverse Probl., Volume 34 (2018) no. 5, 055009 | MR | Zbl

[11] Chen, Victor; Dunlop, Matthew M.; Papaspiliopoulos, Omiros; Stuart, Andrew M. Dimension-Robust MCMC in Bayesian Inverse Problems (2018) (https://arxiv.org/abs/1806.00519)

[12] Clason, Christian; Helin, Tapio; Kretschmann, Remo; Piiroinen, Petteri Generalized modes in Bayesian inverse problems, SIAM/ASA J. Uncertain. Quantif., Volume 7 (2019) no. 2, pp. 652-684 | DOI | MR | Zbl

[13] Cotter, Simon L.; Roberts, Gareth; Stuart, Andrew M.; White, David MCMC methods for functions: modifying old algorithms to make them faster, Stat. Sci., Volume 28 (2013) no. 3, pp. 424-446 | DOI | MR | Zbl

[14] Daon, Yair; Stadler, Georg Mitigating the Influence of the Boundary on PDE-based Covariance Operators, Inverse Probl. Imaging, Volume 12 (2018) no. 5, pp. 1083-1102 | DOI | MR | Zbl

[15] Dashti, Masoumeh; Law, Kody JH; Stuart, Andrew M.; Voss, Jochen MAP estimators and their consistency in Bayesian nonparametric inverse problems, Inverse Probl., Volume 29 (2013) no. 9, 095017 | MR

[16] Dashti, Masoumeh; Stuart, Andrew M. The Bayesian approach to inverse problems, Springer (2017), pp. 311-428

[17] Dunlop, Matthew M.; Iglesias, Marco A.; Stuart, Andrew M. Hierarchical Bayesian level set inversion, Stat. Comput. (2016), pp. 1-30 | Zbl

[18] Franklin, Joel N. Well-posed stochastic extensions of ill-posed linear problems, J. Math. Anal. Appl., Volume 31 (1970) no. 3, pp. 682-716 | DOI | MR | Zbl

[19] Gloter, Arnaud; Hoffmann, Marc et al. Estimation of the Hurst parameter from discrete noisy data, Ann. Stat., Volume 35 (2007) no. 5, pp. 1947-1974 | DOI | MR | Zbl

[20] Gugushvili, Shota; van der Vaart, Aad W.; Yan, Dong Bayesian inverse problems with partial observations, Trans. A. Razmadze Math. Inst., Volume 172 (2018) no. 3, pp. 388-403 | DOI | MR | Zbl

[21] Gugushvili, Shota; van der Vaart, Aad W.; Yan, Dong Bayesian linear inverse problems in regularity scales (2018) (https://arxiv.org/abs/1802.08992)

[22] Helin, Tapio; Burger, Martin Maximum a posteriori probability estimates in infinite-dimensional Bayesian inverse problems, Inverse Probl., Volume 31 (2015) no. 8, 085009 | MR | Zbl

[23] Helin, Tapio; Lassas, Matti Hierarchical models in statistical inverse problems and the Mumford–Shah functional, Inverse Probl., Volume 27 (2010) no. 1, 015008 | MR

[24] Kaipio, Jari; Somersalo, Erkki Statistical and Computational Inverse Problems, 160, Springer, 2006 | Zbl

[25] Khristenko, Ustim; Scarabosio, Laura; Swierczynski, Piotr; Ullmann, Elisabeth; Wohlmuth, Barbara Analysis of boundary effects on PDE-based sampling of Whittle-Matérn random fields, SIAM/ASA J. Uncertain. Quantif., Volume 7 (2019) no. 3, pp. 948-974 | DOI | Zbl

[26] Knapik, Bartek T; Szabó, Botond T.; van der Vaart, Aad W.; van Zanten, J. Harry Bayes procedures for adaptive inference in inverse problems for the white noise model, Probab. Theory Relat. Fields, Volume 164 (2016) no. 3-4, pp. 771-813 | DOI | MR | Zbl

[27] Knapik, Bartek T; van der Vaart, Aad W.; van Zanten, J. Harry Bayesian recovery of the initial condition for the heat equation, Commun. Stat., Theory Methods, Volume 42 (2013) no. 7, pp. 1294-1313 | DOI | MR | Zbl

[28] Knapik, Bartek T; van der Vaart, Aad W.; van Zanten, J. Harry et al. Bayesian inverse problems with Gaussian priors, Ann. Stat., Volume 39 (2011) no. 5, pp. 2626-2657 | DOI | MR | Zbl

[29] Lasanen, Sari Non-Gaussian statistical inverse problems. Part I: Posterior distributions, Inverse Probl. Imaging, Volume 6 (2012) no. 2, pp. 215-266 | DOI | MR | Zbl

[30] Lasanen, Sari Non-Gaussian statistical inverse problems. Part II: Posterior convergence for approximated unknowns., Inverse Probl. Imaging, Volume 6 (2012) no. 2, pp. 267-287 | DOI | MR | Zbl

[31] Lehtinen, Markku S.; Paivarinta, Lassi; Somersalo, Erkki Linear inverse problems for generalised random variables, Inverse Probl., Volume 5 (1989) no. 4, pp. 599-612 | DOI | MR | Zbl

[32] Lindgren, Finn; Rue, Håvard; Lindström, Johan An explicit link between Gaussian fields and Gaussian Markov random fields: the stochastic partial differential equation approach, J. R. Stat. Soc., Ser. B, Stat. Methodol., Volume 73 (2011) no. 4, pp. 423-498 | DOI | MR | Zbl

[33] Murphy, Kevin P Machine Learning: A Probabilistic Perspective, MIT Press, 2012 | Zbl

[34] Neal, Radford M. Bayesian Learning for Neural Networks, Ph. D. Thesis, University of Toronto (1995) | Zbl

[35] Neal, Radford M. Monte Carlo implementation of Gaussian process models for Bayesian regression and classification (1997) (https://arxiv.org/abs/physics/9701026)

[36] Nickl, Richard Bernstein-von Mises theorems for statistical inverse problems I: Schrödinger equation (2017) (https://arxiv.org/abs/1707.01764) | Zbl

[37] Nickl, Richard; Ray, Kolyan Nonparametric statistical inference for drift vector fields of multi-dimensional diffusions (2018) (https://arxiv.org/abs/1810.01702) | Zbl

[38] Nickl, Richard; Söhl, Jakob Bernstein-von Mises theorems for statistical inverse problems II: compound Poisson processes, Electron. J. Stat., Volume 13 (2019) no. 2, pp. 3513-3571 | DOI | MR | Zbl

[39] Nickl, Richard; van de Geer, Sara; Wang, Sven Convergence rates for Penalised Least Squares Estimators in PDE-constrained regression problems (2018) (https://arxiv.org/abs/1809.08818) | Zbl

[40] Owhadi, Houman; Scovel, Clint; Sullivan, Tim On the brittleness of Bayesian inference, SIAM Rev., Volume 57 (2015) no. 4, pp. 566-582 | DOI | MR | Zbl

[41] Papaspiliopoulos, Omiros; Roberts, Gareth; Sköld, Martin A general framework for the parametrization of hierarchical models, Stat. Sci. (2007), pp. 59-73 | DOI | MR | Zbl

[42] Ray, Kolyan Bayesian inverse problems with non-conjugate priors, Electron. J. Stat., Volume 7 (2013), pp. 2516-2549 | MR | Zbl

[43] Roberts, Gareth; Stramer, Osnat On inference for partially observed nonlinear diffusion models using the Metropolis–Hastings algorithm, Biometrika, Volume 88 (2001) no. 3, pp. 603-621 | DOI | MR | Zbl

[44] Roininen, Lassi; Huttunen, Janne M. J.; Lasanen, Sari Whittle-Matérn priors for Bayesian statistical inversion with applications in electrical impedance tomography, Inverse Probl. Imaging, Volume 8 (2014) no. 2, pp. 561-586 | DOI | Zbl

[45] Stuart, Andrew M. Inverse problems: a Bayesian perspective, Acta Numerica, Volume 19, Cambridge University Press, 2010, pp. 451-559 | DOI | MR | Zbl

[46] van der Vaart, Aad W.; Wellner, Jon A. Weak convergence, Weak convergence and empirical processes, Springer, 1996, pp. 16-28 | DOI | Zbl

[47] van Zanten, J. Harry A Note on Consistent Estimation of Multivariate Parameters in Ergodic Diffusion Models, Scand. J. Stat., Volume 28 (2001) no. 4, pp. 617-623 | DOI | MR | Zbl

[48] Yu, Yaming; Meng, Xiao-Li To center or not to center: that is not the question – an Ancillarity–Sufficiency Interweaving Strategy (ASIS) for boosting MCMC efficiency, J. Comput. Graph. Stat., Volume 20 (2011) no. 3, pp. 531-570 | MR

Cité par Sources :