Numéro spécial : Statistique pour les données spatiales et spatio-temporelles et réseau RESSTE
Analyzing spatio-temporal data with R: Everything you always wanted to know – but were afraid to ask
[Données spatio-temporelles avec R : tout ce que vous avez toujours voulu savoir sans jamais avoir osé le demander]
Journal de la société française de statistique, Tome 158 (2017) no. 3, pp. 124-158.

Nous présentons un aperçu des modèles, méthodes et techniques (géo-)statistiques pour l’analyse et la prévision de processus spatio-temporels continus. De nombreuses approches sont possibles pour la construction de modèles statistiques pour ces processus, l’estimation de leurs paramètres et leur prédiction. Nous avons choisi de présenter l’approche par processus gaussien, la plus communément utilisée en statistiques spatiales et en géostatistiques, ainsi que son implémentation avec le logiciel R. La variable cible est la moyenne de la concentration quotidienne PM 10 à l’échelle de la France, prédite à l’aide d’un modèle de transport en chimie de l’atmosphère et de séries d’observations obtenues à des stations de surveillance de la qualité de l’air. En suivant le fil d’une application réelle de grande dimension, nous comparons certains des paquets R les plus utilisés. Le code R permettant la visualisation des données, l’estimation des paramètres de la fonction de covariance spatio-temporelle ainsi que la sélection d’un modèle et la prédiction de la concentration de PM 10 est également présenté afin d’illustrer l’enchaînement des étapes. Nous concluons avec une comparaison entre les paquets qui sont disponibles aujourd’hui et ainsi que les pistes de développement qui nous paraissent intéressantes.

We present an overview of (geo-)statistical models, methods and techniques for the analysis and prediction of continuous spatio-temporal processes residing in continuous space. Various approaches exist for building statistical models for such processes, estimating their parameters and performing predictions. We cover the Gaussian process approach, very common in spatial statistics and geostatistics, and we focus on R-based implementations of numerical procedures. To illustrate and compare the use of some of the most relevant packages, we treat a real-world application with high-dimensional data. The target variable is the daily mean PM 10 concentration predicted thanks to a chemistry-transport model and observation series collected at monitoring stations across France in 2014. We give R code covering the full work-flow from importing data sets to the prediction of PM 10 concentrations with a fitted parametric model, including the visualization of data, estimation of the parameters of the spatio-temporal covariance function and model selection. We conclude with some elements of comparison between the packages that are available today and some discussion for future developments.

Keywords: Space-time, Covariance function, Geostatistics, Kriging, Air pollution
Mot clés : Fonction de covariance, Géostatistique, Krigeage, Pollution atmosphérique
@article{JSFS_2017__158_3_124_0,
     author = {RESSTE Network et al.},
     title = {Analyzing spatio-temporal data with {R:}  {Everything} you always wanted to know {\textendash} but were afraid to ask},
     journal = {Journal de la soci\'et\'e fran\c{c}aise de statistique},
     pages = {124--158},
     publisher = {Soci\'et\'e fran\c{c}aise de statistique},
     volume = {158},
     number = {3},
     year = {2017},
     mrnumber = {3720133},
     zbl = {1378.62139},
     language = {en},
     url = {http://archive.numdam.org/item/JSFS_2017__158_3_124_0/}
}
TY  - JOUR
AU  - RESSTE Network et al.
TI  - Analyzing spatio-temporal data with R:  Everything you always wanted to know – but were afraid to ask
JO  - Journal de la société française de statistique
PY  - 2017
SP  - 124
EP  - 158
VL  - 158
IS  - 3
PB  - Société française de statistique
UR  - http://archive.numdam.org/item/JSFS_2017__158_3_124_0/
LA  - en
ID  - JSFS_2017__158_3_124_0
ER  - 
%0 Journal Article
%A RESSTE Network et al.
%T Analyzing spatio-temporal data with R:  Everything you always wanted to know – but were afraid to ask
%J Journal de la société française de statistique
%D 2017
%P 124-158
%V 158
%N 3
%I Société française de statistique
%U http://archive.numdam.org/item/JSFS_2017__158_3_124_0/
%G en
%F JSFS_2017__158_3_124_0
RESSTE Network et al. Analyzing spatio-temporal data with R:  Everything you always wanted to know – but were afraid to ask. Journal de la société française de statistique, Tome 158 (2017) no. 3, pp. 124-158. http://archive.numdam.org/item/JSFS_2017__158_3_124_0/

[1] Bourotte, Marc; Allard, Denis; Porcu, Emilio A flexible class of non-separable cross-covariance functions for multivariate space–time data, Spatial Statistics, Volume 18A (2016), pp. 125-146 | MR

[2] Blangiardo, Marta; Cameletti, Michela Spatial and spatio-temporal Bayesian models with R-INLA, John Wiley & Sons, 2015 | MR

[3] Blangiardo, Marta; Cameletti, Michela; Baio, Gianluca; Rue, Håvard Spatial and spatio-temporal models with R-INLA, Spatial and spatio-temporal epidemiology, Volume 7 (2013), pp. 39-55

[4] Bevilacqua, Moreno; Gaetan, Carlo Comparing composite likelihood methods based on pairs for spatial Gaussian random fields, Statistics and Computing, Volume 25 (2015) no. 5, pp. 877-892 | MR | Zbl

[5] Bevilacqua, Moreno; Gaetan, Carlo; Mateu, Jorge; Porcu, Emilio Estimating space and space-time covariance functions for large data sets: a weighted composite likelihood approach, Journal of the American Statistical Association, Volume 107 (2012) no. 497, pp. 268-280 | MR | Zbl

[6] Burkard, Richard K GEODESY FOR THE LAYMAN. (1964) https://www.ngs.noaa.gov/PUBS_LIB/Geodesy4Layman/TR80003A.HTM (Technical report)

[7] Cameletti, M Stem: Spatio-temporal models in R, R package version, Volume 1 (2009)

[8] Chilès, J.P.; Delfiner, P. Geostatistics: Modeling Spatial Uncertainty, 2nd edition, John Wiley & Sons, 2012 | MR | Zbl

[9] Cox, DR; Isham, Valerie A simple spatial-temporal model of rainfall, Proceedings of the Royal Society of London A: Mathematical, Physical and Engineering Sciences, Volume 415, The Royal Society (1988) no. 1849, pp. 317-328 | MR

[10] Cressie, N.; Johannesson, G Fixed rank kriging for very large spatial data sets, J. of the Royal Statist. Society, Series B, Volume 70 (2008), pp. 209-226 | MR | Zbl

[11] Cameletti, Michela; Lindgren, Finn; Simpson, Daniel; Rue, Håvard Spatio-temporal modeling of particulate matter concentration through the SPDE approach, AStA Advances in Statistical Analysis, Volume 97 (2013) no. 2, pp. 109-131 | MR | Zbl

[12] Cressie, N Statistics for Spatial Data: Wiley Series in Probability and Statistics, Wiley-Interscience New York, 1993 | MR

[13] Cressie, Noel; Wikle, Christopher K Statistics for spatio-temporal data, John Wiley & Sons, 2015 | MR | Zbl

[14] De Iaco, Sandra; Myers, Donald E; Posa, Donato Space–time analysis using a general product–sum model, Statistics & Probability Letters, Volume 52 (2001) no. 1, pp. 21-28 | MR | Zbl

[15] De Iaco, Sandra; Myers, Donald E; Posa, Donato Nonseparable space-time covariance models: some parametric families, Mathematical Geology, Volume 34 (2002), pp. 23-42 | MR | Zbl

[16] Dowle, M; Srinivasan, A; Short, T; with contributions from R Saporta, S Lianoglou; Antonyan, E data.table: Extension of Data.frame (2015) https://CRAN.R-project.org/package=data.table (R package version 1.9.6)

[17] Furrer, R; Genton, M. G.; Nychka, D. Covariance tapering for interpolation of large spatial datasets, J. Computnl Graph. Statist. (2006), pp. 502-523 | MR

[18] Genton, Marc G; Castruccio, Stefano; Crippa, Paola; Dutta, Subhajit; Huser, Raphaël; Sun, Ying; Vettori, Sabrina Visuanimation in statistics, Stat, Volume 4 (2015) no. 1, pp. 81-96 | MR

[19] Gneiting, Tilmann; Genton, Marc G; Guttorp, Peter Geostatistical space-time models, stationarity, separability, and full symmetry, Monographs On Statistics and Applied Probability, Volume 107 (2006), pp. 151-175 | Zbl

[20] Gneiting, Tilmann Compactly supported correlation functions, Journal of Multivariate Analysis, Volume 83 (2002) no. 2, pp. 493–-508 (Accessed 2014-04-02) | MR | Zbl

[21] Gneiting, Tilmann; Raftery, Adrian E Strictly proper scoring rules, prediction, and estimation, Journal of the American Statistical Association, Volume 102 (2007) no. 477, pp. 359-378 | MR | Zbl

[22] Gneuss, P; Schmid, W; Schwarze, R Efficient Approximation of the Spatial Covariance Function for Large Datasets - Analysis of Atmospheric CO 2 Concentrations, Discussion Paper series recap15 (2013)

[23] Hengl, T.; Roudier, P.; Beaudette, D.; Pebesma, E. plotKML: Scientific Visualization of Spatio-Temporal Data, Journal of Statistical Software, Volume 63 (2015) no. 5, pp. 1-25

[24] Kaufman, Cari G; Schervish, Mark J; Nychka, Douglas W Covariance tapering for likelihood-based estimation in large spatial data sets, Journal of the American Statistical Association, Volume 103 (2008) no. 484, pp. 1545-1555 | MR | Zbl

[25] Lindsay, Bruce G Composite likelihood methods, Contemporary Mathematics, Volume 80 (1988) no. 1, pp. 221-239 | MR | Zbl

[26] Lindgren, F; Rue, H; Lindstrom, J An explicit link between Gaussian fields and Gaussian Markov random fields: the stochastic partial differential equation approach, J. R. Statist. Soc. B, Volume 73 (2011), p. 423-298 | MR | Zbl

[27] Lindström, Johan; Szpiro, Adam; Sampson, Paul D; Bergen, Silas; Sheppard, Lianne SpatioTemporal: An R Package for Spatio-Temporal Modelling of Air-Pollution, J stat softw (in press)(http://cran. rproject. org/web/packages/SpatioTemporal/index. html) (2013)

[28] Menut, L.; Bessagnet, B.; Khvorostyanov, D.; Beekmann, M.; Blond, N.; Colette, A.; Coll, I.; Curci, G.; Foret, G.; Hodzic, A.; Mailler, S.; Meleux, F.; Monge, J.-L.; Pison, I.; Siour, G.; Turquety, S.; Valari, M.; Vautard, R.; Vivanco, M. G. CHIMERE 2013: a model for regional atmospheric composition modelling, Geoscientific Model Development, Volume 6 (2013) no. 4, pp. 981-1028 http://www.geosci-model-dev.net/6/981/2013/ | DOI

[29] Padoan, Simone A; Bevilacqua, Moreno Analysis of Random Fields Using CompRandFld, Journal of Statistical Software, Volume 63 (2015) no. i09

[30] Pebesma, Edzer J. Multivariable geostatistics in S: the gstat package, Computers & Geosciences, Volume 30 (2004), pp. 683-691

[31] Pebesma, Edzer Spacetime: Spatio-Temporal Data in R, Journal of Statistical Software, Volume 51 (2012) no. 7, pp. 1-30 http://www.jstatsoft.org/v51/i07/

[32] Porcu, Emilio; Mateu, Jorge; Christakos, George Quasi-arithmetic means of covariance functions with potential applications to space–time data, Journal of Multivariate Analysis, Volume 100 (2009) no. 8, pp. 1830-1844 | MR | Zbl

[33] R Core Team R: A Language and Environment for Statistical Computing (2013) http://www.R-project.org/

[34] Rue, Håvard; Martino, Sara; Chopin, Nicolas Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations, Journal of the royal statistical society: Series b (statistical methodology), Volume 71 (2009) no. 2, pp. 319-392 | MR | Zbl

[35] Ryan, Jeffrey A.; Ulrich, Joshua M. xts: eXtensible Time Series (2014) https://CRAN.R-project.org/package=xts (R package version 0.9-7)

[36] Sang, H; Huang, J A full scale approximation of covariance functions for large spatial data sets, J. R. Statist. Soc. B, Volume 74 (2012), pp. 111-132 | MR | Zbl

[37] Sigrist, Fabio; Künsch, Hans R.; Stahel, Werner A. spate: An R Package for Spatio-Temporal Modeling with a Stochastic Advection-Diffusion Process, Journal of Statistical Software, Volume 63 (2015) no. 14, pp. 1-23 http://www.jstatsoft.org/v63/i14/

[38] Schlather, Martin; Malinowski, Alexander; Menck, Peter J.; Oesting, Marco; Strokorb, Kirstin Analysis, Simulation and Prediction of Multivariate Random Fields with Package RandomFields, Journal of Statistical Software, Volume 63 (2015) no. 8, pp. 1-25 http://www.jstatsoft.org/v63/i08/

[39] Schlather, Martin; Malinowski, Alexander; Oesting, Marco; Boecker, Daphne; Strokorb, Kirstin; Engelke, Sebastian; Martini, Johannes; Ballani, Felix; Moreva, Olga; Menck, Peter J; Gross, Sebastian; Ober, Ulrike; Christoph Berreth; Burmeister, Katharina; Manitz, Juliane; Morena, Olga; Ribeiro, Paulo; Singleton, Richard; Pfaff, Ben; R Core Team RandomFields: Simulation and Analysis of Random Fields (2016) http://CRAN.R-project.org/package=RandomFields (R package version 3.1.16)

[40] Stein, M Statistical properties of covariance tapers, J. Comput. Graph. Statist., Volume 22 (2013), pp. 866-885 | MR

[41] Varin, Cristiano; Reid, Nancy; Firth, David An overview of composite likelihood methods, Statistica Sinica, Volume 21 (2011) no. 1, pp. 5-42 | MR | Zbl

[42] Wickham, Hadley; Francois, Romain dplyr: A Grammar of Data Manipulation (2016) https://CRAN.R-project.org/package=dplyr (R package version 0.5.0)

[43] Wickham, H. ggplot2: Elegant Graphics for Data Analysis, Springer-Verlag New York, 2009 | Zbl