Chain-referral sampling on stochastic block models
ESAIM: Probability and Statistics, Tome 24 (2020), pp. 718-738.

The discovery of the “hidden population”, whose size and membership are unknown, is made possible by assuming that its members are connected in a social network by their relationships. We explore these groups by a chain-referral sampling (CRS) method, where participants recommend the people they know. This leads to the study of a Markov chain on a random graph where vertices represent individuals and edges connecting any two nodes describe the relationships between corresponding people. We are interested in the study of CRS process on the stochastic block model (SBM), which extends the well-known Erdös-Rényi graphs to populations partitioned into communities. The SBM considered here is characterized by a number of vertices N, a number of communities (blocks) m, proportion of each community π = (π1, …, π$$) and a pattern for connection between blocks P = (λ$$N)$$. In this paper, we give a precise description of the dynamic of CRS process in discrete time on an SBM. The difficulty lies in handling the heterogeneity of the graph. We prove that when the population’s size is large, the normalized stochastic process of the referral chain behaves like a deterministic curve which is the unique solution of a system of ODEs.

DOI : 10.1051/ps/2020025
Classification : 05C80, 60J05, 60F17, 90B15, 92D30, 91D30
Mots-clés : Chain-referral sampling, random graph, social network, stochastic block model, exploration process, large graph limit, respondent driven sampling
@article{PS_2020__24_1_718_0,
     author = {Vo, Thi Phuong Thuy},
     title = {Chain-referral sampling on stochastic block models},
     journal = {ESAIM: Probability and Statistics},
     pages = {718--738},
     publisher = {EDP-Sciences},
     volume = {24},
     year = {2020},
     doi = {10.1051/ps/2020025},
     mrnumber = {4174418},
     zbl = {1455.60093},
     language = {en},
     url = {http://archive.numdam.org/articles/10.1051/ps/2020025/}
}
TY  - JOUR
AU  - Vo, Thi Phuong Thuy
TI  - Chain-referral sampling on stochastic block models
JO  - ESAIM: Probability and Statistics
PY  - 2020
SP  - 718
EP  - 738
VL  - 24
PB  - EDP-Sciences
UR  - http://archive.numdam.org/articles/10.1051/ps/2020025/
DO  - 10.1051/ps/2020025
LA  - en
ID  - PS_2020__24_1_718_0
ER  - 
%0 Journal Article
%A Vo, Thi Phuong Thuy
%T Chain-referral sampling on stochastic block models
%J ESAIM: Probability and Statistics
%D 2020
%P 718-738
%V 24
%I EDP-Sciences
%U http://archive.numdam.org/articles/10.1051/ps/2020025/
%R 10.1051/ps/2020025
%G en
%F PS_2020__24_1_718_0
Vo, Thi Phuong Thuy. Chain-referral sampling on stochastic block models. ESAIM: Probability and Statistics, Tome 24 (2020), pp. 718-738. doi : 10.1051/ps/2020025. http://archive.numdam.org/articles/10.1051/ps/2020025/

[1] E. Abbe, Community detection and stochastic block models. Found. Trends Commun. Inform. Theory 14 (2018) 1–162. | DOI | Zbl

[2] K.B. Athreya and P. Jagers (Eds.), Classical and Modern Branching Processes, Vol. 84. Springer Science & Business Media (2012). | MR

[3] S. Athreya and A. Röllin, Respondent driven sampling and sparse graph convergence. Electron. Commun. Probab. 23 (2018) 3. | DOI | MR | Zbl

[4] A. Bagheri and M. Saadati, Exploring the effectiveness of chain referral methods in sampling hidden populations. Indian J. Sci. Technol. 8 (2015) 1–8. | DOI

[5] A.D. Barbour, L. Holst and S. Janson, Poisson Approximation, Vol. 2 of Oxford Studies in Probability. The Clarendon Press, Oxford University Press, New York (1992). | MR | Zbl

[6] A. Barbour and G. Reinert, Approximating the epidemic curve. Electron. J. Probab. 18 (2013) 54. | DOI | MR | Zbl

[7] B. Bollobás and O. Riordan, Asymptotic normality of the size of the giant component via a random walk. J. Combinat. Theory Ser. B 102 (2012) 53–61. | DOI | MR | Zbl

[8] P. Billingsley, Convergence of Probability Measures. John Wiley & Sons, New York (1968). | MR | Zbl

[9] T. Britton and E. Pardoux, Stochastic Epidemic Models with Inference. Springer (2019). | DOI | MR

[10] A. Cousien, J.S. Dhersin, V.C. Tran and T.P.T. Vo, Respondent driven sampling on sparse Erdös-rényi graphs. Inpreparation (2019).

[11] R. Durrett, Random Graph Dynamics, Vol. 200, No. 7. Cambridge University Press, Cambridge (2007). | MR | Zbl

[12] N. Enriquez, G. Faraud and L. Ménard, Limiting shape of the depth first search tree in an Erdös-Rényi graph. Random Struct. Algorith. 56 (2020) 501–516. | DOI | MR | Zbl

[13] S.N. Ethier and T.G. Kurtz, Markov Processes Characterization and Convergence. John Wiley & Sons, New York (1986). | DOI | MR | Zbl

[14] A. Gadde, E.E. Gad, S. Avestimehr and A. Ortega, Active learning for community detection in stochastic block models. In 2016 IEEE International Symposium on Information Theory (ISIT) (2016) 1889–1893. | DOI

[15] M. Girvan and M.E.J. Newman, Community structure in social and biological networks. Proc. Natl. Acad. Sci. U.S.A. 99 (2002) 7821–7826. | DOI | MR | Zbl

[16] L.A. Goodman, Snowball sampling. Ann. Math. Statist. (1961) 148–170. | DOI | MR | Zbl

[17] D.D. Heckathorn, Respondent-driven sampling: a new approach to the study of hidden populations. Social Probl. 44 (1997) 74–99. | DOI

[18] D.D. Heckathorn, Respondent-driven sampling II: deriving valid population estimates from chain-referral samples of hidden populations. Social Probl. 49 (2002) 11–34. | DOI

[19] P.W. Holland, K.B. Laskey and S. Leinhardt, Stochastic blockmodels: first steps. Social Netw. 5 (1983) 109–137. | DOI | MR

[20] A. Jakubowski, On the Skorokhod topology. Ann. Inst. Henri Poincaré 22 (1986) 263–285. | Numdam | MR | Zbl

[21] S. Janson, M. Luczak and P. Windridge, Law of large numbers for the SIR epidemic on a random graph with given degrees. Random Struct. Algorith. 45 (2014) 726–763. | DOI | MR | Zbl

[22] A. Joffe and M. Métivier, Weak convergence of sequences of semimartingales with applications to multitype branching processes. Adv. Appl. Probab 18 (1986) 20–65. | DOI | MR | Zbl

[23] P. Barbillon, S. Donnet, E. Lazega and A. Bar-Hen, Stochastic Block Models for Multiplex networks: an application to networks of researchers. Preprint (2015). | arXiv | MR

[24] M. Métivier, Semimartingales: A Course on Stochastic Processes, Vol. 2. Walter de Gruyter (2011). | MR | Zbl

[25] A. Shaghaghi, R.S. Bhopal and A. Sheikh, Approaches to recruiting ‘hard-to-reach’ populations into research: a review of the literature. Health Promotion Perspect. 1 (2011) 86.

[26] V.C. Tran, P. Moyal, L. Decreusefond and J.S. Dhersin, Limite en grand graphe d’un processus SIR décrivant la propagation d’une épidemie sur un réseau. Journées MAS et Journée en l’honneur de Jacques Neveu, August 2010, Talence, France (2010).

[27] R. Van Der Hofstad, Random Graphs and Complex Networks. Vol. 1. Cambridge University Press (2016). | DOI | MR

[28] T.P.T. Vo, Exploration of Random Graphs by the Respondent Driven Sampling method. Ph.D. thesis, University Paris 13 (2020).

Cité par Sources :

This work was done during the Ph.D. thesis of the author under the supervision of Jean-Stéphane Dhersin and Tran Viet Chi. The author was partially supported by the Chaire MMB (Modélisation Mathématique et Biodiversité of Veolia-Ecole Polytechnique-Museum National d’Histoire Naturelle-Fondation X) and by the ANR Econet (ANR-18-CE02-0010).