The logarithmic Zipf law in a general urn problem
ESAIM: Probability and Statistics, Tome 24 (2020), pp. 275-293.

The origin of power-law behavior (also known variously as Zipf’s law) has been a topic of debate in the scientific community for more than a century. Power laws appear widely in physics, biology, earth and planetary sciences, economics and finance, computer science, demography and the social sciences. In a highly cited article, Mark Newman [Contemp. Phys. 46 (2005) 323–351] reviewed some of the empirical evidence for the existence of power-law forms, however underscored that even though many distributions do not follow a power law, quite often many of the quantities that scientists measure are close to a Zipf law, and hence are of importance. In this paper we engage a variant of Zipf’s law with a general urn problem. A collector wishes to collect m complete sets of N distinct coupons. The draws from the population are considered to be independent and identically distributed with replacement, and the probability that a type-j coupon is drawn is denoted by p$$, j = 1, 2, …, N. Let T$$(N) the number of trials needed for this problem. We present the asymptotics for the expectation (five terms plus an error), the second rising moment (six terms plus an error), and the variance of T$$(N) (leading term) as N, when

p j =a j j=2 N+1 a j ,wherea j =lnj -p ,p>0.

Moreover, we prove that T$$(N) (appropriately normalized) converges in distribution to a Gumbel random variable. These “log-Zipf” classes of coupon probabilities are not covered by the existing literature and the present paper comes to fill this gap. In the spirit of a recent paper of ours [ESAIM: PS 20 (2016) 367–399] we enlarge the classes for which the Dixie cup problem is solved w.r.t. its moments, variance, distribution.

DOI : 10.1051/ps/2020011
Classification : 60F05, 60F99, 60G70
Mots-clés : Generalized Zipf law, Urn problems, coupon collector’s problem, double Dixie cup problem, Gumbel distribution, Laplace method for integrals – determination of higher order terms, Eulerian logarithmic integral
@article{PS_2020__24_1_275_0,
     author = {Doumas, Aristides V. and Papanicolaou, Vassilis G.},
     title = {The logarithmic {Zipf} law in a general urn problem},
     journal = {ESAIM: Probability and Statistics},
     pages = {275--293},
     publisher = {EDP-Sciences},
     volume = {24},
     year = {2020},
     doi = {10.1051/ps/2020011},
     mrnumber = {4090339},
     zbl = {1434.60069},
     language = {en},
     url = {http://archive.numdam.org/articles/10.1051/ps/2020011/}
}
TY  - JOUR
AU  - Doumas, Aristides V.
AU  - Papanicolaou, Vassilis G.
TI  - The logarithmic Zipf law in a general urn problem
JO  - ESAIM: Probability and Statistics
PY  - 2020
SP  - 275
EP  - 293
VL  - 24
PB  - EDP-Sciences
UR  - http://archive.numdam.org/articles/10.1051/ps/2020011/
DO  - 10.1051/ps/2020011
LA  - en
ID  - PS_2020__24_1_275_0
ER  - 
%0 Journal Article
%A Doumas, Aristides V.
%A Papanicolaou, Vassilis G.
%T The logarithmic Zipf law in a general urn problem
%J ESAIM: Probability and Statistics
%D 2020
%P 275-293
%V 24
%I EDP-Sciences
%U http://archive.numdam.org/articles/10.1051/ps/2020011/
%R 10.1051/ps/2020011
%G en
%F PS_2020__24_1_275_0
Doumas, Aristides V.; Papanicolaou, Vassilis G. The logarithmic Zipf law in a general urn problem. ESAIM: Probability and Statistics, Tome 24 (2020), pp. 275-293. doi : 10.1051/ps/2020011. http://archive.numdam.org/articles/10.1051/ps/2020011/

[1] C.M. Bender and S.A. Orszag, Advanced Mathematical Methods for Scientists and Engineers I: Asymptotic Methods and Perturbation Theory. Springer-Verlag, New York (1999). | DOI | MR | Zbl

[2] S. Boneh and V.G. Papanicolaou, General asymptotic estimates for the coupon collector problem. J. Comput. Appl. Math. 67 (1996) 277–289. | DOI | MR | Zbl

[3] R.K. Brayton, On the asymptotic behavior of the number of trials necessary to complete a set with random selection. Math. Anal. Appl. 7 (1963) 31–61. | DOI | MR | Zbl

[4] N.G. De Bruijn, Asymptotic Methods in Analysis, second edition. North Holland, Amsterdam (1961). | MR | Zbl

[5] P. Diaconis and S. Holmes, A Bayesian peek into Feller volume I, in Vol. 64, No. 3, In Memory of D. Basu, Part 2 (2002) 820–841. | MR | Zbl

[6] A.V. Doumas and V.G. Papanicolaou, The Coupon Collector’s problem revisited: asymptotics of the variance. Adv. Appl. Prob. 44 (2012) 166–195. | DOI | MR | Zbl

[7] A.V. Doumas and V.G. Papanicolaou, Asymptotics of the rising moments for the Coupon Collector’s Problem. Electron. J. Probab. 18 (2013) 41. | DOI | MR | Zbl

[8] A.V. Doumas and V.G. Papanicolaou, The Coupon Collector’s problem revisited: generalizing the double dixie cup problem of Newman and Shepp. ESAIM: PS 20 (2016) 367–399. | DOI | Numdam | MR | Zbl

[9] A.V. Doumas and V.G. Papanicolaou, Uniform versus Zipf distribution in a mixing collection process. Stat. Probab. Lett. 155 (2019) 108559. | DOI | MR | Zbl

[10] A.V. Doumas and V.G. Papanicolaou, Sampling from a mixture of different groups of coupons. Preprint (2018). | arXiv | MR

[11] R. Durrett, Probability: Theory and Examples, Third Edition, Duxbury Advanced Series, Brooks/Cole—Thomson Learning. Belmont, CA, USA (2005). | Zbl

[12] P. Erdős and A. Rényi, On a classical problem of probability theory. Magyar. Tud. Akad. Mat. Kutató Int. Közl. 6 (1961) 215–220. | MR | Zbl

[13] W. Feller, An Introduction to Probability Theory and Its Applications, Vol. I & II. John Wiley & Sons, Inc., New York (1966). | MR | Zbl

[14] P. Flajolet and R. Sedgewick, Analytic Combinatorics. Cambridge University Press, 1st Ed. (2009). | DOI | MR | Zbl

[15] P. Flajolet, D. Gardy and L. Thimonier, Birthday paradox, coupon collectors, caching algorithms and self-organizing search, Discr. Appl. Math. 39 (1992) 207–229. | DOI | MR | Zbl

[16] L. Holst, On Birthday, Collectors’, Occupancy and other classical Urn problems. Int. Stat. Rev. 54 (1986) 15–27. | DOI | MR | Zbl

[17] K.L. Locey and J.T. Lennon, Scaling laws predict global microbial diversity. Proc. Natl. Acad. Sci. USA 113 (2016) 5970–5975. | DOI

[18] H.M. Mahmoud, Pólya urn models. CRC Press, New York (2008). | DOI | MR | Zbl

[19] A.W. Marshall, I. Olkin and B. Arnold, Inequalities: Theory of Majorization and Its Applications. Springer, 2nd ed. (2009). | MR | Zbl

[20] P. Neal, The Generalised Coupon Collector Problem. J. Appl. Prob. 45 (2008) 621–629. | DOI | MR | Zbl

[21] M.E.J. Newman, Power laws, Pareto distributions and Zipf’s law. Contemp. Phys. 46 (2005) 323–351. | DOI

[22] D.J. Newman and L. Shepp, The double Dixie cup problem. Am. Math. Monthly 67 (1960) 58–61. | DOI | MR | Zbl

Cité par Sources :