Optimal assignment of sellers in a store with a random number of clients via the Armed Bandit model

Vázquez-Guevara, Víctor Hugo; Cruz−Suárez, Hugo; Velasco-Luna, Fernando

doi:10.1051/ro/2017015

Vázquez-Guevara, Víctor Hugo ¹ ; Cruz−Suárez, Hugo ¹ ; Velasco-Luna, Fernando ¹

¹ Facultad de Ciencias Físico Matemáticas, Benemérita Universidad Autónoma de Puebla, San Claudio y 18 sur. San Manuel, 72570, Puebla, Mexico.

RAIRO - Operations Research - Recherche Opérationnelle, Tome 51 (2017) no. 4, pp. 1119-1132.

Résumé

The technique of Dynamic Programming for Armed Bandits is employed for solving the problem of maximizing the randomly depreciated gains of a store with unknown (finite random) number of clients with fixed (finite) number of sellers which skills are also random and will be represented as probability distributions which are themselves random. Hence, Armed Bandits’s framework will be considered with horizon being a random variable with a finite support, that far as the authors know, it has not yet been discussed. In addition, numerical examples are detailed in order to illustrate the versatility and practical implementation of the approach presented in this paper in two general contexts, given by the number of available products: one product only, such situation coincides with that in which the number of sales needs to be maximized. And, more than one product, in this case, the amount of sales is not necessarily ruled by a Bernoulli distribution.

Reçu le : 2015-11-12
Accepté le : 2017-03-02

MR Zbl

DOI : 10.1051/ro/2017015

Classification : 49L20, 90C40, 93E20
Mots-clés : Armed bandit model, dynamic programming, assignment of personal, random horizon, markov decision processes

Affiliations des auteurs :

Vázquez-Guevara, Víctor Hugo ¹ ; Cruz−Suárez, Hugo ¹ ; Velasco-Luna, Fernando ¹

¹ Facultad de Ciencias Físico Matemáticas, Benemérita Universidad Autónoma de Puebla, San Claudio y 18 sur. San Manuel, 72570, Puebla, Mexico.

@article{RO_2017__51_4_1119_0,
     author = {V\'azquez-Guevara, V{\'\i}ctor Hugo and Cruz\ensuremath{-}Su\'arez, Hugo and Velasco-Luna, Fernando},
     title = {Optimal assignment of sellers in a store with a random number of clients via the {Armed} {Bandit} model},
     journal = {RAIRO - Operations Research - Recherche Op\'erationnelle},
     pages = {1119--1132},
     publisher = {EDP-Sciences},
     volume = {51},
     number = {4},
     year = {2017},
     doi = {10.1051/ro/2017015},
     mrnumber = {3783937},
     zbl = {1396.49020},
     language = {en},
     url = {http://archive.numdam.org/articles/10.1051/ro/2017015/}
}

TY  - JOUR
AU  - Vázquez-Guevara, Víctor Hugo
AU  - Cruz−Suárez, Hugo
AU  - Velasco-Luna, Fernando
TI  - Optimal assignment of sellers in a store with a random number of clients via the Armed Bandit model
JO  - RAIRO - Operations Research - Recherche Opérationnelle
PY  - 2017
SP  - 1119
EP  - 1132
VL  - 51
IS  - 4
PB  - EDP-Sciences
UR  - http://archive.numdam.org/articles/10.1051/ro/2017015/
DO  - 10.1051/ro/2017015
LA  - en
ID  - RO_2017__51_4_1119_0
ER  -

%0 Journal Article
%A Vázquez-Guevara, Víctor Hugo
%A Cruz−Suárez, Hugo
%A Velasco-Luna, Fernando
%T Optimal assignment of sellers in a store with a random number of clients via the Armed Bandit model
%J RAIRO - Operations Research - Recherche Opérationnelle
%D 2017
%P 1119-1132
%V 51
%N 4
%I EDP-Sciences
%U http://archive.numdam.org/articles/10.1051/ro/2017015/
%R 10.1051/ro/2017015
%G en
%F RO_2017__51_4_1119_0

Vázquez-Guevara, Víctor Hugo; Cruz−Suárez, Hugo; Velasco-Luna, Fernando. Optimal assignment of sellers in a store with a random number of clients via the Armed Bandit model. RAIRO - Operations Research - Recherche Opérationnelle, Tome 51 (2017) no. 4, pp. 1119-1132. doi : 10.1051/ro/2017015. http://archive.numdam.org/articles/10.1051/ro/2017015/

Bibliographie
Cité par

R. Bellman, On the Theory of Dynamic Programming. Proc. of the National Academy of Sciences (1952). | MR | Zbl

D.A. Berry, Bandit Problems with random discounting, Mathematical learning. Models-Theory and Algorithms. Springer Verlag (1983). | MR | Zbl

D.A. Berry and B. Fristedt, Bandit Problems. Chapman and Hall (1985). | MR | Zbl

H. Cruz-Suárez, R. Ilhuicatzi-Roldán and R. Montes-De-Oca, Markov Decision Processes on Borel Spaces with Total Cost and Random Horizon. J. Optimiz. Theory Appl. 162 (2014) 329–346. | DOI | MR | Zbl

D. Levhari and L.J. Mirman, Savings and consumption with uncertain horizon. J.Political Econ. 85 (1977) 265–281. | DOI

K.R. Parthasarathy, Probability Measures on Metric Spaces. Academic Press (1967). | MR | Zbl

J. Wakefield, Bayesian and Frequentist Regression Methods. Springer Verlag (2013). | MR | Zbl

Cité par Sources :