The Construction of Generalized Dirichlet Process Distributions via Polya urn and Gibbs Sampling

Document Type : Original Article

Authors
Department of Statistics, Faculty of Mathematics & Statistics, University of Isfahan
Abstract
Bayesian nonparametric inference is increasingly demanding in statistical modeling due to incorporating flexible prior processes in complex data analysis. This paper represents the Polya urn scheme for the generalized Dirichlet process (GDP). It utilizes the partition analysis to construct the joint distribution of a random sample from the GDP as a mixture prior distribution of countable components. Using permutation theory, we present the components' weights in a computationally accessible manner to make the resulting joint prior equation applicable. The advantages of our findings include tractable algebraic operations that lead to closed-form equations. The paper recommends the Polya urn Gibbs sampler algorithm, derive full conditional posterior distributions, and as an illustration, implement the algorithm for fitting some popular statistical models in nonparametric Bayesian settings.
Keywords
Subjects

Aghabazaz, Z., Kazemi, I. and Nematollahi, A. (2023). Dynamic mixed models with heterogeneous covariance components using multivariate GARCH innovations and the Dirichlet process mixture. Journal of Computational and Applied Mathematics, Forthcoming.
Aldous, D. J. (1985). Exchangeability and related topics. In: Hennequin, P.L. (eds) École d’Été de Probabilités de Saint-Flour XIII—1983, Lecture Notes in Mathematics, vol 1117. Springer, Berlin, Heidelberg.
Andrews, G. E. (1998), The Theory of Partitions (Encyclopedia of Mathematics and its Applications). Cambridge: Cambridge University Press.
Antoniak, C. E. (1974). Mixtures of Dirichlet processes with applications to Bayesian nonparametric problems. The Annals of Statistics, 2(6), 1152-1174.
Barcella, W. (2017). Covariate Dependent Random Measures With Applications in Biostatistics. Ph.D. Thesis, UCL (University College London).
Barcella, W., De Iorio, M., Favaro, S. and Rosner, G.L. (2018). Dependent generalized Dirichlet process priors for the analysis of acute lymphoblastic leukemia. Biostatistics, 19(3), 342-358.
Barcella, W., Iorio, M. D. and Malone-Lee, J. (2018). Modelling correlated binary variables: an application to lower urinary tract symptoms. Journal of the Royal Statistical Society: Series C (Applied Statistics), 67(4), 1083-1100.
Beckett, L., and Diaconis, P. (1994). Spectral analysis for discrete longitudinal data. Advances in Mathematics, 103(1), 107–128.
Blackwell, D., and MacQueen, J.B. (1973). Ferguson distributions via Pólya urn schemes. The Annals of Statistics, 1(2), 353-355.
Castellares, F., Ferrari, S. L. and Lemonte, A.J. (2018). On the Bell distribution and its associated regression model for count data. Applied Mathematical Modelling, 56, 172-185.
Escobar, M.D. (1994). Estimating normal means with a Dirichlet process prior. Journal of the American Statistical Association, 89(425), 268-277.
Escobar, M.D. and West, M. (1995). Bayesian density estimation and inference using mixtures. Journal of the American Statistical Association, 90(430), 577-588.
Ferguson, T.S. (1973). A Bayesian analysis of some nonparametric problems. The Annals of Statistics, 209-230.
Ferguson, T.S. (1983). Bayesian density estimation by mixtures of normal distributions. In Rizvi, M.H., Rustagi, J.S., and Siegmund, D. (Eds), Recent Advances in Statistics, Academic Press, pp 287-302.
Ghosal, S. and Van der Vaart, A. (2017). Fundamentals of Nonparametric Bayesian Inference. Cambridge: Cambridge University Press.
Hjort, N.L. (2000). Bayesian analysis for a generalised Dirichlet process prior. Preprint Series. Statistical Research Report http://urn.nb.no/URN: NBN: no-23420.
Hjort, N.L., Holmes, C., Müller, P. and Walker, S. G. (2010). Bayesian Nonparametrics. Cambridge: Cambridge University Press.
Ishwaran, H. and James, L. F. (2001). Gibbs sampling methods for stick-breaking priors. Journal of the American Statistical Association, 96(453), 161-173.
Ishwaran, H. and Zarepour, M. (2000). Markov chain Monte Carlo in approximate Dirichlet and beta two-parameter process hierarchical models. Biometrika, 87(2), 371- 390.
Liu, J. S. (1996). Nonparametric hierarchical Bayes via sequential imputations. The Annals of Statistics, 24(3), 911–930.
Lo, A.Y. (1984). On a class of Bayesian nonparametric estimates: I. Density estimates. The Annals of Statistics, 12(1), 351-357.
MacEachern, S. N. (1999). Dependent nonparametric processes. In ASA Proceedings of the Section on Bayesian Statistical Science, Alexandria, VA. American Statistical Association. MacEachern, S.N. and Müller, P. (1998). Estimating mixture of Dirichlet process models. Journal of Computational and Graphical Statistics, 7(2), 223-238.
Mano, S. (2018), Partitions, Hypergeometric Systems, and Dirichlet Processes in Statistics. Springer Verlag, Japan.
Miller, J.W. (2019). An elementary derivation of the Chinese restaurant process from Sethuraman’s stick-breaking process. Statistics and Probability Letters, 146, 112-117.
Molinari, M., de Iorio, M., Chaturvedi, N., Hughes, A. and Tillin, T. (2021). Modelling ethnic differences in the distribution of insulin resistance via Bayesian nonparametric processes: an application to the SABRE cohort study. The International Journal of Biostatistics, 17(1), 153-164.
Müller, P., Quintana, F. A., Jara, A. and Hanson, T. (2015). Bayesian Nonparametric Data Analysis, Springer International Publishing Switzerland.
Neal, R. (2000). Markov chain sampling methods for Dirichlet process mixture models. Journal of Computational & Graphical statistics, 9(2), 249-265.
Phadia, E.G. (2016). Prior Processes and Their Applications. Springer International Publishing.
Pitman, J. (1995). Exchangeable and partially exchangeable random partitions. Probability Theory and Related Fields, 102(2), 145-158.
Pitman, J. (1996). Some developments of the Blackwell-MacQueen urn scheme. Lecture Notes-Monograph Series, 30, 245-267.
Pitman, J. (2006). Combinatorial Stochastic Processes. Springer Berlin, Heidelberg.
Pitman, J. and Yor, M. (1997). The two-parameter Poisson-Dirichlet distribution derived from a stable subordinator. The Annals of Probability, 25(2), 855-900.
Rodriguez, A. and Dunson, D. B. (2014). Functional clustering in nested designs: modeling variability in reproductive epidemiology studies. The Annals of Applied Statistics, 8(3), 1416–1442.
Sethuraman, J. (1994). A constructive definition of Dirichlet priors. Statistica Sinica, 8, 639-650.
Sethuraman, J. and Tiwari, R. C. (1982). Convergence of Dirichlet measures and the interpretation of their parameter. In Gupta, S.S. and Berger, J.O. (eds). Statistical Decision Theory and Related Topics III, Academic Press, pp 305-315.
Volume 21, Issue 2
December 2022
Pages 111-132

  • Receive Date 15 February 2023
  • Revise Date 19 August 2023
  • Accept Date 15 September 2023