David Gunawan, Khue-Dung Dang, Matias Quiroz (), Robert Kohn and Minh-Ngoc Tran
Additional contact information
David Gunawan: School of Economics, UNSW Business School, University of New South Wales, ARC Centre of Excellence for Mathematical and Statistical Frontiers (ACEMS).
Khue-Dung Dang: School of Economics, UNSW Business School, University of New South Wales, ARC Centre of Excellence for Mathematical and Statistical Frontiers (ACEMS).
Matias Quiroz: School of Economics, UNSW Business School, University of New South Wales, ARC Centre of Excellence for Mathematical, Statistical Frontiers (ACEMS) and Research Division.
Robert Kohn: School of Economics, UNSW Business School, University of New South Wales, ARC Centre of Excellence for Mathematical and Statistical Frontiers (ACEMS).
Minh-Ngoc Tran: ARC Centre of Excellence for Mathematical and Statistical Frontiers (ACEMS) and Discipline of Business Analytics, University
Abstract: We show how to speed up Sequential Monte Carlo (SMC) for Bayesian inference in large data problems by data subsampling. SMC sequentially updates a cloud of particles through a sequence of distributions, beginning with a distribution that is easy to sample from such as the prior and ending with the posterior distribution. Each update of the particle cloud consists of three steps: reweighting, resampling, and moving. In the move step, each particle is moved using a Markov kernel and this is typically the most computation- ally expensive part, particularly when the dataset is large. It is crucial to have an efficient move step to ensure particle diversity. Our article makes two important contributions. First, in order to speed up the SMC computation, we use an approximately unbiased and efficient annealed likelihood estimator based on data subsampling. The subsampling approach is more memory effi- cient than the corresponding full data SMC, which is an advantage for parallel computation. Second, we use a Metropolis within Gibbs kernel with two con- ditional updates. A Hamiltonian Monte Carlo update makes distant moves for the model parameters, and a block pseudo-marginal proposal is used for the particles corresponding to the auxiliary variables for the data subsampling. We demonstrate the usefulness of the methodology for estimating three gen- eralized linear models and a generalized additive model with large datasets.
Keywords: Hamiltonian Monte Carlo; Large datasets; Likelihood annealing
39 pages, April 1, 2019
Full text files
wp371.pdf Full text
Questions (including download problems) about the papers in this series should be directed to Lena Löfgren ()
Report other problems with accessing this service to Sune Karlsson ().
RePEc:hhs:rbnkwp:0371This page generated on 2024-09-13 22:16:57.