Khue-Dung Dang, Matias Quiroz (), Robert Kohn, Minh-Ngoc Tran and Mattias Villani
Additional contact information
Khue-Dung Dang: School of Economics, UNSW Business School, University of New South Wales, ARC Centre of Excellence for Mathematical and Statistical Frontiers (ACEMS)
Matias Quiroz: School of Economics, UNSW Business School, University of New South Wales, ARC Centre of Excellence for Mathematical and Statistical Frontiers (ACEMS), Research Division, Sveriges Riksbank, Postal: Sveriges Riksbank, SE-103 37 Stockholm, Sweden
Robert Kohn: School of Economics, UNSW Business School, University of New South Wales, ARC Centre of Excellence for Mathematical and Statistical Frontiers (ACEMS)
Minh-Ngoc Tran: ARC Centre of Excellence for Mathematical and Statistical Frontiers (ACEMS), Discipline of Business Analytics, University of Sidney
Mattias Villani: ARC Centre of Excellence for Mathematical and Statistical Frontiers (ACEMS), Division of Statistics and Machine Learning, Linköping University, Department of Statistics, Stockholm University.
Abstract: Hamiltonian Monte Carlo (HMC) samples efficiently from high-dimensional posterior distributions with proposed parameter draws obtained by iterating on a discretized version of the Hamiltonian dynamics. The iterations make HMC computationally costly, especially in problems with large datasets, since it is necessary to compute posterior densities and their derivatives with respect to the parameters. Naively computing the Hamiltonian dynamics on a subset of the data causes HMC to lose its key ability to generate distant parameter proposals with high acceptance probability. The key insight in our article is that efficient subsampling HMC for the parameters is possible if both the dynamics and the acceptance probability are computed from the same data subsample in each complete HMC iteration. We show that this is possible to do in a principled way in a HMC-within-Gibbs framework where the subsample is updated using a pseudo marginal MH step and the parameters are then updated using an HMC step, based on the current subsample. We show that our subsampling methods are fast and compare favorably to two popular sampling algorithms that utilize gradient estimates from data subsampling. We also explore the current limitations of subsampling HMC algorithms by varying the quality of the variance reducing control variates used in the estimators of the posterior density and its gradients.
Keywords: Large datasets; Bayesian inference; Stochastic gradient
47 pages, April 1, 2019
Full text files
wp372.pdf Full text
Questions (including download problems) about the papers in this series should be directed to Lena Löfgren ()
Report other problems with accessing this service to Sune Karlsson ().
RePEc:hhs:rbnkwp:0372This page generated on 2024-09-13 22:16:57.