Scandinavian Working Papers in Economics

Working Papers in Economics,
University of Gothenburg, Department of Economics

No 727: Confidence Set for Group Membership

Andreas Dzemski () and Ryo Okui ()
Additional contact information
Andreas Dzemski: Department of Economics, School of Business, Economics and Law, Göteborg University, Postal: P.O. Box 640, SE 40530 GÖTEBORG, Sweden
Ryo Okui: Department of Economics, School of Business, Economics and Law, Göteborg University, Postal: P.O. Box 640, SE 40530 GÖTEBORG, Sweden

Abstract: We develop new procedures to quantify the statistical uncertainty from sorting units in panel data into groups using data-driven clustering algorithms. In our setting, each unit belongs to one of a finite number of latent groups and its regression curve is determined by which group it belongs to. Our main contribution is a new joint confidence set for group membership. Each element of the joint confidence set is a vector of possible group assignments for all units. The vector of true group memberships is contained in the confidence set with a pre-specified probability. The confidence set inverts a test for group membership. This test exploits a characterization of the true group memberships by a system of moment inequalities. Our procedure solves a high-dimensional one-sided testing problem and tests group membership simultaneously for all units. We also propose a procedure for identifying units for which group membership is obviously determined. These units can be ignored when computing critical values. We justify the joint confidence set under N, T → ∞ asymptotics where we allow T to be much smaller than N. Our arguments rely on the theory of self-normalized sums and high-dimensional central limit theorems. We contribute new theoretical results for testing problems with a large number of moment inequalities, including an anti-concentration inequality for the quasi-likelihood ratio (QLR) statistic. Monte Carlo results indicate that our confidence set has adequate coverage and is informative. We illustrate the practical relevance of our confidence set in two applications.

Keywords: Panel data; grouped heterogeneity; clustering; confidence set; machine learning; moment inequalities; joint one-sided tests; self-normalized sums; high-dimensional CLT; anti-concentration for QLR

JEL-codes: C23; C33; C38

85 pages, March 2018

Full text files

55922 HTML file Full text

Download statistics

Questions (including download problems) about the papers in this series should be directed to Marie Andersson ()
Report other problems with accessing this service to Sune Karlsson ().

This page generated on 2018-03-07 08:55:38.