Auke Rijpma, Jeanne Cilliers () and Johan Fourie
Additional contact information
Auke Rijpma: Utrecht University
Jeanne Cilliers: Department of Economic History, Lund University, Postal: Department of Economic History, Lund University, Box 7083, S-220 07 Lund, Sweden
Johan Fourie: Stellenbosch University
Abstract: In this paper we describe the record linkage procedure to create a panel from Cape Colony census returns, or opgaafrolle, for 1787-1828, a dataset of 42,354 household-level observations. Based on a subset of manually linked records, we first evaluate statistical models and deterministic algorithms to best identify and match households over time. By using household-level characteristics in the linking process and near-annual data, we are able to create high-quality links for 84 percent of the dataset. We compare basic analyses on the linked panel dataset to the original cross-sectional data, evaluate the feasibility of the strategy when linking to supplementary sources, and discuss the scalability of our approach to the full Cape panel.
Keywords: census; machine learning; micro-data; record linkage; panel data; South Africa
49 pages, February 28, 2018
Full text files
92061a04-c39f-4ffe-8a02-78146b5059cf Full text
Questions (including download problems) about the papers in this series should be directed to Tobias Karlsson ()
Report other problems with accessing this service to Sune Karlsson ().
RePEc:hhs:luekhi:0172This page generated on 2024-09-13 22:16:07.