Design-based Estimation with Record-Linked Administrative Files - ARCHIVED

Articles and reports: 11-522-X201300014265

Description:

Exact record linkage is an essential tool for exploiting administrative files, especially when one is studying the relationships among many variables that are not contained in a single administrative file. It is aimed at identifying pairs of records associated with the same individual or entity. The result is a linked file that may be used to estimate population parameters including totals and ratios. Unfortunately, the linkage process is complex and error-prone because it usually relies on linkage variables that are non-unique and recorded with errors. As a result, the linked file contains linkage errors, including bad links between unrelated records, and missing links between related records. These errors may lead to biased estimators when they are ignored in the estimation process. Previous work in this area has accounted for these errors using assumptions about their distribution. In general, the assumed distribution is in fact a very coarse approximation of the true distribution because the linkage process is inherently complex. Consequently, the resulting estimators may be subject to bias. A new methodological framework, grounded in traditional survey sampling, is proposed for obtaining design-based estimators from linked administrative files. It consists of three steps. First, a probabilistic sample of record-pairs is selected. Second, a manual review is carried out for all sampled pairs. Finally, design-based estimators are computed based on the review results. This methodology leads to estimators with a design-based sampling error, even when the process is solely based on two administrative files. It departs from the previous work that is model-based, and provides more robust estimators. This result is achieved by placing manual reviews at the center of the estimation process. Effectively using manual reviews is crucial because they are a de-facto gold-standard regarding the quality of linkage decisions. The proposed framework may also be applied when estimating from linked administrative and survey data.

Issue Number: 2013000
Author(s): Dasylva, Abel
FormatRelease dateMore information
PDFOctober 31, 2014

Related information

Subjects and keywords

Subjects

Date modified: