Administrative data

Skip to main content
Skip to footer

Language selection

Français

Search and menus

Search and menus

Search

Skip to filters. View results.

Results

All (22)

All (22) (0 to 10 of 22 results)

1. Probabilistic or deterministic? Linkage methods tested for the Résil program Archived
Articles and reports: 11-522-X202200100019
Description: The purpose of this article is to compare the linkage results for individuals from French tax sources with those of the 2019 Enquête Annuelle de Recensement (EAR), obtained through different methods. Such a comparison will decide whether the Répertoires Statistiques d'Individus et de Logements (Résil) program should be equipped with a probabilistic matching tool for its administrative source identification and matching engine.
Release date: 2024-03-25
2. Non-response follow-up for business surveys
Articles and reports: 12-001-X202200100006
Description:
In the last two decades, survey response rates have been steadily falling. In that context, it has become increasingly important for statistical agencies to develop and use methods that reduce the adverse effects of non-response on the accuracy of survey estimates. Follow-up of non-respondents may be an effective, albeit time and resource-intensive, remedy for non-response bias. We conducted a simulation study using real business survey data to shed some light on several questions about non-response follow-up. For instance, assuming a fixed non-response follow-up budget, what is the best way to select non-responding units to be followed up? How much effort should be dedicated to repeatedly following up non-respondents until a response is received? Should they all be followed up or a sample of them? If a sample is followed up, how should it be selected? We compared Monte Carlo relative biases and relative root mean square errors under different follow-up sampling designs, sample sizes and non-response scenarios. We also determined an expression for the minimum follow-up sample size required to expend the budget, on average, and showed that it maximizes the expected response rate. A main conclusion of our simulation experiment is that this sample size also appears to approximately minimize the bias and mean square error of the estimates.

Release date: 2022-06-21
3. Accessing the Canada Learning Bond: Meeting Identification and Income Eligibility Requirements
Articles and reports: 75F0002M2019007
Description:
Not having a Social Insurance Number (SIN) and not filing taxes may represent challenges to access government programs and supports such as the Canada Education Savings Grant (CESG) and the Canada Learning Bond (CLB). Limited data availability has prevented a full assessment of the extent of these access challenges. This study attempts to address this knowledge gap by analyzing overall differences in SIN possession and tax-filing uptake by family income, levels of parental education, family type and Indigenous identity of the child and age of children using the 2016 Census data augmented with tax-filing and Social Insurance Number possession indicator flags.
Release date: 2019-06-21
4. Estimating Parental Leave in Canada Using Administrative Data Archived
Articles and reports: 11-633-X2017009
Description:
This document describes the procedures for using linked administrative data sources to estimate paid parental leave rates in Canada and the issues surrounding this use.
Release date: 2017-08-29
5. A note on regression estimation with unknown population size Archived
Articles and reports: 12-001-X201600114543
Description:
The regression estimator is extensively used in practice because it can improve the reliability of the estimated parameters of interest such as means or totals. It uses control totals of variables known at the population level that are included in the regression set up. In this paper, we investigate the properties of the regression estimator that uses control totals estimated from the sample, as well as those known at the population level. This estimator is compared to the regression estimators that strictly use the known totals both theoretically and via a simulation study.
Release date: 2016-06-22
6. Using Administrative Records to Evaluate Survey Data Archived
Articles and reports: 11-522-X201700014711
Description:
After the 2010 Census, the U.S. Census Bureau conducted two separate research projects matching survey data to databases. One study matched to the third-party database Accurint, and the other matched to U.S. Postal Service National Change of Address (NCOA) files. In both projects, we evaluated response error in reported move dates by comparing the self-reported move date to records in the database. We encountered similar challenges in the two projects. This paper discusses our experience using “big data” as a comparison source for survey data and our lessons learned for future projects similar to the ones we conducted.
Release date: 2016-03-24
7. Comparing Survey Data to Administrative Sources: Immigration, Labour, and Demographic data from the Longitudinal and International Study of Adults Archived
Surveys and statistical programs – Documentation: 11-522-X201700014716
Description:
Administrative data, depending on its source and original purpose, can be considered a more reliable source of information than survey-collected data. It does not require a respondent to be present and understand question wording, and it is not limited by the respondent’s ability to recall events retrospectively. This paper compares selected survey data, such as demographic variables, from the Longitudinal and International Study of Adults (LISA) to various administrative sources for which LISA has linkage agreements in place. The agreement between data sources, and some factors that might affect it, are analyzed for various aspects of the survey.
Release date: 2016-03-24
8. Estimating the effects related to the timing of participation in employment assistance services using rich administrative data Archived
Articles and reports: 11-522-X201700014718
Description:
This study assessed whether starting participation in Employment Assistance Services (EAS) earlier after initiating an Employment Insurance (EI) claim leads to better impacts for unemployed individuals than participating later during the EI benefit period. As in Sianesi (2004) and Hujer and Thomsen (2010), the analysis relied on a stratified propensity score matching approach conditional on the discretized duration of unemployment until the program starts. The results showed that individuals who participated in EAS within the first four weeks after initiating an EI claim had the best impacts on earnings and incidence of employment while also experiencing reduced use of EI starting the second year post-program.
Release date: 2016-03-24
9. Sampling Procedures for Assessing Accuracy of Record Linkage Archived
Articles and reports: 11-522-X201700014729
Description:
The use of administrative datasets as a data source in official statistics has become much more common as there is a drive for more outputs to be produced more efficiently. Many outputs rely on linkage between two or more datasets, and this is often undertaken in a number of phases with different methods and rules. In these situations we would like to be able to assess the quality of the linkage, and this involves some re-assessment of both links and non-links. In this paper we discuss sampling approaches to obtain estimates of false negatives and false positives with reasonable control of both accuracy of estimates and cost. Approaches to stratification of links (non-links) to sample are evaluated using information from the 2011 England and Wales population census.
Release date: 2016-03-24
10. Estimating the Impact of Active Labour Market Programs using Administrative Data and Matching Methods Archived
Articles and reports: 11-522-X201700014740
Description:
In this paper, we discuss the impacts of Employment Benefit and Support Measures delivered in Canada under the Labour Market Development Agreements. We use linked rich longitudinal administrative data covering all LMDA participants from 2002 to 2005. We Apply propensity score matching as in Blundell et al. (2002), Gerfin and Lechner (2002), and Sianesi (2004), and produced the national incremental impact estimates using difference-in-differences and Kernel Matching estimator (Heckman and Smith, 1999). The findings suggest that, both Employment Assistance Services and employment benefit such as Skills Development and Targeted Wage Subsidies had positive effects on earnings and employment.
Release date: 2016-03-24

Data (0)

Data (0) (0 results)

No content available at this time.

Analysis (19)

Analysis (19) (0 to 10 of 19 results)

1. Probabilistic or deterministic? Linkage methods tested for the Résil program Archived
Articles and reports: 11-522-X202200100019
Description: The purpose of this article is to compare the linkage results for individuals from French tax sources with those of the 2019 Enquête Annuelle de Recensement (EAR), obtained through different methods. Such a comparison will decide whether the Répertoires Statistiques d'Individus et de Logements (Résil) program should be equipped with a probabilistic matching tool for its administrative source identification and matching engine.
Release date: 2024-03-25
2. Non-response follow-up for business surveys
Articles and reports: 12-001-X202200100006
Description:
In the last two decades, survey response rates have been steadily falling. In that context, it has become increasingly important for statistical agencies to develop and use methods that reduce the adverse effects of non-response on the accuracy of survey estimates. Follow-up of non-respondents may be an effective, albeit time and resource-intensive, remedy for non-response bias. We conducted a simulation study using real business survey data to shed some light on several questions about non-response follow-up. For instance, assuming a fixed non-response follow-up budget, what is the best way to select non-responding units to be followed up? How much effort should be dedicated to repeatedly following up non-respondents until a response is received? Should they all be followed up or a sample of them? If a sample is followed up, how should it be selected? We compared Monte Carlo relative biases and relative root mean square errors under different follow-up sampling designs, sample sizes and non-response scenarios. We also determined an expression for the minimum follow-up sample size required to expend the budget, on average, and showed that it maximizes the expected response rate. A main conclusion of our simulation experiment is that this sample size also appears to approximately minimize the bias and mean square error of the estimates.

Release date: 2022-06-21
3. Accessing the Canada Learning Bond: Meeting Identification and Income Eligibility Requirements
Articles and reports: 75F0002M2019007
Description:
Not having a Social Insurance Number (SIN) and not filing taxes may represent challenges to access government programs and supports such as the Canada Education Savings Grant (CESG) and the Canada Learning Bond (CLB). Limited data availability has prevented a full assessment of the extent of these access challenges. This study attempts to address this knowledge gap by analyzing overall differences in SIN possession and tax-filing uptake by family income, levels of parental education, family type and Indigenous identity of the child and age of children using the 2016 Census data augmented with tax-filing and Social Insurance Number possession indicator flags.
Release date: 2019-06-21
4. Estimating Parental Leave in Canada Using Administrative Data Archived
Articles and reports: 11-633-X2017009
Description:
This document describes the procedures for using linked administrative data sources to estimate paid parental leave rates in Canada and the issues surrounding this use.
Release date: 2017-08-29
5. A note on regression estimation with unknown population size Archived
Articles and reports: 12-001-X201600114543
Description:
The regression estimator is extensively used in practice because it can improve the reliability of the estimated parameters of interest such as means or totals. It uses control totals of variables known at the population level that are included in the regression set up. In this paper, we investigate the properties of the regression estimator that uses control totals estimated from the sample, as well as those known at the population level. This estimator is compared to the regression estimators that strictly use the known totals both theoretically and via a simulation study.
Release date: 2016-06-22
6. Using Administrative Records to Evaluate Survey Data Archived
Articles and reports: 11-522-X201700014711
Description:
After the 2010 Census, the U.S. Census Bureau conducted two separate research projects matching survey data to databases. One study matched to the third-party database Accurint, and the other matched to U.S. Postal Service National Change of Address (NCOA) files. In both projects, we evaluated response error in reported move dates by comparing the self-reported move date to records in the database. We encountered similar challenges in the two projects. This paper discusses our experience using “big data” as a comparison source for survey data and our lessons learned for future projects similar to the ones we conducted.
Release date: 2016-03-24
7. Estimating the effects related to the timing of participation in employment assistance services using rich administrative data Archived
Articles and reports: 11-522-X201700014718
Description:
This study assessed whether starting participation in Employment Assistance Services (EAS) earlier after initiating an Employment Insurance (EI) claim leads to better impacts for unemployed individuals than participating later during the EI benefit period. As in Sianesi (2004) and Hujer and Thomsen (2010), the analysis relied on a stratified propensity score matching approach conditional on the discretized duration of unemployment until the program starts. The results showed that individuals who participated in EAS within the first four weeks after initiating an EI claim had the best impacts on earnings and incidence of employment while also experiencing reduced use of EI starting the second year post-program.
Release date: 2016-03-24
8. Sampling Procedures for Assessing Accuracy of Record Linkage Archived
Articles and reports: 11-522-X201700014729
Description:
The use of administrative datasets as a data source in official statistics has become much more common as there is a drive for more outputs to be produced more efficiently. Many outputs rely on linkage between two or more datasets, and this is often undertaken in a number of phases with different methods and rules. In these situations we would like to be able to assess the quality of the linkage, and this involves some re-assessment of both links and non-links. In this paper we discuss sampling approaches to obtain estimates of false negatives and false positives with reasonable control of both accuracy of estimates and cost. Approaches to stratification of links (non-links) to sample are evaluated using information from the 2011 England and Wales population census.
Release date: 2016-03-24
9. Estimating the Impact of Active Labour Market Programs using Administrative Data and Matching Methods Archived
Articles and reports: 11-522-X201700014740
Description:
In this paper, we discuss the impacts of Employment Benefit and Support Measures delivered in Canada under the Labour Market Development Agreements. We use linked rich longitudinal administrative data covering all LMDA participants from 2002 to 2005. We Apply propensity score matching as in Blundell et al. (2002), Gerfin and Lechner (2002), and Sianesi (2004), and produced the national incremental impact estimates using difference-in-differences and Kernel Matching estimator (Heckman and Smith, 1999). The findings suggest that, both Employment Assistance Services and employment benefit such as Skills Development and Targeted Wage Subsidies had positive effects on earnings and employment.
Release date: 2016-03-24
10. Linking Canadian Patent records from the U.S. Patent office to Statistics Canada’s Business Register, 2000 to 2011 Archived
Articles and reports: 11-522-X201700014742
Description:
This paper describes the Quick Match System (QMS), an in-house application designed to match business microdata records, and the methods used to link the United States Patent and Trademark Office (USPTO) dataset to Statistics Canada’s Business Register (BR) for the period from 2000 to 2011. The paper illustrates the record-linkage framework and outlines the techniques used to prepare and classify each record and evaluate the match results. The USPTO dataset consisted of 41,619 U.S. patents granted to 14,162 distinct Canadian entities. The record-linkage process matched the names, city, province and postal codes of the patent assignees in the USPTO dataset with those of businesses in the January editions of the Generic Survey Universe File (GSUF) from the BR for the same reference period. As the vast majority of individual patent assignees are not engaged in commercial activity to provide taxable property or services, they tend not to appear in the BR. The relatively poor match rate of 24.5% among individuals, compared to 84.7% among institutions, reflects this tendency. Although the 8,844 individual patent assignees outnumbered the 5,318 institutions, the institutions accounted for 73.0% of the patents, compared to 27.0% held by individuals. Consequently, this study and its conclusions focus primarily on institutional patent assignees. The linkage of the USPTO institutions to the BR is significant because it provides access to business micro-level data on firm characteristics, employment, revenue, assets and liabilities. In addition, the retrieval of robust administrative identifiers enables subsequent linkage to other survey and administrative data sources. The integrated dataset will support direct and comparative analytical studies on the performance of Canadian institutions that obtained patents in the United States between 2000 and 2011.
Release date: 2016-03-24

Reference (3)

Reference (3) ((3 results))

1. Comparing Survey Data to Administrative Sources: Immigration, Labour, and Demographic data from the Longitudinal and International Study of Adults Archived
Surveys and statistical programs – Documentation: 11-522-X201700014716
Description:
Administrative data, depending on its source and original purpose, can be considered a more reliable source of information than survey-collected data. It does not require a respondent to be present and understand question wording, and it is not limited by the respondent’s ability to recall events retrospectively. This paper compares selected survey data, such as demographic variables, from the Longitudinal and International Study of Adults (LISA) to various administrative sources for which LISA has linkage agreements in place. The agreement between data sources, and some factors that might affect it, are analyzed for various aspects of the survey.
Release date: 2016-03-24
2. Use of Administrative Data to Increase the Efficiency of the Sample Design for the New National Travel Survey Archived
Surveys and statistical programs – Documentation: 11-522-X201700014749
Description:
As part of the Tourism Statistics Program redesign, Statistics Canada is developing the National Travel Survey (NTS) to collect travel information from Canadian travellers. This new survey will replace the Travel Survey of Residents of Canada and the Canadian resident component of the International Travel Survey. The NTS will take advantage of Statistics Canada’s common sampling frames and common processing tools while maximizing the use of administrative data. This paper discusses the potential uses of administrative data such as Passport Canada files, Canada Border Service Agency files and Canada Revenue Agency files, to increase the efficiency of the NTS sample design.
Release date: 2016-03-24
3. Survey of Labour and Income Dynamics (SLID), 2003 Reference Year: Entry Exit Component for January 2004 Labour Interview and May 2004 Income Interview Archived
Surveys and statistical programs – Documentation: 75F0002M2005005
Description:
The Survey of Labour and Income Dynamics (SLID) conducts two annual interviews: the Labour interview in January and the Income interview in May. The data are collected using computer-assisted interviewing. Thus there are no paper questionnaires required for data collection. The questions, responses and interview flow for Labour and Income are documented in other SLID research papers. This document presents the information for the 2004 Entry Exit portion of the Labour and the Income interviews (for the 2003 reference year).
The Entry Exit Component consists of five separate modules. The Entry module is the first set of data collected. It is information collected to update household composition and place of residence. For each person identified in Entry, the Demographics module collects (or updates) the person's name, date of birth, sex and marital status. Then the Relationships module identifies (or updates) the relationship between each respondent and every other household member. Relationship data is not collected in the May Income interview. The Exit module includes questions on who to contact for the next interview and the names, phone numbers and addresses of two contacts to be used only if future tracing of respondents is required. An overview of the Tracing module is also included in this document.
Release date: 2005-06-16

Report a problem or mistake on this page

Date modified:: 2024-04-18