Statistics by subject – Administrative data

Filter results by

Help for filters and search
Currently selected filters that can be removed

Keyword(s)

Content

1 facets displayed. 0 facets selected.

Filter results by

Help for filters and search
Currently selected filters that can be removed

Keyword(s)

Content

1 facets displayed. 0 facets selected.

Filter results by

Help for filters and search
Currently selected filters that can be removed

Keyword(s)

Content

1 facets displayed. 0 facets selected.

Filter results by

Help for filters and search
Currently selected filters that can be removed

Keyword(s)

Content

1 facets displayed. 0 facets selected.

Other available resources to support your research.

Help for sorting results
Browse our central repository of key standard concepts, definitions, data sources and methods.
Loading
Loading in progress, please wait...
All (72)

All (72) (25 of 72 results)

  • Articles and reports: 11-633-X2018016
    Description:

    Record linkage has been identified as a potential mechanism to add treatment information to the Canadian Cancer Registry (CCR). The purpose of the Canadian Cancer Treatment Linkage Project (CCTLP) pilot is to add surgical treatment data to the CCR. The Discharge Abstract Database (DAD) and the National Ambulatory Care Reporting System (NACRS) were linked to the CCR, and surgical treatment data were extracted. The project was funded through the Cancer Data Development Initiative (CDDI) of the Canadian Partnership Against Cancer (CPAC).

    The CCTLP was developed as a feasibility study in which patient records from the CCR would be linked to surgical treatment records in the DAD and NACRS databases, maintained by the Canadian Institute for Health Information. The target cohort to whom surgical treatment data would be linked was patients aged 19 or older registered on the CCR (2010 through 2012). The linkage was completed in Statistics Canada’s Social Data Linkage Environment (SDLE).

    Release date: 2018-03-27

  • Articles and reports: 11-633-X2018014
    Description:

    The Canadian Mortality Database (CMDB) is an administrative database that collects information on cause of death from all provincial and territorial vital statistics registries in Canada. The CMDB lacks subpopulation identifiers to examine mortality rates and disparities among groups such as First Nations, Métis, Inuit and members of visible minority groups. Linkage between the CMDB and the Census of Population is an approach to circumvent this limitation. This report describes a linkage between the CMDB (2006 to 2011) and the 2006 Census of Population, which was carried out using hierarchical deterministic exact matching, with a focus on methodology and validation.

    Release date: 2018-02-14

  • Articles and reports: 11-633-X2018013
    Description:

    Since 2008, a number of population censuses have been linked to administrative health data and to financial data. These linked datasets have been instrumental in examining health inequalities and have been used in environmental health research. This paper describes the creation of the 1996 Canadian Census Health and Environment Cohort (CanCHEC)—3.57 million respondents to the census long-form questionnaire who were retrospectively followed for mortality and mobility for 16.6 years from 1996 to 2012. The 1996 CanCHEC was limited to census respondents who were aged 19 or older on Census Day (May 14, 1996), were residents of Canada, were not residents of institutions, and had filed an income tax return. These respondents were linked to death records from the Canadian Mortality Database or to the T1 Personal Master File, and to a postal code history from a variety of sources. This is the third in a set of CanCHECs that, when combined, make it possible to examine mortality trends and environmental exposures by socioeconomic characteristics over three census cycles and 21 years of census, tax, and mortality data. This report describes linkage methodologies, validation and bias assessment, and the characteristics of the 1996 CanCHEC. Representativeness of the 1996 CanCHEC relative to the adult population of Canada is also assessed.

    Release date: 2018-01-22

  • Articles and reports: 11-633-X2018012
    Description:

    This study investigates the extent to which income tax reassessments and delayed tax filing affect the reliability of Canadian administrative tax datasets used for economic analysis. The study is based on individual income tax records from the T1 Personal Master File and Historical Personal Master File for selected years from 1990 to 2010. These datasets contain tax records for approximately 100% of initial and all income tax filers, who submitted returns to the Canada Revenue Agency (CRA) before specific processing cut-off dates.

    Release date: 2018-01-11

  • Articles and reports: 12-001-X201700254871
    Description:

    In this paper the question is addressed how alternative data sources, such as administrative and social media data, can be used in the production of official statistics. Since most surveys at national statistical institutes are conducted repeatedly over time, a multivariate structural time series modelling approach is proposed to model the series observed by a repeated surveys with related series obtained from such alternative data sources. Generally, this improves the precision of the direct survey estimates by using sample information observed in preceding periods and information from related auxiliary series. This model also makes it possible to utilize the higher frequency of the social media to produce more precise estimates for the sample survey in real time at the moment that statistics for the social media become available but the sample data are not yet available. The concept of cointegration is applied to address the question to which extent the alternative series represent the same phenomena as the series observed with the repeated survey. The methodology is applied to the Dutch Consumer Confidence Survey and a sentiment index derived from social media.

    Release date: 2017-12-21

  • Articles and reports: 82-003-X201601214687
    Description:

    This study describes record linkage of the Canadian Community Health Survey and the Canadian Mortality Database. The article explains the record linkage process and presents results about associations between health behaviours and mortality among a representative sample of Canadians.

    Release date: 2016-12-21

  • Articles and reports: 11-633-X2016001
    Description:

    Every year, thousands of workers lose their jobs as firms reduce the size of their workforce in response to growing competition, technological changes, changing trade patterns and numerous other factors. Thousands of workers also start a job with a new employer as new firms enter a product market and existing firms expand or replace employees who recently left. This worker reallocation process across employers is generally seen as contributing to productivity growth and rising living standards. To measure this labour reallocation process, labour market indicators such as hiring rates and layoff rates are needed. In response to growing demand for subprovincial labour market information and taking advantage of unique administrative datasets, Statistics Canada is producing hiring rates and layoff rates by economic region of residence. This document describes the data sources, conceptual and methodological issues, and other matters pertaining to these two indicators.

    Release date: 2016-06-27

  • Articles and reports: 12-001-X201600114544
    Description:

    In the Netherlands, statistical information about income and wealth is based on two large scale household panels that are completely derived from administrative data. A problem with using households as sampling units in the sample design of panels is the instability of these units over time. Changes in the household composition affect the inclusion probabilities required for design-based and model-assisted inference procedures. Such problems are circumvented in the two aforementioned household panels by sampling persons, who are followed over time. At each period the household members of these sampled persons are included in the sample. This is equivalent to sampling with probabilities proportional to household size where households can be selected more than once but with a maximum equal to the number of household members. In this paper properties of this sample design are described and contrasted with the Generalized Weight Share method for indirect sampling (Lavallée 1995, 2007). Methods are illustrated with an application to the Dutch Regional Income Survey.

    Release date: 2016-06-22

  • Articles and reports: 12-001-X201600114543
    Description:

    The regression estimator is extensively used in practice because it can improve the reliability of the estimated parameters of interest such as means or totals. It uses control totals of variables known at the population level that are included in the regression set up. In this paper, we investigate the properties of the regression estimator that uses control totals estimated from the sample, as well as those known at the population level. This estimator is compared to the regression estimators that strictly use the known totals both theoretically and via a simulation study.

    Release date: 2016-06-22

  • Technical products: 11-522-X201700014742
    Description:

    This paper describes the Quick Match System (QMS), an in-house application designed to match business microdata records, and the methods used to link the United States Patent and Trademark Office (USPTO) dataset to Statistics Canada’s Business Register (BR) for the period from 2000 to 2011. The paper illustrates the record-linkage framework and outlines the techniques used to prepare and classify each record and evaluate the match results. The USPTO dataset consisted of 41,619 U.S. patents granted to 14,162 distinct Canadian entities. The record-linkage process matched the names, city, province and postal codes of the patent assignees in the USPTO dataset with those of businesses in the January editions of the Generic Survey Universe File (GSUF) from the BR for the same reference period. As the vast majority of individual patent assignees are not engaged in commercial activity to provide taxable property or services, they tend not to appear in the BR. The relatively poor match rate of 24.5% among individuals, compared to 84.7% among institutions, reflects this tendency. Although the 8,844 individual patent assignees outnumbered the 5,318 institutions, the institutions accounted for 73.0% of the patents, compared to 27.0% held by individuals. Consequently, this study and its conclusions focus primarily on institutional patent assignees. The linkage of the USPTO institutions to the BR is significant because it provides access to business micro-level data on firm characteristics, employment, revenue, assets and liabilities. In addition, the retrieval of robust administrative identifiers enables subsequent linkage to other survey and administrative data sources. The integrated dataset will support direct and comparative analytical studies on the performance of Canadian institutions that obtained patents in the United States between 2000 and 2011.

    Release date: 2016-03-24

  • Technical products: 11-522-X201700014748
    Description:

    This paper describes the creation of a database developed in Switzerland to analyze migration and the structural integration of the foreign national population. The database is created from various registers (register of residents, social insurance, unemployment) and surveys, and covers 15 years (1998 to 2013). Information on migration status and socioeconomic characteristics is also available for nearly 4 million foreign nationals who lived in Switzerland between 1998 and 2013. This database is the result of a collaboration between the Federal Statistics Office and researchers from the National Center of Competence in Research (NCCR)–On the Move.

    Release date: 2016-03-24

  • Technical products: 11-522-X201700014747
    Description:

    The Longitudinal Immigration Database (IMDB) combines the Immigrant Landing File (ILF) with annual tax files. This record linkage is performed using a tax filer database. The ILF includes all immigrants who have landed in Canada since 1980. In looking to enhance the IMDB, the possibility of adding temporary residents (TR) and immigrants who landed between 1952 and 1979 (PRE80) was studied. Adding this information would give a more complete picture of the immigrant population living in Canada. To integrate the TR and PRE80 files into the IMDB, record linkages between these two files and the tax filer database, were performed. This exercise was challenging in part due to the presence of duplicates in the files and conflicting links between the different record linkages.

    Release date: 2016-03-24

  • Technical products: 11-522-X201700014741
    Description:

    Statistics Canada’s mandate includes producing statistical data to shed light on current business issues. The linking of business records is an important aspect of the development, production, evaluation and analysis of these statistical data. As record linkage can intrude on one’s privacy, Statistics Canada uses it only when the public good is clear and outweighs the intrusion. Record linkage is experiencing a revival triggered by a greater use of administrative data in many statistical programs. There are many challenges to business record linkage. For example, many administrative files not have common identifiers, information is recorded is in non-standardized formats, information contains typographical errors, administrative data files are usually large in size, and finally the evaluation of multiple record pairings makes absolute comparison impractical and sometimes impossible. Due to the importance and challenges associated with record linkage, Statistics Canada has been developing a record linkage standard to help users optimize their business record linkage process. For example, this process includes building on a record linkage blocking strategy that reduces the amount of record-pairs to compare and match, making use of Statistics Canada’s internal software to conduct deterministic and probabilistic matching, and creating standard business name and address fields on Statistics Canada’s Business Register. This article gives an overview of the business record linkage methodology and looks at various economic projects which use record linkage at Statistics Canada, these include projects in the National Accounts, International Trade, Agriculture and the Business Register.

    Release date: 2016-03-24

  • Technical products: 11-522-X201700014750
    Description:

    The Educational Master File (EMF) system was built to allow the analysis of educational programs in Canada. At the core of the system are administrative files that record all of the registrations to post-secondary and apprenticeship programs in Canada. New administrative files become available on an annual basis. Once a new file becomes available, a first round of processing is performed, which includes linkage to other administrative records. This linkage yields information that can improve the quality of the file, it allows further linkages to other data describing labour market outcomes, and it’s the first step in adding the file to the EMF. Once part of the EMF, information from the file can be included in cross-sectional and longitudinal projects, to study academic pathways and labour market outcomes after graduation. The EMF currently consists of data from 2005 to 2013, but it evolves as new data become available. This paper gives an overview of the mechanisms used to build the EMF, with focus on the structure of the final system and some of its analytical potential.

    Release date: 2016-03-24

  • Technical products: 11-522-X201700014739
    Description:

    Vital statistics datasets such as the Canadian Mortality Database lack identifiers for certain populations of interest such as First Nations, Métis and Inuit. Record linkage between vital statistics and survey or other administrative datasets can circumvent this limitation. This paper describes a linkage between the Canadian Mortality Database and the 2006 Census of the Population and the planned analysis using the linked data.

    Release date: 2016-03-24

  • Technical products: 11-522-X201700014755
    Description:

    The National Children’s Study Vanguard Study was a pilot epidemiological cohort study of children and their parents. Measures were to be taken from pre-pregnancy until adulthood. The use of extant data was planned to supplement direct data collection from the respondents. Our paper outlines a strategy for cataloging and evaluating extant data sources for use with large scale longitudinal. Through our review we selected five evaluation factors to guide a researcher through available data sources including 1) relevance, 2) timeliness, 3) spatiality, 4) accessibility, and 5) accuracy.

    Release date: 2016-03-24

  • Articles and reports: 82-003-X201600114306
    Description:

    This article is an overview of the creation, content, and quality of the 2006 Canadian Birth-Census Cohort Database.

    Release date: 2016-01-20

  • Articles and reports: 82-003-X201501014228
    Description:

    This study presents the results of a hierarchical exact matching approach to link the 2006 Census of Population with hospital data for all provinces and territories (excluding Quebec) to the 2006/2007-to-2008/2009 Discharge Abstract Database. The purpose is to determine if the Census–DAD linkage performed similarly in different jurisdictions, and if linkage and coverage rates declined as time passed since the census.

    Release date: 2015-10-21

  • Articles and reports: 82-003-X201500614196
    Description:

    This study investigates the feasibility and validity of using personal health insurance numbers to deterministically link the CCR and the Discharge Abstract Database to obtain hospitalization information about people with primary cancers.

    Release date: 2015-06-17

  • Articles and reports: 12-001-X201400214128
    Description:

    Users, funders and providers of official statistics want estimates that are “wider, deeper, quicker, better, cheaper” (channeling Tim Holt, former head of the UK Office for National Statistics), to which I would add “more relevant” and “less burdensome”. Since World War II, we have relied heavily on the probability sample survey as the best we could do - and that best being very good - to meet these goals for estimates of household income and unemployment, self-reported health status, time use, crime victimization, business activity, commodity flows, consumer and business expenditures, et al. Faced with secularly declining unit and item response rates and evidence of reporting error, we have responded in many ways, including the use of multiple survey modes, more sophisticated weighting and imputation methods, adaptive design, cognitive testing of survey items, and other means to maintain data quality. For statistics on the business sector, in order to reduce burden and costs, we long ago moved away from relying solely on surveys to produce needed estimates, but, to date, we have not done that for household surveys, at least not in the United States. I argue that we can and must move from a paradigm of producing the best estimates possible from a survey to that of producing the best possible estimates to meet user needs from multiple data sources. Such sources include administrative records and, increasingly, transaction and Internet-based data. I provide two examples - household income and plumbing facilities - to illustrate my thesis. I suggest ways to inculcate a culture of official statistics that focuses on the end result of relevant, timely, accurate and cost-effective statistics and treats surveys, along with other data sources, as means to that end.

    Release date: 2014-12-19

  • Technical products: 11-522-X201300014268
    Description:

    Information collection is critical for chronic-disease surveillance to measure the scope of diseases, assess the use of services, identify at-risk groups and track the course of diseases and risk factors over time with the goal of planning and implementing public-health programs for disease prevention. It is in this context that the Quebec Integrated Chronic Disease Surveillance System (QICDSS) was established. The QICDSS is a database created by linking administrative files covering the period from 1996 to 2013. It is an attractive alternative to survey data, since it covers the entire population, is not affected by recall bias and can track the population over time and space. In this presentation, we describe the relevance of using administrative data as an alternative to survey data, the methods selected to build the population cohort by linking various sources of raw data, and the processing applied to minimize bias. We will also discuss the advantages and limitations associated with the analysis of administrative files.

    Release date: 2014-10-31

  • Technical products: 11-522-X201300014274
    Description:

    What is big data? Can it replace and or supplement official surveys? What are some of the challenges associated with utilizing big data for official statistics? What are some of the possible solutions? Last fall, Statistics Canada invested in a Big Data Pilot project to answer some of these questions. This was the first business survey project of its kind. This paper will cover some of the lessons learned from the Big Data Pilot Project using Smart Meter Data.

    Release date: 2014-10-31

  • Technical products: 11-522-X201300014264
    Description:

    While wetlands represent only 6.4% of the world’s surface area, they are essential to the survival of terrestrial species. These ecosystems require special attention in Canada, since that is where nearly 25% of the world’s wetlands are found. Environment Canada (EC) has massive databases that contain all kinds of wetland information from various sources. Before the information in these databases could be used for any environmental initiative, it had to be classified and its quality had to be assessed. In this paper, we will give an overview of the joint pilot project carried out by EC and Statistics Canada to assess the quality of the information contained in these databases, which has characteristics specific to big data, administrative data and survey data.

    Release date: 2014-10-31

  • Technical products: 11-522-X201300014272
    Description:

    Two converging trends raise questions about the future of large-scale probability surveys conducted by or for National Statistical Institutes (NSIs). First, increasing costs and rising rates of nonresponse potentially threaten the cost-effectiveness and inferential value of surveys. Second, there is growing interest in Big Data as a replacement for surveys. There are many different types of Big Data, but the primary focus here is on data generated through social media. This paper supplements and updates an earlier paper on the topic (Couper, 2013). I review some of the concerns about Big Data, particularly from the survey perspective. I argue that there is a role for both high-quality surveys and big data analytics in the work of NSIs. While Big Data is unlikely to replace high-quality surveys, I believe the two methods can serve complementary functions. I attempt to identify some of the criteria that need to be met, and questions that need to be answered, before Big Data can be used for reliable population-based inference.

    Release date: 2014-10-31

  • Technical products: 11-522-X201300014284
    Description:

    The decline in response rates observed by several national statistical institutes, their desire to limit response burden and the significant budget pressures they face support greater use of administrative data to produce statistical information. The administrative data sources they must consider have to be evaluated according to several aspects to determine their fitness for use. Statistics Canada recently developed a process to evaluate administrative data sources for use as inputs to the statistical information production process. This evaluation is conducted in two phases. The initial phase requires access only to the metadata associated with the administrative data considered, whereas the second phase uses a version of data that can be evaluated. This article outlines the evaluation process and tool.

    Release date: 2014-10-31

Data (0)

Data (0) (0 results)

Your search for "" found no results in this section of the site.

You may try:

Analysis (32)

Analysis (32) (25 of 32 results)

  • Articles and reports: 11-633-X2018016
    Description:

    Record linkage has been identified as a potential mechanism to add treatment information to the Canadian Cancer Registry (CCR). The purpose of the Canadian Cancer Treatment Linkage Project (CCTLP) pilot is to add surgical treatment data to the CCR. The Discharge Abstract Database (DAD) and the National Ambulatory Care Reporting System (NACRS) were linked to the CCR, and surgical treatment data were extracted. The project was funded through the Cancer Data Development Initiative (CDDI) of the Canadian Partnership Against Cancer (CPAC).

    The CCTLP was developed as a feasibility study in which patient records from the CCR would be linked to surgical treatment records in the DAD and NACRS databases, maintained by the Canadian Institute for Health Information. The target cohort to whom surgical treatment data would be linked was patients aged 19 or older registered on the CCR (2010 through 2012). The linkage was completed in Statistics Canada’s Social Data Linkage Environment (SDLE).

    Release date: 2018-03-27

  • Articles and reports: 11-633-X2018014
    Description:

    The Canadian Mortality Database (CMDB) is an administrative database that collects information on cause of death from all provincial and territorial vital statistics registries in Canada. The CMDB lacks subpopulation identifiers to examine mortality rates and disparities among groups such as First Nations, Métis, Inuit and members of visible minority groups. Linkage between the CMDB and the Census of Population is an approach to circumvent this limitation. This report describes a linkage between the CMDB (2006 to 2011) and the 2006 Census of Population, which was carried out using hierarchical deterministic exact matching, with a focus on methodology and validation.

    Release date: 2018-02-14

  • Articles and reports: 11-633-X2018013
    Description:

    Since 2008, a number of population censuses have been linked to administrative health data and to financial data. These linked datasets have been instrumental in examining health inequalities and have been used in environmental health research. This paper describes the creation of the 1996 Canadian Census Health and Environment Cohort (CanCHEC)—3.57 million respondents to the census long-form questionnaire who were retrospectively followed for mortality and mobility for 16.6 years from 1996 to 2012. The 1996 CanCHEC was limited to census respondents who were aged 19 or older on Census Day (May 14, 1996), were residents of Canada, were not residents of institutions, and had filed an income tax return. These respondents were linked to death records from the Canadian Mortality Database or to the T1 Personal Master File, and to a postal code history from a variety of sources. This is the third in a set of CanCHECs that, when combined, make it possible to examine mortality trends and environmental exposures by socioeconomic characteristics over three census cycles and 21 years of census, tax, and mortality data. This report describes linkage methodologies, validation and bias assessment, and the characteristics of the 1996 CanCHEC. Representativeness of the 1996 CanCHEC relative to the adult population of Canada is also assessed.

    Release date: 2018-01-22

  • Articles and reports: 11-633-X2018012
    Description:

    This study investigates the extent to which income tax reassessments and delayed tax filing affect the reliability of Canadian administrative tax datasets used for economic analysis. The study is based on individual income tax records from the T1 Personal Master File and Historical Personal Master File for selected years from 1990 to 2010. These datasets contain tax records for approximately 100% of initial and all income tax filers, who submitted returns to the Canada Revenue Agency (CRA) before specific processing cut-off dates.

    Release date: 2018-01-11

  • Articles and reports: 12-001-X201700254871
    Description:

    In this paper the question is addressed how alternative data sources, such as administrative and social media data, can be used in the production of official statistics. Since most surveys at national statistical institutes are conducted repeatedly over time, a multivariate structural time series modelling approach is proposed to model the series observed by a repeated surveys with related series obtained from such alternative data sources. Generally, this improves the precision of the direct survey estimates by using sample information observed in preceding periods and information from related auxiliary series. This model also makes it possible to utilize the higher frequency of the social media to produce more precise estimates for the sample survey in real time at the moment that statistics for the social media become available but the sample data are not yet available. The concept of cointegration is applied to address the question to which extent the alternative series represent the same phenomena as the series observed with the repeated survey. The methodology is applied to the Dutch Consumer Confidence Survey and a sentiment index derived from social media.

    Release date: 2017-12-21

  • Articles and reports: 82-003-X201601214687
    Description:

    This study describes record linkage of the Canadian Community Health Survey and the Canadian Mortality Database. The article explains the record linkage process and presents results about associations between health behaviours and mortality among a representative sample of Canadians.

    Release date: 2016-12-21

  • Articles and reports: 11-633-X2016001
    Description:

    Every year, thousands of workers lose their jobs as firms reduce the size of their workforce in response to growing competition, technological changes, changing trade patterns and numerous other factors. Thousands of workers also start a job with a new employer as new firms enter a product market and existing firms expand or replace employees who recently left. This worker reallocation process across employers is generally seen as contributing to productivity growth and rising living standards. To measure this labour reallocation process, labour market indicators such as hiring rates and layoff rates are needed. In response to growing demand for subprovincial labour market information and taking advantage of unique administrative datasets, Statistics Canada is producing hiring rates and layoff rates by economic region of residence. This document describes the data sources, conceptual and methodological issues, and other matters pertaining to these two indicators.

    Release date: 2016-06-27

  • Articles and reports: 12-001-X201600114544
    Description:

    In the Netherlands, statistical information about income and wealth is based on two large scale household panels that are completely derived from administrative data. A problem with using households as sampling units in the sample design of panels is the instability of these units over time. Changes in the household composition affect the inclusion probabilities required for design-based and model-assisted inference procedures. Such problems are circumvented in the two aforementioned household panels by sampling persons, who are followed over time. At each period the household members of these sampled persons are included in the sample. This is equivalent to sampling with probabilities proportional to household size where households can be selected more than once but with a maximum equal to the number of household members. In this paper properties of this sample design are described and contrasted with the Generalized Weight Share method for indirect sampling (Lavallée 1995, 2007). Methods are illustrated with an application to the Dutch Regional Income Survey.

    Release date: 2016-06-22

  • Articles and reports: 12-001-X201600114543
    Description:

    The regression estimator is extensively used in practice because it can improve the reliability of the estimated parameters of interest such as means or totals. It uses control totals of variables known at the population level that are included in the regression set up. In this paper, we investigate the properties of the regression estimator that uses control totals estimated from the sample, as well as those known at the population level. This estimator is compared to the regression estimators that strictly use the known totals both theoretically and via a simulation study.

    Release date: 2016-06-22

  • Articles and reports: 82-003-X201600114306
    Description:

    This article is an overview of the creation, content, and quality of the 2006 Canadian Birth-Census Cohort Database.

    Release date: 2016-01-20

  • Articles and reports: 82-003-X201501014228
    Description:

    This study presents the results of a hierarchical exact matching approach to link the 2006 Census of Population with hospital data for all provinces and territories (excluding Quebec) to the 2006/2007-to-2008/2009 Discharge Abstract Database. The purpose is to determine if the Census–DAD linkage performed similarly in different jurisdictions, and if linkage and coverage rates declined as time passed since the census.

    Release date: 2015-10-21

  • Articles and reports: 82-003-X201500614196
    Description:

    This study investigates the feasibility and validity of using personal health insurance numbers to deterministically link the CCR and the Discharge Abstract Database to obtain hospitalization information about people with primary cancers.

    Release date: 2015-06-17

  • Articles and reports: 12-001-X201400214128
    Description:

    Users, funders and providers of official statistics want estimates that are “wider, deeper, quicker, better, cheaper” (channeling Tim Holt, former head of the UK Office for National Statistics), to which I would add “more relevant” and “less burdensome”. Since World War II, we have relied heavily on the probability sample survey as the best we could do - and that best being very good - to meet these goals for estimates of household income and unemployment, self-reported health status, time use, crime victimization, business activity, commodity flows, consumer and business expenditures, et al. Faced with secularly declining unit and item response rates and evidence of reporting error, we have responded in many ways, including the use of multiple survey modes, more sophisticated weighting and imputation methods, adaptive design, cognitive testing of survey items, and other means to maintain data quality. For statistics on the business sector, in order to reduce burden and costs, we long ago moved away from relying solely on surveys to produce needed estimates, but, to date, we have not done that for household surveys, at least not in the United States. I argue that we can and must move from a paradigm of producing the best estimates possible from a survey to that of producing the best possible estimates to meet user needs from multiple data sources. Such sources include administrative records and, increasingly, transaction and Internet-based data. I provide two examples - household income and plumbing facilities - to illustrate my thesis. I suggest ways to inculcate a culture of official statistics that focuses on the end result of relevant, timely, accurate and cost-effective statistics and treats surveys, along with other data sources, as means to that end.

    Release date: 2014-12-19

  • Articles and reports: 82-003-X201300111764
    Description:

    This study compares two sources of information about prescription drug use by people aged 65 or older in Ontario - the Canadian Community Health Survey and the drug claimsdatabase of the Ontario Drug Benefit Program. The analysis pertains to cardiovascular and diabetes drugs because they are commonly used, and almost all are prescribed on a regular basis.

    Release date: 2013-01-16

  • Articles and reports: 91F0015M2005007
    Description:

    The Population Estimates Program at Statistics Canada is using internal migration estimates derived from administrative sources of data. There are two versions of migration estimates currently available, preliminary (P), based on Child Tax Credit information and final (F), produced using information from income tax reports. For some reference dates they could be significantly different. This paper summarises the research undertaken in Demography Division to modify the current method for preliminary estimates in order to decrease those differences. After a brief analysis of the differences, six methods are tested: 1) regression of out-migration; 2) regression of in- and out-migration separately; 3) regression of net migration; 4) the exponentially weighted moving average; 5) the U.S. Bureau of Census approach; and 6) method of using the first difference regression. It seems that the methods in which final and preliminary migration data are combined to estimate preliminary net migration (Method 3) are the best approach to improve convergence between preliminary and final estimates of internal migration for the Population Estimation Program. This approach allows for "smoothing" of some erratic patterns displayed by the former method while preserving CTB data's ability to capture current shifts in migration patterns.

    Release date: 2005-06-20

  • Articles and reports: 91F0015M2004006
    Description:

    The paper assesses and compares new and old methodologies for official estimates of migration within and among provinces and territories for the period 1996/97 to 2000/01.

    Release date: 2004-06-17

  • Articles and reports: 12-001-X19990024878
    Description:

    In his paper Fritz Scheuren considers the possible uses of administrative records to enhance and improve population censuses. After reviewing previous uses of administrative records in an international context, he puts forward several proposals for research and development towards increased use of administrative records in the American statistical system.

    Release date: 2000-03-01

  • Articles and reports: 12-001-X19970023620
    Description:

    Since France has no population registers, population censuses are the basis for its socio-demographic information system. However, between two censuses, some data must be updated, in particular at a high level of geographic detail, especially since censuses are tending, for various reasons, to be less frequent. In 1993, the Institut National de la Statistique et des Études Économiques (INSEE) set up a team whose objective was to propose a system to substantially improve the existing mechanism for making small area population estimates. Its task was twofold: to prepare an efficient and robust synthesis of the information available from different administrative sources, and to assemble a sufficient number of "good" sources. The "multi-source" system that it designed, which is reported on here, is flexible and reliable, without being overly complex.

    Release date: 1998-03-12

  • Articles and reports: 12-001-X198900214563
    Description:

    This paper examines the adequacy of estimates of emigrants from Canada and interprovincial migration data from the Family Allowance files and Revenue Canada tax files. The application of these data files in estimating total population for Canada, provinces and territories, was evaluated with reference to the 1986 Census counts. It was found that these two administrative files provided consistent and reasonably accurate series of data on emigration and interprovincial migration from 1981 to 1986. Consequently, the population estimates were fairly accurate. The estimate of emigrants derived from the Family Allowance file could be improved by using the ratio of adult to child emigrant rates computed from Employment and Immigration Canada’s immigration file.

    Release date: 1989-12-15

  • Articles and reports: 12-001-X198900114571
    Description:

    Statistics Canada is currently rebuilding its central register of economic entities. The new register views each economic entity as a network of legal and operating entities whose characteristics allow for the delineation of statistical entities. This network view, the profile, is determined through the ‘profiling’ process which involves contact with the economic entity. In 1986 a list of all entities in-scope for a profiling contact was required so that profiles could be obtained to initialize the new register. Administrative data were used to build this list. In the future, administrative data will be a source of information on changes that may have happened to economic entities. They may thus be used as a source of direct update or as a signal that a review of the structure of an entity is required. The paper begins with the objectives of the profiling process. The procedures for constructing the frame for the initial profiling process using several administrative data sources are then presented. These procedures include the application of concepts, the detection of overlap between sources, and the evaluation of data quality. Next, the role of administrative data in providing information on changes to business entities and in requesting profiles to be verified is presented. Then the results of a simulation study done to assess this role are reviewed. Finally, the paper concludes with a series of questions on the methodology of using administrative data to maintain profiles.

    Release date: 1989-06-15

  • Articles and reports: 12-001-X198900114572
    Description:

    The Survey of Income and Program Participation (SIPP) is a new Census Bureau panel survey designed to provide data on the economic situation of persons and families in the United States. The basic datum of SIPP is monthly income, which is reported for each month of the four-month reference period preceding the interview month. The SIPP Record Check Study uses administrative record data to estimate the quality of SIPP estimates for a variety of income sources and transfer programs. The project uses computerized record matching to identify SIPP sample persons in four states who are on record as having received payments from any of nine state or Federal programs, and then compares survey-reported dates and amounts of payments with official record values. The paper describes the project in detail and presents some early findings.

    Release date: 1989-06-15

  • Articles and reports: 12-001-X198900114575
    Description:

    The experience of the four Nordic countries illustrates the advantages and disadvantages of a register-based census of population and points to ways in which the disadvantages can be contained. Other countries see major obstacles to a register-based census: the lack of data systems of the kind and quality needed; and public concern about privacy and the power of the State. These issues go far beyond statistics; they concern policy and administration. The paper looks at the situation in two countries, the United Kingdom and Australia. In the United Kingdom past initiatives aimed at population registration in peacetime foundered and the present environment is hostile to any new initiative. But the government is going ahead with a controversial reform of local taxation that involves setting up new registers. In Australia the government tabled a Bill to introduce identity cards and an associated register, and advanced clearcut political arguments to support it; the Bill was later withdrawn. The paper concludes that the issues involved in reforming data systems deserve to be fully discussed and gives reasons why statisticians should take a leading part in the debate.

    Release date: 1989-06-15

  • Articles and reports: 12-001-X198900114573
    Description:

    The Census Bureau makes extensive use of administrative records information in its various economic programs. Although the volume of records processed annually is vast, even larger numbers will be received during the census years. Census Bureau mainframe computers perform quality control (QC) tabulations on the data; however, since such a large number of QC tables are needed and resources for programming are limited and costly, a comprehensive mainframe QC system is difficult to attain. Add to this the sensitive nature of the data and the potentially very negative ramifications from erroneous data, and the need becomes quite apparent for a sophisticated quality assurance system on the microcomputer level. Such a system is being developed by the Economic Surveys Division and will be in place for the 1987 administrative records data files. The automated quality assurance system integrates micro and mainframe computer technology. Administrative records data are received weekly and processed initially through mainframe QC programs. The mainframe output is transferred to a microcomputer and formatted specifically for importation to a spreadsheet program. Systematic quality verification occurs within the spreadsheet structure, as data review, error detection, and report generation are accomplished automatically. As a result of shifting processes from mainframe to microcomputer environments, the system eases the burden on the programming staff, increases the flexibility of the analytical staff, and reduces processing costs on the mainframe and provides the comprehensive quality assurance component for administrative records.

    Release date: 1989-06-15

  • Articles and reports: 12-001-X198800214584
    Description:

    When we examine postal addresses as they might appear in an administrative file, we discover a complex syntax, a lack of standards, various ambiguities and many errors. Therefore, postal addresses represent a real challenge to any computer system using them. PAAS (Postal Address Analysis System) is currently under development at Statistics Canada and aims to replace an aging routine used throughout the Bureau to decode postal addresses. PAAS will provide a means by which computer applications will obtain the address components, the standardized version of these components and the corresponding Address Search Key (ASK).

    Release date: 1988-12-15

  • Articles and reports: 12-001-X198700114467
    Description:

    Demands for statistics on all aspects of our lives, our society and our economy continue to grow. At the same time statistical agencies share with many respondents a growing concern over the mounting burden of response to surveys. One result of the search for alternative methods of satisfying statistical demands has been an increased emphasis on the use of administrative records for statistical purposes. This paper reviews recent experience at Statistics Canada in this area and discusses obstacles to the greater use of administrative records. Approaches to rendering administrative systems more useful for statistical purposes are reviewed, together with some important concerns related to information protection and record linkage.

    Release date: 1987-06-15

Reference (40)

Reference (40) (25 of 40 results)

  • Technical products: 11-522-X201700014742
    Description:

    This paper describes the Quick Match System (QMS), an in-house application designed to match business microdata records, and the methods used to link the United States Patent and Trademark Office (USPTO) dataset to Statistics Canada’s Business Register (BR) for the period from 2000 to 2011. The paper illustrates the record-linkage framework and outlines the techniques used to prepare and classify each record and evaluate the match results. The USPTO dataset consisted of 41,619 U.S. patents granted to 14,162 distinct Canadian entities. The record-linkage process matched the names, city, province and postal codes of the patent assignees in the USPTO dataset with those of businesses in the January editions of the Generic Survey Universe File (GSUF) from the BR for the same reference period. As the vast majority of individual patent assignees are not engaged in commercial activity to provide taxable property or services, they tend not to appear in the BR. The relatively poor match rate of 24.5% among individuals, compared to 84.7% among institutions, reflects this tendency. Although the 8,844 individual patent assignees outnumbered the 5,318 institutions, the institutions accounted for 73.0% of the patents, compared to 27.0% held by individuals. Consequently, this study and its conclusions focus primarily on institutional patent assignees. The linkage of the USPTO institutions to the BR is significant because it provides access to business micro-level data on firm characteristics, employment, revenue, assets and liabilities. In addition, the retrieval of robust administrative identifiers enables subsequent linkage to other survey and administrative data sources. The integrated dataset will support direct and comparative analytical studies on the performance of Canadian institutions that obtained patents in the United States between 2000 and 2011.

    Release date: 2016-03-24

  • Technical products: 11-522-X201700014748
    Description:

    This paper describes the creation of a database developed in Switzerland to analyze migration and the structural integration of the foreign national population. The database is created from various registers (register of residents, social insurance, unemployment) and surveys, and covers 15 years (1998 to 2013). Information on migration status and socioeconomic characteristics is also available for nearly 4 million foreign nationals who lived in Switzerland between 1998 and 2013. This database is the result of a collaboration between the Federal Statistics Office and researchers from the National Center of Competence in Research (NCCR)–On the Move.

    Release date: 2016-03-24

  • Technical products: 11-522-X201700014747
    Description:

    The Longitudinal Immigration Database (IMDB) combines the Immigrant Landing File (ILF) with annual tax files. This record linkage is performed using a tax filer database. The ILF includes all immigrants who have landed in Canada since 1980. In looking to enhance the IMDB, the possibility of adding temporary residents (TR) and immigrants who landed between 1952 and 1979 (PRE80) was studied. Adding this information would give a more complete picture of the immigrant population living in Canada. To integrate the TR and PRE80 files into the IMDB, record linkages between these two files and the tax filer database, were performed. This exercise was challenging in part due to the presence of duplicates in the files and conflicting links between the different record linkages.

    Release date: 2016-03-24

  • Technical products: 11-522-X201700014741
    Description:

    Statistics Canada’s mandate includes producing statistical data to shed light on current business issues. The linking of business records is an important aspect of the development, production, evaluation and analysis of these statistical data. As record linkage can intrude on one’s privacy, Statistics Canada uses it only when the public good is clear and outweighs the intrusion. Record linkage is experiencing a revival triggered by a greater use of administrative data in many statistical programs. There are many challenges to business record linkage. For example, many administrative files not have common identifiers, information is recorded is in non-standardized formats, information contains typographical errors, administrative data files are usually large in size, and finally the evaluation of multiple record pairings makes absolute comparison impractical and sometimes impossible. Due to the importance and challenges associated with record linkage, Statistics Canada has been developing a record linkage standard to help users optimize their business record linkage process. For example, this process includes building on a record linkage blocking strategy that reduces the amount of record-pairs to compare and match, making use of Statistics Canada’s internal software to conduct deterministic and probabilistic matching, and creating standard business name and address fields on Statistics Canada’s Business Register. This article gives an overview of the business record linkage methodology and looks at various economic projects which use record linkage at Statistics Canada, these include projects in the National Accounts, International Trade, Agriculture and the Business Register.

    Release date: 2016-03-24

  • Technical products: 11-522-X201700014750
    Description:

    The Educational Master File (EMF) system was built to allow the analysis of educational programs in Canada. At the core of the system are administrative files that record all of the registrations to post-secondary and apprenticeship programs in Canada. New administrative files become available on an annual basis. Once a new file becomes available, a first round of processing is performed, which includes linkage to other administrative records. This linkage yields information that can improve the quality of the file, it allows further linkages to other data describing labour market outcomes, and it’s the first step in adding the file to the EMF. Once part of the EMF, information from the file can be included in cross-sectional and longitudinal projects, to study academic pathways and labour market outcomes after graduation. The EMF currently consists of data from 2005 to 2013, but it evolves as new data become available. This paper gives an overview of the mechanisms used to build the EMF, with focus on the structure of the final system and some of its analytical potential.

    Release date: 2016-03-24

  • Technical products: 11-522-X201700014739
    Description:

    Vital statistics datasets such as the Canadian Mortality Database lack identifiers for certain populations of interest such as First Nations, Métis and Inuit. Record linkage between vital statistics and survey or other administrative datasets can circumvent this limitation. This paper describes a linkage between the Canadian Mortality Database and the 2006 Census of the Population and the planned analysis using the linked data.

    Release date: 2016-03-24

  • Technical products: 11-522-X201700014755
    Description:

    The National Children’s Study Vanguard Study was a pilot epidemiological cohort study of children and their parents. Measures were to be taken from pre-pregnancy until adulthood. The use of extant data was planned to supplement direct data collection from the respondents. Our paper outlines a strategy for cataloging and evaluating extant data sources for use with large scale longitudinal. Through our review we selected five evaluation factors to guide a researcher through available data sources including 1) relevance, 2) timeliness, 3) spatiality, 4) accessibility, and 5) accuracy.

    Release date: 2016-03-24

  • Technical products: 11-522-X201300014268
    Description:

    Information collection is critical for chronic-disease surveillance to measure the scope of diseases, assess the use of services, identify at-risk groups and track the course of diseases and risk factors over time with the goal of planning and implementing public-health programs for disease prevention. It is in this context that the Quebec Integrated Chronic Disease Surveillance System (QICDSS) was established. The QICDSS is a database created by linking administrative files covering the period from 1996 to 2013. It is an attractive alternative to survey data, since it covers the entire population, is not affected by recall bias and can track the population over time and space. In this presentation, we describe the relevance of using administrative data as an alternative to survey data, the methods selected to build the population cohort by linking various sources of raw data, and the processing applied to minimize bias. We will also discuss the advantages and limitations associated with the analysis of administrative files.

    Release date: 2014-10-31

  • Technical products: 11-522-X201300014274
    Description:

    What is big data? Can it replace and or supplement official surveys? What are some of the challenges associated with utilizing big data for official statistics? What are some of the possible solutions? Last fall, Statistics Canada invested in a Big Data Pilot project to answer some of these questions. This was the first business survey project of its kind. This paper will cover some of the lessons learned from the Big Data Pilot Project using Smart Meter Data.

    Release date: 2014-10-31

  • Technical products: 11-522-X201300014264
    Description:

    While wetlands represent only 6.4% of the world’s surface area, they are essential to the survival of terrestrial species. These ecosystems require special attention in Canada, since that is where nearly 25% of the world’s wetlands are found. Environment Canada (EC) has massive databases that contain all kinds of wetland information from various sources. Before the information in these databases could be used for any environmental initiative, it had to be classified and its quality had to be assessed. In this paper, we will give an overview of the joint pilot project carried out by EC and Statistics Canada to assess the quality of the information contained in these databases, which has characteristics specific to big data, administrative data and survey data.

    Release date: 2014-10-31

  • Technical products: 11-522-X201300014272
    Description:

    Two converging trends raise questions about the future of large-scale probability surveys conducted by or for National Statistical Institutes (NSIs). First, increasing costs and rising rates of nonresponse potentially threaten the cost-effectiveness and inferential value of surveys. Second, there is growing interest in Big Data as a replacement for surveys. There are many different types of Big Data, but the primary focus here is on data generated through social media. This paper supplements and updates an earlier paper on the topic (Couper, 2013). I review some of the concerns about Big Data, particularly from the survey perspective. I argue that there is a role for both high-quality surveys and big data analytics in the work of NSIs. While Big Data is unlikely to replace high-quality surveys, I believe the two methods can serve complementary functions. I attempt to identify some of the criteria that need to be met, and questions that need to be answered, before Big Data can be used for reliable population-based inference.

    Release date: 2014-10-31

  • Technical products: 11-522-X201300014284
    Description:

    The decline in response rates observed by several national statistical institutes, their desire to limit response burden and the significant budget pressures they face support greater use of administrative data to produce statistical information. The administrative data sources they must consider have to be evaluated according to several aspects to determine their fitness for use. Statistics Canada recently developed a process to evaluate administrative data sources for use as inputs to the statistical information production process. This evaluation is conducted in two phases. The initial phase requires access only to the metadata associated with the administrative data considered, whereas the second phase uses a version of data that can be evaluated. This article outlines the evaluation process and tool.

    Release date: 2014-10-31

  • Technical products: 11-522-X201300014271
    Description:

    The purpose of this paper is to present the use of administrative records in the U.S. Census for Group Quarters, or known as collective dwellings elsewhere. Group Quarters enumeration involves collecting data from such hard-to-access places as correctional facilities, skilled nursing facilities, and military barracks. We discuss benefits and constraints of using various sources of administrative records in constructing the Group Quarters frame for coverage improvement. This paper is a companion to the paper by Chun and Gan (2014), discusing the potential uses of administrative records in the Group Quarters enumeration.

    Release date: 2014-10-31

  • Technical products: 11-522-X201300014273
    Description:

    More and more data are being produced by an increasing number of electronic devices physically surrounding us and on the internet. The large amount of data and the high frequency at which they are produced have resulted in the introduction of the term ‘Big Data’. Because of the fact that these data reflect many different aspects of our daily lives and because of their abundance and availability, Big Data sources are very interesting from an official statistics point of view. However, first experiences obtained with analyses of large amounts of Dutch traffic loop detection records, call detail records of mobile phones and Dutch social media messages reveal that a number of challenges need to be addressed to enable the application of these data sources for official statistics. These and the lessons learned during these initial studies will be addressed and illustrated by examples. More specifically, the following topics are discussed: the three general types of Big Data discerned, the need to access and analyse large amounts of data, how we deal with noisy data and look at selectivity (and our own bias towards this topic), how to go beyond correlation, how we found people with the right skills and mind-set to perform the work, and how we have dealt with privacy and security issues.

    Release date: 2014-10-31

  • Technical products: 11-522-X201300014283
    Description:

    The project MIAD of the Statistical Network aims at developing methodologies for an integrated use of administrative data (AD) in the statistical process. MIAD main target is providing guidelines for exploiting AD for statistical purposes. In particular, a quality framework has been developed, a mapping of possible uses has been provided and a schema of alternative informative contexts is proposed. This paper focuses on this latter aspect. In particular, we distinguish between dimensions that relate to features of the source connected with accessibility and with characteristics that are connected to the AD structure and their relationships with the statistical concepts. We denote the first class of features the framework for access and the second class of features the data framework. In this paper we mainly concentrate on the second class of characteristics that are related specifically with the kind of information that can be obtained from the secondary source. In particular, these features relate to the target administrative population and measurement on this population and how it is (or may be) connected with the target population and target statistical concepts.

    Release date: 2014-10-31

  • Technical products: 11-522-X200800011004
    Description:

    The issue of reducing the response burden is not new. Statistics Sweden works in different ways to reduce response burden and to decrease the administrative costs of data collection from enterprises and organizations. According to legislation Statistics Sweden must reduce response burden for the business community. Therefore, this work is a priority. There is a fixed level decided by the Government to decrease the administrative costs of enterprises by twenty-five percent until year 2010. This goal is valid also for data collection for statistical purposes. The goal concerns surveys with response compulsory legislation. In addition to these surveys there are many more surveys and a need to measure and reduce the burden from these surveys as well. In order to help measure, analyze and reduce the burden, Statistics Sweden has developed the Register of Data providers concerning enterprises and organization (ULR). The purpose of the register is twofold, to measure and analyze the burden on an aggregated level and to be able to give information to each individual enterprise which surveys they are participating in.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800011010
    Description:

    The Survey of Employment, Payrolls and Hours (SEPH) is a monthly survey using two sources of data: a census of payroll deduction (PD7) forms (administrative data) and a survey of business establishments. This paper focuses on the processing of the administrative data, from the weekly receipt of data from the Canada Revenue Agency to the production of monthly estimates produced by SEPH.

    The edit and imputation methods used to process the administrative data have been revised in the last several years. The goals of this redesign were primarily to improve the data quality and to increase the consistency with another administrative data source (T4) which is a benchmark measure for Statistics Canada's System of National Accounts people. An additional goal was to ensure that the new process would be easier to understand and to modify, if needed. As a result, a new processing module was developed to edit and impute PD7 forms before their data is aggregated to the monthly level.

    This paper presents an overview of both the current and new processes, including a description of challenges that we faced during development. Improved quality is demonstrated both conceptually (by presenting examples of PD7 forms and their treatment under the old and new systems) and quantitatively (by comparison to T4 data).

    Release date: 2009-12-03

  • Technical products: 11-522-X200800011011
    Description:

    The Federation of Canadian Municipalities' (FCM) Quality of Life Reporting System (QOLRS) is a means by which to measure, monitor, and report on the quality of life in Canadian municipalities. To address that challenge of administrative data collection across member municipalities the QOLRS technical team collaborated on the development of the Municipal Data Collection Tool (MDCT) which has become a key component of QOLRS' data acquisition methodology. Offered as a case study on administrative data collection, this paper argues that the recent launch of the MDCT has enabled the FCM to access reliable pan-Canadian municipal administrative data for the QOLRS.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800011009
    Description:

    The National Routing System is a multi-jurisdictional effort to improve the collection and validation of birth and death information from provincial vital event registries. Instead of having to wait for batch files to be sent at various points during the year, provinces send individual records as an event is registered. Timeliness is further enhanced by the adoption of data and technical standards. Data users no longer have to deal with multiple data formats and transfer media when compiling data from multiple sources. Similarly, data providers need to perform a once only transformation of their data in order to satisfy multiple clients.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800011012
    Description:

    Justice surveys represent a unique type of survey undertaken by Statistics Canada. While they all essentially use administrative data, Statistics Canada has had considerable input into the type of data that is collected as well as quality assurance methods guiding the collection of this data. This is true in the areas of policing, courts and corrections. The main crime survey, the Uniform Crime Reporting Survey (UCR), is the focus of this paper and was designed to measure the incidence of crime in Canadian society and its characteristics. The data is collected by the policing community in Canada and transmitted electronically to Statistics Canada. This paper will begin by providing an overview of the survey and its distinctive properties, such as the use of intermediaries (software vendors) that convert data from the police's information systems into the UCR survey format, following nationally defined data requirements. This level of consistency is uncommon for an administrative survey and permits a variety of opportunities for improving the overall data quality and capabilities of the survey. Various methods such as quality indicators and feedback reports are used on a regular basis and frequent two-way communication takes place with the respondents to correct existing data problems and to prevent future ones. We will discuss recent improvements to both the data itself and our collection methods that have enhanced the usability of the survey. Finally, future development of the survey will be discussed including some of the challenges that currently exist as well as those still to come.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800010981
    Description:

    One of the main characteristics of the 2001 Spanish Census of the Population was the use of an administrative Register of Population (El Padrón) for pre-printing the questionnaires and also the enumerator's record books of the census sections. In this paper we present the main characteristics of the relationship between the Population Register and Census of Population, and the main changes that are being foreseen for the next Census that will take place in 2011.

    Release date: 2009-12-03

  • Technical products: 11-522-X200600110402
    Description:

    This paper explains how to append census area-level summary data to survey or administrative data. It uses examples from survey datasets present in Statistics Canada Research Data Centres, but the methods also apply to external datasets, including administrative datasets. Four examples illustrate common situations faced by researchers: (1) when the survey (or administrative) and census data both contain the same level of geographic identifiers, coded to the same year standard ("vintage") of census geography (for example, if both have 2001 DA); (2) when the two files contain geographic identifiers of the same vintage, but at different levels of census geography (for example, 1996 EA in the survey, but 1996 CT in the census data); (3) when the two files contain data coded to different vintages of census geography (such as 1996 EA for the survey, but 2001 DA for the census); (4) when the survey data are lacking in geographic identifiers, and those identifiers must first be generated from postal codes present on the file. The examples are shown using SAS syntax, but the principles apply to other programming languages or statistical packages.

    Release date: 2008-03-17

  • Technical products: 11-522-X200600110403
    Description:

    This paper reports research to introduce model-assisted estimation into the American Community Survey (ACS), a large-scale ongoing survey intended to replace the long-form sample in the U.S. decennial censuses. The proposed application integrates information from administrative records into ACS estimation. The approach to model-assisted estimation restricts the use of the administrative records to adjustments to the survey weights, while retaining the data on characteristics reported by respondents in the ACS. Although the ACS is a general-purpose survey not specifically tied to health, this case study may suggest possible methodological applications in areas of health statistics.

    Release date: 2008-03-17

  • Technical products: 11-522-X200600110449
    Description:

    Traditionally administrative hospital discharge databases have been mainly used for administrative purposes. Recently, health services researchers and population health researchers have been using the databases for a wide variety of studies; in particular health care outcomes. Tools, such as comorbidity indexes, have been developed to facilitate this analysis. Every time the coding system for diagnoses and procedures is revised or a new one is developed, these comorbidity indexes need to be updated. These updates are important in maintaining consistency when trends are examined over time.

    Release date: 2008-03-17

  • Technical products: 11-522-X200600110401
    Description:

    The Australian Bureau of Statistics (ABS) will begin the formation of a Statistical Longitudinal Census Data Set (SLCD) by choosing a 5% sample of people from the 2006 population census to be linked probabilistically with subsequent censuses. A long-term aim is to use the power of the rich longitudinal demographic data provided by the SLCD to shed light on a variety of issues which cannot be addressed using cross-sectional data. The SLCD may be further enhanced by probabilistically linking it with births, deaths, immigration settlements or disease registers. This paper gives a brief description of recent developments in data linking at the ABS, outlines the data linking methodology and quality measures we have considered and summarises preliminary results using Census Dress Rehearsal data.

    Release date: 2008-03-17

Date modified: