Statistics by subject – Quality assurance

Filter results by

Help for filters and search
Currently selected filters that can be removed

Keyword(s)

Survey or statistical program

2 facets displayed. 0 facets selected.

Content

1 facets displayed. 0 facets selected.

Filter results by

Help for filters and search
Currently selected filters that can be removed

Keyword(s)

Survey or statistical program

2 facets displayed. 0 facets selected.

Content

1 facets displayed. 0 facets selected.

Filter results by

Help for filters and search
Currently selected filters that can be removed

Keyword(s)

Content

1 facets displayed. 0 facets selected.

Filter results by

Help for filters and search
Currently selected filters that can be removed

Keyword(s)

Content

1 facets displayed. 0 facets selected.

Other available resources to support your research.

Help for sorting results
Browse our central repository of key standard concepts, definitions, data sources and methods.
Loading
Loading in progress, please wait...
All (217)

All (217) (25 of 217 results)

  • Index and guides: 12-606-X
    Description:

    This is a toolkit intended to aid data producers and data users external to Statistics Canada.

    Release date: 2017-09-27

  • The Daily
    Description: Release published in The Daily – Statistics Canada’s official release bulletin
    Release date: 2017-04-21

  • Technical products: 12-586-X
    Description:

    The Quality Assurance Framework (QAF) serves as the highest-level governance tool for quality management at Statistics Canada. The QAF gives an overview of the quality management and risk mitigation strategies used by the Agency’s program areas. The QAF is used in conjunction with Statistics Canada management practices, such as those described in the Quality Guidelines.

    Release date: 2017-04-21

  • Technical products: 11-522-X201700014758
    Description:

    "Several Canadian jurisdictions including Ontario are using patient-based healthcare data in their funding models. These initiatives can influence the quality of this data both positively and negatively as people tend to pay more attention to the data and its quality when financial decisions are based upon it. Ontario’s funding formula uses data from several national databases housed at the Canadian Institute for Health Information (CIHI). These databases provide information on patient activity and clinical status across the continuum of care. As funding models may influence coding behaviour, CIHI is collaborating with the Ontario Ministry of Health and Long-Term Care to assess and monitor the quality of this data. CIHI is using data mining software and modelling techniques (that are often associated with “big data”) to identify data anomalies across multiple factors. The models identify what the “typical” clinical coding patterns are for key patient groups (for example, patients seen in special care units or discharged to home care), so that outliers can be identified, where patients do not fit the expected pattern. A key component of the modelling is segmenting the data based on patient, provider and hospital characteristics to take into account key differences in the delivery of health care and patient populations across the province. CIHI’s analysis identified several hospitals with coding practices that appear to be changing or significantly different from their peer group. Further investigation is required to understand why these differences exist and to develop appropriate strategies to mitigate variations. "

    Release date: 2016-03-24

  • Technical products: 11-522-X201700014723
    Description:

    The U.S. Census Bureau is researching uses of administrative records in survey and decennial operations in order to reduce costs and respondent burden while preserving data quality. One potential use of administrative records is to utilize the data when race and Hispanic origin responses are missing. When federal and third party administrative records are compiled, race and Hispanic origin responses are not always the same for an individual across different administrative records sources. We explore different sets of business rules used to assign one race and one Hispanic response when these responses are discrepant across sources. We also describe the characteristics of individuals with matching, non-matching, and missing race and Hispanic origin data across several demographic, household, and contextual variables. We find that minorities, especially Hispanics, are more likely to have non-matching Hispanic origin and race responses in administrative records than in the 2010 Census. Hispanics are less likely to have missing Hispanic origin data but more likely to have missing race data in administrative records. Non-Hispanic Asians and non-Hispanic Pacific Islanders are more likely to have missing race and Hispanic origin data in administrative records. Younger individuals, renters, individuals living in households with two or more people, individuals who responded to the census in the nonresponse follow-up operation, and individuals residing in urban areas are more likely to have non-matching race and Hispanic origin responses.

    Release date: 2016-03-24

  • Technical products: 11-522-X201700014722
    Description:

    The U.S. Census Bureau is researching ways to incorporate administrative data in decennial census and survey operations. Critical to this work is an understanding of the coverage of the population by administrative records. Using federal and third party administrative data linked to the American Community Survey (ACS), we evaluate the extent to which administrative records provide data on foreign-born individuals in the ACS and employ multinomial logistic regression techniques to evaluate characteristics of those who are in administrative records relative to those who are not. We find that overall, administrative records provide high coverage of foreign-born individuals in our sample for whom a match can be determined. The odds of being in administrative records are found to be tied to the processes of immigrant assimilation – naturalization, higher English proficiency, educational attainment, and full-time employment are associated with greater odds of being in administrative records. These findings suggest that as immigrants adapt and integrate into U.S. society, they are more likely to be involved in government and commercial processes and programs for which we are including data. We further explore administrative records coverage for the two largest race/ethnic groups in our sample – Hispanic and non-Hispanic single-race Asian foreign born, finding again that characteristics related to assimilation are associated with administrative records coverage for both groups. However, we observe that neighborhood context impacts Hispanics and Asians differently.

    Release date: 2016-03-24

  • Technical products: 11-522-X201700014717
    Description:

    Files with linked data from the Statistics Canada, Postsecondary Student Information System (PSIS) and tax data can be used to examine the trajectories of students who pursue postsecondary education (PSE) programs and their post-schooling labour market outcomes. On one hand, administrative data on students linked longitudinally can provide aggregate information on student pathways during postsecondary studies such as persistence rates, graduation rates, mobility, etc. On the other hand, the tax data could supplement the PSIS data to provide information on employment outcomes such as average and median earnings or earnings progress by employment sector (industry), field of study, education level and/or other demographic information, year over year after graduation. Two longitudinal pilot studies have been done using administrative data on postsecondary students of Maritimes institutions which have been longitudinally linked and linked to Statistics Canada Ttx data (the T1 Family File) for relevant years. This article first focuses on the quality of information in the administrative data and the methodology used to conduct these longitudinal studies and derive indicators. Second, it will focus on some limitations when using administrative data, rather than a survey, to define some concepts.

    Release date: 2016-03-24

  • Technical products: 11-522-X201700014716
    Description:

    Administrative data, depending on its source and original purpose, can be considered a more reliable source of information than survey-collected data. It does not require a respondent to be present and understand question wording, and it is not limited by the respondent’s ability to recall events retrospectively. This paper compares selected survey data, such as demographic variables, from the Longitudinal and International Study of Adults (LISA) to various administrative sources for which LISA has linkage agreements in place. The agreement between data sources, and some factors that might affect it, are analyzed for various aspects of the survey.

    Release date: 2016-03-24

  • Technical products: 11-522-X201700014726
    Description:

    Internal migration is one of the components of population growth estimated at Statistics Canada. It is estimated by comparing individuals’ addresses at the beginning and end of a given period. The Canada Child Tax Benefit and T1 Family File are the primary data sources used. Address quality and coverage of more mobile subpopulations are crucial to producing high-quality estimates. The purpose of this article is to present the results of evaluations of these elements using access to more tax data sources at Statistics Canada.

    Release date: 2016-03-24

  • Technical products: 11-522-X201700014725
    Description:

    Tax data are being used more and more to measure and analyze the population and its characteristics. One of the issues raised by the growing use of these type of data relates to the definition of the concept of place of residence. While the census uses the traditional concept of place of residence, tax data provide information based on the mailing address of tax filers. Using record linkage between the census, the National Household Survey and tax data from the T1 Family File, this study examines the consistency level of the place of residence of these two sources and its associated characteristics.

    Release date: 2016-03-24

  • Technical products: 11-522-X201700014711
    Description:

    After the 2010 Census, the U.S. Census Bureau conducted two separate research projects matching survey data to databases. One study matched to the third-party database Accurint, and the other matched to U.S. Postal Service National Change of Address (NCOA) files. In both projects, we evaluated response error in reported move dates by comparing the self-reported move date to records in the database. We encountered similar challenges in the two projects. This paper discusses our experience using “big data” as a comparison source for survey data and our lessons learned for future projects similar to the ones we conducted.

    Release date: 2016-03-24

  • Technical products: 11-522-X201700014743
    Description:

    Probabilistic linkage is susceptible to linkage errors such as false positives and false negatives. In many cases, these errors may be reliably measured through clerical-reviews, i.e. the visual inspection of a sample of record pairs to determine if they are matched. A framework is described to effectively carry-out such clerical-reviews based on a probabilistic sample of pairs, repeated independent reviews of the same pairs and latent class analysis to account for clerical errors.

    Release date: 2016-03-24

  • Technical products: 11-522-X201700014724
    Description:

    At the Institut national de santé publique du Québec, the Quebec Integrated Chronic Disease Surveillance System (QICDSS) has been used daily for approximately four years. The benefits of this system are numerous for measuring the extent of diseases more accurately, evaluating the use of health services properly and identifying certain groups at risk. However, in the past months, various problems have arisen that have required a great deal of careful thought. The problems have affected various areas of activity, such as data linkage, data quality, coordinating multiple users and meeting legal obligations. The purpose of this presentation is to describe the main challenges associated with using QICDSS data and to present some possible solutions. In particular, this presentation discusses the processing of five data sources that not only come from five different sources, but also are not mainly used for chronic disease surveillance. The varying quality of the data, both across files and within a given file, will also be discussed. Certain situations associated with the simultaneous use of the system by multiple users will also be examined. Examples will be given of analyses of large data sets that have caused problems. As well, a few challenges involving disclosure and the fulfillment of legal agreements will be briefly discussed.

    Release date: 2016-03-24

  • Articles and reports: 82-003-X201600114307
    Description:

    Using the 2012 Aboriginal Peoples Survey, this study examined the psychometric properties of the 10-item Kessler Psychological Distress Scale (a short measure of non-specific psychological distress) for First Nations people living off reserve, Métis, and Inuit aged 15 or older.

    Release date: 2016-01-20

  • Articles and reports: 82-003-X201501214295
    Description:

    Using the Wisconsin Cancer Intervention and Surveillance Monitoring Network breast cancer simulation model adapted to the Canadian context, costs and quality-adjusted life years were evaluated for 11 mammography screening strategies that varied by start/stop age and screening frequency for the general population. Incremental cost-effectiveness ratios are presented, and sensitivity analyses are used to assess the robustness of model conclusions.

    Release date: 2015-12-16

  • Articles and reports: 82-003-X201501114243
    Description:

    A surveillance tool was developed to assess dietary intake collected by surveys in relation to Eating Well with Canada’s Food Guide (CFG). The tool classifies foods in the Canadian Nutrient File (CNF) according to how closely they reflect CFG. This article describes the validation exercise conducted to ensure that CNF foods determined to be “in line with CFG” were appropriately classified.

    Release date: 2015-11-18

  • Articles and reports: 82-003-X201500714205
    Description:

    Discrepancies between self-reported and objectively measured physical activity are well-known. For the purpose of validation, this study compares a new self-reported physical activity questionnaire with an existing one and with accelerometer data.

    Release date: 2015-07-15

  • Technical products: 11-522-X201300014284
    Description:

    The decline in response rates observed by several national statistical institutes, their desire to limit response burden and the significant budget pressures they face support greater use of administrative data to produce statistical information. The administrative data sources they must consider have to be evaluated according to several aspects to determine their fitness for use. Statistics Canada recently developed a process to evaluate administrative data sources for use as inputs to the statistical information production process. This evaluation is conducted in two phases. The initial phase requires access only to the metadata associated with the administrative data considered, whereas the second phase uses a version of data that can be evaluated. This article outlines the evaluation process and tool.

    Release date: 2014-10-31

  • Technical products: 11-522-X201300014267
    Description:

    Statistics Sweden has, like many other National Statistical Institutes (NSIs), a long history of working with quality. More recently, the agency decided to start using a number of frameworks to address organizational, process and product quality. It is important to consider all three levels, since we know that the way we do things, e.g., when asking questions, affects product quality and therefore process quality is an important part of the quality concept. Further, organizational quality, i.e., systematically managing aspects such as training of staff and leadership, is fundamental for achieving process quality. Statistics Sweden uses EFQM (European Foundation for Quality Management) as a framework for organizational quality and ISO 20252 for market, opinion and social research as a standard for process quality. In April 2014, as the first National Statistical Institute, Statistics Sweden was certified according to the ISO 20252. One challenge that Statistics Sweden faced in 2011 was to systematically measure and monitor changes in product quality and to clearly present them to stakeholders. Together with external consultants, Paul Biemer and Dennis Trewin, Statistics Sweden developed a tool for this called ASPIRE (A System for Product Improvement, Review and Evaluation). To assure that quality is maintained and improved, Statistics Sweden has also built an organization for quality comprising a quality manager, quality coaches, and internal and external quality auditors. In this paper I will present the components of Statistics Sweden’s quality management system and some of the challenges we have faced.

    Release date: 2014-10-31

  • Technical products: 11-522-X201300014264
    Description:

    While wetlands represent only 6.4% of the world’s surface area, they are essential to the survival of terrestrial species. These ecosystems require special attention in Canada, since that is where nearly 25% of the world’s wetlands are found. Environment Canada (EC) has massive databases that contain all kinds of wetland information from various sources. Before the information in these databases could be used for any environmental initiative, it had to be classified and its quality had to be assessed. In this paper, we will give an overview of the joint pilot project carried out by EC and Statistics Canada to assess the quality of the information contained in these databases, which has characteristics specific to big data, administrative data and survey data.

    Release date: 2014-10-31

  • Technical products: 11-522-X201300014256
    Description:

    The American Community Survey (ACS) added an Internet data collection mode as part of a sequential mode design in 2013. The ACS currently uses a single web application for all Internet respondents, regardless of whether they respond on a personal computer or on a mobile device. As market penetration of mobile devices increases, however, more survey respondents are using tablets and smartphones to take surveys that are designed for personal computers. Using mobile devices to complete these surveys may be more difficult for respondents and this difficulty may translate to reduced data quality if respondents become frustrated or cannot navigate around usability issues. This study uses several indicators to compare data quality across computers, tablets, and smartphones and also compares the demographic characteristics of respondents that use each type of device.

    Release date: 2014-10-31

  • Technical products: 11-522-X201300014288
    Description:

    Probability-based surveys, those including with samples selected through a known randomization mechanism, are considered by many to be the gold standard in contrast to non-probability samples. Probability sampling theory was first developed in the early 1930’s and continues today to justify the estimation of population values from these data. Conversely, studies using non-probability samples have gained attention in recent years but they are not new. Touted as cheaper, faster (even better) than probability designs, these surveys capture participants through various “on the ground” methods (e.g., opt-in web survey). But, which type of survey is better? This paper is the first in a series on the quest for a quality framework under which all surveys, probability- and non-probability-based, may be measured on a more equal footing. First, we highlight a few frameworks currently in use, noting that “better” is almost always relative to a survey’s fit for purpose. Next, we focus on the question of validity, particularly external validity when population estimates are desired. Estimation techniques used to date for non-probability surveys are reviewed, along with a few comparative studies of these estimates against those from a probability-based sample. Finally, the next research steps in the quest are described, followed by a few parting comments.

    Release date: 2014-10-31

  • Articles and reports: 11F0019M2013351
    Description:

    Measures of subjective well-being are increasingly prominent in international policy discussions about how best to measure "societal progress" and the well-being of national populations. This has implications for national statistical offices, as calls have been made for them to include measures of subjective well-being in their household surveys (Organization for Economic Cooperation and Development 2013). Statistics Canada has included measures of subjective well-being - particularly life satisfaction - in its surveys for twenty-five years, although the wording of these questions and the response categories have evolved over time. Statistics Canada's General Social Survey (GSS) and Canadian Community Health Survey (CCHS) offer a valuable opportunity to examine the stability of life satisfaction responses and their correlates from year to year using a consistent analytical framework.

    Release date: 2013-10-11

  • Articles and reports: 82-003-X201300811857
    Description:

    Using data from the Canadian Cancer Registry, vital statistics and population statistics, this study examines the assumption of stable age-standardized sex- and cancer-site-specific incidence-to-mortality rate ratios across regions, which underlies the North American Association of Central Cancer Registries' (NAACCR) completeness of case indicator.

    Release date: 2013-08-21

  • Articles and reports: 82-003-X201300111764
    Description:

    This study compares two sources of information about prescription drug use by people aged 65 or older in Ontario - the Canadian Community Health Survey and the drug claimsdatabase of the Ontario Drug Benefit Program. The analysis pertains to cardiovascular and diabetes drugs because they are commonly used, and almost all are prescribed on a regular basis.

    Release date: 2013-01-16

Data (1)

Data (1) (1 result)

  • Table: 53-500-X
    Description:

    This report presents the results of a pilot survey conducted by Statistics Canada to measure the fuel consumption of on-road motor vehicles registered in Canada. This study was carried out in connection with the Canadian Vehicle Survey (CVS) which collects information on road activity such as distance traveled, number of passengers and trip purpose.

    Release date: 2004-10-21

Analysis (38)

Analysis (38) (25 of 38 results)

  • The Daily
    Description: Release published in The Daily – Statistics Canada’s official release bulletin
    Release date: 2017-04-21

  • Articles and reports: 82-003-X201600114307
    Description:

    Using the 2012 Aboriginal Peoples Survey, this study examined the psychometric properties of the 10-item Kessler Psychological Distress Scale (a short measure of non-specific psychological distress) for First Nations people living off reserve, Métis, and Inuit aged 15 or older.

    Release date: 2016-01-20

  • Articles and reports: 82-003-X201501214295
    Description:

    Using the Wisconsin Cancer Intervention and Surveillance Monitoring Network breast cancer simulation model adapted to the Canadian context, costs and quality-adjusted life years were evaluated for 11 mammography screening strategies that varied by start/stop age and screening frequency for the general population. Incremental cost-effectiveness ratios are presented, and sensitivity analyses are used to assess the robustness of model conclusions.

    Release date: 2015-12-16

  • Articles and reports: 82-003-X201501114243
    Description:

    A surveillance tool was developed to assess dietary intake collected by surveys in relation to Eating Well with Canada’s Food Guide (CFG). The tool classifies foods in the Canadian Nutrient File (CNF) according to how closely they reflect CFG. This article describes the validation exercise conducted to ensure that CNF foods determined to be “in line with CFG” were appropriately classified.

    Release date: 2015-11-18

  • Articles and reports: 82-003-X201500714205
    Description:

    Discrepancies between self-reported and objectively measured physical activity are well-known. For the purpose of validation, this study compares a new self-reported physical activity questionnaire with an existing one and with accelerometer data.

    Release date: 2015-07-15

  • Articles and reports: 11F0019M2013351
    Description:

    Measures of subjective well-being are increasingly prominent in international policy discussions about how best to measure "societal progress" and the well-being of national populations. This has implications for national statistical offices, as calls have been made for them to include measures of subjective well-being in their household surveys (Organization for Economic Cooperation and Development 2013). Statistics Canada has included measures of subjective well-being - particularly life satisfaction - in its surveys for twenty-five years, although the wording of these questions and the response categories have evolved over time. Statistics Canada's General Social Survey (GSS) and Canadian Community Health Survey (CCHS) offer a valuable opportunity to examine the stability of life satisfaction responses and their correlates from year to year using a consistent analytical framework.

    Release date: 2013-10-11

  • Articles and reports: 82-003-X201300811857
    Description:

    Using data from the Canadian Cancer Registry, vital statistics and population statistics, this study examines the assumption of stable age-standardized sex- and cancer-site-specific incidence-to-mortality rate ratios across regions, which underlies the North American Association of Central Cancer Registries' (NAACCR) completeness of case indicator.

    Release date: 2013-08-21

  • Articles and reports: 82-003-X201300111764
    Description:

    This study compares two sources of information about prescription drug use by people aged 65 or older in Ontario - the Canadian Community Health Survey and the drug claimsdatabase of the Ontario Drug Benefit Program. The analysis pertains to cardiovascular and diabetes drugs because they are commonly used, and almost all are prescribed on a regular basis.

    Release date: 2013-01-16

  • Articles and reports: 12-001-X201200211751
    Description:

    Survey quality is a multi-faceted concept that originates from two different development paths. One path is the total survey error paradigm that rests on four pillars providing principles that guide survey design, survey implementation, survey evaluation, and survey data analysis. We should design surveys so that the mean squared error of an estimate is minimized given budget and other constraints. It is important to take all known error sources into account, to monitor major error sources during implementation, to periodically evaluate major error sources and combinations of these sources after the survey is completed, and to study the effects of errors on the survey analysis. In this context survey quality can be measured by the mean squared error and controlled by observations made during implementation and improved by evaluation studies. The paradigm has both strengths and weaknesses. One strength is that research can be defined by error sources and one weakness is that most total survey error assessments are incomplete in the sense that it is not possible to include the effects of all the error sources. The second path is influenced by ideas from the quality management sciences. These sciences concern business excellence in providing products and services with a focus on customers and competition from other providers. These ideas have had a great influence on many statistical organizations. One effect is the acceptance among data providers that product quality cannot be achieved without a sufficient underlying process quality and process quality cannot be achieved without a good organizational quality. These levels can be controlled and evaluated by service level agreements, customer surveys, paradata analysis using statistical process control, and organizational assessment using business excellence models or other sets of criteria. All levels can be improved by conducting improvement projects chosen by means of priority functions. The ultimate goal of improvement projects is that the processes involved should gradually approach a state where they are error-free. Of course, this might be an unattainable goal, albeit one to strive for. It is not realistic to hope for continuous measurements of the total survey error using the mean squared error. Instead one can hope that continuous quality improvement using management science ideas and statistical methods can minimize biases and other survey process problems so that the variance becomes an approximation of the mean squared error. If that can be achieved we have made the two development paths approximately coincide.

    Release date: 2012-12-19

  • Articles and reports: 82-003-X201200111625
    Description:

    This study compares estimates of the prevalence of cigarette smoking based on self-report with estimates based on urinary cotinine concentrations. The data are from the 2007 to 2009 Canadian Health Measures Survey, which included self-reported smoking status and the first nationally representative measures of urinary cotinine.

    Release date: 2012-02-15

  • Articles and reports: 82-003-X201100211437
    Description:

    This article examines the internal consistency of the English and French versions of the Medical Outcomes Study social support scale for a sample of older adults. The second objective is to conduct a confirmatory factor analysis to assess the factor structure of the English and French versions of the scale. A third purpose is to determine if the items comprising the scale operate in the same way for English- and French-speaking respondents.

    Release date: 2011-05-18

  • Articles and reports: 75-001-X200510613145
    Description:

    Changes in hours worked normally track employment changes very closely. Recently, however, employment has increased more than hours, resulting in an unprecedented gap. In effect, the average annual hours worked have decreased by the equivalent of two weeks. Many factors can affect the hours worked. Some are structural or cyclical - population aging, industrial shifts, the business cycle, natural disasters, legislative changes or personal preferences. Others are a result of the survey methodology. How have the various factors contributed to the recent drop in hours of work?

    Release date: 2005-09-21

  • Articles and reports: 12-001-X20050018085
    Description:

    Record linkage is a process of pairing records from two files and trying to select the pairs that belong to the same entity. The basic framework uses a match weight to measure the likelihood of a correct match and a decision rule to assign record pairs as "true" or "false" match pairs. Weight thresholds for selecting a record pair as matched or unmatched depend on the desired control over linkage errors. Current methods to determine the selection thresholds and estimate linkage errors can provide divergent results, depending on the type of linkage error and the approach to linkage. This paper presents a case study that uses existing linkage methods to link record pairs but a new simulation approach (SimRate) to help determine selection thresholds and estimate linkage errors. SimRate uses the observed distribution of data in matched and unmatched pairs to generate a large simulated set of record pairs, assigns a match weight to each pair based on specified match rules, and uses the weight curves of the simulated pairs for error estimation.

    Release date: 2005-07-21

  • Articles and reports: 12-001-X20050018083
    Description:

    The advent of computerized record linkage methodology has facilitated the conduct of cohort mortality studies in which exposure data in one database are electronically linked with mortality data from another database. This, however, introduces linkage errors due to mismatching an individual from one database with a different individual from the other database. In this article, the impact of linkage errors on estimates of epidemiological indicators of risk such as standardized mortality ratios and relative risk regression model parameters is explored. It is shown that the observed and expected number of deaths are affected in opposite direction and, as a result, these indicators can be subject to bias and additional variability in the presence of linkage errors.

    Release date: 2005-07-21

  • Articles and reports: 91F0015M2005007
    Description:

    The Population Estimates Program at Statistics Canada is using internal migration estimates derived from administrative sources of data. There are two versions of migration estimates currently available, preliminary (P), based on Child Tax Credit information and final (F), produced using information from income tax reports. For some reference dates they could be significantly different. This paper summarises the research undertaken in Demography Division to modify the current method for preliminary estimates in order to decrease those differences. After a brief analysis of the differences, six methods are tested: 1) regression of out-migration; 2) regression of in- and out-migration separately; 3) regression of net migration; 4) the exponentially weighted moving average; 5) the U.S. Bureau of Census approach; and 6) method of using the first difference regression. It seems that the methods in which final and preliminary migration data are combined to estimate preliminary net migration (Method 3) are the best approach to improve convergence between preliminary and final estimates of internal migration for the Population Estimation Program. This approach allows for "smoothing" of some erratic patterns displayed by the former method while preserving CTB data's ability to capture current shifts in migration patterns.

    Release date: 2005-06-20

  • Articles and reports: 12-001-X20040027750
    Description:

    Intelligent Character Recognition (ICR) has been widely used as a new technology in data capture processing. It was used for the first time at Statistics Canada to process the 2001 Canadian Census of Agriculture. This involved many new challenges, both operational and methodological. This paper presents an overview of the methodological tools used to put in place an efficient ICR system. Since the potential for high levels of error existed at various stages of the operation, Quality Assurance (QA) and Quality Control (QC) methods and procedures were built into this operation to ensure a high degree of accuracy in the captured data. This paper describes these QA / QC methods along with their results and shows how quality improvements were achieved in the ICR Data Capture operation. This paper also identifies the positive impacts of these procedures on this operation.

    Release date: 2005-02-03

  • Articles and reports: 11F0019M2004219
    Description:

    This study investigates trends in family income inequality in the 1980s and 1990s, with particular attention paid to the recovery period of the 1990s.

    Release date: 2004-12-16

  • Articles and reports: 91F0015M2004006
    Description:

    The paper assesses and compares new and old methodologies for official estimates of migration within and among provinces and territories for the period 1996/97 to 2000/01.

    Release date: 2004-06-17

  • Articles and reports: 81-595-M2003009
    Description:

    This paper examines how the Canadian Adult Education and Training Survey (AETS) can be used to study participation in and impacts of education and training activities for adults.

    Release date: 2003-10-15

  • Articles and reports: 12-001-X20030016610
    Description:

    In the presence of item nonreponse, unweighted imputation methods are often used in practice but they generally lead to biased estimators under uniform response within imputation classes. Following Skinner and Rao (2002), we propose a bias-adjusted estimator of a population mean under unweighted ratio imputation and random hot-deck imputation and derive linearization variance estimators. A small simulation study is conducted to study the performance of the methods in terms of bias and mean square error. Relative bias and relative stability of the variance estimators are also studied.

    Release date: 2003-07-31

  • Articles and reports: 12-001-X20020026423
    Description:

    The reputation of a national statistical office (NSO) depends very much on the quality of the service it provides. Quality has to be a core value: providing a high quality service has to be the natural way of doing business. It has to be embedded in the culture of the NSO.

    The paper will outline what is meant by a high quality statistical service. It will also explore those factors that are important to ensuring a quality culture in an NSO. In particular, it will outline the activities and experiences of the Australian Bureau of Statistics in maintaining a quality culture.

    Release date: 2003-01-29

  • Articles and reports: 11F0019M2002181
    Description:

    We use data from the Canadian National Longitudinal Survey of Children and Youth to address two questions. To what extent do parents and children agree when asked identical questions about child well-being? To what extent do differences in their responses affect what one infers from multivariate analysis of the data? The correspondence between parent and child in the assessment of child well-being is only slight to fair. Agreement is stronger for more observable outcomes, such as schooling performance, and weaker for less observable outcomes, such as emotional disorders. We regress both sets of responses on a standard set of socio-economic characteristics. We also conduct formal and informal tests of the differences in what one would infer from these two sets of regressions.

    Release date: 2002-10-23

  • Articles and reports: 11F0019M2001166
    Description:

    This study assesses two potential problems with respect to the reporting of Employment Insurance (EI) and Social Assistance (SA) benefits in the Survey of Labour and Income Dynamics (SLID): (a) under-reporting of the monthly number of beneficiaries; and (b) a tendency to incorrectly report receiving benefits throughout the year, while in fact benefits may have been received only in certain months, leading to artificial spikes in the January starts and December terminations of benefit spells (seam effect). The results of the analysis show the following:

    (1) The rate of under-reporting of EI in SLID is about 15%. Although it varies by month (from 0% to 30%), it is fairly stable from year to year.

    (2) There are significant spikes in the number of January starts and December terminations of EI benefit spells. However, the spikes in January starts appear to represent a real phenomenon, rather than a seam problem. They mirror closely the pattern of establishment of new EI claims (the latter increase significantly in January as a result of the decline in employment following the Christmas peak demand). There are no corresponding statistics for EI claim terminations to assess the nature of December spikes.

    (3) The rate of under-reporting of SA in SLID is about 50%, significantly greater than for EI. The rate of under-reporting goes down to about 20% to 30%, if we assume that those who received SA, but did not report in which months they received benefits, received benefits throughout the year.

    (4) There are large spikes in the number of January starts and December terminations. As in the case of EI, the SA could reflect a real phenomenon. After all, SA starts and terminations are affected by labour market conditions, in the same way EI starts and terminations are affected. However, the SA spikes are much larger than the EI spikes, which increases the probability that, at least in part, are due to a seam effect.

    Release date: 2001-09-11

  • Articles and reports: 12-001-X19990024877
    Description:

    In 1999 Statistics Sweden outlined a proposal for improved quality within the European Statistical System (ESS). The ESS comprises Eurostat and National Statistical Institutes (NSIs) associated with Eurostat. ... Basically Statistics Sweden proposed the creation of a LEG [Leadership Expert Group] on Quality].

    Release date: 2000-03-01

  • Articles and reports: 12-001-X19990024876
    Description:

    Leslie Kish describes the challenges and opportunities of combining data from surveys of different populations. Examples include multinational surveys where the data from surveys of several countries are combined for comparison and analysis, as well as cumulated periodic surveys of the "same" population. He also compares and contrasts the combining of surveys with the combining of experiments.

    Release date: 2000-03-01

Reference (178)

Reference (178) (25 of 178 results)

  • Index and guides: 12-606-X
    Description:

    This is a toolkit intended to aid data producers and data users external to Statistics Canada.

    Release date: 2017-09-27

  • Technical products: 12-586-X
    Description:

    The Quality Assurance Framework (QAF) serves as the highest-level governance tool for quality management at Statistics Canada. The QAF gives an overview of the quality management and risk mitigation strategies used by the Agency’s program areas. The QAF is used in conjunction with Statistics Canada management practices, such as those described in the Quality Guidelines.

    Release date: 2017-04-21

  • Technical products: 11-522-X201700014758
    Description:

    "Several Canadian jurisdictions including Ontario are using patient-based healthcare data in their funding models. These initiatives can influence the quality of this data both positively and negatively as people tend to pay more attention to the data and its quality when financial decisions are based upon it. Ontario’s funding formula uses data from several national databases housed at the Canadian Institute for Health Information (CIHI). These databases provide information on patient activity and clinical status across the continuum of care. As funding models may influence coding behaviour, CIHI is collaborating with the Ontario Ministry of Health and Long-Term Care to assess and monitor the quality of this data. CIHI is using data mining software and modelling techniques (that are often associated with “big data”) to identify data anomalies across multiple factors. The models identify what the “typical” clinical coding patterns are for key patient groups (for example, patients seen in special care units or discharged to home care), so that outliers can be identified, where patients do not fit the expected pattern. A key component of the modelling is segmenting the data based on patient, provider and hospital characteristics to take into account key differences in the delivery of health care and patient populations across the province. CIHI’s analysis identified several hospitals with coding practices that appear to be changing or significantly different from their peer group. Further investigation is required to understand why these differences exist and to develop appropriate strategies to mitigate variations. "

    Release date: 2016-03-24

  • Technical products: 11-522-X201700014723
    Description:

    The U.S. Census Bureau is researching uses of administrative records in survey and decennial operations in order to reduce costs and respondent burden while preserving data quality. One potential use of administrative records is to utilize the data when race and Hispanic origin responses are missing. When federal and third party administrative records are compiled, race and Hispanic origin responses are not always the same for an individual across different administrative records sources. We explore different sets of business rules used to assign one race and one Hispanic response when these responses are discrepant across sources. We also describe the characteristics of individuals with matching, non-matching, and missing race and Hispanic origin data across several demographic, household, and contextual variables. We find that minorities, especially Hispanics, are more likely to have non-matching Hispanic origin and race responses in administrative records than in the 2010 Census. Hispanics are less likely to have missing Hispanic origin data but more likely to have missing race data in administrative records. Non-Hispanic Asians and non-Hispanic Pacific Islanders are more likely to have missing race and Hispanic origin data in administrative records. Younger individuals, renters, individuals living in households with two or more people, individuals who responded to the census in the nonresponse follow-up operation, and individuals residing in urban areas are more likely to have non-matching race and Hispanic origin responses.

    Release date: 2016-03-24

  • Technical products: 11-522-X201700014722
    Description:

    The U.S. Census Bureau is researching ways to incorporate administrative data in decennial census and survey operations. Critical to this work is an understanding of the coverage of the population by administrative records. Using federal and third party administrative data linked to the American Community Survey (ACS), we evaluate the extent to which administrative records provide data on foreign-born individuals in the ACS and employ multinomial logistic regression techniques to evaluate characteristics of those who are in administrative records relative to those who are not. We find that overall, administrative records provide high coverage of foreign-born individuals in our sample for whom a match can be determined. The odds of being in administrative records are found to be tied to the processes of immigrant assimilation – naturalization, higher English proficiency, educational attainment, and full-time employment are associated with greater odds of being in administrative records. These findings suggest that as immigrants adapt and integrate into U.S. society, they are more likely to be involved in government and commercial processes and programs for which we are including data. We further explore administrative records coverage for the two largest race/ethnic groups in our sample – Hispanic and non-Hispanic single-race Asian foreign born, finding again that characteristics related to assimilation are associated with administrative records coverage for both groups. However, we observe that neighborhood context impacts Hispanics and Asians differently.

    Release date: 2016-03-24

  • Technical products: 11-522-X201700014717
    Description:

    Files with linked data from the Statistics Canada, Postsecondary Student Information System (PSIS) and tax data can be used to examine the trajectories of students who pursue postsecondary education (PSE) programs and their post-schooling labour market outcomes. On one hand, administrative data on students linked longitudinally can provide aggregate information on student pathways during postsecondary studies such as persistence rates, graduation rates, mobility, etc. On the other hand, the tax data could supplement the PSIS data to provide information on employment outcomes such as average and median earnings or earnings progress by employment sector (industry), field of study, education level and/or other demographic information, year over year after graduation. Two longitudinal pilot studies have been done using administrative data on postsecondary students of Maritimes institutions which have been longitudinally linked and linked to Statistics Canada Ttx data (the T1 Family File) for relevant years. This article first focuses on the quality of information in the administrative data and the methodology used to conduct these longitudinal studies and derive indicators. Second, it will focus on some limitations when using administrative data, rather than a survey, to define some concepts.

    Release date: 2016-03-24

  • Technical products: 11-522-X201700014716
    Description:

    Administrative data, depending on its source and original purpose, can be considered a more reliable source of information than survey-collected data. It does not require a respondent to be present and understand question wording, and it is not limited by the respondent’s ability to recall events retrospectively. This paper compares selected survey data, such as demographic variables, from the Longitudinal and International Study of Adults (LISA) to various administrative sources for which LISA has linkage agreements in place. The agreement between data sources, and some factors that might affect it, are analyzed for various aspects of the survey.

    Release date: 2016-03-24

  • Technical products: 11-522-X201700014726
    Description:

    Internal migration is one of the components of population growth estimated at Statistics Canada. It is estimated by comparing individuals’ addresses at the beginning and end of a given period. The Canada Child Tax Benefit and T1 Family File are the primary data sources used. Address quality and coverage of more mobile subpopulations are crucial to producing high-quality estimates. The purpose of this article is to present the results of evaluations of these elements using access to more tax data sources at Statistics Canada.

    Release date: 2016-03-24

  • Technical products: 11-522-X201700014725
    Description:

    Tax data are being used more and more to measure and analyze the population and its characteristics. One of the issues raised by the growing use of these type of data relates to the definition of the concept of place of residence. While the census uses the traditional concept of place of residence, tax data provide information based on the mailing address of tax filers. Using record linkage between the census, the National Household Survey and tax data from the T1 Family File, this study examines the consistency level of the place of residence of these two sources and its associated characteristics.

    Release date: 2016-03-24

  • Technical products: 11-522-X201700014711
    Description:

    After the 2010 Census, the U.S. Census Bureau conducted two separate research projects matching survey data to databases. One study matched to the third-party database Accurint, and the other matched to U.S. Postal Service National Change of Address (NCOA) files. In both projects, we evaluated response error in reported move dates by comparing the self-reported move date to records in the database. We encountered similar challenges in the two projects. This paper discusses our experience using “big data” as a comparison source for survey data and our lessons learned for future projects similar to the ones we conducted.

    Release date: 2016-03-24

  • Technical products: 11-522-X201700014743
    Description:

    Probabilistic linkage is susceptible to linkage errors such as false positives and false negatives. In many cases, these errors may be reliably measured through clerical-reviews, i.e. the visual inspection of a sample of record pairs to determine if they are matched. A framework is described to effectively carry-out such clerical-reviews based on a probabilistic sample of pairs, repeated independent reviews of the same pairs and latent class analysis to account for clerical errors.

    Release date: 2016-03-24

  • Technical products: 11-522-X201700014724
    Description:

    At the Institut national de santé publique du Québec, the Quebec Integrated Chronic Disease Surveillance System (QICDSS) has been used daily for approximately four years. The benefits of this system are numerous for measuring the extent of diseases more accurately, evaluating the use of health services properly and identifying certain groups at risk. However, in the past months, various problems have arisen that have required a great deal of careful thought. The problems have affected various areas of activity, such as data linkage, data quality, coordinating multiple users and meeting legal obligations. The purpose of this presentation is to describe the main challenges associated with using QICDSS data and to present some possible solutions. In particular, this presentation discusses the processing of five data sources that not only come from five different sources, but also are not mainly used for chronic disease surveillance. The varying quality of the data, both across files and within a given file, will also be discussed. Certain situations associated with the simultaneous use of the system by multiple users will also be examined. Examples will be given of analyses of large data sets that have caused problems. As well, a few challenges involving disclosure and the fulfillment of legal agreements will be briefly discussed.

    Release date: 2016-03-24

  • Technical products: 11-522-X201300014284
    Description:

    The decline in response rates observed by several national statistical institutes, their desire to limit response burden and the significant budget pressures they face support greater use of administrative data to produce statistical information. The administrative data sources they must consider have to be evaluated according to several aspects to determine their fitness for use. Statistics Canada recently developed a process to evaluate administrative data sources for use as inputs to the statistical information production process. This evaluation is conducted in two phases. The initial phase requires access only to the metadata associated with the administrative data considered, whereas the second phase uses a version of data that can be evaluated. This article outlines the evaluation process and tool.

    Release date: 2014-10-31

  • Technical products: 11-522-X201300014267
    Description:

    Statistics Sweden has, like many other National Statistical Institutes (NSIs), a long history of working with quality. More recently, the agency decided to start using a number of frameworks to address organizational, process and product quality. It is important to consider all three levels, since we know that the way we do things, e.g., when asking questions, affects product quality and therefore process quality is an important part of the quality concept. Further, organizational quality, i.e., systematically managing aspects such as training of staff and leadership, is fundamental for achieving process quality. Statistics Sweden uses EFQM (European Foundation for Quality Management) as a framework for organizational quality and ISO 20252 for market, opinion and social research as a standard for process quality. In April 2014, as the first National Statistical Institute, Statistics Sweden was certified according to the ISO 20252. One challenge that Statistics Sweden faced in 2011 was to systematically measure and monitor changes in product quality and to clearly present them to stakeholders. Together with external consultants, Paul Biemer and Dennis Trewin, Statistics Sweden developed a tool for this called ASPIRE (A System for Product Improvement, Review and Evaluation). To assure that quality is maintained and improved, Statistics Sweden has also built an organization for quality comprising a quality manager, quality coaches, and internal and external quality auditors. In this paper I will present the components of Statistics Sweden’s quality management system and some of the challenges we have faced.

    Release date: 2014-10-31

  • Technical products: 11-522-X201300014264
    Description:

    While wetlands represent only 6.4% of the world’s surface area, they are essential to the survival of terrestrial species. These ecosystems require special attention in Canada, since that is where nearly 25% of the world’s wetlands are found. Environment Canada (EC) has massive databases that contain all kinds of wetland information from various sources. Before the information in these databases could be used for any environmental initiative, it had to be classified and its quality had to be assessed. In this paper, we will give an overview of the joint pilot project carried out by EC and Statistics Canada to assess the quality of the information contained in these databases, which has characteristics specific to big data, administrative data and survey data.

    Release date: 2014-10-31

  • Technical products: 11-522-X201300014256
    Description:

    The American Community Survey (ACS) added an Internet data collection mode as part of a sequential mode design in 2013. The ACS currently uses a single web application for all Internet respondents, regardless of whether they respond on a personal computer or on a mobile device. As market penetration of mobile devices increases, however, more survey respondents are using tablets and smartphones to take surveys that are designed for personal computers. Using mobile devices to complete these surveys may be more difficult for respondents and this difficulty may translate to reduced data quality if respondents become frustrated or cannot navigate around usability issues. This study uses several indicators to compare data quality across computers, tablets, and smartphones and also compares the demographic characteristics of respondents that use each type of device.

    Release date: 2014-10-31

  • Technical products: 11-522-X201300014288
    Description:

    Probability-based surveys, those including with samples selected through a known randomization mechanism, are considered by many to be the gold standard in contrast to non-probability samples. Probability sampling theory was first developed in the early 1930’s and continues today to justify the estimation of population values from these data. Conversely, studies using non-probability samples have gained attention in recent years but they are not new. Touted as cheaper, faster (even better) than probability designs, these surveys capture participants through various “on the ground” methods (e.g., opt-in web survey). But, which type of survey is better? This paper is the first in a series on the quest for a quality framework under which all surveys, probability- and non-probability-based, may be measured on a more equal footing. First, we highlight a few frameworks currently in use, noting that “better” is almost always relative to a survey’s fit for purpose. Next, we focus on the question of validity, particularly external validity when population estimates are desired. Estimation techniques used to date for non-probability surveys are reviewed, along with a few comparative studies of these estimates against those from a probability-based sample. Finally, the next research steps in the quest are described, followed by a few parting comments.

    Release date: 2014-10-31

  • Surveys and statistical programs – Documentation: 62F0026M2011001
    Description:

    This report describes the quality indicators produced for the 2009 Survey of Household Spending. These quality indicators, such as coefficients of variation, nonresponse rates, slippage rates and imputation rates, help users interpret the survey data.

    Release date: 2011-06-16

  • Technical products: 12-587-X
    Description:

    This publication shows readers how to design and conduct a census or sample survey. It explains basic survey concepts and provides information on how to create efficient and high quality surveys. It is aimed at those involved in planning, conducting or managing a survey and at students of survey design courses.

    This book contains the following information:

    -how to plan and manage a survey;-how to formulate the survey objectives and design a questionnaire; -things to consider when determining a sample design (choosing between a sample or a census, defining the survey population, choosing a survey frame, identifying possible sources of survey error); -choosing a method of collection (self-enumeration, personal interviews or telephone interviews; computer-assisted versus paper-based questionnaires); -organizing and conducting data collection operations;-determining the sample size, allocating the sample across strata and selecting the sample; -methods of point estimation and variance estimation, and data analysis; -the use of administrative data, particularly during the design and estimation phases-how to process the data (which consists of all data handling activities between collection and estimation) and use quality control and quality assurance measures to minimize and control errors during various survey steps; and-disclosure control and data dissemination.

    This publication also includes a case study that illustrates the steps in developing a household survey, using the methods and principles presented in the book. This publication was previously only available in print format and originally published in 2003.

    Release date: 2010-09-27

  • Technical products: 11-522-X200800011014
    Description:

    In many countries, improved quality of economic statistics is one of the most important goals of the 21st century. First and foremost, the quality of National Accounts is in focus, regarding both annual and quarterly accounts. To achieve this goal, data quality regarding the largest enterprises is of vital importance. To assure that the quality of data for the largest enterprises is good, coherence analysis is an important tool. Coherence means that data from different sources fit together and give a consistent view of the development within these enterprises. Working with coherence analysis in an efficient way is normally a work-intensive task consisting mainly of collecting data from different sources and comparing them in a structured manner. Over the last two years, Statistics Sweden has made great progress in improving the routines for coherence analysis. An IT tool that collects data for the largest enterprises from a large number of sources and presents it in a structured and logical matter has been built, and a systematic approach to analyse data for National Accounts on a quarterly basis has been developed. The paper describes the work in both these areas and gives an overview of the IT tool and the agreed routines.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800010954
    Description:

    Over the past year, Statistics Canada has been developing and testing a new way to monitor the performance of interviewers conducting computer-assisted personal interviews (CAPI). A formal process already exists for monitoring centralized telephone interviews. Monitors listen to telephone interviews as they take place to assess the interviewer's performance using pre-defined criteria and provide feedback to the interviewer on what was well done and what needs improvement. For the CAPI program, we have developed and are testing a pilot approach whereby interviews are digitally recorded and later a monitor listens to these recordings to assess the field interviewer's performance and provide feedback in order to help improve the quality of the data. In this paper, we will present an overview of the CAPI monitoring project at Statistics Canada by describing the CAPI monitoring methodology and the plans for implementation.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800010940
    Description:

    Data Collection Methodology (DCM) enable the collection of good quality data by providing expert advice and assistance on questionnaire design, methods of evaluation and respondent engagement. DCM assist in the development of client skills, undertake research and lead innovation in data collection methods. This is done in a challenging environment of organisational change and limited resources. This paper will cover 'how DCM do business' with clients and the wider methodological community to achieve our goals.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800010955
    Description:

    Survey managers are still discovering the usefulness of digital audio recording for monitoring and managing field staff. Its value so far has been for confirming the authenticity of interviews, detecting curbstoning, offering a concrete basis for feedback on interviewing performance and giving data collection managers an intimate view of in-person interviews. In addition, computer audio-recorded interviewing (CARI) can improve other aspects of survey data quality, offering corroboration or correction of response coding by field staff. Audio recordings may replace or supplement in-field verbatim transcription of free responses, and speech-to-text technology might make this technique more efficient in the future.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800010968
    Description:

    Statistics Canada has embarked on a program of increasing and improving the usage of imaging technology for paper survey questionnaires. The goal is to make the process an efficient, reliable and cost effective method of capturing survey data. The objective is to continue using Optical Character Recognition (OCR) to capture the data from questionnaires, documents and faxes received whilst improving the process integration and Quality Assurance/Quality Control (QC) of the data capture process. These improvements are discussed in this paper.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800010956
    Description:

    The use of Computer Audio-Recorded Interviewing (CARI) as a tool to identify interview falsification is quickly growing in survey research (Biemer, 2000, 2003; Thissen, 2007). Similarly, survey researchers are starting to expand the usefulness of CARI by combining recordings with coding to address data quality (Herget, 2001; Hansen, 2005; McGee, 2007). This paper presents results from a study included as part of the establishment-based National Center for Health Statistics' National Home and Hospice Care Survey (NHHCS) which used CARI behavior coding and CARI-specific paradata to: 1) identify and correct problematic interviewer behavior or question issues early in the data collection period before either negatively impact data quality, and; 2) identify ways to diminish measurement error in future implementations of the NHHCS. During the first 9 weeks of the 30-week field period, CARI recorded a subset of questions from the NHHCS application for all interviewers. Recordings were linked with the interview application and output and then coded in one of two modes: Code by Interviewer or Code by Question. The Code by Interviewer method provided visibility into problems specific to an interviewer as well as more generalized problems potentially applicable to all interviewers. The Code by Question method yielded data that spoke to understandability of the questions and other response problems. In this mode, coders coded multiple implementations of the same question across multiple interviewers. Using the Code by Question approach, researchers identified issues with three key survey questions in the first few weeks of data collection and provided guidance to interviewers in how to handle those questions as data collection continued. Results from coding the audio recordings (which were linked with the survey application and output) will inform question wording and interviewer training in the next implementation of the NHHCS, and guide future enhancement of CARI and the coding system.

    Release date: 2009-12-03

Date modified: