Quality assurance

Skip to main content
Skip to footer

Language selection

Français

Search and menus

Search and menus

Search

Skip to filters. View results.

Results

All (33)

All (33) (0 to 10 of 33 results)

1. Creation of a composite quality indicator for administrative data-based estimates using clustering Archived
Articles and reports: 11-522-X202100100015
Description: National statistical agencies such as Statistics Canada have a responsibility to convey the quality of statistical information to users. The methods traditionally used to do this are based on measures of sampling error. As a result, they are not adapted to the estimates produced using administrative data, for which the main sources of error are not due to sampling. A more suitable approach to reporting the quality of estimates presented in a multidimensional table is described in this paper. Quality indicators were derived for various post-acquisition processing steps, such as linkage, geocoding and imputation, by estimation domain. A clustering algorithm was then used to combine domains with similar quality levels for a given estimate. Ratings to inform users of the relative quality of estimates across domains were assigned to the groups created. This indicator, called the composite quality indicator (CQI), was developed and experimented with in the Canadian Housing Statistics Program (CHSP), which aims to produce official statistics on the residential housing sector in Canada using multiple administrative data sources.
Keywords: Unsupervised machine learning, quality assurance, administrative data, data integration, clustering.
Release date: 2021-10-22
2. Challenges and results in using Audit trail data to monitor Labour Force Survey data quality Archived
Surveys and statistical programs – Documentation: 11-522-X201700014707
Description:
The Labour Force Survey (LFS) is a monthly household survey of about 56,000 households that provides information on the Canadian labour market. Audit Trail is a Blaise programming option, for surveys like LFS with Computer Assisted Interviewing (CAI), which creates files containing every keystroke and edit and timestamp of every data collection attempt on all households. Combining such a large survey with such a complete source of paradata opens the door to in-depth data quality analysis but also quickly leads to Big Data challenges. How can meaningful information be extracted from this large set of keystrokes and timestamps? How can it help assess the quality of LFS data collection? The presentation will describe some of the challenges that were encountered, solutions that were used to address them, and results of the analysis on data quality.
Release date: 2016-03-24
3. Total cost-effectiveness of mammography screening strategies Archived
Articles and reports: 82-003-X201501214295
Description:
Using the Wisconsin Cancer Intervention and Surveillance Monitoring Network breast cancer simulation model adapted to the Canadian context, costs and quality-adjusted life years were evaluated for 11 mammography screening strategies that varied by start/stop age and screening frequency for the general population. Incremental cost-effectiveness ratios are presented, and sensitivity analyses are used to assess the robustness of model conclusions.
Release date: 2015-12-16
4. A Framework for the Evaluation of Administrative Data Archived
Articles and reports: 11-522-X201300014284
Description:
The decline in response rates observed by several national statistical institutes, their desire to limit response burden and the significant budget pressures they face support greater use of administrative data to produce statistical information. The administrative data sources they must consider have to be evaluated according to several aspects to determine their fitness for use. Statistics Canada recently developed a process to evaluate administrative data sources for use as inputs to the statistical information production process. This evaluation is conducted in two phases. The initial phase requires access only to the metadata associated with the administrative data considered, whereas the second phase uses a version of data that can be evaluated. This article outlines the evaluation process and tool.
Release date: 2014-10-31
5. Survey Quality Archived
Articles and reports: 12-001-X201200211751
Description:
Survey quality is a multi-faceted concept that originates from two different development paths. One path is the total survey error paradigm that rests on four pillars providing principles that guide survey design, survey implementation, survey evaluation, and survey data analysis. We should design surveys so that the mean squared error of an estimate is minimized given budget and other constraints. It is important to take all known error sources into account, to monitor major error sources during implementation, to periodically evaluate major error sources and combinations of these sources after the survey is completed, and to study the effects of errors on the survey analysis. In this context survey quality can be measured by the mean squared error and controlled by observations made during implementation and improved by evaluation studies. The paradigm has both strengths and weaknesses. One strength is that research can be defined by error sources and one weakness is that most total survey error assessments are incomplete in the sense that it is not possible to include the effects of all the error sources. The second path is influenced by ideas from the quality management sciences. These sciences concern business excellence in providing products and services with a focus on customers and competition from other providers. These ideas have had a great influence on many statistical organizations. One effect is the acceptance among data providers that product quality cannot be achieved without a sufficient underlying process quality and process quality cannot be achieved without a good organizational quality. These levels can be controlled and evaluated by service level agreements, customer surveys, paradata analysis using statistical process control, and organizational assessment using business excellence models or other sets of criteria. All levels can be improved by conducting improvement projects chosen by means of priority functions. The ultimate goal of improvement projects is that the processes involved should gradually approach a state where they are error-free. Of course, this might be an unattainable goal, albeit one to strive for. It is not realistic to hope for continuous measurements of the total survey error using the mean squared error. Instead one can hope that continuous quality improvement using management science ideas and statistical methods can minimize biases and other survey process problems so that the variance becomes an approximation of the mean squared error. If that can be achieved we have made the two development paths approximately coincide.
Release date: 2012-12-19
6. Assessment of validity of self-reported smoking status Archived
Articles and reports: 82-003-X201200111625
Geography: Canada
Description:
This study compares estimates of the prevalence of cigarette smoking based on self-report with estimates based on urinary cotinine concentrations. The data are from the 2007 to 2009 Canadian Health Measures Survey, which included self-reported smoking status and the first nationally representative measures of urinary cotinine.
Release date: 2012-02-15
7. Evaluating the hyperactivity/inattention subscale of the National Longitudinal Survey of Children and Youth Archived
Articles and reports: 82-003-X201000211234
Geography: Canada
Description:
This article evaluates the parent-reported Hyperactivity/Inattention Subscale of the National Longitudinal Survey of Children and Youth with data from cycle 1 (1994/1995) of the survey.
Release date: 2010-06-16
8. Survey of Household Spending 2005: Data Quality Indicators Archived
Surveys and statistical programs – Documentation: 62F0026M2010002
Description:
This report describes the quality indicators produced for the 2005 Survey of Household Spending. These quality indicators, such as coefficients of variation, nonresponse rates, slippage rates and imputation rates, help users interpret the survey data.
Release date: 2010-04-26
9. Testing for the 2011 Census of Canada Archived
Articles and reports: 11-522-X200800010950
Description:
The next census will be conducted in May 2011. Being a major survey, it presents a formidable challenge for Statistics Canada and requires a great deal of time and resources. Careful planning has been done to ensure that all deadlines are met. A number of steps have been planned in the questionnaire testing process. These tests apply to both census content and the proposed communications strategy. This paper presents an overview of the strategy, with a focus on combining qualitative studies with the 2008 quantitative study so that the results can be analyzed and the proposals properly evaluated.
Release date: 2009-12-03
10. The Measurement of Job Search and Unemployment in a Retrospective Setting Archived
Articles and reports: 75F0002M1992004
Description:
The accurate measurement of job search and unemployment has been a recurring problem in retrospective surveys. However, strategies to improve recall in such surveys have not been especially successful. Proposed solutions have included a) reducing the recall period and b) questioning whether the standard operationalization of labour force concepts is appropriate in a retrospective setting.
One difficulty in arriving at an appropriate line of questioning is that there does not exist a reliable benchmark source indicating what sort of search patterns one should be observing over the year. Current notions of labour force dynamics have been heavily influenced by linked-record gross change data, which for various reasons cannot be considered a reliable source. These data show numerous changes in status from month-to-month and generally paint a picture of labour force participation that suggests little behavioural consistency.
This study examines data from the Annual Work Patterns Survey (AWPS) and Labour Market Activity Survey (LMAS). It shows that the underreporting of job search in the AWPS (and to a lesser extent in the LMAS) is closely connected to the failure of respondents, in a significant number of cases, to report any job search prior to the start of a job, a problem for which there is a simple questionnaire solution.
Release date: 2008-02-29

Data (0)

Data (0) (0 results)

No content available at this time.

Analysis (23)

Analysis (23) (0 to 10 of 23 results)

1. Creation of a composite quality indicator for administrative data-based estimates using clustering Archived
Articles and reports: 11-522-X202100100015
Description: National statistical agencies such as Statistics Canada have a responsibility to convey the quality of statistical information to users. The methods traditionally used to do this are based on measures of sampling error. As a result, they are not adapted to the estimates produced using administrative data, for which the main sources of error are not due to sampling. A more suitable approach to reporting the quality of estimates presented in a multidimensional table is described in this paper. Quality indicators were derived for various post-acquisition processing steps, such as linkage, geocoding and imputation, by estimation domain. A clustering algorithm was then used to combine domains with similar quality levels for a given estimate. Ratings to inform users of the relative quality of estimates across domains were assigned to the groups created. This indicator, called the composite quality indicator (CQI), was developed and experimented with in the Canadian Housing Statistics Program (CHSP), which aims to produce official statistics on the residential housing sector in Canada using multiple administrative data sources.
Keywords: Unsupervised machine learning, quality assurance, administrative data, data integration, clustering.
Release date: 2021-10-22
2. Total cost-effectiveness of mammography screening strategies Archived
Articles and reports: 82-003-X201501214295
Description:
Using the Wisconsin Cancer Intervention and Surveillance Monitoring Network breast cancer simulation model adapted to the Canadian context, costs and quality-adjusted life years were evaluated for 11 mammography screening strategies that varied by start/stop age and screening frequency for the general population. Incremental cost-effectiveness ratios are presented, and sensitivity analyses are used to assess the robustness of model conclusions.
Release date: 2015-12-16
3. A Framework for the Evaluation of Administrative Data Archived
Articles and reports: 11-522-X201300014284
Description:
The decline in response rates observed by several national statistical institutes, their desire to limit response burden and the significant budget pressures they face support greater use of administrative data to produce statistical information. The administrative data sources they must consider have to be evaluated according to several aspects to determine their fitness for use. Statistics Canada recently developed a process to evaluate administrative data sources for use as inputs to the statistical information production process. This evaluation is conducted in two phases. The initial phase requires access only to the metadata associated with the administrative data considered, whereas the second phase uses a version of data that can be evaluated. This article outlines the evaluation process and tool.
Release date: 2014-10-31
4. Survey Quality Archived
Articles and reports: 12-001-X201200211751
Description:
Survey quality is a multi-faceted concept that originates from two different development paths. One path is the total survey error paradigm that rests on four pillars providing principles that guide survey design, survey implementation, survey evaluation, and survey data analysis. We should design surveys so that the mean squared error of an estimate is minimized given budget and other constraints. It is important to take all known error sources into account, to monitor major error sources during implementation, to periodically evaluate major error sources and combinations of these sources after the survey is completed, and to study the effects of errors on the survey analysis. In this context survey quality can be measured by the mean squared error and controlled by observations made during implementation and improved by evaluation studies. The paradigm has both strengths and weaknesses. One strength is that research can be defined by error sources and one weakness is that most total survey error assessments are incomplete in the sense that it is not possible to include the effects of all the error sources. The second path is influenced by ideas from the quality management sciences. These sciences concern business excellence in providing products and services with a focus on customers and competition from other providers. These ideas have had a great influence on many statistical organizations. One effect is the acceptance among data providers that product quality cannot be achieved without a sufficient underlying process quality and process quality cannot be achieved without a good organizational quality. These levels can be controlled and evaluated by service level agreements, customer surveys, paradata analysis using statistical process control, and organizational assessment using business excellence models or other sets of criteria. All levels can be improved by conducting improvement projects chosen by means of priority functions. The ultimate goal of improvement projects is that the processes involved should gradually approach a state where they are error-free. Of course, this might be an unattainable goal, albeit one to strive for. It is not realistic to hope for continuous measurements of the total survey error using the mean squared error. Instead one can hope that continuous quality improvement using management science ideas and statistical methods can minimize biases and other survey process problems so that the variance becomes an approximation of the mean squared error. If that can be achieved we have made the two development paths approximately coincide.
Release date: 2012-12-19
5. Assessment of validity of self-reported smoking status Archived
Articles and reports: 82-003-X201200111625
Geography: Canada
Description:
This study compares estimates of the prevalence of cigarette smoking based on self-report with estimates based on urinary cotinine concentrations. The data are from the 2007 to 2009 Canadian Health Measures Survey, which included self-reported smoking status and the first nationally representative measures of urinary cotinine.
Release date: 2012-02-15
6. Evaluating the hyperactivity/inattention subscale of the National Longitudinal Survey of Children and Youth Archived
Articles and reports: 82-003-X201000211234
Geography: Canada
Description:
This article evaluates the parent-reported Hyperactivity/Inattention Subscale of the National Longitudinal Survey of Children and Youth with data from cycle 1 (1994/1995) of the survey.
Release date: 2010-06-16
7. Testing for the 2011 Census of Canada Archived
Articles and reports: 11-522-X200800010950
Description:
The next census will be conducted in May 2011. Being a major survey, it presents a formidable challenge for Statistics Canada and requires a great deal of time and resources. Careful planning has been done to ensure that all deadlines are met. A number of steps have been planned in the questionnaire testing process. These tests apply to both census content and the proposed communications strategy. This paper presents an overview of the strategy, with a focus on combining qualitative studies with the 2008 quantitative study so that the results can be analyzed and the proposals properly evaluated.
Release date: 2009-12-03
8. The Measurement of Job Search and Unemployment in a Retrospective Setting Archived
Articles and reports: 75F0002M1992004
Description:
The accurate measurement of job search and unemployment has been a recurring problem in retrospective surveys. However, strategies to improve recall in such surveys have not been especially successful. Proposed solutions have included a) reducing the recall period and b) questioning whether the standard operationalization of labour force concepts is appropriate in a retrospective setting.
One difficulty in arriving at an appropriate line of questioning is that there does not exist a reliable benchmark source indicating what sort of search patterns one should be observing over the year. Current notions of labour force dynamics have been heavily influenced by linked-record gross change data, which for various reasons cannot be considered a reliable source. These data show numerous changes in status from month-to-month and generally paint a picture of labour force participation that suggests little behavioural consistency.
This study examines data from the Annual Work Patterns Survey (AWPS) and Labour Market Activity Survey (LMAS). It shows that the underreporting of job search in the AWPS (and to a lesser extent in the LMAS) is closely connected to the failure of respondents, in a significant number of cases, to report any job search prior to the start of a job, a problem for which there is a simple questionnaire solution.
Release date: 2008-02-29
9. Quality measures related to the telephone first contact collection process in the Labour force survey Archived
Articles and reports: 11-522-X20050019487
Description:
The goal of this presentation is to present the different quality measures used to evaluate and manage the collection process related to the Telephone First Contact methodology in LFS.
Release date: 2007-03-02
10. Quality measures for enhanced quality of statistics in the European statistical system Archived
Articles and reports: 11-522-X20050019488
Description:
This paper sets out the importance of quality measures that can be used for monitoring purposes of these current and future information needs in the ESS. Particular emphasis is put on the needs for generalisation of initiatives in the ESS for the development and implementation of operational quality measures for enhanced quality of the statistical processes.
Release date: 2007-03-02

Reference (10)

Reference (10) ((10 results))

1. Challenges and results in using Audit trail data to monitor Labour Force Survey data quality Archived
Surveys and statistical programs – Documentation: 11-522-X201700014707
Description:
The Labour Force Survey (LFS) is a monthly household survey of about 56,000 households that provides information on the Canadian labour market. Audit Trail is a Blaise programming option, for surveys like LFS with Computer Assisted Interviewing (CAI), which creates files containing every keystroke and edit and timestamp of every data collection attempt on all households. Combining such a large survey with such a complete source of paradata opens the door to in-depth data quality analysis but also quickly leads to Big Data challenges. How can meaningful information be extracted from this large set of keystrokes and timestamps? How can it help assess the quality of LFS data collection? The presentation will describe some of the challenges that were encountered, solutions that were used to address them, and results of the analysis on data quality.
Release date: 2016-03-24
2. Survey of Household Spending 2005: Data Quality Indicators Archived
Surveys and statistical programs – Documentation: 62F0026M2010002
Description:
This report describes the quality indicators produced for the 2005 Survey of Household Spending. These quality indicators, such as coefficients of variation, nonresponse rates, slippage rates and imputation rates, help users interpret the survey data.
Release date: 2010-04-26
3. Survey of Household Spending 2003: Data Quality Indicators Archived
Surveys and statistical programs – Documentation: 62F0026M2005006
Description:
This report describes the quality indicators produced for the 2003 Survey of Household Spending. These quality indicators, such as coefficients of variation, nonresponse rates, slippage rates and imputation rates, help users interpret the survey data.
Release date: 2005-10-06
4. Survey of Household Spending 2002: Data Quality Indicators Archived
Surveys and statistical programs – Documentation: 62F0026M2004001
Description:
This report describes the quality indicators produced for the 2002 Survey of Household Spending. These quality indicators, such as coefficients of variation, nonresponse rates, slippage rates and imputation rates, help users interpret the survey data.
Release date: 2004-09-15
5. 2000 Survey of Household Spending - Data Quality Indicators Archived
Surveys and statistical programs – Documentation: 62F0026M2002001
Description:
This report describes the quality indicators produced for the 2000 Survey of Household Spending. It covers the usual quality indicators that help users interpret the data, such as coefficients of variation, non-response rates, slippage rates and imputation rates.
Release date: 2002-06-28
6. Modeling labour force careers for the lifepaths simulation model Archived
Surveys and statistical programs – Documentation: 11-522-X19990015648
Description:
We estimate the parameters of a stochastic model for labour force careers involving distributions of correlated durations employed, unemployed (with and without job search) and not in the labour force. If the model is to account for sub-annual labour force patterns as well as advancement towards retirement, then no single data source is adequate to inform it. However, it is possible to build up an approximation from a number of different sources.
Release date: 2000-03-02
7. Estimation using the generalized weight share method: The case of record linkage Archived
Surveys and statistical programs – Documentation: 11-522-X19990015680
Description:
To augment the amount of available information, data from different sources are increasingly being combined. These databases are often combined using record linkage methods. When there is no unique identifier, a probabilistic linkage is used. In that case, a record on a first file is associated with a probability that is linked to a record on a second file, and then a decision is taken on whether a possible link is a true link or not. This usually requires a non-negligible amount of manual resolution. It might then be legitimate to evaluate if manual resolution can be reduced or even eliminated. This issue is addressed in this paper where one tries to produce an estimate of a total (or a mean) of one population, when using a sample selected from another population linked somehow to the first population. In other words, having two populations linked through probabilistic record linkage, we try to avoid any decision concerning the validity of links and still be able to produce an unbiased estimate for a total of the one of two populations. To achieve this goal, we suggest the use of the Generalised Weight Share Method (GWSM) described by Lavallée (1995).
Release date: 2000-03-02
8. Qualitative Aspects of the Survey of Labour and Income Dynamics (SLID) Test 3A Data Collection Archived
Surveys and statistical programs – Documentation: 75F0002M1993007
Description:
This report presents a summary evaluation of the quality of the data collected during the Survey of Labour and Income Dynamics (SLID) field test of labour market activity data, held in January and February 1993.
Release date: 1995-12-30
9. Qualitative Aspects of the Survey of Labour and Income Dynamics (SLID) Test 3B Data Collection Archived
Surveys and statistical programs – Documentation: 75F0002M1993011
Description:
This report presents a summary evaluation of the quality of the data collected during the Survey of Labour and Income Dynamics (SLID) field test of income and wealth, held in April and May 1993.
Release date: 1995-12-30
10. Some Effects of Computer-assisted Interviewing on the Data Quality of the Survey of Labour and Income Dynamics Archived
Surveys and statistical programs – Documentation: 75F0002M1995007
Description:
This paper describes the impact of computer-assisted interviewing (CAI) on the quality of the Survey of Labour and Income Dynamics (SLID) data in three content areas: labour force activity, respondent-sensitive sources of income, and household relationships.
Release date: 1995-12-30

Report a problem or mistake on this page

Date modified:: 2024-04-24