Statistics by subject – Statistical methods

Filter results by

Help for filters and search
Currently selected filters that can be removed

Keyword(s)

Content

1 facets displayed. 1 facets selected.

Filter results by

Help for filters and search
Currently selected filters that can be removed

Keyword(s)

Content

1 facets displayed. 1 facets selected.

Filter results by

Help for filters and search
Currently selected filters that can be removed

Keyword(s)

Content

1 facets displayed. 1 facets selected.

Filter results by

Help for filters and search
Currently selected filters that can be removed

Keyword(s)

Content

1 facets displayed. 1 facets selected.

Other available resources to support your research.

Help for sorting results
Browse our central repository of key standard concepts, definitions, data sources and methods.
Loading
Loading in progress, please wait...
All (210)

All (210) (25 of 210 results)

  • Journals and periodicals: 11-633-X
    Description:

    Papers in this series provide background discussions of the methods used to develop data for economic, health, and social analytical studies at Statistics Canada. They are intended to provide readers with information on the statistical methods, standards and definitions used to develop databases for research purposes. All papers in this series have undergone peer and institutional review to ensure that they conform to Statistics Canada's mandate and adhere to generally accepted standards of good professional practice.

    Release date: 2018-01-22

  • Articles and reports: 11-633-X2018013
    Description:

    Since 2008, a number of population censuses have been linked to administrative health data and to financial data. These linked datasets have been instrumental in examining health inequalities and have been used in environmental health research. This paper describes the creation of the 1996 Canadian Census Health and Environment Cohort (CanCHEC)—3.57 million respondents to the census long-form questionnaire who were retrospectively followed for mortality and mobility for 16.6 years from 1996 to 2012. The 1996 CanCHEC was limited to census respondents who were aged 19 or older on Census Day (May 14, 1996), were residents of Canada, were not residents of institutions, and had filed an income tax return. These respondents were linked to death records from the Canadian Mortality Database or to the T1 Personal Master File, and to a postal code history from a variety of sources. This is the third in a set of CanCHECs that, when combined, make it possible to examine mortality trends and environmental exposures by socioeconomic characteristics over three census cycles and 21 years of census, tax, and mortality data. This report describes linkage methodologies, validation and bias assessment, and the characteristics of the 1996 CanCHEC. Representativeness of the 1996 CanCHEC relative to the adult population of Canada is also assessed.

    Release date: 2018-01-22

  • Articles and reports: 11-633-X2018012
    Description:

    This study investigates the extent to which income tax reassessments and delayed tax filing affect the reliability of Canadian administrative tax datasets used for economic analysis. The study is based on individual income tax records from the T1 Personal Master File and Historical Personal Master File for selected years from 1990 to 2010. These datasets contain tax records for approximately 100% of initial and all income tax filers, who submitted returns to the Canada Revenue Agency (CRA) before specific processing cut-off dates.

    Release date: 2018-01-11

  • Articles and reports: 11-633-X2018011
    Description:

    The Longitudinal Immigration Database (IMDB) is a comprehensive source of data that plays a key role in the understanding of the economic behaviour of immigrants. It is the only annual Canadian dataset that allows users to study the characteristics of immigrants to Canada at the time of admission and their economic outcomes and regional (inter-provincial) mobility over a time span of more than 30 years. The IMDB combines administrative files on immigrant admissions and non-permanent resident permits from Immigration, Refugees and Citizenship Canada (IRCC) with tax files from the Canadian Revenue Agency (CRA). Information is available for immigrant taxfilers admitted since 1980. Tax records for 1982 and subsequent years are available for immigrant taxfilers.

    This report will discuss the IMDB data sources, concepts and variables, record linkage, data processing, dissemination, data evaluation and quality indicators, comparability with other immigration datasets, and the analyses possible with the IMDB.

    Release date: 2018-01-08

  • Public use microdata: 89F0002X
    Description:

    The SPSD/M is a static microsimulation model designed to analyse financial interactions between governments and individuals in Canada. It can compute taxes paid to and cash transfers received from government. It is comprised of a database, a series of tax/transfer algorithms and models, analytical software and user documentation.

    Release date: 2018-01-08

  • Articles and reports: 18-001-X2017001
    Description:

    This working paper profiles Canadian firms involved in the development and production of Bioproducts. It provides data on the number and types of Bioproducts firms in 2015, covering bioproducts revenues, research and development, use of biomass, patents, products, business practices and the impact of government regulations on the sector.

    Release date: 2017-12-22

  • Journals and periodicals: 12-001-X
    Description:

    The journal publishes articles dealing with various aspects of statistical development relevant to a statistical agency, such as design issues in the context of practical constraints, use of different data sources and collection techniques, total survey error, survey evaluation, research in survey methodology, time series analysis, seasonal adjustment, demographic studies, data integration, estimation and data analysis methods, and general survey systems development. The emphasis is placed on the development and evaluation of specific methodologies as applied to data collection or the data themselves.

    Release date: 2017-12-21

  • Articles and reports: 12-001-X201700254871
    Description:

    In this paper the question is addressed how alternative data sources, such as administrative and social media data, can be used in the production of official statistics. Since most surveys at national statistical institutes are conducted repeatedly over time, a multivariate structural time series modelling approach is proposed to model the series observed by a repeated surveys with related series obtained from such alternative data sources. Generally, this improves the precision of the direct survey estimates by using sample information observed in preceding periods and information from related auxiliary series. This model also makes it possible to utilize the higher frequency of the social media to produce more precise estimates for the sample survey in real time at the moment that statistics for the social media become available but the sample data are not yet available. The concept of cointegration is applied to address the question to which extent the alternative series represent the same phenomena as the series observed with the repeated survey. The methodology is applied to the Dutch Consumer Confidence Survey and a sentiment index derived from social media.

    Release date: 2017-12-21

  • Articles and reports: 12-001-X201700254872
    Description:

    This note discusses the theoretical foundations for the extension of the Wilson two-sided coverage interval to an estimated proportion computed from complex survey data. The interval is shown to be asymptotically equivalent to an interval derived from a logistic transformation. A mildly better version is discussed, but users may prefer constructing a one-sided interval already in the literature.

    Release date: 2017-12-21

  • Articles and reports: 12-001-X201700254888
    Description:

    We discuss developments in sample survey theory and methods covering the past 100 years. Neyman’s 1934 landmark paper laid the theoretical foundations for the probability sampling approach to inference from survey samples. Classical sampling books by Cochran, Deming, Hansen, Hurwitz and Madow, Sukhatme, and Yates, which appeared in the early 1950s, expanded and elaborated the theory of probability sampling, emphasizing unbiasedness, model free features, and designs that minimize variance for a fixed cost. During the period 1960-1970, theoretical foundations of inference from survey data received attention, with the model-dependent approach generating considerable discussion. Introduction of general purpose statistical software led to the use of such software with survey data, which led to the design of methods specifically for complex survey data. At the same time, weighting methods, such as regression estimation and calibration, became practical and design consistency replaced unbiasedness as the requirement for standard estimators. A bit later, computer-intensive resampling methods also became practical for large scale survey samples. Improved computer power led to more sophisticated imputation for missing data, use of more auxiliary data, some treatment of measurement errors in estimation, and more complex estimation procedures. A notable use of models was in the expanded use of small area estimation. Future directions in research and methods will be influenced by budgets, response rates, timeliness, improved data collection devices, and availability of auxiliary data, some of which will come from “Big Data”. Survey taking will be impacted by changing cultural behavior and by a changing physical-technical environment.

    Release date: 2017-12-21

  • Articles and reports: 12-001-X201700254887
    Description:

    This paper proposes a new approach to decompose the wage difference between men and women that is based on a calibration procedure. This approach generalizes two current decomposition methods that are re-expressed using survey weights. The first one is the Blinder-Oaxaca method and the second one is a reweighting method proposed by DiNardo, Fortin and Lemieux. The new approach provides a weighting system that enables us to estimate such parameters of interest like quantiles. An application to data from the Swiss Structure of Earnings Survey shows the interest of this method.

    Release date: 2017-12-21

  • Articles and reports: 12-001-X201700254895
    Description:

    This note by Graham Kalton presents a discussion of the paper “Sample survey theory and methods: Past, present, and future directions” where J.N.K. Rao and Wayne A. Fuller share their views regarding the developments in sample survey theory and methods covering the past 100 years.

    Release date: 2017-12-21

  • Articles and reports: 12-001-X201700254896
    Description:

    This note by Sharon L. Lohr presents a discussion of the paper “Sample survey theory and methods: Past, present, and future directions” where J.N.K. Rao and Wayne A. Fuller share their views regarding the developments in sample survey theory and methods covering the past 100 years.

    Release date: 2017-12-21

  • Articles and reports: 12-001-X201700254897
    Description:

    This note by Chris Skinner presents a discussion of the paper “Sample survey theory and methods: Past, present, and future directions” where J.N.K. Rao and Wayne A. Fuller share their views regarding the developments in sample survey theory and methods covering the past 100 years.

    Release date: 2017-12-21

  • Articles and reports: 12-001-X201700254894
    Description:

    This note by Danny Pfeffermann presents a discussion of the paper “Sample survey theory and methods: Past, present, and future directions” where J.N.K. Rao and Wayne A. Fuller share their views regarding the developments in sample survey theory and methods covering the past 100 years.

    Release date: 2017-12-21

  • Articles and reports: 11-626-X2017077
    Description:

    On April 13, 2017, the Government of Canada tabled legislation to legalize the recreational use of cannabis by adults. This will directly impact Canada’s statistical system. The focus of this Economic Insights article is to provide experimental estimates for the volume of cannabis consumption, based on existing information on the prevalence of cannabis use. The article presents experimental estimates of the number of tonnes of cannabis consumed by age group for the period from 1960 to 2015. The experimental estimates rely on survey data from multiple sources, statistical techniques to link the sources over time, and assumptions about consumption behaviour. They are subject to revision as improved or additional data sources become available.

    Release date: 2017-12-18

  • Index and guides: 98-500-X
    Description:

    Provides information that enables users to effectively use, apply and interpret data from the Census of Population. Each guide contains definitions and explanations on census concepts as well as a data quality and historical comparability section. Additional information will be included for specific variables to help users better understand the concepts and questions used in the census.

    Release date: 2017-11-29

  • Technical products: 84-538-X
    Description:

    This document presents the methodology underlying the production of the life tables for Canada, provinces and territories, from reference period 1980/1982 and onward.

    Release date: 2017-11-16

  • Technical products: 12-206-X
    Description:

    This report summarizes the achievements program sponsored by the three methodology divisions of Statistics Canada. This program covers research and development activities in statistical methods with potentially broad application in the Agency's survey programs, which would not otherwise have been carried out during the provision of methodology services to those survey programs. They also include tasks that provided client support in the application of past successful developments in order to promote the utilization of the results of research and development work.

    Release date: 2017-11-03

  • Articles and reports: 11F0019M2017399
    Description:

    Canada is a trading nation that produces significant quantities of resource outputs. Consequently, the behaviour of resource prices that are important for Canada is germane to understanding the progress of real income growth and the prosperity of the country and the provinces. Demand and supply shocks or changes in monetary policy in international markets may exert significant influence on resource prices, and their fluctuations constitute an important avenue for the transmission of external shocks into the domestic economy. This paper develops historical estimates of the Bank of Canada commodity price index (BCPI) and links them to modern estimates. Using a collection of historical data sources, it estimates weights and prices sufficiently consistently to merit the construction of long-run estimates that may be linked to the modern Fisher BCPI.

    Release date: 2017-10-11

  • Articles and reports: 13-605-X201700114840
    Description:

    Statistics Canada is presently preparing the statistical system to be able to gauge the impact of the transition from illegal to legal non-medical cannabis use and to shed light on the social and economic activities related to the use of cannabis thereafter. While the system of social statistics captures some information on the use of cannabis, updates will be required to more accurately measure health effects and the impact on the judicial system. Current statistical infrastructure used to more comprehensively measure the use and impacts of substances such as tobacco and alcohol could be adapted to do the same for cannabis. However, available economic statistics are largely silent on the role illegal drugs play in the economy. Both social and economic statistics will need to be updated to reflect the legalization of cannabis and the challenge is especially great for economic statistics This paper provides a summary of the work that is now under way toward these ends.

    Release date: 2017-09-28

  • Index and guides: 12-606-X
    Description:

    This is a toolkit intended to aid data producers and data users external to Statistics Canada.

    Release date: 2017-09-27

  • Articles and reports: 11-633-X2017009
    Description:

    This document describes the procedures for using linked administrative data sources to estimate paid parental leave rates in Canada and the issues surrounding this use.

    Release date: 2017-08-29

  • Articles and reports: 11-633-X2017008
    Description:

    The DYSEM microsimulation modelling platform provides a demographic and socioeconomic core that can be readily built upon to develop custom dynamic microsimulation models or applications. This paper describes DYSEM and provides an overview of its intended uses, as well as the methods and data used in its development.

    Release date: 2017-07-28

  • Articles and reports: 12-001-X201700114817
    Description:

    We present research results on sample allocations for efficient model-based small area estimation in cases where the areas of interest coincide with the strata. Although model-assisted and model-based estimation methods are common in the production of small area statistics, utilization of the underlying model and estimation method are rarely included in the sample area allocation scheme. Therefore, we have developed a new model-based allocation named g1-allocation. For comparison, one recently developed model-assisted allocation is presented. These two allocations are based on an adjusted measure of homogeneity which is computed using an auxiliary variable and is an approximation of the intra-class correlation within areas. Five model-free area allocation solutions presented in the past are selected from the literature as reference allocations. Equal and proportional allocations need the number of areas and area-specific numbers of basic statistical units. The Neyman, Bankier and NLP (Non-Linear Programming) allocation need values for the study variable concerning area level parameters such as standard deviation, coefficient of variation or totals. In general, allocation methods can be classified according to the optimization criteria and use of auxiliary data. Statistical properties of the various methods are assessed through sample simulation experiments using real population register data. It can be concluded from simulation results that inclusion of the model and estimation method into the allocation method improves estimation results.

    Release date: 2017-06-22

Data (1)

Data (1) (1 result)

  • Public use microdata: 89F0002X
    Description:

    The SPSD/M is a static microsimulation model designed to analyse financial interactions between governments and individuals in Canada. It can compute taxes paid to and cash transfers received from government. It is comprised of a database, a series of tax/transfer algorithms and models, analytical software and user documentation.

    Release date: 2018-01-08

Analysis (165)

Analysis (165) (25 of 165 results)

  • Journals and periodicals: 11-633-X
    Description:

    Papers in this series provide background discussions of the methods used to develop data for economic, health, and social analytical studies at Statistics Canada. They are intended to provide readers with information on the statistical methods, standards and definitions used to develop databases for research purposes. All papers in this series have undergone peer and institutional review to ensure that they conform to Statistics Canada's mandate and adhere to generally accepted standards of good professional practice.

    Release date: 2018-01-22

  • Articles and reports: 11-633-X2018013
    Description:

    Since 2008, a number of population censuses have been linked to administrative health data and to financial data. These linked datasets have been instrumental in examining health inequalities and have been used in environmental health research. This paper describes the creation of the 1996 Canadian Census Health and Environment Cohort (CanCHEC)—3.57 million respondents to the census long-form questionnaire who were retrospectively followed for mortality and mobility for 16.6 years from 1996 to 2012. The 1996 CanCHEC was limited to census respondents who were aged 19 or older on Census Day (May 14, 1996), were residents of Canada, were not residents of institutions, and had filed an income tax return. These respondents were linked to death records from the Canadian Mortality Database or to the T1 Personal Master File, and to a postal code history from a variety of sources. This is the third in a set of CanCHECs that, when combined, make it possible to examine mortality trends and environmental exposures by socioeconomic characteristics over three census cycles and 21 years of census, tax, and mortality data. This report describes linkage methodologies, validation and bias assessment, and the characteristics of the 1996 CanCHEC. Representativeness of the 1996 CanCHEC relative to the adult population of Canada is also assessed.

    Release date: 2018-01-22

  • Articles and reports: 11-633-X2018012
    Description:

    This study investigates the extent to which income tax reassessments and delayed tax filing affect the reliability of Canadian administrative tax datasets used for economic analysis. The study is based on individual income tax records from the T1 Personal Master File and Historical Personal Master File for selected years from 1990 to 2010. These datasets contain tax records for approximately 100% of initial and all income tax filers, who submitted returns to the Canada Revenue Agency (CRA) before specific processing cut-off dates.

    Release date: 2018-01-11

  • Articles and reports: 11-633-X2018011
    Description:

    The Longitudinal Immigration Database (IMDB) is a comprehensive source of data that plays a key role in the understanding of the economic behaviour of immigrants. It is the only annual Canadian dataset that allows users to study the characteristics of immigrants to Canada at the time of admission and their economic outcomes and regional (inter-provincial) mobility over a time span of more than 30 years. The IMDB combines administrative files on immigrant admissions and non-permanent resident permits from Immigration, Refugees and Citizenship Canada (IRCC) with tax files from the Canadian Revenue Agency (CRA). Information is available for immigrant taxfilers admitted since 1980. Tax records for 1982 and subsequent years are available for immigrant taxfilers.

    This report will discuss the IMDB data sources, concepts and variables, record linkage, data processing, dissemination, data evaluation and quality indicators, comparability with other immigration datasets, and the analyses possible with the IMDB.

    Release date: 2018-01-08

  • Articles and reports: 18-001-X2017001
    Description:

    This working paper profiles Canadian firms involved in the development and production of Bioproducts. It provides data on the number and types of Bioproducts firms in 2015, covering bioproducts revenues, research and development, use of biomass, patents, products, business practices and the impact of government regulations on the sector.

    Release date: 2017-12-22

  • Journals and periodicals: 12-001-X
    Description:

    The journal publishes articles dealing with various aspects of statistical development relevant to a statistical agency, such as design issues in the context of practical constraints, use of different data sources and collection techniques, total survey error, survey evaluation, research in survey methodology, time series analysis, seasonal adjustment, demographic studies, data integration, estimation and data analysis methods, and general survey systems development. The emphasis is placed on the development and evaluation of specific methodologies as applied to data collection or the data themselves.

    Release date: 2017-12-21

  • Articles and reports: 12-001-X201700254871
    Description:

    In this paper the question is addressed how alternative data sources, such as administrative and social media data, can be used in the production of official statistics. Since most surveys at national statistical institutes are conducted repeatedly over time, a multivariate structural time series modelling approach is proposed to model the series observed by a repeated surveys with related series obtained from such alternative data sources. Generally, this improves the precision of the direct survey estimates by using sample information observed in preceding periods and information from related auxiliary series. This model also makes it possible to utilize the higher frequency of the social media to produce more precise estimates for the sample survey in real time at the moment that statistics for the social media become available but the sample data are not yet available. The concept of cointegration is applied to address the question to which extent the alternative series represent the same phenomena as the series observed with the repeated survey. The methodology is applied to the Dutch Consumer Confidence Survey and a sentiment index derived from social media.

    Release date: 2017-12-21

  • Articles and reports: 12-001-X201700254872
    Description:

    This note discusses the theoretical foundations for the extension of the Wilson two-sided coverage interval to an estimated proportion computed from complex survey data. The interval is shown to be asymptotically equivalent to an interval derived from a logistic transformation. A mildly better version is discussed, but users may prefer constructing a one-sided interval already in the literature.

    Release date: 2017-12-21

  • Articles and reports: 12-001-X201700254888
    Description:

    We discuss developments in sample survey theory and methods covering the past 100 years. Neyman’s 1934 landmark paper laid the theoretical foundations for the probability sampling approach to inference from survey samples. Classical sampling books by Cochran, Deming, Hansen, Hurwitz and Madow, Sukhatme, and Yates, which appeared in the early 1950s, expanded and elaborated the theory of probability sampling, emphasizing unbiasedness, model free features, and designs that minimize variance for a fixed cost. During the period 1960-1970, theoretical foundations of inference from survey data received attention, with the model-dependent approach generating considerable discussion. Introduction of general purpose statistical software led to the use of such software with survey data, which led to the design of methods specifically for complex survey data. At the same time, weighting methods, such as regression estimation and calibration, became practical and design consistency replaced unbiasedness as the requirement for standard estimators. A bit later, computer-intensive resampling methods also became practical for large scale survey samples. Improved computer power led to more sophisticated imputation for missing data, use of more auxiliary data, some treatment of measurement errors in estimation, and more complex estimation procedures. A notable use of models was in the expanded use of small area estimation. Future directions in research and methods will be influenced by budgets, response rates, timeliness, improved data collection devices, and availability of auxiliary data, some of which will come from “Big Data”. Survey taking will be impacted by changing cultural behavior and by a changing physical-technical environment.

    Release date: 2017-12-21

  • Articles and reports: 12-001-X201700254887
    Description:

    This paper proposes a new approach to decompose the wage difference between men and women that is based on a calibration procedure. This approach generalizes two current decomposition methods that are re-expressed using survey weights. The first one is the Blinder-Oaxaca method and the second one is a reweighting method proposed by DiNardo, Fortin and Lemieux. The new approach provides a weighting system that enables us to estimate such parameters of interest like quantiles. An application to data from the Swiss Structure of Earnings Survey shows the interest of this method.

    Release date: 2017-12-21

  • Articles and reports: 12-001-X201700254895
    Description:

    This note by Graham Kalton presents a discussion of the paper “Sample survey theory and methods: Past, present, and future directions” where J.N.K. Rao and Wayne A. Fuller share their views regarding the developments in sample survey theory and methods covering the past 100 years.

    Release date: 2017-12-21

  • Articles and reports: 12-001-X201700254896
    Description:

    This note by Sharon L. Lohr presents a discussion of the paper “Sample survey theory and methods: Past, present, and future directions” where J.N.K. Rao and Wayne A. Fuller share their views regarding the developments in sample survey theory and methods covering the past 100 years.

    Release date: 2017-12-21

  • Articles and reports: 12-001-X201700254897
    Description:

    This note by Chris Skinner presents a discussion of the paper “Sample survey theory and methods: Past, present, and future directions” where J.N.K. Rao and Wayne A. Fuller share their views regarding the developments in sample survey theory and methods covering the past 100 years.

    Release date: 2017-12-21

  • Articles and reports: 12-001-X201700254894
    Description:

    This note by Danny Pfeffermann presents a discussion of the paper “Sample survey theory and methods: Past, present, and future directions” where J.N.K. Rao and Wayne A. Fuller share their views regarding the developments in sample survey theory and methods covering the past 100 years.

    Release date: 2017-12-21

  • Articles and reports: 11-626-X2017077
    Description:

    On April 13, 2017, the Government of Canada tabled legislation to legalize the recreational use of cannabis by adults. This will directly impact Canada’s statistical system. The focus of this Economic Insights article is to provide experimental estimates for the volume of cannabis consumption, based on existing information on the prevalence of cannabis use. The article presents experimental estimates of the number of tonnes of cannabis consumed by age group for the period from 1960 to 2015. The experimental estimates rely on survey data from multiple sources, statistical techniques to link the sources over time, and assumptions about consumption behaviour. They are subject to revision as improved or additional data sources become available.

    Release date: 2017-12-18

  • Articles and reports: 11F0019M2017399
    Description:

    Canada is a trading nation that produces significant quantities of resource outputs. Consequently, the behaviour of resource prices that are important for Canada is germane to understanding the progress of real income growth and the prosperity of the country and the provinces. Demand and supply shocks or changes in monetary policy in international markets may exert significant influence on resource prices, and their fluctuations constitute an important avenue for the transmission of external shocks into the domestic economy. This paper develops historical estimates of the Bank of Canada commodity price index (BCPI) and links them to modern estimates. Using a collection of historical data sources, it estimates weights and prices sufficiently consistently to merit the construction of long-run estimates that may be linked to the modern Fisher BCPI.

    Release date: 2017-10-11

  • Articles and reports: 13-605-X201700114840
    Description:

    Statistics Canada is presently preparing the statistical system to be able to gauge the impact of the transition from illegal to legal non-medical cannabis use and to shed light on the social and economic activities related to the use of cannabis thereafter. While the system of social statistics captures some information on the use of cannabis, updates will be required to more accurately measure health effects and the impact on the judicial system. Current statistical infrastructure used to more comprehensively measure the use and impacts of substances such as tobacco and alcohol could be adapted to do the same for cannabis. However, available economic statistics are largely silent on the role illegal drugs play in the economy. Both social and economic statistics will need to be updated to reflect the legalization of cannabis and the challenge is especially great for economic statistics This paper provides a summary of the work that is now under way toward these ends.

    Release date: 2017-09-28

  • Articles and reports: 11-633-X2017009
    Description:

    This document describes the procedures for using linked administrative data sources to estimate paid parental leave rates in Canada and the issues surrounding this use.

    Release date: 2017-08-29

  • Articles and reports: 11-633-X2017008
    Description:

    The DYSEM microsimulation modelling platform provides a demographic and socioeconomic core that can be readily built upon to develop custom dynamic microsimulation models or applications. This paper describes DYSEM and provides an overview of its intended uses, as well as the methods and data used in its development.

    Release date: 2017-07-28

  • Articles and reports: 12-001-X201700114817
    Description:

    We present research results on sample allocations for efficient model-based small area estimation in cases where the areas of interest coincide with the strata. Although model-assisted and model-based estimation methods are common in the production of small area statistics, utilization of the underlying model and estimation method are rarely included in the sample area allocation scheme. Therefore, we have developed a new model-based allocation named g1-allocation. For comparison, one recently developed model-assisted allocation is presented. These two allocations are based on an adjusted measure of homogeneity which is computed using an auxiliary variable and is an approximation of the intra-class correlation within areas. Five model-free area allocation solutions presented in the past are selected from the literature as reference allocations. Equal and proportional allocations need the number of areas and area-specific numbers of basic statistical units. The Neyman, Bankier and NLP (Non-Linear Programming) allocation need values for the study variable concerning area level parameters such as standard deviation, coefficient of variation or totals. In general, allocation methods can be classified according to the optimization criteria and use of auxiliary data. Statistical properties of the various methods are assessed through sample simulation experiments using real population register data. It can be concluded from simulation results that inclusion of the model and estimation method into the allocation method improves estimation results.

    Release date: 2017-06-22

  • Articles and reports: 12-001-X201700114818
    Description:

    The protection of data confidentiality in tables of magnitude can become extremely difficult when working in a custom tabulation environment. A relatively simple solution consists of perturbing the underlying microdata beforehand, but the negative impact on the accuracy of aggregates can be too high. A perturbative method is proposed that aims to better balance the needs of data protection and data accuracy in such an environment. The method works by processing the data in each cell in layers, applying higher levels of perturbation for the largest values and little or no perturbation for the smallest ones. The method is primarily aimed at protecting personal data, which tend to be less skewed than business data.

    Release date: 2017-06-22

  • Articles and reports: 12-001-X201700114836
    Description:

    Web-push survey data collection that uses mail contact to request responses over the Internet, while withholding alternative answering modes until later in the implementation process, has developed rapidly over the past decade. This paper describes the reasons this innovative mixing of survey contact and response modes was needed, the primary ones being the declining effectiveness of voice telephone and slower than expected development of email/web only data collection methods. Historical and institutional barriers to mixing survey modes in this manner are also discussed. Essential research on the use of U.S. Postal address lists and the effects of aural and visual communication on survey measurement are then described followed by discussion of experimental efforts to create a viable web-push methodology as an alternative to voice telephone and mail response surveys. Multiple examples of current and anticipated web-push data collection uses are provided. This paper ends with a discussion of both the great promise and significant challenge presented by greater reliance on web-push survey methods.

    Release date: 2017-06-22

  • Articles and reports: 12-001-X201700114822
    Description:

    We use a Bayesian method to infer about a finite population proportion when binary data are collected using a two-fold sample design from small areas. The two-fold sample design has a two-stage cluster sample design within each area. A former hierarchical Bayesian model assumes that for each area the first stage binary responses are independent Bernoulli distributions, and the probabilities have beta distributions which are parameterized by a mean and a correlation coefficient. The means vary with areas but the correlation is the same over areas. However, to gain some flexibility we have now extended this model to accommodate different correlations. The means and the correlations have independent beta distributions. We call the former model a homogeneous model and the new model a heterogeneous model. All hyperparameters have proper noninformative priors. An additional complexity is that some of the parameters are weakly identified making it difficult to use a standard Gibbs sampler for computation. So we have used unimodal constraints for the beta prior distributions and a blocked Gibbs sampler to perform the computation. We have compared the heterogeneous and homogeneous models using an illustrative example and simulation study. As expected, the two-fold model with heterogeneous correlations is preferred.

    Release date: 2017-06-22

  • Articles and reports: 12-001-X201700114823
    Description:

    The derivation of estimators in a multi-phase calibration process requires a sequential computation of estimators and calibrated weights of previous phases in order to obtain those of later ones. Already after two phases of calibration the estimators and their variances involve calibration factors from both phases and the formulae become cumbersome and uninformative. As a consequence the literature so far deals mainly with two phases while three phases or more are rarely being considered. The analysis in some cases is ad-hoc for a specific design and no comprehensive methodology for constructing calibrated estimators, and more challengingly, estimating their variances in three or more phases was formed. We provide a closed form formula for the variance of multi-phase calibrated estimators that holds for any number of phases. By specifying a new presentation of multi-phase calibrated weights it is possible to construct calibrated estimators that have the form of multi-variate regression estimators which enables a computation of a consistent estimator for their variance. This new variance estimator is not only general for any number of phases but also has some favorable characteristics. A comparison to other estimators in the special case of two-phase calibration and another independent study for three phases are presented.

    Release date: 2017-06-22

  • Articles and reports: 12-001-X201700114819
    Description:

    Structural time series models are a powerful technique for variance reduction in the framework of small area estimation (SAE) based on repeatedly conducted surveys. Statistics Netherlands implemented a structural time series model to produce monthly figures about the labour force with the Dutch Labour Force Survey (DLFS). Such models, however, contain unknown hyperparameters that have to be estimated before the Kalman filter can be launched to estimate state variables of the model. This paper describes a simulation aimed at studying the properties of hyperparameter estimators in the model. Simulating distributions of the hyperparameter estimators under different model specifications complements standard model diagnostics for state space models. Uncertainty around the model hyperparameters is another major issue. To account for hyperparameter uncertainty in the mean squared errors (MSE) estimates of the DLFS, several estimation approaches known in the literature are considered in a simulation. Apart from the MSE bias comparison, this paper also provides insight into the variances and MSEs of the MSE estimators considered.

    Release date: 2017-06-22

Reference (44)

Reference (44) (25 of 44 results)

  • Index and guides: 98-500-X
    Description:

    Provides information that enables users to effectively use, apply and interpret data from the Census of Population. Each guide contains definitions and explanations on census concepts as well as a data quality and historical comparability section. Additional information will be included for specific variables to help users better understand the concepts and questions used in the census.

    Release date: 2017-11-29

  • Technical products: 84-538-X
    Description:

    This document presents the methodology underlying the production of the life tables for Canada, provinces and territories, from reference period 1980/1982 and onward.

    Release date: 2017-11-16

  • Technical products: 12-206-X
    Description:

    This report summarizes the achievements program sponsored by the three methodology divisions of Statistics Canada. This program covers research and development activities in statistical methods with potentially broad application in the Agency's survey programs, which would not otherwise have been carried out during the provision of methodology services to those survey programs. They also include tasks that provided client support in the application of past successful developments in order to promote the utilization of the results of research and development work.

    Release date: 2017-11-03

  • Index and guides: 12-606-X
    Description:

    This is a toolkit intended to aid data producers and data users external to Statistics Canada.

    Release date: 2017-09-27

  • Technical products: 12-586-X
    Description:

    The Quality Assurance Framework (QAF) serves as the highest-level governance tool for quality management at Statistics Canada. The QAF gives an overview of the quality management and risk mitigation strategies used by the Agency’s program areas. The QAF is used in conjunction with Statistics Canada management practices, such as those described in the Quality Guidelines.

    Release date: 2017-04-21

  • Technical products: 91-621-X2017001
    Release date: 2017-01-25

  • Technical products: 75F0002M
    Description:

    This series provides detailed documentation on income developments, including survey design issues, data quality evaluation and exploratory research.

    Release date: 2016-07-08

  • Technical products: 75F0002M2016003
    Description:

    Periodically, income statistics are updated to reflect the most recent population estimates from the Census. Accordingly, with the release of the 2014 data from the Canadian Income Survey, Statistics Canada has revised estimates for 2006 to 2013 using new population totals from the 2011 Census. This paper provides unrevised estimates alongside revised estimates for key income series, indicating where the revisions were significant.

    Release date: 2016-07-08

  • Technical products: 11-522-X
    Description:

    Since 1984, an annual international symposium on methodological issues has been sponsored by Statistics Canada. Proceedings have been available since 1987.

    Release date: 2016-03-24

  • Technical products: 91-528-X
    Description:

    This manual provides detailed descriptions of the data sources and methods used by Statistics Canada to estimate population. They comprise Postcensal and intercensal population estimates; base population; births and deaths; immigration; emigration; non-permanent residents; interprovincial migration; subprovincial estimates of population; population estimates by age, sex and marital status; and census family estimates. A glossary of principal terms is contained at the end of the manual, followed by the standard notation used.

    Until now, literature on the methodological changes for estimates calculations has always been spread throughout various Statistics Canada publications and background papers. This manual provides users of demographic statistics with a comprehensive compilation of the current procedures used by Statistics Canada to prepare population and family estimates.

    Release date: 2016-03-03

  • Classification: 12-603-X
    Description:

    Canadian Classification of Institutional Units and Sectors (CCIUS) 2012 is the departmental standard for classifying institutional units and sectors. This classification is used for economic statistics and includes definitions for its 171 classes. CCIUS 2012 was developed as a result of the implementation of international recommendations published in the 2008 System of National Accounts manual (SNA 2008).

    Release date: 2016-02-11

  • Technical products: 75F0002M2015003
    Description:

    This note discusses revised income estimates from the Survey of Labour and Income Dynamics (SLID). These revisions to the SLID estimates make it possible to compare results from the Canadian Income Survey (CIS) to earlier years. The revisions address the issue of methodology differences between SLID and CIS.

    Release date: 2015-12-17

  • Technical products: 91-621-X2015001
    Release date: 2015-09-17

  • Technical products: 12-002-X
    Description:

    The Research Data Centres (RDCs) Information and Technical Bulletin (ITB) is a forum by which Statistics Canada analysts and the research community can inform each other on survey data uses and methodological techniques. Articles in the ITB focus on data analysis and modelling, data management, and best or ineffective statistical, computational, and scientific practices. Further, ITB topics will include essays on data content, implications of questionnaire wording, comparisons of datasets, reviews on methodologies and their application, data peculiarities, problematic data and solutions, and explanations of innovative tools using RDC surveys and relevant software. All of these essays may provide advice and detailed examples outlining commands, habits, tricks and strategies used to make problem-solving easier for the RDC user.

    The main aims of the ITB are:

    - the advancement and dissemination of knowledge surrounding Statistics Canada's data; - the exchange of ideas among the RDC-user community;- the support of new users; - the co-operation with subject matter experts and divisions within Statistics Canada.

    The ITB is interested in quality articles that are worth publicizing throughout the research community, and that will add value to the quality of research produced at Statistics Canada's RDCs.

    Release date: 2015-03-25

  • Technical products: 12-002-X201500114147
    Description:

    Influential observations in logistic regression are those that have a notable effect on certain aspects of the model fit. Large sample size alone does not eliminate this concern; it is still important to examine potentially influential observations, especially in complex survey data. This paper describes a straightforward algorithm for examining potentially influential observations in complex survey data using SAS software. This algorithm was applied in a study using the 2005 Canadian Community Health Survey that examined factors associated with family physician utilization for adolescents.

    Release date: 2015-03-25

  • Index and guides: 99-002-X
    Description:

    This report describes sampling and weighting procedures used in the 2011 National Household Survey. It provides operational and theoretical justifications for them, and presents the results of the evaluation studies of these procedures.

    Release date: 2015-01-28

  • Technical products: 12-002-X201400111901
    Description:

    This document is for analysts/researchers who are considering doing research with data from a survey where both survey weights and bootstrap weights are provided in the data files. This document gives directions, for some selected software packages, about how to get started in using survey weights and bootstrap weights for an analysis of survey data. We give brief directions for obtaining survey-weighted estimates, bootstrap variance estimates (and other desired error quantities) and some typical test statistics for each software package in turn. While these directions are provided just for the chosen examples, there will be information about the range of weighted and bootstrapped analyses that can be carried out by each software package.

    Release date: 2014-08-07

  • Index and guides: 99-001-X2011001
    Description:

    The National Household Survey User Guide is a reference document that describes the various phases of the National Household Survey (NHS). It provides an overview of the 2011 NHS content, sampling design and collection, data processing, data quality assessment and data dissemination. The National Household Survey User Guide may be useful to both new and experienced users who wish to familiarize themselves with and find specific information about the 2011 NHS.

    Release date: 2013-05-08

  • Technical products: 75F0002M2012003
    Description:

    The release of the 2010 Survey of Labour and Income Dynamics (SLID) data coincided with a historical revision of the 2006 to 2009 results. The survey weights were updated to take into account new population estimates based on the 2006 Census rather than the 2001 Census. This paper presents a summary of the impact of this revision on the 2006-2009 survey estimates.

    Release date: 2012-11-01

  • Technical products: 12-002-X201200111642
    Description:

    It is generally recommended that weighted estimation approaches be used when analyzing data from a long-form census microdata file. Since such data files are now available in the RDC's, there is a need to provide researchers there with more information about doing weighted estimation with these files. The purpose of this paper is to provide some of this information - in particular, how the weight variables were derived for the census microdata files and what weight should be used for different units of analysis. For the 1996, 2001 and 2006 censuses the same weight variable is appropriate regardless of whether people, families or households are being studied. For the 1991 census, recommendations are more complex: a different weight variable is required for households than for people and families, and additional restrictions apply to obtain the correct weight value for families.

    Release date: 2012-10-25

  • Technical products: 11-522-X2010000
    Description:

    Since 1984, an annual international symposium on methodological issues has been sponsored by Statistics Canada. Proceedings have been available since 1987. The Symposium 2010 is entitled "Social Statistics: The Interplay among Censuses, Surveys and Administrative Data".

    Release date: 2011-09-15

  • Surveys and statistical programs – Documentation: 62F0026M2011001
    Description:

    This report describes the quality indicators produced for the 2009 Survey of Household Spending. These quality indicators, such as coefficients of variation, nonresponse rates, slippage rates and imputation rates, help users interpret the survey data.

    Release date: 2011-06-16

  • Technical products: 12-587-X
    Description:

    This publication shows readers how to design and conduct a census or sample survey. It explains basic survey concepts and provides information on how to create efficient and high quality surveys. It is aimed at those involved in planning, conducting or managing a survey and at students of survey design courses.

    This book contains the following information:

    -how to plan and manage a survey;-how to formulate the survey objectives and design a questionnaire; -things to consider when determining a sample design (choosing between a sample or a census, defining the survey population, choosing a survey frame, identifying possible sources of survey error); -choosing a method of collection (self-enumeration, personal interviews or telephone interviews; computer-assisted versus paper-based questionnaires); -organizing and conducting data collection operations;-determining the sample size, allocating the sample across strata and selecting the sample; -methods of point estimation and variance estimation, and data analysis; -the use of administrative data, particularly during the design and estimation phases-how to process the data (which consists of all data handling activities between collection and estimation) and use quality control and quality assurance measures to minimize and control errors during various survey steps; and-disclosure control and data dissemination.

    This publication also includes a case study that illustrates the steps in developing a household survey, using the methods and principles presented in the book. This publication was previously only available in print format and originally published in 2003.

    Release date: 2010-09-27

  • Surveys and statistical programs – Documentation: 62F0026M2010002
    Description:

    This report describes the quality indicators produced for the 2005 Survey of Household Spending. These quality indicators, such as coefficients of variation, nonresponse rates, slippage rates and imputation rates, help users interpret the survey data.

    Release date: 2010-04-26

Date modified: