Statistical methods

Key indicators

Changing any selection will automatically update the page content.

Selected geographical area: Canada

Selected geographical area: Newfoundland and Labrador

Selected geographical area: Prince Edward Island

Selected geographical area: Nova Scotia

Selected geographical area: New Brunswick

Selected geographical area: Quebec

Selected geographical area: Ontario

Selected geographical area: Manitoba

Selected geographical area: Saskatchewan

Selected geographical area: Alberta

Selected geographical area: British Columbia

Selected geographical area: Yukon

Selected geographical area: Northwest Territories

Selected geographical area: Nunavut

Sort Help
entries

Results

All (2,296)

All (2,296) (0 to 10 of 2,296 results)

  • Articles and reports: 11-522-X202200100001
    Description: Record linkage aims at identifying record pairs related to the same unit and observed in two different data sets, say A and B. Fellegi and Sunter (1969) suggest each record pair is tested whether generated from the set of matched or unmatched pairs. The decision function consists of the ratio between m(y) and u(y),probabilities of observing a comparison y of a set of k>3 key identifying variables in a record pair under the assumptions that the pair is a match or a non-match, respectively. These parameters are usually estimated by means of the EM algorithm using as data the comparisons on all the pairs of the Cartesian product ?=A×B. These observations (on the comparisons and on the pairs status as match or non-match) are assumed as generated independently of other pairs, assumption characterizing most of the literature on record linkage and implemented in software tools (e.g. RELAIS, Cibella et al. 2012). On the contrary, comparisons y and matching status in ? are deterministically dependent. As a result, estimates on m(y) and u(y) based on the EM algorithm are usually bad. This fact jeopardizes the effective application of the Fellegi-Sunter method, as well as automatic computation of quality measures and possibility to apply efficient methods for model estimation on linked data (e.g. regression functions), as in Chambers et al. (2015). We propose to explore ? by a set of samples, each one drawn so to preserve independence of comparisons among the selected record pairs. Simulations are encouraging.
    Release date: 2024-03-25

  • Articles and reports: 11-522-X202200100002
    Description: The authors used the Splink probabilistic linkage package developed by the UK Ministry of Justice, to link census data from England and Wales to itself to find duplicate census responses. A large gold standard of confirmed census duplicates was available meaning that the results of the Splink implementation could be quality assured. This paper describes the implementation and features of Splink, gives details of the settings and parameters that we used to tune Splink for our particular project, and gives the results that we obtained.
    Release date: 2024-03-25

  • Articles and reports: 11-522-X202200100003
    Description: Estimation at fine levels of aggregation is necessary to better describe society. Small area estimation model-based approaches that combine sparse survey data with rich data from auxiliary sources have been proven useful to improve the reliability of estimates for small domains. Considered here is a scenario where small area model-based estimates, produced at a given aggregation level, needed to be disaggregated to better describe the social structure at finer levels. For this scenario, an allocation method was developed to implement the disaggregation, overcoming challenges associated with data availability and model development at such fine levels. The method is applied to adult literacy and numeracy estimation at the county-by-group-level, using data from the U.S. Program for the International Assessment of Adult Competencies. In this application the groups are defined in terms of age or education, but the method could be applied to estimation of other equity-deserving groups.
    Release date: 2024-03-25

  • Articles and reports: 11-522-X202200100004
    Description: In accordance with Statistics Canada’s long-term Disaggregated Data Action Plan (DDAP), several initiatives have been implemented into the Labour Force Survey (LFS). One of the more direct initiatives was a targeted increase in the size of the monthly LFS sample. Furthermore, a regular Supplement program was introduced, where an additional series of questions are asked to a subset of LFS respondents and analyzed in a monthly or quarterly production cycle. Finally, the production of modelled estimates based on Small Area Estimation (SAE) methodologies resumed for the LFS and will include a wider scope with more analytical value than what had existed in the past. This paper will give an overview of these three initiatives.
    Release date: 2024-03-25

  • Articles and reports: 11-522-X202200100005
    Description: Sampling variance smoothing is an important topic in small area estimation. In this paper, we propose sampling variance smoothing methods for small area proportion estimation. In particular, we consider the generalized variance function and design effect methods for sampling variance smoothing. We evaluate and compare the smoothed sampling variances and small area estimates based on the smoothed variance estimates through analysis of survey data from Statistics Canada. The results from real data analysis indicate that the proposed sampling variance smoothing methods work very well for small area estimation.
    Release date: 2024-03-25

  • Articles and reports: 11-522-X202200100006
    Description: The Australian Bureau of Statistics (ABS) is committed to improving access to more microdata, while ensuring privacy and confidentiality is maintained, through its virtual DataLab which supports researchers to undertake complex research more efficiently. Currently, the DataLab research outputs need to follow strict rules to minimise disclosure risks for clearance. However, the clerical-review process is not cost effective and has potential to introduce errors. The increasing number of statistical outputs from different projects can potentially introduce differencing risks even though these outputs from different projects have met the strict output rules. The ABS has been exploring the possibility of providing automatic output checking using the ABS cellkey methodology to ensure that all outputs across different projects are protected consistently to minimise differencing risks and reduce costs associated with output checking.
    Release date: 2024-03-25

  • Articles and reports: 11-522-X202200100007
    Description: With the availability of larger and more diverse data sources, Statistical Institutes in Europe are inclined to publish statistics on smaller groups than they used to do. Moreover, high impact global events like the Covid crisis and the situation in Ukraine may also ask for statistics on specific subgroups of the population. Publishing on small, targeted groups not only raises questions on statistical quality of the figures, it also raises issues concerning statistical disclosure risk. The principle of statistical disclosure control does not depend on the size of the groups the statistics are based on. However, the risk of disclosure does depend on the group size: the smaller a group, the higher the risk. Traditional ways to deal with statistical disclosure control and small group sizes include suppressing information and coarsening categories. These methods essentially increase the (mean) group sizes. More recent approaches include perturbative methods that have the intention to keep the group sizes small in order to preserve as much information as possible while reducing the disclosure risk sufficiently. In this paper we will mention some European examples of special focus group statistics and discuss the implications on statistical disclosure control. Additionally, we will discuss some issues that the use of perturbative methods brings along: its impact on disclosure risk and utility as well as the challenges in proper communication thereof.
    Release date: 2024-03-25

  • Articles and reports: 11-522-X202200100008
    Description: The publication of more disaggregated data can increase transparency and provide important information on underrepresented groups. Developing more readily available access options increases the amount of information available to and produced by researchers. Increasing the breadth and depth of the information released allows for a better representation of the Canadian population, but also puts a greater responsibility on Statistics Canada to do this in a way that preserves confidentiality, and thus it is helpful to develop tools which allow Statistics Canada to quantify the risk from the additional data granularity. In an effort to evaluate the risk of a database reconstruction attack on Statistics Canada’s published Census data, this investigation follows the strategy of the US Census Bureau, who outlined a method to use a Boolean satisfiability (SAT) solver to reconstruct individual attributes of residents of a hypothetical US Census block, based just on a table of summary statistics. The technique is expanded to attempt to reconstruct a small fraction of Statistics Canada’s Census microdata. This paper will discuss the findings of the investigation, the challenges involved in mounting a reconstruction attack, and the effect of an existing confidentiality measure in mitigating these attacks. Furthermore, the existing strategy is compared to other potential methods used to protect data – in particular, releasing tabular data perturbed by some random mechanism, such as those suggested by differential privacy.
    Release date: 2024-03-25

  • Articles and reports: 11-522-X202200100009
    Description: Education and training is acknowledged as fundamental for the development of a society. It is a complex multidimensional phenomenon, which determinants are ascribable to several interrelated familiar and socio-economic conditions. To respond to the demand of supporting statistical information for policymaking and its monitoring and evaluation process, the Italian National Statistical Institute (Istat) is renewing the education and training statistical production system, implementing a new thematic statistical register. It will be part of the Istat Integrated System of Registers, thus allowing relating the education and training phenomenon to other relevant phenomena, e.g. transition to work.
    Release date: 2024-03-25

  • Articles and reports: 11-522-X202200100010
    Description: Growing Up in Québec is a longitudinal population survey that began in the spring of 2021 at the Institut de la statistique du Québec. Among the children targeted by this longitudinal follow-up, some will experience developmental difficulties at some point in their lives. Those same children often have characteristics associated with higher sample attrition (low-income family, parents with a low level of education). This article describes the two main challenges we encountered when trying to ensure sufficient representativeness of these children, in both the overall results and the subpopulation analyses.
    Release date: 2024-03-25
Data (9)

Data (9) ((9 results))

  • Public use microdata: 89F0002X
    Description: The SPSD/M is a static microsimulation model designed to analyse financial interactions between governments and individuals in Canada. It can compute taxes paid to and cash transfers received from government. It is comprised of a database, a series of tax/transfer algorithms and models, analytical software and user documentation.
    Release date: 2024-02-02

  • Profile of a community or region: 46-26-0002
    Description: The National Address Register (NAR) is a list of commercial and residential addresses in Canada that are extracted from Statistics Canada's Building Register and deemed non-confidential.
    Release date: 2023-12-19

  • Data Visualization: 71-607-X2020010
    Description: The Canadian Statistical Geospatial Explorer empowers users to discover geo enabled data holdings of Statistics Canada at various levels of geography including at the neighbourhood level. Users are able to visualize, thematically map, spatially explore and analyze, export and consume data in various formats. Users can also view the data superimposed on satellite imagery, topographic and street layers.
    Release date: 2023-01-24

  • Table: 11-10-0074-01
    Geography: Census tract
    Frequency: Occasional
    Description:

    The divergence index (D-index) describes the degree that families with different income levels are mixing together in neighbourhoods. It compares neighbourhood (census tract, CT) discrete income distributions to a base distribution, which is the income quintiles of the neighbourhood’s census metropolitan area (CMA).

    Release date: 2020-06-22

  • Data Visualization: 71-607-X2019010
    Description: The Housing Data Viewer is a visualization tool that allows users to explore Statistics Canada data on a map. Users can use the tool to navigate, compare and export data.
    Release date: 2019-10-30

  • Table: 53-500-X
    Description:

    This report presents the results of a pilot survey conducted by Statistics Canada to measure the fuel consumption of on-road motor vehicles registered in Canada. This study was carried out in connection with the Canadian Vehicle Survey (CVS) which collects information on road activity such as distance traveled, number of passengers and trip purpose.

    Release date: 2004-10-21

  • Table: 13-220-X
    Description: In the 1997 edition, new and revised benchmarks were introduced for 1992 and 1988. The indicators are used to monitor supply, demand and employment for tourism in Canada on a timely basis. The annual tables are derived using the National Income and Expenditure Accounts (NIEA) and various industry and travel surveys. Tables providing actual data and percentage changes, for seasonally adjusted current and constant price estimates are included. In addition, an analytical section provides graphs, and time series of first differences, percentage changes, and seasonal factors for selected indicators. Data are published from 1987 and the publication will be available on the day of release. New data are included in the demand tables for non-tourism commodities produced by non-tourism industries and in the employment tables covering direct tourism employment generated by non-tourism industries. This product was commissioned by the Canadian Tourism Commission to provide annual updates for the Tourism Satellite Account.
    Release date: 2003-01-08

  • Table: 11-516-X
    Description:

    The second edition of Historical statistics of Canada was jointly produced by the Social Science Federation of Canada and Statistics Canada in 1983. This volume contains about 1,088 statistical tables on the social, economic and institutional conditions of Canada from the start of Confederation in 1867 to the mid-1970s. The tables are arranged in sections with an introduction explaining the content of each section, the principal sources of data for each table, and general explanatory notes regarding the statistics. In most cases, there is sufficient description of the individual series to enable the reader to use them without consulting the numerous basic sources referenced in the publication.

    The electronic version of this historical publication is accessible on the Internet site of Statistics Canada as a free downloadable document: text as HTML pages and all tables as individual spreadsheets in a comma delimited format (CSV) (which allows online viewing or downloading).

    Release date: 1999-07-29

  • Table: 82-567-X
    Description:

    The National Population Health Survey (NPHS) is designed to enhance the understanding of the processes affecting health. The survey collects cross-sectional as well as longitudinal data. In 1994/95 the survey interviewed a panel of 17,276 individuals, then returned to interview them a second time in 1996/97. The response rate for these individuals was 96% in 1996/97. Data collection from the panel will continue for up to two decades. For cross-sectional purposes, data were collected for a total of 81,000 household residents in all provinces (except people on Indian reserves or on Canadian Forces bases) in 1996/97.

    This overview illustrates the variety of information available by presenting data on perceived health, chronic conditions, injuries, repetitive strains, depression, smoking, alcohol consumption, physical activity, consultations with medical professionals, use of medications and use of alternative medicine.

    Release date: 1998-07-29
Analysis (1,872)

Analysis (1,872) (0 to 10 of 1,872 results)

  • Articles and reports: 11-522-X202200100001
    Description: Record linkage aims at identifying record pairs related to the same unit and observed in two different data sets, say A and B. Fellegi and Sunter (1969) suggest each record pair is tested whether generated from the set of matched or unmatched pairs. The decision function consists of the ratio between m(y) and u(y),probabilities of observing a comparison y of a set of k>3 key identifying variables in a record pair under the assumptions that the pair is a match or a non-match, respectively. These parameters are usually estimated by means of the EM algorithm using as data the comparisons on all the pairs of the Cartesian product ?=A×B. These observations (on the comparisons and on the pairs status as match or non-match) are assumed as generated independently of other pairs, assumption characterizing most of the literature on record linkage and implemented in software tools (e.g. RELAIS, Cibella et al. 2012). On the contrary, comparisons y and matching status in ? are deterministically dependent. As a result, estimates on m(y) and u(y) based on the EM algorithm are usually bad. This fact jeopardizes the effective application of the Fellegi-Sunter method, as well as automatic computation of quality measures and possibility to apply efficient methods for model estimation on linked data (e.g. regression functions), as in Chambers et al. (2015). We propose to explore ? by a set of samples, each one drawn so to preserve independence of comparisons among the selected record pairs. Simulations are encouraging.
    Release date: 2024-03-25

  • Articles and reports: 11-522-X202200100002
    Description: The authors used the Splink probabilistic linkage package developed by the UK Ministry of Justice, to link census data from England and Wales to itself to find duplicate census responses. A large gold standard of confirmed census duplicates was available meaning that the results of the Splink implementation could be quality assured. This paper describes the implementation and features of Splink, gives details of the settings and parameters that we used to tune Splink for our particular project, and gives the results that we obtained.
    Release date: 2024-03-25

  • Articles and reports: 11-522-X202200100003
    Description: Estimation at fine levels of aggregation is necessary to better describe society. Small area estimation model-based approaches that combine sparse survey data with rich data from auxiliary sources have been proven useful to improve the reliability of estimates for small domains. Considered here is a scenario where small area model-based estimates, produced at a given aggregation level, needed to be disaggregated to better describe the social structure at finer levels. For this scenario, an allocation method was developed to implement the disaggregation, overcoming challenges associated with data availability and model development at such fine levels. The method is applied to adult literacy and numeracy estimation at the county-by-group-level, using data from the U.S. Program for the International Assessment of Adult Competencies. In this application the groups are defined in terms of age or education, but the method could be applied to estimation of other equity-deserving groups.
    Release date: 2024-03-25

  • Articles and reports: 11-522-X202200100004
    Description: In accordance with Statistics Canada’s long-term Disaggregated Data Action Plan (DDAP), several initiatives have been implemented into the Labour Force Survey (LFS). One of the more direct initiatives was a targeted increase in the size of the monthly LFS sample. Furthermore, a regular Supplement program was introduced, where an additional series of questions are asked to a subset of LFS respondents and analyzed in a monthly or quarterly production cycle. Finally, the production of modelled estimates based on Small Area Estimation (SAE) methodologies resumed for the LFS and will include a wider scope with more analytical value than what had existed in the past. This paper will give an overview of these three initiatives.
    Release date: 2024-03-25

  • Articles and reports: 11-522-X202200100005
    Description: Sampling variance smoothing is an important topic in small area estimation. In this paper, we propose sampling variance smoothing methods for small area proportion estimation. In particular, we consider the generalized variance function and design effect methods for sampling variance smoothing. We evaluate and compare the smoothed sampling variances and small area estimates based on the smoothed variance estimates through analysis of survey data from Statistics Canada. The results from real data analysis indicate that the proposed sampling variance smoothing methods work very well for small area estimation.
    Release date: 2024-03-25

  • Articles and reports: 11-522-X202200100006
    Description: The Australian Bureau of Statistics (ABS) is committed to improving access to more microdata, while ensuring privacy and confidentiality is maintained, through its virtual DataLab which supports researchers to undertake complex research more efficiently. Currently, the DataLab research outputs need to follow strict rules to minimise disclosure risks for clearance. However, the clerical-review process is not cost effective and has potential to introduce errors. The increasing number of statistical outputs from different projects can potentially introduce differencing risks even though these outputs from different projects have met the strict output rules. The ABS has been exploring the possibility of providing automatic output checking using the ABS cellkey methodology to ensure that all outputs across different projects are protected consistently to minimise differencing risks and reduce costs associated with output checking.
    Release date: 2024-03-25

  • Articles and reports: 11-522-X202200100007
    Description: With the availability of larger and more diverse data sources, Statistical Institutes in Europe are inclined to publish statistics on smaller groups than they used to do. Moreover, high impact global events like the Covid crisis and the situation in Ukraine may also ask for statistics on specific subgroups of the population. Publishing on small, targeted groups not only raises questions on statistical quality of the figures, it also raises issues concerning statistical disclosure risk. The principle of statistical disclosure control does not depend on the size of the groups the statistics are based on. However, the risk of disclosure does depend on the group size: the smaller a group, the higher the risk. Traditional ways to deal with statistical disclosure control and small group sizes include suppressing information and coarsening categories. These methods essentially increase the (mean) group sizes. More recent approaches include perturbative methods that have the intention to keep the group sizes small in order to preserve as much information as possible while reducing the disclosure risk sufficiently. In this paper we will mention some European examples of special focus group statistics and discuss the implications on statistical disclosure control. Additionally, we will discuss some issues that the use of perturbative methods brings along: its impact on disclosure risk and utility as well as the challenges in proper communication thereof.
    Release date: 2024-03-25

  • Articles and reports: 11-522-X202200100008
    Description: The publication of more disaggregated data can increase transparency and provide important information on underrepresented groups. Developing more readily available access options increases the amount of information available to and produced by researchers. Increasing the breadth and depth of the information released allows for a better representation of the Canadian population, but also puts a greater responsibility on Statistics Canada to do this in a way that preserves confidentiality, and thus it is helpful to develop tools which allow Statistics Canada to quantify the risk from the additional data granularity. In an effort to evaluate the risk of a database reconstruction attack on Statistics Canada’s published Census data, this investigation follows the strategy of the US Census Bureau, who outlined a method to use a Boolean satisfiability (SAT) solver to reconstruct individual attributes of residents of a hypothetical US Census block, based just on a table of summary statistics. The technique is expanded to attempt to reconstruct a small fraction of Statistics Canada’s Census microdata. This paper will discuss the findings of the investigation, the challenges involved in mounting a reconstruction attack, and the effect of an existing confidentiality measure in mitigating these attacks. Furthermore, the existing strategy is compared to other potential methods used to protect data – in particular, releasing tabular data perturbed by some random mechanism, such as those suggested by differential privacy.
    Release date: 2024-03-25

  • Articles and reports: 11-522-X202200100009
    Description: Education and training is acknowledged as fundamental for the development of a society. It is a complex multidimensional phenomenon, which determinants are ascribable to several interrelated familiar and socio-economic conditions. To respond to the demand of supporting statistical information for policymaking and its monitoring and evaluation process, the Italian National Statistical Institute (Istat) is renewing the education and training statistical production system, implementing a new thematic statistical register. It will be part of the Istat Integrated System of Registers, thus allowing relating the education and training phenomenon to other relevant phenomena, e.g. transition to work.
    Release date: 2024-03-25

  • Articles and reports: 11-522-X202200100010
    Description: Growing Up in Québec is a longitudinal population survey that began in the spring of 2021 at the Institut de la statistique du Québec. Among the children targeted by this longitudinal follow-up, some will experience developmental difficulties at some point in their lives. Those same children often have characteristics associated with higher sample attrition (low-income family, parents with a low level of education). This article describes the two main challenges we encountered when trying to ensure sufficient representativeness of these children, in both the overall results and the subpopulation analyses.
    Release date: 2024-03-25
Reference (363)

Reference (363) (0 to 10 of 363 results)

  • Notices and consultations: 13-605-X
    Description: This product contains articles related to the latest methodological, conceptual developments in the Canadian System of Macroeconomic Accounts as well as the analysis of the Canadian economy. It includes articles detailing new methods, concepts and statistical techniques used to compile the Canadian System of Macroeconomic Accounts. It also includes information related to new or expanded data products, provides updates and supplements to information found in various guides and analytical articles touching upon a broad range of topics related to the Canadian economy.
    Release date: 2024-02-29

  • Surveys and statistical programs – Documentation: 32-26-0007
    Description: Census of Agriculture data provide statistical information on farms and farm operators at fine geographic levels and for small subpopulations. Quality evaluation activities are essential to ensure that census data are reliable and that they meet user needs.

    This report provides data quality information pertaining to the Census of Agriculture, such as sources of error, error detection, disclosure control methods, data quality indicators, response rates and collection rates.
    Release date: 2024-02-06

  • Surveys and statistical programs – Documentation: 75-005-M2023001
    Description: This document provides information on the evolution of response rates for the Labour Force Survey (LFS) and a discussion of the evaluation of two aspects of data quality that ensure the LFS estimates continue providing an accurate portrait of the Canadian labour market.
    Release date: 2023-10-30

  • Surveys and statistical programs – Documentation: 98-306-X
    Description:

    This report describes sampling, weighting and estimation procedures used in the Census of Population. It provides operational and theoretical justifications for them, and presents the results of the evaluations of these procedures.

    Release date: 2023-10-04

  • Surveys and statistical programs – Documentation: 84-538-X
    Geography: Canada
    Description: This electronic publication presents the methodology underlying the production of the life tables for Canada, provinces and territories.
    Release date: 2023-08-28

  • Surveys and statistical programs – Documentation: 32-26-0006
    Description: This report provides data quality information pertaining to the Agriculture–Population Linkage, such as sources of error, matching process, response rates, imputation rates, sampling, weighting, disclosure control methods and data quality indicators.
    Release date: 2023-08-25

  • Surveys and statistical programs – Documentation: 75-514-G
    Description: The Guide to the Job Vacancy and Wage Survey contains a dictionary of concepts and definitions, and covers topics such as survey methodology, data collection, processing, and data quality. The guide covers both components of the survey: the job vacancy component, which is quarterly, and the wage component, which is annual.
    Release date: 2023-05-25

  • Surveys and statistical programs – Documentation: 32-26-0002
    Description:

    This reference guide may be useful to both new and experienced users who wish to familiarize themselves with and find specific information about the Census of Agriculture.

    It provides an overview of the Census of Agriculture communications, content determination, collection, processing, data quality evaluation and dissemination activities. It also summarizes the key changes to the census and other useful information.

    Release date: 2022-04-14

  • Geographic files and documentation: 12-572-X
    Description:

    The Standard Geographical Classification (SGC) provides a systematic classification structure that categorizes all of the geographic area of Canada. The SGC is the official classification used in the Census of Population and other Statistics Canada surveys.

    The classification is organized in two volumes: Volume I, The Classification and Volume II, Reference Maps.

    Volume II contains reference maps showing boundaries, names, codes and locations of the geographic areas in the classification. The reference maps show census subdivisions, census divisions, census metropolitan areas, census agglomerations, census metropolitan influenced zones and economic regions. Definitions for these terms are found in Volume I, The Classification. Volume I describes the classification and related standard geographic areas and place names.

    The maps in Volume II can be downloaded in PDF format from our website.

    Release date: 2022-02-09

  • Surveys and statistical programs – Documentation: 12-004-X
    Description:

    Statistics: Power from Data! is a web resource that was created in 2001 to assist secondary students and teachers of Mathematics and Information Studies in getting the most from statistics. Over the past 20 years, this product has become one of Statistics Canada most popular references for students, teachers, and many other members of the general population. This product was last updated in 2021.

    Release date: 2021-09-02

Browse our partners page to find a complete list of our partners and their associated products.

Date modified: