Data analysis

Filter results by

Search Help
Currently selected filters that can be removed

Keyword(s)

Geography

2 facets displayed. 0 facets selected.

Content

1 facets displayed. 0 facets selected.
Sort Help
entries

Results

All (38)

All (38) (0 to 10 of 38 results)

  • Stats in brief: 89-20-00062023001
    Description: This course is intended for Government of Canada employees who would like to learn about evaluating the quality of data for a particular use. Whether you are a new employee interested in learning the basics, or an experienced subject matter expert looking to refresh your skills, this course is here to help.
    Release date: 2023-07-17

  • Articles and reports: 11-522-X202100100029
    Description:

    In line with the path taken by the European Statistical System, Istat is investing on innovative methods to harness Big Data sources and to use them for the production of new and enriched Official Statistics products. Big Data sources are not, in general, directly tractable with traditional statistical techniques, just think of specific data types such as images and texts that are examples of the Variety dimension of Big Data. This motivates and justifies the growing interest of National Statistical Institutes in data science techniques. Istat is currently using data science techniques, including machine learning techniques, in innovation projects and for the publication of experimental statistics. This paper will provide an overview of the main current projects by Istat and will focus on two specific Big Data-based production pipelines, related to the processing of respectively text sources and imagery sources. The paper will highlight the main challenges these two pipelines and the solutions put in place to solve them.

    Key Words: Machine Learning; Text Processing; Image Processing; Big Data

    Release date: 2021-11-05

  • Articles and reports: 11-522-X202100100027
    Description:

    Privacy concerns are a barrier to applying remote analytics, including machine learning, on sensitive data via the cloud. In this work, we use a leveled fully Homomorphic Encryption scheme to train an end-to-end supervised machine learning algorithm to classify texts while protecting the privacy of the input data points. We train our single-layer neural network on a large simulated dataset, providing a practical solution to a real-world multi-class text classification task. To improve both accuracy and training time, we train an ensemble of such classifiers in parallel using ciphertext packing.

    Key Words: Privacy Preservation, Machine Learning, Encryption

    Release date: 2021-10-29

  • Articles and reports: 11-522-X202100100022
    Description:

    I provide an overview of the evolution of Statistical Disclosure Control (SDC) research over the last decades and how it has evolved to handle the data revolution with more formal definitions of privacy. I emphasize the many contributions by Chris Skinner in the research areas of SDC. I will review his seminal research, starting in the 1990’s with his work on the release of UK Census sample microdata. This led to a wide-range of research on measuring the risk of re-identification in survey microdata through probabilistic models. I also focus on other aspects of Chris’ research in SDC. Chris was the recipient of the 2019 Waksberg Award and sadly never got a chance to present his Waksberg Lecture at the Statistics Canada International Methodology Symposium. This paper follows the outline that Chris had prepared in preparation for that lecture, and provided to me by his son, Tom Skinner. Keywords: Risk of Re-identification, Data Revolution, Privacy Models, Differential Privacy

    Release date: 2021-10-22

  • Articles and reports: 11-633-X2021003
    Description:

    Canada continues to experience an opioid crisis. While there is solid information on the demographic and geographic characteristics of people experiencing fatal and non-fatal opioid overdoses in Canada, there is limited information on the social and economic conditions of those who experience these events. To fill this information gap, Statistics Canada collaborated with existing partnerships in British Columbia, including the BC Coroners Service, BC Stats, the BC Centre for Disease Control and the British Columbia Ministry of Health, to create the Statistics Canada British Columbia Opioid Overdose Analytical File (BC-OOAF).

    Release date: 2021-02-17

  • Articles and reports: 13-605-X201900100009
    Description:

    In this paper a preliminary set of statistical estimates of the amounts invested in Canadian data, databases and data science in recent years are presented. The results indicate rapid growth in investment in data, databases and data science over the last three decades and a significant accumulation of these kinds of capital over time.

    Release date: 2019-07-10

  • Articles and reports: 13-605-X201900100008
    Description:

    This paper aims to expand the current national accounting concepts and statistical methods for measuring data in order to shed light on some highly consequential changes in society that are related to the rising usage of data. The paper concludes by discussing possible methods that can be used to assign an economic value to the various elements in the information chain and tests these concepts and methods by presenting results for Canada as a first attempt to measure the value of data.

    Release date: 2019-06-24

  • Articles and reports: 11-633-X2019002
    Description:

    Survey data collection through mobile devices, such as tablets and smartphones, is underway in Canada. However, little is known about the representativeness of the data collected through these devices. In March 2017, Statistics Canada commissioned survey data collection through the Carrot Rewards Application and included 11 questions on the Carrot Rewards Mobile App Survey (Carrot) drawn from the 2017 Canadian Community Health Survey (CCHS).

    Release date: 2019-06-04

  • Articles and reports: 11-633-X2018016
    Description:

    Record linkage has been identified as a potential mechanism to add treatment information to the Canadian Cancer Registry (CCR). The purpose of the Canadian Cancer Treatment Linkage Project (CCTLP) pilot is to add surgical treatment data to the CCR. The Discharge Abstract Database (DAD) and the National Ambulatory Care Reporting System (NACRS) were linked to the CCR, and surgical treatment data were extracted. The project was funded through the Cancer Data Development Initiative (CDDI) of the Canadian Partnership Against Cancer (CPAC).

    The CCTLP was developed as a feasibility study in which patient records from the CCR would be linked to surgical treatment records in the DAD and NACRS databases, maintained by the Canadian Institute for Health Information. The target cohort to whom surgical treatment data would be linked was patients aged 19 or older registered on the CCR (2010 through 2012). The linkage was completed in Statistics Canada’s Social Data Linkage Environment (SDLE).

    Release date: 2018-03-27

  • Articles and reports: 11-633-X2016003
    Description:

    Large national mortality cohorts are used to estimate mortality rates for different socioeconomic and population groups, and to conduct research on environmental health. In 2008, Statistics Canada created a cohort linking the 1991 Census to mortality. The present study describes a linkage of the 2001 Census long-form questionnaire respondents aged 19 years and older to the T1 Personal Master File and the Amalgamated Mortality Database. The linkage tracks all deaths over a 10.6-year period (until the end of 2011, to date).

    Release date: 2016-10-26
Data (0)

Data (0) (0 results)

No content available at this time.

Analysis (35)

Analysis (35) (0 to 10 of 35 results)

  • Stats in brief: 89-20-00062023001
    Description: This course is intended for Government of Canada employees who would like to learn about evaluating the quality of data for a particular use. Whether you are a new employee interested in learning the basics, or an experienced subject matter expert looking to refresh your skills, this course is here to help.
    Release date: 2023-07-17

  • Articles and reports: 11-522-X202100100029
    Description:

    In line with the path taken by the European Statistical System, Istat is investing on innovative methods to harness Big Data sources and to use them for the production of new and enriched Official Statistics products. Big Data sources are not, in general, directly tractable with traditional statistical techniques, just think of specific data types such as images and texts that are examples of the Variety dimension of Big Data. This motivates and justifies the growing interest of National Statistical Institutes in data science techniques. Istat is currently using data science techniques, including machine learning techniques, in innovation projects and for the publication of experimental statistics. This paper will provide an overview of the main current projects by Istat and will focus on two specific Big Data-based production pipelines, related to the processing of respectively text sources and imagery sources. The paper will highlight the main challenges these two pipelines and the solutions put in place to solve them.

    Key Words: Machine Learning; Text Processing; Image Processing; Big Data

    Release date: 2021-11-05

  • Articles and reports: 11-522-X202100100027
    Description:

    Privacy concerns are a barrier to applying remote analytics, including machine learning, on sensitive data via the cloud. In this work, we use a leveled fully Homomorphic Encryption scheme to train an end-to-end supervised machine learning algorithm to classify texts while protecting the privacy of the input data points. We train our single-layer neural network on a large simulated dataset, providing a practical solution to a real-world multi-class text classification task. To improve both accuracy and training time, we train an ensemble of such classifiers in parallel using ciphertext packing.

    Key Words: Privacy Preservation, Machine Learning, Encryption

    Release date: 2021-10-29

  • Articles and reports: 11-522-X202100100022
    Description:

    I provide an overview of the evolution of Statistical Disclosure Control (SDC) research over the last decades and how it has evolved to handle the data revolution with more formal definitions of privacy. I emphasize the many contributions by Chris Skinner in the research areas of SDC. I will review his seminal research, starting in the 1990’s with his work on the release of UK Census sample microdata. This led to a wide-range of research on measuring the risk of re-identification in survey microdata through probabilistic models. I also focus on other aspects of Chris’ research in SDC. Chris was the recipient of the 2019 Waksberg Award and sadly never got a chance to present his Waksberg Lecture at the Statistics Canada International Methodology Symposium. This paper follows the outline that Chris had prepared in preparation for that lecture, and provided to me by his son, Tom Skinner. Keywords: Risk of Re-identification, Data Revolution, Privacy Models, Differential Privacy

    Release date: 2021-10-22

  • Articles and reports: 11-633-X2021003
    Description:

    Canada continues to experience an opioid crisis. While there is solid information on the demographic and geographic characteristics of people experiencing fatal and non-fatal opioid overdoses in Canada, there is limited information on the social and economic conditions of those who experience these events. To fill this information gap, Statistics Canada collaborated with existing partnerships in British Columbia, including the BC Coroners Service, BC Stats, the BC Centre for Disease Control and the British Columbia Ministry of Health, to create the Statistics Canada British Columbia Opioid Overdose Analytical File (BC-OOAF).

    Release date: 2021-02-17

  • Articles and reports: 13-605-X201900100009
    Description:

    In this paper a preliminary set of statistical estimates of the amounts invested in Canadian data, databases and data science in recent years are presented. The results indicate rapid growth in investment in data, databases and data science over the last three decades and a significant accumulation of these kinds of capital over time.

    Release date: 2019-07-10

  • Articles and reports: 13-605-X201900100008
    Description:

    This paper aims to expand the current national accounting concepts and statistical methods for measuring data in order to shed light on some highly consequential changes in society that are related to the rising usage of data. The paper concludes by discussing possible methods that can be used to assign an economic value to the various elements in the information chain and tests these concepts and methods by presenting results for Canada as a first attempt to measure the value of data.

    Release date: 2019-06-24

  • Articles and reports: 11-633-X2019002
    Description:

    Survey data collection through mobile devices, such as tablets and smartphones, is underway in Canada. However, little is known about the representativeness of the data collected through these devices. In March 2017, Statistics Canada commissioned survey data collection through the Carrot Rewards Application and included 11 questions on the Carrot Rewards Mobile App Survey (Carrot) drawn from the 2017 Canadian Community Health Survey (CCHS).

    Release date: 2019-06-04

  • Articles and reports: 11-633-X2018016
    Description:

    Record linkage has been identified as a potential mechanism to add treatment information to the Canadian Cancer Registry (CCR). The purpose of the Canadian Cancer Treatment Linkage Project (CCTLP) pilot is to add surgical treatment data to the CCR. The Discharge Abstract Database (DAD) and the National Ambulatory Care Reporting System (NACRS) were linked to the CCR, and surgical treatment data were extracted. The project was funded through the Cancer Data Development Initiative (CDDI) of the Canadian Partnership Against Cancer (CPAC).

    The CCTLP was developed as a feasibility study in which patient records from the CCR would be linked to surgical treatment records in the DAD and NACRS databases, maintained by the Canadian Institute for Health Information. The target cohort to whom surgical treatment data would be linked was patients aged 19 or older registered on the CCR (2010 through 2012). The linkage was completed in Statistics Canada’s Social Data Linkage Environment (SDLE).

    Release date: 2018-03-27

  • Articles and reports: 11-633-X2016003
    Description:

    Large national mortality cohorts are used to estimate mortality rates for different socioeconomic and population groups, and to conduct research on environmental health. In 2008, Statistics Canada created a cohort linking the 1991 Census to mortality. The present study describes a linkage of the 2001 Census long-form questionnaire respondents aged 19 years and older to the T1 Personal Master File and the Amalgamated Mortality Database. The linkage tracks all deaths over a 10.6-year period (until the end of 2011, to date).

    Release date: 2016-10-26
Reference (3)

Reference (3) ((3 results))

  • Surveys and statistical programs – Documentation: 16-001-M2010014
    Description: Quantifying how Canada's water yield has changed over time is an important component of the water accounts maintained by Statistics Canada. This study evaluates the movement in the series of annual water yield estimates for Southern Canada from 1971 to 2004. We estimated the movement in the series using a trend-cycle approach and found that water yield for southern Canada has generally decreased over the period of observation.
    Release date: 2010-09-13

  • Surveys and statistical programs – Documentation: 62F0026M2005005
    Description:

    This discussion paper reviews the previous research into the subject of presenting historical time series and comparisons in constant dollars for the Survey of Household Spending (SHS), and its predecessor the Family Expenditure Survey (FAMEX). It examines two principal methods of converting spending data into constant dollars. The purpose of this discussion paper is to show interested parties how the two methods differ in complexity of implementation and interpretation.

    Release date: 2005-07-15

  • Surveys and statistical programs – Documentation: 12-584-G
    Description:

    This book introduces technical aspects of the Statistics Canada Total Work Accounts System (TWAS). The TWAS is designed to facilitate the analysis of issues that require simultaneous consideration of both paid work and unpaid productive work. Its key contribution is to allocate the deemed output of each episode of unpaid work activity to a specific beneficiary or group of beneficiaries (called "destinations"). The guide presents the criteria used to decide the allocation of each work episode to one of the destinations, as well as the pseudo code for DESTIN, the key variable of the System. This pseudo code allows programmers to quickly create the actual programming code needed to derive the DESTIN variable in their own microdata files of diary-based time-use records. The guide also discusses illustrative applications of the System, as well as its key limitations.

    Release date: 2002-02-12
Date modified: