Statistical techniques

Filter results by

Search Help
Currently selected filters that can be removed

Keyword(s)

Type

1 facets displayed. 0 facets selected.

Geography

1 facets displayed. 0 facets selected.

Survey or statistical program

2 facets displayed. 0 facets selected.

Content

1 facets displayed. 0 facets selected.
Sort Help
entries

Results

All (15)

All (15) (0 to 10 of 15 results)

  • Articles and reports: 11-522-X202100100017
    Description: The outbreak of the COVID-19 pandemic required the Government of Canada to provide relevant and timely information to support decision-making around a host of issues, including personal protective equipment (PPE) procurement and deployment. Our team built a compartmental epidemiological model from an existing code base to project PPE demand under a range of epidemiological scenarios. This model was further enhanced using data science techniques, which allowed for the rapid development and dissemination of model results to inform policy decisions.

    Key Words: COVID-19; SARS-CoV-2; Epidemiological model; Data science; Personal Protective Equipment (PPE); SEIR

    Release date: 2021-10-22

  • Articles and reports: 82-003-X201901200003
    Description:

    This article provides a description of the Canadian Census Health and Environment Cohorts (CanCHECs), a population-based linked datasets of the household population at the time of census collection. The CanCHEC datasets are rich national data resources that can be used to measure and examine health inequalities across socioeconomic and ethnocultural dimensions for different periods and locations. These datasets can also be used to examine the effects of exposure to environmental factors on human health.

    Release date: 2019-12-18

  • Articles and reports: 11-633-X2018016
    Description:

    Record linkage has been identified as a potential mechanism to add treatment information to the Canadian Cancer Registry (CCR). The purpose of the Canadian Cancer Treatment Linkage Project (CCTLP) pilot is to add surgical treatment data to the CCR. The Discharge Abstract Database (DAD) and the National Ambulatory Care Reporting System (NACRS) were linked to the CCR, and surgical treatment data were extracted. The project was funded through the Cancer Data Development Initiative (CDDI) of the Canadian Partnership Against Cancer (CPAC).

    The CCTLP was developed as a feasibility study in which patient records from the CCR would be linked to surgical treatment records in the DAD and NACRS databases, maintained by the Canadian Institute for Health Information. The target cohort to whom surgical treatment data would be linked was patients aged 19 or older registered on the CCR (2010 through 2012). The linkage was completed in Statistics Canada’s Social Data Linkage Environment (SDLE).

    Release date: 2018-03-27

  • Articles and reports: 11-633-X2018013
    Description:

    Since 2008, a number of population censuses have been linked to administrative health data and to financial data. These linked datasets have been instrumental in examining health inequalities and have been used in environmental health research. This paper describes the creation of the 1996 Canadian Census Health and Environment Cohort (CanCHEC)—3.57 million respondents to the census long-form questionnaire who were retrospectively followed for mortality and mobility for 16.6 years from 1996 to 2012. The 1996 CanCHEC was limited to census respondents who were aged 19 or older on Census Day (May 14, 1996), were residents of Canada, were not residents of institutions, and had filed an income tax return. These respondents were linked to death records from the Canadian Mortality Database or to the T1 Personal Master File, and to a postal code history from a variety of sources. This is the third in a set of CanCHECs that, when combined, make it possible to examine mortality trends and environmental exposures by socioeconomic characteristics over three census cycles and 21 years of census, tax, and mortality data. This report describes linkage methodologies, validation and bias assessment, and the characteristics of the 1996 CanCHEC. Representativeness of the 1996 CanCHEC relative to the adult population of Canada is also assessed.

    Release date: 2018-01-22

  • Articles and reports: 12-001-X201600214663
    Description:

    We present theoretical evidence that efforts during data collection to balance the survey response with respect to selected auxiliary variables will improve the chances for low nonresponse bias in the estimates that are ultimately produced by calibrated weighting. One of our results shows that the variance of the bias – measured here as the deviation of the calibration estimator from the (unrealized) full-sample unbiased estimator – decreases linearly as a function of the response imbalance that we assume measured and controlled continuously over the data collection period. An attractive prospect is thus a lower risk of bias if one can manage the data collection to get low imbalance. The theoretical results are validated in a simulation study with real data from an Estonian household survey.

    Release date: 2016-12-20

  • Articles and reports: 12-001-X201600214676
    Description:

    Winsorization procedures replace extreme values with less extreme values, effectively moving the original extreme values toward the center of the distribution. Winsorization therefore both detects and treats influential values. Mulry, Oliver and Kaputa (2014) compare the performance of the one-sided Winsorization method developed by Clark (1995) and described by Chambers, Kokic, Smith and Cruddas (2000) to the performance of M-estimation (Beaumont and Alavi 2004) in highly skewed business population data. One aspect of particular interest for methods that detect and treat influential values is the range of values designated as influential, called the detection region. The Clark Winsorization algorithm is easy to implement and can be extremely effective. However, the resultant detection region is highly dependent on the number of influential values in the sample, especially when the survey totals are expected to vary greatly by collection period. In this note, we examine the effect of the number and magnitude of influential values on the detection regions from Clark Winsorization using data simulated to realistically reflect the properties of the population for the Monthly Retail Trade Survey (MRTS) conducted by the U.S. Census Bureau. Estimates from the MRTS and other economic surveys are used in economic indicators, such as the Gross Domestic Product (GDP).

    Release date: 2016-12-20

  • Articles and reports: 11-633-X2016003
    Description:

    Large national mortality cohorts are used to estimate mortality rates for different socioeconomic and population groups, and to conduct research on environmental health. In 2008, Statistics Canada created a cohort linking the 1991 Census to mortality. The present study describes a linkage of the 2001 Census long-form questionnaire respondents aged 19 years and older to the T1 Personal Master File and the Amalgamated Mortality Database. The linkage tracks all deaths over a 10.6-year period (until the end of 2011, to date).

    Release date: 2016-10-26

  • Articles and reports: 11-633-X2016002
    Description:

    Immigrants comprise an ever-increasing percentage of the Canadian population—at more than 20%, which is the highest percentage among the G8 countries (Statistics Canada 2013a). This figure is expected to rise to 25% to 28% by 2031, when at least one in four people living in Canada will be foreign-born (Statistics Canada 2010).

    This report summarizes the linkage of the Immigrant Landing File (ILF) for all provinces and territories, excluding Quebec, to hospital data from the Discharge Abstract Database (DAD), a national database containing information about hospital inpatient and day-surgery events. A deterministic exact-matching approach was used to link data from the 1980-to-2006 ILF and from the DAD (2006/2007, 2007/2008 and 2008/2009) with the 2006 Census, which served as a “bridge” file. This was a secondary linkage in that it used linkage keys created in two previous projects (primary linkages) that separately linked the ILF and the DAD to the 2006 Census. The ILF–DAD linked data were validated by means of a representative sample of 2006 Census records containing immigrant information previously linked to the DAD.

    Release date: 2016-08-17

  • Articles and reports: 82-622-X2015009
    Description:

    The Canadian Cancer Registry (CCR) represents a collaborative effort between Statistics Canada and the thirteen provincial and territorial cancer registries to create a single database to report annually on cancer incidence and survival at the national and jurisdictional level. While gains have been made to ensure high quality, standardized, and comparable data, the CCR currently lacks information on cancer treatment. The Canadian Council of Cancer Registries (CCCR) identified the need to capture treatment data at the national level as a key strategic priority for 2013/2014. Record linkage was identified as one possible approach to fill this information gap.

    The purpose of this study is to examine the feasibility of using record linkage to add cancer treatment information for selected cancers: breast, colorectal and prostate. The objectives are twofold: to assess the quality of the linkage processes and the validity of using linked data to estimate cancer treatment rates at the provincial level. The study is based on the Canadian Cancer Registry (2005 to 2008) linked to the Discharge Abstract Database (DAD) and the National Ambulatory Care Reporting System (NACRS) for four provinces (Ontario, Manitoba, Nova Scotia and Prince Edward Island). The linkage was proposed by Statistics Canada, the CCCR and the Canadian Institute for Health Information (CIHI). The linkage was approved and conducted at Statistics Canada.

    Release date: 2015-11-23

  • Articles and reports: 82-003-X201300611796
    Geography: Canada
    Description:

    The study assesses the feasibility of using statistical modelling techniques to fill information gaps related to risk factors, specifically, smoking status, in linked long-form census data.

    Release date: 2013-06-19
Data (0)

Data (0) (0 results)

No content available at this time.

Analysis (15)

Analysis (15) (0 to 10 of 15 results)

  • Articles and reports: 11-522-X202100100017
    Description: The outbreak of the COVID-19 pandemic required the Government of Canada to provide relevant and timely information to support decision-making around a host of issues, including personal protective equipment (PPE) procurement and deployment. Our team built a compartmental epidemiological model from an existing code base to project PPE demand under a range of epidemiological scenarios. This model was further enhanced using data science techniques, which allowed for the rapid development and dissemination of model results to inform policy decisions.

    Key Words: COVID-19; SARS-CoV-2; Epidemiological model; Data science; Personal Protective Equipment (PPE); SEIR

    Release date: 2021-10-22

  • Articles and reports: 82-003-X201901200003
    Description:

    This article provides a description of the Canadian Census Health and Environment Cohorts (CanCHECs), a population-based linked datasets of the household population at the time of census collection. The CanCHEC datasets are rich national data resources that can be used to measure and examine health inequalities across socioeconomic and ethnocultural dimensions for different periods and locations. These datasets can also be used to examine the effects of exposure to environmental factors on human health.

    Release date: 2019-12-18

  • Articles and reports: 11-633-X2018016
    Description:

    Record linkage has been identified as a potential mechanism to add treatment information to the Canadian Cancer Registry (CCR). The purpose of the Canadian Cancer Treatment Linkage Project (CCTLP) pilot is to add surgical treatment data to the CCR. The Discharge Abstract Database (DAD) and the National Ambulatory Care Reporting System (NACRS) were linked to the CCR, and surgical treatment data were extracted. The project was funded through the Cancer Data Development Initiative (CDDI) of the Canadian Partnership Against Cancer (CPAC).

    The CCTLP was developed as a feasibility study in which patient records from the CCR would be linked to surgical treatment records in the DAD and NACRS databases, maintained by the Canadian Institute for Health Information. The target cohort to whom surgical treatment data would be linked was patients aged 19 or older registered on the CCR (2010 through 2012). The linkage was completed in Statistics Canada’s Social Data Linkage Environment (SDLE).

    Release date: 2018-03-27

  • Articles and reports: 11-633-X2018013
    Description:

    Since 2008, a number of population censuses have been linked to administrative health data and to financial data. These linked datasets have been instrumental in examining health inequalities and have been used in environmental health research. This paper describes the creation of the 1996 Canadian Census Health and Environment Cohort (CanCHEC)—3.57 million respondents to the census long-form questionnaire who were retrospectively followed for mortality and mobility for 16.6 years from 1996 to 2012. The 1996 CanCHEC was limited to census respondents who were aged 19 or older on Census Day (May 14, 1996), were residents of Canada, were not residents of institutions, and had filed an income tax return. These respondents were linked to death records from the Canadian Mortality Database or to the T1 Personal Master File, and to a postal code history from a variety of sources. This is the third in a set of CanCHECs that, when combined, make it possible to examine mortality trends and environmental exposures by socioeconomic characteristics over three census cycles and 21 years of census, tax, and mortality data. This report describes linkage methodologies, validation and bias assessment, and the characteristics of the 1996 CanCHEC. Representativeness of the 1996 CanCHEC relative to the adult population of Canada is also assessed.

    Release date: 2018-01-22

  • Articles and reports: 12-001-X201600214663
    Description:

    We present theoretical evidence that efforts during data collection to balance the survey response with respect to selected auxiliary variables will improve the chances for low nonresponse bias in the estimates that are ultimately produced by calibrated weighting. One of our results shows that the variance of the bias – measured here as the deviation of the calibration estimator from the (unrealized) full-sample unbiased estimator – decreases linearly as a function of the response imbalance that we assume measured and controlled continuously over the data collection period. An attractive prospect is thus a lower risk of bias if one can manage the data collection to get low imbalance. The theoretical results are validated in a simulation study with real data from an Estonian household survey.

    Release date: 2016-12-20

  • Articles and reports: 12-001-X201600214676
    Description:

    Winsorization procedures replace extreme values with less extreme values, effectively moving the original extreme values toward the center of the distribution. Winsorization therefore both detects and treats influential values. Mulry, Oliver and Kaputa (2014) compare the performance of the one-sided Winsorization method developed by Clark (1995) and described by Chambers, Kokic, Smith and Cruddas (2000) to the performance of M-estimation (Beaumont and Alavi 2004) in highly skewed business population data. One aspect of particular interest for methods that detect and treat influential values is the range of values designated as influential, called the detection region. The Clark Winsorization algorithm is easy to implement and can be extremely effective. However, the resultant detection region is highly dependent on the number of influential values in the sample, especially when the survey totals are expected to vary greatly by collection period. In this note, we examine the effect of the number and magnitude of influential values on the detection regions from Clark Winsorization using data simulated to realistically reflect the properties of the population for the Monthly Retail Trade Survey (MRTS) conducted by the U.S. Census Bureau. Estimates from the MRTS and other economic surveys are used in economic indicators, such as the Gross Domestic Product (GDP).

    Release date: 2016-12-20

  • Articles and reports: 11-633-X2016003
    Description:

    Large national mortality cohorts are used to estimate mortality rates for different socioeconomic and population groups, and to conduct research on environmental health. In 2008, Statistics Canada created a cohort linking the 1991 Census to mortality. The present study describes a linkage of the 2001 Census long-form questionnaire respondents aged 19 years and older to the T1 Personal Master File and the Amalgamated Mortality Database. The linkage tracks all deaths over a 10.6-year period (until the end of 2011, to date).

    Release date: 2016-10-26

  • Articles and reports: 11-633-X2016002
    Description:

    Immigrants comprise an ever-increasing percentage of the Canadian population—at more than 20%, which is the highest percentage among the G8 countries (Statistics Canada 2013a). This figure is expected to rise to 25% to 28% by 2031, when at least one in four people living in Canada will be foreign-born (Statistics Canada 2010).

    This report summarizes the linkage of the Immigrant Landing File (ILF) for all provinces and territories, excluding Quebec, to hospital data from the Discharge Abstract Database (DAD), a national database containing information about hospital inpatient and day-surgery events. A deterministic exact-matching approach was used to link data from the 1980-to-2006 ILF and from the DAD (2006/2007, 2007/2008 and 2008/2009) with the 2006 Census, which served as a “bridge” file. This was a secondary linkage in that it used linkage keys created in two previous projects (primary linkages) that separately linked the ILF and the DAD to the 2006 Census. The ILF–DAD linked data were validated by means of a representative sample of 2006 Census records containing immigrant information previously linked to the DAD.

    Release date: 2016-08-17

  • Articles and reports: 82-622-X2015009
    Description:

    The Canadian Cancer Registry (CCR) represents a collaborative effort between Statistics Canada and the thirteen provincial and territorial cancer registries to create a single database to report annually on cancer incidence and survival at the national and jurisdictional level. While gains have been made to ensure high quality, standardized, and comparable data, the CCR currently lacks information on cancer treatment. The Canadian Council of Cancer Registries (CCCR) identified the need to capture treatment data at the national level as a key strategic priority for 2013/2014. Record linkage was identified as one possible approach to fill this information gap.

    The purpose of this study is to examine the feasibility of using record linkage to add cancer treatment information for selected cancers: breast, colorectal and prostate. The objectives are twofold: to assess the quality of the linkage processes and the validity of using linked data to estimate cancer treatment rates at the provincial level. The study is based on the Canadian Cancer Registry (2005 to 2008) linked to the Discharge Abstract Database (DAD) and the National Ambulatory Care Reporting System (NACRS) for four provinces (Ontario, Manitoba, Nova Scotia and Prince Edward Island). The linkage was proposed by Statistics Canada, the CCCR and the Canadian Institute for Health Information (CIHI). The linkage was approved and conducted at Statistics Canada.

    Release date: 2015-11-23

  • Articles and reports: 82-003-X201300611796
    Geography: Canada
    Description:

    The study assesses the feasibility of using statistical modelling techniques to fill information gaps related to risk factors, specifically, smoking status, in linked long-form census data.

    Release date: 2013-06-19
Reference (0)

Reference (0) (0 results)

No content available at this time.

Date modified: