Data analysis

Filter results by

Search Help
Currently selected filters that can be removed

Keyword(s)

Type

2 facets displayed. 0 facets selected.

Geography

1 facets displayed. 0 facets selected.

Survey or statistical program

2 facets displayed. 0 facets selected.

Content

1 facets displayed. 0 facets selected.
Sort Help
entries

Results

All (22)

All (22) (0 to 10 of 22 results)

  • Articles and reports: 82-003-X202300200003
    Description: Utility scores are an important tool for evaluating health-related quality of life. Utility score norms have been published for Canadian adults, but no nationally representative utility score norms are available for non-adults. Using Health Utilities Index Mark 3 (HUI3) data from two recent cycles of the Canadian Health Measures Survey (i.e., 2016-2017 and 2018-2019), this is the first study to provide utility score norms for children aged 6 to 11 years and adolescents aged 12 to 17 years.
    Release date: 2023-02-15

  • Articles and reports: 11-522-X202100100018
    Description: Statistics Finland started publishing nowcasts of the trend indicator of output (TIO), the monthly indicator of real economic activity, to answer users´ needs during the Covid-19 pandemic. The indicator was first published in April 2020, at the very beginning of the pandemic in Finland, and had a monthly release schedule until June 2021. The TIO nowcasts are produced using open-source data on truck traffic volumes at about 100 automatic measuring points in the Helsinki/Uusimaa -region and the Economic Sentiment Indicator for Finland. Estimation is done using a machine learning approach and the methodology is based on previous work done by Statistics Finland and ETLA Economic Research.

    Key Words: nowcasting; flash estimates; machine learning; experimental statistics.

    Release date: 2021-10-29

  • Articles and reports: 11-522-X202100100025
    Description:

    We propose a longitudinal analysis with a point of view connected to the organizational changes that have taken place in the Italian National Institute of Statistics in recent years. In 2016 the Institute introduced a new Directorate, intending to standardize and generalize the business process of Data Collection according to the European standard of the GAMSO model. The paper discusses the pros and cons of this change from the perspective of the survey's participation. The ICT survey response rate analysis demonstrates an increase of around 20% since the beginning of the new organization: the paper tries to focus on the impact of the changes introduced with the new organization. We focused our attention on two specific subsets of respondents - the so-called "wanted" - the ones who have never answered to an ICT survey or to any other Istat survey and - the so-called “lost” - the ones included in two consecutive survey’s samples and that answered in the previous edition but not in the current one. The paper aims to illustrate how an efficient organization of data collection reflects its benefits on survey results and what kind of actions should be taken to catch the attention of the "wanted". Finally, we apply a logistic model measuring the probability that an enterprise responding in 2018 (t-1) also answered in 2019 (t). All the analysis suggests some actions that could be taken to improve respondents' participation, data quality, and respondents' perception of the official statistics.

    Key Words: data collection strategy, response rate, paradata, response burden, ICT Survey.

    Release date: 2021-10-29

  • Articles and reports: 11-633-X2018016
    Description:

    Record linkage has been identified as a potential mechanism to add treatment information to the Canadian Cancer Registry (CCR). The purpose of the Canadian Cancer Treatment Linkage Project (CCTLP) pilot is to add surgical treatment data to the CCR. The Discharge Abstract Database (DAD) and the National Ambulatory Care Reporting System (NACRS) were linked to the CCR, and surgical treatment data were extracted. The project was funded through the Cancer Data Development Initiative (CDDI) of the Canadian Partnership Against Cancer (CPAC).

    The CCTLP was developed as a feasibility study in which patient records from the CCR would be linked to surgical treatment records in the DAD and NACRS databases, maintained by the Canadian Institute for Health Information. The target cohort to whom surgical treatment data would be linked was patients aged 19 or older registered on the CCR (2010 through 2012). The linkage was completed in Statistics Canada’s Social Data Linkage Environment (SDLE).

    Release date: 2018-03-27

  • Articles and reports: 11-633-X2017006
    Description:

    This paper describes a method of imputing missing postal codes in a longitudinal database. The 1991 Canadian Census Health and Environment Cohort (CanCHEC), which contains information on individuals from the 1991 Census long-form questionnaire linked with T1 tax return files for the 1984-to-2011 period, is used to illustrate and validate the method. The cohort contains up to 28 consecutive fields for postal code of residence, but because of frequent gaps in postal code history, missing postal codes must be imputed. To validate the imputation method, two experiments were devised where 5% and 10% of all postal codes from a subset with full history were randomly removed and imputed.

    Release date: 2017-03-13

  • Articles and reports: 11-633-X2016003
    Description:

    Large national mortality cohorts are used to estimate mortality rates for different socioeconomic and population groups, and to conduct research on environmental health. In 2008, Statistics Canada created a cohort linking the 1991 Census to mortality. The present study describes a linkage of the 2001 Census long-form questionnaire respondents aged 19 years and older to the T1 Personal Master File and the Amalgamated Mortality Database. The linkage tracks all deaths over a 10.6-year period (until the end of 2011, to date).

    Release date: 2016-10-26

  • Articles and reports: 12-001-X201600114546
    Description:

    Adjusting the base weights using weighting classes is a standard approach for dealing with unit nonresponse. A common approach is to create nonresponse adjustments that are weighted by the inverse of the assumed response propensity of respondents within weighting classes under a quasi-randomization approach. Little and Vartivarian (2003) questioned the value of weighting the adjustment factor. In practice the models assumed are misspecified, so it is critical to understand the impact of weighting might have in this case. This paper describes the effects on nonresponse adjusted estimates of means and totals for population and domains computed using the weighted and unweighted inverse of the response propensities in stratified simple random sample designs. The performance of these estimators under different conditions such as different sample allocation, response mechanism, and population structure is evaluated. The findings show that for the scenarios considered the weighted adjustment has substantial advantages for estimating totals and using an unweighted adjustment may lead to serious biases except in very limited cases. Furthermore, unlike the unweighted estimates, the weighted estimates are not sensitive to how the sample is allocated.

    Release date: 2016-06-22

  • Articles and reports: 12-001-X201500214238
    Description:

    Félix-Medina and Thompson (2004) proposed a variant of link-tracing sampling to sample hidden and/or hard-to-detect human populations such as drug users and sex workers. In their variant, an initial sample of venues is selected and the people found in the sampled venues are asked to name other members of the population to be included in the sample. Those authors derived maximum likelihood estimators of the population size under the assumption that the probability that a person is named by another in a sampled venue (link-probability) does not depend on the named person (homogeneity assumption). In this work we extend their research to the case of heterogeneous link-probabilities and derive unconditional and conditional maximum likelihood estimators of the population size. We also propose profile likelihood and bootstrap confidence intervals for the size of the population. The results of simulations studies carried out by us show that in presence of heterogeneous link-probabilities the proposed estimators perform reasonably well provided that relatively large sampling fractions, say larger than 0.5, be used, whereas the estimators derived under the homogeneity assumption perform badly. The outcomes also show that the proposed confidence intervals are not very robust to deviations from the assumed models.

    Release date: 2015-12-17

  • Articles and reports: 82-003-X201100411598
    Geography: Canada
    Description:

    With longitudinal data, lifetime health status dynamics can be estimated by modeling trajectories. Health status trajectories measured by the Health Utilities Index Mark 3 (HUI3) modeled as a function of age alone and also of age and socio-economic covariates revealed non-normal residuals and variance estimation problems. The possibility of transforming the HUI3 distribution to obtain residuals that approximate a normal distribution was investigated.

    Release date: 2011-12-21

  • Articles and reports: 82-003-X200800410747
    Geography: Canada
    Description:

    A selective approach may be used in an ecological study where the aim is to choose a subset of units of analysis (UAs) and produce interpretations about a population of interest (PI) based solely on those UAs. The results for the PI will be reliable if that population is concentrated in the selected UAs and rare in other UAs. This article presents a graphical tool that helps determine whether these conditions are satisfied.

    Release date: 2008-12-17
Data (0)

Data (0) (0 results)

No content available at this time.

Analysis (21)

Analysis (21) (0 to 10 of 21 results)

  • Articles and reports: 82-003-X202300200003
    Description: Utility scores are an important tool for evaluating health-related quality of life. Utility score norms have been published for Canadian adults, but no nationally representative utility score norms are available for non-adults. Using Health Utilities Index Mark 3 (HUI3) data from two recent cycles of the Canadian Health Measures Survey (i.e., 2016-2017 and 2018-2019), this is the first study to provide utility score norms for children aged 6 to 11 years and adolescents aged 12 to 17 years.
    Release date: 2023-02-15

  • Articles and reports: 11-522-X202100100018
    Description: Statistics Finland started publishing nowcasts of the trend indicator of output (TIO), the monthly indicator of real economic activity, to answer users´ needs during the Covid-19 pandemic. The indicator was first published in April 2020, at the very beginning of the pandemic in Finland, and had a monthly release schedule until June 2021. The TIO nowcasts are produced using open-source data on truck traffic volumes at about 100 automatic measuring points in the Helsinki/Uusimaa -region and the Economic Sentiment Indicator for Finland. Estimation is done using a machine learning approach and the methodology is based on previous work done by Statistics Finland and ETLA Economic Research.

    Key Words: nowcasting; flash estimates; machine learning; experimental statistics.

    Release date: 2021-10-29

  • Articles and reports: 11-522-X202100100025
    Description:

    We propose a longitudinal analysis with a point of view connected to the organizational changes that have taken place in the Italian National Institute of Statistics in recent years. In 2016 the Institute introduced a new Directorate, intending to standardize and generalize the business process of Data Collection according to the European standard of the GAMSO model. The paper discusses the pros and cons of this change from the perspective of the survey's participation. The ICT survey response rate analysis demonstrates an increase of around 20% since the beginning of the new organization: the paper tries to focus on the impact of the changes introduced with the new organization. We focused our attention on two specific subsets of respondents - the so-called "wanted" - the ones who have never answered to an ICT survey or to any other Istat survey and - the so-called “lost” - the ones included in two consecutive survey’s samples and that answered in the previous edition but not in the current one. The paper aims to illustrate how an efficient organization of data collection reflects its benefits on survey results and what kind of actions should be taken to catch the attention of the "wanted". Finally, we apply a logistic model measuring the probability that an enterprise responding in 2018 (t-1) also answered in 2019 (t). All the analysis suggests some actions that could be taken to improve respondents' participation, data quality, and respondents' perception of the official statistics.

    Key Words: data collection strategy, response rate, paradata, response burden, ICT Survey.

    Release date: 2021-10-29

  • Articles and reports: 11-633-X2018016
    Description:

    Record linkage has been identified as a potential mechanism to add treatment information to the Canadian Cancer Registry (CCR). The purpose of the Canadian Cancer Treatment Linkage Project (CCTLP) pilot is to add surgical treatment data to the CCR. The Discharge Abstract Database (DAD) and the National Ambulatory Care Reporting System (NACRS) were linked to the CCR, and surgical treatment data were extracted. The project was funded through the Cancer Data Development Initiative (CDDI) of the Canadian Partnership Against Cancer (CPAC).

    The CCTLP was developed as a feasibility study in which patient records from the CCR would be linked to surgical treatment records in the DAD and NACRS databases, maintained by the Canadian Institute for Health Information. The target cohort to whom surgical treatment data would be linked was patients aged 19 or older registered on the CCR (2010 through 2012). The linkage was completed in Statistics Canada’s Social Data Linkage Environment (SDLE).

    Release date: 2018-03-27

  • Articles and reports: 11-633-X2017006
    Description:

    This paper describes a method of imputing missing postal codes in a longitudinal database. The 1991 Canadian Census Health and Environment Cohort (CanCHEC), which contains information on individuals from the 1991 Census long-form questionnaire linked with T1 tax return files for the 1984-to-2011 period, is used to illustrate and validate the method. The cohort contains up to 28 consecutive fields for postal code of residence, but because of frequent gaps in postal code history, missing postal codes must be imputed. To validate the imputation method, two experiments were devised where 5% and 10% of all postal codes from a subset with full history were randomly removed and imputed.

    Release date: 2017-03-13

  • Articles and reports: 11-633-X2016003
    Description:

    Large national mortality cohorts are used to estimate mortality rates for different socioeconomic and population groups, and to conduct research on environmental health. In 2008, Statistics Canada created a cohort linking the 1991 Census to mortality. The present study describes a linkage of the 2001 Census long-form questionnaire respondents aged 19 years and older to the T1 Personal Master File and the Amalgamated Mortality Database. The linkage tracks all deaths over a 10.6-year period (until the end of 2011, to date).

    Release date: 2016-10-26

  • Articles and reports: 12-001-X201600114546
    Description:

    Adjusting the base weights using weighting classes is a standard approach for dealing with unit nonresponse. A common approach is to create nonresponse adjustments that are weighted by the inverse of the assumed response propensity of respondents within weighting classes under a quasi-randomization approach. Little and Vartivarian (2003) questioned the value of weighting the adjustment factor. In practice the models assumed are misspecified, so it is critical to understand the impact of weighting might have in this case. This paper describes the effects on nonresponse adjusted estimates of means and totals for population and domains computed using the weighted and unweighted inverse of the response propensities in stratified simple random sample designs. The performance of these estimators under different conditions such as different sample allocation, response mechanism, and population structure is evaluated. The findings show that for the scenarios considered the weighted adjustment has substantial advantages for estimating totals and using an unweighted adjustment may lead to serious biases except in very limited cases. Furthermore, unlike the unweighted estimates, the weighted estimates are not sensitive to how the sample is allocated.

    Release date: 2016-06-22

  • Articles and reports: 12-001-X201500214238
    Description:

    Félix-Medina and Thompson (2004) proposed a variant of link-tracing sampling to sample hidden and/or hard-to-detect human populations such as drug users and sex workers. In their variant, an initial sample of venues is selected and the people found in the sampled venues are asked to name other members of the population to be included in the sample. Those authors derived maximum likelihood estimators of the population size under the assumption that the probability that a person is named by another in a sampled venue (link-probability) does not depend on the named person (homogeneity assumption). In this work we extend their research to the case of heterogeneous link-probabilities and derive unconditional and conditional maximum likelihood estimators of the population size. We also propose profile likelihood and bootstrap confidence intervals for the size of the population. The results of simulations studies carried out by us show that in presence of heterogeneous link-probabilities the proposed estimators perform reasonably well provided that relatively large sampling fractions, say larger than 0.5, be used, whereas the estimators derived under the homogeneity assumption perform badly. The outcomes also show that the proposed confidence intervals are not very robust to deviations from the assumed models.

    Release date: 2015-12-17

  • Articles and reports: 82-003-X201100411598
    Geography: Canada
    Description:

    With longitudinal data, lifetime health status dynamics can be estimated by modeling trajectories. Health status trajectories measured by the Health Utilities Index Mark 3 (HUI3) modeled as a function of age alone and also of age and socio-economic covariates revealed non-normal residuals and variance estimation problems. The possibility of transforming the HUI3 distribution to obtain residuals that approximate a normal distribution was investigated.

    Release date: 2011-12-21

  • Articles and reports: 82-003-X200800410747
    Geography: Canada
    Description:

    A selective approach may be used in an ecological study where the aim is to choose a subset of units of analysis (UAs) and produce interpretations about a population of interest (PI) based solely on those UAs. The results for the PI will be reliable if that population is concentrated in the selected UAs and rare in other UAs. This article presents a graphical tool that helps determine whether these conditions are satisfied.

    Release date: 2008-12-17
Reference (1)

Reference (1) ((1 result))

  • Notices and consultations: 12-002-X20050018033
    Description:

    Dr. J. Douglas Willms, and his staff at the Canadian Research Institute for Social Policy (CRISP) at the University of New Brunswick (Fredericton Campus), have developed a set of files for researchers interested in using Statistics Canada's National Longitudinal Survey of Children and Youth (NLSCY) data sets. "The Files" consist of SPSS data and syntax, which are intended to assist researchers in conducting more efficient longitudinal analyses, using NLSCY data.

    Release date: 2005-06-23
Date modified: