Statistics by subject – Statistical methods

Filter results by

Help for filters and search
Currently selected filters that can be removed

Keyword(s)

Survey or statistical program

40 facets displayed. 0 facets selected.

Content

1 facets displayed. 0 facets selected.

Filter results by

Help for filters and search
Currently selected filters that can be removed

Keyword(s)

Survey or statistical program

40 facets displayed. 0 facets selected.

Content

1 facets displayed. 0 facets selected.

Filter results by

Help for filters and search
Currently selected filters that can be removed

Keyword(s)

Survey or statistical program

40 facets displayed. 0 facets selected.

Content

1 facets displayed. 0 facets selected.

Filter results by

Help for filters and search
Currently selected filters that can be removed

Keyword(s)

Survey or statistical program

40 facets displayed. 0 facets selected.

Content

1 facets displayed. 0 facets selected.

Other available resources to support your research.

Help for sorting results
Browse our central repository of key standard concepts, definitions, data sources and methods.
Loading
Loading in progress, please wait...
All (1,610)

All (1,610) (25 of 1,610 results)

  • Articles and reports: 11-630-X2016003
    Description:

    This edition of Canadian Megatrends looks at changes in the causes of death from 1950 to 2012.

    Release date: 2016-03-21

  • Articles and reports: 82-003-X201600314338
    Description:

    This paper describes the methods and data used in the development and implementation of the POHEM-Neurological meta-model.

    Release date: 2016-03-16

  • Technical products: 91-528-X
    Description:

    This manual provides detailed descriptions of the data sources and methods used by Statistics Canada to estimate population. They comprise Postcensal and intercensal population estimates; base population; births and deaths; immigration; emigration; non-permanent residents; interprovincial migration; subprovincial estimates of population; population estimates by age, sex and marital status; and census family estimates. A glossary of principal terms is contained at the end of the manual, followed by the standard notation used.

    Until now, literature on the methodological changes for estimates calculations has always been spread throughout various Statistics Canada publications and background papers. This manual provides users of demographic statistics with a comprehensive compilation of the current procedures used by Statistics Canada to prepare population and family estimates.

    Release date: 2016-03-03

  • Articles and reports: 89-654-X2016003
    Description:

    This paper describes the process that led to the creation of the new Disability Screening Questions (DSQ), jointly developped by Statistics Canada and Employment and Social Development Canada. The DSQ form a new module which can be put on general population surveys to allow comparisons of persons with and without a disability. The paper explains why there are two versions of the DSQ—a long and a short one—, the difference between the two, and how each version can be used.

    Release date: 2016-02-29

  • Articles and reports: 11-630-X2016002
    Description:

    In this edition of Canadian Megatrends, we look at the increase in life expectancy in Canada from 1920–1922 to 2009–2011.

    Release date: 2016-02-26

  • The Daily
    Description: Release published in The Daily – Statistics Canada’s official release bulletin
    Release date: 2016-02-11

  • Classification: 12-603-X
    Description:

    Canadian Classification of Institutional Units and Sectors (CCIUS) 2012 is the departmental standard for classifying institutional units and sectors. This classification is used for economic statistics and includes definitions for its 171 classes. CCIUS 2012 was developed as a result of the implementation of international recommendations published in the 2008 System of National Accounts manual (SNA 2008).

    Release date: 2016-02-11

  • Articles and reports: 11-630-X2016001
    Description:

    This edition of Canadian Megatrends explores the evolution of English-French bilingualism in Canada from 1901 to 2011.

    Release date: 2016-01-28

  • Articles and reports: 82-003-X201600114307
    Description:

    Using the 2012 Aboriginal Peoples Survey, this study examined the psychometric properties of the 10-item Kessler Psychological Distress Scale (a short measure of non-specific psychological distress) for First Nations people living off reserve, Métis, and Inuit aged 15 or older.

    Release date: 2016-01-20

  • Articles and reports: 82-003-X201600114306
    Description:

    This article is an overview of the creation, content, and quality of the 2006 Canadian Birth-Census Cohort Database.

    Release date: 2016-01-20

  • Articles and reports: 11-630-X2015009
    Description:

    In this edition of Canadian Megatrends, we look at increased participation of women in the paid workforce since the 1950s.

    Release date: 2015-12-17

  • Technical products: 75F0002M2015003
    Description:

    This note discusses revised income estimates from the Survey of Labour and Income Dynamics (SLID). These revisions to the SLID estimates make it possible to compare results from the Canadian Income Survey (CIS) to earlier years. The revisions address the issue of methodology differences between SLID and CIS.

    Release date: 2015-12-17

  • Articles and reports: 12-001-X201500214238
    Description:

    Félix-Medina and Thompson (2004) proposed a variant of link-tracing sampling to sample hidden and/or hard-to-detect human populations such as drug users and sex workers. In their variant, an initial sample of venues is selected and the people found in the sampled venues are asked to name other members of the population to be included in the sample. Those authors derived maximum likelihood estimators of the population size under the assumption that the probability that a person is named by another in a sampled venue (link-probability) does not depend on the named person (homogeneity assumption). In this work we extend their research to the case of heterogeneous link-probabilities and derive unconditional and conditional maximum likelihood estimators of the population size. We also propose profile likelihood and bootstrap confidence intervals for the size of the population. The results of simulations studies carried out by us show that in presence of heterogeneous link-probabilities the proposed estimators perform reasonably well provided that relatively large sampling fractions, say larger than 0.5, be used, whereas the estimators derived under the homogeneity assumption perform badly. The outcomes also show that the proposed confidence intervals are not very robust to deviations from the assumed models.

    Release date: 2015-12-17

  • Articles and reports: 12-001-X201500214237
    Description:

    Careful design of a dual-frame random digit dial (RDD) telephone survey requires selecting from among many options that have varying impacts on cost, precision, and coverage in order to obtain the best possible implementation of the study goals. One such consideration is whether to screen cell-phone households in order to interview cell-phone only (CPO) households and exclude dual-user household, or to take all interviews obtained via the cell-phone sample. We present a framework in which to consider the tradeoffs between these two options and a method to select the optimal design. We derive and discuss the optimum allocation of sample size between the two sampling frames and explore the choice of optimum p, the mixing parameter for the dual-user domain. We illustrate our methods using the National Immunization Survey, sponsored by the Centers for Disease Control and Prevention.

    Release date: 2015-12-17

  • Articles and reports: 12-001-X201500214250
    Description:

    Assessing the impact of mode effects on survey estimates has become a crucial research objective due to the increasing use of mixed-mode designs. Despite the advantages of a mixed-mode design, such as lower costs and increased coverage, there is sufficient evidence that mode effects may be large relative to the precision of a survey. They may lead to incomparable statistics in time or over population subgroups and they may increase bias. Adaptive survey designs offer a flexible mathematical framework to obtain an optimal balance between survey quality and costs. In this paper, we employ adaptive designs in order to minimize mode effects. We illustrate our optimization model by means of a case-study on the Dutch Labor Force Survey. We focus on item-dependent mode effects and we evaluate the impact on survey quality by comparison to a gold standard.

    Release date: 2015-12-17

  • Articles and reports: 12-001-X201500214249
    Description:

    The problem of optimal allocation of samples in surveys using a stratified sampling plan was first discussed by Neyman in 1934. Since then, many researchers have studied the problem of the sample allocation in multivariate surveys and several methods have been proposed. Basically, these methods are divided into two classes: The first class comprises methods that seek an allocation which minimizes survey costs while keeping the coefficients of variation of estimators of totals below specified thresholds for all survey variables of interest. The second aims to minimize a weighted average of the relative variances of the estimators of totals given a maximum overall sample size or a maximum cost. This paper proposes a new optimization approach for the sample allocation problem in multivariate surveys. This approach is based on a binary integer programming formulation. Several numerical experiments showed that the proposed approach provides efficient solutions to this problem, which improve upon a ‘textbook algorithm’ and can be more efficient than the algorithm by Bethel (1985, 1989).

    Release date: 2015-12-17

  • Articles and reports: 12-001-X201500214230
    Description:

    This paper develops allocation methods for stratified sample surveys where composite small area estimators are a priority, and areas are used as strata. Longford (2006) proposed an objective criterion for this situation, based on a weighted combination of the mean squared errors of small area means and a grand mean. Here, we redefine this approach within a model-assisted framework, allowing regressor variables and a more natural interpretation of results using an intra-class correlation parameter. We also consider several uses of power allocation, and allow the placing of other constraints such as maximum relative root mean squared errors for stratum estimators. We find that a simple power allocation can perform very nearly as well as the optimal design even when the objective is to minimize Longford’s (2006) criterion.

    Release date: 2015-12-17

  • Articles and reports: 12-001-X201500214231
    Description:

    Rotating panels are widely applied by national statistical institutes, for example, to produce official statistics about the labour force. Estimation procedures are generally based on traditional design-based procedures known from classical sampling theory. A major drawback of this class of estimators is that small sample sizes result in large standard errors and that they are not robust for measurement bias. Two examples showing the effects of measurement bias are rotation group bias in rotating panels, and systematic differences in the outcome of a survey due to a major redesign of the underlying process. In this paper we apply a multivariate structural time series model to the Dutch Labour Force Survey to produce model-based figures about the monthly labour force. The model reduces the standard errors of the estimates by taking advantage of sample information collected in previous periods, accounts for rotation group bias and autocorrelation induced by the rotating panel, and models discontinuities due to a survey redesign. Additionally, we discuss the use of correlated auxiliary series in the model to further improve the accuracy of the model estimates. The method is applied by Statistics Netherlands to produce accurate official monthly statistics about the labour force that are consistent over time, despite a redesign of the survey process.

    Release date: 2015-12-17

  • Articles and reports: 12-001-X201500214229
    Description:

    Self-weighting estimation through equal probability selection methods (epsem) is desirable for variance efficiency. Traditionally, the epsem property for (one phase) two stage designs for estimating population-level parameters is realized by using each primary sampling unit (PSU) population count as the measure of size for PSU selection along with equal sample size allocation per PSU under simple random sampling (SRS) of elementary units. However, when self-weighting estimates are desired for parameters corresponding to multiple domains under a pre-specified sample allocation to domains, Folsom, Potter and Williams (1987) showed that a composite measure of size can be used to select PSUs to obtain epsem designs when besides domain-level PSU counts (i.e., distribution of domain population over PSUs), frame-level domain identifiers for elementary units are also assumed to be available. The term depsem-A will be used to denote such (one phase) two stage designs to obtain domain-level epsem estimation. Folsom et al. also considered two phase two stage designs when domain-level PSU counts are unknown, but whole PSU counts are known. For these designs (to be termed depsem-B) with PSUs selected proportional to the usual size measure (i.e., the total PSU count) at the first stage, all elementary units within each selected PSU are first screened for classification into domains in the first phase of data collection before SRS selection at the second stage. Domain-stratified samples are then selected within PSUs with suitably chosen domain sampling rates such that the desired domain sample sizes are achieved and the resulting design is self-weighting. In this paper, we first present a simple justification of composite measures of size for the depsem-A design and of the domain sampling rates for the depsem-B design. Then, for depsem-A and -B designs, we propose generalizations, first to cases where frame-level domain identifiers for elementary units are not available and domain-level PSU counts are only approximately known from alternative sources, and second to cases where PSU size measures are pre-specified based on other practical and desirable considerations of over- and under-sampling of certain domains. We also present a further generalization in the presence of subsampling of elementary units and nonresponse within selected PSUs at the first phase before selecting phase two elementary units from domains within each selected PSU. This final generalization of depsem-B is illustrated for an area sample of housing units.

    Release date: 2015-12-17

  • Articles and reports: 12-001-X201500214248
    Description:

    Unit level population models are often used in model-based small area estimation of totals and means, but the models may not hold for the sample if the sampling design is informative for the model. As a result, standard methods, assuming that the model holds for the sample, can lead to biased estimators. We study alternative methods that use a suitable function of the unit selection probability as an additional auxiliary variable in the sample model. We report the results of a simulation study on the bias and mean squared error (MSE) of the proposed estimators of small area means and on the relative bias of the associated MSE estimators, using informative sampling schemes to generate the samples. Alternative methods, based on modeling the conditional expectation of the design weight as a function of the model covariates and the response, are also included in the simulation study.

    Release date: 2015-12-17

  • Articles and reports: 12-001-X201500214236
    Description:

    We propose a model-assisted extension of weighting design-effect measures. We develop a summary-level statistic for different variables of interest, in single-stage sampling and under calibration weight adjustments. Our proposed design effect measure captures the joint effects of a non-epsem sampling design, unequal weights produced using calibration adjustments, and the strength of the association between an analysis variable and the auxiliaries used in calibration. We compare our proposed measure to existing design effect measures in simulations using variables like those collected in establishment surveys and telephone surveys of households.

    Release date: 2015-12-17

  • Articles and reports: 82-003-X201501214295
    Description:

    Using the Wisconsin Cancer Intervention and Surveillance Monitoring Network breast cancer simulation model adapted to the Canadian context, costs and quality-adjusted life years were evaluated for 11 mammography screening strategies that varied by start/stop age and screening frequency for the general population. Incremental cost-effectiveness ratios are presented, and sensitivity analyses are used to assess the robustness of model conclusions.

    Release date: 2015-12-16

  • The Daily
    Description: Release published in The Daily – Statistics Canada’s official release bulletin
    Release date: 2015-11-23

  • Articles and reports: 11-630-X2015008
    Description:

    In this edition of Canadian Megatrends, we look at at changes in household size from 1941 to 2011.

    Release date: 2015-11-23

  • Articles and reports: 11-627-M2015005
    Description:

    This infographic demonstrates the journey of data and how respondents' answers to our surveys become useful data used to make informed decisions. The infographic highlights the Labour Force Survey (LFS), the Survey of Household Spending (SHS), and the Canadian Community Health Survey (CCHS).

    Release date: 2015-11-23

Data (8)

Data (8) (8 of 8 results)

  • Public use microdata: 89F0002X
    Description:

    The SPSD/M is a static microsimulation model designed to analyse financial interactions between governments and individuals in Canada. It can compute taxes paid to and cash transfers received from government. It is comprised of a database, a series of tax/transfer algorithms and models, analytical software and user documentation.

    Release date: 2018-01-08

  • Table: 53-500-X
    Description:

    This report presents the results of a pilot survey conducted by Statistics Canada to measure the fuel consumption of on-road motor vehicles registered in Canada. This study was carried out in connection with the Canadian Vehicle Survey (CVS) which collects information on road activity such as distance traveled, number of passengers and trip purpose.

    Release date: 2004-10-21

  • Table: 95F0495X2001012
    Description:

    This table contains information from the 2001 Census, presented according to the statistical area classification (SAC). The SAC groups census subdivisions according to whether they are a component of a census metropolitan area, a census agglomeration, a census metropolitan area and census agglomeration influenced zone (strong MIZ, moderate MIZ, weak MIZ or no MIZ) or of the territories (Northwest Territories, Nunavut and Yukon Territory). The SAC is used for data dissemination purposes.

    Data characteristics presented according to the SAC include age, visible minority groups, immigration, mother tongue, education, income, work and dwellings. Data are presented for Canada, provinces and territories. The data characteristics presented within this table may differ from those of other products in the "Profiles" series.

    Release date: 2004-02-27

  • Table: 53-222-X19980006587
    Description:

    The primary purpose of this article is to present a new time series data and to demonstrate its analytical potential and not to provide a detailed analysis of these data. The analysis in section 5.2.4 will deal primarily with the trends of major variables dealing with domestic and transborder traffic.

    Release date: 2000-03-07

  • Table: 75M0007X
    Description:

    The Absence from Work Survey was designed primarily to fulfill the objectives of Human Resources Development Canada. They sponsor the qualified wage loss replacement plan which applies to employers who have their own private plans to cover employee wages lost due to sickness, accident, etc. Employers who fall under the plan are granted a reduction in their quotas payable to the Unemployment Insurance Commission. The data generated from the responses to the supplement will provide input to determine the rates for quota reductions for qualified employers.

    Although the Absence from Work Survey collects information on absences from work due to illness, accident or pregnancy, it does not provide a complete picture of people who have been absent from work for these reasons because the concepts and definitions have been developed specifically for the needs of the client. Absences in this survey are defined as being at least two weeks in length, and respondents are only asked the three reasons for their most recent absence and the one preceding it.

    Release date: 1999-06-29

  • Table: 82-567-X
    Description:

    The National Population Health Survey (NPHS) is designed to enhance the understanding of the processes affecting health. The survey collects cross-sectional as well as longitudinal data. In 1994/95 the survey interviewed a panel of 17,276 individuals, then returned to interview them a second time in 1996/97. The response rate for these individuals was 96% in 1996/97. Data collection from the panel will continue for up to two decades. For cross-sectional purposes, data were collected for a total of 81,000 household residents in all provinces (except people on Indian reserves or on Canadian Forces bases) in 1996/97.

    This overview illustrates the variety of information available by presenting data on perceived health, chronic conditions, injuries, repetitive strains, depression, smoking, alcohol consumption, physical activity, consultations with medical professionals, use of medications and use of alternative medicine.

    Release date: 1998-07-29

  • Table: 62-010-X19970023422
    Description:

    The current official time base of the Consumer Price Index (CPI) is 1986=100. This time base was first used when the CPI for June 1990 was released. Statistics Canada is about to convert all price index series to the time base 1992=100. As a result, all constant dollar series will be converted to 1992 dollars. The CPI will shift to the new time base when the CPI for January 1998 is released on February 27th, 1998.

    Release date: 1997-11-17

  • Public use microdata: 89M0005X
    Description:

    The objective of this survey was to collect attitudinal, cognitive and behavioral information regarding drinking and driving.

    Release date: 1996-10-21

Analysis (902)

Analysis (902) (25 of 902 results)

  • Articles and reports: 13-604-M2018087
    Description:

    Statistics Canada regularly publishes macroeconomic indicators on household assets, liabilities and net worth as part of the quarterly National Balance Sheet Accounts (NBSA). These accounts are aligned with the most recent international standards and are the source of estimates of national wealth for all sectors of the economy, including households, non-profit institutions, governments and corporations along with Canada’s wealth position vis-a-vis the rest of the world. While the NBSA provide high quality information on the overall position of households relative to other economic sectors, they lack the granularity required to understand vulnerabilities of specific groups and the resulting implications for economic wellbeing and financial stability.

    Release date: 2018-04-13

  • The Daily
    Description: Release published in The Daily – Statistics Canada’s official release bulletin
    Release date: 2018-04-03

  • Journals and periodicals: 11-633-X
    Description:

    Papers in this series provide background discussions of the methods used to develop data for economic, health, and social analytical studies at Statistics Canada. They are intended to provide readers with information on the statistical methods, standards and definitions used to develop databases for research purposes. All papers in this series have undergone peer and institutional review to ensure that they conform to Statistics Canada's mandate and adhere to generally accepted standards of good professional practice.

    Release date: 2018-03-27

  • Articles and reports: 11-633-X2018016
    Description:

    Record linkage has been identified as a potential mechanism to add treatment information to the Canadian Cancer Registry (CCR). The purpose of the Canadian Cancer Treatment Linkage Project (CCTLP) pilot is to add surgical treatment data to the CCR. The Discharge Abstract Database (DAD) and the National Ambulatory Care Reporting System (NACRS) were linked to the CCR, and surgical treatment data were extracted. The project was funded through the Cancer Data Development Initiative (CDDI) of the Canadian Partnership Against Cancer (CPAC).

    The CCTLP was developed as a feasibility study in which patient records from the CCR would be linked to surgical treatment records in the DAD and NACRS databases, maintained by the Canadian Institute for Health Information. The target cohort to whom surgical treatment data would be linked was patients aged 19 or older registered on the CCR (2010 through 2012). The linkage was completed in Statistics Canada’s Social Data Linkage Environment (SDLE).

    Release date: 2018-03-27

  • Articles and reports: 11-629-X2018002
    Description:

    Celebrate Statistics Canada’s centennial by looking back on our journey with Canada.

    Release date: 2018-03-16

  • Articles and reports: 11-633-X2018015
    Description:

    This paper discusses the process for estimating the volume of cannabis consumption in Canada by age group from 1960 to 2015. Cannabis consumption is estimated using a model that first estimates the number of cannabis consumers among 15- to 17-year-olds, 18- to 24-year-olds, 25- to 44-year-olds and 45- to 64-year-olds. This is accomplished by estimating cannabis consumption prevalence based on multiple survey data sources. For each age group, consumers are divided into categories based on annual frequency of consumption: once in the past year, less than once a month, one to three times a month, weekly (excluding daily) and daily. Each category of frequency of consumption is then associated with a quantity of cannabis consumed.

    Release date: 2018-02-21

  • Articles and reports: 82-003-X201800254908
    Description:

    This study examined nine national surveys of the household population which collected information about drug use during the period from 1985 through 2015. These surveys are examined for comparability. The data are used to estimate past-year (current) cannabis use (total, and by sex and age). Based on the most comparable data, trends in use from 2004 through 2015 are estimated.

    Release date: 2018-02-21

  • Articles and reports: 11-633-X2018014
    Description:

    The Canadian Mortality Database (CMDB) is an administrative database that collects information on cause of death from all provincial and territorial vital statistics registries in Canada. The CMDB lacks subpopulation identifiers to examine mortality rates and disparities among groups such as First Nations, Métis, Inuit and members of visible minority groups. Linkage between the CMDB and the Census of Population is an approach to circumvent this limitation. This report describes a linkage between the CMDB (2006 to 2011) and the 2006 Census of Population, which was carried out using hierarchical deterministic exact matching, with a focus on methodology and validation.

    Release date: 2018-02-14

  • Articles and reports: 11-633-X2018013
    Description:

    Since 2008, a number of population censuses have been linked to administrative health data and to financial data. These linked datasets have been instrumental in examining health inequalities and have been used in environmental health research. This paper describes the creation of the 1996 Canadian Census Health and Environment Cohort (CanCHEC)—3.57 million respondents to the census long-form questionnaire who were retrospectively followed for mortality and mobility for 16.6 years from 1996 to 2012. The 1996 CanCHEC was limited to census respondents who were aged 19 or older on Census Day (May 14, 1996), were residents of Canada, were not residents of institutions, and had filed an income tax return. These respondents were linked to death records from the Canadian Mortality Database or to the T1 Personal Master File, and to a postal code history from a variety of sources. This is the third in a set of CanCHECs that, when combined, make it possible to examine mortality trends and environmental exposures by socioeconomic characteristics over three census cycles and 21 years of census, tax, and mortality data. This report describes linkage methodologies, validation and bias assessment, and the characteristics of the 1996 CanCHEC. Representativeness of the 1996 CanCHEC relative to the adult population of Canada is also assessed.

    Release date: 2018-01-22

  • Articles and reports: 11-633-X2018012
    Description:

    This study investigates the extent to which income tax reassessments and delayed tax filing affect the reliability of Canadian administrative tax datasets used for economic analysis. The study is based on individual income tax records from the T1 Personal Master File and Historical Personal Master File for selected years from 1990 to 2010. These datasets contain tax records for approximately 100% of initial and all income tax filers, who submitted returns to the Canada Revenue Agency (CRA) before specific processing cut-off dates.

    Release date: 2018-01-11

  • Articles and reports: 11-633-X2018011
    Description:

    The Longitudinal Immigration Database (IMDB) is a comprehensive source of data that plays a key role in the understanding of the economic behaviour of immigrants. It is the only annual Canadian dataset that allows users to study the characteristics of immigrants to Canada at the time of admission and their economic outcomes and regional (inter-provincial) mobility over a time span of more than 30 years. The IMDB combines administrative files on immigrant admissions and non-permanent resident permits from Immigration, Refugees and Citizenship Canada (IRCC) with tax files from the Canadian Revenue Agency (CRA). Information is available for immigrant taxfilers admitted since 1980. Tax records for 1982 and subsequent years are available for immigrant taxfilers.

    This report will discuss the IMDB data sources, concepts and variables, record linkage, data processing, dissemination, data evaluation and quality indicators, comparability with other immigration datasets, and the analyses possible with the IMDB.

    Release date: 2018-01-08

  • The Daily
    Description: Release published in The Daily – Statistics Canada’s official release bulletin
    Release date: 2018-01-08

  • Articles and reports: 18-001-X2017001
    Description:

    This working paper profiles Canadian firms involved in the development and production of Bioproducts. It provides data on the number and types of Bioproducts firms in 2015, covering bioproducts revenues, research and development, use of biomass, patents, products, business practices and the impact of government regulations on the sector.

    Release date: 2017-12-22

  • Journals and periodicals: 12-001-X
    Description:

    The journal publishes articles dealing with various aspects of statistical development relevant to a statistical agency, such as design issues in the context of practical constraints, use of different data sources and collection techniques, total survey error, survey evaluation, research in survey methodology, time series analysis, seasonal adjustment, demographic studies, data integration, estimation and data analysis methods, and general survey systems development. The emphasis is placed on the development and evaluation of specific methodologies as applied to data collection or the data themselves.

    Release date: 2017-12-21

  • The Daily
    Description: Release published in The Daily – Statistics Canada’s official release bulletin
    Release date: 2017-12-21

  • Articles and reports: 12-001-X201700254897
    Description:

    This note by Chris Skinner presents a discussion of the paper “Sample survey theory and methods: Past, present, and future directions” where J.N.K. Rao and Wayne A. Fuller share their views regarding the developments in sample survey theory and methods covering the past 100 years.

    Release date: 2017-12-21

  • Articles and reports: 12-001-X201700254888
    Description:

    We discuss developments in sample survey theory and methods covering the past 100 years. Neyman’s 1934 landmark paper laid the theoretical foundations for the probability sampling approach to inference from survey samples. Classical sampling books by Cochran, Deming, Hansen, Hurwitz and Madow, Sukhatme, and Yates, which appeared in the early 1950s, expanded and elaborated the theory of probability sampling, emphasizing unbiasedness, model free features, and designs that minimize variance for a fixed cost. During the period 1960-1970, theoretical foundations of inference from survey data received attention, with the model-dependent approach generating considerable discussion. Introduction of general purpose statistical software led to the use of such software with survey data, which led to the design of methods specifically for complex survey data. At the same time, weighting methods, such as regression estimation and calibration, became practical and design consistency replaced unbiasedness as the requirement for standard estimators. A bit later, computer-intensive resampling methods also became practical for large scale survey samples. Improved computer power led to more sophisticated imputation for missing data, use of more auxiliary data, some treatment of measurement errors in estimation, and more complex estimation procedures. A notable use of models was in the expanded use of small area estimation. Future directions in research and methods will be influenced by budgets, response rates, timeliness, improved data collection devices, and availability of auxiliary data, some of which will come from “Big Data”. Survey taking will be impacted by changing cultural behavior and by a changing physical-technical environment.

    Release date: 2017-12-21

  • Articles and reports: 12-001-X201700254895
    Description:

    This note by Graham Kalton presents a discussion of the paper “Sample survey theory and methods: Past, present, and future directions” where J.N.K. Rao and Wayne A. Fuller share their views regarding the developments in sample survey theory and methods covering the past 100 years.

    Release date: 2017-12-21

  • Articles and reports: 12-001-X201700254871
    Description:

    In this paper the question is addressed how alternative data sources, such as administrative and social media data, can be used in the production of official statistics. Since most surveys at national statistical institutes are conducted repeatedly over time, a multivariate structural time series modelling approach is proposed to model the series observed by a repeated surveys with related series obtained from such alternative data sources. Generally, this improves the precision of the direct survey estimates by using sample information observed in preceding periods and information from related auxiliary series. This model also makes it possible to utilize the higher frequency of the social media to produce more precise estimates for the sample survey in real time at the moment that statistics for the social media become available but the sample data are not yet available. The concept of cointegration is applied to address the question to which extent the alternative series represent the same phenomena as the series observed with the repeated survey. The methodology is applied to the Dutch Consumer Confidence Survey and a sentiment index derived from social media.

    Release date: 2017-12-21

  • Articles and reports: 12-001-X201700254896
    Description:

    This note by Sharon L. Lohr presents a discussion of the paper “Sample survey theory and methods: Past, present, and future directions” where J.N.K. Rao and Wayne A. Fuller share their views regarding the developments in sample survey theory and methods covering the past 100 years.

    Release date: 2017-12-21

  • Articles and reports: 12-001-X201700254887
    Description:

    This paper proposes a new approach to decompose the wage difference between men and women that is based on a calibration procedure. This approach generalizes two current decomposition methods that are re-expressed using survey weights. The first one is the Blinder-Oaxaca method and the second one is a reweighting method proposed by DiNardo, Fortin and Lemieux. The new approach provides a weighting system that enables us to estimate such parameters of interest like quantiles. An application to data from the Swiss Structure of Earnings Survey shows the interest of this method.

    Release date: 2017-12-21

  • Articles and reports: 12-001-X201700254872
    Description:

    This note discusses the theoretical foundations for the extension of the Wilson two-sided coverage interval to an estimated proportion computed from complex survey data. The interval is shown to be asymptotically equivalent to an interval derived from a logistic transformation. A mildly better version is discussed, but users may prefer constructing a one-sided interval already in the literature.

    Release date: 2017-12-21

  • Articles and reports: 12-001-X201700254894
    Description:

    This note by Danny Pfeffermann presents a discussion of the paper “Sample survey theory and methods: Past, present, and future directions” where J.N.K. Rao and Wayne A. Fuller share their views regarding the developments in sample survey theory and methods covering the past 100 years.

    Release date: 2017-12-21

  • The Daily
    Description: Release published in The Daily – Statistics Canada’s official release bulletin
    Release date: 2017-12-18

  • Articles and reports: 11-626-X2017077
    Description:

    On April 13, 2017, the Government of Canada tabled legislation to legalize the recreational use of cannabis by adults. This will directly impact Canada’s statistical system. The focus of this Economic Insights article is to provide experimental estimates for the volume of cannabis consumption, based on existing information on the prevalence of cannabis use. The article presents experimental estimates of the number of tonnes of cannabis consumed by age group for the period from 1960 to 2015. The experimental estimates rely on survey data from multiple sources, statistical techniques to link the sources over time, and assumptions about consumption behaviour. They are subject to revision as improved or additional data sources become available.

    Release date: 2017-12-18

Reference (700)

Reference (700) (25 of 700 results)

  • Technical products: 75F0002M
    Description:

    This series provides detailed documentation on income developments, including survey design issues, data quality evaluation and exploratory research.

    Release date: 2018-04-05

  • Technical products: 75F0002M2018001
    Description:

    This study looks at changes introduced in 2018 to the methodology used for the census family low income measure, based on the T1 Family File (T1FF; tax filer data). By making these changes, the methodology becomes better aligned with other data sources at Statistics Canada, such as the Census of Population and the Canadian Income Survey. To account for changes in the methodology, new T1FF standard tables on the census family low income measure (after-tax income), going back to 2004 data, are introduced.

    Release date: 2018-04-05

  • Technical products: 75F0002M2018002
    Description:

    This study looks at the differences in after-tax low income measure (LIM) statistics from two data sources which both use administrative tax data as their principal inputs: the 2016 Census of Population and the T1 Family file (T1FF). It presents a summary of the two data sources and compares after-tax LIM statistics by focussing on unit of analysis, LIM thresholds and the percentage of population below the LIM. The study also explores what factors users may want to consider when choosing one data source over the other.

    Release date: 2018-04-05

  • Technical products: 84-538-X
    Description:

    This document presents the methodology underlying the production of the life tables for Canada, provinces and territories, from reference period 1980/1982 and onward.

    Release date: 2018-02-23

  • Surveys and statistical programs – Documentation: 71-526-X
    Description:

    The Canadian Labour Force Survey (LFS) is the official source of monthly estimates of total employment and unemployment. Following the 2011 census, the LFS underwent a sample redesign to account for the evolution of the population and labour market characteristics, to adjust to changes in the information needs and to update the geographical information used to carry out the survey. The redesign program following the 2011 census culminated with the introduction of a new sample at the beginning of 2015. This report is a reference on the methodological aspects of the LFS, covering stratification, sampling, collection, processing, weighting, estimation, variance estimation and data quality.

    Release date: 2017-12-21

  • Index and guides: 98-500-X
    Description:

    Provides information that enables users to effectively use, apply and interpret data from the Census of Population. Each guide contains definitions and explanations on census concepts as well as a data quality and historical comparability section. Additional information will be included for specific variables to help users better understand the concepts and questions used in the census.

    Release date: 2017-11-29

  • Technical products: 12-206-X
    Description:

    This report summarizes the achievements program sponsored by the three methodology divisions of Statistics Canada. This program covers research and development activities in statistical methods with potentially broad application in the Agency's survey programs, which would not otherwise have been carried out during the provision of methodology services to those survey programs. They also include tasks that provided client support in the application of past successful developments in order to promote the utilization of the results of research and development work.

    Release date: 2017-11-03

  • Index and guides: 12-606-X
    Description:

    This is a toolkit intended to aid data producers and data users external to Statistics Canada.

    Release date: 2017-09-27

  • Technical products: 12-586-X
    Description:

    The Quality Assurance Framework (QAF) serves as the highest-level governance tool for quality management at Statistics Canada. The QAF gives an overview of the quality management and risk mitigation strategies used by the Agency’s program areas. The QAF is used in conjunction with Statistics Canada management practices, such as those described in the Quality Guidelines.

    Release date: 2017-04-21

  • Technical products: 91-621-X2017001
    Release date: 2017-01-25

  • Technical products: 75F0002M2016003
    Description:

    Periodically, income statistics are updated to reflect the most recent population estimates from the Census. Accordingly, with the release of the 2014 data from the Canadian Income Survey, Statistics Canada has revised estimates for 2006 to 2013 using new population totals from the 2011 Census. This paper provides unrevised estimates alongside revised estimates for key income series, indicating where the revisions were significant.

    Release date: 2016-07-08

  • Technical products: 11-522-X
    Description:

    Since 1984, an annual international symposium on methodological issues has been sponsored by Statistics Canada. Proceedings have been available since 1987.

    Release date: 2016-03-24

  • Technical products: 11-522-X201700014753
    Description:

    The fact that the world is in continuous change and that new technologies are becoming widely available creates new opportunities and challenges for National Statistical Institutes (NSIs) worldwide. What if NSIs could access vast amounts of sophisticated data for free (or for a low cost) from enterprises? Could this facilitate the possibility for NSIs to disseminate more accurate indicators for the policy-makers and users, significantly reduce the response burden for companies, reduce costs for the NSIs and in the long run improve the living standards of the people in a country? The time has now come for NSIs to find the best practice to align legislation, regulations and practices in relation to scanner data and big data. Without common ground, the prospect of reaching consensus is unlikely. The discussions need to start with how to define quality. If NSIs define and approach quality differently, this will lead to a highly undesirable situation, as NSIs will move further away from harmonisation. Sweden was one of the leading countries that put these issues on the agenda for European cooperation; in 2012 Sweden implemented scanner data in the national Consumer Price Index after it was proven through research studies and statistical analyses that scanner data was significantly better than the manually collected data.

    Release date: 2016-03-24

  • Technical products: 11-522-X201700014714
    Description:

    The Labour Market Development Agreements (LMDAs) between Canada and the provinces and territories fund labour market training and support services to Employment Insurance claimants. The objective of this paper is to discuss the improvements over the years in the impact assessment methodology. The paper describes the LMDAs and past evaluation work and discusses the drivers to make better use of large administrative data holdings. It then explains how the new approach made the evaluation less resource-intensive, while results are more relevant to policy development. The paper outlines the lessons learned from a methodological perspective and provides insight into ways for making this type of use of administrative data effective, especially in the context of large programs.

    Release date: 2016-03-24

  • Technical products: 11-522-X201700014707
    Description:

    The Labour Force Survey (LFS) is a monthly household survey of about 56,000 households that provides information on the Canadian labour market. Audit Trail is a Blaise programming option, for surveys like LFS with Computer Assisted Interviewing (CAI), which creates files containing every keystroke and edit and timestamp of every data collection attempt on all households. Combining such a large survey with such a complete source of paradata opens the door to in-depth data quality analysis but also quickly leads to Big Data challenges. How can meaningful information be extracted from this large set of keystrokes and timestamps? How can it help assess the quality of LFS data collection? The presentation will describe some of the challenges that were encountered, solutions that were used to address them, and results of the analysis on data quality.

    Release date: 2016-03-24

  • Technical products: 11-522-X201700014742
    Description:

    This paper describes the Quick Match System (QMS), an in-house application designed to match business microdata records, and the methods used to link the United States Patent and Trademark Office (USPTO) dataset to Statistics Canada’s Business Register (BR) for the period from 2000 to 2011. The paper illustrates the record-linkage framework and outlines the techniques used to prepare and classify each record and evaluate the match results. The USPTO dataset consisted of 41,619 U.S. patents granted to 14,162 distinct Canadian entities. The record-linkage process matched the names, city, province and postal codes of the patent assignees in the USPTO dataset with those of businesses in the January editions of the Generic Survey Universe File (GSUF) from the BR for the same reference period. As the vast majority of individual patent assignees are not engaged in commercial activity to provide taxable property or services, they tend not to appear in the BR. The relatively poor match rate of 24.5% among individuals, compared to 84.7% among institutions, reflects this tendency. Although the 8,844 individual patent assignees outnumbered the 5,318 institutions, the institutions accounted for 73.0% of the patents, compared to 27.0% held by individuals. Consequently, this study and its conclusions focus primarily on institutional patent assignees. The linkage of the USPTO institutions to the BR is significant because it provides access to business micro-level data on firm characteristics, employment, revenue, assets and liabilities. In addition, the retrieval of robust administrative identifiers enables subsequent linkage to other survey and administrative data sources. The integrated dataset will support direct and comparative analytical studies on the performance of Canadian institutions that obtained patents in the United States between 2000 and 2011.

    Release date: 2016-03-24

  • Technical products: 11-522-X201700014715
    Description:

    In preparation for 2021 UK Census the ONS has committed to an extensive research programme exploring how linked administrative data can be used to support conventional statistical processes. Item-level edit and imputation (E&I) will play an important role in adjusting the 2021 Census database. However, uncertainty associated with the accuracy and quality of available administrative data renders the efficacy of an integrated census-administrative data approach to E&I unclear. Current constraints that dictate an anonymised ‘hash-key’ approach to record linkage to ensure confidentiality add to that uncertainty. Here, we provide preliminary results from a simulation study comparing the predictive and distributional accuracy of the conventional E&I strategy implemented in CANCEIS for the 2011 UK Census to that of an integrated approach using synthetic administrative data with systematically increasing error as auxiliary information. In this initial phase of research we focus on imputing single year of age. The aim of the study is to gain insight into whether auxiliary information from admin data can improve imputation estimates and where the different strategies fall on a continuum of accuracy.

    Release date: 2016-03-24

  • Technical products: 11-522-X201700014710
    Description:

    The Data Warehouse has modernized the way the Canadian System of Macroeconomic Accounts (MEA) are produced and analyzed today. Its continuing evolution facilitates the amounts and types of analytical work that is done within the MEA. It brings in the needed element of harmonization and confrontation as the macroeconomic accounts move toward full integration. The improvements in quality, transparency, and timeliness have strengthened the statistics that are being disseminated.

    Release date: 2016-03-24

  • Technical products: 11-522-X201700014724
    Description:

    At the Institut national de santé publique du Québec, the Quebec Integrated Chronic Disease Surveillance System (QICDSS) has been used daily for approximately four years. The benefits of this system are numerous for measuring the extent of diseases more accurately, evaluating the use of health services properly and identifying certain groups at risk. However, in the past months, various problems have arisen that have required a great deal of careful thought. The problems have affected various areas of activity, such as data linkage, data quality, coordinating multiple users and meeting legal obligations. The purpose of this presentation is to describe the main challenges associated with using QICDSS data and to present some possible solutions. In particular, this presentation discusses the processing of five data sources that not only come from five different sources, but also are not mainly used for chronic disease surveillance. The varying quality of the data, both across files and within a given file, will also be discussed. Certain situations associated with the simultaneous use of the system by multiple users will also be examined. Examples will be given of analyses of large data sets that have caused problems. As well, a few challenges involving disclosure and the fulfillment of legal agreements will be briefly discussed.

    Release date: 2016-03-24

  • Technical products: 11-522-X201700014729
    Description:

    The use of administrative datasets as a data source in official statistics has become much more common as there is a drive for more outputs to be produced more efficiently. Many outputs rely on linkage between two or more datasets, and this is often undertaken in a number of phases with different methods and rules. In these situations we would like to be able to assess the quality of the linkage, and this involves some re-assessment of both links and non-links. In this paper we discuss sampling approaches to obtain estimates of false negatives and false positives with reasonable control of both accuracy of estimates and cost. Approaches to stratification of links (non-links) to sample are evaluated using information from the 2011 England and Wales population census.

    Release date: 2016-03-24

  • Technical products: 11-522-X201700014759
    Description:

    Many of the challenges and opportunities of modern data science have to do with dynamic aspects: evolving populations, the growing volume of administrative and commercial data on individuals and establishments, continuous flows of data and the capacity to analyze and summarize them in real time, and the deterioration of data absent the resources to maintain them. With its emphasis on data quality and supportable results, the domain of Official Statistics is ideal for highlighting statistical and data science issues in a variety of contexts. The messages of the talk include the importance of population frames and their maintenance; the potential for use of multi-frame methods and linkages; how the use of large scale non-survey data as auxiliary information shapes the objects of inference; the complexity of models for large data sets; the importance of recursive methods and regularization; and the benefits of sophisticated data visualization tools in capturing change.

    Release date: 2016-03-24

  • Technical products: 11-522-X201700014722
    Description:

    The U.S. Census Bureau is researching ways to incorporate administrative data in decennial census and survey operations. Critical to this work is an understanding of the coverage of the population by administrative records. Using federal and third party administrative data linked to the American Community Survey (ACS), we evaluate the extent to which administrative records provide data on foreign-born individuals in the ACS and employ multinomial logistic regression techniques to evaluate characteristics of those who are in administrative records relative to those who are not. We find that overall, administrative records provide high coverage of foreign-born individuals in our sample for whom a match can be determined. The odds of being in administrative records are found to be tied to the processes of immigrant assimilation – naturalization, higher English proficiency, educational attainment, and full-time employment are associated with greater odds of being in administrative records. These findings suggest that as immigrants adapt and integrate into U.S. society, they are more likely to be involved in government and commercial processes and programs for which we are including data. We further explore administrative records coverage for the two largest race/ethnic groups in our sample – Hispanic and non-Hispanic single-race Asian foreign born, finding again that characteristics related to assimilation are associated with administrative records coverage for both groups. However, we observe that neighborhood context impacts Hispanics and Asians differently.

    Release date: 2016-03-24

  • Technical products: 11-522-X201700014726
    Description:

    Internal migration is one of the components of population growth estimated at Statistics Canada. It is estimated by comparing individuals’ addresses at the beginning and end of a given period. The Canada Child Tax Benefit and T1 Family File are the primary data sources used. Address quality and coverage of more mobile subpopulations are crucial to producing high-quality estimates. The purpose of this article is to present the results of evaluations of these elements using access to more tax data sources at Statistics Canada.

    Release date: 2016-03-24

  • Technical products: 11-522-X201700014741
    Description:

    Statistics Canada’s mandate includes producing statistical data to shed light on current business issues. The linking of business records is an important aspect of the development, production, evaluation and analysis of these statistical data. As record linkage can intrude on one’s privacy, Statistics Canada uses it only when the public good is clear and outweighs the intrusion. Record linkage is experiencing a revival triggered by a greater use of administrative data in many statistical programs. There are many challenges to business record linkage. For example, many administrative files not have common identifiers, information is recorded is in non-standardized formats, information contains typographical errors, administrative data files are usually large in size, and finally the evaluation of multiple record pairings makes absolute comparison impractical and sometimes impossible. Due to the importance and challenges associated with record linkage, Statistics Canada has been developing a record linkage standard to help users optimize their business record linkage process. For example, this process includes building on a record linkage blocking strategy that reduces the amount of record-pairs to compare and match, making use of Statistics Canada’s internal software to conduct deterministic and probabilistic matching, and creating standard business name and address fields on Statistics Canada’s Business Register. This article gives an overview of the business record linkage methodology and looks at various economic projects which use record linkage at Statistics Canada, these include projects in the National Accounts, International Trade, Agriculture and the Business Register.

    Release date: 2016-03-24

  • Technical products: 11-522-X201700014708
    Description:

    Statistics Canada’s Household Survey Frames (HSF) Programme provides various universe files that can be used alone or in combination to improve survey design, sampling, collection, and processing in the traditional “need to contact a household model.” Even as surveys are migrating onto these core suite of products, the HSF is starting to plan the changes to infrastructure, organisation, and linkages with other data assets in Statistics Canada that will help enable a shift to increased use of a wide variety of administrative data as input to the social statistics programme. The presentation will provide an overview of the HSF Programme, foundational concepts that will need to be implemented to expand linkage potential, and will identify strategic research being under-taken toward 2021.

    Release date: 2016-03-24

Date modified: