Statistics by subject – Statistical methods

Filter results by

Help for filters and search
Currently selected filters that can be removed

Keyword(s)

Year of publication

1 facets displayed. 1 facets selected.

Filter results by

Help for filters and search
Currently selected filters that can be removed

Keyword(s)

Year of publication

1 facets displayed. 1 facets selected.

Other available resources to support your research.

Help for sorting results
Browse our central repository of key standard concepts, definitions, data sources and methods.
Loading
Loading in progress, please wait...
All (109)

All (109) (25 of 109 results)

  • Articles and reports: 11F0019M2002181
    Description:

    We use data from the Canadian National Longitudinal Survey of Children and Youth to address two questions. To what extent do parents and children agree when asked identical questions about child well-being? To what extent do differences in their responses affect what one infers from multivariate analysis of the data? The correspondence between parent and child in the assessment of child well-being is only slight to fair. Agreement is stronger for more observable outcomes, such as schooling performance, and weaker for less observable outcomes, such as emotional disorders. We regress both sets of responses on a standard set of socio-economic characteristics. We also conduct formal and informal tests of the differences in what one would infer from these two sets of regressions.

    Release date: 2002-10-23

  • Articles and reports: 82-005-X20020016479
    Description:

    The Population Health Model (POHEM) is a policy analysis tool that helps answer "what-if" questions about the health and economic burden of specific diseases and the cost-effectiveness of administering new diagnostic and therapeutic interventions. This simulation model is particularly pertinent in an era of fiscal restraint, when new therapies are generally expensive and difficult policy decisions are being made. More important, it provides a base for a broader framework to inform policy decisions using comprehensive disease data and risk factors. Our "base case" models comprehensively estimate the lifetime costs of treating breast, lung and colorectal cancer in Canada. Our cancer models have shown the large financial burden of diagnostic work-up and initial therapy, as well as the high costs of hospitalizing those dying of cancer. Our core cancer models (lung, breast and colorectal cancer) have been used to evaluate the impact of new practice patterns. We have used these models to evaluate new chemotherapy regimens as therapeutic options for advanced lung cancer; the health and financial impact of reducing the hospital length of stay for initial breast cancer surgery; and the potential impact of population-based screening for colorectal cancer. To date, the most interesting intervention we have studied has been the use of tamoxifen to prevent breast cancer among high risk women.

    Release date: 2002-10-08

  • Technical products: 11-522-X2001001
    Description:

    Symposium 2001 was the eighteenth in Statistics Canada's series of international symposia on methodological issues. Each year the symposium focuses on a particular theme. In 2001, the theme was: "Achieving Data Quality in a Statistical Agency: a Methodological Perspective".

    Symposium 2001 was held from October 17 to October 19, 2001 in Hull, Quebec and it attracted over 560 people from 21 countries. A total of 83 papers were presented. Aside from translation and formatting, the papers, as submitted by the authors, have been reproduced in these proceedings.

    Release date: 2002-09-12

  • Technical products: 11-522-X20010016283
    Description:

    This paper discusses in detail issues dealing with the technical aspects of designing and conducting surveys. It is intended for an audience of survey methodologists.

    The accurate recording of patients' Indegenous status in hospital separations data is critical to analyses of health service use by Aboriginal and Torres Strait Islander Australians, who have relatively poor health. However, the accuracy of these data is now well understood. In 1998, a methodology for assessing the data accuracy was piloted in 11 public hospitals. Data were collected for 8,267 patients using a personal interview, and compared with the corresponding, routinely collected data. Among the 11 hospitals, the proportion of patients correctly recorded as Indigenous ranged from 55 % to 100 %. Overall, hospitals with high proportions of Indigenous persons in their catchment areas reported more accurate data. The methodology has since been used to assess data quality in hospitals in two Australian states and to promote best practice data collection.

    Release date: 2002-09-12

  • Technical products: 11-522-X20010016277
    Description:

    This paper discusses in detail issues dealing with the technical aspects of designing and conducting surveys. It is intended for an audience of survey methodologists.

    The advent of computerized record-linkage methodology has facilitated the conduct of cohort mortality studies in which exposure data in one database are electronically linked with mortality data from another database. In this article, the impact of linkage errors on estimates of epidemiological indicators of risk, such as standardized mortality ratios and relative risk regression model parameters, is explored. It is shown that these indicators can be subject to bias and additional variability in the presence of linkage errors, with false links and non-links leading to positive and negative bias, respectively, in estimates of the standardized mortality ratio. Although linkage errors always increase the uncertainty in the estimates, bias can be effectively eliminated in the special case in which the false positive rate equals the false negative rate within homogeneous states defined by cross-classification of the covariates of interest.

    Release date: 2002-09-12

  • Technical products: 11-522-X20010016262
    Description:

    This paper discusses in detail issues dealing with the technical aspects of designing and conducting surveys. It is intended for an audience of survey methodologists.

    The demand for information on the electronic economy requires statistical agencies to assess the relevancy and improve the quality of their existing measurement programs. Innovations at the U.S. Census Bureau have helped the Bureau meet the user's urgent needs for this information, and improve the quality of the data. Through research conducted at the U.S. Census Bureau, as well as tapping into the expertise of academia, the private sector, and other government agencies, the new data on electronic commerce and electronic business processes has been strengthened. Using both existing and new data, the U.S. Census Bureau has discovered that research provides new key estimates of the size, scope, and impact of the new economy.

    Release date: 2002-09-12

  • Technical products: 11-522-X20010016230
    Description:

    This publication consists of three papers, each addressing data quality issues associated with a large and complex survey. Two of the case studies involve household surveys of labour force activity and the third focuses on a business survey. The papers each address a data quality topic from a different perspective, but share some interesting common threads.

    Release date: 2002-09-12

  • Technical products: 11-522-X20010016241
    Description:

    This paper discusses in detail issues dealing with the technical aspects of designing and conducting surveys. It is intended for an audience of survey methodologists.

    Leslie Kish long advocated the use of the "rolling sample" design. With non-overlapping, monthly panels that can be cumulated over different lengths of time for domains of different sizes, the rolling sample design enables a single survey to serve multiple purposes. The Census Bureau's new American Community Survey uses such a rolling sample design with annual averages to measure change at the state level, and three-year or five-year moving averages to describe progressively smaller domains. This paper traces Kish's influence on the development of the American Community Survey, and discusses some practical methodological issues that had to be addressed during the implementation of the design.

    Release date: 2002-09-12

  • Technical products: 11-522-X20010016231
    Description:

    This paper discusses in detail issues dealing with the technical aspects of designing and conducting surveys. Its is intended for an audience of survey methodologists.

    In 2000, the Behavioral Risk Factor Surveillance System (BRFSS) conducted monthly telephone surveys in 50 American states, the District of Columbia, and Puerto Rico: each was responsible for collecting its own survey data. In Maine, data collection was split between the state health department and ORC Macro, a commercial market research firm. Examination of survey outcome rates, selection biases and missing values for income suggest that the Maine health department data are more accurate. However, out of 18 behavioural health risk factors, only four are statistically different by data collector, and for these four factors, the data collected by ORC Macro seem more accurate.

    Release date: 2002-09-12

  • Technical products: 11-522-X20010016237
    Description:

    This paper discusses in detail issues dealing with the technical aspects of designing and conducting surveys. It is intended for an audience of survey methodologists.

    Secondary users of health information often assume that administrative data provides a relatively sound basis for making important planning and policy decisions. If errors are evenly or randomly distributed, this assumption may have little impact on these decisions. However, when information sources contain systematic errors, or when systematic errors are introduced during the creation of master files, this assumption can be damaging.

    The most common systematic errors involve underreporting activities for a specific population; inaccurate re-coding of spatial information; and differences in data entry protocols, which have raised questions about the consistency of data submitted by different tracking agencies. The Central East Health Information Partnership (CEHIP) has identified a number of systematic errors in administrative databases and has documented many of these in reports distributed to partner organizations.

    This paper describes how some of these errors were identified and notes the processes that give rise to the loss of data integrity. The conclusion addresses some of the impacts these problems have for health planners, program managers and policy makers.

    Release date: 2002-09-12

  • Technical products: 11-522-X20010016229
    Description:

    This paper discusses the approach that Statistics Canada has taken to improve the quality of annual business surveys through their integration in the Unified Enterprise Survey (UES). The primary objective of the UES is to measure the final annual sales of goods and services accurately by province, in sufficient detail and in a timely manner.

    This paper describes the methodological approaches that the UES has used to improve financial and commodity data quality in four broad areas. These include improved coherence of the data collected from different levels of the enterprise, better coverage of industries, better depth of information (in the sense of more content detail and estimates for more detailed domains) and better consistency of the concepts and methods across industries.

    The approach, in achieving quality, has been to (a) establish a base measure of the quality of the business survey program prior to the UES, (b) measure the annual data quality of the UES, and (c) carry out specific studies to better understand the quality of UES data and methods.

    Release date: 2002-09-12

  • Technical products: 11-522-X20010016250
    Description:

    This paper discusses in detail issues dealing with the technical aspects of designing and conducting surveys. It is intended for an audience of survey methodologists.

    This paper describes the Korea National Statistics Office's (KNSO) experiences in data quality assessment and introduces the strategies of institutionalizing the assessment procedure. This paper starts by briefly describing the definition of quality assessment, quality dimensions and indicators at the national level. It introduces the current situation of the quality assessment process in KNSO and lists the six dimensions of quality that have been identified: relevance, accuracy, timeliness, accessibility, comparability and efficiency. Based on the lessons learned from these experiences, this paper points out three essential elements required in an advanced system of data quality assessment: an objective and independent planning system, a set of appropriate indicators and competent personnel specialized in data quality assessment.

    Release date: 2002-09-12

  • Technical products: 11-522-X20010016267
    Description:

    This paper discusses in detail issues dealing with the technical aspects of designing and conducting surveys. It is intended for an audience of survey methodologists.

    In practice, a list of the desired collection units is not always available. Instead, a list of different units that are somehow related to the collection units may be provided, thus producing two related populations, UA and UB. An estimate for UB needs to be created, however, the sampling frame provided is only for the UA population.

    One solution for this problem is to select a sample from UA (sA) and produce an estimate for UB using the existing relationship between the two populations. This process may be referred to as indirect sampling. To assign a selection probability, or an estimation weight, for the survey units, Lavallée (1995) developed the generalized weight share method (GWSM). The GWSM produces an estimation weight that basically constitutes an average of the sampling weights of the units in sA.

    This paper discusses the types of non-response associated with indirect sampling and the possible estimation problems that can occur in the application of the GWSM.

    Release date: 2002-09-12

  • Technical products: 11-522-X20010016292
    Description:

    This paper discusses in detail issues dealing with the technical aspects of designing and conducting surveys. It is intended for an audience of survey methodologists.

    Statistics can serve to benefit society, but, if manipulated politically or otherwise, statistics may also be used by the powerful as instruments to maintain the status quo or even to oppress. Statisticians working internationally, usually employed by international, supra-national or bilateral agencies, face a range of problems as they try to 'make a difference' in the lives of the poorest people in the world. One of the most difficult challenges statisticians face is the dilemma between open accountability and national sovereignty (in relation to what data are collected, the methods used and who is to have access to the results). Because of increasing globalization and new modalities of development co-operation and partnership, statisticians work in a constantly changing environment.

    This paper addresses the problems of improving the quality of cross-national data. This paper aims to raise consciousness of the role of statisticians at the international level; describe some of the constraints under which statisticians work; address principles which ought to govern the general activities of statisticians; and evaluate, in particular, the relevance of such principles to international statisticians. This paper also draws upon the recent Presidential Address to the Royal Statistical Society (Presented June 2001, JRSS Series D forthcoming).

    Release date: 2002-09-12

  • Technical products: 11-522-X20010016273
    Description:

    This paper discusses in detail issues dealing with the technical aspects of designing and conducting surveys. It is intended for an audience of survey methodologists.

    For a multivariate survey based on simple random sampling, the problem of calculating an optimal sampling size becomes one of solving a stochastic programming problem in which each constraint corresponds to a bounded estimate of the variance for a commodity. The problem is stochastic because the set of data collected from a previous survey makes the components of each constraint random variables; consequently, the calculated size of a sample is itself a random variable and is dependent on the quality of that set of data. By means of a Monte Carlo technique, an empirical probability distribution of the optimal sampling size can be produced for finding the probability of the event that the prescribed precision will be achieved. Corresponding to each set of previously collected data, there is an optimal size and allocation across strata. While reviewing these over several consecutive periods of time, it may be possible to identify troublesome strata and to see a trend in the stability of the data. The review may reveal an oscillatory pattern in the sizes of the samples that might have evolved over time due to the dependency of one allocation on another.

    Release date: 2002-09-12

  • Technical products: 11-522-X20010016296
    Description:

    This paper discusses in detail issues dealing with the technical aspects of designing and conducting surveys. It is intended for an audience of survey methodologists.

    The Canadian Labour Force Survey (LFS) is one of Statistics Canada's most important surveys. It is a monthly survey that collects data concerning the person's labour force status, the nature of the person's work or reason for not working, and the person's demographics. The survey sample consists of approximately 52,000 households. Coverage error is a measure of data quality that is important to any survey. One of the key measures of coverage error in the LFS is the percentage difference between the Census of Population estimates and the LFS population counts; this error is called slippage. A negative value indicates that the LFS has a problem of overcoverage, while a positive value indicates the LFS has an undercoverage problem. In general, slippage is positive, thus meaning that the LFS consistently misses people who should be enumerated.

    The purpose of this study was to determine why slippage is increasing and what can be done to remedy it. The study was conducted in two stages. The first stage was a historical review of the projects that have studied and tried to control slippage in the LFS, as well as the operational changes that have been implemented over time. The second stage was an analysis of factors such as vacancy rates, non-response, demographics, urban and rural status and the impact of these factors on the slippage rate.

    Release date: 2002-09-12

  • Technical products: 11-522-X20010016302
    Description:

    This paper discusses in detail issues dealing with the technical aspects of designing and conducting surveys. It is intended for an audience of survey methodologists.

    This session provides three more contributions to the continuing discussion concerning the national statistics offices' response to the topic of quality -in particular, the subtopic of communicating quality. These three papers make the important and necessary assumption that national statistical offices have an obligation to report the limitations of the data; users should know and understand those limitations; and, as a result of understanding the limitations, users ought to be able to determine whether the data are fit for their purposes.

    Release date: 2002-09-12

  • Technical products: 11-522-X20010016247
    Description:

    This paper discusses in detail issues dealing with the technical aspects of designing and conducting surveys. It is intended for an audience of survey methodologists.

    This paper describes joint research by the Office for National Statistics (ONS) and Southampton University regarding the evaluation of several different approaches to the local estimation of International Labour Office (ILO) unemployment. The need to compare estimators with different underlying assumptions has led to a focus on evaluation methods that are (partly at least) model-independent. Model-fit diagnostics that have been considered include: various residual procedures, cross-validation, predictive validation, consistency with marginals, and consistency with direct estimates within single cells. These diagnostics have been used to compare different model-based estimators with each other and with direct estimators.

    Release date: 2002-09-12

  • Technical products: 11-522-X20010016258
    Description:

    This paper discusses in detail issues dealing with the technical aspects of designing and conducting surveys. It is intended for an audience of survey methodologists.

    To fill statistical gaps in the areas of health determinants, health status and health system usage by the Canadian population at the health region levels (sub-provincial areas or regions of interest to health authorities), Statistics Canada established a new survey called the Canadian Community Health Survey (CCHS). The CCHS consists of two separate components: a regional survey in the first year and a provincial survey in the second year. The main purpose of the regional survey, for which collection took place between September 2000 and October 2001, was to produce cross-sectional estimates for 136 health regions in Canada, based on a sample of more than 134,000 respondents. This article focuses on the various measures taken at the time of data collection to ensure a high level of quality for this large-scale survey.

    Release date: 2002-09-12

  • Technical products: 11-522-X20010016306
    Description:

    This paper discusses in detail issues dealing with the technical aspects of designing and conducting surveys. It is intended for an audience of survey methodologists.

    The paper deals with concerns regarding the problem of automatic detection and correction of inconsistent or out-of-range data in a general process of statistical data collection. The proposed approach is capable of handling both qualitative and quantitative values. The purpose of this new approach is to overcome the computational limits of the Fellegi-Holt method, while maintaining its positive features. As customary, data records must respect a set of rules in order to be declared correct. By encoding the rules with linear inequalities, we develop mathematical models for the problems of interest. As a first relevant point, by solving a sequence of feasibility problems, the set of rules itself is checked for inconsistency or redundancy. As a second relevant point, imputation is performed by solving a sequence of set-covering problems.

    Release date: 2002-09-12

  • Technical products: 11-522-X20010016290
    Description:

    This paper discusses in detail issues dealing with the technical aspects of designing and conducting surveys. It is intended for an audience of survey methodologists.

    Over the last five years, the United Kingdom Office for National Statistics has been implementing a series of initiatives to improve the process of collecting business statistics data in the UK. These initiatives include the application of a range of new technology solutions data collection; document imaging and scanned forms have replaced paper forms for all processes. For some inquiries, the paper form has been eliminated altogether by the adoption of Telephone Data Entry (TDE). Reporting all incoming data in electronic format has allowed workflow systems to be introduced across a wide range of data collection activities.

    This paper describes the recent history of these initiatives and covers proposals that are presently at a pilot stage or being projected for the next four years. It also examines the future strategy of TDE data collection via the Internet, and the current pilots and security issues under consideration.

    Release date: 2002-09-12

  • Technical products: 11-522-X20010016252
    Description:

    This paper discusses in detail issues dealing with the technical aspects of designing and conducting surveys. It is intended for an audience of survey methodologists.

    The use of sample co-ordination in business surveys is crucial because it provides a way of smoothing out the survey burden. In many co-ordination methodologies, the random numbers representing the units are permanent and the sample selection method varies. In the microstrata methodology, however, it is the selection function that is permanent. On the other hand, random numbers are systematically rearranged between units for different co-ordination purposes: smoothing out the burden, updating panels or minimizing the overlap between two surveys. These rearrangements are made in the intersections of strata, known as microstrata. This microstrata method has good, mathematical properties and demonstrates a general approach to sample co-ordination in which births, deaths and strata changes are automatically handled. There are no particular constraints on stratification and rotation rates of panels. Two software programs have been written to implement this method and its evolutions: SALOMON in 1998, and MICROSTRAT in 2001.

    Release date: 2002-09-12

  • Technical products: 11-522-X20010016264
    Description:

    This paper discusses in detail issues dealing with the technical aspects of designing and conducting surveys. It is intended for an audience of survey methodologists.

    Conducting a census by traditional methods is becoming more difficult. The possibility of cross-linking administrative files provides an attractive alternative to conducting periodic censuses (Laihonen, 2000; Borchsenius, 2000). This method was proposed in a recent article by Nathan (2001). The Institut National de la Statistique et des Études Économiques (INSEE) census redesign is based on the idea of a "continuous census," originally suggested by Kish (1981, 1990) and Horvitz (1986). The first approach, which could be feasible in France, can be found in Deville and Jacod's paper (1996). This particular article reviews the methodological developments and approaches used since INSEE started its population census redesign program.

    Release date: 2002-09-12

  • Technical products: 11-522-X20010016311
    Description:

    This paper discusses in detail issues dealing with the technical aspects of designing and conducting surveys. It is intended for an audience of survey methodologists.

    These notes discuss the importance of accuracy within the broader framework of data quality, which has been adopted by many statistical agencies.

    Accuracy is a product or service characteristic. The data quality process influences how accuracy and quality attributes, such as timeliness, relevance, accessibility, etc., are achieved. This paper studies Deming's ideas, as well as those of Juran and many others. It supports the distinction and disentanglement of these two kinds of data quality, both which have been themes at the Conference.

    Release date: 2002-09-12

  • Technical products: 11-522-X20010016287
    Description:

    This paper discusses in detail issues dealing with the technical aspects of designing and conducting surveys. It is intended for an audience of survey methodologists.

    In this paper we discuss a specific component of a research agenda aimed at disclosure protections for "non-traditional" statistical outputs. We argue that these outputs present different disclosure risks than normally faced and hence may require new thinking on the issue. Specifically, we argue that kernel density estimators, while powerful (high quality) descriptions of cross-sections, pose potential disclosure risks that depend materially on the selection of bandwidth. We illustrate these risks using a unique, non-confidential data set on the statistical universe of coal mines and present potential solutions. Finally, we discuss current practices at the U.S. Census Bureau's Center for Economic Studies for performing disclosure analysis on kernel density estimators.

    Release date: 2002-09-12

Data (0)

Data (0) (0 results)

Your search for "" found no results in this section of the site.

You may try:

Analysis (25)

Analysis (25) (25 of 25 results)

  • Articles and reports: 11F0019M2002181
    Description:

    We use data from the Canadian National Longitudinal Survey of Children and Youth to address two questions. To what extent do parents and children agree when asked identical questions about child well-being? To what extent do differences in their responses affect what one infers from multivariate analysis of the data? The correspondence between parent and child in the assessment of child well-being is only slight to fair. Agreement is stronger for more observable outcomes, such as schooling performance, and weaker for less observable outcomes, such as emotional disorders. We regress both sets of responses on a standard set of socio-economic characteristics. We also conduct formal and informal tests of the differences in what one would infer from these two sets of regressions.

    Release date: 2002-10-23

  • Articles and reports: 82-005-X20020016479
    Description:

    The Population Health Model (POHEM) is a policy analysis tool that helps answer "what-if" questions about the health and economic burden of specific diseases and the cost-effectiveness of administering new diagnostic and therapeutic interventions. This simulation model is particularly pertinent in an era of fiscal restraint, when new therapies are generally expensive and difficult policy decisions are being made. More important, it provides a base for a broader framework to inform policy decisions using comprehensive disease data and risk factors. Our "base case" models comprehensively estimate the lifetime costs of treating breast, lung and colorectal cancer in Canada. Our cancer models have shown the large financial burden of diagnostic work-up and initial therapy, as well as the high costs of hospitalizing those dying of cancer. Our core cancer models (lung, breast and colorectal cancer) have been used to evaluate the impact of new practice patterns. We have used these models to evaluate new chemotherapy regimens as therapeutic options for advanced lung cancer; the health and financial impact of reducing the hospital length of stay for initial breast cancer surgery; and the potential impact of population-based screening for colorectal cancer. To date, the most interesting intervention we have studied has been the use of tamoxifen to prevent breast cancer among high risk women.

    Release date: 2002-10-08

  • Articles and reports: 12-001-X20020016414
    Description:

    Census-taking by traditional methods is becoming more difficult. The possibility of cross-linking administrative files provides an attractive alternative to conducting periodic censuses (Laihonen 2000; Borchsenius 2000). This was proposed in a recent article by Nathan (2001). The Institut national de la statistique et des études économiques (INSEE)' redesign is based on the idea of a 'continuous census,' originally suggested by Kish (1981, 1990) and Horvitz (1986). A first approach that would be feasible in France can be found in Deville and Jacod (1996). This article reviews methodological developments since INSEE started its population census redesign program.

    Release date: 2002-07-05

  • Articles and reports: 12-001-X20020016424
    Description:

    A variety of estimators for the variance of the General Regression (GREG) estimator of a mean have been proposed in the sampling literature, mainly with the goal of estimating the design-based variance. Under certain conditions, estimators can be easily constructed that are approximately unbiased for both the design-variance and the model-variance. Several dual-purpose estimators are studied here in single-stage sampling. These choices are robust estimators of a model-variance even if the model that motivates the GREG has an incorrect variance parameter.

    A key feature of the robust estimators is the adjustment of squared residuals by factors analogous to the leverages used in standard regression analysis. We also show that the delete-one jackknife estimator implicitly includes the leverage adjustments and is a good choice from either the design-based or model-based perspective. In a set of simulations, these variance estimators have small bias and produce confidence intervals with near-nominal coverage rates for several sampling methods, sample sizes and populations in single-stage sampling.

    We also present simulation results for a skewed population where all variance estimators perform poorly. Samples that do not adequately represent the units with large values lead to estimated means that are too small, variance estimates that are too small and confidence intervals that cover at far less than the nominal rate. These defects can be avoided at the design stage by selecting samples that cover the extreme units well. However, in populations with inadequate design information this will not be feasible.

    Release date: 2002-07-05

  • Articles and reports: 12-001-X20020016422
    Description:

    In estimating variances so as to account for imputation for item non-response, Rao and Shao (1992) originated an approach based on adjusted replication. Further developments (particularly the extension to Balanced Repeated Replication of the jackknife replication of Rao and Shao) were made by Shao, Chen and Chen (1998). In this article, we explore how these methods can be implemented using replicate weights.

    Release date: 2002-07-05

  • Articles and reports: 12-001-X20020019499
    Description:

    "In this Issue" is a column where the Editor briefly presents each paper of the current issue of Survey Methodology. As well, it sometimes contains informations on structure or management changes in the journal.

    Release date: 2002-07-05

  • Articles and reports: 12-001-X20020016417
    Description:

    An approach to exploiting the data from multiple surveys and epochs by benchmarking the parameter estimates of logit models of binary choice and semiparametric survival models has been developed. The goal is to exploit the relatively rich source of socio-economic covariates offered by Statistics Canada's Survey of Labour and Income Dynamics (SLID), and also the historical time-span of the Labour Force Survey (LFS), enhanced by following individuals through each interview in their six-month rotation. A demonstration of how the method can be applied is given, using the maternity leave module of the LifePaths dynamic microsimulation project at Statistics Canada. The choice of maternity leave over job separation is specified as a binary logit model, while the duration of leave is specified as a semiparametric proportional hazards survival model with covariates together with a baseline hazard permitted to change each month. Both models are initially estimated by maximum likelihood from pooled SLID data on maternity leaves beginning in the period from 1993 to 1996, then benchmarked to annual estimates from the LFS from 1976 to 1992. In the case of the logit model, the linear predictor is adjusted by a log-odds estimate from the LFS. For the survival model, a Kaplan-Meier estimator of the hazard function from the LFS is used to adjust the predicted hazard in the semiparametric model.

    Release date: 2002-07-05

  • Articles and reports: 12-001-X20020016413
    Description:

    Leslie Kish long advocated a "rolling sample" design, with non-overlapping monthly panels which can be cumulated over different lengths of time for domains of different sizes. This enables a single survey to serve multiple purposes. The Census Bureau's new American Community Survey (ACS) uses such a rolling sample design, with annual averages to measure change at the state level, and three-year or five-year moving averages to describe progressively smaller domains. This paper traces Kish's influence on the development of the American Community Survey, and discusses some practical methodological issues that had to be addressed in implementing the design.

    Release date: 2002-07-05

  • Articles and reports: 12-001-X20020016408
    Description:

    Regression and regression-related procedures have become common in survey estimation. We review the basic properties of regression estimators, discuss implementation of regression estimation, and investigate variance estimation for regression estimators. The role of models in constructing regression estimators and the use of regression in non-response adjustment are also explored.

    Release date: 2002-07-05

  • Articles and reports: 12-001-X20020016419
    Description:

    Since some individuals in a population may lack telephones, telephone surveys using random digit dialling within strata may result in asymptotically biased estimators of ratios. The impact from not being able to sample the non-telephone population is examined. We take into account the propensity that a household owns a telephone, when proposing a post-stratified telephone-weighted estimator, which seems to perform better than the typical post-stratified estimator in terms of mean squared error. Such coverage propensities are estimated using the Public Use Microdata Samples, as provided by the United States Census. Non-post-stratified estimators are considered when sample sizes are small. The asymptotic mean squared error, along with its estimate based on a sample of each of the estimators is derived. Real examples are analysed using the Public Use Microdata Samples. Other forms of no-nresponse are not examined herein.

    Release date: 2002-07-05

  • Articles and reports: 12-001-X20020016488
    Description:

    Sampling is a branch of and a tool for statistics, and the field of statistics was founded as a new paradigm in 1810 by Quetelet (Porter 1987; Stigler 1986). Statistics and statisticians deal with the effects of chance events on empirical data. The mathematics of chance had been developed centuries earlier to predict gambling games and to account for errors of observation in astronomy. Data were also compiled for commerce, banking, and government purposes. But combining chance with real data required a new theoretical view; a new paradigm. Thus, statistical science and its various branches, which are the products of the maturity of human development (Kish 1985), arrived late in history and academia. This article examines the new concepts in diverse aspects of sampling, which may also be known as new sampling paradigms, models or methods.

    Release date: 2002-07-05

  • Articles and reports: 12-001-X20020016421
    Description:

    Like most other surveys, non-response often occurs in the Current Employment Survey conducted monthly by the U.S. Bureau of Labor Statistics (BLS). In a given month, imputation using reported data from previous months generally provides more efficient survey estimators than ignoring non-respondents and adjusting survey weights. However, imputation also has an effect on variance estimation: treating imputed values as reported data and applying a standard variance estimation method lead to negatively biased variance estimators. In this article, we propose some variance estimators using the Grouped Balanced Half Sample method and re-imputation to take imputation into account. Some simulation results for the finite sample performance of the imputed survey estimators and their variance estimators are presented.

    Release date: 2002-07-05

  • Articles and reports: 12-001-X20020016420
    Description:

    The post-stratified estimator sometimes has empty strata. To address this problem, we construct a post-stratified estimator with post-strata sizes set in the sample. The post-strata sizes are then random in the population. The next step is to construct a smoothed estimator by calculating a moving average of the post-stratified estimators. Using this technique, it is possible to construct an exact theory of calibration on distribution. The estimator obtained is not only calibrated on distribution, it is also linear and completely unbiased. We then compare the calibrated estimator with the regression estimator. Lastly, we propose an approximate variance estimator that we validate using simulations.

    Release date: 2002-07-05

  • Articles and reports: 88-003-X20020026371
    Description:

    When constructing questions for questionnaires, one of the rules of thumb has always been "keep it short and simple." This article is the third in a series of lessons learned during cognitive testing of the pilot Knowledge Management Practices Survey. It studies the responses given to long questions, thick questionnaires and too many response boxes.

    Release date: 2002-06-14

  • Articles and reports: 88-003-X20020026369
    Description:

    Eliminating the "neutral" response in an opinion question not only encourages the respondent to choose a side, it gently persuades respondents to read the question. Learn how we used this technique to our advantage in the Knowledge Management Practices Survey, 2001.

    Release date: 2002-06-14

  • Articles and reports: 12-001-X20010026093
    Description:

    This paper presents weighting procedures that combine information from multiple panels of a repeated panel household survey for cross-sectional estimation. The dynamic character of a repeated panel survey is discussed in relation to estimation of population parameters at any wave of the survey. A repeated panel survey with overlapping panels is described as a special type of multiple frame survey, with the frames of the panels forming a time sequence. The paper proposes weighting strategies suitable for various multiple panel survey situations. The proposed weighting schemes involve an adjustment of weights in domains of the combined panel sample that represent identical time periods covered by the individual panels. A weight adjustment procedure that deals with changes in the panels over time is discussed. The integration of the various weight adjustments required for cross-sectional estimation in a repeated panel household survey is also discussed.

    Release date: 2002-02-28

  • Articles and reports: 12-001-X20010026096
    Description:

    Local polynomial regression methods are put forward to aid in exploratory data analysis for large-scale surveys. The proposed regression methods are put forward to aid in exploratory data analysis for large-scale surveys. The proposed method relies on binning the data on the x-variable and calculating the appropriate survey estimates for the mean of the y-values at each bin. When binning on x has been carried out to the precision of the recorded data, the method is the same as applying the survey weights to the standard criterion for obtaining local polynomial regression estimates. The alternative of using classical polynomial regression is also considered and a criterion is proposed to decide whether the nonparametric approach to modeling should be preferred over the classical approach. Illustrative examples are given from the 1990 Ontario Health Survey.

    Release date: 2002-02-28

  • Articles and reports: 12-001-X20010026090
    Description:

    The number of calls in a telephone survey is used as an indicator of how difficult an intended respondent is to reach. This permits a probabilistic division of the non-respondents into non-susceptibles (those who will always refuse to respond), and the susceptible non-respondents (those who were not available to respond) in a model of the non-response. Further, it permits stochastic estimation of the views of the latter group and an evaluation of whether the non-response is ignorable for inference about the dependent variable. These ideas are implemented on the data from a survey in Metropolitan Toronto of attitudes toward smoking in the workplace. Using a Bayesian model, the posterior distribution of the model parameters is sampled by Markov Chain Monte Carlo methods. The results reveal that the non-response is not ignorable and those who do not respond are twice as likely to favor unrestricted smoking in the workplace as are those who do.

    Release date: 2002-02-28

  • Articles and reports: 12-001-X20010026094
    Description:

    This article reviews the methods that may be used to produce direct estimates for small areas, including stratification and oversampling, and forms of dual-frame estimation.

    Release date: 2002-02-28

  • Articles and reports: 12-001-X20010026097
    Description:

    A compositional time series is defined as a multivariate time series in which each of the series has values bounded between zero and one and the sum of the series equals one at each time point. Data with such characteristics are observed in repeated surveys when a survey variable has a multinomial response but interest lies in the proportion of units classified in each of its categories. In this case, the survey estimates are proportions of a whole subject to a unity-sum constraint. In this paper we employ a state space approach for modelling compositional time series from repeated surveys taking into account the sampling errors. The additive logistic transformation is used in order to guarantee predictions and signal estimates bounded between zero and one which satisfy the unity-sum constraint. The method is applied to compositional data from the Brazilian Labour Force Survey. Estimates of the vector of proportions and the unemployment rate are obtained. In addition, the structural components of the signal vector, such as the seasonals and the trends, are produced.

    Release date: 2002-02-28

  • Articles and reports: 12-001-X20010026092
    Description:

    To augment the amount of available information, data from different sources are increasingly being combined. These databases are often combined using record linkage methods. When there is no unique identifier, a probabilistic linkage is used. In that case, a record on a first file is associated with a probability that is linked to a record on a second file, and then a decision is taken on whether a possible link is a true link or not. This usually requires a non-negligible amount of manual resolution. It might then be legitimate to evaluate if manual resolution can be reduced or even eliminated. This issue is addressed in this paper where one tries to produce an estimate of a total (or a mean) of one population, when using a sample selected from another population linked somehow to the first population. In other words, having two populations linked through probabilistic record linkage, we try to avoid any decision concerning the validity of links and still be able to produce an unbiased estimate for a total of the one of two populations. To achieve this goal, we suggest the use of the Generalised Weight Share Method (GWSM) described by Lavallée (1995).

    Release date: 2002-02-28

  • Articles and reports: 12-001-X20010026091
    Description:

    The theory of double sampling is usually presented under the assumption that one of the samples is nested within the other. This type of sampling is called two-phase sampling. The first-phase sample provides auxiliary information (x) that is relatively inexpensive to obtain, whereas the second-phase sample: (b) to improve the estimate using a difference, ratio or regression estimator; or (c) to draw a sub-sample of non-respondent units. However, it is not necessary for one of the samples to be nested in the other or selected from the same frame. The case of non-nested double sampling is dealt with in passing in the classical works on sampling (Des Raj 1968, Cochrane 1977). This method is now used in several national statistical agencies. This paper consolidates double sampling by presenting it in a unified manner. Several examples of surveys used at Statistics Canada illustrate this unification.

    Release date: 2002-02-28

  • Articles and reports: 12-001-X20010026095
    Description:

    In this paper, we discuss the application of the bootstrap with a re-imputation step to capture the imputation variance (Shao and Sitter 1996) in stratified multistage sampling. We propose a modified bootstrap that does not require rescaling so that Shao and Sitter's procedure can be applied to the case where random imputation is applied and the first-stage stratum sample sizes are very small. This provides a unified method that works irrespective of the imputation method (random or nonrandom), the stratum size (small or large), the type of estimator (smooth or nonsmooth), or the type of problem (variance estimation or sampling distribution estimation). In addition, we discuss the proper Monte Carlo approximation to the bootstrap variance, when using re-imputation together with resampling methods. In this setting, more care is needed than is typical. Similar results are obtained for the method of balanced repeated replications, which is often used in surveys and can be viewed as an analytic approximation to the bootstrap. Finally, some simulation results are presented to study finite sample properties and various variance estimators for imputed data.

    Release date: 2002-02-28

  • Articles and reports: 12-001-X20010029567
    Description:

    In this Issue is a column where the Editor biefly presents each paper of the current issue of Survey Methodology. As well, it sometimes contain informations on structure or management changes in the journal.

    Release date: 2002-02-28

  • Articles and reports: 12-001-X20010026089
    Description:

    Telephone surveys are a convenient and efficient method of data collection. Bias may be introduced into population estimates, however, by the exclusion of nontelephone households from these surveys. Data from the U.S. Federal Communications Commission (FCC) indicates that five and a half to six percent of American households are without phone service at any given time. The bias introduced can be significant since nontelephone households may differ from telephone households in ways that are not adequately handled by poststratification. Many households, called "transients", move in and out of the telephone population during the year, sometimes due to economic reasons or relocation. The transient telephone population may be representative of the nontelephone population in general since its members have recently been in the nontelephone population.

    Release date: 2002-02-28

Reference (84)

Reference (84) (25 of 84 results)

  • Technical products: 11-522-X2001001
    Description:

    Symposium 2001 was the eighteenth in Statistics Canada's series of international symposia on methodological issues. Each year the symposium focuses on a particular theme. In 2001, the theme was: "Achieving Data Quality in a Statistical Agency: a Methodological Perspective".

    Symposium 2001 was held from October 17 to October 19, 2001 in Hull, Quebec and it attracted over 560 people from 21 countries. A total of 83 papers were presented. Aside from translation and formatting, the papers, as submitted by the authors, have been reproduced in these proceedings.

    Release date: 2002-09-12

  • Technical products: 11-522-X20010016283
    Description:

    This paper discusses in detail issues dealing with the technical aspects of designing and conducting surveys. It is intended for an audience of survey methodologists.

    The accurate recording of patients' Indegenous status in hospital separations data is critical to analyses of health service use by Aboriginal and Torres Strait Islander Australians, who have relatively poor health. However, the accuracy of these data is now well understood. In 1998, a methodology for assessing the data accuracy was piloted in 11 public hospitals. Data were collected for 8,267 patients using a personal interview, and compared with the corresponding, routinely collected data. Among the 11 hospitals, the proportion of patients correctly recorded as Indigenous ranged from 55 % to 100 %. Overall, hospitals with high proportions of Indigenous persons in their catchment areas reported more accurate data. The methodology has since been used to assess data quality in hospitals in two Australian states and to promote best practice data collection.

    Release date: 2002-09-12

  • Technical products: 11-522-X20010016277
    Description:

    This paper discusses in detail issues dealing with the technical aspects of designing and conducting surveys. It is intended for an audience of survey methodologists.

    The advent of computerized record-linkage methodology has facilitated the conduct of cohort mortality studies in which exposure data in one database are electronically linked with mortality data from another database. In this article, the impact of linkage errors on estimates of epidemiological indicators of risk, such as standardized mortality ratios and relative risk regression model parameters, is explored. It is shown that these indicators can be subject to bias and additional variability in the presence of linkage errors, with false links and non-links leading to positive and negative bias, respectively, in estimates of the standardized mortality ratio. Although linkage errors always increase the uncertainty in the estimates, bias can be effectively eliminated in the special case in which the false positive rate equals the false negative rate within homogeneous states defined by cross-classification of the covariates of interest.

    Release date: 2002-09-12

  • Technical products: 11-522-X20010016262
    Description:

    This paper discusses in detail issues dealing with the technical aspects of designing and conducting surveys. It is intended for an audience of survey methodologists.

    The demand for information on the electronic economy requires statistical agencies to assess the relevancy and improve the quality of their existing measurement programs. Innovations at the U.S. Census Bureau have helped the Bureau meet the user's urgent needs for this information, and improve the quality of the data. Through research conducted at the U.S. Census Bureau, as well as tapping into the expertise of academia, the private sector, and other government agencies, the new data on electronic commerce and electronic business processes has been strengthened. Using both existing and new data, the U.S. Census Bureau has discovered that research provides new key estimates of the size, scope, and impact of the new economy.

    Release date: 2002-09-12

  • Technical products: 11-522-X20010016230
    Description:

    This publication consists of three papers, each addressing data quality issues associated with a large and complex survey. Two of the case studies involve household surveys of labour force activity and the third focuses on a business survey. The papers each address a data quality topic from a different perspective, but share some interesting common threads.

    Release date: 2002-09-12

  • Technical products: 11-522-X20010016241
    Description:

    This paper discusses in detail issues dealing with the technical aspects of designing and conducting surveys. It is intended for an audience of survey methodologists.

    Leslie Kish long advocated the use of the "rolling sample" design. With non-overlapping, monthly panels that can be cumulated over different lengths of time for domains of different sizes, the rolling sample design enables a single survey to serve multiple purposes. The Census Bureau's new American Community Survey uses such a rolling sample design with annual averages to measure change at the state level, and three-year or five-year moving averages to describe progressively smaller domains. This paper traces Kish's influence on the development of the American Community Survey, and discusses some practical methodological issues that had to be addressed during the implementation of the design.

    Release date: 2002-09-12

  • Technical products: 11-522-X20010016231
    Description:

    This paper discusses in detail issues dealing with the technical aspects of designing and conducting surveys. Its is intended for an audience of survey methodologists.

    In 2000, the Behavioral Risk Factor Surveillance System (BRFSS) conducted monthly telephone surveys in 50 American states, the District of Columbia, and Puerto Rico: each was responsible for collecting its own survey data. In Maine, data collection was split between the state health department and ORC Macro, a commercial market research firm. Examination of survey outcome rates, selection biases and missing values for income suggest that the Maine health department data are more accurate. However, out of 18 behavioural health risk factors, only four are statistically different by data collector, and for these four factors, the data collected by ORC Macro seem more accurate.

    Release date: 2002-09-12

  • Technical products: 11-522-X20010016237
    Description:

    This paper discusses in detail issues dealing with the technical aspects of designing and conducting surveys. It is intended for an audience of survey methodologists.

    Secondary users of health information often assume that administrative data provides a relatively sound basis for making important planning and policy decisions. If errors are evenly or randomly distributed, this assumption may have little impact on these decisions. However, when information sources contain systematic errors, or when systematic errors are introduced during the creation of master files, this assumption can be damaging.

    The most common systematic errors involve underreporting activities for a specific population; inaccurate re-coding of spatial information; and differences in data entry protocols, which have raised questions about the consistency of data submitted by different tracking agencies. The Central East Health Information Partnership (CEHIP) has identified a number of systematic errors in administrative databases and has documented many of these in reports distributed to partner organizations.

    This paper describes how some of these errors were identified and notes the processes that give rise to the loss of data integrity. The conclusion addresses some of the impacts these problems have for health planners, program managers and policy makers.

    Release date: 2002-09-12

  • Technical products: 11-522-X20010016229
    Description:

    This paper discusses the approach that Statistics Canada has taken to improve the quality of annual business surveys through their integration in the Unified Enterprise Survey (UES). The primary objective of the UES is to measure the final annual sales of goods and services accurately by province, in sufficient detail and in a timely manner.

    This paper describes the methodological approaches that the UES has used to improve financial and commodity data quality in four broad areas. These include improved coherence of the data collected from different levels of the enterprise, better coverage of industries, better depth of information (in the sense of more content detail and estimates for more detailed domains) and better consistency of the concepts and methods across industries.

    The approach, in achieving quality, has been to (a) establish a base measure of the quality of the business survey program prior to the UES, (b) measure the annual data quality of the UES, and (c) carry out specific studies to better understand the quality of UES data and methods.

    Release date: 2002-09-12

  • Technical products: 11-522-X20010016250
    Description:

    This paper discusses in detail issues dealing with the technical aspects of designing and conducting surveys. It is intended for an audience of survey methodologists.

    This paper describes the Korea National Statistics Office's (KNSO) experiences in data quality assessment and introduces the strategies of institutionalizing the assessment procedure. This paper starts by briefly describing the definition of quality assessment, quality dimensions and indicators at the national level. It introduces the current situation of the quality assessment process in KNSO and lists the six dimensions of quality that have been identified: relevance, accuracy, timeliness, accessibility, comparability and efficiency. Based on the lessons learned from these experiences, this paper points out three essential elements required in an advanced system of data quality assessment: an objective and independent planning system, a set of appropriate indicators and competent personnel specialized in data quality assessment.

    Release date: 2002-09-12

  • Technical products: 11-522-X20010016267
    Description:

    This paper discusses in detail issues dealing with the technical aspects of designing and conducting surveys. It is intended for an audience of survey methodologists.

    In practice, a list of the desired collection units is not always available. Instead, a list of different units that are somehow related to the collection units may be provided, thus producing two related populations, UA and UB. An estimate for UB needs to be created, however, the sampling frame provided is only for the UA population.

    One solution for this problem is to select a sample from UA (sA) and produce an estimate for UB using the existing relationship between the two populations. This process may be referred to as indirect sampling. To assign a selection probability, or an estimation weight, for the survey units, Lavallée (1995) developed the generalized weight share method (GWSM). The GWSM produces an estimation weight that basically constitutes an average of the sampling weights of the units in sA.

    This paper discusses the types of non-response associated with indirect sampling and the possible estimation problems that can occur in the application of the GWSM.

    Release date: 2002-09-12

  • Technical products: 11-522-X20010016292
    Description:

    This paper discusses in detail issues dealing with the technical aspects of designing and conducting surveys. It is intended for an audience of survey methodologists.

    Statistics can serve to benefit society, but, if manipulated politically or otherwise, statistics may also be used by the powerful as instruments to maintain the status quo or even to oppress. Statisticians working internationally, usually employed by international, supra-national or bilateral agencies, face a range of problems as they try to 'make a difference' in the lives of the poorest people in the world. One of the most difficult challenges statisticians face is the dilemma between open accountability and national sovereignty (in relation to what data are collected, the methods used and who is to have access to the results). Because of increasing globalization and new modalities of development co-operation and partnership, statisticians work in a constantly changing environment.

    This paper addresses the problems of improving the quality of cross-national data. This paper aims to raise consciousness of the role of statisticians at the international level; describe some of the constraints under which statisticians work; address principles which ought to govern the general activities of statisticians; and evaluate, in particular, the relevance of such principles to international statisticians. This paper also draws upon the recent Presidential Address to the Royal Statistical Society (Presented June 2001, JRSS Series D forthcoming).

    Release date: 2002-09-12

  • Technical products: 11-522-X20010016273
    Description:

    This paper discusses in detail issues dealing with the technical aspects of designing and conducting surveys. It is intended for an audience of survey methodologists.

    For a multivariate survey based on simple random sampling, the problem of calculating an optimal sampling size becomes one of solving a stochastic programming problem in which each constraint corresponds to a bounded estimate of the variance for a commodity. The problem is stochastic because the set of data collected from a previous survey makes the components of each constraint random variables; consequently, the calculated size of a sample is itself a random variable and is dependent on the quality of that set of data. By means of a Monte Carlo technique, an empirical probability distribution of the optimal sampling size can be produced for finding the probability of the event that the prescribed precision will be achieved. Corresponding to each set of previously collected data, there is an optimal size and allocation across strata. While reviewing these over several consecutive periods of time, it may be possible to identify troublesome strata and to see a trend in the stability of the data. The review may reveal an oscillatory pattern in the sizes of the samples that might have evolved over time due to the dependency of one allocation on another.

    Release date: 2002-09-12

  • Technical products: 11-522-X20010016296
    Description:

    This paper discusses in detail issues dealing with the technical aspects of designing and conducting surveys. It is intended for an audience of survey methodologists.

    The Canadian Labour Force Survey (LFS) is one of Statistics Canada's most important surveys. It is a monthly survey that collects data concerning the person's labour force status, the nature of the person's work or reason for not working, and the person's demographics. The survey sample consists of approximately 52,000 households. Coverage error is a measure of data quality that is important to any survey. One of the key measures of coverage error in the LFS is the percentage difference between the Census of Population estimates and the LFS population counts; this error is called slippage. A negative value indicates that the LFS has a problem of overcoverage, while a positive value indicates the LFS has an undercoverage problem. In general, slippage is positive, thus meaning that the LFS consistently misses people who should be enumerated.

    The purpose of this study was to determine why slippage is increasing and what can be done to remedy it. The study was conducted in two stages. The first stage was a historical review of the projects that have studied and tried to control slippage in the LFS, as well as the operational changes that have been implemented over time. The second stage was an analysis of factors such as vacancy rates, non-response, demographics, urban and rural status and the impact of these factors on the slippage rate.

    Release date: 2002-09-12

  • Technical products: 11-522-X20010016302
    Description:

    This paper discusses in detail issues dealing with the technical aspects of designing and conducting surveys. It is intended for an audience of survey methodologists.

    This session provides three more contributions to the continuing discussion concerning the national statistics offices' response to the topic of quality -in particular, the subtopic of communicating quality. These three papers make the important and necessary assumption that national statistical offices have an obligation to report the limitations of the data; users should know and understand those limitations; and, as a result of understanding the limitations, users ought to be able to determine whether the data are fit for their purposes.

    Release date: 2002-09-12

  • Technical products: 11-522-X20010016247
    Description:

    This paper discusses in detail issues dealing with the technical aspects of designing and conducting surveys. It is intended for an audience of survey methodologists.

    This paper describes joint research by the Office for National Statistics (ONS) and Southampton University regarding the evaluation of several different approaches to the local estimation of International Labour Office (ILO) unemployment. The need to compare estimators with different underlying assumptions has led to a focus on evaluation methods that are (partly at least) model-independent. Model-fit diagnostics that have been considered include: various residual procedures, cross-validation, predictive validation, consistency with marginals, and consistency with direct estimates within single cells. These diagnostics have been used to compare different model-based estimators with each other and with direct estimators.

    Release date: 2002-09-12

  • Technical products: 11-522-X20010016258
    Description:

    This paper discusses in detail issues dealing with the technical aspects of designing and conducting surveys. It is intended for an audience of survey methodologists.

    To fill statistical gaps in the areas of health determinants, health status and health system usage by the Canadian population at the health region levels (sub-provincial areas or regions of interest to health authorities), Statistics Canada established a new survey called the Canadian Community Health Survey (CCHS). The CCHS consists of two separate components: a regional survey in the first year and a provincial survey in the second year. The main purpose of the regional survey, for which collection took place between September 2000 and October 2001, was to produce cross-sectional estimates for 136 health regions in Canada, based on a sample of more than 134,000 respondents. This article focuses on the various measures taken at the time of data collection to ensure a high level of quality for this large-scale survey.

    Release date: 2002-09-12

  • Technical products: 11-522-X20010016306
    Description:

    This paper discusses in detail issues dealing with the technical aspects of designing and conducting surveys. It is intended for an audience of survey methodologists.

    The paper deals with concerns regarding the problem of automatic detection and correction of inconsistent or out-of-range data in a general process of statistical data collection. The proposed approach is capable of handling both qualitative and quantitative values. The purpose of this new approach is to overcome the computational limits of the Fellegi-Holt method, while maintaining its positive features. As customary, data records must respect a set of rules in order to be declared correct. By encoding the rules with linear inequalities, we develop mathematical models for the problems of interest. As a first relevant point, by solving a sequence of feasibility problems, the set of rules itself is checked for inconsistency or redundancy. As a second relevant point, imputation is performed by solving a sequence of set-covering problems.

    Release date: 2002-09-12

  • Technical products: 11-522-X20010016290
    Description:

    This paper discusses in detail issues dealing with the technical aspects of designing and conducting surveys. It is intended for an audience of survey methodologists.

    Over the last five years, the United Kingdom Office for National Statistics has been implementing a series of initiatives to improve the process of collecting business statistics data in the UK. These initiatives include the application of a range of new technology solutions data collection; document imaging and scanned forms have replaced paper forms for all processes. For some inquiries, the paper form has been eliminated altogether by the adoption of Telephone Data Entry (TDE). Reporting all incoming data in electronic format has allowed workflow systems to be introduced across a wide range of data collection activities.

    This paper describes the recent history of these initiatives and covers proposals that are presently at a pilot stage or being projected for the next four years. It also examines the future strategy of TDE data collection via the Internet, and the current pilots and security issues under consideration.

    Release date: 2002-09-12

  • Technical products: 11-522-X20010016252
    Description:

    This paper discusses in detail issues dealing with the technical aspects of designing and conducting surveys. It is intended for an audience of survey methodologists.

    The use of sample co-ordination in business surveys is crucial because it provides a way of smoothing out the survey burden. In many co-ordination methodologies, the random numbers representing the units are permanent and the sample selection method varies. In the microstrata methodology, however, it is the selection function that is permanent. On the other hand, random numbers are systematically rearranged between units for different co-ordination purposes: smoothing out the burden, updating panels or minimizing the overlap between two surveys. These rearrangements are made in the intersections of strata, known as microstrata. This microstrata method has good, mathematical properties and demonstrates a general approach to sample co-ordination in which births, deaths and strata changes are automatically handled. There are no particular constraints on stratification and rotation rates of panels. Two software programs have been written to implement this method and its evolutions: SALOMON in 1998, and MICROSTRAT in 2001.

    Release date: 2002-09-12

  • Technical products: 11-522-X20010016264
    Description:

    This paper discusses in detail issues dealing with the technical aspects of designing and conducting surveys. It is intended for an audience of survey methodologists.

    Conducting a census by traditional methods is becoming more difficult. The possibility of cross-linking administrative files provides an attractive alternative to conducting periodic censuses (Laihonen, 2000; Borchsenius, 2000). This method was proposed in a recent article by Nathan (2001). The Institut National de la Statistique et des Études Économiques (INSEE) census redesign is based on the idea of a "continuous census," originally suggested by Kish (1981, 1990) and Horvitz (1986). The first approach, which could be feasible in France, can be found in Deville and Jacod's paper (1996). This particular article reviews the methodological developments and approaches used since INSEE started its population census redesign program.

    Release date: 2002-09-12

  • Technical products: 11-522-X20010016311
    Description:

    This paper discusses in detail issues dealing with the technical aspects of designing and conducting surveys. It is intended for an audience of survey methodologists.

    These notes discuss the importance of accuracy within the broader framework of data quality, which has been adopted by many statistical agencies.

    Accuracy is a product or service characteristic. The data quality process influences how accuracy and quality attributes, such as timeliness, relevance, accessibility, etc., are achieved. This paper studies Deming's ideas, as well as those of Juran and many others. It supports the distinction and disentanglement of these two kinds of data quality, both which have been themes at the Conference.

    Release date: 2002-09-12

  • Technical products: 11-522-X20010016287
    Description:

    This paper discusses in detail issues dealing with the technical aspects of designing and conducting surveys. It is intended for an audience of survey methodologists.

    In this paper we discuss a specific component of a research agenda aimed at disclosure protections for "non-traditional" statistical outputs. We argue that these outputs present different disclosure risks than normally faced and hence may require new thinking on the issue. Specifically, we argue that kernel density estimators, while powerful (high quality) descriptions of cross-sections, pose potential disclosure risks that depend materially on the selection of bandwidth. We illustrate these risks using a unique, non-confidential data set on the statistical universe of coal mines and present potential solutions. Finally, we discuss current practices at the U.S. Census Bureau's Center for Economic Studies for performing disclosure analysis on kernel density estimators.

    Release date: 2002-09-12

  • Technical products: 11-522-X20010016279
    Description:

    This paper discusses in detail issues dealing with the technical aspects of designing and conducting surveys. It is intended for an audience of survey methodologists.

    Rather than having to rely on traditional measures of survey quality, such as response rates, the Social Survey Division of the U.K. Office for National Statistics has been looking for alternative ways to report on quality. In order to achieve this, all the processes involved throughout the lifetime of a survey, from sampling and questionnaire design through to production of the finished report, have been mapped out. Having done this, we have been able to find quality indicators for many of these processes. By using this approach, we hope to be able to appraise any changes to our processes as well as to inform our customers of the quality of the work we carry out.

    Release date: 2002-09-12

  • Technical products: 11-522-X20010016291
    Description:

    This paper discusses in detail issues dealing with the technical aspects of designing and conducting surveys. It is intended for an audience of survey methodologists.

    Many types of Web surveys are not based on scientific sampling and do not represent any well-defined population. Even when Web surveys are based on a general sample, it is not known whether they yield reliable or valid results. One way to test the adequacy of Web surveys is to conduct experiments comparing Web surveys with well-established, traditional survey methods. One such test was performed by comparing the 2000 General Social Survey of the National Opinion Research Center with a Knowledge Networks Web survey.

    Release date: 2002-09-12

Date modified: