Statistics by subject – Statistical methods

Other available resources to support your research.

Help for sorting results
Browse our central repository of key standard concepts, definitions, data sources and methods.
Loading
Loading in progress, please wait...
All (12)

All (12) (12 of 12 results)

  • Technical products: 11-522-X201300014287
    Description:

    The purpose of the EpiNano program is to monitor workers who may be exposed to intentionally produced nanomaterials in France. This program is based both on industrial hygiene data collected in businesses for the purpose of gauging exposure to nanomaterials at workstations and on data from self-administered questionnaires completed by participants. These data will subsequently be matched with health data from national medical-administrative databases (passive monitoring of health events). Follow-up questionnaires will be sent regularly to participants. This paper describes the arrangements for optimizing data collection and matching.

    Release date: 2014-10-31

  • Articles and reports: 12-001-X201000211378
    Description:

    One key to poverty alleviation or eradication in the third world is reliable information on the poor and their location, so that interventions and assistance can be effectively targeted to the neediest people. Small area estimation is one statistical technique that is used to monitor poverty and to decide on aid allocation in pursuit of the Millennium Development Goals. Elbers, Lanjouw and Lanjouw (ELL) (2003) proposed a small area estimation methodology for income-based or expenditure-based poverty measures, which is implemented by the World Bank in its poverty mapping projects via the involvement of the central statistical agencies in many third world countries, including Cambodia, Lao PDR, the Philippines, Thailand and Vietnam, and is incorporated into the World Bank software program PovMap. In this paper, the ELL methodology which consists of first modeling survey data and then applying that model to census information is presented and discussed with strong emphasis on the first phase, i.e., the fitting of regression models and on the estimated standard errors at the second phase. Other regression model fitting procedures such as the General Survey Regression (GSR) (as described in Lohr (1999) Chapter 11) and those used in existing small area estimation techniques: Pseudo-Empirical Best Linear Unbiased Prediction (Pseudo-EBLUP) approach (You and Rao 2002) and Iterative Weighted Estimating Equation (IWEE) method (You, Rao and Kovacevic 2003) are presented and compared with the ELL modeling strategy. The most significant difference between the ELL method and the other techniques is in the theoretical underpinning of the ELL model fitting procedure. An example based on the Philippines Family Income and Expenditure Survey is presented to show the differences in both the parameter estimates and their corresponding standard errors, and in the variance components generated from the different methods and the discussion is extended to the effect of these on the estimated accuracy of the final small area estimates themselves. The need for sound estimation of variance components, as well as regression estimates and estimates of their standard errors for small area estimation of poverty is emphasized.

    Release date: 2010-12-21

  • Technical products: 11-522-X200800010962
    Description:

    The ÉLDEQ initiated a special data gathering project in March 2008 with the collection of biological materials from 1,973 families. During a typical visit, a nurse collects a blood or saliva sample from the selected child, makes a series of measurements (anthropometry, pulse rate and blood pressure) and administers questionnaires. Planned and supervised by the Institut de la Statistique du Québec (ISQ) and the Université de Montréal, the study is being conducted in cooperation with two private firms and a number of hospitals. This article examines the choice of collection methods, the division of effort among the various players, the sequence of communications and contacts with respondents, the tracing of families who are not contacted, and follow-up on the biological samples. Preliminary field results are also presented.

    Release date: 2009-12-03

  • Technical products: 11-522-X20030017723
    Description:

    This document examines the use of a follow-up survey of non-respondents to augment the respondents from the main survey where response rates are low.

    Release date: 2005-01-26

  • Technical products: 11-522-X20020016725
    Description:

    In 1997, the US Office of Management and Budget issued revised standards for the collection of race information within the federal statistical system. One revision allows individuals to choose more than one race group when responding to federal surveys and other federal data collections. This change presents challenges for analyses that involve data collected under both the old and new race-reporting systems, since the data on race are not comparable. The following paper discusses the problems encountered by these changes and methods developed to overcome them.

    Since most people under both systems report only a single race, a common proposed solution is to try to bridge the transition by assigning a single-race category to each multiple-race reporter under the new system, and to conduct analyses using just the observed and assigned single-race categories. Thus, the problem can be viewed as a missing-data problem, in which single-race responses are missing for multiple-race reporters and needing to be imputed.

    The US Office of Management and Budget suggested several simple bridging methods to handle this missing-data problem. Schenker and Parker (Statistics in Medicine, forthcoming) analysed data from the National Health Interview Survey of the US National Center for Health Statistics, which allows multiple-race reporting but also asks multiple-race reporters to specify a primary race, and found that improved bridging methods could result from incorporating individual-level and contextual covariates into the bridging models.

    While Schenker and Parker discussed only three large multiple-race groups, the current application requires predicting single-race categories for several small multiple-race groups as well. Thus, problems of sparse data arise in fitting the bridging models. We address these problems by building combined models for several multiple-race groups, thus borrowing strength across them. These and other methodological issues are discussed.

    Release date: 2004-09-13

  • Articles and reports: 12-001-X20040016992
    Description:

    In the U.S. Census of Population and Housing, a sample of about one-in-six of the households receives a longer version of the census questionnaire called the long form. All others receive a version called the short form. Raking, using selected control totals from the short form, has been used to create two sets of weights for long form estimation; one for individuals and one for households. We describe a weight construction method based on quadratic programming that produces household weights such that the weighted sum for individual characteristics and for household characteristics agree closely with selected short form totals. The method is broadly applicable to situations where weights are to be constructed to meet both size bounds and sum-to-control restrictions. Application to the situation where the controls are estimates with an estimated covariance matrix is described.

    Release date: 2004-07-14

  • Articles and reports: 12-001-X20030026777
    Description:

    The Accuracy and Coverage Evaluation survey was conducted to estimate the coverage in the 2000 U.S. Census. After field procedures were completed, several types of missing data had to be addressed to apply dual-system estimation. Some housing units were not interviewed. Two noninterview adjustments were devised from the same set of interviews, one for each of two points in time. In addition, the resident, match, or enumeration status of some respondents was not determined. Methods applied in the past were replaced to accommodate a tighter schedule to compute and verify the estimates. This paper presents the extent of missing data in the survey, describes the procedures applied, comparing them to past and current alternatives, and provides analytical summaries of the procedures, including comparisons of dual-system estimates of population under alternatives. Because the resulting levels of missing data were low, it appears that alternative procedures would not have affected the results substantially. However some changes in the estimates are noted.

    Release date: 2004-01-27

  • Articles and reports: 12-001-X20000015176
    Description:

    A components-of-variance approach and an estimated covariance error structure were used in constructing predictors of adjustment factors for the 1990 Decennial Census. The variability of the estimated covariance matrix is the suspected cause of certain anomalies that appeared in the regression estimation and in the estimated adjustment factors. We investigate alternative prediction methods and propose a procedure that is less influenced by variability in the estimated covariance matrix. The proposed methodology is applied to a data set composed of 336 adjustment factors from the 1990 Post Enumeration Survey.

    Release date: 2000-08-30

  • Articles and reports: 12-001-X199500214395
    Description:

    When redesigning a sample with a stratified multi-stage design, it is sometimes considered desirable to maximize the number of primary sampling units retained in the new sample without altering unconditional selection probabilities. For this problem, an optimal solution which uses transportation theory exists for a very general class of designs. However, this procedure has never been used in the redesign of any survey (that the authors are aware of), in part because even for moderately-sized strata, the resulting transportation problem may be too large to solve in practice. In this paper, a modified reduced-size transportation algorithm is presented for maximizing the overlap, which substantially reduces the size of the problem. This reduced-size overlap procedure was used in the recent redesign of the Survey of Income and Program Participation (SIPP). The performance of the reduced-size algorithm is summarized, both for the actual production SIPP overlap and for earlier, artificial simulations of the SIPP overlap. Although the procedure is not optimal and theoretically can produce only negligible improvements in expected overlap compared to independent selection, in practice it gave substantial improvements in overlap over independent selection for SIPP, and generally provided an overlap that is close to optimal.

    Release date: 1995-12-15

  • Articles and reports: 12-001-X199500114406
    Description:

    This paper discusses the design of visitor surveys. To illustrate, two recent surveys are described. The first is a survey of visitors to National Park Service areas nationwide throughout the year (1992). The second is a survey of recreational users of the three-river basin around Pittsburgh, Pennsylvania, during a twelve-month period. Both surveys involved sampling in time with temporal as well as spatial stratification. Sampling units had the form of site-period pairs for the stage before the final, visitor sampling stage. Random assignment of sample sites to periods permits the computation of unbiased estimates for the temporal strata (e.g., monthly and seasonal estimates) as well as estimates for strata defined by region and by type of use.

    Release date: 1995-06-15

  • Articles and reports: 12-001-X199200114497
    Description:

    The present article discusses a model-based approach towards adjustment of the 1988 Census Dress Rehearsal Data collected from test sites in Missouri. The primary objective is to develop procedures that can be used to model data from the 1990 Census Post Enumeration Survey in April, 1991 and smooth survey-based estimates of the adjustment factors. We have proposed in this paper hierarchical Bayes (HB) and empirical Bayes (EB) procedures which meet this objective. The resulting estimators seem to improve consistently on the estimators of the adjustment factors based on dual system estimation (DSE) as well as the smoothed regression estimators.

    Release date: 1992-06-15

  • Articles and reports: 12-001-X198700114465
    Description:

    The two-stage rejection rule telephone sample design described by Waksberg (1978) is modified to improve the efficiency of telephone surveys of the U.S. Black population. Experimental tests of sample design alternatives demonstrate that: a) use of rough stratification based on telephone exchange names and states; b) use of large cluster definitions (200 and 400 consecutive numbers) at the first stage; and c) rejection rules based on racial status of the household combine to offer improvements in the relative precision of a sample, given fixed resources. Cost and error models are examined to simulate design alternatives.

    Release date: 1987-06-15

Data (0)

Data (0) (0 results)

Your search for "" found no results in this section of the site.

You may try:

Analysis (8)

Analysis (8) (8 of 8 results)

  • Articles and reports: 12-001-X201000211378
    Description:

    One key to poverty alleviation or eradication in the third world is reliable information on the poor and their location, so that interventions and assistance can be effectively targeted to the neediest people. Small area estimation is one statistical technique that is used to monitor poverty and to decide on aid allocation in pursuit of the Millennium Development Goals. Elbers, Lanjouw and Lanjouw (ELL) (2003) proposed a small area estimation methodology for income-based or expenditure-based poverty measures, which is implemented by the World Bank in its poverty mapping projects via the involvement of the central statistical agencies in many third world countries, including Cambodia, Lao PDR, the Philippines, Thailand and Vietnam, and is incorporated into the World Bank software program PovMap. In this paper, the ELL methodology which consists of first modeling survey data and then applying that model to census information is presented and discussed with strong emphasis on the first phase, i.e., the fitting of regression models and on the estimated standard errors at the second phase. Other regression model fitting procedures such as the General Survey Regression (GSR) (as described in Lohr (1999) Chapter 11) and those used in existing small area estimation techniques: Pseudo-Empirical Best Linear Unbiased Prediction (Pseudo-EBLUP) approach (You and Rao 2002) and Iterative Weighted Estimating Equation (IWEE) method (You, Rao and Kovacevic 2003) are presented and compared with the ELL modeling strategy. The most significant difference between the ELL method and the other techniques is in the theoretical underpinning of the ELL model fitting procedure. An example based on the Philippines Family Income and Expenditure Survey is presented to show the differences in both the parameter estimates and their corresponding standard errors, and in the variance components generated from the different methods and the discussion is extended to the effect of these on the estimated accuracy of the final small area estimates themselves. The need for sound estimation of variance components, as well as regression estimates and estimates of their standard errors for small area estimation of poverty is emphasized.

    Release date: 2010-12-21

  • Articles and reports: 12-001-X20040016992
    Description:

    In the U.S. Census of Population and Housing, a sample of about one-in-six of the households receives a longer version of the census questionnaire called the long form. All others receive a version called the short form. Raking, using selected control totals from the short form, has been used to create two sets of weights for long form estimation; one for individuals and one for households. We describe a weight construction method based on quadratic programming that produces household weights such that the weighted sum for individual characteristics and for household characteristics agree closely with selected short form totals. The method is broadly applicable to situations where weights are to be constructed to meet both size bounds and sum-to-control restrictions. Application to the situation where the controls are estimates with an estimated covariance matrix is described.

    Release date: 2004-07-14

  • Articles and reports: 12-001-X20030026777
    Description:

    The Accuracy and Coverage Evaluation survey was conducted to estimate the coverage in the 2000 U.S. Census. After field procedures were completed, several types of missing data had to be addressed to apply dual-system estimation. Some housing units were not interviewed. Two noninterview adjustments were devised from the same set of interviews, one for each of two points in time. In addition, the resident, match, or enumeration status of some respondents was not determined. Methods applied in the past were replaced to accommodate a tighter schedule to compute and verify the estimates. This paper presents the extent of missing data in the survey, describes the procedures applied, comparing them to past and current alternatives, and provides analytical summaries of the procedures, including comparisons of dual-system estimates of population under alternatives. Because the resulting levels of missing data were low, it appears that alternative procedures would not have affected the results substantially. However some changes in the estimates are noted.

    Release date: 2004-01-27

  • Articles and reports: 12-001-X20000015176
    Description:

    A components-of-variance approach and an estimated covariance error structure were used in constructing predictors of adjustment factors for the 1990 Decennial Census. The variability of the estimated covariance matrix is the suspected cause of certain anomalies that appeared in the regression estimation and in the estimated adjustment factors. We investigate alternative prediction methods and propose a procedure that is less influenced by variability in the estimated covariance matrix. The proposed methodology is applied to a data set composed of 336 adjustment factors from the 1990 Post Enumeration Survey.

    Release date: 2000-08-30

  • Articles and reports: 12-001-X199500214395
    Description:

    When redesigning a sample with a stratified multi-stage design, it is sometimes considered desirable to maximize the number of primary sampling units retained in the new sample without altering unconditional selection probabilities. For this problem, an optimal solution which uses transportation theory exists for a very general class of designs. However, this procedure has never been used in the redesign of any survey (that the authors are aware of), in part because even for moderately-sized strata, the resulting transportation problem may be too large to solve in practice. In this paper, a modified reduced-size transportation algorithm is presented for maximizing the overlap, which substantially reduces the size of the problem. This reduced-size overlap procedure was used in the recent redesign of the Survey of Income and Program Participation (SIPP). The performance of the reduced-size algorithm is summarized, both for the actual production SIPP overlap and for earlier, artificial simulations of the SIPP overlap. Although the procedure is not optimal and theoretically can produce only negligible improvements in expected overlap compared to independent selection, in practice it gave substantial improvements in overlap over independent selection for SIPP, and generally provided an overlap that is close to optimal.

    Release date: 1995-12-15

  • Articles and reports: 12-001-X199500114406
    Description:

    This paper discusses the design of visitor surveys. To illustrate, two recent surveys are described. The first is a survey of visitors to National Park Service areas nationwide throughout the year (1992). The second is a survey of recreational users of the three-river basin around Pittsburgh, Pennsylvania, during a twelve-month period. Both surveys involved sampling in time with temporal as well as spatial stratification. Sampling units had the form of site-period pairs for the stage before the final, visitor sampling stage. Random assignment of sample sites to periods permits the computation of unbiased estimates for the temporal strata (e.g., monthly and seasonal estimates) as well as estimates for strata defined by region and by type of use.

    Release date: 1995-06-15

  • Articles and reports: 12-001-X199200114497
    Description:

    The present article discusses a model-based approach towards adjustment of the 1988 Census Dress Rehearsal Data collected from test sites in Missouri. The primary objective is to develop procedures that can be used to model data from the 1990 Census Post Enumeration Survey in April, 1991 and smooth survey-based estimates of the adjustment factors. We have proposed in this paper hierarchical Bayes (HB) and empirical Bayes (EB) procedures which meet this objective. The resulting estimators seem to improve consistently on the estimators of the adjustment factors based on dual system estimation (DSE) as well as the smoothed regression estimators.

    Release date: 1992-06-15

  • Articles and reports: 12-001-X198700114465
    Description:

    The two-stage rejection rule telephone sample design described by Waksberg (1978) is modified to improve the efficiency of telephone surveys of the U.S. Black population. Experimental tests of sample design alternatives demonstrate that: a) use of rough stratification based on telephone exchange names and states; b) use of large cluster definitions (200 and 400 consecutive numbers) at the first stage; and c) rejection rules based on racial status of the household combine to offer improvements in the relative precision of a sample, given fixed resources. Cost and error models are examined to simulate design alternatives.

    Release date: 1987-06-15

Reference (4)

Reference (4) (4 of 4 results)

  • Technical products: 11-522-X201300014287
    Description:

    The purpose of the EpiNano program is to monitor workers who may be exposed to intentionally produced nanomaterials in France. This program is based both on industrial hygiene data collected in businesses for the purpose of gauging exposure to nanomaterials at workstations and on data from self-administered questionnaires completed by participants. These data will subsequently be matched with health data from national medical-administrative databases (passive monitoring of health events). Follow-up questionnaires will be sent regularly to participants. This paper describes the arrangements for optimizing data collection and matching.

    Release date: 2014-10-31

  • Technical products: 11-522-X200800010962
    Description:

    The ÉLDEQ initiated a special data gathering project in March 2008 with the collection of biological materials from 1,973 families. During a typical visit, a nurse collects a blood or saliva sample from the selected child, makes a series of measurements (anthropometry, pulse rate and blood pressure) and administers questionnaires. Planned and supervised by the Institut de la Statistique du Québec (ISQ) and the Université de Montréal, the study is being conducted in cooperation with two private firms and a number of hospitals. This article examines the choice of collection methods, the division of effort among the various players, the sequence of communications and contacts with respondents, the tracing of families who are not contacted, and follow-up on the biological samples. Preliminary field results are also presented.

    Release date: 2009-12-03

  • Technical products: 11-522-X20030017723
    Description:

    This document examines the use of a follow-up survey of non-respondents to augment the respondents from the main survey where response rates are low.

    Release date: 2005-01-26

  • Technical products: 11-522-X20020016725
    Description:

    In 1997, the US Office of Management and Budget issued revised standards for the collection of race information within the federal statistical system. One revision allows individuals to choose more than one race group when responding to federal surveys and other federal data collections. This change presents challenges for analyses that involve data collected under both the old and new race-reporting systems, since the data on race are not comparable. The following paper discusses the problems encountered by these changes and methods developed to overcome them.

    Since most people under both systems report only a single race, a common proposed solution is to try to bridge the transition by assigning a single-race category to each multiple-race reporter under the new system, and to conduct analyses using just the observed and assigned single-race categories. Thus, the problem can be viewed as a missing-data problem, in which single-race responses are missing for multiple-race reporters and needing to be imputed.

    The US Office of Management and Budget suggested several simple bridging methods to handle this missing-data problem. Schenker and Parker (Statistics in Medicine, forthcoming) analysed data from the National Health Interview Survey of the US National Center for Health Statistics, which allows multiple-race reporting but also asks multiple-race reporters to specify a primary race, and found that improved bridging methods could result from incorporating individual-level and contextual covariates into the bridging models.

    While Schenker and Parker discussed only three large multiple-race groups, the current application requires predicting single-race categories for several small multiple-race groups as well. Thus, problems of sparse data arise in fitting the bridging models. We address these problems by building combined models for several multiple-race groups, thus borrowing strength across them. These and other methodological issues are discussed.

    Release date: 2004-09-13

Date modified: