Statistics by subject – Statistical methods

Other available resources to support your research.

Help for sorting results
Browse our central repository of key standard concepts, definitions, data sources and methods.
Loading
Loading in progress, please wait...
All (19)

All (19) (19 of 19 results)

  • Articles and reports: 12-001-X198900214568
    Description:

    The paper describes a Monte Carlo study of simultaneous confidence interval procedures for k > 2 proportions, under a model of two-stage cluster sampling. The procedures investigated include: (i) standard multinomial intervals; (ii) Scheffé intervals based on sample estimates of the variances of cell proportions; (iii) Quesenberry-Hurst intervals adapted for clustered data using Rao and Scott’s first and second order adjustments to X^2; (iv) simple Bonferroni intervals; (v) Bonferroni intervals based on transformations of the estimated proportions; (vi) Bonferroni intervals computed using the critical points of Student’s t. In several realistic situations, actual coverage rates of the multinomial procedures were found to be seriously depressed compared to the nominal rate. The best performing intervals, from the point of view of coverage rates and coverage symmetry (an extension of an idea due to Jennings), were the t-based Bonferroni intervals derived using log and logit transformations. Of the Scheffé-like procedures, the best performance was provided by Quesenberry-Hurst intervals in combination with first-order Rao-Scott adjustments.

    Release date: 1989-12-15

  • Articles and reports: 12-001-X198900214563
    Description:

    This paper examines the adequacy of estimates of emigrants from Canada and interprovincial migration data from the Family Allowance files and Revenue Canada tax files. The application of these data files in estimating total population for Canada, provinces and territories, was evaluated with reference to the 1986 Census counts. It was found that these two administrative files provided consistent and reasonably accurate series of data on emigration and interprovincial migration from 1981 to 1986. Consequently, the population estimates were fairly accurate. The estimate of emigrants derived from the Family Allowance file could be improved by using the ratio of adult to child emigrant rates computed from Employment and Immigration Canada’s immigration file.

    Release date: 1989-12-15

  • Articles and reports: 12-001-X198900214566
    Description:

    A randomized response model for sampling from dichotomous populations is developed in this paper. The model permits the use of continuous randomization and multiple trials per respondent. The special case of randomization with normal distributions is considered, and a computer simulation of such a sampling procedure is presented as an initial exploration into the effects such a scheme has on the amount of information in the sample. A portable electronic device is discussed which would implement the presented model. The results of a study taken, using the electronic randomizing device, is presented. The results show that randomized response sampling is a superior technique to direct questioning for at least some sensitive questions.

    Release date: 1989-12-15

  • Articles and reports: 12-001-X198900214562
    Description:

    This paper presents a technique for developing appropriate confidence intervals around postcensal population estimates using a modification of the ratio-correlation method termed the rank-order procedure. It is shown that the Wilcoxon test can be used to decide if a given ratio-correlation model is stable over time. If stability is indicated, then the confidence intervals associated with the data used in model construction are appropriate for postcensal estimates. If stability is not indicated, the confidence intervals associated with the data used in model construction are not appropriate, and, moreover, likely to overstate the precision of postcensal estimates. Given instability, it is shown that confidence intervals appropriate for postcensal estimates can be derived using the rank-order procedure. An empirical example is provided using county population estimates for Washington state.

    Release date: 1989-12-15

  • Articles and reports: 12-001-X198900214567
    Description:

    Estimation procedures for obtaining consistent estimators of the parameters of a generalized logistic function and of its asymptotic covariance matrix under complex survey designs are presented. A correction in the Taylor estimator of the covariance matrix is made to produce a positive definite covariance matrix. The correction also reduces the small sample bias. The estimation procedure is first presented for cluster sampling and then extended to more complex situations. A Monte Carlo study is conducted to examine the small sample properties of F-tests constructed from alternative covariance matrices. The maximum likelihood estimation method where the survey design is completely ignored is compared with the usual Taylor’s series expansion method and with the modified Taylor procedure.

    Release date: 1989-12-15

  • Articles and reports: 12-001-X198900214564
    Description:

    It is sometimes required that a probabilities proportional to size without replacement (PPSWOR) sample of first stage units (psu’s) in a multistage population survey design be updated to take account of new size measures that have become available for the whole population of such units. However, because of a considerable investment in within-psu mapping, segmentation, listing, enumerator recruitment, etc., we would like to retain the same sample psu’s if possible, consistent with the requirement that selection probabilities may now be regarded as being proportional to the new size measures. The method described in this article differs from methods already described in the literature in that it is valid for any sample size and does not require enumeration of all possible samples. Further, it does not require that the old and the new sampling methods be the same and hence it provides a convenient way not only of updating size measures but also of switching to a new sampling method.

    Release date: 1989-12-15

  • Articles and reports: 12-001-X198900214565
    Description:

    Empirical Bayes techniques are applied to the problem of “small area” estimation of proportions. Such methods have been previously used to advantage in a variety of situations, as described, for example, by Morris (1983). The basic idea here consists of incorporating random effects and nested random effects into models which reflect the complex structure of a multi-stage sample design, as was originally proposed by Dempster and Tomberlin (1980). Estimates of proportions can be obtained, together with associated estimates of uncertainty. These techniques are applied to simulated data in a Monte Carlo study which compares several available techniques for small area estimation.

    Release date: 1989-12-15

  • Articles and reports: 12-001-X198900214569
    Description:

    During the past 10 years or so, rapid progress has been made in the development of statistical methods of analysing survey data that take account of the complexity of survey design. This progress has been particularly evident in the analysis of cross-classified count data. Developments in this area have included weighted least squares estimation of generalized linear models and associated Wald tests of goodness of fit and subhypotheses, corrections to standard chi-squared or likelihood ratio tests under loglinear models or logistic regression models involving a binary response variable, and jackknifed chisquared tests. This paper illustrates the use of various extensions of these methods on data from complex surveys. The method of Scott, Rao and Thomas (1989) for weighted regression involving singular covariance matrices is applied to data from the Canada Health Survey (1978-79). Methods for logistic regression models are extended to Box-Cox models involving power transformations of cell odds ratios, and their use is illustrated on data from the Canadian Labour Force Survey. Methods for testing equality of parameters in two logistic regression models, corresponding to two time points, are applied to data from the Canadian Labour Force Survey. Finally, a general class of polytomous response models is studied, and corrected chi-squared tests are applied to data from the Canada Health Survey (1978-79). Software to implement these methods using the SAS facilities on a main frame computer is briefly described.

    Release date: 1989-12-15

  • Articles and reports: 12-001-X198900114576
    Description:

    A typical goal of health workers in the developing world is to ascertain whether or not a population meets certain standards, such as the proportion vaccinated against a certain disease. Because populations tend to be large, and resources and time available for studies limited, it is usually necessary to select a sample from the population and then make estimates regarding the entire population. Depending upon the proportion of the sample individuals who were not vaccinated, a decision will be made as to whether the coverage is adequate or whether additional efforts must be initiated to improve coverage in the population. Several sampling methods are currently in use. Among these is a modified method of cluster sampling recommended by the Expanded Programme on Immunization (EPI) of the World Health Organization. More recently, quality assurance sampling (QAS), a method commonly used for inspecting manufactured products, has been proposed as a potentially useful method for continually monitoring health service programs. In this paper, the QAS method is described and an example of how this type of sampling might be used is provided.

    Release date: 1989-06-15

  • Articles and reports: 12-001-X198900114580
    Description:

    Estimation of total numbers of hogs and pigs, sows and gilts, and cattle and calves in a state is studied using data obtained in the June Enumerative Survey conducted by the National Agricultural Statistics Service of the U.S. Department of Agriculture. It is possible to construct six different estimators using the June Enumerative Survey data. Three estimators involve data from area samples and three estimators combine data from list-frame and area-frame surveys. A rotation sampling scheme is used for the area frame portion of the June Enumerative Survey. Using data from the five years, 1982 through 1986, covariances among the estimators for different years are estimated. A composite estimator is proposed for the livestock numbers. The composite estimator is obtained by a generalized least-squares regression of the vector of different yearly estimators on an appropriate set of dummy variables. The composite estimator is designed to yield estimates for livestock inventories that are “at the same level” as the official estimates made by the U.S. Department of Agriculture.

    Release date: 1989-06-15

  • Articles and reports: 12-001-X198900114578
    Description:

    The optimum allocation to strata for multipurpose surveys is often solved in practice by establishing linear variance constraints and then using convex programming to minimize the survey cost. Using the Kuhn-Tucker theorem, this paper gives an expression for the resulting optimum allocation in terms of Lagrangian multipliers. Using this representation, the partial derivative of the cost function with respect to the k-th variance constraint is found to be -2 \alpha_{k^*} g (x^*) / v_k, where g (x^*) is the cost of the optimum allocation and where \alpha_{k^*} and v_k are, respectively, the k-th normalized Lagrangian multiplier and the upper bound on the precision of the k-th variable. Finally, a simple computing algorithm is presented and its convergence properties are discussed. The use of these results in sample design is demonstrated with data from a survey of commercial establishments.

    Release date: 1989-06-15

  • Articles and reports: 12-001-X198900114577
    Description:

    In this article the authors evaluate the relative performance of survey and diary data collection methods in the context of the long-distance telephone communication market. Based on an analysis of 1,530 respondents, the results indicate that two demographic variables, sex and income, are important in explaining the difference in survey reporting and diary recording of usage data.

    Release date: 1989-06-15

  • Articles and reports: 12-001-X198900114571
    Description:

    Statistics Canada is currently rebuilding its central register of economic entities. The new register views each economic entity as a network of legal and operating entities whose characteristics allow for the delineation of statistical entities. This network view, the profile, is determined through the ‘profiling’ process which involves contact with the economic entity. In 1986 a list of all entities in-scope for a profiling contact was required so that profiles could be obtained to initialize the new register. Administrative data were used to build this list. In the future, administrative data will be a source of information on changes that may have happened to economic entities. They may thus be used as a source of direct update or as a signal that a review of the structure of an entity is required. The paper begins with the objectives of the profiling process. The procedures for constructing the frame for the initial profiling process using several administrative data sources are then presented. These procedures include the application of concepts, the detection of overlap between sources, and the evaluation of data quality. Next, the role of administrative data in providing information on changes to business entities and in requesting profiles to be verified is presented. Then the results of a simulation study done to assess this role are reviewed. Finally, the paper concludes with a series of questions on the methodology of using administrative data to maintain profiles.

    Release date: 1989-06-15

  • Articles and reports: 12-001-X198900114574
    Description:

    Let A x B be the product space of two sets A and B which is divided into matches (pairs representing the same entity) and nonmatches (pairs representing different entities). Linkage rules are those that divide A x B into links (designated matches), possible links (pairs for which we delay a decision), and nonlinks (designated nonmatches). Under fixed bounds on the error rates, Fellegi and Sunter (1969) provided a linkage rule that is optimal in the sense that it minimizes the set of possible links. The optimality is dependent on knowledge of certain probabilities that are used in a crucial likelihood ratio. In applying the record linkage model, an independence assumption is often made that allows estimation of the probabilities. If the assumption is not met, then a record linkage procedure using estimates computed under the assumption may not be optimal. This paper contains an examination of methods for adjusting linkage rules when the independence assumption is not valid. The presentation takes the form of an empirical analysis of lists of businesses for which the truth of matches is known. The number of possible links obtained using standard and adjusted computational procedures may be dependent on different samples. Bootstrap methods (Efron 1987) are used to examine the variation due to different samples.

    Release date: 1989-06-15

  • Articles and reports: 12-001-X198900114581
    Description:

    This paper develops a design consistent small domain estimator using a random effects model. The mean squared error of this estimator is then evaluated without assuming the random effect component of the model is correct. Data from a complex sample survey shows how this approach to mean squared error estimation, while perhaps too instable to be used directly, can be employed to determine whether the design consistent small domain estimator proposed here is better than the conventional design-based estimator.

    Release date: 1989-06-15

  • Articles and reports: 12-001-X198900114572
    Description:

    The Survey of Income and Program Participation (SIPP) is a new Census Bureau panel survey designed to provide data on the economic situation of persons and families in the United States. The basic datum of SIPP is monthly income, which is reported for each month of the four-month reference period preceding the interview month. The SIPP Record Check Study uses administrative record data to estimate the quality of SIPP estimates for a variety of income sources and transfer programs. The project uses computerized record matching to identify SIPP sample persons in four states who are on record as having received payments from any of nine state or Federal programs, and then compares survey-reported dates and amounts of payments with official record values. The paper describes the project in detail and presents some early findings.

    Release date: 1989-06-15

  • Articles and reports: 12-001-X198900114579
    Description:

    Estimation of the means of a characteristic for a population at different points in time, based on a series of repeated surveys, is briefly reviewed. By imposing a stochastic parametric model on these means, it is possible to estimate the parameters of the model and to obtain alternative estimators of the means themselves. We describe the case where the population means follow an autoregressive-moving average (ARMA) process and the survey errors can also be formulated as an ARMA process. An example using data from the Canadian Travel Survey is presented.

    Release date: 1989-06-15

  • Articles and reports: 12-001-X198900114573
    Description:

    The Census Bureau makes extensive use of administrative records information in its various economic programs. Although the volume of records processed annually is vast, even larger numbers will be received during the census years. Census Bureau mainframe computers perform quality control (QC) tabulations on the data; however, since such a large number of QC tables are needed and resources for programming are limited and costly, a comprehensive mainframe QC system is difficult to attain. Add to this the sensitive nature of the data and the potentially very negative ramifications from erroneous data, and the need becomes quite apparent for a sophisticated quality assurance system on the microcomputer level. Such a system is being developed by the Economic Surveys Division and will be in place for the 1987 administrative records data files. The automated quality assurance system integrates micro and mainframe computer technology. Administrative records data are received weekly and processed initially through mainframe QC programs. The mainframe output is transferred to a microcomputer and formatted specifically for importation to a spreadsheet program. Systematic quality verification occurs within the spreadsheet structure, as data review, error detection, and report generation are accomplished automatically. As a result of shifting processes from mainframe to microcomputer environments, the system eases the burden on the programming staff, increases the flexibility of the analytical staff, and reduces processing costs on the mainframe and provides the comprehensive quality assurance component for administrative records.

    Release date: 1989-06-15

  • Articles and reports: 12-001-X198900114575
    Description:

    The experience of the four Nordic countries illustrates the advantages and disadvantages of a register-based census of population and points to ways in which the disadvantages can be contained. Other countries see major obstacles to a register-based census: the lack of data systems of the kind and quality needed; and public concern about privacy and the power of the State. These issues go far beyond statistics; they concern policy and administration. The paper looks at the situation in two countries, the United Kingdom and Australia. In the United Kingdom past initiatives aimed at population registration in peacetime foundered and the present environment is hostile to any new initiative. But the government is going ahead with a controversial reform of local taxation that involves setting up new registers. In Australia the government tabled a Bill to introduce identity cards and an associated register, and advanced clearcut political arguments to support it; the Bill was later withdrawn. The paper concludes that the issues involved in reforming data systems deserve to be fully discussed and gives reasons why statisticians should take a leading part in the debate.

    Release date: 1989-06-15

Data (0)

Data (0) (0 results)

Your search for "" found no results in this section of the site.

You may try:

Analysis (19)

Analysis (19) (19 of 19 results)

  • Articles and reports: 12-001-X198900214568
    Description:

    The paper describes a Monte Carlo study of simultaneous confidence interval procedures for k > 2 proportions, under a model of two-stage cluster sampling. The procedures investigated include: (i) standard multinomial intervals; (ii) Scheffé intervals based on sample estimates of the variances of cell proportions; (iii) Quesenberry-Hurst intervals adapted for clustered data using Rao and Scott’s first and second order adjustments to X^2; (iv) simple Bonferroni intervals; (v) Bonferroni intervals based on transformations of the estimated proportions; (vi) Bonferroni intervals computed using the critical points of Student’s t. In several realistic situations, actual coverage rates of the multinomial procedures were found to be seriously depressed compared to the nominal rate. The best performing intervals, from the point of view of coverage rates and coverage symmetry (an extension of an idea due to Jennings), were the t-based Bonferroni intervals derived using log and logit transformations. Of the Scheffé-like procedures, the best performance was provided by Quesenberry-Hurst intervals in combination with first-order Rao-Scott adjustments.

    Release date: 1989-12-15

  • Articles and reports: 12-001-X198900214563
    Description:

    This paper examines the adequacy of estimates of emigrants from Canada and interprovincial migration data from the Family Allowance files and Revenue Canada tax files. The application of these data files in estimating total population for Canada, provinces and territories, was evaluated with reference to the 1986 Census counts. It was found that these two administrative files provided consistent and reasonably accurate series of data on emigration and interprovincial migration from 1981 to 1986. Consequently, the population estimates were fairly accurate. The estimate of emigrants derived from the Family Allowance file could be improved by using the ratio of adult to child emigrant rates computed from Employment and Immigration Canada’s immigration file.

    Release date: 1989-12-15

  • Articles and reports: 12-001-X198900214566
    Description:

    A randomized response model for sampling from dichotomous populations is developed in this paper. The model permits the use of continuous randomization and multiple trials per respondent. The special case of randomization with normal distributions is considered, and a computer simulation of such a sampling procedure is presented as an initial exploration into the effects such a scheme has on the amount of information in the sample. A portable electronic device is discussed which would implement the presented model. The results of a study taken, using the electronic randomizing device, is presented. The results show that randomized response sampling is a superior technique to direct questioning for at least some sensitive questions.

    Release date: 1989-12-15

  • Articles and reports: 12-001-X198900214562
    Description:

    This paper presents a technique for developing appropriate confidence intervals around postcensal population estimates using a modification of the ratio-correlation method termed the rank-order procedure. It is shown that the Wilcoxon test can be used to decide if a given ratio-correlation model is stable over time. If stability is indicated, then the confidence intervals associated with the data used in model construction are appropriate for postcensal estimates. If stability is not indicated, the confidence intervals associated with the data used in model construction are not appropriate, and, moreover, likely to overstate the precision of postcensal estimates. Given instability, it is shown that confidence intervals appropriate for postcensal estimates can be derived using the rank-order procedure. An empirical example is provided using county population estimates for Washington state.

    Release date: 1989-12-15

  • Articles and reports: 12-001-X198900214567
    Description:

    Estimation procedures for obtaining consistent estimators of the parameters of a generalized logistic function and of its asymptotic covariance matrix under complex survey designs are presented. A correction in the Taylor estimator of the covariance matrix is made to produce a positive definite covariance matrix. The correction also reduces the small sample bias. The estimation procedure is first presented for cluster sampling and then extended to more complex situations. A Monte Carlo study is conducted to examine the small sample properties of F-tests constructed from alternative covariance matrices. The maximum likelihood estimation method where the survey design is completely ignored is compared with the usual Taylor’s series expansion method and with the modified Taylor procedure.

    Release date: 1989-12-15

  • Articles and reports: 12-001-X198900214564
    Description:

    It is sometimes required that a probabilities proportional to size without replacement (PPSWOR) sample of first stage units (psu’s) in a multistage population survey design be updated to take account of new size measures that have become available for the whole population of such units. However, because of a considerable investment in within-psu mapping, segmentation, listing, enumerator recruitment, etc., we would like to retain the same sample psu’s if possible, consistent with the requirement that selection probabilities may now be regarded as being proportional to the new size measures. The method described in this article differs from methods already described in the literature in that it is valid for any sample size and does not require enumeration of all possible samples. Further, it does not require that the old and the new sampling methods be the same and hence it provides a convenient way not only of updating size measures but also of switching to a new sampling method.

    Release date: 1989-12-15

  • Articles and reports: 12-001-X198900214565
    Description:

    Empirical Bayes techniques are applied to the problem of “small area” estimation of proportions. Such methods have been previously used to advantage in a variety of situations, as described, for example, by Morris (1983). The basic idea here consists of incorporating random effects and nested random effects into models which reflect the complex structure of a multi-stage sample design, as was originally proposed by Dempster and Tomberlin (1980). Estimates of proportions can be obtained, together with associated estimates of uncertainty. These techniques are applied to simulated data in a Monte Carlo study which compares several available techniques for small area estimation.

    Release date: 1989-12-15

  • Articles and reports: 12-001-X198900214569
    Description:

    During the past 10 years or so, rapid progress has been made in the development of statistical methods of analysing survey data that take account of the complexity of survey design. This progress has been particularly evident in the analysis of cross-classified count data. Developments in this area have included weighted least squares estimation of generalized linear models and associated Wald tests of goodness of fit and subhypotheses, corrections to standard chi-squared or likelihood ratio tests under loglinear models or logistic regression models involving a binary response variable, and jackknifed chisquared tests. This paper illustrates the use of various extensions of these methods on data from complex surveys. The method of Scott, Rao and Thomas (1989) for weighted regression involving singular covariance matrices is applied to data from the Canada Health Survey (1978-79). Methods for logistic regression models are extended to Box-Cox models involving power transformations of cell odds ratios, and their use is illustrated on data from the Canadian Labour Force Survey. Methods for testing equality of parameters in two logistic regression models, corresponding to two time points, are applied to data from the Canadian Labour Force Survey. Finally, a general class of polytomous response models is studied, and corrected chi-squared tests are applied to data from the Canada Health Survey (1978-79). Software to implement these methods using the SAS facilities on a main frame computer is briefly described.

    Release date: 1989-12-15

  • Articles and reports: 12-001-X198900114576
    Description:

    A typical goal of health workers in the developing world is to ascertain whether or not a population meets certain standards, such as the proportion vaccinated against a certain disease. Because populations tend to be large, and resources and time available for studies limited, it is usually necessary to select a sample from the population and then make estimates regarding the entire population. Depending upon the proportion of the sample individuals who were not vaccinated, a decision will be made as to whether the coverage is adequate or whether additional efforts must be initiated to improve coverage in the population. Several sampling methods are currently in use. Among these is a modified method of cluster sampling recommended by the Expanded Programme on Immunization (EPI) of the World Health Organization. More recently, quality assurance sampling (QAS), a method commonly used for inspecting manufactured products, has been proposed as a potentially useful method for continually monitoring health service programs. In this paper, the QAS method is described and an example of how this type of sampling might be used is provided.

    Release date: 1989-06-15

  • Articles and reports: 12-001-X198900114580
    Description:

    Estimation of total numbers of hogs and pigs, sows and gilts, and cattle and calves in a state is studied using data obtained in the June Enumerative Survey conducted by the National Agricultural Statistics Service of the U.S. Department of Agriculture. It is possible to construct six different estimators using the June Enumerative Survey data. Three estimators involve data from area samples and three estimators combine data from list-frame and area-frame surveys. A rotation sampling scheme is used for the area frame portion of the June Enumerative Survey. Using data from the five years, 1982 through 1986, covariances among the estimators for different years are estimated. A composite estimator is proposed for the livestock numbers. The composite estimator is obtained by a generalized least-squares regression of the vector of different yearly estimators on an appropriate set of dummy variables. The composite estimator is designed to yield estimates for livestock inventories that are “at the same level” as the official estimates made by the U.S. Department of Agriculture.

    Release date: 1989-06-15

  • Articles and reports: 12-001-X198900114578
    Description:

    The optimum allocation to strata for multipurpose surveys is often solved in practice by establishing linear variance constraints and then using convex programming to minimize the survey cost. Using the Kuhn-Tucker theorem, this paper gives an expression for the resulting optimum allocation in terms of Lagrangian multipliers. Using this representation, the partial derivative of the cost function with respect to the k-th variance constraint is found to be -2 \alpha_{k^*} g (x^*) / v_k, where g (x^*) is the cost of the optimum allocation and where \alpha_{k^*} and v_k are, respectively, the k-th normalized Lagrangian multiplier and the upper bound on the precision of the k-th variable. Finally, a simple computing algorithm is presented and its convergence properties are discussed. The use of these results in sample design is demonstrated with data from a survey of commercial establishments.

    Release date: 1989-06-15

  • Articles and reports: 12-001-X198900114577
    Description:

    In this article the authors evaluate the relative performance of survey and diary data collection methods in the context of the long-distance telephone communication market. Based on an analysis of 1,530 respondents, the results indicate that two demographic variables, sex and income, are important in explaining the difference in survey reporting and diary recording of usage data.

    Release date: 1989-06-15

  • Articles and reports: 12-001-X198900114571
    Description:

    Statistics Canada is currently rebuilding its central register of economic entities. The new register views each economic entity as a network of legal and operating entities whose characteristics allow for the delineation of statistical entities. This network view, the profile, is determined through the ‘profiling’ process which involves contact with the economic entity. In 1986 a list of all entities in-scope for a profiling contact was required so that profiles could be obtained to initialize the new register. Administrative data were used to build this list. In the future, administrative data will be a source of information on changes that may have happened to economic entities. They may thus be used as a source of direct update or as a signal that a review of the structure of an entity is required. The paper begins with the objectives of the profiling process. The procedures for constructing the frame for the initial profiling process using several administrative data sources are then presented. These procedures include the application of concepts, the detection of overlap between sources, and the evaluation of data quality. Next, the role of administrative data in providing information on changes to business entities and in requesting profiles to be verified is presented. Then the results of a simulation study done to assess this role are reviewed. Finally, the paper concludes with a series of questions on the methodology of using administrative data to maintain profiles.

    Release date: 1989-06-15

  • Articles and reports: 12-001-X198900114574
    Description:

    Let A x B be the product space of two sets A and B which is divided into matches (pairs representing the same entity) and nonmatches (pairs representing different entities). Linkage rules are those that divide A x B into links (designated matches), possible links (pairs for which we delay a decision), and nonlinks (designated nonmatches). Under fixed bounds on the error rates, Fellegi and Sunter (1969) provided a linkage rule that is optimal in the sense that it minimizes the set of possible links. The optimality is dependent on knowledge of certain probabilities that are used in a crucial likelihood ratio. In applying the record linkage model, an independence assumption is often made that allows estimation of the probabilities. If the assumption is not met, then a record linkage procedure using estimates computed under the assumption may not be optimal. This paper contains an examination of methods for adjusting linkage rules when the independence assumption is not valid. The presentation takes the form of an empirical analysis of lists of businesses for which the truth of matches is known. The number of possible links obtained using standard and adjusted computational procedures may be dependent on different samples. Bootstrap methods (Efron 1987) are used to examine the variation due to different samples.

    Release date: 1989-06-15

  • Articles and reports: 12-001-X198900114581
    Description:

    This paper develops a design consistent small domain estimator using a random effects model. The mean squared error of this estimator is then evaluated without assuming the random effect component of the model is correct. Data from a complex sample survey shows how this approach to mean squared error estimation, while perhaps too instable to be used directly, can be employed to determine whether the design consistent small domain estimator proposed here is better than the conventional design-based estimator.

    Release date: 1989-06-15

  • Articles and reports: 12-001-X198900114572
    Description:

    The Survey of Income and Program Participation (SIPP) is a new Census Bureau panel survey designed to provide data on the economic situation of persons and families in the United States. The basic datum of SIPP is monthly income, which is reported for each month of the four-month reference period preceding the interview month. The SIPP Record Check Study uses administrative record data to estimate the quality of SIPP estimates for a variety of income sources and transfer programs. The project uses computerized record matching to identify SIPP sample persons in four states who are on record as having received payments from any of nine state or Federal programs, and then compares survey-reported dates and amounts of payments with official record values. The paper describes the project in detail and presents some early findings.

    Release date: 1989-06-15

  • Articles and reports: 12-001-X198900114579
    Description:

    Estimation of the means of a characteristic for a population at different points in time, based on a series of repeated surveys, is briefly reviewed. By imposing a stochastic parametric model on these means, it is possible to estimate the parameters of the model and to obtain alternative estimators of the means themselves. We describe the case where the population means follow an autoregressive-moving average (ARMA) process and the survey errors can also be formulated as an ARMA process. An example using data from the Canadian Travel Survey is presented.

    Release date: 1989-06-15

  • Articles and reports: 12-001-X198900114573
    Description:

    The Census Bureau makes extensive use of administrative records information in its various economic programs. Although the volume of records processed annually is vast, even larger numbers will be received during the census years. Census Bureau mainframe computers perform quality control (QC) tabulations on the data; however, since such a large number of QC tables are needed and resources for programming are limited and costly, a comprehensive mainframe QC system is difficult to attain. Add to this the sensitive nature of the data and the potentially very negative ramifications from erroneous data, and the need becomes quite apparent for a sophisticated quality assurance system on the microcomputer level. Such a system is being developed by the Economic Surveys Division and will be in place for the 1987 administrative records data files. The automated quality assurance system integrates micro and mainframe computer technology. Administrative records data are received weekly and processed initially through mainframe QC programs. The mainframe output is transferred to a microcomputer and formatted specifically for importation to a spreadsheet program. Systematic quality verification occurs within the spreadsheet structure, as data review, error detection, and report generation are accomplished automatically. As a result of shifting processes from mainframe to microcomputer environments, the system eases the burden on the programming staff, increases the flexibility of the analytical staff, and reduces processing costs on the mainframe and provides the comprehensive quality assurance component for administrative records.

    Release date: 1989-06-15

  • Articles and reports: 12-001-X198900114575
    Description:

    The experience of the four Nordic countries illustrates the advantages and disadvantages of a register-based census of population and points to ways in which the disadvantages can be contained. Other countries see major obstacles to a register-based census: the lack of data systems of the kind and quality needed; and public concern about privacy and the power of the State. These issues go far beyond statistics; they concern policy and administration. The paper looks at the situation in two countries, the United Kingdom and Australia. In the United Kingdom past initiatives aimed at population registration in peacetime foundered and the present environment is hostile to any new initiative. But the government is going ahead with a controversial reform of local taxation that involves setting up new registers. In Australia the government tabled a Bill to introduce identity cards and an associated register, and advanced clearcut political arguments to support it; the Bill was later withdrawn. The paper concludes that the issues involved in reforming data systems deserve to be fully discussed and gives reasons why statisticians should take a leading part in the debate.

    Release date: 1989-06-15

Reference (0)

Reference (0) (0 results)

Your search for "" found no results in this section of the site.

You may try:

Date modified: