Statistics by subject – Statistical methods

Filter results by

Help for filters and search
Currently selected filters that can be removed

Keyword(s)

Year of publication

1 facets displayed. 1 facets selected.

Filter results by

Help for filters and search
Currently selected filters that can be removed

Keyword(s)

Year of publication

1 facets displayed. 1 facets selected.

Filter results by

Help for filters and search
Currently selected filters that can be removed

Keyword(s)

Year of publication

1 facets displayed. 1 facets selected.

Filter results by

Help for filters and search
Currently selected filters that can be removed

Keyword(s)

Year of publication

1 facets displayed. 1 facets selected.

Other available resources to support your research.

Help for sorting results
Browse our central repository of key standard concepts, definitions, data sources and methods.
Loading
Loading in progress, please wait...
All (54)

All (54) (25 of 54 results)

  • Index and guides: 92-371-X
    Description:

    This report deals with sampling and weighting, a process whereby certain characteristics are collected and processed for a random sample of dwellings and persons identified in the complete census enumeration. Data for the whole population are then obtained by scaling up the results for the sample to the full population level. The use of sampling may lead to substantial reductions in costs and respondent burden, or alternatively, can allow the scope of a census to be broadened at the same cost.

    Release date: 1999-12-07

  • Classification: 89F0077X199903B
    Description:

    The National Longitudinal Survey of Children and Youth (NLSCY) is the first Canada-wide survey of children. Starting in 1994, it will gather information on a sample of children and their life experiences. It will follow these children over time. The survey will collect information on children and their families, education, health, development, behaviour, friends, activities, etc.

    Along with 89F0077XPE (or XIE) issue 9903a, this document contains the various questionnaires used to gather information from parents, children, teachers and principals.

    Release date: 1999-10-22

  • Technical products: 11-522-X19980015024
    Description:

    A longitudinal study on a cohort of pupils in the secondary school has been conducted in an Italian region since 1986 in order to study the transition from school to working life. The information have been collected at every sweep by a mail questionnaire and, at the final sweep, by a face-to-face interview, where retrospective questions referring back to the whole observation period have been asked. The gross flows between different discrete states - still in the school system, in the labour force without a job, in the labour force with a job - may then be estimated both from prospective and retrospective data, and the recall effect may be evaluated. Moreover, the conditions observed by the two different techniques may be regarded as two indicators of the 'true' unobservable condition, thus leading to the specification and estimation of a latent class model. In this framework, a Markov chain hypothesis may be introduced and evaluated in order to estimate the transition probabilities between the states, once they are corrected or the classification errors. Since the information collected by mail show a given amount of missing data in terms of unit nonresponse, the 'missing' category is also introduced in the model specification.

    Release date: 1999-10-22

  • Technical products: 11-522-X19980015019
    Description:

    The British Labour Force Survey (LFS) is a quarterly household survey with a rotating sample design that can potentially be used to produce longitudinal data, including estimates of labour force gross flows. However, these estimates may be biased due to the effect of non-response. Weighting adjustments are a commonly used method to account for non-response bias. We find that weighting may not fully account for the effect of non-response bias because non-response may depend on the unobserved labour force flows, i.e., the non-response is non-ignorable. To adjust for the effects of non-ignorable non-response, we propose a model for the complex non-response patterns in the LFS which controls for the correlated within-household non-response behaviour found in the survey. The results of modelling suggest that non-response may be non-ignorable in the LFS, causing the weighting estimates to be biased.

    Release date: 1999-10-22

  • Technical products: 11-522-X19980015037
    Description:

    For longitudinal data, mixed models are often used, since they allow analysts to take account of the correlation between different observations from the same individual. The finite mixture model may be considered as a special case of a mixed model. In this document, attention will be given to the maximum likelihood method. The maximization of the likelihood function for a finite mixture of distributions is generally more difficult than in the usual case of a single distribution and can require considerable time. The objective of this project was therefore primarily to identify the one or more algorithms that best meet the criteria of run time and of efficiency in finding the solution. To achieve this objective, a simulation study was carried out. Only the situation in which the dependent variable is dichotomous was considered. This situation is very useful in practice, since among other things it can be used to model discrete durations, such as the length of time in "low income" status.

    Release date: 1999-10-22

  • Technical products: 11-522-X19980015010
    Description:

    In 1994, Statistics Canada introduced a new longitudinal social survey that collects information from about 23,000 children spread over 13,500 households. The objective of the National Longitudinal Survey of Children and Youth is to measure the development and well being of children until they reach adulthood. To this end, the survey gathers together information about the child, parents, neighbourhood as well as family and school environment. As a consequence, the data collected for each child, is provided by several respondents, from parents to teachers, a situation which contributes to an increased disclosure risk. In order to reach a balance between confidentiality and the analytical value of released data, the survey produces three different microdata files with more or less information. The master file that contains all the information is only available by means of remote access. Hence, researchers do not have direct access to the data, but send their request in the form of software programs that are submitted by Statistics Canada staff. The results are then vetted for confidentiality and sent back to the researchers. The presentation will be devoted to the various disclosure risks of such a survey and to the tools used to reduce those risks.

    Release date: 1999-10-22

  • Technical products: 11-522-X19980015017
    Description:

    Longitudinal studies with repeated observations on individuals permit better characterizations of change and assessment of possible risk factors, but there has been little experience applying sophisticated models for longitudinal data to the complex survey setting. We present results from a comparison of different variance estimation methods for random effects models of change in cognitive function among older adults. The sample design is a stratified sample of people 65 and older, drawn as part of a community-based study designed to examine risk factors for dementia. The model summarizes the population heterogeneity in overall level and rate of change in cognitive function using random effects for intercept and slope. We discuss an unweighted regression including covariates for the stratification variables, a weighted regression, and bootstrapping; we also did preliminary work into using balanced repeated replication and jackknife repeated replication.

    Release date: 1999-10-22

  • Technical products: 11-522-X19980015036
    Description:

    Multivariate logistic regression, introduced by Glonek and McCullagh (1995) as a generalisation of logistic regression, is useful in the analysis of longitudinal data as it allows for dependent repeated observations of a categorical variable and for incomplete response profiles. We show how the method can be extended to deal with data from complex surveys and we illustrate it on data from the Swiss Labour Force Survey. The effect of the sampling weights on the parameter estimates and their standard errors is considered.

    Release date: 1999-10-22

  • Technical products: 11-522-X19980015016
    Description:

    Models for fitting longitudinal binary responses are explored using a panel study of voting intentions. A standard repeated measures multilevel logistic model is shown inadequate due to the presence of a substantial proportion of respondents who maintain a constant response over time. A multivariate binary response model is shown a better fit to the data.

    Release date: 1999-10-22

  • Technical products: 11-522-X19980015025
    Description:

    The log-linear modelling of categorical longitudinal survey data on income is studied. An emphasis is on inference about change. Special attention is paid to modelling of longitudinal data from two waves. A small illustration is based on data from the Canadian Survey of Labour and Income Dynamics.

    Release date: 1999-10-22

  • Technical products: 11-522-X19980015028
    Description:

    We address the problem of estimation for the income dynamics statistics calculated from complex longitudinal surveys. In addition, we compare two design-based estimators of longitudinal proportions and transition rates in terms of variability under large attrition rates. One estimator is based on the cross-sectional samples for the estimation of the income class boundaries at each time period and on the longitudinal sample for the estimation of the longitudinal counts; the other estimator is entirely based on the longitudinal sample, both for the estimation of the class boundaries and the longitudinal counts. We develop Taylor linearization-type variance estimators for both the longitudinal and the mixed estimator under the assumption of no change in the population, and for the mixed estimator when there is change.

    Release date: 1999-10-22

  • Technical products: 11-522-X19980015023
    Description:

    The study of social mobility, between labour market statuses or between income levels, for example, is often based on the analysis of mobility matrices. When comparing these transition matrices, with a view to evaluating behavioural changes, one often forgets that the data derive from a sample survey and are therefore affected by sampling variances. Similarly, it is assumed that the responses collected correspond to the ' true value.'

    Release date: 1999-10-22

  • Technical products: 11-522-X19980015033
    Description:

    Victimizations are not randomly scattered through the population, but tend to be concentrated in relatively few victims. Data from the U.S. National Crime Victimization Survey (NCVS), a multistage rotating panel survey, are employed to estimate the conditional probabilities of being a crime victim at time t given the victimization status in earlier interviews. Models are presented and fit to allow use of partial information from households that move in or out of the housing unit during the study period. The estimated probability of being a crime victim at interview t given the status at interview (t-l) is found to decrease with t. Possible implications for estimating cross-sectional victimization rates are discusssed.

    Release date: 1999-10-22

  • Technical products: 11-522-X19980015021
    Description:

    The U.S. Bureau of the Census implemented major changes to the design of the Survey of Income and Program Participation (SIPP) with the panel begun in 1996. The revised survey design emphasized longitudinal applications and the Census Bureau attempted to understand and resolve the seam bias common to longitudinal surveys. In addition to the substantive and administrative redesign of the survey, the Census Bureau is improving the data processing procedures which yield microdata files for the public to analyse. The wave-by-wave data products are being edited and imputed with a longitudinal element rather than cross-sectionally, carrying forward information from a prior wave that is missing in the current wave. The longitudinal data products will be enhanced, both by the redesigned survey and new processing procedures. Simple methods of imputing data over time are being replaced with more sophisticated methods that do not attenuate seam bias. The longitudinal sample is expanding to include more observations which were nonrespondents in one or more waves. Longitudinal weights will be applied to the file to support person-based longitudinal analysis for calendar years or longer periods of time (up to four years).

    Release date: 1999-10-22

  • Technical products: 11-522-X19980015018
    Description:

    This paper presents a method for handling longitudinal data in which individuals belong to more than one unit at a higher level, and also where there is missing information on the identification of the units to which they belong. In education, for example, a student might be classified as belonging sequentially to a particular combination of primary and secondary school, but for some students, the identity of either the primary or secondary school may be unknown. Likewise, in a longitudinal study, students may change school or class from one period to the next, so 'belonging' to more than one higher level unit. The procedures used to model these stuctures are extensions of a random effects cross-classified multilevel model.

    Release date: 1999-10-22

  • Technical products: 11-522-X19980015022
    Description:

    This article extends and further develops the method proposed by Pfeffermann, Skinner and Humphreys (1998) for the estimation of gross flows in the presence of classification errors. The main feature of that method is the use of auxiliary information at the individual level which circumvents the need for validation data for estimating the misclassification rates. The new developments in this article are the establishment of conditions for model identification, a study of the properties of a model goodness of fit statistic and modifications to the sample likelihood to account for missing data and informative sampling. The new developments are illustrated by a small Monte-Carlo simulation study.

    Release date: 1999-10-22

  • Technical products: 11-522-X19980015031
    Description:

    The U.S. Third National Health and Nutrition Examination Survey (NHANES III) was carried out from 1988 to 1994. This survey was intended primarily to provide estimates of cross-sectional parameters believed to be approximately constant over the six-year data collection period. However, for some variable (e.g., serum lead, body mass index and smoking behavior), substantive considerations suggest the possible presence of nontrivial changes in level between 1988 and 1994. For these variables, NHANES III is potentially a valuable source of time-change information, compared to other studies involving more restricted populations and samples. Exploration of possible change over time is complicated by two issues. First, there was of practical concern because some variables displayed substantial regional differences in level. This was of practical concern because some variables displayed substantial regional differences in level. Second, nontrivial changes in level over time can lead to nontrivial biases in some customary NHANES III variance estimators. This paper considers these two problems and discusses some related implications for statistical policy.

    Release date: 1999-10-22

  • Technical products: 11-522-X19980015029
    Description:

    In longitudinal surveys, sample subjects are observed over several time points. This feature typically leads to dependent observations on the same subject, in addition to the customary correlations across subjects induced by the sample design. Much research in the literature has focussed on modeling the marginal mean of a response as a function of covariates. Liang and Zeger (1986) used generalized estimating equations (GEE), requiring only correct specification of the marginal mean, and obtained standard errors of regression parameter estimates and associated Wald tests, assuming a "working" correlation structure for the repeated measurements on a sample subject. Rotnitzky and Jewell (1990) developed quasi-score tests and Rao-Scott adjustments to "working" quasi-score tests under marginal models. These methods are asymptotically robust to misspecification of the within-subject correlation structure, but assume independence of sample subjects which is not satisfied for complex longitudinal survey data based on stratified multi-stage sampling. We proposed asymptotically valid Wald and quasi-score tests for longitudinal survey data, using the Taylor Linearization and jackknife methods. Alternative tests, based on Rao-Scott adjustments to naive tests that ignore survey design features and on Bonferroni-t, are also developed. These tests are particularly useful when the effective degrees of freedom, usually taken as the total number of sample primary units (clusters) minus the number of strata, is small.

    Release date: 1999-10-22

  • Technical products: 11-522-X19980015035
    Description:

    In a longitudinal survey conducted for k periods some units may be observed for less than k of the periods. Examples include, surveys designed with partially overlapping subsamples, a pure panel survey with nonresponse, and a panel survey supplemented with additional samples for some of the time periods. Estimators of the regression type are exhibited for such surveys. An application to special studies associated with the National Resources Inventory is discussed.

    Release date: 1999-10-22

  • Technical products: 11-522-X19980015026
    Description:

    The purpose of the present study is to utilize panel data from the Current Population Survey (CPS) to examine the effects of unit nonresponse. Because most nonrespondents to the CPS are respondents during at least one month-in-sample, data from other months can be used to compare the characteristics of complete respondents and panel nonrespondents and to evaluate nonresponse adjustment procedures. In the current paper we present analyses utilizing CPS panel data to illustrate the effects of unit nonresponse. After adjusting for nonresponse, additional comparisons are also made to evaluate the effects of nonresponse adjustment. The implications of the findings and suggestions for further research are discussed.

    Release date: 1999-10-22

  • Technical products: 11-522-X19980015032
    Description:

    The objective of this research project is to examine the long-term consequences of being raised in a single parent household. We examine the impact of parental separation or divorce on the adult labour market behaviour of children ten to fifteen years after the event. In particular, we relate the family income and household characteristics of a cohort of individuals who are 16 to 19 years of age in 1982 to their labour market earnings, reliance on social transfers (UI and Income Assistance), and marital/fertility outcomes during the early 1990s, when they are in their late 20s and early 30s. Our data is based upon the linked income tax records developed by us at Statistics Canada, the Survey of Labour and Income Dynamics, and the National Longitudinal Survey of Children and Youth.

    Release date: 1999-10-22

  • Technical products: 11-522-X19980015034
    Description:

    A model of secondary school progression has been estimated using data from the 1991 School Leavers Survey conducted by Statistics Canada. The data on which the school progression model was based comprised current educational status and responses to retrospective questions on the timing of schooling events. These data were sufficient for approximate reconstruction of educational event histories of each respondent. The school progression model was designed to be included in a larger, continuous time micro-simulation model. Its main features involve estimation -- by age, month of birth and season for both sexes in each province -- of rates of graduation, of dropout, of return and of dropout graduation. Estimation was reinforced with auxiliary 1991 Census and administative data.

    Release date: 1999-10-22

  • Technical products: 11-522-X19980015015
    Description:

    In epidemiology analysis of longitudinal data is commonly accepted as providing the most robust measures of association between putative risk and selected outcomes such as death or cancer. SMARTIE is a SAS application for efficient analysis of longitudinal data. Based on person days at risk, it can handle multiple exits from and re-entries to risk, and derives outcome measures such as survival rates. Standardised Mortality Ratios (SMRs) and Cancer Incidence Ratios (SIRs). Summary data can be produced in a format easily ported to any modelling package such as Stats 5.0. We discuss the background to its development, the overall program structure, its command language, and finally we say something about the organization of outputs. Findings from survival studies using the Longitudinal Study of the Office for National Statistics (ONS) are used to demonstrate features of SMARTIE. This study is based on one per cent of the population of England and Wales. It is continually updated with the addition of new members and with information from birth, death and cancer records, and from the census.

    Release date: 1999-10-22

  • Technical products: 11-522-X19980015030
    Description:

    Two-phase sampling designs have been conducted in waves to estimate the incidence of a rare disease such as dementia. Estimation of disease incidence from longitudinal dementia study has to appropriately adjust for data missing by death as well as the sampling design used at each study wave. In this paper we adopt a selection model approach to model the missing data by death and use a likelihood approach to derive incidence estimates. A modified EM algorithm is used to deal with data missing by sampling selection. The non-paramedic jackknife variance estimator is used to derive variance estimates for the model parameters and the incidence estimates. The proposed approaches are applied to data from the Indianapolis-Ibadan Dementia Study.

    Release date: 1999-10-22

  • Technical products: 11-522-X19980015020
    Description:

    At the end of 1993, Eurostat lauched a 'community' panel of households. The first wave, carried out in 1994 in the 12 countries of the European Union, included some 7,300 households in France, and at least 14,000 adults 17 years or over. Each individual was then followed up and interviewed each year, even if they had moved. The individuals leaving the sample present a particular profile. In the first part, we present a sketch of how our sample evolves and an analysis of the main characteristics of the non-respondents. We then propose 2 models to correct for non-response per homogeneous category. We then describe the longitudinal weight distribution obtained from the two models, and the cross-sectional weights using the weight share method. Finally, we compare some indicators calculated using both weighting methods.

    Release date: 1999-10-22

Data (1)

Data (1) (1 result)

  • Table: 75M0007X
    Description:

    The Absence from Work Survey was designed primarily to fulfill the objectives of Human Resources Development Canada. They sponsor the qualified wage loss replacement plan which applies to employers who have their own private plans to cover employee wages lost due to sickness, accident, etc. Employers who fall under the plan are granted a reduction in their quotas payable to the Unemployment Insurance Commission. The data generated from the responses to the supplement will provide input to determine the rates for quota reductions for qualified employers.

    Although the Absence from Work Survey collects information on absences from work due to illness, accident or pregnancy, it does not provide a complete picture of people who have been absent from work for these reasons because the concepts and definitions have been developed specifically for the needs of the client. Absences in this survey are defined as being at least two weeks in length, and respondents are only asked the three reasons for their most recent absence and the one preceding it.

    Release date: 1999-06-29

Analysis (23)

Analysis (23) (23 of 23 results)

  • Articles and reports: 12-001-X19990014716
    Description:

    Two design-based estimators of gross flows and transition rates are considered. One makes use of the cross-sectional samples for the estimation of the income class boundaries at each time period and the longitudinal sample for the estimation of counts of units in the longitudinal population (longitudinal counts); this is the mixed estimator. The other one is entirely based on the longitudinal sample, both for the estimation of the class boundaries and the longitudinal counts; this is the longitudinal estimator. We compare the two estimators in the presence of large attrition rates, by means of a simulation. We find that under a less than perfect model of compensation for attrition, the mixed estimator is usually more sensitive to model bias than the longitudinal estimator. Furthermore, we find that for the mixed estimator, the magnitude of this bias overshadows the small gain in precision when compared to the longitudinal estimator. The results are illustrated with data from the Survey of Labour and Income Dynamics and the Longitudinal Administrative Database of Statistics Canada.

    Release date: 1999-10-08

  • Articles and reports: 12-001-X19990014707
    Description:

    This paper introduces Poisson Mixture sampling, a family of sampling designs so named because each member of the family is a mixture of two Poisson sampling designs, Poisson nps sampling and Bernoulli sampling. These two designs are at opposite ends of a continuous spectrum, indexed by a continuous parameter. Poisson Mixture sampling is conceived for use with the highly skewed populations often arising in business surveys. It gives the statistician a range of different options for the extent of the sample coordination and the control of response burden. Some Poisson Mixture sampling designs give considerably more precise estimates than the usual Poisson nps sampling. This result is noteworthy, because Poisson nps is in itself highly efficient, assuming it is based on a strong measure of size.

    Release date: 1999-10-08

  • Articles and reports: 12-001-X19990014718
    Description:

    In this short note, we demonstrate that the well-known formula for the design effect intuitively proposed by Kish has a model-based justification. The formula can be interpreted as a conservative value for the actual design effect.

    Release date: 1999-10-08

  • Articles and reports: 12-001-X19990014712
    Description:

    This paper investigates a repeated sampling approach to take into account auxiliary information in order to improve the precision of estimators. The objective is to build an estimator with a small conditional bias by weighting the observed values by the inverses of the conditional inclusion probabilities. A general approximation is proposed in cases when the auxiliary static is a vector of Horvitz-Thompson estimators. This approximation is quite close to the optimal estimator discussed by Fuller and Isaki (1981), Montanari (1987, 1997), Deville (1992) and Rao (1994, 1997). Next, the optimal estimator is applied to a stratified sampling design and it is shown that the optimal estimator can be viewed as a generalised regression estimator for which the stratification indicator variables are also used at the estimation stage. Finally, the application field of this estimator is discussed in the general context of the use of auxiliary information.

    Release date: 1999-10-08

  • Articles and reports: 12-001-X19990014715
    Description:

    The Gallup Organization has been conducting household surveys to study state-wide prevalences of alcohol and drug (e.g., cocaine, marijuana, etc.) use. Traditional design-based survey estimates of use and dependence for counties and select demographic groups have unacceptably large standard errors because sample sizes in sub-state groups are two small. Synthetic estimation incorporates demographic information and social indicators in estimates of prevalence through an implicit regression model. Synthetic estimates tend to have smaller variances than design-based estimates, but can be very homogeneous across counties when auxiliary variables are homogeneous. Composite estimates for small areas are weighted averages of design-based survey estimates and synthetic estimates. A second problem generally not encountered at the state level but present for sub-state areas and groups concerns estimating standard errors of estimated prevalences that are close to zero. This difficulty affects not only telephone household survey estimates, but also composite estimates. A hierarchical model is proposed to address this problem. Empirical Bayes composite estimators, which incorporate survey weights, of prevalences and jackknife estimators of their mean squared errors are presented and illustrated.

    Release date: 1999-10-08

  • Articles and reports: 12-001-X19990014711
    Description:

    We consider the use of calibration estimators when outliers occur. An extension is obtained for the class of Deville and Särndal (1992) calibration estimators based on Wright (1983) QR estimators. It is also obtained by minimizing a general metric subject to constraints on the calibration variables and weights. As an application, this class of estimators helps us consider robust calibration estimators by choosing parameters carefully. This makes it possible, e.g., for cosmetic reasons, to limit robuts weights to a predetermined interval. The use of robust estimators with a high breakdown point is also considered. In the specific case of the mean square metric, the estimator proposed by the author is a generalization of a Lee (1991) proposition. The new methodology is illustrated by means of a short simulation study.

    Release date: 1999-10-08

  • Articles and reports: 12-001-X19990014710
    Description:

    Most statistical offices select the sample of commodities of which prices are collected for their Consumer Price Indexes with non-probability techniques. In the Netherlands, and in many other countries as well, those judgemental sampling methods come close to some kind of cut-off selection, in which a large part of the population (usually the items with the lowest expenditures) is deliberately left unobserved. This method obviously yields biased price index numbers. The question arises whether probability sampling would lead to better results in terms of the mean square error. We have considered simple random sampling, stratified sampling and systematic sampling proportional to expenditure. Monte Carlo simulations using scanner data on coffee, baby's napkins and toilet paper were carried out to assess the performance of the four sampling designs. Surprisingly perhaps, cut-off selection is shown to be a successful strategy for item sampling in the consumer price index.

    Release date: 1999-10-08

  • Articles and reports: 12-001-X199900111395
    Description:

    In this Issue is a column where the Editor biefly presents each paper of the current issue of Survey Methodology. As well, it sometimes contain informations on structure or management changes in the journal.

    Release date: 1999-10-08

  • Articles and reports: 12-001-X19990014713
    Description:

    Robust small area estimation is studied under a simple random effects model consisting of a basic (or fixed effects) model and a linking model that treats the fixed effects as realizations of a random variable. Under this model a model-assisted estimator of a small area mean is obtained. This estimator depends on the survey weights and remains design-consistent. A model-based estimator of its mean squared error (MSE) is also obtained. Simulation results suggest that the proposed estimator and Kott's (1989) model-assisted estimator are equally efficient, and that the proposed MSE estimator is often much more stable than Kott's MSE estimator, even under moderate deviations of the linking model. The method is also extended to nested error regression models.

    Release date: 1999-10-08

  • Articles and reports: 12-001-X19990014709
    Description:

    We develop an approach to estimating variances for X-11 seasonal adjustments that recognizes the effects of sampling error and errors from forecast extension. In our approach, seasonal adjustment error in the central values of a sufficiently long series results only from the effect of the X-11 filtering on the sampling errors. Towards either end of the series, we also recognize the contribution to seasonal adjustment error from forecast and backcast errors. We extend the approach to produce variances of errors in X-11 trend estimates, and to recognize error in estimation of regression coefficients used to model, e.g., calendar effects. In empirical results, the contribution of sampling error often dominated the seasonal adjustment variances. Trend estimate variances, however, showed large increases at the ends of series due to the effects of fore/backcast error. Nonstationarities in the sampling errors produced striking patterns in the seasonal adjustment and trend estimate variances.

    Release date: 1999-10-08

  • Articles and reports: 12-001-X19990014717
    Description:

    The British Labour Froce Survey (LFS) uses a rotating sample design, with each sample household retained for five consecutive quarters. Linking together the information on the same persons across quarters produces a potentially very rich source of longitudinal data. There are however serious risks of distortion in the results from such longitudinal linking, mainly arising from sample attrition, and from response errors, which can produce spurious flows between economic activity states. This paper describes the initial results of investigations by the Office for National Statistics (ONS) into the nature and extent of the problems.

    Release date: 1999-10-08

  • Articles and reports: 12-001-X19990014714
    Description:

    In this paper a general multilevel model framework is used to provide estimates for small areas using survey data. This class of models allows for variation between areas because of: (i) differences in the distributions of unit level variables between areas, (ii) differences in the distribution of area level variables between areas (iii) area specific components of variance which make provision for additional local variation which cannot be explained by unit-level or area-level covariates. Small area estimators are derived for this multilevel model formulation and an approximation to the mean square error (MSE) of each small area estimates for this general class of mixed models is provided together with an estimator of this MSE. Both the approximations to the MSE and the estimator of MSE take into account three sources of variation: (i) the prediction MSE assuming that both the fixed and components of variance terms in the multilevel model are knows, (ii) the additional component due to the fact that the fixed coefficients must be estimated, and (iii) the further component due to the fact that the components of variance in the model must be estimated. The proposed methods are estimated using a large data set as a basis for numerical investigation. The results confirm that the extra components of variance contained in multilevel models as well as small area covariates can improve small area estimates and that the MSE approximation and estimator are satisfactory.

    Release date: 1999-10-08

  • Articles and reports: 62F0014M1998013
    Description:

    The reference population for the Consumer Price Index (CPI) has been represented, since the 1992 updating of the basket of goods and services, by families and unattached individuals living in private urban or rural households. The official CPI is a measure of the average percentage change over time in the cost of a fixed basket of goods and services purchased by Canadian consumers.

    Because of the broadly defined target population of the CPI, the measure has been criticised for failing to reflect the inflationary experiences of certain socio-economic groups. This study examines this question for three sub-groups of the reference population of the CPI. It is an extension of earlier studies on the subject done at Statistics Canada.

    In this document, analytical consumer price indexes sub-group indexes are compared to the analytical index for the whole population calculated at the national geographic level.

    The findings tend to point to those of earlier Statistics Canada studies on sub-groups in the CPI reference population. Those studies have consistently concluded that a consumer price index established for a given sub-group does not differ substantially from the index for the whole reference population.

    Release date: 1999-05-13

  • Articles and reports: 12-001-X19980024351
    Description:

    To calculate price indexes, data on "the same item" (actually a collection of items narrowly defined) must be collected across time periods. The question arises whether such "quasi-longitudinal" data can be modeled in such a way as to shed light on what a price index is. Leading thinkers on price indexes have questioned the feasibility of using statistical modeling at all for characterizing price indexes. This paper suggests a simple state space model of price data, yielding a consumer price index that is given in terms of the parameters of the model.

    Release date: 1999-01-14

  • Articles and reports: 12-001-X19980024355
    Description:

    Two sampling strategies have been proposed for estimating the finite population total for the most recent occasion, based on the samples selected over two occasions involving varying probability sampling schemes. Attempts have been made to utilize the data collected on a study variable, in the first occasion, as a measure of size and a stratification variable for selection of the matched-sample on the second occasion. Relative efficiencies of the proposed strategies have been compared with suitable alternatives.

    Release date: 1999-01-14

  • Articles and reports: 12-001-X19980024348
    Description:

    Gross flows among labour force states are of great importance in understanding labour market dynamics. Observed flows are typically subject to classification errors, which may induce serious bias. In this paper, some of the most common strategies, used to collect longitudinal information about labour force condition are reviewed, jointly with the modelling approaches developed to correct gross flows, when affected by classification errors. A general framework for estimating gross flows is outlined. Examples are given of different model specifications, applied to data collected with different strategies. Specifically, two cases are considered, i.e., gross flows from (i) the U.S. Survey of Income and Program Participation and (ii) the French Labour Force Survey, a yearly survey collecting retrospective monthly information.

    Release date: 1999-01-14

  • Articles and reports: 12-001-X19980024349
    Description:

    Measurement of gross flows in labour force status is an important objective of the continuing labour force surveys carried out by many national statistics agencies. However, it is well known that estimation of these flows can be complicated by nonresponse, measurement errors, sample rotation and complex design effects. Motivated by nonresponse patterns in household-based surveys, this paper focuses on estimation of labour force gross flows, while simultaneously adjusting for nonignorable nonresponse. Previous model-based approaches to gross flows estimation have assumed nonresponse to be an individual-level process. We propose a class of models that allow for nonignorable household-level nonresponse. A simulation study is used to show, that individual-level labour force gross flows estimates from household-based survey data, may be biased and that estimates using household-level models can offer a reduction in this bias.

    Release date: 1999-01-14

  • Articles and reports: 12-001-X19980024350
    Description:

    In longitudinal surveys, simple estimates of change, such as differences of percentages may not always be efficient enough to detect changes of practical relevance, especially in sub-populations. The use of models, which can represent the dependence structure of the longitudinal survey, can help to solve this problem. One of the main characteristics observed by the Swiss Labour Force Survey (SLFS) is the employment status. As the survey is designed as a rotating panel, the data from the SLFS are multivariate categorical data, where a large proportion of the response profiles are missing by design. The multivariate logistic model, introduced by Glonek and McCullagh (1995) as a generalisation of logistic regression, is attractive in this context, since it allows for dependent repeated observations and incomplete response profiles. We show that, using multivariate logistic regression, we can represent the complex dependence structure of the SLFS by a small number of parameters, and obtain more efficient estimates of change.

    Release date: 1999-01-14

  • Articles and reports: 12-001-X19980024352
    Description:

    The National Population Health Survey (NPHS) is one of Statistics Canada's three major longitudinal household surveys providing an extensive coverage of the Canadian population. A panel of approximately 17,000 people are being followed up every two years for up to twenty years. The survey data are used for longitudinal analyses, although an important objective is the production of cross-sectional estimates. Each cycle panel respondents provide detailed health information (H) while, to augment the cross-sectional sample, general socio-demographic and health information (G) are collected from all members of their households. This particular collection strategy presents several observable response patterns for Panel Members after two cycles: GH-GH, GH-G*, GH-**, G*-GH, G*-G* and G*-**, where "*" denotes a missing portion of data. The article presents the methodology developed to deal with these types of longitudinal nonresponse as well as with nonresponse from a cross-sectional perspective. The use of weight adjustments for nonresponse and the creation of adjustment cells for weighting using a CHAID algorithm are discussed.

    Release date: 1999-01-14

  • Articles and reports: 12-001-X19980024354
    Description:

    This article deals with an attempt to cross-tabulate two categorical variables, which were separately collected from two large independent samples, and jointly collected from one small sample. It was assumed that the large samples have a large set of common variables. The proposed estimation technique can be considered a mix between calibration techniques and statistical matching. Through calibration techniques, it is possible to incorporate the complex designs of the samples in the estimation procedure, to fulfill some consistency requirements between estimates from various sources, and to obtain fairly unbiased estimates for the two-way table. Through the statistical matching techniques, it is possible to incorporate a relatively large set of common variables in the calibration estimation, by means of which the precision of the estimated two-way table can be improved. The estimation technique enables us to gain insight into the bias generally obtained, in estimating the two-way table, by sole use of the large samples. It is shown how the estimation technique can be useful to impute values of the one large sample (donor source) into the other large sample (host source). Although the technique is principally developed for catagorical variables Y and Z, with a minor modification, it is also applicable for continuous variables Y and Z.

    Release date: 1999-01-14

  • Articles and reports: 12-001-X19980024347
    Description:

    We review the current status of various aspects of the design and analysis of studies where the same units are investigated at several points in time. These studies include longitudinal surveys, and longitudinal analyses of retrospective studies and of administrative or census data. The major focus is the special problems posed by the longitudinal nature of the study. We discuss four of the major components of longitudinal studies in general; namely, Design, Implementation, Evaluation and Analysis. Each of these components requires special considerations when planning a longitudinal study. Some issues relating to the longitudinal nature of the studies are: concepts and definitions, frames, sampling, data collection, nonresponse treatment, imputation, estimation, data validation, data analysis and dissemination. Assuming familiarity with the basic requirements for conducting a cross-sectional survey, we highlight the issues and problems that become apparent for many longitudinal studies.

    Release date: 1999-01-14

  • Articles and reports: 12-001-X19980024353
    Description:

    This paper studies response errors in the Current Population Survey of the U.S. Bureau of the Census and assesses their impact on the unemployment rates published by the Bureau of Labour Statistics. The measurement of these error rates is obtained from reinterview data, using an extension of the Hui and Walter (1980) procedure for the evaluation of diagnostic tests. Unlike prior studies which assumed that the reconciled reinterview yields the true status, the method estimates the error rates in both interviews. Using these estimated error rates, we show that the misclassification in the original survey creates a cyclical effect on the reported estimated unemployment rates. In particular, the degress of underestimation increases when true unemployment is high. As there was insufficient data to distinguish between a model assuming that the misclassification rates are the same throughout the business cycle, and one that allows the error rates to differ in periods of low, moderate and high unemployment, our findings should be regarded as preliminary. Nonetheless, they indicated that the relationship between the models used to assess the accuracy of diagnostic tests, and those measuring misclassification rates of survey data, deserves further study.

    Release date: 1999-01-14

  • Articles and reports: 12-001-X19980024356
    Description:

    In the nonsurvey setting,"exact" confidence intervals for proportions calculated using the binomial distribution are frequently used instead of intervals based on approximate normality when the number of positive counts is small. With complex survey data, the binomial intervals are not applicable, so intervals based on the assumed approximate normality of the sample-weighted proportion are used, even if the number of positive counts is small. We propose a simple modification of the binomial intervals to be used in this situation. Limited simulations are presented that show the coverage probability of the proposed intervals is superior to that of the normality-based intervals, logit-transform intervals, and intervals based on a Poisson approximation. Applications are given involving the prevalence of Human Immunodeficiency Virus (HIV) based on data from the third National Health and Nutrition Examination Survey, and the proportion of users of cocaine based on data from the Hispanic Health and Nutrition Examination Survey.

    Release date: 1999-01-14

Reference (30)

Reference (30) (25 of 30 results)

  • Index and guides: 92-371-X
    Description:

    This report deals with sampling and weighting, a process whereby certain characteristics are collected and processed for a random sample of dwellings and persons identified in the complete census enumeration. Data for the whole population are then obtained by scaling up the results for the sample to the full population level. The use of sampling may lead to substantial reductions in costs and respondent burden, or alternatively, can allow the scope of a census to be broadened at the same cost.

    Release date: 1999-12-07

  • Classification: 89F0077X199903B
    Description:

    The National Longitudinal Survey of Children and Youth (NLSCY) is the first Canada-wide survey of children. Starting in 1994, it will gather information on a sample of children and their life experiences. It will follow these children over time. The survey will collect information on children and their families, education, health, development, behaviour, friends, activities, etc.

    Along with 89F0077XPE (or XIE) issue 9903a, this document contains the various questionnaires used to gather information from parents, children, teachers and principals.

    Release date: 1999-10-22

  • Technical products: 11-522-X19980015024
    Description:

    A longitudinal study on a cohort of pupils in the secondary school has been conducted in an Italian region since 1986 in order to study the transition from school to working life. The information have been collected at every sweep by a mail questionnaire and, at the final sweep, by a face-to-face interview, where retrospective questions referring back to the whole observation period have been asked. The gross flows between different discrete states - still in the school system, in the labour force without a job, in the labour force with a job - may then be estimated both from prospective and retrospective data, and the recall effect may be evaluated. Moreover, the conditions observed by the two different techniques may be regarded as two indicators of the 'true' unobservable condition, thus leading to the specification and estimation of a latent class model. In this framework, a Markov chain hypothesis may be introduced and evaluated in order to estimate the transition probabilities between the states, once they are corrected or the classification errors. Since the information collected by mail show a given amount of missing data in terms of unit nonresponse, the 'missing' category is also introduced in the model specification.

    Release date: 1999-10-22

  • Technical products: 11-522-X19980015019
    Description:

    The British Labour Force Survey (LFS) is a quarterly household survey with a rotating sample design that can potentially be used to produce longitudinal data, including estimates of labour force gross flows. However, these estimates may be biased due to the effect of non-response. Weighting adjustments are a commonly used method to account for non-response bias. We find that weighting may not fully account for the effect of non-response bias because non-response may depend on the unobserved labour force flows, i.e., the non-response is non-ignorable. To adjust for the effects of non-ignorable non-response, we propose a model for the complex non-response patterns in the LFS which controls for the correlated within-household non-response behaviour found in the survey. The results of modelling suggest that non-response may be non-ignorable in the LFS, causing the weighting estimates to be biased.

    Release date: 1999-10-22

  • Technical products: 11-522-X19980015037
    Description:

    For longitudinal data, mixed models are often used, since they allow analysts to take account of the correlation between different observations from the same individual. The finite mixture model may be considered as a special case of a mixed model. In this document, attention will be given to the maximum likelihood method. The maximization of the likelihood function for a finite mixture of distributions is generally more difficult than in the usual case of a single distribution and can require considerable time. The objective of this project was therefore primarily to identify the one or more algorithms that best meet the criteria of run time and of efficiency in finding the solution. To achieve this objective, a simulation study was carried out. Only the situation in which the dependent variable is dichotomous was considered. This situation is very useful in practice, since among other things it can be used to model discrete durations, such as the length of time in "low income" status.

    Release date: 1999-10-22

  • Technical products: 11-522-X19980015010
    Description:

    In 1994, Statistics Canada introduced a new longitudinal social survey that collects information from about 23,000 children spread over 13,500 households. The objective of the National Longitudinal Survey of Children and Youth is to measure the development and well being of children until they reach adulthood. To this end, the survey gathers together information about the child, parents, neighbourhood as well as family and school environment. As a consequence, the data collected for each child, is provided by several respondents, from parents to teachers, a situation which contributes to an increased disclosure risk. In order to reach a balance between confidentiality and the analytical value of released data, the survey produces three different microdata files with more or less information. The master file that contains all the information is only available by means of remote access. Hence, researchers do not have direct access to the data, but send their request in the form of software programs that are submitted by Statistics Canada staff. The results are then vetted for confidentiality and sent back to the researchers. The presentation will be devoted to the various disclosure risks of such a survey and to the tools used to reduce those risks.

    Release date: 1999-10-22

  • Technical products: 11-522-X19980015017
    Description:

    Longitudinal studies with repeated observations on individuals permit better characterizations of change and assessment of possible risk factors, but there has been little experience applying sophisticated models for longitudinal data to the complex survey setting. We present results from a comparison of different variance estimation methods for random effects models of change in cognitive function among older adults. The sample design is a stratified sample of people 65 and older, drawn as part of a community-based study designed to examine risk factors for dementia. The model summarizes the population heterogeneity in overall level and rate of change in cognitive function using random effects for intercept and slope. We discuss an unweighted regression including covariates for the stratification variables, a weighted regression, and bootstrapping; we also did preliminary work into using balanced repeated replication and jackknife repeated replication.

    Release date: 1999-10-22

  • Technical products: 11-522-X19980015036
    Description:

    Multivariate logistic regression, introduced by Glonek and McCullagh (1995) as a generalisation of logistic regression, is useful in the analysis of longitudinal data as it allows for dependent repeated observations of a categorical variable and for incomplete response profiles. We show how the method can be extended to deal with data from complex surveys and we illustrate it on data from the Swiss Labour Force Survey. The effect of the sampling weights on the parameter estimates and their standard errors is considered.

    Release date: 1999-10-22

  • Technical products: 11-522-X19980015016
    Description:

    Models for fitting longitudinal binary responses are explored using a panel study of voting intentions. A standard repeated measures multilevel logistic model is shown inadequate due to the presence of a substantial proportion of respondents who maintain a constant response over time. A multivariate binary response model is shown a better fit to the data.

    Release date: 1999-10-22

  • Technical products: 11-522-X19980015025
    Description:

    The log-linear modelling of categorical longitudinal survey data on income is studied. An emphasis is on inference about change. Special attention is paid to modelling of longitudinal data from two waves. A small illustration is based on data from the Canadian Survey of Labour and Income Dynamics.

    Release date: 1999-10-22

  • Technical products: 11-522-X19980015028
    Description:

    We address the problem of estimation for the income dynamics statistics calculated from complex longitudinal surveys. In addition, we compare two design-based estimators of longitudinal proportions and transition rates in terms of variability under large attrition rates. One estimator is based on the cross-sectional samples for the estimation of the income class boundaries at each time period and on the longitudinal sample for the estimation of the longitudinal counts; the other estimator is entirely based on the longitudinal sample, both for the estimation of the class boundaries and the longitudinal counts. We develop Taylor linearization-type variance estimators for both the longitudinal and the mixed estimator under the assumption of no change in the population, and for the mixed estimator when there is change.

    Release date: 1999-10-22

  • Technical products: 11-522-X19980015023
    Description:

    The study of social mobility, between labour market statuses or between income levels, for example, is often based on the analysis of mobility matrices. When comparing these transition matrices, with a view to evaluating behavioural changes, one often forgets that the data derive from a sample survey and are therefore affected by sampling variances. Similarly, it is assumed that the responses collected correspond to the ' true value.'

    Release date: 1999-10-22

  • Technical products: 11-522-X19980015033
    Description:

    Victimizations are not randomly scattered through the population, but tend to be concentrated in relatively few victims. Data from the U.S. National Crime Victimization Survey (NCVS), a multistage rotating panel survey, are employed to estimate the conditional probabilities of being a crime victim at time t given the victimization status in earlier interviews. Models are presented and fit to allow use of partial information from households that move in or out of the housing unit during the study period. The estimated probability of being a crime victim at interview t given the status at interview (t-l) is found to decrease with t. Possible implications for estimating cross-sectional victimization rates are discusssed.

    Release date: 1999-10-22

  • Technical products: 11-522-X19980015021
    Description:

    The U.S. Bureau of the Census implemented major changes to the design of the Survey of Income and Program Participation (SIPP) with the panel begun in 1996. The revised survey design emphasized longitudinal applications and the Census Bureau attempted to understand and resolve the seam bias common to longitudinal surveys. In addition to the substantive and administrative redesign of the survey, the Census Bureau is improving the data processing procedures which yield microdata files for the public to analyse. The wave-by-wave data products are being edited and imputed with a longitudinal element rather than cross-sectionally, carrying forward information from a prior wave that is missing in the current wave. The longitudinal data products will be enhanced, both by the redesigned survey and new processing procedures. Simple methods of imputing data over time are being replaced with more sophisticated methods that do not attenuate seam bias. The longitudinal sample is expanding to include more observations which were nonrespondents in one or more waves. Longitudinal weights will be applied to the file to support person-based longitudinal analysis for calendar years or longer periods of time (up to four years).

    Release date: 1999-10-22

  • Technical products: 11-522-X19980015018
    Description:

    This paper presents a method for handling longitudinal data in which individuals belong to more than one unit at a higher level, and also where there is missing information on the identification of the units to which they belong. In education, for example, a student might be classified as belonging sequentially to a particular combination of primary and secondary school, but for some students, the identity of either the primary or secondary school may be unknown. Likewise, in a longitudinal study, students may change school or class from one period to the next, so 'belonging' to more than one higher level unit. The procedures used to model these stuctures are extensions of a random effects cross-classified multilevel model.

    Release date: 1999-10-22

  • Technical products: 11-522-X19980015022
    Description:

    This article extends and further develops the method proposed by Pfeffermann, Skinner and Humphreys (1998) for the estimation of gross flows in the presence of classification errors. The main feature of that method is the use of auxiliary information at the individual level which circumvents the need for validation data for estimating the misclassification rates. The new developments in this article are the establishment of conditions for model identification, a study of the properties of a model goodness of fit statistic and modifications to the sample likelihood to account for missing data and informative sampling. The new developments are illustrated by a small Monte-Carlo simulation study.

    Release date: 1999-10-22

  • Technical products: 11-522-X19980015031
    Description:

    The U.S. Third National Health and Nutrition Examination Survey (NHANES III) was carried out from 1988 to 1994. This survey was intended primarily to provide estimates of cross-sectional parameters believed to be approximately constant over the six-year data collection period. However, for some variable (e.g., serum lead, body mass index and smoking behavior), substantive considerations suggest the possible presence of nontrivial changes in level between 1988 and 1994. For these variables, NHANES III is potentially a valuable source of time-change information, compared to other studies involving more restricted populations and samples. Exploration of possible change over time is complicated by two issues. First, there was of practical concern because some variables displayed substantial regional differences in level. This was of practical concern because some variables displayed substantial regional differences in level. Second, nontrivial changes in level over time can lead to nontrivial biases in some customary NHANES III variance estimators. This paper considers these two problems and discusses some related implications for statistical policy.

    Release date: 1999-10-22

  • Technical products: 11-522-X19980015029
    Description:

    In longitudinal surveys, sample subjects are observed over several time points. This feature typically leads to dependent observations on the same subject, in addition to the customary correlations across subjects induced by the sample design. Much research in the literature has focussed on modeling the marginal mean of a response as a function of covariates. Liang and Zeger (1986) used generalized estimating equations (GEE), requiring only correct specification of the marginal mean, and obtained standard errors of regression parameter estimates and associated Wald tests, assuming a "working" correlation structure for the repeated measurements on a sample subject. Rotnitzky and Jewell (1990) developed quasi-score tests and Rao-Scott adjustments to "working" quasi-score tests under marginal models. These methods are asymptotically robust to misspecification of the within-subject correlation structure, but assume independence of sample subjects which is not satisfied for complex longitudinal survey data based on stratified multi-stage sampling. We proposed asymptotically valid Wald and quasi-score tests for longitudinal survey data, using the Taylor Linearization and jackknife methods. Alternative tests, based on Rao-Scott adjustments to naive tests that ignore survey design features and on Bonferroni-t, are also developed. These tests are particularly useful when the effective degrees of freedom, usually taken as the total number of sample primary units (clusters) minus the number of strata, is small.

    Release date: 1999-10-22

  • Technical products: 11-522-X19980015035
    Description:

    In a longitudinal survey conducted for k periods some units may be observed for less than k of the periods. Examples include, surveys designed with partially overlapping subsamples, a pure panel survey with nonresponse, and a panel survey supplemented with additional samples for some of the time periods. Estimators of the regression type are exhibited for such surveys. An application to special studies associated with the National Resources Inventory is discussed.

    Release date: 1999-10-22

  • Technical products: 11-522-X19980015026
    Description:

    The purpose of the present study is to utilize panel data from the Current Population Survey (CPS) to examine the effects of unit nonresponse. Because most nonrespondents to the CPS are respondents during at least one month-in-sample, data from other months can be used to compare the characteristics of complete respondents and panel nonrespondents and to evaluate nonresponse adjustment procedures. In the current paper we present analyses utilizing CPS panel data to illustrate the effects of unit nonresponse. After adjusting for nonresponse, additional comparisons are also made to evaluate the effects of nonresponse adjustment. The implications of the findings and suggestions for further research are discussed.

    Release date: 1999-10-22

  • Technical products: 11-522-X19980015032
    Description:

    The objective of this research project is to examine the long-term consequences of being raised in a single parent household. We examine the impact of parental separation or divorce on the adult labour market behaviour of children ten to fifteen years after the event. In particular, we relate the family income and household characteristics of a cohort of individuals who are 16 to 19 years of age in 1982 to their labour market earnings, reliance on social transfers (UI and Income Assistance), and marital/fertility outcomes during the early 1990s, when they are in their late 20s and early 30s. Our data is based upon the linked income tax records developed by us at Statistics Canada, the Survey of Labour and Income Dynamics, and the National Longitudinal Survey of Children and Youth.

    Release date: 1999-10-22

  • Technical products: 11-522-X19980015034
    Description:

    A model of secondary school progression has been estimated using data from the 1991 School Leavers Survey conducted by Statistics Canada. The data on which the school progression model was based comprised current educational status and responses to retrospective questions on the timing of schooling events. These data were sufficient for approximate reconstruction of educational event histories of each respondent. The school progression model was designed to be included in a larger, continuous time micro-simulation model. Its main features involve estimation -- by age, month of birth and season for both sexes in each province -- of rates of graduation, of dropout, of return and of dropout graduation. Estimation was reinforced with auxiliary 1991 Census and administative data.

    Release date: 1999-10-22

  • Technical products: 11-522-X19980015015
    Description:

    In epidemiology analysis of longitudinal data is commonly accepted as providing the most robust measures of association between putative risk and selected outcomes such as death or cancer. SMARTIE is a SAS application for efficient analysis of longitudinal data. Based on person days at risk, it can handle multiple exits from and re-entries to risk, and derives outcome measures such as survival rates. Standardised Mortality Ratios (SMRs) and Cancer Incidence Ratios (SIRs). Summary data can be produced in a format easily ported to any modelling package such as Stats 5.0. We discuss the background to its development, the overall program structure, its command language, and finally we say something about the organization of outputs. Findings from survival studies using the Longitudinal Study of the Office for National Statistics (ONS) are used to demonstrate features of SMARTIE. This study is based on one per cent of the population of England and Wales. It is continually updated with the addition of new members and with information from birth, death and cancer records, and from the census.

    Release date: 1999-10-22

  • Technical products: 11-522-X19980015030
    Description:

    Two-phase sampling designs have been conducted in waves to estimate the incidence of a rare disease such as dementia. Estimation of disease incidence from longitudinal dementia study has to appropriately adjust for data missing by death as well as the sampling design used at each study wave. In this paper we adopt a selection model approach to model the missing data by death and use a likelihood approach to derive incidence estimates. A modified EM algorithm is used to deal with data missing by sampling selection. The non-paramedic jackknife variance estimator is used to derive variance estimates for the model parameters and the incidence estimates. The proposed approaches are applied to data from the Indianapolis-Ibadan Dementia Study.

    Release date: 1999-10-22

  • Technical products: 11-522-X19980015020
    Description:

    At the end of 1993, Eurostat lauched a 'community' panel of households. The first wave, carried out in 1994 in the 12 countries of the European Union, included some 7,300 households in France, and at least 14,000 adults 17 years or over. Each individual was then followed up and interviewed each year, even if they had moved. The individuals leaving the sample present a particular profile. In the first part, we present a sketch of how our sample evolves and an analysis of the main characteristics of the non-respondents. We then propose 2 models to correct for non-response per homogeneous category. We then describe the longitudinal weight distribution obtained from the two models, and the cross-sectional weights using the weight share method. Finally, we compare some indicators calculated using both weighting methods.

    Release date: 1999-10-22

Date modified: