Sample survey theory and methods: Past, present, and future directions
Section 3. Inferential issues: 1950 -

3.1  Theoretical foundations

Attempts were made to integrate sample survey theory with mainstream statistical inference via the likelihood function. Godambe (1966) showed that the likelihood function from the full sample data including labels, regarding the vector of unknown population values as the parameter, provides no information on the non-sampled values and hence on the population total or mean. This uninformative feature of the likelihood function is due to the inclusion of labels in the data which makes the sample unique. An alternative design-based route ignores some aspects of the sample data to make the sample non-unique and thus arrive at informative likelihood functions (Hartley and Rao, 1968; Royall, 1968). This non-parametric likelihood approach is similar to the currently popular empirical likelihood (EL) approach in mainstream statistical inference (Owen, 1988). The EL approach has been applied to sampling problems in recent years to estimate not only totals and means but also more complex parameters. So the integration efforts with main stream statistics was partially successful.

The model-dependent approach provides an alternative route to inference from survey data. The approach requires that the population structure obeys a specified super-population model. The distribution induced by the assumed model provides the basis for inferences (Brewer, 1963 and Royall, 1970). Such conditional (conditional on the sample) inferences can be appealing. However, the resulting estimators may be design inconsistent and, as such, they can perform poorly in large samples under model misspecification (Hansen, Madow and Tepping, 1983).

A hybrid approach, called the model-assisted approach, attempts to combine the desirable features of the design-based and model-dependent methods, see Cassel, Särndal and Wretman (1976). The approach typically includes the use of data external to the collected data, called auxiliary data. Procedures using auxiliary data include regression estimation, ratio estimation, and raking, methods with estimators linear in the variable of interest. Estimators using auxiliary information, particularly regression, were recognized very early as powerful estimators (Cochran, 1953). Computing power made regression estimation practical in the 1970’s, but to be acceptable in large scale surveys the regression weights need to be nonnegative. An early definition of nonnegative weights is Huang and Fuller (1978). Deville and Särndal (1992) gave a general method of constructing weights for design consistent estimators. Model assisted methods entertain only design consistent estimators of the total that are also model unbiased under a working model. This approach is useful for large samples and it leads to valid design-based inferences in large samples, regardless of the validity of the working model. However, efficiency of the estimators does depend on the degree to which the working model approximates the true population structure. The most popular form of model-assisted estimators are known as generalized regression estimators (GREGs) and are implemented in survey software packages.

Theoretical results for probability-based sampling emphasize the first two moments of the sample statistics. Central limit theorems have been used to provide justification for normality-based confidence intervals. An early central limit theorem for simple random samples is that of Madow (1948). Hájek (1960) gave a central limit theorem for simple random sampling and a theorem for rejective sampling in Hájek (1964). Bickel and Freedman (1984) gave a central limit theorem for stratified random sampling. Recent literature considers both sequences of fixed finite populations and sequences of finite populations that are samples from a superpopulation (Fuller, 2009b; Section 1.3.2).

Variance estimation was very costly, nearly prohibitive, in the 1930’s and 1940’s, and remains expensive today. Replication was adopted as an efficient variance estimation method from the beginning. As we noted, an early replication form was introduced by Mahalanobis (1939, 1946) called “interpenetrating” samples by him, and called “random groups” by later authors. The method of random groups based on half samples, was used by the U.S. Census Bureau in the 1950’s and 1960’s. McCarthy (1966, 1969) developed and described balanced half-sample variance estimation. Also see Kish and Frankel (1970). Wolter (2007) contains an extensive discussion on balanced half samples. Also see Dippo, Fay and Morgenstein (1984), Kish and Frankel (1974), Krewski and Rao (1981), and Rao and Shao (1999). The jackknife and bootstrap are the current versions of early replication procedures. Wolter (2007, Chapter 4) credits Durbin (1959) with the first use of the jackknife in finite population estimation. The use of the bootstrap in the classical setting dates from Efron (1979) but application to unequal probability samples and finite populations is not immediate. Among the large number of papers on jackknife and bootstrap for survey samples are McCarthy and Snowden (1985), an early with-replacement sampling version, and Rao and Wu (1988), a modified bootstrap based on “rescaling” for survey samples. Sitter (1992) discussed several topics including suggestions for obtaining integer sample sizes. Antal and Tillé (2011) gave bootstrap methods appropriate for a wide range of designs, including Poisson sampling. Beaumont and Patak (2012) gave general bootstrap procedures.

3.2  Analytic use of survey data

As we have remarked, the early work on probability sampling emphasized totals and means and many estimation procedures were developed for official statistics. However, from the beginning, survey samples were used by social scientists to answer subject matter questions with relevance beyond the finite population sampled. Deming and Stephan (1940) and Deming (1953) gave explicit consideration to the difference between “enumerative” and “analytic” use of survey and census data, also see Hartley (1959). The analytic estimates are sometimes called estimates for a superpopulation. Early analysts often treated survey sample data as a simple random sample and constructed estimates on that basis. The potential for bias that arises from ignoring the design led to estimation theory for analytic estimates. One component is comprised of tests for the effect of weights on estimates, see DuMouchel and Duncan (1983), Fuller (1984), and Korn and Graubard (1995). A second component has been the development of design based theory for complicated statistics. See Fuller (1975), Rao and Scott (1981, 1984), and Binder and Roberts (2003). The third approach builds the sampling design into the model (Skinner, 1994 and Pfeffermann and Sverchkov, 1999). A number of computer packages (SAS, SUDAAN, R, STATA) are now available for probability-based statistics and standard errors. Many of the algorithms date from the work at Iowa State University (Hidiroglou, Fuller and Hickman, 1976).

3.3  Missing data

Almost all samples (and experiments) have missing and incorrect data. Missing data in survey sampling are placed in two categories; unit-missing and item-missing, where, as the name implies, a missing unit means that all items in the response record are missing. An indicator of the importance of missing data in survey research is the monograph set edited by Madow, Nisselson and Olkin (1983). One method of handling missing data is to report the nature and number of missing items and tabulate the remaining items. This was common in the early years, but the implied assumption of exchangeability in such a procedure was often not reasonable. An early method of correcting for unit nonresponse was to use a substitute respondent, often interviewing someone “close” to the nonrespondent. A common modification at the analysis stage was, and remains, post stratification. (Deming, 1953; Thomsen, 1973; Kalton, 1983 and Jagers, 1986). In the missing data literature, post strata are often called cells. Regression estimators are direct extensions of cell estimators and are an important method of correcting for missing data (Fuller and An, 1998). Weighting methods for handling unit nonresponse are reviewed in Brick and Montaquila (2009).

Various forms of imputation for item nonresponse have been used over time, with imputation performed by clerks prior to use of computers. An early formal model-based and computer-based imputation was the hot deck imputation procedure used by the U.S. Census Bureau in the 1947 Current Population Survey, see the description in Andridge and Little (2009). Improved computing power and theoretical advances (Little, 1982; Kalton and Kish, 1984; Rubin, 1974, 1976, 1987; Little and Rubin, 1987; Kim and Fuller, 2004) have made imputation a standard part of estimation for survey samples and an active area of research. Recent books are Kim and Shao (2013) and Little and Rubin (2014).

3.4  Small area estimation

The increased use of models for small-domain estimates is the result of the pairing of two factors. The first is the demand for estimates for small domains (e.g. geographic areas) in policy formulation, fund allocation and regional planning. The second is the large standard errors for many of the design-based domain estimators. Schaible (1996) and Purcell and Kish (1979) gave early examples of small area estimation, also see Gonzalez (1973) and Steinberg (1979). The U.S. Census Bureau used model-based methods for small area estimation as early as 1947 (Hansen et al., 1953; Vol. I, pages 483-486). More recently, linear mixed models involving both fixed and random effects have become important. Early uses of mixed models for small area estimation are Fay and Herriot (1979) and Battese, Harter and Fuller (1988). Some sets of small area estimates can be viewed as a reallocation of the domain estimates, retaining the direct design-consistent estimate of the grand total. Bayesian methods, in particular hierarchical Bayes, are increasingly being used because of the ability to handle complex models; see Rao and Molina (2015, Chapter 10). On the basis of growing demand, there has been a large increase in literature and the field now boasts regular meetings and a book (Rao, 2003) with a recent second edition, (Rao and Molina, 2015).

3.5  Survey practice

Sample design and estimation topics that we have discussed are critical parts of a survey operation, but represent a small fraction of the total. The quality of the final product is determined by frame materials, collection instrument, data collection, editing, processing, and presentation of results. Many error sources are difficult to measure, but those designing surveys make implicit cost estimates when they allocate resources to different parts of the survey operation. Groves and Lyberg (2010) is a review of attempts to enumerate the components of survey quality and to bring them under a single umbrella. They credit Deming (1944) for an early description of error sources in sample surveys and describe the contributions of Dalenius (1974), Anderson, Kasper and Frankel (1979), Groves (1989), Biemer and Lyerg (2003), among others. Groves and Herringa (2006) proposed tools for actively controlling survey errors and costs that can lead to responsive designs for household surveys. In particular, para data (measurements related to the process of collecting survey data) can be used to monitor field work, to make intervention decisions during data collection and to deal with measurement error, nonresponse and coverage errors (Kreuter, 2013).


Date modified: