Survey errors

What errors may affect the survey results?

Errors may occur at any stage during the collection and processing of survey data, whether it is a census or a sample survey. There are two main sources of survey error: Sampling error (errors associated directly with the sample design and estimation methods used) and non-sampling error (a blanket term used to cover all other errors). Non-sampling errors are usually sub-divided as follows:

  • Coverage errors, which are mainly associated with the sampling frame, such as missing units, inclusion of units not in the population of interest, and duplication.
  • Response errors, which are caused by problems related to the way questions were phrased, the order in which the questions were asked, or respondents' reporting errors (also referred to as measurement error if possible errors made by the interviewer are included in this category).
  • Non-response errors, which are due to respondents either not providing information or providing incorrect information. Non-response increases the likelihood of bias in the survey estimates. It also reduces the effective sample size, thereby increasing the observed sampling error. However, the risk of bias when non-response rates are high is generally more dangerous than the reduction in sample size per se.
  • Data capture errors, which are due to coding or data entry problems.
  • Edit and imputation ("E&I") errors, which can be introduced during attempts to find and correct all the other non-sampling errors.

All of these sources may contribute to either, or both, of the two types of survey error. These are bias, or systematic error, and variance, or random error.

Sampling error is not an error in the sense of a mistake having been made in conducting the survey. Rather it indicates the degree of uncertainty about the 'true' value based on information obtained from the number of people that were surveyed.

It is reasonably straightforward for knowledgeable, experienced survey-taking organizations to control sampling error through the use of suitable sampling methods and to estimate its impact using information from the sample design and the achieved sample. Any statement about sampling errors, namely variance, standard error, margin of sampling error or coefficient of variation, can only be made if the survey data come from a probability sample.

The non-sampling errors, especially potential biases, are the most difficult to detect, to control and to measure, and require careful planning, training and testing.

How will the accuracy of the survey results be measured and reported?

The combined effect of bias and variance is the total survey error, which, if available, is the best measure of the overall accuracy of the survey results. For most surveys, however, only an estimate of sampling error is available. The most commonly presented measure is usually referred to as the margin of error: It should properly always be called the margin of sampling error because it does not incorporate any information about non-sampling errors. The same comment applies to confidence intervals as they are computed directly from the margin of sampling error.

For that reason, confidence intervals and margins of sampling error alone are not enough to judge the quality of survey results. If the quality of statistical estimates is important to you in the use of your survey results, then you should seek a survey provider that is able to calculate and report all aspects of survey reliability.

What influences the margin of sampling error?

The margin of sampling error is influenced by several factors:

  • The homogeneity of the population: the more the persons differ from one another in relation to the variables measured, the larger the sample must be.
  • The level or prevalence of the variables being measured: The rarer a characteristic is in the population, the harder it is to measure accurately.
  • The efficiency of the sample design being used.
  • Sample size, which is based on a sample design that will yield the most accurate estimates possible at a given cost.
  • Response rate, which determines the achieved sample size.
How big should the sample be?

The sample size directly affects the margin of sampling error that is reported with the survey results.

The margin of sampling error provides a legitimate estimate of the error due to sampling only if a probability sampling method was used to select the sample. Generally speaking, the more people that are interviewed, the smaller the sampling error becomes.

Note: Don't put all your faith in the survey results simply because the margin of sampling error is relatively small. This is only one possible source of error in a survey.

Will the margin of sampling error be the same for all survey estimates?

The margin of sampling error depends on the size of the sample surveyed. Therefore, estimates for sub-groups of the survey population, for which the sample size smaller by definition, will have a larger margin of sampling error than the overall estimate for the total survey population.

The margin of sampling error also depends on the behaviour of the variable being measured. So even under the same sample design and with the same sample size, the margin of sampling error may be larger for one variable than for another simply because its values are more widely dispersed in the population being surveyed.

Does a small margin of sampling error necessarily mean that the survey results are reliable?

If the survey estimate is relatively small, then a margin of sampling error of only a few percentage points means that the survey estimate should be interpreted with caution. Base your interpretation on how the information will be used and the consequences that may result from making an incorrect decision based on that result.

What is the typical response rate for a survey?

Response rates vary widely depending on a number of factors. Virtually all surveys suffer from some non-response, and non-respondents may be different from respondents in ways that affect the survey results. A low response rate increases the potential impact of bias and can be much more damaging than a small sample with high response rate.

Previous experience and choice of data collection method should provide an estimate of likely response rates. Some of the techniques that can help to maximize response rates include the following:

Providing advance notification: An advance letter explains the background of the survey and encourages participation.

Including effective introductions in your material: This approach can increase the credibility and perceived importance of the survey. In your introduction, it's important to do the following:

  • Identify the name of the organization conducting the survey
  • Guarantee confidentiality to all your respondents
  • Be honest about the length of the interview
  • Explain the uses and the benefits of the survey

Ensuring your interviewers are well trained: Preparing your interviewers before they meet with respondents is a must. Before sending them out into the field, ensure they are well-versed in the following:

  • Able to explain "random selection" (an often asked question)
  • Professional in their approach
  • Able to read out questions accurately
  • Prepared to probe and clarify responses

If quota sampling will be used and the respondents will be, for example, the first 1,000 willing to respond, then the results of the survey should be interpreted with caution. To follow this example, there is no information about how many people were approached in total in order to get the 1,000 interviews. There is also no information about how the respondents may be different from those who did not respond.

Date modified: