Editing is of little value to the overall improvement of the actual survey results, if no corrective action is taken when items fail to follow the rules set out during the editing process. When all of the data have been edited using the applied rules and a file is found to have missing data, then imputation is usually done as a separate step.
Non-response and invalid data definitely impact the quality of the survey results. Imputation resolves the problems of missing, invalid or incomplete responses identified during editing, as well as any editing errors that might have occurred. At this stage, all of the data are screened for errors because respondents are not the only ones capable of making mistakes; errors can also occur during coding and editing.
Imputation procedures are designed to fill in the gaps. So, changes are made to the minimum number of fields until the completed record passes all of the edits. When these errors are detected, values for invalid, missing or incomplete entries are imputed or replaced with appropriate values, and answers are provided for non-response questions. This procedure is best accomplished by those with full access to the microdata and in possession of good auxiliary information.
The imputation procedures are decided upon during the planning and development stages of a survey. Some problems are eliminated earlier through contact with the respondent or by manually studying the questionnaire, but it is generally impossible to resolve all problems due to concerns of response burden, cost and timeliness. Thus, the imputation procedure is used to handle the remaining edit failures.
Although imputation can improve the quality of the final data, care must be taken to choose an appropriate imputation methodology. Some methods of imputation do not preserve the relationship between variables. In fact, some can actually distort the underlying distributions.
There are several approaches to consider when imputing data. Usually, deductive imputation is the first method used. This method is used when a value can be deducted with certainty and can be completed during the collection, capture, editing, or later stages of data processing. Deductive imputation is used when there is only one possible response to the question (e.g., all the values are given but the total or subtotal is missing).
Some other types of imputation methods include:
The method of imputation can vary from survey to survey and, depending on unique or particular circumstances, sometimes even within the same survey. These methods can be applied either manually or with the use of an automated system. The imputed value is determined by calling the respondent or is based on the judgment of a subject-matter specialist. To help facilitate this, Statistics Canada has written specialized programs to impute data based on the methodological input of experienced statisticians who have analysed the survey and suggested approaches on how best to impute meaningful data.
Imputation methods can be performed automatically, manually or in combination. Done properly, imputation limits the biases caused by not having a complete and accurate record; contains an audit trail for evaluation purposes; and ensures that the imputed records are internally consistent. A good imputation procedure is automated, objective and efficient.