Statistics Canada
Symbol of the Government of Canada

Appendix I: Glossary

Warning View the most recent version.

Archived Content

Information identified as archived is provided for reference, research or recordkeeping purposes. It is not subject to the Government of Canada Web Standards and has not been altered or updated since it was archived. Please "contact us" to request a format other than those available.

Population of interest: the collection of all units (for example, vehicle-days) for which the information is required.

Survey population: the collection of all units (for example, vehicle-days) for which the information can be realistically provided to the survey. The survey population may differ from the population of interest due to the operational difficulty of identifying all the units that belong to the population of interest. A list of all units in the survey population with their classification information (for example, geographical, vehicle characteristics, date) is used for sample design, selection and estimation.

Stratification: a non-overlapping partition of the survey population into relatively homogeneous groups with respect to certain characteristics such as geographical classification, size, etc. These groups are called strata and are used for sample allocation and selection.

Sampling weight: a raising factor is attached to each sampled unit (vehicle-day) to obtain estimates for the population from a sample. The basic concept of the sampling weight can be explained by using the representation rate. For example, if 2 units are selected out of 10 population units at random, then each selected unit represents 5 units in the population including itself, and is given the sampling weight of 5. A survey with a complex sample design like CVS requires a more complicated way of calculating the sampling weight. However, the sampling weight is still equal to the number of units in the registration lists the sampled unit represents.

Editing: the application of checks that identify missing, invalid or inconsistent entries or that point to data records that are potentially in error. Some of these checks involve logical relationships that follow directly from the concepts and definitions. Others are more empirical in nature or are obtained as a result of the application of statistical tests or procedures.

Imputation: the process used to resolve problems of missing, invalid or inconsistent responses identified during editing. This is done by changing some of the responses or missing values on the record being edited to ensure that a plausible, internally coherent record is created. Some problems are eliminated earlier through contact with the respondent or through manual study of the questionnaire. It is generally impossible to resolve all problems at these early stages due to concerns of response burden, cost and timeliness. Imputation is then used to handle remaining edit failures, since it is desirable to produce a complete and consistent file containing imputed data. Although, imputation can improve the quality of the final data by correcting for missing, invalid or inconsistent responses, some methods of imputation do not preserve the relationships between variables or can actually distort underlying distributions.