# Concepts, variables and classifications

## Scope and purpose

Concepts are general or abstract ideas that express the social and/or economic phenomena to be studied. They are the subjects of inquiry and analysis that are of interest to users.

A variable consists of two components, a statistical unit and a property. A statistical unit is the unit of observation or measurement for which data are collected or derived (e.g. persons or households in social surveys, and enterprises or establishments in business surveys) (Statistics Canada, 2008). A property is a characteristic or attribute of the statistical unit. Definitions of variables must be unambiguous and clearly specified in the context of the analytical purposes for which the data are to be collected (Statistics Canada, 2004).

Classification is a systematic grouping of values that a variable can take comprising mutually exclusive classes, covering the full set of values, often providing a hierarchial structure for aggregating data so as to facilitate analysis and interpretation. More than one classification can be used to represent data for a given variable (Statistics Canada, 2004).

## Principles

Statistics Canada aims to ensure that the information it produces provides a consistent and coherent picture of the Canadian economy, society and environment, and that its various datasets can be analyzed together and in combination with information from other sources.  To achieve this the Agency follows conceptual frameworks, makes use of  standard names and definitions for populations, statistical units, concepts, variables and classifications in statistical programs and uses consistent collection and processing methods for the production of statistical data across surveys.  The Statistics Canada Policy on Standards governs how this is to be done (Statistics Canada, 2004). Where applicable, three types of standards should be respected, in descending order of compulsion; departmental standards, recommended standards and program specific standards (Statistics Canada, 2004).

## Guidelines

### Using Standards

• Specify concepts and variables clearly and relate them to their intended use. Emphasis should be placed on the use of standard definitions of concepts, variables, classifications, statistical units and populations established under the Statistics Canada Policy on Standards (Statistics Canada, 2004). In choosing naming conventions, take into account the similarity or dissimilarity of existing standards and usage. Use titles from existing standards only for what is defined in the standards.

• Use standard definitions to make it possible to compare data collected from different sources and to integrate data across sources (Statistics Canada, 2004). Statistics Canada has standard classifications of industries, products, instructional programs, occupations, financial accounting and geography (Statistics Canada 2007a (NAICS), 2007b (NAPCS), 2000 (CIP), 2006a (NOC-S), 2006b (COA) and 2007c (SGC) as well as of a large number of other domains used for social and economic statistics.

• In addition to Statistics Canada's standard classifications, there are international standard classifications produced by the United Nations Statistical Office, the International Labour Office, Eurostat, and other international and regional agencies. The Standards Division has produced official concordances to a number of international standard classifications. When there is a requirement to provide data to international agencies, use official concordances when they are available.

• Use standard units of observation to facilitate the comparison of data. Classifications are usually designed with particular units of observation in mind. For example, the North American Industry Classification Systems (NAICS) is designed primarily for classifying establishments.

• Be aware of derived statistical activities or statistical frameworks (e.g., the System of National Accounts) whose definitions of concepts and variables may have a significant effect on specific data collection activities (Statistics Canada, 1989).

• Sometimes, there is more than one way to measure a concept. The variables and classifications chosen to measure a concept will also need to take into account factors such as the ease of obtaining the information required, the respondent burden imposed, the collection method, the context in which the question(s) must be asked, the processing of the data (especially editing, imputation and weighting techniques), whether the information can be obtained from administrative records, and the costs associated with collection and processing. Thus, the measurement approach adopted may be more or less successful in providing the desired interpretation of the concept. A variable chosen at one point in time may become obsolete later if new factors come into play and may therefore need to be modified or changed. Therefore, it is important to ensure that the latest approved version of the variable is used. Updated standards are made available on the Statistics Canada website.

• In the absence of an official standard, examine the concepts, variables and classifications being used by related statistical programs and consult with the Standards Division when necessary.

### Using classifications

• To maximize flexibility of use, code microdata and maintain files at the lowest possible level of the appropriate classification. Aggregation at a higher level may be required for particular analytical purposes or to satisfy confidentiality or data reliability constraints. Wherever possible, use the hierarchy of the classification in terms of the classes or higher level aggregations of the standard. If this is not possible follow a common collapsing strategy for aggregation and document differences between the standard and adopted levels of classifications/aggregations used. Use classifications that reflect both the most detailed and the collapsed levels. Make clear to users how these fit into higher level (e.g. less detailed) classifications.

## Quality indicators

Main quality elements:  coherence, interpretability, relevance

Describe key statistical concepts, including the statistical measure, the population, variables, units, domains and time reference.  This information gives users an understanding of the relevance of the output to their needs.

Provide accurate references when standard concepts, variables and classifications are adopted.

Describe, justify and if possible, measure (qualitatively if not quantitatively) any departures from standards.  This gives users an indicator of relevance, and aids interpretability.

