Statistics Canada
Symbol of the Government of Canada

Introduction

Coding
Automated coding systems

The simplified flowchart below shows how raw data are transformed into information. Data processing takes place once all of the relevant data have been collected. They are gathered from various sources and entered into a computer where they can be processed to produce information (output).

Figure 1. Data processing flowchart

A flowcart showing the processing of data.

Data processing includes the following steps:

Coding

First, before raw data can be entered into a computer, they must be coded. In order to do this, survey responses must be labeled, usually with simple, numerical codes. This can be done by the interviewer in the field or even by an office employee. The data coding step is important because it makes data entry and data processing easier.

Surveys have two types of questions—closed questions and open questions. The responses to these questions affect the type of coding performed. A closed question means that only a fixed number of predetermined survey responses are allowed. These responses will have already been coded.

The following question asked in the 1998 Time Use Survey (Sport), is an example of a closed question:

To what degree is sport important in providing you with the following benefits?

<1/>Very important

<2/>Somewhat important

<3/>Not important

An open question implies that any response is allowed, making subsequent coding more difficult. In order to code an open question, the processor must sample a number of responses, and then design a code structure that includes all possible answers.

The following code structure is an example of an open question:

What sports do you participate in?

Specify (28 characters)______________

In the Census and almost all other surveys, the codes for each question field are pre-marked on the questionnaire. When it comes time to process the questionnaire, the codes are entered directly into the database and are prepared for data capturing. The following is an example of pre-marked coding:

What language does this person speak most often at home?

<18/>English

<19/>French

<20/>Other — Specify ____________

Statistique Canada a élaboré des codes uniformes, désignés classifications ou normes, pour aider à répartir les personnes, les endroits et les choses en groupes spécialisés. Se reporter à la section sur la classification pour obtenir plus de renseignements à ce sujet.

Automated coding systems

Statistics Canada is constantly testing programs that will automate repetitive and routine tasks. Thus, coding is a prime candidate for this type of automation.

Some of the advantages of an automated coding system are that the process increasingly becomes

  • faster,
  • consistent, and
  • more economical.

There are already many automated systems in use. For example, the Labour Force Survey data files are collected from the Regional Offices of Statistics Canada and are run through an automated coding system that assigns industry and occupation codes based on the Standard Industrial Classification System and the Standard Occupation Classification System. The rejected records (those that do not have a match with the written response) are the only data to be manually coded.

The 1991 Census of Population used an automated coding system to code language, religion and ethnic origin information. These codes are not part of a standard classification system, but they are a standard used by the Census to facilitate analysis and tabulations.

The next step in data processing is inputting the coded data into a computer database. This method is known as data capture.