Appendix 1: Glossary of terms
The descendants of the original inhabitants of North America. The Canadian Constitution recognizes three groups of Aboriginal people – First Nations (or North American Indian people, including Status and non-Status Indians), Métis and Inuit. These are three separate peoples with unique heritages, languages, cultural practices and spiritual beliefs.
A Statistics Canada microdata set for a given survey, available for use in Research Data Centres (RDCs) across Canada. RDCs provide researchers with access, in a secure university setting, to microdata from population and household surveys. The centres are staffed by Statistics Canada employees. They are operated under the provisions of the Statistics Act in accordance with all the confidentiality rules and are accessible only to researchers with approved projects who have been sworn in under the Statistics Act as 'deemed employees'.
The bootstrap method is an approach for estimating error in a dataset related to sampling. Sampling introduces error because data are not taken from the entire population, but only a sub-section, called a sample, which is then used to make estimates for the whole population. There are several methods for estimating the level of sampling error. The bootstrap method selects a number of subsamples from the main sample and produces estimates for each subsample. The sampling error is estimated as a function of the observed differences between estimates from the different subsamples.
Census Metropolitan Area (CMA) and Census Agglomeration (CA)
Area consisting of one or more neighbouring municipalities situated around a major urban core. A census metropolitan area must have a total population of at least 100,000 of which 50,000 or more live in the urban core. A census agglomeration must have an urban core population of at least 10,000.
Census subdivision (CSD)
This is the general term for municipalities (as determined by provincial / territorial legislation) or areas treated as municipal equivalents for statistical purposes (e.g., Indian reserves, Indian settlements and unorganized territories).
Census of population
A census is the collection of information about all units in a population, sometimes also called a 100% sample survey. Under the Statistics Act of 1971, it is a statutory requirement to conduct a nationwide census every five years. The Census of Population provides information needed by community groups, businesses and governments to develop plans for education and training, seniors' housing, day care, fire protection, public transport, and many other programs.
As used in demography, a number of people having a common characteristic, for example, all persons in a given population who were born in 1940, or all persons suffering from a particular disease.
This is a term used within Statistics Canada to describe information that is subject to the secrecy provisions of the Statistics Act. Information is deemed confidential either because it directly identifies a responding unit, for example, by name, or because it could permit specific responding units to be identified, even when the data is stripped of identifiers, due to the information's detail or its geographical structure or format.
Confidentiality denotes an implied trust relationship between the person providing the information and the individual or organization collecting it. This relationship is built on the assurance that the information will not be disclosed without the person's permission. Under the Statistics Act, information that would identify an individual, business or institution can not be disclosed without their knowledge or consent.
Coverage is the extent to which every person or unit intended for inclusion in a survey or census is in fact counted and counted only once. Coverage errors refer to when persons or units of the survey or census are missed (under-coverage) or over-counted (over-coverage). Studies are often conducted by Statistics Canada to provide estimates of under-coverage and over-coverage of a given survey or census or to examine related issues. For example, Statistics Canada has studied and analyzed the extent to which cell-phone use affects coverage for telephone surveys.
CV – Coefficient of variation
In a sample survey, results from the sample are used to estimate what the findings would be if the whole population were to be measured. In this process of estimation, some level of error is inevitable. The coefficient of variation (CV) is a way of expressing the sampling error associated with an estimate. First a standard error or 'average' error of the estimate is calculated. The CV is obtained by dividing the standard error of the estimate by the estimate itself and expressing the resulting fraction as a percentage. The lower the CV, the higher the data quality (see Margin of error).
Observations and measurements collected during a survey, census or other study. Facts or figures from which conclusions can be drawn.
A degree or level of confidence that the data and statistical information are "fit for use". The particular issues of quality or fitness for use that must be addressed by Statistics Canada are relevance, accuracy, timeliness, accessibility, interpretability and coherence.
Dataset / Database
An organized and sorted list of facts or information about a set of individuals, households, businesses, or other relevant units. A Statistics Canada dataset is usually generated by a survey or administrative data, stored on a computer, and organized in such a way that it may be accessed easily by a wide variety of statistical application programs.
The process of providing statistical products and services to the general public and to specific data users. Statistics Canada disseminates data and analysis in the form of survey results, research reports, technical papers, periodical magazines, census products, and research compendia. Online products date from 1996 to the present. Historical material can be located using the Library Catalogue. Statistics Canada information is also distributed to an approved network of depository libraries. The objective of dissemination activities is to provide relevant information in a timely fashion, in useful formats, and through accessible channels. Activities in place to support the dissemination of products include client consultation services, marketing, promotions, user-training and other client services.
A new variable constructed by applying logical or mathematical operations to one or more existing variables in order to meet particular data needs. For example, an age variable can be derived from date of birth information. As another example, a derived variable could be obtained called 'presence of a chronic health condition' based on whether or not a respondent answered 'yes' at least once to a series of questions asking about specific chronic health conditions such as asthma, diabetes, heart disease, etc.
Editing is a process that ensures survey data are accurate, complete and consistent. A set of editing rules or conditions is applied to a dataset. Data which do not meet the conditions are examined and corrected where appropriate.
In a sample survey, results from the sample are used to estimate what the findings would be if the whole population were to be measured. The accuracy of such an estimate is a measure of how much the estimate differs from the correct or "true" figure. Departures from true figures are known as errors. Errors can arise from many sources, but can be grouped into a few broad categories: coverage errors, non-response errors, response errors, processing errors and sampling errors.
Coverage errors refer to when persons or units of the survey are missed (under-coverage) or overcounted (over-coverage).
Non-response errors occur when it proves impossible to obtain a complete questionnaire from a person, household, or organization. Although certain adjustments for missing data can be made during processing, non-response means some loss of accuracy is inevitable.
Response errors indicate that a response may not be entirely accurate. The respondent may have misinterpreted the question or may not know the answer, especially if it is given for an absent household member, for example.
Processing errors include mistakes made during data entry, coding, tabulation or other forms of data manipulation.
Sampling error refers to the fact that the results of the weighted sample differ somewhat from the results that would have been obtained from the total population. The difference is known as sampling error. The actual sampling error is of course unknown, but it is possible to calculate an "average" value, known as the "standard error".
Using results of the weighted sample to estimate the characteristics of the total population.
A term that came into common usage in the 1970s to replace the word "Indian," which many people found offensive. Although the term First Nation is widely used, no legal definition of it exists. Among its uses, the term "First Nations peoples" refers to the North American Indian people in Canada, both Status and Non-Status. Many people have also adopted the term "First Nation" to replace the word "band" in the name of their community.
A list, map, or conceptual specification of the units comprising the survey population from which persons can be selected. For example, a telephone or city directory, or a list of members of a particular association or group.
The number of times an event or item occurs in a dataset.
A chart or table showing how often each value or range of values of a variable appear in a dataset. It is sometimes called a one-way frequency table to indicate that the distribution contains counts for one variable only.
G - H - I
Imputation involves replacing either missing or invalid data with valid data. This is normally performed using predetermined rules or with the use of data from a 'statistical neighbour' – another responding unit who has similar characteristics. Imputation is often combined with data editing.
A unit that meets all criteria for the survey. For the ACS, in the provinces, a unit was in scope if he or she was under 6 years of age, was Aboriginal, and did not live on a reserve (except for some First Nation communities in Quebec). In the territories, a unit was in scope if he or she was under 6 years of age (Aboriginal or not, living on reserve or not).
The Canadian federal legislation, first passed in 1876, that sets out certain federal government obligations, and regulates the management of Indian reserve lands. The act has been amended several times, most recently in 1985.
A group of North American Indian people for whom lands have been set apart and money is held by the Crown. Each band has its own governing band council, usually consisting of one or more chiefs, and several councillors. Community members choose the chief and councillors by election, or sometimes through traditional custom. The members of a band generally share common values, traditions and practices rooted in their ancestral heritage. Today, many bands prefer to be known as First Nations.
Data that have been recorded, classified, organized, related or interpreted within a framework so that meaning emerges.
Organization of results from Statistics Canada activities, including data files, databases, tables, graphs, maps, and text. This organization can be either pre-defined (standard information product) or made in response to special requests (customized information product). Information products can be made available on either print or electronic media
Interpretability reflects the ease with which the user may understand, properly use and analyze the data or information. The degree of interpretability is largely determined by: the adequacy of definitions on concepts, target populations and variables; terminology underlying the data; and information on any limitations of the data.
"Inuit" means "people" in Inuktitut, the language of Inuit people. Most Inuit live in the Northwest Territories, Nunavut, Northern Quebec and Labrador.
Inuit Nunaat is the homeland of Inuit of Canada. It includes communities in Nunatsiavut (Northern coastal Labrador), Nunavik (Northern Quebec), the territory of Nunavut and the Inuvialuit region (Northwest Territories). These regions collectively encompass the area traditionally used and occupied by Inuit in Canada.
The singular form of the word Inuit (i.e. 'a person').
J - K - L
A form of regression analysis used when the response variable is a binary variable (a variable having two possible values).
Margin of error
In a sample survey, results from the sample are used to estimate what the findings would be if the whole population were to be measured. In this process of estimation, some level of error is inevitable. The margin of error, a measure used to build confidence intervals, serves as a rough indicator of the precision of an estimate. For example, pollsters often say that a certain percentage of the population, plus or minus the margin of error (expressed in percentage points), is likely to vote for a certain candidate, 19 times out of 20. To calculate the margin of error, which in this example corresponds to a 95% confidence interval, the pollster would use the equivalent of plus or minus two standard errors of the estimate (see Standard error).
A set of research methods and techniques applied to a particular field of study. At Statistics Canada, methodology refers to survey methodology.
People of mixed North American Indian and European ancestry who identify themselves as Métis people, as distinct from North American Indian people, Inuit or non-Aboriginal people. The Métis have a unique culture that draws on their diverse ancestral origins, such as Scottish, French, Ojibway and Cree.
Files of records pertaining to individual responding units.
N - O
A non-Status Indian is a person who identifies as First Nation or North American Indian but is not registered under the Indian Act.
North American Indian
A term that describes all the Aboriginal people in Canada who are not Inuit or Métis. North American Indian peoples are one of three groups of people recognized as Aboriginal in the Constitution Act, 1982. This also refers to First Nations people including Status and non-Status Indians.
Data collected for a given variable about a particular responding unit. Examples include the specific values for a responding unit on characteristics such as age, gender or marital status – the observations might be '77', 'woman' and 'widowed'.
Out of scope
A sampled unit that does not meet all criteria for being surveyed. For the ACS, in the provinces, a person could be out of scope by, for example, being 6 or more years of age or by being non-Aboriginal. In the territories, a person being 6 or more years of age would be out of scope.
The complete group of units to which survey results are to apply. These units may be persons, households, businesses, institutions, etc. The term "Target Population" is often used to refer to all potentially surveyed units, as defined in a clear, precise way by the survey study. This is the population for which information is wanted.
A post-censal survey is one where surveyed units are selected based upon their responses to the Census of Population. These surveys are generally conducted shortly after the Census data have been processed.
A proportion refers to how many responses fall into a given response category in relation to the total responses. It is calculated by dividing the frequency of the response category by the total number of responses to the question.
PUMF – Public use microdata file
Public use microdata files provide access to responding units so that users can conduct their own research or analysis. They involve a non-identifiable data set containing characteristics pertaining to the units of the survey (e.g., individuals, households or businesses). All such datasets have been authorized for release to the public by the Statistics Canada Microdata Release Committee. The dataset contains no confidential information in that individual identifiers have been removed and any data combination or geography which could potentially reveal the identity of a responding unit has been modified.
Q - R
A record is the data for an individual responding unit in a file containing data for all of a survey's responding units.
A Status or Registered Indian is a person who is registered under the Indian Act. The act sets out the requirements for determining who is a Status Indian.
A statistical method which tries to predict the value of a characteristic by studying its relationship with one or more other characteristics. This relationship is expressed through the means of a regression equation.
Research Data Centres (RDCs)
The Research Data Centre program provides researchers with access, in a secure Statistics Canada governed setting, to micro data from population and household surveys. The RDC program is part of an initiative by Statistics Canada, the Social Sciences and Humanities Research Council (SSHRC) and university consortia to help strengthen Canada's social research capacity and to support the policy research community. The program is also supported by the Canadian Foundation for Innovation (CFI) and the Canadian Institutes of Health Research (CIHR).
The respondent is the person providing the information for the surveyed unit, which could be a person, household, business or institution. In the case of ACS, the respondent is the parent or guardian of the selected child.
The responding unit refers to the surveyed unit for which a response is obtained. In the case of ACS, it would be the child for whom a response is obtained from the parent or guardian. This term is defined to distinguish it from the term "respondent" which in the case of ACS refers to the parent or guardian providing the information for the child.
The proportion of a sample for which a response to a questionnaire is obtained, usually expressed as a percentage. Non-response covers those who refused to participate as well as persons whom the survey was unable to reach.
Rural areas include all territory lying outside urban areas. An urban area has a minimum population concentration of 1,000 persons and a population density of at least 400 persons per square kilometre, based on the current census population count. Taken together, urban and rural areas cover all of Canada. Rural population includes all population living in the rural fringes of census metropolitan areas (CMAs) and census agglomerations (CAs) , as well as population living in rural areas outside CMA and CAs.
A set of specifications that describe the sampling elements of a survey in detail. These elements include population, frame, survey units, sample size, sample selection and estimation method.
The process of selecting some part of a population to observe so as to estimate something of interest about the whole population. Examples of different sampling methods include simple random sampling, stratified random sampling, cluster sampling and multi-stage sampling.
Sampling rate / Sampling fraction
The size of a sample divided by the total population being estimated.
Sampling or sampled unit
The unit selected by the sample design and from which measurements are taken for a survey. Examples include persons, households, families or businesses. For ACS, the sampling unit is the child.
Standard deviation measures the spread or dispersion of a data set around the mean. It is the most widely-used measure of spread. Mathematically, the standard deviation is the square root of variance.
In a sample survey, results from the sample are used to estimate what the findings would be if the whole population were to be measured. Sampling error refers to the fact that the results of the weighted sample differ somewhat from the results that would have been obtained from the total population. The difference is known as sampling error. The actual sampling error is of course unknown, but it is possible to calculate an "average" value, known as the "standard error".
An Act regarding statistics of Canada. Includes the definition of Statistics Canada's mandate: There shall continue to be a statistics bureau under the Minister, to be known as Statistics Canada, the duties of which are:
- to collect, compile, analyze, abstract and publish statistical information relating to the commercial, industrial, financial, social, economic and general activities and condition of the people;
- to collaborate with departments of government in the collection, compilation and publication of statistical information, including statistics derived from the activities of those departments;
- to take the census of population of Canada and the census of agriculture of Canada as provided in this Act;
- to promote the avoidance of duplication in the information collected by departments of government; and
- generally, to promote and develop integrated social and economic statistics pertaining to the whole of Canada and to each of the provinces thereof and to coordinate plans for the integration of those statistics.
See Registered Indian.
Stratified sampling, stratification
A sampling procedure in which the population is divided into homogeneous subgroups or strata and the selection of samples is done independently in each stratum.
The process by which particular data are prevented from being released based on criteria designed to protect confidentiality. 'Cell' suppression refers to procedures used to protect sensitive tabular data from disclosure – a cell being an individual entry in a table.
The selected unit from which measurements are taken for a sample survey or a Census. Examples include persons, households, families or businesses. For ACS, the surveyed unit (which is also the sampled unit since ACS is a sample survey) is the child.
T - U - V
A Status or Registered Indian who belongs to a First Nation that signed a treaty with the Crown.
Same as surveyed unit
An urban area has a minimum population concentration of 1,000 persons and a population density of at least 400 persons per square kilometre, based on the current census population count. All territory outside urban areas is classified as rural. Taken together, urban and rural areas cover all of Canada. The urban population includes all population living in the urban cores, secondary urban cores and urban fringes of census metropolitan areas (CMAs) and census agglomerations (CAs), as well as the population living in urban areas outside CMA and CAs.
These guides accompany Statistics Canada survey datasets, such as analytical files and Public Use Microdata Files (PUMF), providing the detailed technical information required to use the data appropriately. The guide typically contains important information to know prior to data analysis: weighting variables to use, procedures related to the estimate of variance, and precautions to take in the dissemination of the data.
A characteristic that may assume more than one set of values to which a numerical measure can be assigned (e.g., income, age and weight).
A measure of spread for a given characteristic or variable in a dataset. It indicates how much variability exists for that characteristic. Technically, it is calculated as the average squared deviation from the mean of each observation in the data set for a particular variable.
W - X - Y - Z
A weight is the average number of units in the population that a unit in the survey represents. Examples of a unit include a person or a household. Weights are applied to responding units in a sample database in order to ensure that, when making inferences from the survey data to population parameters, estimates of characteristics for the total population are obtained.
- Date modified: