Canadian Community Health Survey (CCHS)

Household weights documentation

June 2011

  1. Introduction
  2. Weighting overview
  3. Sample weighting

1. Introduction

This document describes the weighting process used in the creation of the household weight for the Canadian Community Health Survey (CCHS). In using this weight, users should note that the survey is designed to represent individuals and not designed to represent households. Certain steps are taken in the design to ensure that the sample is representative of different demographic groups and this may affect how well the sample represents households of different compositions. Also, since the calibration for the household weight is done at the provincial level, it is possible to yield reliable estimates at the national and provincial level only. It is felt that individual responses to certain questions can be used to represent the household. This weight should only be used for variables where it can be assumed that the responses from the individual clearly represent the household and that the response would not be affected by who responded within the household. This is highlighted by the fact that throughout the document, we refer to responding persons. Users must remember that the responses from this selected individual are assumed to represent the household when using the household weight.

Those familiar with the CCHS will notice that these weight adjustments are very similar to those used for other weights that have been produced in the past. A CCHS interview can be seen as a two–part process. First, the interviewer gets the complete roster of the people living within the household (Household Response). Second, (s)he interviews the selected person within the household (Person Response). In the calculation of the household weight, the individual responses are used to represent the household. Nonresponse adjustments for both stages are still included since nonresponse can occur at either part of the interview process. Note that nonresponse adjustments for the household can be based on characteristics of the individual respondent. Since the survey is designed to collect information from the individual, the characteristics of the individual can have an effect on nonresponse of the household.

2. Weighting Overview

In order for estimates produced from survey data to be representative of the covered population and not just the sample itself, users must incorporate the survey weights in their calculations. A survey weight is given to each person included in the final sample, that is, the sample of persons having answered the survey. This weight corresponds to the number of households in the entire population that are represented by the respondent.

The CCHS has recourse to three sampling frames for its sample selection: an area frame acting as the primary frame and two frames formed of telephone numbers complementing the area frame. Since only minor differences differentiate the two frames formed of telephone numbers in terms of weighting, they are treated together. They are referred to as being part of the telephone frame.

The weighting strategy treats both the area and telephone frames independently. Weights resulting from these two frames are afterwards combined into a single set of weights through a step called "integration". After some adjustments, this integrated weight becomes the final weight. Note that depending on the need, one or two frames were used for the selection of the sample within a given health region (HR). The weighting strategy deals with this aspect at the integration step.

3. Sample weighting

As mentioned previously, units from both the area and telephone frames are treated separately up to the integration step. These weighting steps for the household weight, up to and including the integration of the frames, are the same as the steps from the main weight. Please refer to the CCHS User Guide for more information about these steps. The final three weighting steps, person nonresponse, winsorization and calibration, are explained in sub–sections 3.1–3.3.

Although these two frames were used to cover the three territories, some modifications had to be done relative to their use. These modifications substantially affected the weighting of these three regions and they are reported in sub–section 3.4.

Diagram A presents an overview of the different adjustments that are part of the weighting strategy, in the order in which they are applied. A numbering system is used to identify each adjustment applied to the weight and will be used throughout the section. Letters A and T are used as prefixes to refer to adjustments applied to the units on the Area and Telephone frames respectively, while prefix I identifies adjustments applied from the Integration step onwards.

Diagram A: Weighting strategy overview (Household weight)

Diagram A: Weighting strategy overview (Household weight)

3.1 Person–level nonresponse (I2)

A CCHS interview can be seen as a two–part process. First, the interviewer gets the complete roster of the people within the household. Second, (s)he interviews the selected person. In some cases, interviewers can only get through the first part, either because they cannot get in touch with the selected person or because that selected person refuses to be interviewed. Such individuals are defined as person nonrespondents and an adjustment factor must be applied to the weights of the respondents to account for this nonresponse. Using the same methodology that was used in the treatment of household nonresponse (see User Guide, Section 8.2 – A4), the adjustment is applied within response homogeneity groups based on characteristics available for both respondents and non–respondents. All characteristics collected when creating the roster of household members are available for the creation of the groups as well as geographic information and some paradata. The scoring method is used to define the classes. In the end, the following adjustment factor is calculated within each group:

Formula 1

Weight I1 is multiplied by the above adjustment factor to produce weight I2. Nonresponding persons are dropped from the weighting process from this point onward.

3.2 Winsorization (I3)

Following the series of adjustments applied to the respondents, some units may come out with extreme weights compared with other units in the same domain of interest. For the household weights, the domains include province by household size, where household size is defined by: 1–person household, 2–person household and at least 3–person household. Some responding households could represent a large proportion of their province by household size domain or have a large impact on the variance. In order to prevent this, the weights of the outlier units that represent a large proportion of their domain are adjusted downward using a “Winsorization” trimming approach.

3.3 Calibration (I4)

The last step in obtaining the final CCHS household weight is calibration. Calibration is done using the program CALMAR to ensure that the sum of the final weights corresponds to the household estimates defined at the province by household size level. These groups of interest are defined by the sizes: 1–person household, 2–person household and at least 3–person household. At the same time, the weights are seasonally adjusted to ensure that the each two month collection period is equally represented within the sample. In terms of geography, all calibration is at the provincial level only.

The household count estimates are based on the most recent census. The average of these monthly estimates is used to calibrate for each of the province by household size post–strata within a collection period. The weight I3 is therefore adjusted to obtain the final weight I4 with the help of the adjustment factor I4 defined as follows:

Formula 2

Consequently, the weight I4 corresponds to the final CCHS household weight that can be found on the household weight file with the variable name WTS_MHH for the master weight and WTS_SHH for the share weight.

3.4 Particular aspects of the weighting in the three territories

The sampling frame used in the three territories is somewhat different from the one used in the provinces. Therefore the weighting strategy is adapted to comply with these differences. This section summarises the changes applied to the weighting steps in the territories.

For the area frame, an additional stage of selection is added in the territories where each territory is initially stratified into groups of communities and one community is selected within each group. Note that the capital of each territory forms a stratum on its own and is consequently selected automatically at this first stage. This has an effect in the computation of the probability of selection and therefore in the value of the initial weight (A0). Once the initial weight is calculated, the same series of adjustments (A1 to A4) is applied to the area frame units. The household–level adjustment classes are built in the same way as for the provinces, using the same set of variables available.

For the weighting of the telephone frame units, it should first be noted that only the RDD frame is used for the territories, and exclusively in the Yukon and Northwest Territories capitals. All of the standard telephone adjustments are applied.

The two sets of weights (area and telephone) are subsequently integrated and then adjusted for person level nonresponse, winsorization and finally calibrated in a similar way to what is done for the provinces, with the exception of four details. First, the integration is applied only to units located in the Yukon and Northwest Territories capitals since the other communities are covered only by the area frame. Second, for Nunavut, the household counts used for calibration only represent the 10 largest communities (73% of the households) because of the under–coverage of the area frame (for more details, see User Guide, section 5.4.1). Third, in the Yukon and Northwest Territories, starting with the 2008 and 2007/2008 reference periods,calibration is used to control for the proportion of households located inside the capital cities versus the proportion of households located outside of the capital cities. The same approach has been adapted for Nunavut starting in 2009. Finally, due to the differences in collection strategies, the number of collection periods used in calibration for the seasonal effect in the territories is different from the provinces. In 2009, two 6–month periods are used in the three Territories.

3.5 Creation of the Share weight (WTS_SHH)

Along with the master file and PUMF , which contain all CCHS responding persons, a share file is created which contains only a portion (>90%) of the original CCHS responding persons. The individuals on this share file have agreed to share their data with certain partners. To compensate for the loss of some respondents from the file, the weights of these "sharers" must be adjusted by the factor:

Formula 3

Similar to the nonresponse adjustments, this factor is calculated within homogeneity groups, where in this case, individuals with similar estimated propensity to share will be grouped together. The final weight after this adjustment is called WTS_SHH.

3.6 The Food Security Module in 2010

The Food Security Module (FSC) is one of the few modules that are part of the CCHS where it can be appropriate to use the household weights. For 2007–2008, the FSC module was part of the common content, and therefore was asked to respondents in all provinces and territories. For the 2009–2010 collection period, the FSC module is part of optional content and was not selected for some of the provinces. Given this change, it is no longer appropriate to calculate national estimates with variables in this module since the results would only represent those provinces and territories where the module was asked. For more information on optional content selection, please refer to the appendices of the User Guide.

Canadian Community Health Survey (CCHS) – Errata

Date: October 2011

To: CCHS master and share microdata files

Subject: Error corrected in the Smoking Module – Modified version derived variable, SMKDSTY

Product(s) affected: Share and Master microdata files

Year(s) affected: 2001, 2003, 2005, 2007, 2007–2008, 2009

Description of the problem(s):
In 2010, the programming of the response categories for this derived variable was changed. Respondents who answered SMK_202=3, SMK_05D=9, SMK_01A in (7,8), and SMK_01B=1 were being classified as SMKDSTY=5 and should have been classified as SMKDSTY=99. A new condition and brackets were added to ensure that the category was being assigned correctly to all cases.

Suggested correction(s): To correctly process this derived variable for 2001, 2003, 2005, 2007, 2007–2008 and 2009 please use the specifications below.

Correction steps: The correction is highlighted and in bold font.
SKMDSTY = 5 (SMK_202 = 3 AND ((SMK_05D = 2 OR SMK_05D = 6) AND (SMK_01A = 1 OR SMK_01B = 1)))

Contact us: We regret any inconvenience this may have caused you or your organisation and thank you in advance for your understanding.

Should you have any questions, please do not hesitate to contact us at:

Health Statistics Division
613-951-1746
Electronic mail: CCHS-ESCC@statcan.gc.ca

Date: October 2011

To: Users of the 2009 and 2009–2010 Master and Share files

Subject: Reversed variable labels

Product(s) affected : 2009 and 2009–2010 Master and share files

Year(s) affected: 2009 and 2009–2010

Description of the problem(s):
Two “Sources of Income” response categories, “Child Tax Benefit” and “Social Assistance or Welfare”, in questions INC_6J and INC_6K were reversed in the 2009 and 2009–2010 master and share files. The 2010 files are not affected.

Suggested correction(s): Users should modify the format programs in order to switch the two answer categories i.e. INC_6J should refer to “Child Tax Benefit” and INC_6K should refer to Social Assistance or Welfare.

Correction steps: N/A

Contact us: We regret any inconvenience this may have caused you or your organization and thank you in advance for your understanding.

Should you have any questions, please do not hesitate to contact us at:

Health Statistics Division
613–951–1746
Electronic mail: CCHS-ESCC@statcan.gc.ca

Date: October 2011

To: SAS and SPSS users of 2005–2009 files

Subject: Incorrect variable labels

Product(s) affected : Cycle 3.1(2005) , Cycle 4.1 (2007,2008 and 2007–2008 files) and 2009 master, share, rapid response and BC buy in share

Year(s) affected: 2005,2007,2008,2007–2008,2009

Description of the problem(s):
The labels attached to the EDUDH04 and EDUDR04 variables are incorrect in the layout file provided for SAS and SPSS users. The labels for EDUDH04 and EDUDR04 should be ‘Highest level/edu. – respond (not household) 4 levels – (D)’

Suggested correction(s): Modify the format *._lbe program with the correct label

Correction steps: N/A

Contact us: We regret any inconvenience this may have caused you or your organization and thank you in advance for your understanding.

Should you have any questions, please do not hesitate to contact us at:

Health Statistics Division
613–951–1746
Electronic mail: CCHS-ESCC@statcan.gc.ca

Date: June 2011

To: Users of the CCHS 2005, 2007, 2007–2008 Master and Share microdata files, and 2005, 2007–2008 Public Use Microdata Files (PUMF)

Subject: Flow error during collection related to MAM_038 question (on hysterectomy)

Product(s) affected : 2005, 2007, 2007–2008 Master and Share microdata files and 2005, 2007–2008 PUMF.

Year(s) affected: 2005, 2007, and 2007–2008

Description of the problem(s):
The high number of "Not Stated" responses for 2005 and 2007 resulted from an error in the application flow. Women aged 50 and over should have skipped only question MAM_Q037 (in 2005) / HWT_Q1 (in 2007), but instead they also skipped MAM_Q038.

Suggested correction(s): The error was corrected starting with the CCHS 2008 data.

Correction steps: N/A

Contact us: We regret any inconvenience this may have caused you or your organization and thank you in advance for your understanding.

Should you have any questions, please do not hesitate to contact us at:

Health Statistics Division
613–951–1746
Electronic mail: HD–DS@statcan.gc.ca

Date: June 2011

To: Users of the 2009 CCHS master and share microdata files

Subject: Error corrected in the Smoking Module – Modified version derived variable, SMKDSTY

Product(s) affected: 2009 Share and Master microdata files.

Year(s) affected: 2001, 2003, 2005, 2007, 2007–2008 2009

Description of the problem(s): In 2010, the programming of the response categories for this derived variable was changed. Respondents who answered SMK_202=3, SMK_05D=5, SMK_01A=2, and SMK_01B=1 were being classified as SMKDSTY=99 and should have been classified as SMKDSTY=5. A new condition and brackets were added to ensure that the category was being assigned correctly to all cases.

Suggested correction(s): To correctly process this derived variable for 2001, 2003, 2005, 2007, 2007–2008 and 2009 please use the specifications below.

Correction process: The correction is highlighted and in bold font. SKMDSTY = 5 (SMK_202 = 3 AND ((SMK_05D = 2 OR SMK_05D = 6) AND (SMK_01A = 1 OR SMK_01B = 1)))

Contact us: We regret any inconvenience this may have caused you or your organization and thank you in advance for your understanding.

Should you have any questions, please do not hesitate to contact us at:

Health Statistics Division
613–951–1746
Electronic mail: HD–DS@statcan.gc.ca

Date: June 2011

To: Users of the 2009 CCHS master and share microdata files

Subject: Error corrected in the Physical Activities Module – Modified version derived variable, PACFLTI

Product(s) affected: 2009 Share and Master microdata files.

Year(s) affected: 2007, 2009

Description of the problem(s):
In 2010, the programming of the response categories for this derived variable was changed. Respondents who provided a mix of valid answer and non response to PAC_1V, PAC_7, or PAC_8 have been coded to category 1 or 2 in PACFLTI. Previously, if they provided a non response to either PAC_1V, PAC_7, or PAC_8 they were coded as non response in PACFLTI.

Suggested correction(s): To correctly process this derived variable for 2007, 2008, 2007–2008 and 2009, please use the specifications below.

Correction steps: The order of conditions was changed. The correction is highlighted and in bold font.
9 ADM_PRX = 1
1 PAC_1V = 2 or PAC_7 = 1 or PAC_8 = 1
2 (PAC_1V = 1) and (PAC_7 = 2, 3) and (PAC_8 = 2, 3)

9 (PAC_1V = DK, R, NS) or (PAC_7 = DK, R, NS) or (PAC_8 = DK, R, NS)

Contact us: We regret any inconvenience this may have caused you or your organization and thank you in advance for your understanding.

Should you have any questions, please do not hesitate to contact us at:

Health Statistics Division
613–951–1746
Electronic mail:HD–DS@statcan.gc.ca

Date: June 2010

To: Users of the 2007, 2008 and 2007–2008 CCHS master and share microdata files

Subject: Error corrected in the Household Food Security Status – Modified version derived variable, FSCDHFS2

Product(s) affected: 2007, 2008, 2007–2008 Share and Master microdata files.

Year(s) affected: 2007, 2008 and 2007–2008

Description of the problem(s): Some households with children were improperly classified as moderately food insecure but should have been classified as severely food insecure as a result of a specification error. The error was corrected starting with the CCHS 2009 data.

Suggested correction(s): To recalculate this derived variable for 2007, 2008 and 2007–2008, please use the specifications below.

Correction steps: The correction is highlighted and in bold font.

[DHHTDKS = 1 and
(2 <= FSCASUM <= 5) and
(2 <= FSCCSUM <= 4)] or

[DHHTDKS = 1 and
(((2 <= FSCASUM <= 5) and( FSCCSUM <= 4)) or
(( FSCASUM <= 5) and(2 <= FSCCSUM <= 4)))]

or [DHHTDKS = 0 and
(2 <= FSCASUM <= 5)]

Contact us: We regret any inconvenience this may have caused you or your organization and thank you in advance for your understanding.

Should you have any questions, please do not hesitate to contact us at:

Health Statistics Division
613–951–1746
Electronic mail: HD–DS@statcan.gc.ca

Date: October 2009

To: Users of the 2008 CCHS master and share microdata files

Subject: Error wth the flow of some answers in question CCC_Q073

Product(s) affected: 2008 Share and Master microdata files.

Year(s) affected: 2008

Description of the problem(s): Respondents who answered question CCC_Q073 as “2 – No”, “Refusal” or “Don’t Know” skipped to question CCC_Q081, while they should have flowed to condition CCC_C073A.

Therefore, respondents who did not take medication for hypertension are automatically excluded from the universe of questions CCC_Q073A and CCC_Q073B.

Suggested correction(s): The error was corrected starting with the CCHS 2009 data.

Correction steps: N/A

Contact us: We regret any inconvenience this may have caused you or your organization and thank you in advance for your understanding.

Should you have any questions, please do not hesitate to contact us at:

Health Statistics Division
613–951–1746
Electronic mail: HD–DS@statcan.gc.ca

Date: October 2009

To: Users of the 2007 and 2005 CCHS master and share files

Subject: Certain values were assigned to the wrong variables.

Product(s) affected:
2007: Master file and share files for all provinces and territories.
2005: Master file and its subsamples, share file and its subsamples and public use microdata file and its subsamples.

Year(s) affected: 2007and 2005

Description of the problem(s): When 2007 and 2005 data was processed, the values of certain variables were assigned to other variables. See the table below for the modules and variables affected and the provinces affected in the case of optional content.

2007
Modules: Breast examination (BRX) Home health care services (HMC) Mammography (MAM)
Content type Optional Optional Optional
Provinces affected New Brunswick,
Saskatchewan and
the Northwest Territories
Ontario Newfoundland and Labrador,
Nova Scotia, New Brunswick,
Ontario, Saskatchewan
and the Northwest Territories
Variables containing
incorrect values
BRX_16N
BRX_16O
BRX_16M
HMC_10I
HMC_10C
HMC_10D
HMC_10E
HMC_10F
HMC_10G
HMC_10H
HMC_15N
HMC_15O
HMC_15M
HMC_16I
HMC_16C
HMC_16D
HMC_16E
HMC_16F
HMC_16G
HMC_16H
MAM_36N
MAM_36O
MAM_36M
2005
Module Breast examination (BRXE) Home health care services (HMCE) Mammography (MAME) Alcohol use (ALCE) Sexual behaviour
(SXBE)
Content type Optional Common Common Common Common
Provinces/
territories affected
Ontario
Yukon
All All All All
Variables BRXE_16N
BRXE_16O
BRXE_16M
HMCE_10C
HMCE_10D
HMCE_10E
HMCE_10F
HMCE_10G
HMCE_10H
HMCE_10I
HMCE_15M
HMCE_15N
HMCE_15O
HMCE_16C
HMCE_16D
HMCE_16E
HMCE_16F
HMCE_16G
HMCE_16H
HMCE_16I
MAME_36M
MAME_36N
MAME_36O
ALCE_7M
ALCE_7N
SXBE_13E
SXBE_13F
SXBE_13G

Suggested correction(s): Users must recover the correct values from the variables where they are found. The table below shows the correspondence between the variables containing incorrect values (column A) and the names of the variables to which they must be renamed (column B) to obtain the correct values. The table is shown by product year.

Suggested correction(s)
2007 2005
Column A
Variables with incorrect values
Column B
Name of the renamed variable
Column A
Variables with incorrect values
Column B
Name of the renamed variable
BRX_16N BRX_16O BRXE_16N BRXE_16O
BRX_16O BRX_16M BRXE_16O BRXE_16M
BRX_16M BRX_16N BRXE_16M BRXE_16N
HMC_10I HMC_10H HMCE_10I HMCE_10H
HMC_10C HMC_10I HMCE_10C HMCE_10I
HMC_10D HMC_10C HMCE_10D HMCE_10C
HMC_10E HMC_10D HMCE_10E HMCE_10D
HMC_10F HMC_10E HMCE_10F HMCE_10E
HMC_10G HMC_10F HMCE_10G HMCE_10F
HMC_10H HMC_10G HMCE_10H HMCE_10G
HMC_15N HMC_15O HMCE_15N HMCE_15O
HMC_15O HMC_15M HMCE_15O HMCE_15M
HMC_15M HMC_15N HMCE_15M HMCE_15N
HMC_16I HMC_16H HMCE_16I HMCE_16H
HMC_16C HMC_16I HMCE_16C HMCE_16I
HMC_16D HMC_16C HMCE_16D HMCE_16C
HMC_16E HMC_16D HMCE_16E HMCE_16D
HMC_16F HMC_16E HMCE_16F HMCE_16E
HMC_16G HMC_16F HMCE_16G HMCE_16F
HMC_16H HMC_16G HMCE_16H HMCE_16G
MAM_36N MAM_36O MAME_36N MAME_36O
MAM_36O MAM_36M MAME_36O MAME_36M
MAM_36M MAM_36N MAME_36M MAME_36N
    ALCE_7N ALCE_7M
    ALCE_7M ALCE_7N
    SXBE_13F SXBE_13E
    SXBE_13G SXBE_13F
    SXBE_13E SXBE_13G

Correction process:

  1. Create a temporary file including the variables in column A.
  2. Rename the variables in column A to temporary variables based on the corresponding variables in column B (e.g. BRX_16M to BRX_16N_00, and HMC_10D to HMC_10C_00, etc.).
  3. Rename the temporary variables (e.g. BRX_16N_00, HMC_10C_00, etc.) to the correct variables as indicated in column B (e.g. BRX_16N, HMC_10C, etc.).
  4. Combine the temporary file which now includes the variables with their correct values with the main data file.

Contact us: We regret any inconvenience this may have caused you or your organization and thank you in advance for your understanding.

Should you have any questions, please do not hesitate to contact us at:

Health Statistics Division
613–951–1746
Electronic mail: HD–DS@statcan.gc.ca

Date: October 2009

To: 2007 master or share file users

Subject: Incorrect variable labels

Product(s) affected: Master and share files

Year(s) affected: 2007

Description of the problem(s):
The labels attached to certain variables are incorrect in the master file and the share files and in their respective data dictionnary. The table below gives the variable names along with their old and new labels.

Description of the problem(s)
Variable Question Labels in the files and codebooks Must be replaced by:
SXB_13F Contraceptive method
last time
Other Contraceptive injections
SXB_13G Contraceptive method
last time
Contraceptive injections None
SXB_13E Contraceptive method
last time
None Other

Suggested correction(s): Rename the variables with the correct names.

Correction steps: N/A

Contact us: We regret any inconvenience this may have caused you or your organization and thank you in advance for your understanding.

Should you have any questions, please do not hesitate to contact us at:

Health Statistics Division
613–951–1746
Electronic mail:HD–DS@statcan.gc.ca

Date: October 2009

To: Users of the 2005 master or share file

Subject: The household weight of the master and all share files is invalid (small error)

Product(s) affected: Household weight file hs_hhwt.txt

Year(s) affected: 2005

Description of the problem(s): The main household weight variable (WTSE_MHH on the master file and WTSE_SHH on the share file) on all the HS_HHWT.txt file is invalid and differs from their corresponding FWGT weight on the B5_HH.txt Bootstrap file.

Since Bootvar uses the variable FWGT to calculate the final estimates, any analysis on the 2005 Households weights using Bootvar would be correct and would not need to be redone. However, any preliminary analysis based only on the variables WTSE_MHH or WTSE_SHH would be incorrect and would need to be revised.

Users should know that the errors were considered minimal.

Suggested correction(s): The master and share files have been redone with the corrected household weight and are available upon request.

Correction steps: N/A

Contact us: We regret any inconvenience this may have caused you or your organization and thank you in advance for your understanding.

Should you have any questions, please do not hesitate to contact us at:

Health Statistics Division
613–951–1746
Electronic mail: HD–DS@statcan.gc.ca

Date: December 21, 2007

To: Data users of cycle 2.1, sub–sample 3 – Master and share file

Subject: Error on number of person interviewed in the following document: “Guidelines for the use of sub–sample variables”

Cycle(s) affected: Cycle 2.1

Product(s) affected: Guidelines for the use of sub–sample variables – Master and share file

Description of the problem(s):

Page 7 of the document:

  • Number 18,981 replaces 18,091 in the following sentence:
    “A total of 18,981 respondents were interviewed for HSAS at the same time as their CCHS interview.”
  • Number 13,024 replaces 12,031 in the following sentence:
    “A total of 13,024 respondents were re–contacted after having been interviewed previously for CCHS.”

Suggested correction: N/A

Corrective Pseudo–code: N/A

Contact us: We regret any inconvenience this may have caused you or your organization and thank you in advance for your understanding.

Should you have any questions, please do not hesitate to contact us at:

Health Statistics Division
613–951–1746
Electronic mail: HD–DS@statcan.gc.ca

Date: October 12, 2007

To: Data Users and licensees of the Canadian Community Health Survey data, Cycles 2.1 and 3.1, Public Use Microdata File

Subject: Derived variable on work stress scale – Job strain (WSTCDJST and WSTEDJST)

Cycle(s) affected:

Cycle 2.1 (optional content selected by 12 health regions within Newfoundland and Labrador, Ontario and Saskatchewan)

Cycle 3.1 (optional content selected by all health regions in Quebec and Saskatchewan).

Product(s) affected:

Derived variable on work stress scale – Job strain (WSTCDJST and WSTEDJST)

Referred to hereafter as WSTnDJST where n = C or E.

Description of the problem(s):

The Job strain scale should reflect the ratio of the psychological demands and decision making leeway in accordance with the principle that strong demands combined with weak decision making autonomy generate more stress.

Certain variables from the denominator of the WSTnDJST derived variable are incorrectly specified in the Derived Variable (DV) Specifications document for cycles 2.1 and 3.1. The data are therefore erroneous. The ratio of strong demands and decision making leeway result is scores that are too high.

Suggested correction(s):

Public use microdata file (PUMF):
Cannot be corrected by users given the fact that only the derived variables exist on the file. User support via remote access is available upon request.

Share and Master Files:
A temporary reformatting step aiming to invert certain variables must be added to the specifications for the WSTnDJST variable. The step to invert the categories (to arrange them from 4 to 0 rather than from 0 to 4) must be applied to the following variables: WSTn401, WSTn402, WSTn403, WSTn405 and WSTn409.

A “patch file” is available on request.

Corrective Pseudo–code:
The two following temporary reformatting steps are executed and then the WSTnDJST variable is created according to the DV specifications:

Step 1: Temporary Reformatting

Modify the scale of responses for the questions WSTn_401 to WSTn_406 and WSTn_409 from 1 to 5, to 0 to 4

If WSTn_401 <= 5 then WSTn_401 = (WSTn_401 – 1)
If WSTn_402 <= 5 then WSTn_402 = (WSTn_402 – 1)
If WSTn_403 <= 5 then WSTn_403 = (WSTn_403 – 1)
If WSTn_404 <= 5 then WSTn_404 = (WSTn_404 – 1)
If WSTn_405 <= 5 then WSTn_405 = (WSTn_405 – 1)
If WSTn_406 <= 5 then WSTn_406 = (WSTn_406 – 1)
If WSTn_409 <= 5 then WSTn_409 = (WSTn_409 – 1)

Step 2 : Temporary Reformatting

Invert the scale of responses for the questions WSTn_401 to WSTn_403, WSTn_405 and WSTn_409, from 0 to 4, to 4 to 0

If WSTn_401 <= 4 then WSTn_401 = (4 – WSTn_401)
If WSTn_402 <= 4 then WSTn_402 = (4 – WSTn_402)
If WSTn_403 <= 4 then WSTn_403 = (4 – WSTn_403)
If WSTn_405 <= 4 then WSTn_405 = (4 – WSTn_405)
If WSTn_409 <= 4 then WSTn_409 = (4 – WSTn_409)

Step 3:

See WSTnDJST in the Derived Variable (DV) Specifications document.

Contact us: We regret any inconvenience this may have caused you or your organization and thank you in advance for your understanding.

Should you have any questions, please do not hesitate to contact us at:

Health Statistics Division
613–951–1746
Electronic mail:HD–DS@statcan.gc.ca

Date: October 12, 2007

To:Data Users and licensees of the Canadian Community Health Survey data, Cycle 3.1, Public Use Microdata File (PUMF)

Subject: Question universes in 6 modules in the PUMF data dictionary for Cycle 3.1 are incorrect.

Cycle(s) affected: Cycle 3.1

Product(s) affected:

Data dictionary of the Cycle 3.1 PUMF in English [English Data Dictionary (Freqs).pdf] and in French [French Data Dictionary (Freqs).pdf].

Description of the problem(s):

The derived variable descriptions for the following modules:

  • Smoking (SMK)
  • Colorectal Cancer Screening (CCS)
  • Exposure to second–hand smoke (ETS)
  • Prostate Cancer Screening (PSA)
  • Smoking – Physician counseling (SPC)
  • Youth Smoking (YSM)

Suggested correction(s):

  • Smoking (SMK):
    • SMKEDYCS: Respondents who answered SMKE_202 = (1, 7 or 8) or SMKE_01A = 8 and SMKE_01B = 8
  • Colorectal Cancer Screening (CCS):
    • CCSEFOPT: All respondents
    • CCSE_180: Respondents aged 35 and over with CCSEFOPT = 1
    • CCSE_182: Respondents who answered CCSE_180 = (1, 7 or 8)
    • CCSE_83A: Respondents who answered CCSE_180 = (1, 7 or 8)
    • CCSE_83B: Respondents who answered CCSE_180 = (1, 7 or 8)
  • Exposure to second–hand smoke (ETS):
    • ETSE_10: Respondents with DHHEDHSZ > 1 or who answered (SMKE_202 = (3, 7 or 8) or (SMKE_01A = 8 and SMKE_01B = 8))
    • ETSE_G11: Respondents who answered ETSE_10 = (1, 7 or 8)
    • ETSE_20: Respondents who answered SMKE_202 = (3, 7 or 8) or (SMKE_01A = 8 and SMKE_01B = 8)
    • ETSE_20B: Respondents who answered SMKE_202 = (3, 7 or 8) or (SMKE_01A = 8 and SMKE_01B = 8)
  • Prostate Cancer Screening (PSA) :
    • PSAEFOPT: All respondents
    • PSAE_170: Males aged 35 and over with PSAEFOPT = 1
    • PSAE_172: Respondents who answered PSAE_170 = (1, 7 or 8)
    • PSAE_73A: Respondents who answered PSAE_170 = (1, 7 or 8)
    • PSAE_73B: Respondents who answered PSAE_170 = (1, 7 or 8)
    • PSAE_73C: Respondents who answered PSAE_170 = (1, 7 or 8)
    • PSAE_73G: Respondents who answered PSAE_170 = (1, 7 or 8)
    • PSAE_73D: Respondents who answered PSAE_170 = (1, 7 or 8)
    • PSAE_73E: Respondents who answered PSAE_170 = (1, 7 or 8)
    • PSAE_73F: Respondents who answered PSAE_170 = (1, 7 or 8)
    • PSAE_174: Males aged 35 and over with PSAEFOPT = 1
    • PSAE_175: Respondents who answered PSAE_174 = (1, 7 or 8) or PSAE_170 = 8
  • Smoking – Physician counseling (SPC):
    • SPCEFOPT: All respondents
    • SPCE_10: Respondents with SPCEFOPT = 1 who answered (SMKE_202 = (1 or 2) or SMKE_06A = 1 or SMKE_09A = 1 and HCUE_1AA = (1, 7 or 8) or SMKE_01A = 8 and SMKE_01B = 8 or SMKE_202 = (7 or 8) or SMKE_06A = (7 or 8) or SMKE_09A = (7 or 8))
    • SPCE_11: Respondents with SPCEFOPT = 1 who answered SPCE_10 = (1, 7 or 8) or SMKE_01A = 8 and SMKE_01B = 8 or SMKE_202 = (7 or 8) or SMKE_06A = (7 or 8) or SMKE_09A = (7 or 8) or HCUE_1AA = (7 or 8) and (SMKE_202 = (1 or 2) or SMKE_06A = 1 or SMKE_09A = 1)
    • SPCE_12: Respondents with SPCEFOPT = 1 who answered SPCE_11 = (1, 7 or 8) or SPCE_10 = (7 or 8) or SMKE_01A = 8 and SMKE_01B = 8 or SMKE_202 = (7 or 8) or SMKE_06A = (7 or 8) or SMKE_09A = (7 or 8) or HCUE_1AA = (7 or 8) and (SMKE_202 = (1 or 2) or SMKE_06A = 1 or SMKE_09A = 1)
    • SPCE_13: Respondents with SPCEFOPT = 1 who answered SPCE_11 = (1, 7 or 8) or SPCE_10 = (7 or 8) or SMKE_01A = 8 and SMKE_01B = 8 or SMKE_202 = (7 or 8) or SMKE_06A = (7 or 8) or SMKE_09A = ( 7 or 8) or HCUE_1AA = (7 or 8) and (SMKE_202 = (1 or 2) or SMKE_06A = 1 or SMKE_09A = 1)
    • SPCE_14A: Respondents with SPCEFOPT = 1 who answered SPCE_13 = (1, 7 or 8) or SPCE_12 = (7 or 8) or SPCE_11 = (7 or 8) or SPCE_10 = (7 or 8) or SMKE_01A = 8 and SMKE_01B = 8 and SMKE_202 = (7 or 8) and SMKE_06A = (7 or 8) and SMKE_09A = ( 7 or 8) and HCUE_1AA = (7 or 8) and (SMKE_202 = (1 or 2) and SMKE_06A = 1 and SMKE_09A = 1)
    • SPCE_14B: Respondents with SPCEFOPT = 1 who answered SPCE_13 = (1, 7 or 8) or SPCE_12 = (7 or 8) or SPCE_11 = (7 or 8) or SPCE_10 = (7 or 8) or SMKE_01A = 8 and SMKE_01B = 8 and SMKE_202 = (7 or 8) and SMKE_06A = (7 or 8) and SMKE_09A = ( 7 or 8) and HCUE_1AA = (7 or 8) and (SMKE_202 = (1 or 2) and SMKE_06A = 1 and SMKE_09A = 1)
    • SPCE_14C: Respondents with SPCEFOPT = 1 who answered SPCE_13 = (1, 7 or 8) or SPCE_12 = (7 or 8) or SPCE_11 = (7 or 8) or SPCE_10 = (7 or 8) or SMKE_01A = 8 and SMKE_01B = 8 and SMKE_202 = (7 or 8) and SMKE_06A = (7 or 8) and SMKE_09A = ( 7 or 8) and HCUE_1AA = (7 or 8) and (SMKE_202 = (1 or 2) and SMKE_06A = 1 and SMKE_09A = 1)
    • SPCE_14D: Respondents with SPCEFOPT = 1 who answered SPCE_13 = (1, 7 or 8) or SPCE_12 = (7 or 8) or SPCE_11 = (7 or 8) or SPCE_10 = (7 or 8) or SMKE_01A = 8 and SMKE_01B = 8 and SMKE_202 = (7 or 8) and SMKE_06A = (7 or 8) and SMKE_09A = ( 7 or 8) and HCUE_1AA = (7 or 8) and (SMKE_202 = (1 or 2) and SMKE_06A = 1 and SMKE_09A = 1)
    • SPCE_14E: Respondents with SPCEFOPT = 1 who answered SPCE_13 = (1, 7 or 8) or SPCE_12 = (7 or 8) or SPCE_11 = (7 or 8) or SPCE_10 = (7 or 8) or SMKE_01A = 8 and SMKE_01B = 8 and SMKE_202 = (7 or 8) and SMKE_06A = (7 or 8) and SMKE_09A = ( 7 or 8) and HCUE_1AA = (7 or 8) and (SMKE_202 = (1 or 2) and SMKE_06A = 1 and SMKE_09A = 1)
    • SPCE_14F: Respondents with SPCEFOPT = 1 who answered SPCE_13 = (1, 7 or 8) or SPCE_12 = (7 or 8) or SPCE_11 = (7 or 8) or SPCE_10 = (7 or 8) or SMKE_01A = 8 and SMKE_01B = 8 and SMKE_202 = (7 or 8) and SMKE_06A = (7 or 8) and SMKE_09A = ( 7 or 8) and HCUE_1AA = (7 or 8) and (SMKE_202 = (1 or 2) and SMKE_06A = 1 and SMKE_09A = 1)
    • SPCE_14G: Respondents with SPCEFOPT = 1 who answered SPCE_13 = (1, 7 or 8) or SPCE_12 = (7 or 8) or SPCE_11 = (7 or 8) or SPCE_10 = (7 or 8) or SMKE_01A = 8 and SMKE_01B = 8 and SMKE_202 = (7 or 8) and SMKE_06A = (7 or 8) and SMKE_09A = ( 7 or 8) and HCUE_1AA = (7 or 8) and (SMKE_202 = (1 or 2) and SMKE_06A = 1 and SMKE_09A = 1)
    • SPCE_20: Respondents with SPCEFOPT = 1 and [DENEFOPT = 2 who answered (SMKE_202 = (1, 2, 7 or 8) or SMKE_06A = (1, 7 or 8) or SMKE_09A = (1, 7 or 8) or SMKE_01A = 8 or SMKE_01B = 8)] and (HCUE_02E > 0 and < 100 or HCUE_02E = (997 or 998) or HCUE_01 = 8)
    • SPCE_21: Respondents with SPCEFOPT = 1 who answered SPCE_20 = (1, 7 or 8) or (DENE_132 = (1, 97 or 98) or with DENEFOPT = 2 who answered HCUE_02E = (997 or 998) or HCUE_01 = 8) and (SMKE_202 = (1, 2, 7 or 8) or SMKE_06A = (1, 7 or 8) or SMKE_09A = (1, 7 or 8) or SMKE_01A = 8 and SMKE_01B = 8)
    • SPCE_22: Respondents with SPCEFOPT = 1 who answered SPCE_21 = (1, 7 or 8) or SPCE_20 = (7 or 8) or (DENE_132 = (97 or 98) or with DENEFOPT = 2 who answered HCUE_02E = (997 or 998) or HCUE_01 = 8) and (SMKE_202 = (1, 2, 7 or 8) or SMKE_06A = (1, 7 or 8) or SMKE_09A = (1, 7 or 8) or SMKE_01A = 8 and SMKE_01B = 8)
  • Youth Smoking (YSM)
    • YSMEG1: Respondents aged less than 20 who answered SMKE_202 = (1, 2, 7 or 8) or (SMKE_01A = 8 and SMKE_01B = 8)
    • YSME_2: Respondents aged less than 20 who answered YSME_1 = (8, 9, 10, 11, 12, 97 or 98) or SMKE_202 = (7 or 8) or SMKE_01A = 8 and SMKE_01B = 8
    • YSME_3: Respondents aged less than 20 who answered YSME_1 = (1,2,3,4,5,6,7,97 or 98) or YSME_2 = (1, 7 or 8) or SMKE_202 = (7 or 8) or SMKE_01A = 8 and SMKE_01B = 8
    • YSME_4: Respondents aged less than 20 who answered YSME_1 = (1,2,3,4,5,6,7,97 or 98) or YSME_2 = (1, 7 or 8) or SMKE_202 = (7 or 8) or SMKE_01A = 8 and SMKE_01B = 8
    • YSME_5: Respondents aged less than 20 who answered SMKE_202 = (1, 2, 7 or 8) or (SMKE_01A = 8 and SMKE_01B = 8)

Corrective Pseudo–code: N/A

Contact us: We regret any inconvenience this may have caused you or your organization and thank you in advance for your understanding.

Should you have any questions, please do not hesitate to contact us at:

Health Statistics Division
613–951–1746
Electronic mail: HD–DS@statcan.gc.ca

Date: October 12, 2007

To: Data Users and licensees of the Canadian Community Health Survey data, Cycle 3.1, Public Use Microdata File (PUMF)

Subject: Derived variable documentation contains misleading information indicating that some variables are included in the PUMF when they are not.

Cycle(s) affected: Cycle 3.1

Product(s) affected:

Cycle 3.1 PUMF derived variable documentation in English (DERIVE_E.pdf) and in French (DERIVE_F.pdf).

Description of the problem(s):

The derived variable descriptions for the following modules:

  • Smoking – Cessation Aids (SCA)
  • Smoking – Nicotine dependence (NDE)
  • Smoking – Stages of change (SCH)

indicate that the derived variables are available in the PUMF, while they are not. Since these modules have only been selected by one or two territories, the information is sensitive to respondent identity disclosure and is therefore not included in the PUMF.

Suggested correction: N/A

Corrective Pseudo–code: N/A

Contact us:

We regret any inconvenience this may have caused you or your organization and thank you in advance for your understanding.

Should you have any questions, please do not hesitate to contact us at:

Data Access and Information Services

Health Statistics Division
613–951–1746
Electronic mail: HD–DS@statcan.gc.ca

Canadian Community Health Survey (CCHS)

Annual component – 2010
Common Content

Derived Variable (DV) Specifications

Table of Contents

ADL Activities of Daily Living (1 DV)
1 ) ADLF6R – Need for help with instrumental activities of daily living

ALC Alcohol use (1 DV)
1 ) ALCDTTM – Type of Drinker (12 Months)

ALD Alcohol use – Dependence (4 DVs)
1 ) ALDDSF – Alcohol Dependence Scale (Short Form Score) – 12–Month
2 ) ALDDPP – Probability of Caseness to Respondents (Alcohol Dependence) – 12–Month
3 ) ALDDINT – Alcohol Interference 12–Month – Mean
4 ) ALDFINT – Flag for Alcohol Interference 12–Month

ALW Alcohol use during the past week (2 DVs)
1 ) ALWDWKY – Weekly Consumption
2 ) ALWDDLY – Average Daily Alcohol Consumption

CCC Chronic conditions (1 DV)
1 ) CCCDDIA – Diabetes type

CHP Contacts with health professionals (2 DVs)
1 ) CHPDMDC – Number of Consultations with Medical Doctor/Paediatrician
2 ) CHPFCOP – Consultations with Health Professionals

CPG Problem gambling (6 DVs)
1 ) CPGFGAM – Gambling Activity – Gambler vs. Non–gambler
2 ) CPGDSEV – Problem Gambling Severity Index (PGSI) – Modified Version
3 ) CPGDTYP – Type of Gambler
4 ) CPGDACT – Number of Types of Gambling Activities in the List Used to Calculate CPGI
5 ) CPGDINT – Gambling Interference – Mean
6 ) CPGFINT – Flag for Gambling Interference

DHH Dwelling and household variables (10 DVs)
1 ) DHHDSAGE – Age of spouse
2 ) DHHDYKD – Number of Persons in Household Less Than 16 Years of Age
3 ) DHHDOKD – Number of Persons in Household 16 or 17 Years of Age
4 ) DHHDLE5 – Number of Persons in Household Less Than 6 Years of Age
5 ) DHHD611 – Number of Persons in Household between 6 and 11 Years of Age
6 ) DHHDL12 – Number of Persons in Household Less Than 12 Years of Age
7 ) DHHDL18 – Number of Persons in Household Less than 18 Years of Age
8 ) DHHDLVG – Living/Family Arrangement of Selected Respondent
9 ) DHHDECF – Economic Family Status (Household Type)
10 ) DHHDHSZ – Household Size

DIS Distress (3 DVs)
1 ) DISDK6 – Distress Scale – K6
2 ) DISDCHR – Chronicity of Distress and Impairment Scale
3 ) DISDDSX – Distress Scale – K10

DPS Depression (4 DVs)
1 ) DPSDSF – Derived Depression Scale – Short Form Score
2 ) DPSDPP – Depression Scale – Probability of Caseness to Respondents
3 ) DPSDWK – Number of Weeks Feeling Depressed – 12–Months
4 ) DPSDMT – Specific Month Last Felt Depressed

DRV Driving and safety (1 DV)
1 ) DRVFSBU – Passenger Seat Belt Use (Motor Vehicle)

DSU Dietary supplement use – Vitamins and minerals (1 DV)
1 ) DSUDCON – Frequency of Consumption of Vitamin or Mineral Supplements

EDU Education (4 DVs)
1 ) EDUDH04 – Highest Level of Education – Household, 4 Levels
2 ) EDUDH10 – Highest Level of Education – Household, 10 Levels
3 ) EDUDR04 – Highest Level of Education – Respondent, 4 Levels
4 ) EDUDR10 – Highest Level of Education – Respondent, 10 Levels

FDC Food choices (3 DVs)
1 ) FDCFAVD – Avoids Certain Foods for Certain Content Reasons
2 ) FDCFCAH – Chooses or Avoids Certain Foods Because of Certain Health Concerns
3 ) FDCFCHO – Chooses Certain Foods for Certain Content Reasons

FSC Food security (3 DVs)
1 ) FSCDHFS2 – Household Food Security Status – Modified version
2 ) FSCDAFS2 – Food Security – Adult Status
3 ) FSCDCFS2 – Food Security – Child Status

FVC Fruit and vegetable consumption (8 DVs)
1 ) FVCDJUI – Daily Consumption – Fruit Juice
2 ) FVCDFRU – Daily Consumption – Other Fruit
3 ) FVCDSAL – Daily Consumption – Green Salad
4 ) FVCDPOT – Daily Consumption – Potatoes
5 ) FVCDCAR – Daily Consumption – Carrots
6 ) FVCDVEG – Daily Consumption – Other Vegetables
7 ) FVCDTOT – Daily Consumption – Total Fruit and Vegetable
8 ) FVCGTOT – Grouping of Daily Consumption – Total Fruit and Vegetable

GEN General health (3 DVs)
1 ) GENDHDI – Perceived Health
2 ) GENDMHI – Perceived Mental Health
3 ) GENGSWL – Satisfaction with life in general – (G)

GEO Geography variables (18 DVs)
1 ) GEODPC – Postal Code
2 ) GEODHR4 – Health Region
3 ) GEODBCHA – Health Authority – British Columbia
4 ) GEODSHR – Quebec Sub–Health Region
5 ) GEODDHA – Nova Scotia District Health Authority (DHA)
6 ) GEODRHA – Regional Health Authority – Alberta
7 ) GEODLHA – British Columbia Local Health Authority (LHA)
8 ) GEODLHN – Ontario Local Health Integration Network
9 ) GEODDA06 – 2006 Census Dissemination Area (DA)
10 ) GEODFED – 2006 Census Federal Electoral District (FED)
11 ) GEODCSD – 2006 Census Subdivision (CSD)
12 ) GEODCD – 2006 Census Division (CD)
13 ) GEODSAT – Statistical Area Classification Type (SAT)
14 ) GEODCMA6 – 2006 Census Metropolitan Area (CMA)
15 ) GEODPG09 – Peer Group
16 ) GEODUR – Urban–Rural Classification
17 ) GEODUR2 – Urban–Rural Classification – Grouped
18 ) GEODPSZ – Population Size Group

HMC Home care services (1 DV)
1 ) HMCFRHC – Received Home Care

HUI Health utilities index (8 DVs)
1 ) HUIDVIS – Vision Health Status
2 ) HUIDHER – Hearing Health Status
3 ) HUIDSPE – Speech Health Status
4 ) HUIDMOB – Ambulation Health Status
5 ) HUIDDEX – Dexterity Health Status
6 ) HUIDEMO – Emotion Health Status
7 ) HUIDCOG – Cognition Health Status
8 ) HUIDHSI – Health Utilities Index

HUP Health utilities index – Pain and discomfort (1 DV)
1 ) HUPDPAD – Pain Health Status

HWT Height and weight – Self–reported (5 DVs)
1 ) HWTDHTM – Height (Metres) – Self–Reported
2 ) HWTDWTK – Weight (Kilograms) – Self–Reported
3 ) HWTDBMI – Body Mass Index (self–reported)
4 ) HWTDISW – BMI classification for adults aged 18 and over (self–reported) – international standard
5 ) HWTDCOL – BMI classification for children aged 12 to 17 (self–reported) – Cole classification system

IDG Illicit drug use (16 DVs)
1 ) IDGFLCA – Cannabis Drug Use – Lifetime (Including "One Time Only" Use)
2 ) IDGFLCM – Cannabis Drug Use – Lifetime (Excluding "One Time Only" Use)
3 ) IDGFYCM – Cannabis Drug Use – 12 month (Excluding "One Time Only" Use)
4 ) IDGFLCO – Cocaine or Crack Drug Use – Lifetime
5 ) IDGFLAM – Amphetamine (Speed) Drug Use – Lifetime
6 ) IDGFLEX – MDMA (ecstasy) Drug Use – Lifetime
7 ) IDGFLHA – Hallucinogens, PCP or LSD Drug Use – Lifetime
8 ) IDGFLGL – Glue, Gasoline, or Other Solvent Use – Lifetime
9 ) IDGFLHE – Heroin Drug Use – Lifetime
10 ) IDGFLST – Steroid Use – Lifetime
11 ) IDGFLA – Any Illicit Drug Use – Lifetime (Including "One Time Only" Use of Cannabis)
12 ) IDGFLAC – Any Illicit Drug Use – Lifetime (Excluding "One Time Only" Use of Cannabis)
13 ) IDGFYA – Any Illicit Drug Use – 12–Month (Including "One Time Only" Use of Cannabis)
14 ) IDGFYAC – Any Illicit Drug Use – 12–Month (Excluding "One Time Only" Use of Cannabis)
15 ) IDGDINT – Illicit Drug Interference 12–Month – Mean
16 ) IDGFINT – Flag for Illicit Drug Interference – 12–Month

INC Income (6 DVs)
1 ) INCDHH – Total Household Income – All Sources
2 ) INCDPER – Personal Income – All Sources
3 ) INCDADR – Adjusted household income ratio – National level
4 ) INCDRCA – Distribution of household income – National level
5 ) INCDRPR – Distribution of household income – Provincial levl
6 ) INCDRRS – Distribution of household income – Health region level

INJ Injuries (4 DVs)
1 ) INJDTBS – Type of Injury by Body Site
2 ) INJDCAU – Cause of Injury
3 ) INJDCBP – Cause of Injury by Place of Occurrence
4 ) INJDSTT – Injury Status

INW Workplace injury (2 DVs)
1 ) INWDOCG – Injury at Work – Occupation Group
2 ) INWDING – Injury at work – Industry Group

LBS Labour force (5 DVs)
1 ) LBSDHPW – Total usual hours worked per week
2 ) LBSDPFT – Full–time/part–time working status (for total usual hours)
3 ) LBSDWSS – Working status last week
4 ) LBSDING – Industry Group
5 ) LBSDOCG – Occupation Group

MAS Mastery (1 DV)
1 ) MASDM1 – Derived Mastery Scale

MEX Maternal experiences – Breastfeeding (2 DVs)
1 ) MEXDEBF – Length of exclusive breastfeeding
2 ) MEXFEB6 – Exclusively breastfed for at least 6 months (or more)

NEU Neurological conditions (38 DVs)
1 ) NEUDNCR – Has a neurological condition – selected respondent
2 ) NEUDNCH – Presence of neurological condition in the household
3 ) NEUDMHR – Has migraine headaches – selected respondent
4 ) NEUDMHH – Number of persons in the household with migraine headaches
5 ) NEUDEPR – Has epilepsy – selected respondent
6 ) NEUDEPH – Number of persons in the household with epilepsy
7 ) NEUDCPR – Has cerebral palsy – selected respondent
8 ) NEUDCPH – Number of persons in the household with cerebral palsy
9 ) NEUDSBR – Has spina bifida – selected respondent
10 ) NEUDSBH – Number of persons in the household with spina bifida
11 ) NEUDHCR – Has hydrocephalus – selected respondent
12 ) NEUDHCH – Number of persons in the household with hydrocephalus
13 ) NEUDMDR – Has muscular dystrophy – selected respondent
14 ) NEUDMDH – Number of persons in the household with muscular dystrophy
15 ) NEUDDYR – Has dystonia – selected respondent
16 ) NEUDDYH – Number of persons in the household with dystonia
17 ) NEUDTSR – Has Tourette's syndrome – selected respondent
18 ) NEUDTSH – Number of persons in the household with Tourette's syndrome
19 ) NEUDPDR – Has Parkinson's disease – selected respondent
20 ) NEUDPDH – Number of persons in the household with Parkinson's disease
21 ) NEUDALR – Has ALS (Lou Gehrig’s disease/amyotrophic lateral sclerosis) – selected respondent
22 ) NEUDALH – Number of persons in the household with ALS (Lou Gehrig's disease)
23 ) NEUDHDR – Has Huntington's disease – selected respondent
24 ) NEUDHDH – Number of persons in the household with Huntington's disease
25 ) NEUDSTR – Suffers from the effects of a stroke – selected respondent
26 ) NEUDSTH – Number of persons in the household that suffer from the effects of a stroke
27 ) NEUDBIR – Has a neurological condition caused by a brain injury – selected respondent
28 ) NEUDBIH – Number of persons in the hhld with a neurological condition caused by a brain injury
29 ) NEUDBTR – Has a neurological condition caused by a brain tumour – selected respondent
30 ) NEUDBTH – Number of persons in the hhld with a neurological condition caused by brain tumour
31 ) NEUDSIR – Has a neurological condition caused by a spinal cord injury – selected respondent
32 ) NEUDSIH – Number of persons in the hhld with neurological condition caused by a spinal cord injury
33 ) NEUDSCR – Has a neurological condition caused by a spinal cord tumour – selected respondent
34 ) NEUDSCH – Number of persons in the hhld with a neurological condition caused by a spinal cord tumour
35 ) NEUDADR – Has Alzheimer's disease or other dementia – selected respondent
36 ) NEUDADH – Number of persons in the household with Alzheimer's or other dementia
37 ) NEUDMSH – Number of persons in the household with multiple sclerosis
38 ) NEUDMSR – Has multiple sclerosis – selected respondent

OH2 Oral health 2 (2 DVs)
1 ) OH2FLIM – Social Limitation Due to Oral Health Status
2 ) OH2FOFP – Oral and Facial Pain and Discomfort

PAC Physical activities (9 DVs)
1 ) PACDEE – Daily Energy Expenditure in Leisure Time Physical Activities
2 ) PACFLEI – Participant In Leisure Time Physical Activity
3 ) PACDFM – Average Monthly Frequency of Leisure Time Physical Activity Lasting Over 15 Minutes
4 ) PACDFR – Frequency of All Leisure Time Physical Activity Lasting Over 15 Minutes
5 ) PACFD – Participant In Daily Leisure Time Physical Activity Lasting Over 15 Minutes
6 ) PACDPAI – Leisure Time Physical Activity Index
7 ) PACDLTI – Transportation and Leisure Time Physical Activity Index
8 ) PACDTLE – Daily Energy Expenditure in Transportation and Leisure Time Physical Activities
9 ) PACFLTI – Participant In Transportation or Leisure Time Physical Activity

PAF Physical activities – Facilities at work (1 DV)
1 ) PAFFACC – Access to Physical Activity Facilities at Work

PWB Psychological well–being (1 DV)
1 ) PWBDPWB – Psychological Well–Being Manifestation Scale (WBMMS)

RAC Restriction of activities (2 DVs)
1 ) RACDIMP – Impact of Health Problems
2 ) RACDPAL – Participation and Activity Limitation

SAC Sedentary activities (2 DVs)
1 ) SACDTOT – Total Number of Hours Per Week Spent In Sedentary Activities
2 ) SACDTER – Total number of hours per week spent in sedentary activities (excluding reading)

SAM Sample variables (2 DVs)
1 ) SAMDSHR – Permission to Share Data
2 ) SAMDLNK – Permission to Link

SCA Smoking cessation methods (1 DV)
1 ) SCADQUI – Attempted/Successful Quitting

SCH Smoking – Stages of change (1 DV)
1 ) SCHDSTG – Smoking Stages of Change (Current and Former Smokers)

SDC Socio–demographic characteristics (10 DVs)
1 ) SDCCCB – Country of birth code
2 ) SDCGCB – Country of birth – grouped
3 ) SDCDLHM – Language(s) spoken at home
4 ) SDCDAIM – Age at time of immigration
5 ) SDCFIMM – Immigration flag
6 ) SDCDRES – Length of time in Canada since immigration
7 ) SDCDLNG – Language(s) in which respondent can converse
8 ) SDCDFL1 – First official language learned and still understood
9 ) SDCDABT – Aboriginal Identity
10 ) SDCDCGT – Cultural / Racial Background

SFE Self–esteem (1 DV)
1 ) SFEDE1 – Derived Self–Esteem Scale

SFR Health status (SF–36) (10 DVs)
1 ) SFRDPFS – Physical Functioning Scale
2 ) SFRDSFS – Social Functioning Scale
3 ) SFRDPRF – Role Functioning (Physical) Scale
4 ) SFRDMRF – Role Functioning (Mental) Scale
5 ) SFRDGMH – General Mental Health Scale
6 ) SFRDVTS – Vitality Scale
7 ) SFRDBPS – Bodily Pain Scale
8 ) SFRDGHP – General Health Perceptions Scale
9 ) SFRDPCS – Summary Measure of Physical Health
10 ) SFRDMCS – Summary Measure of Mental Health

SMK Smoking (3 DVs)
1 ) SMKDSTY – Type of Smoker
2 ) SMKDSTP – Number of Years Since Stopped Smoking Completely
3 ) SMKDYCS – Number of Years Smoked Daily (Current Daily Smokers Only)

SSA Social support – Availability (4 DVs)
1 ) SSADTNG – Tangible Social Support – MOS Subscale
2 ) SSADAFF – Affection – MOS Subscale
3 ) SSADSOC – Positive Social Interaction – MOS Subscale
4 ) SSADEMO – Emotional or Informational Support – MOS Subscale

UPE Use of protective equipment (3 DVs)
1 ) UPEFILS – Wears Protective Equipment when In–Line Skating
2 ) UPEFSKB – Wears Protective Equipment when Skateboarding
3 ) UPEFSNB – Wears Protective Equipment when Snowboarding

WTM Waiting times (9 DVs)
1 ) WTMDSO – Number of Waiting Days to See a Medical Specialist – Seen Specialist
2 ) WTMDSN – Number of Waiting Days to See a Medical Specialist – Not Seen Specialist
3 ) WTMDSA – Number of Acceptable Waiting Days to See a Medical Specialist
4 ) WTMDCO – Number of Waiting Days to Receive Non–Emergency Surgery Surgery Done
5 ) WTMDCN – Number of Waiting Days to Receive Non–Emergency Surgery – Surgery Not Done
6 ) WTMDCA – Number of Acceptable Waiting Days to Receive Non–Emergency Surgery
7 ) WTMDTO – Number of Waiting Days for Diagnostic Test – Test Done
8 ) WTMDTN – Number of Waiting Days for Diagnostic Test – Test Not Done
9 ) WTMDTA – Number of Acceptable Waiting Days for Diagnostic Test

For the complete document in PDF format, contact Client Services (613-951-1746; hd-ds@statcan.gc.ca), Health Statistics Division

Location of study of person, name

The data for this variable are reported using the following classification(s) and/or list(s):

'Location of study' refers to the province, territory or country where the person obtained his or her highest certificate, diploma or degree. It refers to the location of the institution granting the certificate, diploma or degree, not the location of the person at the time he or she obtained it. The location is reported according to current boundaries.

'Person' refers to an individual and is the unit of analysis for most social statistics programmes.

Secondary (high) school diploma or equivalent of person, category

The data for this variable are reported using the following classification(s) and/or list(s):

'Secondary (high) school diploma or equivalent' refers to whether or not persons have completed a secondary school or high school diploma, graduation certificate, or its equivalent. If other education qualifications above high school are held, this variable also indicates the highest additional certificate, diploma or degree.

'Person' refers to an individual and is the unit of analysis for most social statistics programmes.

Location of study of person, category

The data for this variable are reported using the following classification(s) and/or list(s):

'Location of study' refers to the province, territory or country where the person obtained his or her highest certificate, diploma or degree. It refers to the location of the institution granting the certificate, diploma or degree, not the location of the person at the time he or she obtained it. The location is reported according to current boundaries.

'Person' refers to an individual and is the unit of analysis for most social statistics programmes.

Field of study of person, type

The data for this variable are reported using the following classification(s) and/or list(s):

'Field of study' refers to the discipline or area of learning/training associated with a particular course or programme of study.

Note: In the 2011 National Household Survey, 'Field of study' is referred to as 'Major field of study'.

'Person' refers to an individual and is the unit of analysis for most social statistics programmes.

Certificate, diploma or degree of person 15 years or over, type

The data for this variable are reported using the following classification(s) and/or list(s):

'Certificate, diploma or degree' refers to a certificate, diploma or degree obtained by the person from an accredited educational institution. It also includes educational certificates or diplomas awarded to the person by provincial or federal authorities, such as the journeyman/woman designation or teaching and nursing certificates. The certificate, diploma or degree must be awarded based principally on evaluation of educational attainment and not on attendance.

'Person 15 years or over' refers to an individual whose age is 15 years or over.

School attendance of person, category

The data for this variable are reported using the following classification(s) and/or list(s):

'School attendance' refers to whether a person attended, either full time or part time, any accredited educational institution or program during all or part of a specified reference period. The person may have attended more than one educational institution or have been enrolled in more than one program. Attendance is counted only for courses which could be used as credits towards a certificate, diploma or degree from an educational institution or program such as elementary or secondary school, registered apprenticeship program, trade school, college, CEGEP or university. Educational institutions also include seminaries, schools of nursing, private business schools, private or public trade schools, institutes of technology, vocational schools, and schools for people who are deaf or blind. Attendance includes participation in courses or programs offered over the Internet, through correspondence and by other non-traditional methods of delivery. Attendance does not include training received from an employer unless it could be used as credit towards a certificate, diploma or degree from an accredited educational institution. A person is considered to have attended an educational institution if they were enrolled during the reference period but were absent, for example, due to illness.

Note: In the 2011 National Household Survey, 'School attendance' is referred to as 'Attendance at school'.

'Person' refers to an individual and is the unit of analysis for most social statistics programmes.

Monthly Retail Trade Survey (MRTS) Data Quality Statement

Objectives, uses and users
Concepts, variables and classifications
Coverage and frames
Sampling
Questionnaire design
Response and nonresponse
Data collection and capture operations
Editing
Imputation
Estimation
Revisions and seasonal adjustment
Data quality evaluation
Disclosure control

1. Objectives, uses and users

1.1. Objective

The Monthly Retail Trade Survey (MRTS) provides information on the performance of the retail trade sector on a monthly basis, and when combined with other statistics, represents an important indicator of the state of the Canadian economy.

1.2. Uses

The estimates provide a measure of the health and performance of the retail trade sector. Information collected is used to estimate level and monthly trend for retail sales. At the end of each year, the estimates provide a preliminary look at annual retail sales and performance.

1.3. Users

A variety of organizations, sector associations, and levels of government make use of the information. Retailers rely on the survey results to compare their performance against similar types of businesses, as well as for marketing purposes. Retail associations are able to monitor industry performance and promote their retail industries. Investors can monitor industry growth, which can result in better access to investment capital by retailers. Governments are able to understand the role of retailers in the economy, which aids in the development of policies and tax incentives. As an important industry in the Canadian economy, governments are able to better determine the overall health of the economy through the use of the estimates in the calculation of the nation’s Gross Domestic Product (GDP).

2. Concepts, variables and classifications

2.1. Concepts

The retail trade sector comprises establishments primarily engaged in retailing merchandise, generally without transformation, and rendering services incidental to the sale of merchandise.

The retailing process is the final step in the distribution of merchandise; retailers are therefore organized to sell merchandise in small quantities to the general public. This sector comprises two main types of retailers, that is, store and non-store retailers. The MRTS covers only store retailers. Their main characteristics are described below. Store retailers operate fixed point-of-sale locations, located and designed to attract a high volume of walk-in customers. In general, retail stores have extensive displays of merchandise and use mass-media advertising to attract customers. They typically sell merchandise to the general public for personal or household consumption, but some also serve business and institutional clients. These include establishments such as office supplies stores, computer and software stores, gasoline stations, building material dealers, plumbing supplies stores and electrical supplies stores.

In addition to selling merchandise, some types of store retailers are also engaged in the provision of after-sales services, such as repair and installation. For example, new automobile dealers, electronic and appliance stores and musical instrument and supplies stores often provide repair services, while floor covering stores and window treatment stores often provide installation services. As a general rule, establishments engaged in retailing merchandise and providing after sales services are classified in this sector. Catalogue sales showrooms, gasoline service stations, and mobile home dealers are treated as store retailers.

2.2. Variables

Sales are defined as the sales of all goods purchased for resale, net of returns and discounts. This includes commission revenue and fees earned from selling goods and services on account of others, such as selling lottery tickets, bus tickets, and phone cards. It also includes parts and labour revenue from repair and maintenance; revenue from rental and leasing of goods and equipment; revenues from services, including food services; sales of goods manufactured as a secondary activity; and the proprietor’s withdrawals, at retail, of goods for personal use. Other revenue from rental of real estate, placement fees, operating subsidies, grants, royalties and franchise fees are excluded.

Trading Location is the physical location(s) in which business activity is conducted in each province and territory, and for which sales are credited or recognized in the financial records of the company. For retailers, this would normally be a store.

Constant Dollars: The value of retail trade is measured in two ways; including the effects of price change on sales and net of the effects of price change. The first measure is referred to as retail trade in current dollars and the latter as retail trade in constant dollars. The method of calculating the current dollar estimate is to aggregate the weighted value of sales for all retail outlets. The method of calculating the constant dollar estimate is to first adjust the sales values to a base year, using the Consumer Price Index, and then sum up the resulting values.

2.3. Classification

The Monthly Retail Trade Survey is based on the definition of retail trade under the NAICS (North American Industry Classification System). NAICS is the agreed upon common framework for the production of comparable statistics by the statistical agencies of Canada, Mexico and the United States. The agreement defines the boundaries of twenty sectors. NAICS is based on a production-oriented, or supply based conceptual framework in that establishments are groups into industries according to similarity in production processes used to produce goods and services.

Estimates appear for 21 industries based on special aggregations of the 2007 North American Industry Classification System (NAICS) industries. The 21 industries are further aggregated to 11 sub-sectors.

Geographically, sales estimates are produced for Canada and each province and territory.

3. Coverage and frames

Statistics Canada’s Business Register ( BR) provides the frame for the Monthly Retail Trade Survey. The BR is a structured list of businesses engaged in the production of goods and services in Canada. It is a centrally maintained database containing detailed descriptions of most business entities operating within Canada. The BR includes all incorporated businesses, with or without employees. For unincorporated businesses, the BR includes all employers with businesses, and businesses with no employees with annual sales that have a Goods and Services Tax (GST) or annual revenue that declares individual taxes.  annual sales greater than $30,000 that have a Goods and Services Tax (GST) account (the BR does not include unincorporated businesses with no employees and with annual sales less than $30,000).

The businesses on the BR are represented by a hierarchical structure with four levels, with the statistical enterprise at the top, followed by the statistical company, the statistical establishment and the statistical location. An enterprise can be linked to one or more statistical companies, a statistical company can be linked to one or more statistical establishments, and a statistical establishment to one or more statistical locations.

The target population for the MRTS consists of all statistical establishments on the BR that are classified to the retail sector using the North American Industry Classification System (NAICS) (approximately 200,000 establishments). The NAICS code range for the retail sector is 441100 to 453999. A statistical establishment is the production entity or the smallest grouping of production entities which: produces a homogeneous set of goods or services; does not cross provincial boundaries; and provides data on the value of output, together with the cost of principal intermediate inputs used, along with the cost and quantity of labour used to produce the output. The production entity is the physical unit where the business operations are carried out. It must have a civic address and dedicated labour.

The exclusions to the target population are ancillary establishments (producers of services in support of the activity of producing goods and services for the market of more than one establishment within the enterprise, and serves as a cost centre or a discretionary expense centre for which data on all its costs including labour and depreciation can be reported by the business), future establishments, establishments with a missing or a zero gross business income (GBI) value on the BR and establishments in the following non-covered NAICS:

  • 4541 (electronic shopping and mail-order houses)
  • 4542 (vending machine operators)
  • 45431 (fuel dealers)
  • 45439 (other direct selling establishments)

4. Sampling

The MRTS sample consists of 10,000 groups of establishments (clusters) classified to the Retail Trade sector selected from the Statistics Canada Business Register. A cluster of establishments is defined as all establishments belonging to a statistical enterprise that are in the same industrial group and geographical region. The MRTS uses a stratified design with simple random sample selection in each stratum. The stratification is done by industry groups (the mainly, but not only four digit level NAICS), and the geographical regions consisting of the provinces and territories, as well as three provincial sub-regions. We further stratify the population by size.

The size measure is created using a combination of independent survey data and three administrative variables: the annual profiled revenue, the GST sales expressed on an annual basis, and the declared tax revenue (T1 or T2). The size strata consist of one take-all (census), at most, two take-some (partially sampled) strata, and one take-none (non-sampled) stratum. Take-none strata serve to reduce respondent burden by excluding the smaller businesses from the surveyed population. These businesses should represent at most ten percent of total sales. Instead of sending questionnaires to these businesses, the estimates are produced through the use of administrative data.

The sample was allocated optimally in order to reach target coefficients of variation at the national, provincial/territorial, industrial, and industrial groups by province/territory levels. The sample was also inflated to compensate for dead, non-responding, and misclassified units.

MRTS is a repeated survey with maximisation of monthly sample overlap. The sample is kept month after month, and every month new units are added (births) to the sample.  MRTS births, i.e., new clusters of establishment(s), are identified every month via the BR’s latest universe. They are stratified according to the same criteria as the initial population. A sample of these births is selected according to the sampling fraction of the stratum to which they belong and is added to the monthly sample. Deaths occur on a monthly basis. A death can be a cluster of establishment(s) that have ceased their activities (out-of-business) or whose major activities are no longer in retail trade (out-of-scope). The status of these businesses is updated on the BR using administrative sources and survey feedback, including feedback from the MRTS. Methods to treat dead units and misclassified units are part of the sample and population update procedures.

5. Questionnaire design

The Monthly Retail Trade Survey incorporates the following sub-surveys:

Monthly Retail Trade Survey - R8

Monthly Retail Trade Survey (with inventories) – R8

Survey of Sales and Inventories of Alcoholic Beverages

The questionnaires collect monthly data on retail sales and the number of trading locations by province or territory and inventories of goods owned and intended for resale from a sample of retailers. The items on the questionnaires have remained unchanged for several years. For the 2004 redesign, the general questionnaires were subject to cosmetic changes only. The questionnaire for Sales and Inventories of Alcoholic Beverages underwent more extensive changes. The modifications were discussed with stakeholders and the respondents were given an opportunity to comment before the new questionnaire was finalized. If further changes are needed to any of the questionnaires, proposed changes would go through a review committee and a field test with respondents and data users to ensure its relevancy.

6. Response and nonresponse

6.1. Response and non-response

Despite the best efforts of survey managers and operations staff to maximize response in the MRTS, some non-response will occur. For statistical establishments to be classified as responding, the degree of partial response (where an accurate response is obtained for only some of the questions asked a respondent) must meet a minimum threshold level below which the response would be rejected and considered a unit nonresponse.  In such an instance, the business is classified as not having responded at all.

Non-response has two effects on data: first it introduces bias in estimates when nonrespondents differ from respondents in the characteristics measured; and second, it contributes to an increase in the sampling variance of estimates because the effective sample size is reduced from that originally sought.

The degree to which efforts are made to get a response from a non-respondent is based on budget and time constraints, its impact on the overall quality and the risk of nonresponse bias.

The main method to reduce the impact of non-response at sampling is to inflate the sample size through the use of over-sampling rates that have been determined from similar surveys.

Besides the methods to reduce the impact of non-response at sampling and collection, the non-responses to the survey that do occur are treated through imputation. In order to measure the amount of non-response that occurs each month, various response rates are calculated. For a given reference month, the estimation process is run at least twice (a preliminary and a revised run). Between each run, respondent data can be identified as unusable and imputed values can be corrected through respondent data. As a consequence, response rates are computed following each run of the estimation process.

For the MRTS, two types of rates are calculated (un-weighted and weighted). In order to assess the efficiency of the collection process, un-weighted response rates are calculated. Weighted rates, using the estimation weight and the value for the variable of interest, assess the quality of estimation. Within each of these types of rates, there are distinct rates for units that are surveyed and for units that are only modeled from administrative data that has been extracted from GST files.

To get a better picture of the success of the collection process, two un-weighted rates called the ‘collection results rate’ and the ‘extraction results rate’ are computed. They are computed by dividing the number of respondents by the number of units that we tried to contact or tried to receive extracted data for them. Non-monthly reporters (respondents with special reporting arrangements where they do not report every month but for whom actual data is available in subsequent revisions) are excluded from both the numerator and denominator for the months where no contact is performed.

In summary, the various response rates are calculated as follows:

Weighted rates:

Survey Response rate (estimation) =
Sum of weighted sales of units with response status i / Sum of survey weighted sales

where i = units that have either reported data that will be used in estimation or are converted refusals, or have reported data that has not yet been resolved for estimation.

Admin Response rate (estimation) =
Sum of weighted sales of units with response status ii / Sum of administrative weighted sales

where ii = units that have data that was extracted from administrative files and are usable for estimation.

Total Response rate (estimation) =
Sum of weighted sales of units with response status i or response status ii / Sum of all weighted sales

Un-weighted rates:

Survey Response rate (collection) =
Number of questionnaires with response status iii/ Number of questionnaires with response status iv

where iii = units that have either reported data (unresolved, used or not used for estimation) or are converted refusals.

where iv = all of the above plus units that have refused to respond, units that were not contacted and other types of non-respondent units.

Admin Response rate (extraction) =
Number of questionnaires with response status vi/ Number of questionnaires with response status vii

where vi = in-scope units that have data (either usable or non-usable) that was extracted from administrative files

where vii = all of the above plus units that have refused to report to the administrative data source, units that were not contacted and other types of non-respondent units.

(% of questionnaire collected over all in-scope questionnaires)

Collection Results Rate =
Number of questionnaires with response status iii / Number of questionnaires with response status viii

where iii = same as iii defined above

where viii = same as iv except for the exclusion of units that were contacted because their response is unavailable for a particular month since they are non-monthly reporters.

Extraction Results Rate =
Number of questionnaires with response status ix / Number of questionnaires with response status vii

where ix = same as vi with the addition of extracted units that have been imputed or were out of scope

where vii = same as vii defined above

(% of questionnaires collected over all questionnaire in-scope we tried to collect)

All the above weighted and un-weighted rates are provided at the industrial group, geography and size group level or for any combination of these levels.

Use of Administrative Data

Managing response burden is an ongoing challenge for Statistics Canada. In an attempt to alleviate response burden and survey costs, especially for smaller businesses, the MRTS has reduced the number of simple establishments in the sample that are surveyed directly and instead derives sales data for these establishments from Goods and Service Tax (GST) files using a statistical model. The model accounts for differences between sales and revenue (reported for GST purposes) as well as for the time lag between the survey reference period and the reference period of the GST file.

For more information on the methodology used for modeling sales from administrative data sources, refer to ‘Monthly Retail Trade Survey: Use of Administrative Data’ under ‘Documentation’ of the IMDB.

Table 1 contains the weighted response rates for all industry groups as well as for total retail trade for each province and territory. For more detailed weighted response rates, please contact the Marketing and Dissemination Section at (613) 951-3549, toll free: 1-877-421-3067 or by e-mail at retailinfo@statcan.

6.2. Methods used to reduce non-response at collection

Significant effort is spent trying to minimize non-response during collection. Methods used, among others, are interviewer techniques such as probing and persuasion, repeated re-scheduling and call-backs to obtain the information, and procedures dealing with how to handle non-compliant (refusal) respondents.

If data are unavailable at the time of collection, a respondent's best estimates are also accepted, and are subsequently revised once the actual data become available.

To minimize total non-response for all variables, partial responses are accepted. In addition, questionnaires are customized for the collection of certain variables, such as inventory, so that collection is timed for those months when the data are available.

Finally, to build trust and rapport between the interviewers and respondents, cases are generally assigned to the same interviewer each month. This action establishes a personal relationship between interviewer and respondent, and builds respondent trust.

7. Data collection and capture operations

Collection of the data is performed by Statistics Canada’s Regional Offices.

Table 1
Weighted response rates by NAICS, for all provinces/territories: April 2011
  Weighted Response Rates
Total Survey Administrative
NAICS - Canada
Motor Vehicle and Parts Dealers 93.7 94.6 55
Automobile Dealers 96.1 96.4 56.3
New Car Dealers 97.4 97.4  
Used Car Dealers 75 77.4 56.3
Other Motor Vehicle Dealers 71.4 75 52.1
Automotive Parts, Accessories and Tire Stores 84.4 88.1 58.1
Furniture and Home Furnishings Stores 83.3 87.4 44.8
Furniture Stores 86.8 88.7 44.5
Home Furnishings Stores 77.3 84.6 44.8
Electronics and Appliance Stores 87.7 89 59.8
Building Material and Garden Equipment Dealers 90.4 93.6 60.6
Food and Beverage Stores 80.9 86.1 26
Grocery Stores 81.8 87.9 23
Grocery (except Convenience) Stores 83.5 89.7 19.6
Convenience Stores 59.8 62 47.6
Specialty Food Stores 67.9 74.6 40.3
Beer, Wine and Liquor Stores 80.1 81 46.7
Health and Personal Care Stores 90.6 92.8 67.8
Gasoline Stations 83.1 84.3 64.6
Clothing and Clothing Accessories Stores 86.4 88.1 47.7
Clothing Stores 85.6 87.3 47.4
Shoe Stores 94.2 95.4 38.8
Jewellery, Luggage and Leather Goods Stores 83.4 85.6 52.7
Sporting Goods, Hobby, Book and Music Stores 84.9 91.8 32
General Merchandise Stores 98.6 99.2 6.7
Department Stores 100 100  
Other general merchadise stores 97.2 98.4 6.7
Miscellaneous Store Retailers 82.6 86.5 55.9
Total 88.4 91 45.6
Regions
Newfoundland and Labrador 87.4 88.4 20.3
Prince Edward Island 89.9 90.8 33.2
Nova Scotia 94.2 95.1 71.5
New Brunswick 87.6 91.2 38.2
Québec 90 93.9 39.9
Ontario 88.7 91.1 48.6
Manitoba 88.7 88.9 79.1
Saskatchewan 89.5 91.3 39.2
Alberta 83.4 85.4 49.6
British Columbia 88.8 91.5 43.2
Yukon Territory 89.4 89.4  
Northwest Territories 87.1 87.1  
Nunavut 65.7 65.7  
1 There are no administrative records used in new car dealers

Weighted Response Rates

Respondents are sent a questionnaire or are contacted by telephone to obtain their sales and inventory values, as well as to confirm the opening or closing of business trading locations. Collection of the data begins approximately 7 working days after the end of the reference month and continues for the duration of that month.

New entrants to the survey are introduced to the survey via an introductory letter that informs the respondent that a representative of Statistics Canada will be calling. This call is to introduce the respondent to the survey, confirm the respondent's business activity, establish and begin data collection, as well as to answer any questions that the respondent may have.

8. Editing

Data editing is the application of checks to detect missing, invalid or inconsistent entries or to point to data records that are potentially in error. In the survey process for the MRTS, data editing is done at two different time periods.

First of all, editing is done during data collection. Once data are collected via the telephone, or via the receipt of completed mail-in questionnaires, the data are captured using customized data capture applications. All data are subjected to data editing. Edits during data collection are referred to as field edits and generally consist of validity and some simple consistency edits. They are used to detect mistakes made during the interview by the respondent or the interviewer and to identify missing information during collection in order to reduce the need for follow-up later on. Another purpose of the field edits is to clean up responses. In the MRTS, the current month’s responses are edited against the respondent’s previous month’s responses and/or the previous year’s responses for the current month. Field edits are also used to identify problems with data collection procedures and the design of the questionnaire, as well as the need for more interviewer training.

Follow-up with respondents occurs to validate potential erroneous data following any failed preliminary edit check of the data. Once validated, the collected data is regularly transmitted to the head office in Ottawa.

Secondly, editing known as statistical editing is also done after data collection and this is more empirical in nature. Statistical editing is run prior to imputation in order to identify the data that will be used as a basis to impute non-respondents. Large outliers that could disrupt a monthly trend are excluded from trend calculations by the statistical edits. It should be noted that adjustments are not made at this stage to correct the reported outliers.

The first step in the statistical editing is to identify which responses will be subjected to the statistical edit rules. Reported data for the current reference month will go through various edit checks.

The first set of edit checks is based on the Hidiriglou-Berthelot method whereby a ratio of the respondent’s current month data over historical (last month, same month last year) or auxiliary data is analyzed. When the respondent’s ratio differs significantly from ratios of respondents who are similar in terms of industry and/or geography group, the response is deemed an outlier.

The second set of edits consists of an edit known as the share of market edit. With this method, one is able to edit all respondents, even those where historical and auxiliary data is unavailable. The method relies on current month data only. Therefore, within a group of respondents, that are similar in terms of industrial group and/or geography, if the weighted contribution of a respondent to the group’s total is too large, it will be flagged as an outlier.

For edit checks based on the Hidiriglou-Berthelot method, data that are flagged as an outlier will not be included in the imputation models (those based on ratios). Also, data that are flagged as outliers in the share of market edit will not be included in the imputation models where means and medians are calculated to impute for responses that have no historical responses.

In conjunction with the statistical editing after data collection of reported data, there is also error detection done on the extracted GST data. Modeled data based on the GST are also subject to an extensive series of processing steps which thoroughly verify each record that is the basis for the model as well as the record being modeled. Edits are performed at a more aggregate level (industry by geography level) to detect records which deviate from the expected range, either by exhibiting large month-to-month change, or differing significantly from the remaining units. All data which fail these edits are subject to manual inspection and possible corrective action.

9. Imputation

Imputation in the MRTS is the process used to assign replacement values for missing data. This is done by assigning values when they are missing on the record being edited to ensure that estimates are of high quality and that a plausible, internal consistency is created. Due to concerns of response burden, cost and timeliness, it is generally impossible to do all follow-ups with the respondents in order to resolve missing responses. Since it is desirable to produce a complete and consistent microdata file, imputation is used to handle the remaining missing cases.

In the MRTS, imputation is based on historical data or administrative data (GST sales). The appropriate method is selected according to a strategy that is based on whether historical data is available, auxiliary data is available and/or which reference month is being processed.

There are three types of historical imputation methods. The first type is a general trend that uses one historical data source (previous month, data from next month or data from same month previous year). The second type is a regression model where data from previous month and same month previous year are used simultaneously. The third type uses the historical data as a direct replacement value for a non-respondent. Depending upon the particular reference month, there is an order of preference that exists so that top quality imputation can result. The historical imputation method that was labelled as the third type above is always the last option in the order for each reference month.

The imputation methods using administrative data are automatically selected when historical information is unavailable for a non-respondent. The administrative data source (annual GST sales) is the basis of these methods. The annual GST sales are used for two types of methods. One is a general trend that will be used for simple structure, e.g. enterprises with only one establishment, and a second type is called median-average that is used for units with a more complex structure.

10. Estimation

Estimation is a process that approximates unknown population parameters using only part of the population that is included in a sample. Inferences about these unknown parameters are then made, using the sample data and associated survey design. This stage uses Statistics Canada's Generalized Estimation System (GES).

For retail sales, the population is divided into a survey portion (take-all and take-some strata) and a non-survey portion (take-none stratum). From the sample that is drawn from the survey portion, an estimate for the population is determined through the use of a Horvitz-Thompson estimator where responses for sales are weighted by using the inverses of the inclusion probabilities of the sampled units. Such weights (called sampling weights) can be interpreted as the number of times that each sampled unit should be replicated to represent the entire population. The calculated weighted sales values are summed by domain, to produce the total sales estimates by each industrial group / geographic area combination. A domain is defined as the most recent classification values available from the BR for the unit and the survey reference period. These domains may differ from the original sampling strata because units may have changed size, industry or location. Changes in classification are reflected immediately in the estimates and do not accumulate over time. For the non-survey portion, the sales are estimated with statistical models using monthly GST sales.

For more information on the methodology for modeling sales from administrative data sources which also contributes to the estimates of the survey portion, refer to ‘Monthly Retail Survey: Use of Administrative Data’ under ‘Documentation’ of the IMDB.

The measure of precision used for the MRTS to evaluate the quality of a population parameter estimate and to obtain valid inferences is the variance. The variance from the survey portion is derived directly from a stratified simple random sample without replacement.

Sample estimates may differ from the expected value of the estimates. However, since the estimate is based on a probability sample, the variability of the sample estimate with respect to its expected value can be measured. The variance of an estimate is a measure of the precision of the sample estimate and is defined as the average, over all possible samples, of the squared difference of the estimate from its expected value.

11. Revisions and seasonal adjustment

Revisions in the raw data are required to correct known non-sampling errors. These normally include replacing imputed data with reported data, corrections to previously reported data, and estimates for new births that were not known at the time of the original estimates. Raw data are revised, on a monthly basis, for the month immediately prior to the current reference month being published. That is, when data for December are being published for the first time, there will also be revisions, if necessary, to the raw data for November. In addition, revisions are made once a year, with the initial release of the February data, for all months in the previous year. The purpose is to correct any significant problems that have been found that apply for an extended period. The actual period of revision depends on the nature of the problem identified, but rarely exceeds three years. Time series contain the elements essential to the description, explanation and forecasting of the behaviour of an economic phenomenon: "They are statistical records of the evolution of economic processes through time."1 Economic time series such as the Monthly Retail Trade Survey can be broken down into five main components: the trend-cycle, seasonality, the trading-day effect, the Easter holiday effect and the irregular component.

The trend represents the long-term change in the series, whereas the cycle represents a smooth, quasi-periodical movement about the trend, showing a succession of growth and decline phases (e.g., the business cycle). These two components—the trend and the cycle—are estimated together, and the trend-cycle reflects the fundamental evolution of the series. The other components reflect short-term transient movements.

The seasonal component represents sub-annual, monthly or quarterly fluctuations that recur more or less regularly from one year to the next. Seasonal variations are caused by the direct and indirect effects of the climatic seasons and institutional factors (attributable to social conventions or administrative rules; e.g., Christmas).

The trading-day component originates from the fact that the relative importance of the days varies systematically within the week and that the number of each day of the week in a given month varies from year to year. This effect is present when activity varies with the day of the week. For instance, Sunday is typically less active than the other days, and the number of Sundays, Mondays, etc., in a given month changes from year to year.

The Easter holiday effect is the variation due to the shift of part of April’s activity to March when Easter falls in March rather than April.

Lastly, the irregular component includes all other more or less erratic fluctuations not taken into account in the preceding components. It is a residual that includes errors of measurement on the 1. A Note on the Seasonal adjustment of Economic Time Series», Canadian Statistical Review, August 1974.  A variable itself as well as unusual events (e.g., strikes, drought, floods, major power blackout or other unexpected events causing variations in respondents’ activities).

Thus, the latter four components—seasonal, irregular, trading-day and Easter holiday effect—all conceal the fundamental trend-cycle component of the series. Seasonal adjustment (correction of seasonal variation) consists in removing the seasonal, trading-day and Easter holiday effect components from the series, and it thus helps reveal the trend-cycle. While seasonal adjustment permits a better understanding of the underlying trend-cycle of a series, the seasonally adjusted series still contains an irregular component. Slight month-to-month variations in the seasonally adjusted series may be simple irregular movements. To get a better idea of the underlying trend, users should examine several months of the seasonally adjusted series.

Since April 2008, Monthly Retail Trade Survey data are seasonally adjusted using the X-12- ARIMA2 software. The technique that is used essentially consists of first correcting the initial series for all sorts of undesirable effects, such as the trading-day and the Easter holiday effects, by a module called regARIMA. These effects are estimated using regression models with ARIMA errors (auto-regressive integrated moving average models). The series can also be extrapolated for at least one year by using the model. Subsequently, the raw series—pre-adjusted and extrapolated if applicable— is seasonally adjusted by the X-11 method.

The X-11 method is used for analysing monthly and quarterly series. It is based on an iterative principle applied in estimating the different components, with estimation being done at each stage using adequate moving averages3. The moving averages used to estimate the main components—the trend and seasonality—are primarily smoothing tools designed to eliminate an undesirable component from the series. Since moving averages react poorly to the presence of atypical values, the X-11 method includes a tool for detecting and correcting atypical points. This tool is used to clean up the series during the seasonal adjustment. Outlying data points can also be detected and corrected in advance, within the regARIMA module.

Lastly, the annual totals of the seasonally adjusted series are forced to the annual totals of the original series.

Unfortunately, seasonal adjustment removes the sub-annual additivity of a system of series; small discrepancies can be observed between the sum of seasonally adjusted series and the direct seasonal adjustment of their total. To insure or restore additivity in a system of series, a reconciliation process is applied or indirect seasonal adjustment is used, i.e. the seasonal adjustment of a total is derived by the summation of the individually seasonally adjusted series.

12. Data quality evaluation

The methodology of this survey has been designed to control errors and to reduce their potential effects on estimates. However, the survey results remain subject to errors, of which sampling error is only one component of the total survey error. Sampling error results when observations are made only on a sample and not on the entire population. All other errors arising from the various phases of a survey are referred to as nonsampling errors. For example, these types of errors can occur when a respondent provides incorrect information or does not answer certain questions; when a unit in the target population is omitted or covered more than once; when GST data for records being modeled for a particular month are not representative of the actual record for various reasons; when a unit that is out of scope for the survey is included by mistake or when errors occur in data processing, such as coding or capture errors.

Prior to publication, combined survey results are analyzed for comparability; in general, this includes a detailed review of individual responses (especially for large businesses), general economic conditions and historical trends.

A common measure of data quality for surveys is the coefficient of variation (CV). The coefficient of variation, defined as the standard error divided by the sample estimate, is a measure of precision in relative terms. Since the coefficient of variation is calculated from responses of individual units, it also measures some non-sampling errors.

The formula used to calculate coefficients of variation (CV) as percentages is:

CV (X) = S(X) * 100% / X
where X denotes the estimate and S(X) denotes the standard error of X.

Confidence intervals can be constructed around the estimates using the estimate and the CV. Thus, for our sample, it is possible to state with a given level of confidence that the expected value will fall within the confidence interval constructed around the estimate. For example, if an estimate of $12,000,000 has a CV of 2%, the standard error will be $240,000 (the estimate multiplied by the CV). It can be stated with 68% confidence that the expected values will fall within the interval whose length equals the standard deviation about the estimate, i.e. between $11,760,000 and $12,240,000.

Alternatively, it can be stated with 95% confidence that the expected value will fall within the interval whose length equals two standard deviations about the estimate, i.e. between $11,520,000 and $12,480,000.

Finally, due to the small contribution of the non-survey portion to the total estimates, bias in the non-survey portion has a negligible impact on the CVs. Therefore, the CV from the survey portion is used for the total estimate that is the summation of estimates from the surveyed and non-surveyed portions.

13. Disclosure control

Statistics Canada is prohibited by law from releasing any data which would divulge information obtained under the Statistics Act that relates to any identifiable person, business or organization without the prior knowledge or the consent in writing of that person, business or organization. Various confidentiality rules are applied to all data that are released or published to prevent the publication or disclosure of any information deemed confidential. If necessary, data are suppressed to prevent direct or residual disclosure of identifiable data.

Confidentiality analysis includes the detection of possible "direct disclosure", which occurs when the value in a tabulation cell is composed of a few respondents or when the cell is dominated by a few companies.