Monthly Retail Trade Survey (MRTS) Data Quality Statement

Objectives, uses and users
Concepts, variables and classifications
Coverage and frames
Sampling
Questionnaire design
Response and nonresponse
Data collection and capture operations
Editing
Imputation
Estimation
Revisions and seasonal adjustment
Data quality evaluation
Disclosure control

1. Objectives, uses and users

1.1. Objective

The Monthly Retail Trade Survey (MRTS) provides information on the performance of the retail trade sector on a monthly basis, and when combined with other statistics, represents an important indicator of the state of the Canadian economy.

1.2. Uses

The estimates provide a measure of the health and performance of the retail trade sector. Information collected is used to estimate level and monthly trend for retail sales. At the end of each year, the estimates provide a preliminary look at annual retail sales and performance.

1.3. Users

A variety of organizations, sector associations, and levels of government make use of the information. Retailers rely on the survey results to compare their performance against similar types of businesses, as well as for marketing purposes. Retail associations are able to monitor industry performance and promote their retail industries. Investors can monitor industry growth, which can result in better access to investment capital by retailers. Governments are able to understand the role of retailers in the economy, which aids in the development of policies and tax incentives. As an important industry in the Canadian economy, governments are able to better determine the overall health of the economy through the use of the estimates in the calculation of the nation’s Gross Domestic Product (GDP).

2. Concepts, variables and classifications

2.1. Concepts

The retail trade sector comprises establishments primarily engaged in retailing merchandise, generally without transformation, and rendering services incidental to the sale of merchandise.

The retailing process is the final step in the distribution of merchandise; retailers are therefore organized to sell merchandise in small quantities to the general public. This sector comprises two main types of retailers, that is, store and non-store retailers. The MRTS covers only store retailers. Their main characteristics are described below. Store retailers operate fixed point-of-sale locations, located and designed to attract a high volume of walk-in customers. In general, retail stores have extensive displays of merchandise and use mass-media advertising to attract customers. They typically sell merchandise to the general public for personal or household consumption, but some also serve business and institutional clients. These include establishments such as office supplies stores, computer and software stores, gasoline stations, building material dealers, plumbing supplies stores and electrical supplies stores.

In addition to selling merchandise, some types of store retailers are also engaged in the provision of after-sales services, such as repair and installation. For example, new automobile dealers, electronic and appliance stores and musical instrument and supplies stores often provide repair services, while floor covering stores and window treatment stores often provide installation services. As a general rule, establishments engaged in retailing merchandise and providing after sales services are classified in this sector. Catalogue sales showrooms, gasoline service stations, and mobile home dealers are treated as store retailers.

2.2. Variables

Sales are defined as the sales of all goods purchased for resale, net of returns and discounts. This includes commission revenue and fees earned from selling goods and services on account of others, such as selling lottery tickets, bus tickets, and phone cards. It also includes parts and labour revenue from repair and maintenance; revenue from rental and leasing of goods and equipment; revenues from services, including food services; sales of goods manufactured as a secondary activity; and the proprietor’s withdrawals, at retail, of goods for personal use. Other revenue from rental of real estate, placement fees, operating subsidies, grants, royalties and franchise fees are excluded.

Trading Location is the physical location(s) in which business activity is conducted in each province and territory, and for which sales are credited or recognized in the financial records of the company. For retailers, this would normally be a store.

Constant Dollars: The value of retail trade is measured in two ways; including the effects of price change on sales and net of the effects of price change. The first measure is referred to as retail trade in current dollars and the latter as retail trade in constant dollars. The method of calculating the current dollar estimate is to aggregate the weighted value of sales for all retail outlets. The method of calculating the constant dollar estimate is to first adjust the sales values to a base year, using the Consumer Price Index, and then sum up the resulting values.

2.3. Classification

The Monthly Retail Trade Survey is based on the definition of retail trade under the NAICS (North American Industry Classification System). NAICS is the agreed upon common framework for the production of comparable statistics by the statistical agencies of Canada, Mexico and the United States. The agreement defines the boundaries of twenty sectors. NAICS is based on a production-oriented, or supply based conceptual framework in that establishments are groups into industries according to similarity in production processes used to produce goods and services.

Estimates appear for 21 industries based on special aggregations of the 2012 North American Industry Classification System (NAICS) industries. The 21 industries are further aggregated to 11 sub-sectors.

Geographically, sales estimates are produced for Canada and each province and territory.

3. Coverage and frames

Statistics Canada’s Business Register ( BR) provides the frame for the Monthly Retail Trade Survey. The BR is a structured list of businesses engaged in the production of goods and services in Canada. It is a centrally maintained database containing detailed descriptions of most business entities operating within Canada. The BR includes all incorporated businesses, with or without employees. For unincorporated businesses, the BR includes all employers with businesses, and businesses with no employees with annual sales that have a Goods and Services Tax (GST) or annual revenue that declares individual taxes.  annual sales greater than $30,000 that have a Goods and Services Tax (GST) account (the BR does not include unincorporated businesses with no employees and with annual sales less than $30,000).

The businesses on the BR are represented by a hierarchical structure with four levels, with the statistical enterprise at the top, followed by the statistical company, the statistical establishment and the statistical location. An enterprise can be linked to one or more statistical companies, a statistical company can be linked to one or more statistical establishments, and a statistical establishment to one or more statistical locations.

The target population for the MRTS consists of all statistical establishments on the BR that are classified to the retail sector using the North American Industry Classification System (NAICS) (approximately 200,000 establishments). The NAICS code range for the retail sector is 441100 to 453999. A statistical establishment is the production entity or the smallest grouping of production entities which: produces a homogeneous set of goods or services; does not cross provincial boundaries; and provides data on the value of output, together with the cost of principal intermediate inputs used, along with the cost and quantity of labour used to produce the output. The production entity is the physical unit where the business operations are carried out. It must have a civic address and dedicated labour.

The exclusions to the target population are ancillary establishments (producers of services in support of the activity of producing goods and services for the market of more than one establishment within the enterprise, and serves as a cost centre or a discretionary expense centre for which data on all its costs including labour and depreciation can be reported by the business), future establishments, establishments with a missing or a zero gross business income (GBI) value on the BR and establishments in the following non-covered NAICS:

  • 4541 (electronic shopping and mail-order houses)
  • 4542 (vending machine operators)
  • 45431 (fuel dealers)
  • 45439 (other direct selling establishments)

4. Sampling

The MRTS sample consists of 10,000 groups of establishments (clusters) classified to the Retail Trade sector selected from the Statistics Canada Business Register. A cluster of establishments is defined as all establishments belonging to a statistical enterprise that are in the same industrial group and geographical region. The MRTS uses a stratified design with simple random sample selection in each stratum. The stratification is done by industry groups (the mainly, but not only four digit level NAICS), and the geographical regions consisting of the provinces and territories, as well as three provincial sub-regions. We further stratify the population by size.

The size measure is created using a combination of independent survey data and three administrative variables: the annual profiled revenue, the GST sales expressed on an annual basis, and the declared tax revenue (T1 or T2). The size strata consist of one take-all (census), at most, two take-some (partially sampled) strata, and one take-none (non-sampled) stratum. Take-none strata serve to reduce respondent burden by excluding the smaller businesses from the surveyed population. These businesses should represent at most ten percent of total sales. Instead of sending questionnaires to these businesses, the estimates are produced through the use of administrative data.

The sample was allocated optimally in order to reach target coefficients of variation at the national, provincial/territorial, industrial, and industrial groups by province/territory levels. The sample was also inflated to compensate for dead, non-responding, and misclassified units.

MRTS is a repeated survey with maximisation of monthly sample overlap. The sample is kept month after month, and every month new units are added (births) to the sample.  MRTS births, i.e., new clusters of establishment(s), are identified every month via the BR’s latest universe. They are stratified according to the same criteria as the initial population. A sample of these births is selected according to the sampling fraction of the stratum to which they belong and is added to the monthly sample. Deaths occur on a monthly basis. A death can be a cluster of establishment(s) that have ceased their activities (out-of-business) or whose major activities are no longer in retail trade (out-of-scope). The status of these businesses is updated on the BR using administrative sources and survey feedback, including feedback from the MRTS. Methods to treat dead units and misclassified units are part of the sample and population update procedures.

5. Questionnaire design

The Monthly Retail Trade Survey incorporates the following sub-surveys:

Monthly Retail Trade Survey - R8

Monthly Retail Trade Survey (with inventories) – R8

Survey of Sales and Inventories of Alcoholic Beverages

The questionnaires collect monthly data on retail sales and the number of trading locations by province or territory and inventories of goods owned and intended for resale from a sample of retailers. The items on the questionnaires have remained unchanged for several years. For the 2004 redesign, the general questionnaires were subject to cosmetic changes only. The questionnaire for Sales and Inventories of Alcoholic Beverages underwent more extensive changes. The modifications were discussed with stakeholders and the respondents were given an opportunity to comment before the new questionnaire was finalized. If further changes are needed to any of the questionnaires, proposed changes would go through a review committee and a field test with respondents and data users to ensure its relevancy.

6. Response and nonresponse

6.1. Response and non-response

Despite the best efforts of survey managers and operations staff to maximize response in the MRTS, some non-response will occur. For statistical establishments to be classified as responding, the degree of partial response (where an accurate response is obtained for only some of the questions asked a respondent) must meet a minimum threshold level below which the response would be rejected and considered a unit nonresponse.  In such an instance, the business is classified as not having responded at all.

Non-response has two effects on data: first it introduces bias in estimates when nonrespondents differ from respondents in the characteristics measured; and second, it contributes to an increase in the sampling variance of estimates because the effective sample size is reduced from that originally sought.

The degree to which efforts are made to get a response from a non-respondent is based on budget and time constraints, its impact on the overall quality and the risk of nonresponse bias.

The main method to reduce the impact of non-response at sampling is to inflate the sample size through the use of over-sampling rates that have been determined from similar surveys.

Besides the methods to reduce the impact of non-response at sampling and collection, the non-responses to the survey that do occur are treated through imputation. In order to measure the amount of non-response that occurs each month, various response rates are calculated. For a given reference month, the estimation process is run at least twice (a preliminary and a revised run). Between each run, respondent data can be identified as unusable and imputed values can be corrected through respondent data. As a consequence, response rates are computed following each run of the estimation process.

For the MRTS, two types of rates are calculated (un-weighted and weighted). In order to assess the efficiency of the collection process, un-weighted response rates are calculated. Weighted rates, using the estimation weight and the value for the variable of interest, assess the quality of estimation. Within each of these types of rates, there are distinct rates for units that are surveyed and for units that are only modeled from administrative data that has been extracted from GST files.

To get a better picture of the success of the collection process, two un-weighted rates called the ‘collection results rate’ and the ‘extraction results rate’ are computed. They are computed by dividing the number of respondents by the number of units that we tried to contact or tried to receive extracted data for them. Non-monthly reporters (respondents with special reporting arrangements where they do not report every month but for whom actual data is available in subsequent revisions) are excluded from both the numerator and denominator for the months where no contact is performed.

In summary, the various response rates are calculated as follows:

Weighted rates:

Survey Response rate (estimation) =
Sum of weighted sales of units with response status i / Sum of survey weighted sales

where i = units that have either reported data that will be used in estimation or are converted refusals, or have reported data that has not yet been resolved for estimation.

Admin Response rate (estimation) =
Sum of weighted sales of units with response status ii / Sum of administrative weighted sales

where ii = units that have data that was extracted from administrative files and are usable for estimation.

Total Response rate (estimation) =
Sum of weighted sales of units with response status i or response status ii / Sum of all weighted sales

Un-weighted rates:

Survey Response rate (collection) =
Number of questionnaires with response status iii/ Number of questionnaires with response status iv

where iii = units that have either reported data (unresolved, used or not used for estimation) or are converted refusals.

where iv = all of the above plus units that have refused to respond, units that were not contacted and other types of non-respondent units.

Admin Response rate (extraction) =
Number of questionnaires with response status vi/ Number of questionnaires with response status vii

where vi = in-scope units that have data (either usable or non-usable) that was extracted from administrative files

where vii = all of the above plus units that have refused to report to the administrative data source, units that were not contacted and other types of non-respondent units.

(% of questionnaire collected over all in-scope questionnaires)

Collection Results Rate =
Number of questionnaires with response status iii / Number of questionnaires with response status viii

where iii = same as iii defined above

where viii = same as iv except for the exclusion of units that were contacted because their response is unavailable for a particular month since they are non-monthly reporters.

Extraction Results Rate =
Number of questionnaires with response status ix / Number of questionnaires with response status vii

where ix = same as vi with the addition of extracted units that have been imputed or were out of scope

where vii = same as vii defined above

(% of questionnaires collected over all questionnaire in-scope we tried to collect)

All the above weighted and un-weighted rates are provided at the industrial group, geography and size group level or for any combination of these levels.

Use of Administrative Data

Managing response burden is an ongoing challenge for Statistics Canada. In an attempt to alleviate response burden and survey costs, especially for smaller businesses, the MRTS has reduced the number of simple establishments in the sample that are surveyed directly and instead derives sales data for these establishments from Goods and Service Tax (GST) files using a statistical model. The model accounts for differences between sales and revenue (reported for GST purposes) as well as for the time lag between the survey reference period and the reference period of the GST file.

For more information on the methodology used for modeling sales from administrative data sources, refer to ‘Monthly Retail Trade Survey: Use of Administrative Data’ under ‘Documentation’ of the IMDB.

Table 1 contains the weighted response rates for all industry groups as well as for total retail trade for each province and territory. For more detailed weighted response rates, please contact the Marketing and Dissemination Section at (613) 951-3549, toll free: 1-877-421-3067 or by e-mail at retailinfo@statcan.

6.2. Methods used to reduce non-response at collection

Significant effort is spent trying to minimize non-response during collection. Methods used, among others, are interviewer techniques such as probing and persuasion, repeated re-scheduling and call-backs to obtain the information, and procedures dealing with how to handle non-compliant (refusal) respondents.

If data are unavailable at the time of collection, a respondent's best estimates are also accepted, and are subsequently revised once the actual data become available.

To minimize total non-response for all variables, partial responses are accepted. In addition, questionnaires are customized for the collection of certain variables, such as inventory, so that collection is timed for those months when the data are available.

Finally, to build trust and rapport between the interviewers and respondents, cases are generally assigned to the same interviewer each month. This action establishes a personal relationship between interviewer and respondent, and builds respondent trust.

7. Data collection and capture operations

Collection of the data is performed by Statistics Canada’s Regional Offices.

Table 1: Weighted response rates by NAICS, for all provinces and territories: December 2013
Table summary
This table displays the results of Table 1: Weighted response rates by NAICS Weighted Response Rates (appearing as column headers).
  Weighted Response Rates
Total Survey Administrative
NAICS - Canada
Motor Vehicle and Parts Dealers 94.9 95.7 57.3
Automobile Dealers 96.8 97.1 56.3
New Car DealersNote 1 98.0 98.0  
Used Car Dealers 74.7 78.6 56.3
Other Motor Vehicle Dealers 74.5 75.6 64.9
Automotive Parts, Accessories and Tire Stores 86.1 89.8 53.2
Furniture and Home Furnishings Stores 89.3 94.3 39.0
Furniture Stores 94.1 96.4 52.9
Home Furnishings Stores 82.3 90.8 31.7
Electronics and Appliance Stores 92.2 92.5 71.3
Building Material and Garden Equipment Dealers 83.7 86.2 54.4
Food and Beverage Stores 89.8 92.9 53.4
Grocery Stores 93.8 96.7 61.8
Grocery (except Convenience) Stores 94.9 97.3 67.0
Convenience Stores 76.3 86.4 19.6
Specialty Food Stores 69.2 79.8 28.2
Beer, Wine and Liquor Stores 82.6 84.2 17.2
Health and Personal Care Stores 91.6 92.6 78.0
Gasoline Stations 78.6 79.2 67.1
Clothing and Clothing Accessories Stores 85.8 87.1 45.2
Clothing Stores 86.5 87.8 39.4
Shoe Stores 87.7 88.6 14.2
Jewellery, Luggage and Leather Goods Stores 81.7 83.0 61.3
Sporting Goods, Hobby, Book and Music Stores 90.4 94.9 39.5
General Merchandise Stores 99.0 99.5 35.0
Department Stores 100.0 100.0  
Other general merchandise stores 98.1 99.0 35.0
Miscellaneous Store Retailers 78.9 84.1 31.5
Total 90.4 92.1 54.6
Regions
Newfoundland and Labrador 83.1 84.4 31.7
Prince Edward Island 88.1 89.4 18.3
Nova Scotia 91.3 92.6 53.4
New Brunswick 88.5 90.2 56.9
Québec 88.3 90.5 59.4
Ontario 91.8 93.3 55.5
Manitoba 90.4 91.0 52.4
Saskatchewan 92.8 94.6 54.3
Alberta 90.0 91.6 54.5
British Columbia 91.1 93.4 43.5
Yukon Territory 86.5 86.5  
Northwest Territories 80.9 80.9  
Nunavut 70.3 70.3  

Weighted Response Rates

Respondents are sent a questionnaire or are contacted by telephone to obtain their sales and inventory values, as well as to confirm the opening or closing of business trading locations. Collection of the data begins approximately 7 working days after the end of the reference month and continues for the duration of that month.

New entrants to the survey are introduced to the survey via an introductory letter that informs the respondent that a representative of Statistics Canada will be calling. This call is to introduce the respondent to the survey, confirm the respondent's business activity, establish and begin data collection, as well as to answer any questions that the respondent may have.

8. Editing

Data editing is the application of checks to detect missing, invalid or inconsistent entries or to point to data records that are potentially in error. In the survey process for the MRTS, data editing is done at two different time periods.

First of all, editing is done during data collection. Once data are collected via the telephone, or via the receipt of completed mail-in questionnaires, the data are captured using customized data capture applications. All data are subjected to data editing. Edits during data collection are referred to as field edits and generally consist of validity and some simple consistency edits. They are used to detect mistakes made during the interview by the respondent or the interviewer and to identify missing information during collection in order to reduce the need for follow-up later on. Another purpose of the field edits is to clean up responses. In the MRTS, the current month’s responses are edited against the respondent’s previous month’s responses and/or the previous year’s responses for the current month. Field edits are also used to identify problems with data collection procedures and the design of the questionnaire, as well as the need for more interviewer training.

Follow-up with respondents occurs to validate potential erroneous data following any failed preliminary edit check of the data. Once validated, the collected data is regularly transmitted to the head office in Ottawa.

Secondly, editing known as statistical editing is also done after data collection and this is more empirical in nature. Statistical editing is run prior to imputation in order to identify the data that will be used as a basis to impute non-respondents. Large outliers that could disrupt a monthly trend are excluded from trend calculations by the statistical edits. It should be noted that adjustments are not made at this stage to correct the reported outliers.

The first step in the statistical editing is to identify which responses will be subjected to the statistical edit rules. Reported data for the current reference month will go through various edit checks.

The first set of edit checks is based on the Hidiriglou-Berthelot method whereby a ratio of the respondent’s current month data over historical (last month, same month last year) or auxiliary data is analyzed. When the respondent’s ratio differs significantly from ratios of respondents who are similar in terms of industry and/or geography group, the response is deemed an outlier.

The second set of edits consists of an edit known as the share of market edit. With this method, one is able to edit all respondents, even those where historical and auxiliary data is unavailable. The method relies on current month data only. Therefore, within a group of respondents, that are similar in terms of industrial group and/or geography, if the weighted contribution of a respondent to the group’s total is too large, it will be flagged as an outlier.

For edit checks based on the Hidiriglou-Berthelot method, data that are flagged as an outlier will not be included in the imputation models (those based on ratios). Also, data that are flagged as outliers in the share of market edit will not be included in the imputation models where means and medians are calculated to impute for responses that have no historical responses.

In conjunction with the statistical editing after data collection of reported data, there is also error detection done on the extracted GST data. Modeled data based on the GST are also subject to an extensive series of processing steps which thoroughly verify each record that is the basis for the model as well as the record being modeled. Edits are performed at a more aggregate level (industry by geography level) to detect records which deviate from the expected range, either by exhibiting large month-to-month change, or differing significantly from the remaining units. All data which fail these edits are subject to manual inspection and possible corrective action.

9. Imputation

Imputation in the MRTS is the process used to assign replacement values for missing data. This is done by assigning values when they are missing on the record being edited to ensure that estimates are of high quality and that a plausible, internal consistency is created. Due to concerns of response burden, cost and timeliness, it is generally impossible to do all follow-ups with the respondents in order to resolve missing responses. Since it is desirable to produce a complete and consistent microdata file, imputation is used to handle the remaining missing cases.

In the MRTS, imputation is based on historical data or administrative data (GST sales). The appropriate method is selected according to a strategy that is based on whether historical data is available, auxiliary data is available and/or which reference month is being processed.

There are three types of historical imputation methods. The first type is a general trend that uses one historical data source (previous month, data from next month or data from same month previous year). The second type is a regression model where data from previous month and same month previous year are used simultaneously. The third type uses the historical data as a direct replacement value for a non-respondent. Depending upon the particular reference month, there is an order of preference that exists so that top quality imputation can result. The historical imputation method that was labelled as the third type above is always the last option in the order for each reference month.

The imputation methods using administrative data are automatically selected when historical information is unavailable for a non-respondent. The administrative data source (annual GST sales) is the basis of these methods. The annual GST sales are used for two types of methods. One is a general trend that will be used for simple structure, e.g. enterprises with only one establishment, and a second type is called median-average that is used for units with a more complex structure.

10. Estimation

Estimation is a process that approximates unknown population parameters using only part of the population that is included in a sample. Inferences about these unknown parameters are then made, using the sample data and associated survey design. This stage uses Statistics Canada's Generalized Estimation System (GES).

For retail sales, the population is divided into a survey portion (take-all and take-some strata) and a non-survey portion (take-none stratum). From the sample that is drawn from the survey portion, an estimate for the population is determined through the use of a Horvitz-Thompson estimator where responses for sales are weighted by using the inverses of the inclusion probabilities of the sampled units. Such weights (called sampling weights) can be interpreted as the number of times that each sampled unit should be replicated to represent the entire population. The calculated weighted sales values are summed by domain, to produce the total sales estimates by each industrial group / geographic area combination. A domain is defined as the most recent classification values available from the BR for the unit and the survey reference period. These domains may differ from the original sampling strata because units may have changed size, industry or location. Changes in classification are reflected immediately in the estimates and do not accumulate over time. For the non-survey portion, the sales are estimated with statistical models using monthly GST sales.

For more information on the methodology for modeling sales from administrative data sources which also contributes to the estimates of the survey portion, refer to ‘Monthly Retail Survey: Use of Administrative Data’ under ‘Documentation’ of the IMDB.

The measure of precision used for the MRTS to evaluate the quality of a population parameter estimate and to obtain valid inferences is the variance. The variance from the survey portion is derived directly from a stratified simple random sample without replacement.

Sample estimates may differ from the expected value of the estimates. However, since the estimate is based on a probability sample, the variability of the sample estimate with respect to its expected value can be measured. The variance of an estimate is a measure of the precision of the sample estimate and is defined as the average, over all possible samples, of the squared difference of the estimate from its expected value.

11. Revisions and seasonal adjustment

Revisions in the raw data are required to correct known non-sampling errors. These normally include replacing imputed data with reported data, corrections to previously reported data, and estimates for new births that were not known at the time of the original estimates. Raw data are revised, on a monthly basis, for the month immediately prior to the current reference month being published. That is, when data for December are being published for the first time, there will also be revisions, if necessary, to the raw data for November. In addition, revisions are made once a year, with the initial release of the February data, for all months in the previous year. The purpose is to correct any significant problems that have been found that apply for an extended period. The actual period of revision depends on the nature of the problem identified, but rarely exceeds three years. Time series contain the elements essential to the description, explanation and forecasting of the behaviour of an economic phenomenon: "They are statistical records of the evolution of economic processes through time."1 Economic time series such as the Monthly Retail Trade Survey can be broken down into five main components: the trend-cycle, seasonality, the trading-day effect, the Easter holiday effect and the irregular component.

The trend represents the long-term change in the series, whereas the cycle represents a smooth, quasi-periodical movement about the trend, showing a succession of growth and decline phases (e.g., the business cycle). These two components—the trend and the cycle—are estimated together, and the trend-cycle reflects the fundamental evolution of the series. The other components reflect short-term transient movements.

The seasonal component represents sub-annual, monthly or quarterly fluctuations that recur more or less regularly from one year to the next. Seasonal variations are caused by the direct and indirect effects of the climatic seasons and institutional factors (attributable to social conventions or administrative rules; e.g., Christmas).

The trading-day component originates from the fact that the relative importance of the days varies systematically within the week and that the number of each day of the week in a given month varies from year to year. This effect is present when activity varies with the day of the week. For instance, Sunday is typically less active than the other days, and the number of Sundays, Mondays, etc., in a given month changes from year to year.

The Easter holiday effect is the variation due to the shift of part of April’s activity to March when Easter falls in March rather than April.

Lastly, the irregular component includes all other more or less erratic fluctuations not taken into account in the preceding components. It is a residual that includes errors of measurement on the 1. A Note on the Seasonal adjustment of Economic Time Series», Canadian Statistical Review, August 1974.  A variable itself as well as unusual events (e.g., strikes, drought, floods, major power blackout or other unexpected events causing variations in respondents’ activities).

Thus, the latter four components—seasonal, irregular, trading-day and Easter holiday effect—all conceal the fundamental trend-cycle component of the series. Seasonal adjustment (correction of seasonal variation) consists in removing the seasonal, trading-day and Easter holiday effect components from the series, and it thus helps reveal the trend-cycle. While seasonal adjustment permits a better understanding of the underlying trend-cycle of a series, the seasonally adjusted series still contains an irregular component. Slight month-to-month variations in the seasonally adjusted series may be simple irregular movements. To get a better idea of the underlying trend, users should examine several months of the seasonally adjusted series.

Since April 2008, Monthly Retail Trade Survey data are seasonally adjusted using the X-12- ARIMA2 software. The technique that is used essentially consists of first correcting the initial series for all sorts of undesirable effects, such as the trading-day and the Easter holiday effects, by a module called regARIMA. These effects are estimated using regression models with ARIMA errors (auto-regressive integrated moving average models). The series can also be extrapolated for at least one year by using the model. Subsequently, the raw series—pre-adjusted and extrapolated if applicable— is seasonally adjusted by the X-11 method.

The X-11 method is used for analysing monthly and quarterly series. It is based on an iterative principle applied in estimating the different components, with estimation being done at each stage using adequate moving averages3. The moving averages used to estimate the main components—the trend and seasonality—are primarily smoothing tools designed to eliminate an undesirable component from the series. Since moving averages react poorly to the presence of atypical values, the X-11 method includes a tool for detecting and correcting atypical points. This tool is used to clean up the series during the seasonal adjustment. Outlying data points can also be detected and corrected in advance, within the regARIMA module.

Lastly, the annual totals of the seasonally adjusted series are forced to the annual totals of the original series.

Unfortunately, seasonal adjustment removes the sub-annual additivity of a system of series; small discrepancies can be observed between the sum of seasonally adjusted series and the direct seasonal adjustment of their total. To insure or restore additivity in a system of series, a reconciliation process is applied or indirect seasonal adjustment is used, i.e. the seasonal adjustment of a total is derived by the summation of the individually seasonally adjusted series.

12. Data quality evaluation

The methodology of this survey has been designed to control errors and to reduce their potential effects on estimates. However, the survey results remain subject to errors, of which sampling error is only one component of the total survey error. Sampling error results when observations are made only on a sample and not on the entire population. All other errors arising from the various phases of a survey are referred to as nonsampling errors. For example, these types of errors can occur when a respondent provides incorrect information or does not answer certain questions; when a unit in the target population is omitted or covered more than once; when GST data for records being modeled for a particular month are not representative of the actual record for various reasons; when a unit that is out of scope for the survey is included by mistake or when errors occur in data processing, such as coding or capture errors.

Prior to publication, combined survey results are analyzed for comparability; in general, this includes a detailed review of individual responses (especially for large businesses), general economic conditions and historical trends.

A common measure of data quality for surveys is the coefficient of variation (CV). The coefficient of variation, defined as the standard error divided by the sample estimate, is a measure of precision in relative terms. Since the coefficient of variation is calculated from responses of individual units, it also measures some non-sampling errors.

The formula used to calculate coefficients of variation (CV) as percentages is:

CV (X) = S(X) * 100% / X
where X denotes the estimate and S(X) denotes the standard error of X.

Confidence intervals can be constructed around the estimates using the estimate and the CV. Thus, for our sample, it is possible to state with a given level of confidence that the expected value will fall within the confidence interval constructed around the estimate. For example, if an estimate of $12,000,000 has a CV of 2%, the standard error will be $240,000 (the estimate multiplied by the CV). It can be stated with 68% confidence that the expected values will fall within the interval whose length equals the standard deviation about the estimate, i.e. between $11,760,000 and $12,240,000.

Alternatively, it can be stated with 95% confidence that the expected value will fall within the interval whose length equals two standard deviations about the estimate, i.e. between $11,520,000 and $12,480,000.

Finally, due to the small contribution of the non-survey portion to the total estimates, bias in the non-survey portion has a negligible impact on the CVs. Therefore, the CV from the survey portion is used for the total estimate that is the summation of estimates from the surveyed and non-surveyed portions.

13. Disclosure control

Statistics Canada is prohibited by law from releasing any data which would divulge information obtained under the Statistics Act that relates to any identifiable person, business or organization without the prior knowledge or the consent in writing of that person, business or organization. Various confidentiality rules are applied to all data that are released or published to prevent the publication or disclosure of any information deemed confidential. If necessary, data are suppressed to prevent direct or residual disclosure of identifiable data.

Confidentiality analysis includes the detection of possible "direct disclosure", which occurs when the value in a tabulation cell is composed of a few respondents or when the cell is dominated by a few companies.

Monthly Retail Trade Survey (MRTS) Data Quality Statement

Objectives, uses and users
Concepts, variables and classifications
Coverage and frames
Sampling
Questionnaire design
Response and nonresponse
Data collection and capture operations
Editing
Imputation
Estimation
Revisions and seasonal adjustment
Data quality evaluation
Disclosure control

1. Objectives, uses and users

1.1. Objective

The Monthly Retail Trade Survey (MRTS) provides information on the performance of the retail trade sector on a monthly basis, and when combined with other statistics, represents an important indicator of the state of the Canadian economy.

1.2. Uses

The estimates provide a measure of the health and performance of the retail trade sector. Information collected is used to estimate level and monthly trend for retail sales. At the end of each year, the estimates provide a preliminary look at annual retail sales and performance.

1.3. Users

A variety of organizations, sector associations, and levels of government make use of the information. Retailers rely on the survey results to compare their performance against similar types of businesses, as well as for marketing purposes. Retail associations are able to monitor industry performance and promote their retail industries. Investors can monitor industry growth, which can result in better access to investment capital by retailers. Governments are able to understand the role of retailers in the economy, which aids in the development of policies and tax incentives. As an important industry in the Canadian economy, governments are able to better determine the overall health of the economy through the use of the estimates in the calculation of the nation’s Gross Domestic Product (GDP).

2. Concepts, variables and classifications

2.1. Concepts

The retail trade sector comprises establishments primarily engaged in retailing merchandise, generally without transformation, and rendering services incidental to the sale of merchandise.

The retailing process is the final step in the distribution of merchandise; retailers are therefore organized to sell merchandise in small quantities to the general public. This sector comprises two main types of retailers, that is, store and non-store retailers. The MRTS covers only store retailers. Their main characteristics are described below. Store retailers operate fixed point-of-sale locations, located and designed to attract a high volume of walk-in customers. In general, retail stores have extensive displays of merchandise and use mass-media advertising to attract customers. They typically sell merchandise to the general public for personal or household consumption, but some also serve business and institutional clients. These include establishments such as office supplies stores, computer and software stores, gasoline stations, building material dealers, plumbing supplies stores and electrical supplies stores.

In addition to selling merchandise, some types of store retailers are also engaged in the provision of after-sales services, such as repair and installation. For example, new automobile dealers, electronic and appliance stores and musical instrument and supplies stores often provide repair services, while floor covering stores and window treatment stores often provide installation services. As a general rule, establishments engaged in retailing merchandise and providing after sales services are classified in this sector. Catalogue sales showrooms, gasoline service stations, and mobile home dealers are treated as store retailers.

2.2. Variables

Sales are defined as the sales of all goods purchased for resale, net of returns and discounts. This includes commission revenue and fees earned from selling goods and services on account of others, such as selling lottery tickets, bus tickets, and phone cards. It also includes parts and labour revenue from repair and maintenance; revenue from rental and leasing of goods and equipment; revenues from services, including food services; sales of goods manufactured as a secondary activity; and the proprietor’s withdrawals, at retail, of goods for personal use. Other revenue from rental of real estate, placement fees, operating subsidies, grants, royalties and franchise fees are excluded.

Trading Location is the physical location(s) in which business activity is conducted in each province and territory, and for which sales are credited or recognized in the financial records of the company. For retailers, this would normally be a store.

Constant Dollars: The value of retail trade is measured in two ways; including the effects of price change on sales and net of the effects of price change. The first measure is referred to as retail trade in current dollars and the latter as retail trade in constant dollars. The method of calculating the current dollar estimate is to aggregate the weighted value of sales for all retail outlets. The method of calculating the constant dollar estimate is to first adjust the sales values to a base year, using the Consumer Price Index, and then sum up the resulting values.

2.3. Classification

The Monthly Retail Trade Survey is based on the definition of retail trade under the NAICS (North American Industry Classification System). NAICS is the agreed upon common framework for the production of comparable statistics by the statistical agencies of Canada, Mexico and the United States. The agreement defines the boundaries of twenty sectors. NAICS is based on a production-oriented, or supply based conceptual framework in that establishments are groups into industries according to similarity in production processes used to produce goods and services.

Estimates appear for 21 industries based on special aggregations of the 2012 North American Industry Classification System (NAICS) industries. The 21 industries are further aggregated to 11 sub-sectors.

Geographically, sales estimates are produced for Canada and each province and territory.

3. Coverage and frames

Statistics Canada’s Business Register ( BR) provides the frame for the Monthly Retail Trade Survey. The BR is a structured list of businesses engaged in the production of goods and services in Canada. It is a centrally maintained database containing detailed descriptions of most business entities operating within Canada. The BR includes all incorporated businesses, with or without employees. For unincorporated businesses, the BR includes all employers with businesses, and businesses with no employees with annual sales that have a Goods and Services Tax (GST) or annual revenue that declares individual taxes.  annual sales greater than $30,000 that have a Goods and Services Tax (GST) account (the BR does not include unincorporated businesses with no employees and with annual sales less than $30,000).

The businesses on the BR are represented by a hierarchical structure with four levels, with the statistical enterprise at the top, followed by the statistical company, the statistical establishment and the statistical location. An enterprise can be linked to one or more statistical companies, a statistical company can be linked to one or more statistical establishments, and a statistical establishment to one or more statistical locations.

The target population for the MRTS consists of all statistical establishments on the BR that are classified to the retail sector using the North American Industry Classification System (NAICS) (approximately 200,000 establishments). The NAICS code range for the retail sector is 441100 to 453999. A statistical establishment is the production entity or the smallest grouping of production entities which: produces a homogeneous set of goods or services; does not cross provincial boundaries; and provides data on the value of output, together with the cost of principal intermediate inputs used, along with the cost and quantity of labour used to produce the output. The production entity is the physical unit where the business operations are carried out. It must have a civic address and dedicated labour.

The exclusions to the target population are ancillary establishments (producers of services in support of the activity of producing goods and services for the market of more than one establishment within the enterprise, and serves as a cost centre or a discretionary expense centre for which data on all its costs including labour and depreciation can be reported by the business), future establishments, establishments with a missing or a zero gross business income (GBI) value on the BR and establishments in the following non-covered NAICS:

  • 4541 (electronic shopping and mail-order houses)
  • 4542 (vending machine operators)
  • 45431 (fuel dealers)
  • 45439 (other direct selling establishments)

4. Sampling

The MRTS sample consists of 10,000 groups of establishments (clusters) classified to the Retail Trade sector selected from the Statistics Canada Business Register. A cluster of establishments is defined as all establishments belonging to a statistical enterprise that are in the same industrial group and geographical region. The MRTS uses a stratified design with simple random sample selection in each stratum. The stratification is done by industry groups (the mainly, but not only four digit level NAICS), and the geographical regions consisting of the provinces and territories, as well as three provincial sub-regions. We further stratify the population by size.

The size measure is created using a combination of independent survey data and three administrative variables: the annual profiled revenue, the GST sales expressed on an annual basis, and the declared tax revenue (T1 or T2). The size strata consist of one take-all (census), at most, two take-some (partially sampled) strata, and one take-none (non-sampled) stratum. Take-none strata serve to reduce respondent burden by excluding the smaller businesses from the surveyed population. These businesses should represent at most ten percent of total sales. Instead of sending questionnaires to these businesses, the estimates are produced through the use of administrative data.

The sample was allocated optimally in order to reach target coefficients of variation at the national, provincial/territorial, industrial, and industrial groups by province/territory levels. The sample was also inflated to compensate for dead, non-responding, and misclassified units.

MRTS is a repeated survey with maximisation of monthly sample overlap. The sample is kept month after month, and every month new units are added (births) to the sample.  MRTS births, i.e., new clusters of establishment(s), are identified every month via the BR’s latest universe. They are stratified according to the same criteria as the initial population. A sample of these births is selected according to the sampling fraction of the stratum to which they belong and is added to the monthly sample. Deaths occur on a monthly basis. A death can be a cluster of establishment(s) that have ceased their activities (out-of-business) or whose major activities are no longer in retail trade (out-of-scope). The status of these businesses is updated on the BR using administrative sources and survey feedback, including feedback from the MRTS. Methods to treat dead units and misclassified units are part of the sample and population update procedures.

5. Questionnaire design

The Monthly Retail Trade Survey incorporates the following sub-surveys:

Monthly Retail Trade Survey - R8

Monthly Retail Trade Survey (with inventories) – R8

Survey of Sales and Inventories of Alcoholic Beverages

The questionnaires collect monthly data on retail sales and the number of trading locations by province or territory and inventories of goods owned and intended for resale from a sample of retailers. The items on the questionnaires have remained unchanged for several years. For the 2004 redesign, the general questionnaires were subject to cosmetic changes only. The questionnaire for Sales and Inventories of Alcoholic Beverages underwent more extensive changes. The modifications were discussed with stakeholders and the respondents were given an opportunity to comment before the new questionnaire was finalized. If further changes are needed to any of the questionnaires, proposed changes would go through a review committee and a field test with respondents and data users to ensure its relevancy.

6. Response and nonresponse

6.1. Response and non-response

Despite the best efforts of survey managers and operations staff to maximize response in the MRTS, some non-response will occur. For statistical establishments to be classified as responding, the degree of partial response (where an accurate response is obtained for only some of the questions asked a respondent) must meet a minimum threshold level below which the response would be rejected and considered a unit nonresponse.  In such an instance, the business is classified as not having responded at all.

Non-response has two effects on data: first it introduces bias in estimates when nonrespondents differ from respondents in the characteristics measured; and second, it contributes to an increase in the sampling variance of estimates because the effective sample size is reduced from that originally sought.

The degree to which efforts are made to get a response from a non-respondent is based on budget and time constraints, its impact on the overall quality and the risk of nonresponse bias.

The main method to reduce the impact of non-response at sampling is to inflate the sample size through the use of over-sampling rates that have been determined from similar surveys.

Besides the methods to reduce the impact of non-response at sampling and collection, the non-responses to the survey that do occur are treated through imputation. In order to measure the amount of non-response that occurs each month, various response rates are calculated. For a given reference month, the estimation process is run at least twice (a preliminary and a revised run). Between each run, respondent data can be identified as unusable and imputed values can be corrected through respondent data. As a consequence, response rates are computed following each run of the estimation process.

For the MRTS, two types of rates are calculated (un-weighted and weighted). In order to assess the efficiency of the collection process, un-weighted response rates are calculated. Weighted rates, using the estimation weight and the value for the variable of interest, assess the quality of estimation. Within each of these types of rates, there are distinct rates for units that are surveyed and for units that are only modeled from administrative data that has been extracted from GST files.

To get a better picture of the success of the collection process, two un-weighted rates called the ‘collection results rate’ and the ‘extraction results rate’ are computed. They are computed by dividing the number of respondents by the number of units that we tried to contact or tried to receive extracted data for them. Non-monthly reporters (respondents with special reporting arrangements where they do not report every month but for whom actual data is available in subsequent revisions) are excluded from both the numerator and denominator for the months where no contact is performed.

In summary, the various response rates are calculated as follows:

Weighted rates:

Survey Response rate (estimation) =
Sum of weighted sales of units with response status i / Sum of survey weighted sales

where i = units that have either reported data that will be used in estimation or are converted refusals, or have reported data that has not yet been resolved for estimation.

Admin Response rate (estimation) =
Sum of weighted sales of units with response status ii / Sum of administrative weighted sales

where ii = units that have data that was extracted from administrative files and are usable for estimation.

Total Response rate (estimation) =
Sum of weighted sales of units with response status i or response status ii / Sum of all weighted sales

Un-weighted rates:

Survey Response rate (collection) =
Number of questionnaires with response status iii/ Number of questionnaires with response status iv

where iii = units that have either reported data (unresolved, used or not used for estimation) or are converted refusals.

where iv = all of the above plus units that have refused to respond, units that were not contacted and other types of non-respondent units.

Admin Response rate (extraction) =
Number of questionnaires with response status vi/ Number of questionnaires with response status vii

where vi = in-scope units that have data (either usable or non-usable) that was extracted from administrative files

where vii = all of the above plus units that have refused to report to the administrative data source, units that were not contacted and other types of non-respondent units.

(% of questionnaire collected over all in-scope questionnaires)

Collection Results Rate =
Number of questionnaires with response status iii / Number of questionnaires with response status viii

where iii = same as iii defined above

where viii = same as iv except for the exclusion of units that were contacted because their response is unavailable for a particular month since they are non-monthly reporters.

Extraction Results Rate =
Number of questionnaires with response status ix / Number of questionnaires with response status vii

where ix = same as vi with the addition of extracted units that have been imputed or were out of scope

where vii = same as vii defined above

(% of questionnaires collected over all questionnaire in-scope we tried to collect)

All the above weighted and un-weighted rates are provided at the industrial group, geography and size group level or for any combination of these levels.

Use of Administrative Data

Managing response burden is an ongoing challenge for Statistics Canada. In an attempt to alleviate response burden and survey costs, especially for smaller businesses, the MRTS has reduced the number of simple establishments in the sample that are surveyed directly and instead derives sales data for these establishments from Goods and Service Tax (GST) files using a statistical model. The model accounts for differences between sales and revenue (reported for GST purposes) as well as for the time lag between the survey reference period and the reference period of the GST file.

For more information on the methodology used for modeling sales from administrative data sources, refer to ‘Monthly Retail Trade Survey: Use of Administrative Data’ under ‘Documentation’ of the IMDB.

Table 1 contains the weighted response rates for all industry groups as well as for total retail trade for each province and territory. For more detailed weighted response rates, please contact the Marketing and Dissemination Section at (613) 951-3549, toll free: 1-877-421-3067 or by e-mail at retailinfo@statcan.

6.2. Methods used to reduce non-response at collection

Significant effort is spent trying to minimize non-response during collection. Methods used, among others, are interviewer techniques such as probing and persuasion, repeated re-scheduling and call-backs to obtain the information, and procedures dealing with how to handle non-compliant (refusal) respondents.

If data are unavailable at the time of collection, a respondent's best estimates are also accepted, and are subsequently revised once the actual data become available.

To minimize total non-response for all variables, partial responses are accepted. In addition, questionnaires are customized for the collection of certain variables, such as inventory, so that collection is timed for those months when the data are available.

Finally, to build trust and rapport between the interviewers and respondents, cases are generally assigned to the same interviewer each month. This action establishes a personal relationship between interviewer and respondent, and builds respondent trust.

7. Data collection and capture operations

Collection of the data is performed by Statistics Canada’s Regional Offices.

Table 1:
Weighted response rates by NAICS, for all provinces and territories: November 2013
Table summary
This table displays the results of Weighted response rates by NAICS Weighted Response Rates (appearing as column headers).
  Weighted Response Rates
Total Survey Administrative
NAICS - Canada  
Motor Vehicle and Parts Dealers 91.6 92.2 64.1
Automobile Dealers 93.3 93.7 56.0
New Car Dealers Note 1 94.8 94.8  
Used Car Dealers 67.8 70.1 56.0
Other Motor Vehicle Dealers 66.8 67.3 63.6
Automotive Parts, Accessories and Tire Stores 87.1 88.9 71.2
Furniture and Home Furnishings Stores 83.5 87.2 47.6
Furniture Stores 88.6 90.1 58.7
Home Furnishings Stores 75.7 82.0 42.4
Electronics and Appliance Stores 88.9 89.8 45.7
Building Material and Garden Equipment Dealers 88.8 92.6 58.9
Food and Beverage Stores 88.7 90.7 65.7
Grocery Stores 92.4 94.1 73.4
Grocery (except Convenience) Stores 95.0 96.3 79.7
Convenience Stores 53.7 58.8 24.6
Specialty Food Stores 65.9 73.0 38.3
Beer, Wine and Liquor Stores 80.5 82.0 21.5
Health and Personal Care Stores 89.7 89.9 86.1
Gasoline Stations 75.5 75.9 68.0
Clothing and Clothing Accessories Stores 87.7 88.9 47.6
Clothing Stores 88.5 89.6 50.0
Shoe Stores 90.1 91.3  
Jewellery, Luggage and Leather Goods Stores 78.4 80.3 51.4
Sporting Goods, Hobby, Book and Music Stores 90.1 92.8 51.8
General Merchandise Stores 98.3 99.1 21.0
Department Stores 100.0 100.0  
Other general merchadise stores 96.8 98.2 21.0
Miscellaneous Store Retailers 76.5 81.5 33.3
Total 88.8 90.2 61.6
Regions  
Newfoundland and Labrador 83.6 84.9 35.9
Prince Edward Island 87.2 88.7 9.2
Nova Scotia 90.3 91.1 67.1
New Brunswick 86.0 87.8 54.7
Québec 88.0 89.8 64.9
Ontario 89.3 90.7 58.6
Manitoba 86.1 86.6 59.6
Saskatchewan 91.4 92.6 64.3
Alberta 87.5 88.5 65.0
British Columbia 91.2 92.9 59.9
Yukon Territory 86.8 86.8  
Northwest Territories 84.2 84.2  
Nunavut 71.6 71.6  

Weighted Response Rates

Respondents are sent a questionnaire or are contacted by telephone to obtain their sales and inventory values, as well as to confirm the opening or closing of business trading locations. Collection of the data begins approximately 7 working days after the end of the reference month and continues for the duration of that month.

New entrants to the survey are introduced to the survey via an introductory letter that informs the respondent that a representative of Statistics Canada will be calling. This call is to introduce the respondent to the survey, confirm the respondent's business activity, establish and begin data collection, as well as to answer any questions that the respondent may have.

8. Editing

Data editing is the application of checks to detect missing, invalid or inconsistent entries or to point to data records that are potentially in error. In the survey process for the MRTS, data editing is done at two different time periods.

First of all, editing is done during data collection. Once data are collected via the telephone, or via the receipt of completed mail-in questionnaires, the data are captured using customized data capture applications. All data are subjected to data editing. Edits during data collection are referred to as field edits and generally consist of validity and some simple consistency edits. They are used to detect mistakes made during the interview by the respondent or the interviewer and to identify missing information during collection in order to reduce the need for follow-up later on. Another purpose of the field edits is to clean up responses. In the MRTS, the current month’s responses are edited against the respondent’s previous month’s responses and/or the previous year’s responses for the current month. Field edits are also used to identify problems with data collection procedures and the design of the questionnaire, as well as the need for more interviewer training.

Follow-up with respondents occurs to validate potential erroneous data following any failed preliminary edit check of the data. Once validated, the collected data is regularly transmitted to the head office in Ottawa.

Secondly, editing known as statistical editing is also done after data collection and this is more empirical in nature. Statistical editing is run prior to imputation in order to identify the data that will be used as a basis to impute non-respondents. Large outliers that could disrupt a monthly trend are excluded from trend calculations by the statistical edits. It should be noted that adjustments are not made at this stage to correct the reported outliers.

The first step in the statistical editing is to identify which responses will be subjected to the statistical edit rules. Reported data for the current reference month will go through various edit checks.

The first set of edit checks is based on the Hidiriglou-Berthelot method whereby a ratio of the respondent’s current month data over historical (last month, same month last year) or auxiliary data is analyzed. When the respondent’s ratio differs significantly from ratios of respondents who are similar in terms of industry and/or geography group, the response is deemed an outlier.

The second set of edits consists of an edit known as the share of market edit. With this method, one is able to edit all respondents, even those where historical and auxiliary data is unavailable. The method relies on current month data only. Therefore, within a group of respondents, that are similar in terms of industrial group and/or geography, if the weighted contribution of a respondent to the group’s total is too large, it will be flagged as an outlier.

For edit checks based on the Hidiriglou-Berthelot method, data that are flagged as an outlier will not be included in the imputation models (those based on ratios). Also, data that are flagged as outliers in the share of market edit will not be included in the imputation models where means and medians are calculated to impute for responses that have no historical responses.

In conjunction with the statistical editing after data collection of reported data, there is also error detection done on the extracted GST data. Modeled data based on the GST are also subject to an extensive series of processing steps which thoroughly verify each record that is the basis for the model as well as the record being modeled. Edits are performed at a more aggregate level (industry by geography level) to detect records which deviate from the expected range, either by exhibiting large month-to-month change, or differing significantly from the remaining units. All data which fail these edits are subject to manual inspection and possible corrective action.

9. Imputation

Imputation in the MRTS is the process used to assign replacement values for missing data. This is done by assigning values when they are missing on the record being edited to ensure that estimates are of high quality and that a plausible, internal consistency is created. Due to concerns of response burden, cost and timeliness, it is generally impossible to do all follow-ups with the respondents in order to resolve missing responses. Since it is desirable to produce a complete and consistent microdata file, imputation is used to handle the remaining missing cases.

In the MRTS, imputation is based on historical data or administrative data (GST sales). The appropriate method is selected according to a strategy that is based on whether historical data is available, auxiliary data is available and/or which reference month is being processed.

There are three types of historical imputation methods. The first type is a general trend that uses one historical data source (previous month, data from next month or data from same month previous year). The second type is a regression model where data from previous month and same month previous year are used simultaneously. The third type uses the historical data as a direct replacement value for a non-respondent. Depending upon the particular reference month, there is an order of preference that exists so that top quality imputation can result. The historical imputation method that was labelled as the third type above is always the last option in the order for each reference month.

The imputation methods using administrative data are automatically selected when historical information is unavailable for a non-respondent. The administrative data source (annual GST sales) is the basis of these methods. The annual GST sales are used for two types of methods. One is a general trend that will be used for simple structure, e.g. enterprises with only one establishment, and a second type is called median-average that is used for units with a more complex structure.

10. Estimation

Estimation is a process that approximates unknown population parameters using only part of the population that is included in a sample. Inferences about these unknown parameters are then made, using the sample data and associated survey design. This stage uses Statistics Canada's Generalized Estimation System (GES).

For retail sales, the population is divided into a survey portion (take-all and take-some strata) and a non-survey portion (take-none stratum). From the sample that is drawn from the survey portion, an estimate for the population is determined through the use of a Horvitz-Thompson estimator where responses for sales are weighted by using the inverses of the inclusion probabilities of the sampled units. Such weights (called sampling weights) can be interpreted as the number of times that each sampled unit should be replicated to represent the entire population. The calculated weighted sales values are summed by domain, to produce the total sales estimates by each industrial group / geographic area combination. A domain is defined as the most recent classification values available from the BR for the unit and the survey reference period. These domains may differ from the original sampling strata because units may have changed size, industry or location. Changes in classification are reflected immediately in the estimates and do not accumulate over time. For the non-survey portion, the sales are estimated with statistical models using monthly GST sales.

For more information on the methodology for modeling sales from administrative data sources which also contributes to the estimates of the survey portion, refer to ‘Monthly Retail Survey: Use of Administrative Data’ under ‘Documentation’ of the IMDB.

The measure of precision used for the MRTS to evaluate the quality of a population parameter estimate and to obtain valid inferences is the variance. The variance from the survey portion is derived directly from a stratified simple random sample without replacement.

Sample estimates may differ from the expected value of the estimates. However, since the estimate is based on a probability sample, the variability of the sample estimate with respect to its expected value can be measured. The variance of an estimate is a measure of the precision of the sample estimate and is defined as the average, over all possible samples, of the squared difference of the estimate from its expected value.

11. Revisions and seasonal adjustment

Revisions in the raw data are required to correct known non-sampling errors. These normally include replacing imputed data with reported data, corrections to previously reported data, and estimates for new births that were not known at the time of the original estimates. Raw data are revised, on a monthly basis, for the month immediately prior to the current reference month being published. That is, when data for December are being published for the first time, there will also be revisions, if necessary, to the raw data for November. In addition, revisions are made once a year, with the initial release of the February data, for all months in the previous year. The purpose is to correct any significant problems that have been found that apply for an extended period. The actual period of revision depends on the nature of the problem identified, but rarely exceeds three years. Time series contain the elements essential to the description, explanation and forecasting of the behaviour of an economic phenomenon: "They are statistical records of the evolution of economic processes through time."1 Economic time series such as the Monthly Retail Trade Survey can be broken down into five main components: the trend-cycle, seasonality, the trading-day effect, the Easter holiday effect and the irregular component.

The trend represents the long-term change in the series, whereas the cycle represents a smooth, quasi-periodical movement about the trend, showing a succession of growth and decline phases (e.g., the business cycle). These two components—the trend and the cycle—are estimated together, and the trend-cycle reflects the fundamental evolution of the series. The other components reflect short-term transient movements.

The seasonal component represents sub-annual, monthly or quarterly fluctuations that recur more or less regularly from one year to the next. Seasonal variations are caused by the direct and indirect effects of the climatic seasons and institutional factors (attributable to social conventions or administrative rules; e.g., Christmas).

The trading-day component originates from the fact that the relative importance of the days varies systematically within the week and that the number of each day of the week in a given month varies from year to year. This effect is present when activity varies with the day of the week. For instance, Sunday is typically less active than the other days, and the number of Sundays, Mondays, etc., in a given month changes from year to year.

The Easter holiday effect is the variation due to the shift of part of April’s activity to March when Easter falls in March rather than April.

Lastly, the irregular component includes all other more or less erratic fluctuations not taken into account in the preceding components. It is a residual that includes errors of measurement on the 1. A Note on the Seasonal adjustment of Economic Time Series», Canadian Statistical Review, August 1974.  A variable itself as well as unusual events (e.g., strikes, drought, floods, major power blackout or other unexpected events causing variations in respondents’ activities).

Thus, the latter four components—seasonal, irregular, trading-day and Easter holiday effect—all conceal the fundamental trend-cycle component of the series. Seasonal adjustment (correction of seasonal variation) consists in removing the seasonal, trading-day and Easter holiday effect components from the series, and it thus helps reveal the trend-cycle. While seasonal adjustment permits a better understanding of the underlying trend-cycle of a series, the seasonally adjusted series still contains an irregular component. Slight month-to-month variations in the seasonally adjusted series may be simple irregular movements. To get a better idea of the underlying trend, users should examine several months of the seasonally adjusted series.

Since April 2008, Monthly Retail Trade Survey data are seasonally adjusted using the X-12- ARIMA2 software. The technique that is used essentially consists of first correcting the initial series for all sorts of undesirable effects, such as the trading-day and the Easter holiday effects, by a module called regARIMA. These effects are estimated using regression models with ARIMA errors (auto-regressive integrated moving average models). The series can also be extrapolated for at least one year by using the model. Subsequently, the raw series—pre-adjusted and extrapolated if applicable— is seasonally adjusted by the X-11 method.

The X-11 method is used for analysing monthly and quarterly series. It is based on an iterative principle applied in estimating the different components, with estimation being done at each stage using adequate moving averages3. The moving averages used to estimate the main components—the trend and seasonality—are primarily smoothing tools designed to eliminate an undesirable component from the series. Since moving averages react poorly to the presence of atypical values, the X-11 method includes a tool for detecting and correcting atypical points. This tool is used to clean up the series during the seasonal adjustment. Outlying data points can also be detected and corrected in advance, within the regARIMA module.

Lastly, the annual totals of the seasonally adjusted series are forced to the annual totals of the original series.

Unfortunately, seasonal adjustment removes the sub-annual additivity of a system of series; small discrepancies can be observed between the sum of seasonally adjusted series and the direct seasonal adjustment of their total. To insure or restore additivity in a system of series, a reconciliation process is applied or indirect seasonal adjustment is used, i.e. the seasonal adjustment of a total is derived by the summation of the individually seasonally adjusted series.

12. Data quality evaluation

The methodology of this survey has been designed to control errors and to reduce their potential effects on estimates. However, the survey results remain subject to errors, of which sampling error is only one component of the total survey error. Sampling error results when observations are made only on a sample and not on the entire population. All other errors arising from the various phases of a survey are referred to as nonsampling errors. For example, these types of errors can occur when a respondent provides incorrect information or does not answer certain questions; when a unit in the target population is omitted or covered more than once; when GST data for records being modeled for a particular month are not representative of the actual record for various reasons; when a unit that is out of scope for the survey is included by mistake or when errors occur in data processing, such as coding or capture errors.

Prior to publication, combined survey results are analyzed for comparability; in general, this includes a detailed review of individual responses (especially for large businesses), general economic conditions and historical trends.

A common measure of data quality for surveys is the coefficient of variation (CV). The coefficient of variation, defined as the standard error divided by the sample estimate, is a measure of precision in relative terms. Since the coefficient of variation is calculated from responses of individual units, it also measures some non-sampling errors.

The formula used to calculate coefficients of variation (CV) as percentages is:

CV (X) = S(X) * 100% / X
where X denotes the estimate and S(X) denotes the standard error of X.

Confidence intervals can be constructed around the estimates using the estimate and the CV. Thus, for our sample, it is possible to state with a given level of confidence that the expected value will fall within the confidence interval constructed around the estimate. For example, if an estimate of $12,000,000 has a CV of 2%, the standard error will be $240,000 (the estimate multiplied by the CV). It can be stated with 68% confidence that the expected values will fall within the interval whose length equals the standard deviation about the estimate, i.e. between $11,760,000 and $12,240,000.

Alternatively, it can be stated with 95% confidence that the expected value will fall within the interval whose length equals two standard deviations about the estimate, i.e. between $11,520,000 and $12,480,000.

Finally, due to the small contribution of the non-survey portion to the total estimates, bias in the non-survey portion has a negligible impact on the CVs. Therefore, the CV from the survey portion is used for the total estimate that is the summation of estimates from the surveyed and non-surveyed portions.

13. Disclosure control

Statistics Canada is prohibited by law from releasing any data which would divulge information obtained under the Statistics Act that relates to any identifiable person, business or organization without the prior knowledge or the consent in writing of that person, business or organization. Various confidentiality rules are applied to all data that are released or published to prevent the publication or disclosure of any information deemed confidential. If necessary, data are suppressed to prevent direct or residual disclosure of identifiable data.

Confidentiality analysis includes the detection of possible "direct disclosure", which occurs when the value in a tabulation cell is composed of a few respondents or when the cell is dominated by a few companies.

Monthly Retail Trade Survey (MRTS) Data Quality Statement

Objectives, uses and users
Concepts, variables and classifications
Coverage and frames
Sampling
Questionnaire design
Response and nonresponse
Data collection and capture operations
Editing
Imputation
Estimation
Revisions and seasonal adjustment
Data quality evaluation
Disclosure control

1. Objectives, uses and users

1.1. Objective

The Monthly Retail Trade Survey (MRTS) provides information on the performance of the retail trade sector on a monthly basis, and when combined with other statistics, represents an important indicator of the state of the Canadian economy.

1.2. Uses

The estimates provide a measure of the health and performance of the retail trade sector. Information collected is used to estimate level and monthly trend for retail sales. At the end of each year, the estimates provide a preliminary look at annual retail sales and performance.

1.3. Users

A variety of organizations, sector associations, and levels of government make use of the information. Retailers rely on the survey results to compare their performance against similar types of businesses, as well as for marketing purposes. Retail associations are able to monitor industry performance and promote their retail industries. Investors can monitor industry growth, which can result in better access to investment capital by retailers. Governments are able to understand the role of retailers in the economy, which aids in the development of policies and tax incentives. As an important industry in the Canadian economy, governments are able to better determine the overall health of the economy through the use of the estimates in the calculation of the nation’s Gross Domestic Product (GDP).

2. Concepts, variables and classifications

2.1. Concepts

The retail trade sector comprises establishments primarily engaged in retailing merchandise, generally without transformation, and rendering services incidental to the sale of merchandise.

The retailing process is the final step in the distribution of merchandise; retailers are therefore organized to sell merchandise in small quantities to the general public. This sector comprises two main types of retailers, that is, store and non-store retailers. The MRTS covers only store retailers. Their main characteristics are described below. Store retailers operate fixed point-of-sale locations, located and designed to attract a high volume of walk-in customers. In general, retail stores have extensive displays of merchandise and use mass-media advertising to attract customers. They typically sell merchandise to the general public for personal or household consumption, but some also serve business and institutional clients. These include establishments such as office supplies stores, computer and software stores, gasoline stations, building material dealers, plumbing supplies stores and electrical supplies stores.

In addition to selling merchandise, some types of store retailers are also engaged in the provision of after-sales services, such as repair and installation. For example, new automobile dealers, electronic and appliance stores and musical instrument and supplies stores often provide repair services, while floor covering stores and window treatment stores often provide installation services. As a general rule, establishments engaged in retailing merchandise and providing after sales services are classified in this sector. Catalogue sales showrooms, gasoline service stations, and mobile home dealers are treated as store retailers.

2.2. Variables

Sales are defined as the sales of all goods purchased for resale, net of returns and discounts. This includes commission revenue and fees earned from selling goods and services on account of others, such as selling lottery tickets, bus tickets, and phone cards. It also includes parts and labour revenue from repair and maintenance; revenue from rental and leasing of goods and equipment; revenues from services, including food services; sales of goods manufactured as a secondary activity; and the proprietor’s withdrawals, at retail, of goods for personal use. Other revenue from rental of real estate, placement fees, operating subsidies, grants, royalties and franchise fees are excluded.

Trading Location is the physical location(s) in which business activity is conducted in each province and territory, and for which sales are credited or recognized in the financial records of the company. For retailers, this would normally be a store.

Constant Dollars: The value of retail trade is measured in two ways; including the effects of price change on sales and net of the effects of price change. The first measure is referred to as retail trade in current dollars and the latter as retail trade in constant dollars. The method of calculating the current dollar estimate is to aggregate the weighted value of sales for all retail outlets. The method of calculating the constant dollar estimate is to first adjust the sales values to a base year, using the Consumer Price Index, and then sum up the resulting values.

2.3. Classification

The Monthly Retail Trade Survey is based on the definition of retail trade under the NAICS (North American Industry Classification System). NAICS is the agreed upon common framework for the production of comparable statistics by the statistical agencies of Canada, Mexico and the United States. The agreement defines the boundaries of twenty sectors. NAICS is based on a production-oriented, or supply based conceptual framework in that establishments are groups into industries according to similarity in production processes used to produce goods and services.

Estimates appear for 21 industries based on special aggregations of the 2012 North American Industry Classification System (NAICS) industries. The 21 industries are further aggregated to 11 sub-sectors.

Geographically, sales estimates are produced for Canada and each province and territory.

3. Coverage and frames

Statistics Canada’s Business Register ( BR) provides the frame for the Monthly Retail Trade Survey. The BR is a structured list of businesses engaged in the production of goods and services in Canada. It is a centrally maintained database containing detailed descriptions of most business entities operating within Canada. The BR includes all incorporated businesses, with or without employees. For unincorporated businesses, the BR includes all employers with businesses, and businesses with no employees with annual sales that have a Goods and Services Tax (GST) or annual revenue that declares individual taxes.  annual sales greater than $30,000 that have a Goods and Services Tax (GST) account (the BR does not include unincorporated businesses with no employees and with annual sales less than $30,000).

The businesses on the BR are represented by a hierarchical structure with four levels, with the statistical enterprise at the top, followed by the statistical company, the statistical establishment and the statistical location. An enterprise can be linked to one or more statistical companies, a statistical company can be linked to one or more statistical establishments, and a statistical establishment to one or more statistical locations.

The target population for the MRTS consists of all statistical establishments on the BR that are classified to the retail sector using the North American Industry Classification System (NAICS) (approximately 200,000 establishments). The NAICS code range for the retail sector is 441100 to 453999. A statistical establishment is the production entity or the smallest grouping of production entities which: produces a homogeneous set of goods or services; does not cross provincial boundaries; and provides data on the value of output, together with the cost of principal intermediate inputs used, along with the cost and quantity of labour used to produce the output. The production entity is the physical unit where the business operations are carried out. It must have a civic address and dedicated labour.

The exclusions to the target population are ancillary establishments (producers of services in support of the activity of producing goods and services for the market of more than one establishment within the enterprise, and serves as a cost centre or a discretionary expense centre for which data on all its costs including labour and depreciation can be reported by the business), future establishments, establishments with a missing or a zero gross business income (GBI) value on the BR and establishments in the following non-covered NAICS:

  • 4541 (electronic shopping and mail-order houses)
  • 4542 (vending machine operators)
  • 45431 (fuel dealers)
  • 45439 (other direct selling establishments)

4. Sampling

The MRTS sample consists of 10,000 groups of establishments (clusters) classified to the Retail Trade sector selected from the Statistics Canada Business Register. A cluster of establishments is defined as all establishments belonging to a statistical enterprise that are in the same industrial group and geographical region. The MRTS uses a stratified design with simple random sample selection in each stratum. The stratification is done by industry groups (the mainly, but not only four digit level NAICS), and the geographical regions consisting of the provinces and territories, as well as three provincial sub-regions. We further stratify the population by size.

The size measure is created using a combination of independent survey data and three administrative variables: the annual profiled revenue, the GST sales expressed on an annual basis, and the declared tax revenue (T1 or T2). The size strata consist of one take-all (census), at most, two take-some (partially sampled) strata, and one take-none (non-sampled) stratum. Take-none strata serve to reduce respondent burden by excluding the smaller businesses from the surveyed population. These businesses should represent at most ten percent of total sales. Instead of sending questionnaires to these businesses, the estimates are produced through the use of administrative data.

The sample was allocated optimally in order to reach target coefficients of variation at the national, provincial/territorial, industrial, and industrial groups by province/territory levels. The sample was also inflated to compensate for dead, non-responding, and misclassified units.

MRTS is a repeated survey with maximisation of monthly sample overlap. The sample is kept month after month, and every month new units are added (births) to the sample.  MRTS births, i.e., new clusters of establishment(s), are identified every month via the BR’s latest universe. They are stratified according to the same criteria as the initial population. A sample of these births is selected according to the sampling fraction of the stratum to which they belong and is added to the monthly sample. Deaths occur on a monthly basis. A death can be a cluster of establishment(s) that have ceased their activities (out-of-business) or whose major activities are no longer in retail trade (out-of-scope). The status of these businesses is updated on the BR using administrative sources and survey feedback, including feedback from the MRTS. Methods to treat dead units and misclassified units are part of the sample and population update procedures.

5. Questionnaire design

The Monthly Retail Trade Survey incorporates the following sub-surveys:

Monthly Retail Trade Survey - R8

Monthly Retail Trade Survey (with inventories) – R8

Survey of Sales and Inventories of Alcoholic Beverages

The questionnaires collect monthly data on retail sales and the number of trading locations by province or territory and inventories of goods owned and intended for resale from a sample of retailers. The items on the questionnaires have remained unchanged for several years. For the 2004 redesign, the general questionnaires were subject to cosmetic changes only. The questionnaire for Sales and Inventories of Alcoholic Beverages underwent more extensive changes. The modifications were discussed with stakeholders and the respondents were given an opportunity to comment before the new questionnaire was finalized. If further changes are needed to any of the questionnaires, proposed changes would go through a review committee and a field test with respondents and data users to ensure its relevancy.

6. Response and nonresponse

6.1. Response and non-response

Despite the best efforts of survey managers and operations staff to maximize response in the MRTS, some non-response will occur. For statistical establishments to be classified as responding, the degree of partial response (where an accurate response is obtained for only some of the questions asked a respondent) must meet a minimum threshold level below which the response would be rejected and considered a unit nonresponse.  In such an instance, the business is classified as not having responded at all.

Non-response has two effects on data: first it introduces bias in estimates when nonrespondents differ from respondents in the characteristics measured; and second, it contributes to an increase in the sampling variance of estimates because the effective sample size is reduced from that originally sought.

The degree to which efforts are made to get a response from a non-respondent is based on budget and time constraints, its impact on the overall quality and the risk of nonresponse bias.

The main method to reduce the impact of non-response at sampling is to inflate the sample size through the use of over-sampling rates that have been determined from similar surveys.

Besides the methods to reduce the impact of non-response at sampling and collection, the non-responses to the survey that do occur are treated through imputation. In order to measure the amount of non-response that occurs each month, various response rates are calculated. For a given reference month, the estimation process is run at least twice (a preliminary and a revised run). Between each run, respondent data can be identified as unusable and imputed values can be corrected through respondent data. As a consequence, response rates are computed following each run of the estimation process.

For the MRTS, two types of rates are calculated (un-weighted and weighted). In order to assess the efficiency of the collection process, un-weighted response rates are calculated. Weighted rates, using the estimation weight and the value for the variable of interest, assess the quality of estimation. Within each of these types of rates, there are distinct rates for units that are surveyed and for units that are only modeled from administrative data that has been extracted from GST files.

To get a better picture of the success of the collection process, two un-weighted rates called the ‘collection results rate’ and the ‘extraction results rate’ are computed. They are computed by dividing the number of respondents by the number of units that we tried to contact or tried to receive extracted data for them. Non-monthly reporters (respondents with special reporting arrangements where they do not report every month but for whom actual data is available in subsequent revisions) are excluded from both the numerator and denominator for the months where no contact is performed.

In summary, the various response rates are calculated as follows:

Weighted rates:

Survey Response rate (estimation) =
Sum of weighted sales of units with response status i / Sum of survey weighted sales

where i = units that have either reported data that will be used in estimation or are converted refusals, or have reported data that has not yet been resolved for estimation.

Admin Response rate (estimation) =
Sum of weighted sales of units with response status ii / Sum of administrative weighted sales

where ii = units that have data that was extracted from administrative files and are usable for estimation.

Total Response rate (estimation) =
Sum of weighted sales of units with response status i or response status ii / Sum of all weighted sales

Un-weighted rates:

Survey Response rate (collection) =
Number of questionnaires with response status iii/ Number of questionnaires with response status iv

where iii = units that have either reported data (unresolved, used or not used for estimation) or are converted refusals.

where iv = all of the above plus units that have refused to respond, units that were not contacted and other types of non-respondent units.

Admin Response rate (extraction) =
Number of questionnaires with response status vi/ Number of questionnaires with response status vii

where vi = in-scope units that have data (either usable or non-usable) that was extracted from administrative files

where vii = all of the above plus units that have refused to report to the administrative data source, units that were not contacted and other types of non-respondent units.

(% of questionnaire collected over all in-scope questionnaires)

Collection Results Rate =
Number of questionnaires with response status iii / Number of questionnaires with response status viii

where iii = same as iii defined above

where viii = same as iv except for the exclusion of units that were contacted because their response is unavailable for a particular month since they are non-monthly reporters.

Extraction Results Rate =
Number of questionnaires with response status ix / Number of questionnaires with response status vii

where ix = same as vi with the addition of extracted units that have been imputed or were out of scope

where vii = same as vii defined above

(% of questionnaires collected over all questionnaire in-scope we tried to collect)

All the above weighted and un-weighted rates are provided at the industrial group, geography and size group level or for any combination of these levels.

Use of Administrative Data

Managing response burden is an ongoing challenge for Statistics Canada. In an attempt to alleviate response burden and survey costs, especially for smaller businesses, the MRTS has reduced the number of simple establishments in the sample that are surveyed directly and instead derives sales data for these establishments from Goods and Service Tax (GST) files using a statistical model. The model accounts for differences between sales and revenue (reported for GST purposes) as well as for the time lag between the survey reference period and the reference period of the GST file.

For more information on the methodology used for modeling sales from administrative data sources, refer to ‘Monthly Retail Trade Survey: Use of Administrative Data’ under ‘Documentation’ of the IMDB.

Table 1 contains the weighted response rates for all industry groups as well as for total retail trade for each province and territory. For more detailed weighted response rates, please contact the Marketing and Dissemination Section at (613) 951-3549, toll free: 1-877-421-3067 or by e-mail at retailinfo@statcan.

6.2. Methods used to reduce non-response at collection

Significant effort is spent trying to minimize non-response during collection. Methods used, among others, are interviewer techniques such as probing and persuasion, repeated re-scheduling and call-backs to obtain the information, and procedures dealing with how to handle non-compliant (refusal) respondents.

If data are unavailable at the time of collection, a respondent's best estimates are also accepted, and are subsequently revised once the actual data become available.

To minimize total non-response for all variables, partial responses are accepted. In addition, questionnaires are customized for the collection of certain variables, such as inventory, so that collection is timed for those months when the data are available.

Finally, to build trust and rapport between the interviewers and respondents, cases are generally assigned to the same interviewer each month. This action establishes a personal relationship between interviewer and respondent, and builds respondent trust.

7. Data collection and capture operations

Collection of the data is performed by Statistics Canada’s Regional Offices.

Table 1: Weighted response rates by NAICS, for all provinces and territories: October 2013
Table summary
This table displays the results of Table 1: Weighted response rates by NAICS Weighted Response Rates (appearing as column headers).
  Weighted Response Rates
Total Survey Administrative
NAICS - Canada  
Motor Vehicle and Parts Dealers 93.2 93.8 66.7
Automobile Dealers 95.0 95.3 65.1
New Car Dealers Note 1 96.4 96.4  
Used Car Dealers 72.4 74.0 65.1
Other Motor Vehicle Dealers 70.3 71.3 61.1
Automotive Parts, Accessories and Tire Stores 86.4 88.9 70.7
Furniture and Home Furnishings Stores 89.6 92.9 60.2
Furniture Stores 93.7 96.0 55.5
Home Furnishings Stores 82.9 87.1 62.7
Electronics and Appliance Stores 89.1 90.1 53.8
Building Material and Garden Equipment Dealers 92.1 93.5 82.3
Food and Beverage Stores 92.2 93.6 76.1
Grocery Stores 92.9 94.2 79.7
Grocery (except Convenience) Stores 95.5 96.6 83.2
Convenience Stores 58.1 59.1 52.2
Specialty Food Stores 68.7 72.0 55.3
Beer, Wine and Liquor Stores 95.5 96.2 69.7
Health and Personal Care Stores 90.5 90.7 88.6
Gasoline Stations 81.9 82.2 77.2
Clothing and Clothing Accessories Stores 89.4 91.0 37.8
Clothing Stores 90.3 92.0 30.8
Shoe Stores 90.0 90.6 54.6
Jewellery, Luggage and Leather Goods Stores 80.6 82.4 56.6
Sporting Goods, Hobby, Book and Music Stores 86.2 92.8 25.1
General Merchandise Stores 98.3 98.9 35.8
Department Stores 100.0 100.0  
Other general merchadise stores 96.9 97.9 35.8
Miscellaneous Store Retailers 78.9 82.4 54.7
Total 91.1 92.3 70.7
Regions  
Newfoundland and Labrador 92.9 93.6 71.8
Prince Edward Island 89.9 90.2 70.2
Nova Scotia 92.5 92.7 87.6
New Brunswick 88.4 89.9 64.5
Québec 91.0 92.3 75.2
Ontario 92.8 93.9 72.2
Manitoba 89.0 89.6 61.9
Saskatchewan 91.6 92.6 71.6
Alberta 88.3 89.9 58.1
British Columbia 90.7 91.8 69.4
Yukon Territory 86.1 86.1  
Northwest Territories 85.6 85.6  
Nunavut 72.3 72.3  

Weighted Response Rates

Respondents are sent a questionnaire or are contacted by telephone to obtain their sales and inventory values, as well as to confirm the opening or closing of business trading locations. Collection of the data begins approximately 7 working days after the end of the reference month and continues for the duration of that month.

New entrants to the survey are introduced to the survey via an introductory letter that informs the respondent that a representative of Statistics Canada will be calling. This call is to introduce the respondent to the survey, confirm the respondent's business activity, establish and begin data collection, as well as to answer any questions that the respondent may have.

8. Editing

Data editing is the application of checks to detect missing, invalid or inconsistent entries or to point to data records that are potentially in error. In the survey process for the MRTS, data editing is done at two different time periods.

First of all, editing is done during data collection. Once data are collected via the telephone, or via the receipt of completed mail-in questionnaires, the data are captured using customized data capture applications. All data are subjected to data editing. Edits during data collection are referred to as field edits and generally consist of validity and some simple consistency edits. They are used to detect mistakes made during the interview by the respondent or the interviewer and to identify missing information during collection in order to reduce the need for follow-up later on. Another purpose of the field edits is to clean up responses. In the MRTS, the current month’s responses are edited against the respondent’s previous month’s responses and/or the previous year’s responses for the current month. Field edits are also used to identify problems with data collection procedures and the design of the questionnaire, as well as the need for more interviewer training.

Follow-up with respondents occurs to validate potential erroneous data following any failed preliminary edit check of the data. Once validated, the collected data is regularly transmitted to the head office in Ottawa.

Secondly, editing known as statistical editing is also done after data collection and this is more empirical in nature. Statistical editing is run prior to imputation in order to identify the data that will be used as a basis to impute non-respondents. Large outliers that could disrupt a monthly trend are excluded from trend calculations by the statistical edits. It should be noted that adjustments are not made at this stage to correct the reported outliers.

The first step in the statistical editing is to identify which responses will be subjected to the statistical edit rules. Reported data for the current reference month will go through various edit checks.

The first set of edit checks is based on the Hidiriglou-Berthelot method whereby a ratio of the respondent’s current month data over historical (last month, same month last year) or auxiliary data is analyzed. When the respondent’s ratio differs significantly from ratios of respondents who are similar in terms of industry and/or geography group, the response is deemed an outlier.

The second set of edits consists of an edit known as the share of market edit. With this method, one is able to edit all respondents, even those where historical and auxiliary data is unavailable. The method relies on current month data only. Therefore, within a group of respondents, that are similar in terms of industrial group and/or geography, if the weighted contribution of a respondent to the group’s total is too large, it will be flagged as an outlier.

For edit checks based on the Hidiriglou-Berthelot method, data that are flagged as an outlier will not be included in the imputation models (those based on ratios). Also, data that are flagged as outliers in the share of market edit will not be included in the imputation models where means and medians are calculated to impute for responses that have no historical responses.

In conjunction with the statistical editing after data collection of reported data, there is also error detection done on the extracted GST data. Modeled data based on the GST are also subject to an extensive series of processing steps which thoroughly verify each record that is the basis for the model as well as the record being modeled. Edits are performed at a more aggregate level (industry by geography level) to detect records which deviate from the expected range, either by exhibiting large month-to-month change, or differing significantly from the remaining units. All data which fail these edits are subject to manual inspection and possible corrective action.

9. Imputation

Imputation in the MRTS is the process used to assign replacement values for missing data. This is done by assigning values when they are missing on the record being edited to ensure that estimates are of high quality and that a plausible, internal consistency is created. Due to concerns of response burden, cost and timeliness, it is generally impossible to do all follow-ups with the respondents in order to resolve missing responses. Since it is desirable to produce a complete and consistent microdata file, imputation is used to handle the remaining missing cases.

In the MRTS, imputation is based on historical data or administrative data (GST sales). The appropriate method is selected according to a strategy that is based on whether historical data is available, auxiliary data is available and/or which reference month is being processed.

There are three types of historical imputation methods. The first type is a general trend that uses one historical data source (previous month, data from next month or data from same month previous year). The second type is a regression model where data from previous month and same month previous year are used simultaneously. The third type uses the historical data as a direct replacement value for a non-respondent. Depending upon the particular reference month, there is an order of preference that exists so that top quality imputation can result. The historical imputation method that was labelled as the third type above is always the last option in the order for each reference month.

The imputation methods using administrative data are automatically selected when historical information is unavailable for a non-respondent. The administrative data source (annual GST sales) is the basis of these methods. The annual GST sales are used for two types of methods. One is a general trend that will be used for simple structure, e.g. enterprises with only one establishment, and a second type is called median-average that is used for units with a more complex structure.

10. Estimation

Estimation is a process that approximates unknown population parameters using only part of the population that is included in a sample. Inferences about these unknown parameters are then made, using the sample data and associated survey design. This stage uses Statistics Canada's Generalized Estimation System (GES).

For retail sales, the population is divided into a survey portion (take-all and take-some strata) and a non-survey portion (take-none stratum). From the sample that is drawn from the survey portion, an estimate for the population is determined through the use of a Horvitz-Thompson estimator where responses for sales are weighted by using the inverses of the inclusion probabilities of the sampled units. Such weights (called sampling weights) can be interpreted as the number of times that each sampled unit should be replicated to represent the entire population. The calculated weighted sales values are summed by domain, to produce the total sales estimates by each industrial group / geographic area combination. A domain is defined as the most recent classification values available from the BR for the unit and the survey reference period. These domains may differ from the original sampling strata because units may have changed size, industry or location. Changes in classification are reflected immediately in the estimates and do not accumulate over time. For the non-survey portion, the sales are estimated with statistical models using monthly GST sales.

For more information on the methodology for modeling sales from administrative data sources which also contributes to the estimates of the survey portion, refer to ‘Monthly Retail Survey: Use of Administrative Data’ under ‘Documentation’ of the IMDB.

The measure of precision used for the MRTS to evaluate the quality of a population parameter estimate and to obtain valid inferences is the variance. The variance from the survey portion is derived directly from a stratified simple random sample without replacement.

Sample estimates may differ from the expected value of the estimates. However, since the estimate is based on a probability sample, the variability of the sample estimate with respect to its expected value can be measured. The variance of an estimate is a measure of the precision of the sample estimate and is defined as the average, over all possible samples, of the squared difference of the estimate from its expected value.

11. Revisions and seasonal adjustment

Revisions in the raw data are required to correct known non-sampling errors. These normally include replacing imputed data with reported data, corrections to previously reported data, and estimates for new births that were not known at the time of the original estimates. Raw data are revised, on a monthly basis, for the month immediately prior to the current reference month being published. That is, when data for December are being published for the first time, there will also be revisions, if necessary, to the raw data for November. In addition, revisions are made once a year, with the initial release of the February data, for all months in the previous year. The purpose is to correct any significant problems that have been found that apply for an extended period. The actual period of revision depends on the nature of the problem identified, but rarely exceeds three years. Time series contain the elements essential to the description, explanation and forecasting of the behaviour of an economic phenomenon: "They are statistical records of the evolution of economic processes through time."1 Economic time series such as the Monthly Retail Trade Survey can be broken down into five main components: the trend-cycle, seasonality, the trading-day effect, the Easter holiday effect and the irregular component.

The trend represents the long-term change in the series, whereas the cycle represents a smooth, quasi-periodical movement about the trend, showing a succession of growth and decline phases (e.g., the business cycle). These two components—the trend and the cycle—are estimated together, and the trend-cycle reflects the fundamental evolution of the series. The other components reflect short-term transient movements.

The seasonal component represents sub-annual, monthly or quarterly fluctuations that recur more or less regularly from one year to the next. Seasonal variations are caused by the direct and indirect effects of the climatic seasons and institutional factors (attributable to social conventions or administrative rules; e.g., Christmas).

The trading-day component originates from the fact that the relative importance of the days varies systematically within the week and that the number of each day of the week in a given month varies from year to year. This effect is present when activity varies with the day of the week. For instance, Sunday is typically less active than the other days, and the number of Sundays, Mondays, etc., in a given month changes from year to year.

The Easter holiday effect is the variation due to the shift of part of April’s activity to March when Easter falls in March rather than April.

Lastly, the irregular component includes all other more or less erratic fluctuations not taken into account in the preceding components. It is a residual that includes errors of measurement on the 1. A Note on the Seasonal adjustment of Economic Time Series», Canadian Statistical Review, August 1974.  A variable itself as well as unusual events (e.g., strikes, drought, floods, major power blackout or other unexpected events causing variations in respondents’ activities).

Thus, the latter four components—seasonal, irregular, trading-day and Easter holiday effect—all conceal the fundamental trend-cycle component of the series. Seasonal adjustment (correction of seasonal variation) consists in removing the seasonal, trading-day and Easter holiday effect components from the series, and it thus helps reveal the trend-cycle. While seasonal adjustment permits a better understanding of the underlying trend-cycle of a series, the seasonally adjusted series still contains an irregular component. Slight month-to-month variations in the seasonally adjusted series may be simple irregular movements. To get a better idea of the underlying trend, users should examine several months of the seasonally adjusted series.

Since April 2008, Monthly Retail Trade Survey data are seasonally adjusted using the X-12- ARIMA2 software. The technique that is used essentially consists of first correcting the initial series for all sorts of undesirable effects, such as the trading-day and the Easter holiday effects, by a module called regARIMA. These effects are estimated using regression models with ARIMA errors (auto-regressive integrated moving average models). The series can also be extrapolated for at least one year by using the model. Subsequently, the raw series—pre-adjusted and extrapolated if applicable— is seasonally adjusted by the X-11 method.

The X-11 method is used for analysing monthly and quarterly series. It is based on an iterative principle applied in estimating the different components, with estimation being done at each stage using adequate moving averages3. The moving averages used to estimate the main components—the trend and seasonality—are primarily smoothing tools designed to eliminate an undesirable component from the series. Since moving averages react poorly to the presence of atypical values, the X-11 method includes a tool for detecting and correcting atypical points. This tool is used to clean up the series during the seasonal adjustment. Outlying data points can also be detected and corrected in advance, within the regARIMA module.

Lastly, the annual totals of the seasonally adjusted series are forced to the annual totals of the original series.

Unfortunately, seasonal adjustment removes the sub-annual additivity of a system of series; small discrepancies can be observed between the sum of seasonally adjusted series and the direct seasonal adjustment of their total. To insure or restore additivity in a system of series, a reconciliation process is applied or indirect seasonal adjustment is used, i.e. the seasonal adjustment of a total is derived by the summation of the individually seasonally adjusted series.

12. Data quality evaluation

The methodology of this survey has been designed to control errors and to reduce their potential effects on estimates. However, the survey results remain subject to errors, of which sampling error is only one component of the total survey error. Sampling error results when observations are made only on a sample and not on the entire population. All other errors arising from the various phases of a survey are referred to as nonsampling errors. For example, these types of errors can occur when a respondent provides incorrect information or does not answer certain questions; when a unit in the target population is omitted or covered more than once; when GST data for records being modeled for a particular month are not representative of the actual record for various reasons; when a unit that is out of scope for the survey is included by mistake or when errors occur in data processing, such as coding or capture errors.

Prior to publication, combined survey results are analyzed for comparability; in general, this includes a detailed review of individual responses (especially for large businesses), general economic conditions and historical trends.

A common measure of data quality for surveys is the coefficient of variation (CV). The coefficient of variation, defined as the standard error divided by the sample estimate, is a measure of precision in relative terms. Since the coefficient of variation is calculated from responses of individual units, it also measures some non-sampling errors.

The formula used to calculate coefficients of variation (CV) as percentages is:

CV (X) = S(X) * 100% / X
where X denotes the estimate and S(X) denotes the standard error of X.

Confidence intervals can be constructed around the estimates using the estimate and the CV. Thus, for our sample, it is possible to state with a given level of confidence that the expected value will fall within the confidence interval constructed around the estimate. For example, if an estimate of $12,000,000 has a CV of 2%, the standard error will be $240,000 (the estimate multiplied by the CV). It can be stated with 68% confidence that the expected values will fall within the interval whose length equals the standard deviation about the estimate, i.e. between $11,760,000 and $12,240,000.

Alternatively, it can be stated with 95% confidence that the expected value will fall within the interval whose length equals two standard deviations about the estimate, i.e. between $11,520,000 and $12,480,000.

Finally, due to the small contribution of the non-survey portion to the total estimates, bias in the non-survey portion has a negligible impact on the CVs. Therefore, the CV from the survey portion is used for the total estimate that is the summation of estimates from the surveyed and non-surveyed portions.

13. Disclosure control

Statistics Canada is prohibited by law from releasing any data which would divulge information obtained under the Statistics Act that relates to any identifiable person, business or organization without the prior knowledge or the consent in writing of that person, business or organization. Various confidentiality rules are applied to all data that are released or published to prevent the publication or disclosure of any information deemed confidential. If necessary, data are suppressed to prevent direct or residual disclosure of identifiable data.

Confidentiality analysis includes the detection of possible "direct disclosure", which occurs when the value in a tabulation cell is composed of a few respondents or when the cell is dominated by a few companies.

Monthly Retail Trade Survey (MRTS) Data Quality Statement

Objectives, uses and users
Concepts, variables and classifications
Coverage and frames
Sampling
Questionnaire design
Response and nonresponse
Data collection and capture operations
Editing
Imputation
Estimation
Revisions and seasonal adjustment
Data quality evaluation
Disclosure control

1. Objectives, uses and users

1.1. Objective

The Monthly Retail Trade Survey (MRTS) provides information on the performance of the retail trade sector on a monthly basis, and when combined with other statistics, represents an important indicator of the state of the Canadian economy.

1.2. Uses

The estimates provide a measure of the health and performance of the retail trade sector. Information collected is used to estimate level and monthly trend for retail sales. At the end of each year, the estimates provide a preliminary look at annual retail sales and performance.

1.3. Users

A variety of organizations, sector associations, and levels of government make use of the information. Retailers rely on the survey results to compare their performance against similar types of businesses, as well as for marketing purposes. Retail associations are able to monitor industry performance and promote their retail industries. Investors can monitor industry growth, which can result in better access to investment capital by retailers. Governments are able to understand the role of retailers in the economy, which aids in the development of policies and tax incentives. As an important industry in the Canadian economy, governments are able to better determine the overall health of the economy through the use of the estimates in the calculation of the nation’s Gross Domestic Product (GDP).

2. Concepts, variables and classifications

2.1. Concepts

The retail trade sector comprises establishments primarily engaged in retailing merchandise, generally without transformation, and rendering services incidental to the sale of merchandise.

The retailing process is the final step in the distribution of merchandise; retailers are therefore organized to sell merchandise in small quantities to the general public. This sector comprises two main types of retailers, that is, store and non-store retailers. The MRTS covers only store retailers. Their main characteristics are described below. Store retailers operate fixed point-of-sale locations, located and designed to attract a high volume of walk-in customers. In general, retail stores have extensive displays of merchandise and use mass-media advertising to attract customers. They typically sell merchandise to the general public for personal or household consumption, but some also serve business and institutional clients. These include establishments such as office supplies stores, computer and software stores, gasoline stations, building material dealers, plumbing supplies stores and electrical supplies stores.

In addition to selling merchandise, some types of store retailers are also engaged in the provision of after-sales services, such as repair and installation. For example, new automobile dealers, electronic and appliance stores and musical instrument and supplies stores often provide repair services, while floor covering stores and window treatment stores often provide installation services. As a general rule, establishments engaged in retailing merchandise and providing after sales services are classified in this sector. Catalogue sales showrooms, gasoline service stations, and mobile home dealers are treated as store retailers.

2.2. Variables

Sales are defined as the sales of all goods purchased for resale, net of returns and discounts. This includes commission revenue and fees earned from selling goods and services on account of others, such as selling lottery tickets, bus tickets, and phone cards. It also includes parts and labour revenue from repair and maintenance; revenue from rental and leasing of goods and equipment; revenues from services, including food services; sales of goods manufactured as a secondary activity; and the proprietor’s withdrawals, at retail, of goods for personal use. Other revenue from rental of real estate, placement fees, operating subsidies, grants, royalties and franchise fees are excluded.

Trading Location is the physical location(s) in which business activity is conducted in each province and territory, and for which sales are credited or recognized in the financial records of the company. For retailers, this would normally be a store.

Constant Dollars: The value of retail trade is measured in two ways; including the effects of price change on sales and net of the effects of price change. The first measure is referred to as retail trade in current dollars and the latter as retail trade in constant dollars. The method of calculating the current dollar estimate is to aggregate the weighted value of sales for all retail outlets. The method of calculating the constant dollar estimate is to first adjust the sales values to a base year, using the Consumer Price Index, and then sum up the resulting values.

2.3. Classification

The Monthly Retail Trade Survey is based on the definition of retail trade under the NAICS (North American Industry Classification System). NAICS is the agreed upon common framework for the production of comparable statistics by the statistical agencies of Canada, Mexico and the United States. The agreement defines the boundaries of twenty sectors. NAICS is based on a production-oriented, or supply based conceptual framework in that establishments are groups into industries according to similarity in production processes used to produce goods and services.

Estimates appear for 21 industries based on special aggregations of the 2012 North American Industry Classification System (NAICS) industries. The 21 industries are further aggregated to 11 sub-sectors.

Geographically, sales estimates are produced for Canada and each province and territory.

3. Coverage and frames

Statistics Canada’s Business Register ( BR) provides the frame for the Monthly Retail Trade Survey. The BR is a structured list of businesses engaged in the production of goods and services in Canada. It is a centrally maintained database containing detailed descriptions of most business entities operating within Canada. The BR includes all incorporated businesses, with or without employees. For unincorporated businesses, the BR includes all employers with businesses, and businesses with no employees with annual sales that have a Goods and Services Tax (GST) or annual revenue that declares individual taxes.  annual sales greater than $30,000 that have a Goods and Services Tax (GST) account (the BR does not include unincorporated businesses with no employees and with annual sales less than $30,000).

The businesses on the BR are represented by a hierarchical structure with four levels, with the statistical enterprise at the top, followed by the statistical company, the statistical establishment and the statistical location. An enterprise can be linked to one or more statistical companies, a statistical company can be linked to one or more statistical establishments, and a statistical establishment to one or more statistical locations.

The target population for the MRTS consists of all statistical establishments on the BR that are classified to the retail sector using the North American Industry Classification System (NAICS) (approximately 200,000 establishments). The NAICS code range for the retail sector is 441100 to 453999. A statistical establishment is the production entity or the smallest grouping of production entities which: produces a homogeneous set of goods or services; does not cross provincial boundaries; and provides data on the value of output, together with the cost of principal intermediate inputs used, along with the cost and quantity of labour used to produce the output. The production entity is the physical unit where the business operations are carried out. It must have a civic address and dedicated labour.

The exclusions to the target population are ancillary establishments (producers of services in support of the activity of producing goods and services for the market of more than one establishment within the enterprise, and serves as a cost centre or a discretionary expense centre for which data on all its costs including labour and depreciation can be reported by the business), future establishments, establishments with a missing or a zero gross business income (GBI) value on the BR and establishments in the following non-covered NAICS:

  • 4541 (electronic shopping and mail-order houses)
  • 4542 (vending machine operators)
  • 45431 (fuel dealers)
  • 45439 (other direct selling establishments)

4. Sampling

The MRTS sample consists of 10,000 groups of establishments (clusters) classified to the Retail Trade sector selected from the Statistics Canada Business Register. A cluster of establishments is defined as all establishments belonging to a statistical enterprise that are in the same industrial group and geographical region. The MRTS uses a stratified design with simple random sample selection in each stratum. The stratification is done by industry groups (the mainly, but not only four digit level NAICS), and the geographical regions consisting of the provinces and territories, as well as three provincial sub-regions. We further stratify the population by size.

The size measure is created using a combination of independent survey data and three administrative variables: the annual profiled revenue, the GST sales expressed on an annual basis, and the declared tax revenue (T1 or T2). The size strata consist of one take-all (census), at most, two take-some (partially sampled) strata, and one take-none (non-sampled) stratum. Take-none strata serve to reduce respondent burden by excluding the smaller businesses from the surveyed population. These businesses should represent at most ten percent of total sales. Instead of sending questionnaires to these businesses, the estimates are produced through the use of administrative data.

The sample was allocated optimally in order to reach target coefficients of variation at the national, provincial/territorial, industrial, and industrial groups by province/territory levels. The sample was also inflated to compensate for dead, non-responding, and misclassified units.

MRTS is a repeated survey with maximisation of monthly sample overlap. The sample is kept month after month, and every month new units are added (births) to the sample.  MRTS births, i.e., new clusters of establishment(s), are identified every month via the BR’s latest universe. They are stratified according to the same criteria as the initial population. A sample of these births is selected according to the sampling fraction of the stratum to which they belong and is added to the monthly sample. Deaths occur on a monthly basis. A death can be a cluster of establishment(s) that have ceased their activities (out-of-business) or whose major activities are no longer in retail trade (out-of-scope). The status of these businesses is updated on the BR using administrative sources and survey feedback, including feedback from the MRTS. Methods to treat dead units and misclassified units are part of the sample and population update procedures.

5. Questionnaire design

The Monthly Retail Trade Survey incorporates the following sub-surveys:

Monthly Retail Trade Survey - R8

Monthly Retail Trade Survey (with inventories) – R8

Survey of Sales and Inventories of Alcoholic Beverages

The questionnaires collect monthly data on retail sales and the number of trading locations by province or territory and inventories of goods owned and intended for resale from a sample of retailers. The items on the questionnaires have remained unchanged for several years. For the 2004 redesign, the general questionnaires were subject to cosmetic changes only. The questionnaire for Sales and Inventories of Alcoholic Beverages underwent more extensive changes. The modifications were discussed with stakeholders and the respondents were given an opportunity to comment before the new questionnaire was finalized. If further changes are needed to any of the questionnaires, proposed changes would go through a review committee and a field test with respondents and data users to ensure its relevancy.

6. Response and nonresponse

6.1. Response and non-response

Despite the best efforts of survey managers and operations staff to maximize response in the MRTS, some non-response will occur. For statistical establishments to be classified as responding, the degree of partial response (where an accurate response is obtained for only some of the questions asked a respondent) must meet a minimum threshold level below which the response would be rejected and considered a unit nonresponse.  In such an instance, the business is classified as not having responded at all.

Non-response has two effects on data: first it introduces bias in estimates when nonrespondents differ from respondents in the characteristics measured; and second, it contributes to an increase in the sampling variance of estimates because the effective sample size is reduced from that originally sought.

The degree to which efforts are made to get a response from a non-respondent is based on budget and time constraints, its impact on the overall quality and the risk of nonresponse bias.

The main method to reduce the impact of non-response at sampling is to inflate the sample size through the use of over-sampling rates that have been determined from similar surveys.

Besides the methods to reduce the impact of non-response at sampling and collection, the non-responses to the survey that do occur are treated through imputation. In order to measure the amount of non-response that occurs each month, various response rates are calculated. For a given reference month, the estimation process is run at least twice (a preliminary and a revised run). Between each run, respondent data can be identified as unusable and imputed values can be corrected through respondent data. As a consequence, response rates are computed following each run of the estimation process.

For the MRTS, two types of rates are calculated (un-weighted and weighted). In order to assess the efficiency of the collection process, un-weighted response rates are calculated. Weighted rates, using the estimation weight and the value for the variable of interest, assess the quality of estimation. Within each of these types of rates, there are distinct rates for units that are surveyed and for units that are only modeled from administrative data that has been extracted from GST files.

To get a better picture of the success of the collection process, two un-weighted rates called the ‘collection results rate’ and the ‘extraction results rate’ are computed. They are computed by dividing the number of respondents by the number of units that we tried to contact or tried to receive extracted data for them. Non-monthly reporters (respondents with special reporting arrangements where they do not report every month but for whom actual data is available in subsequent revisions) are excluded from both the numerator and denominator for the months where no contact is performed.

In summary, the various response rates are calculated as follows:

Weighted rates:

Survey Response rate (estimation) =
Sum of weighted sales of units with response status i / Sum of survey weighted sales

where i = units that have either reported data that will be used in estimation or are converted refusals, or have reported data that has not yet been resolved for estimation.

Admin Response rate (estimation) =
Sum of weighted sales of units with response status ii / Sum of administrative weighted sales

where ii = units that have data that was extracted from administrative files and are usable for estimation.

Total Response rate (estimation) =
Sum of weighted sales of units with response status i or response status ii / Sum of all weighted sales

Un-weighted rates:

Survey Response rate (collection) =
Number of questionnaires with response status iii/ Number of questionnaires with response status iv

where iii = units that have either reported data (unresolved, used or not used for estimation) or are converted refusals.

where iv = all of the above plus units that have refused to respond, units that were not contacted and other types of non-respondent units.

Admin Response rate (extraction) =
Number of questionnaires with response status vi/ Number of questionnaires with response status vii

where vi = in-scope units that have data (either usable or non-usable) that was extracted from administrative files

where vii = all of the above plus units that have refused to report to the administrative data source, units that were not contacted and other types of non-respondent units.

(% of questionnaire collected over all in-scope questionnaires)

Collection Results Rate =
Number of questionnaires with response status iii / Number of questionnaires with response status viii

where iii = same as iii defined above

where viii = same as iv except for the exclusion of units that were contacted because their response is unavailable for a particular month since they are non-monthly reporters.

Extraction Results Rate =
Number of questionnaires with response status ix / Number of questionnaires with response status vii

where ix = same as vi with the addition of extracted units that have been imputed or were out of scope

where vii = same as vii defined above

(% of questionnaires collected over all questionnaire in-scope we tried to collect)

All the above weighted and un-weighted rates are provided at the industrial group, geography and size group level or for any combination of these levels.

Use of Administrative Data

Managing response burden is an ongoing challenge for Statistics Canada. In an attempt to alleviate response burden and survey costs, especially for smaller businesses, the MRTS has reduced the number of simple establishments in the sample that are surveyed directly and instead derives sales data for these establishments from Goods and Service Tax (GST) files using a statistical model. The model accounts for differences between sales and revenue (reported for GST purposes) as well as for the time lag between the survey reference period and the reference period of the GST file.

For more information on the methodology used for modeling sales from administrative data sources, refer to ‘Monthly Retail Trade Survey: Use of Administrative Data’ under ‘Documentation’ of the IMDB.

Table 1 contains the weighted response rates for all industry groups as well as for total retail trade for each province and territory. For more detailed weighted response rates, please contact the Marketing and Dissemination Section at (613) 951-3549, toll free: 1-877-421-3067 or by e-mail at retailinfo@statcan.

6.2. Methods used to reduce non-response at collection

Significant effort is spent trying to minimize non-response during collection. Methods used, among others, are interviewer techniques such as probing and persuasion, repeated re-scheduling and call-backs to obtain the information, and procedures dealing with how to handle non-compliant (refusal) respondents.

If data are unavailable at the time of collection, a respondent's best estimates are also accepted, and are subsequently revised once the actual data become available.

To minimize total non-response for all variables, partial responses are accepted. In addition, questionnaires are customized for the collection of certain variables, such as inventory, so that collection is timed for those months when the data are available.

Finally, to build trust and rapport between the interviewers and respondents, cases are generally assigned to the same interviewer each month. This action establishes a personal relationship between interviewer and respondent, and builds respondent trust.

7. Data collection and capture operations

Collection of the data is performed by Statistics Canada’s Regional Offices.

Table 1: Weighted response rates by NAICS, for all provinces and territories: September 2013
Table summary
This table displays the results of table 1: weighted response rates by naics weighted response rates, calculated using total, survey and administrative units of measure (appearing as column headers).
  Weighted Response Rates
Total Survey Administrative
NAICS - Canada
Motor Vehicle and Parts Dealers 93.5 94.2 61.4
Automobile Dealers 95.7 96.0 59.7
New Car DealersNote 1 96.8 96.8  
Used Car Dealers 76.9 80.0 59.7
Other Motor Vehicle Dealers 72.7 71.2 82.0
Automotive Parts, Accessories and Tire Stores 79.9 85.6 42.2
Furniture and Home Furnishings Stores 88.9 94.1 40.8
Furniture Stores 95.1 97.0 58.3
Home Furnishings Stores 78.0 88.2 32.5
Electronics and Appliance Stores 90.4 90.9 73.9
Building Material and Garden Equipment Dealers 83.1 85.7 62.6
Food and Beverage Stores 92.0 94.0 66.2
Grocery Stores 93.7 95.2 76.4
Grocery (except Convenience) Stores 95.2 96.1 84.6
Convenience Stores 74.4 83.0 24.5
Specialty Food Stores 63.8 72.3 28.6
Beer, Wine and Liquor Stores 92.1 94.2 18.8
Health and Personal Care Stores 90.9 92.0 74.5
Gasoline Stations 84.4 85.5 66.8
Clothing and Clothing Accessories Stores 90.5 91.9 40.9
Clothing Stores 91.9 93.4 36.6
Shoe Stores 89.7 91.0 13.1
Jewellery, Luggage and Leather Goods Stores 80.2 80.8 69.0
Sporting Goods, Hobby, Book and Music Stores 88.4 93.0 42.7
General Merchandise Stores 98.6 98.8 84.2
Department Stores 100.0 100.0  
Other general merchadise stores 97.5 97.7 84.2
Miscellaneous Store Retailers 84.9 89.1 42.6
Total 91.1 92.7 62.2
Regions
Newfoundland and Labrador 89.7 90.4 65.9
Prince Edward Island 87.0 88.0 21.3
Nova Scotia 91.4 92.5 59.6
New Brunswick 85.3 87.2 56.8
Québec 89.7 91.6 64.0
Ontario 93.5 95.0 62.1
Manitoba 89.5 89.7 79.2
Saskatchewan 91.1 93.0 52.7
Alberta 89.3 90.7 62.6
British Columbia 91.3 92.9 60.0
Yukon Territory 82.7 82.7  
Northwest Territories 84.8 84.8  
Nunavut 73.1 73.1  

Weighted Response Rates

Respondents are sent a questionnaire or are contacted by telephone to obtain their sales and inventory values, as well as to confirm the opening or closing of business trading locations. Collection of the data begins approximately 7 working days after the end of the reference month and continues for the duration of that month.

New entrants to the survey are introduced to the survey via an introductory letter that informs the respondent that a representative of Statistics Canada will be calling. This call is to introduce the respondent to the survey, confirm the respondent's business activity, establish and begin data collection, as well as to answer any questions that the respondent may have.

8. Editing

Data editing is the application of checks to detect missing, invalid or inconsistent entries or to point to data records that are potentially in error. In the survey process for the MRTS, data editing is done at two different time periods.

First of all, editing is done during data collection. Once data are collected via the telephone, or via the receipt of completed mail-in questionnaires, the data are captured using customized data capture applications. All data are subjected to data editing. Edits during data collection are referred to as field edits and generally consist of validity and some simple consistency edits. They are used to detect mistakes made during the interview by the respondent or the interviewer and to identify missing information during collection in order to reduce the need for follow-up later on. Another purpose of the field edits is to clean up responses. In the MRTS, the current month’s responses are edited against the respondent’s previous month’s responses and/or the previous year’s responses for the current month. Field edits are also used to identify problems with data collection procedures and the design of the questionnaire, as well as the need for more interviewer training.

Follow-up with respondents occurs to validate potential erroneous data following any failed preliminary edit check of the data. Once validated, the collected data is regularly transmitted to the head office in Ottawa.

Secondly, editing known as statistical editing is also done after data collection and this is more empirical in nature. Statistical editing is run prior to imputation in order to identify the data that will be used as a basis to impute non-respondents. Large outliers that could disrupt a monthly trend are excluded from trend calculations by the statistical edits. It should be noted that adjustments are not made at this stage to correct the reported outliers.

The first step in the statistical editing is to identify which responses will be subjected to the statistical edit rules. Reported data for the current reference month will go through various edit checks.

The first set of edit checks is based on the Hidiriglou-Berthelot method whereby a ratio of the respondent’s current month data over historical (last month, same month last year) or auxiliary data is analyzed. When the respondent’s ratio differs significantly from ratios of respondents who are similar in terms of industry and/or geography group, the response is deemed an outlier.

The second set of edits consists of an edit known as the share of market edit. With this method, one is able to edit all respondents, even those where historical and auxiliary data is unavailable. The method relies on current month data only. Therefore, within a group of respondents, that are similar in terms of industrial group and/or geography, if the weighted contribution of a respondent to the group’s total is too large, it will be flagged as an outlier.

For edit checks based on the Hidiriglou-Berthelot method, data that are flagged as an outlier will not be included in the imputation models (those based on ratios). Also, data that are flagged as outliers in the share of market edit will not be included in the imputation models where means and medians are calculated to impute for responses that have no historical responses.

In conjunction with the statistical editing after data collection of reported data, there is also error detection done on the extracted GST data. Modeled data based on the GST are also subject to an extensive series of processing steps which thoroughly verify each record that is the basis for the model as well as the record being modeled. Edits are performed at a more aggregate level (industry by geography level) to detect records which deviate from the expected range, either by exhibiting large month-to-month change, or differing significantly from the remaining units. All data which fail these edits are subject to manual inspection and possible corrective action.

9. Imputation

Imputation in the MRTS is the process used to assign replacement values for missing data. This is done by assigning values when they are missing on the record being edited to ensure that estimates are of high quality and that a plausible, internal consistency is created. Due to concerns of response burden, cost and timeliness, it is generally impossible to do all follow-ups with the respondents in order to resolve missing responses. Since it is desirable to produce a complete and consistent microdata file, imputation is used to handle the remaining missing cases.

In the MRTS, imputation is based on historical data or administrative data (GST sales). The appropriate method is selected according to a strategy that is based on whether historical data is available, auxiliary data is available and/or which reference month is being processed.

There are three types of historical imputation methods. The first type is a general trend that uses one historical data source (previous month, data from next month or data from same month previous year). The second type is a regression model where data from previous month and same month previous year are used simultaneously. The third type uses the historical data as a direct replacement value for a non-respondent. Depending upon the particular reference month, there is an order of preference that exists so that top quality imputation can result. The historical imputation method that was labelled as the third type above is always the last option in the order for each reference month.

The imputation methods using administrative data are automatically selected when historical information is unavailable for a non-respondent. The administrative data source (annual GST sales) is the basis of these methods. The annual GST sales are used for two types of methods. One is a general trend that will be used for simple structure, e.g. enterprises with only one establishment, and a second type is called median-average that is used for units with a more complex structure.

10. Estimation

Estimation is a process that approximates unknown population parameters using only part of the population that is included in a sample. Inferences about these unknown parameters are then made, using the sample data and associated survey design. This stage uses Statistics Canada's Generalized Estimation System (GES).

For retail sales, the population is divided into a survey portion (take-all and take-some strata) and a non-survey portion (take-none stratum). From the sample that is drawn from the survey portion, an estimate for the population is determined through the use of a Horvitz-Thompson estimator where responses for sales are weighted by using the inverses of the inclusion probabilities of the sampled units. Such weights (called sampling weights) can be interpreted as the number of times that each sampled unit should be replicated to represent the entire population. The calculated weighted sales values are summed by domain, to produce the total sales estimates by each industrial group / geographic area combination. A domain is defined as the most recent classification values available from the BR for the unit and the survey reference period. These domains may differ from the original sampling strata because units may have changed size, industry or location. Changes in classification are reflected immediately in the estimates and do not accumulate over time. For the non-survey portion, the sales are estimated with statistical models using monthly GST sales.

For more information on the methodology for modeling sales from administrative data sources which also contributes to the estimates of the survey portion, refer to ‘Monthly Retail Survey: Use of Administrative Data’ under ‘Documentation’ of the IMDB.

The measure of precision used for the MRTS to evaluate the quality of a population parameter estimate and to obtain valid inferences is the variance. The variance from the survey portion is derived directly from a stratified simple random sample without replacement.

Sample estimates may differ from the expected value of the estimates. However, since the estimate is based on a probability sample, the variability of the sample estimate with respect to its expected value can be measured. The variance of an estimate is a measure of the precision of the sample estimate and is defined as the average, over all possible samples, of the squared difference of the estimate from its expected value.

11. Revisions and seasonal adjustment

Revisions in the raw data are required to correct known non-sampling errors. These normally include replacing imputed data with reported data, corrections to previously reported data, and estimates for new births that were not known at the time of the original estimates. Raw data are revised, on a monthly basis, for the month immediately prior to the current reference month being published. That is, when data for December are being published for the first time, there will also be revisions, if necessary, to the raw data for November. In addition, revisions are made once a year, with the initial release of the February data, for all months in the previous year. The purpose is to correct any significant problems that have been found that apply for an extended period. The actual period of revision depends on the nature of the problem identified, but rarely exceeds three years. Time series contain the elements essential to the description, explanation and forecasting of the behaviour of an economic phenomenon: "They are statistical records of the evolution of economic processes through time."1 Economic time series such as the Monthly Retail Trade Survey can be broken down into five main components: the trend-cycle, seasonality, the trading-day effect, the Easter holiday effect and the irregular component.

The trend represents the long-term change in the series, whereas the cycle represents a smooth, quasi-periodical movement about the trend, showing a succession of growth and decline phases (e.g., the business cycle). These two components—the trend and the cycle—are estimated together, and the trend-cycle reflects the fundamental evolution of the series. The other components reflect short-term transient movements.

The seasonal component represents sub-annual, monthly or quarterly fluctuations that recur more or less regularly from one year to the next. Seasonal variations are caused by the direct and indirect effects of the climatic seasons and institutional factors (attributable to social conventions or administrative rules; e.g., Christmas).

The trading-day component originates from the fact that the relative importance of the days varies systematically within the week and that the number of each day of the week in a given month varies from year to year. This effect is present when activity varies with the day of the week. For instance, Sunday is typically less active than the other days, and the number of Sundays, Mondays, etc., in a given month changes from year to year.

The Easter holiday effect is the variation due to the shift of part of April’s activity to March when Easter falls in March rather than April.

Lastly, the irregular component includes all other more or less erratic fluctuations not taken into account in the preceding components. It is a residual that includes errors of measurement on the 1. A Note on the Seasonal adjustment of Economic Time Series», Canadian Statistical Review, August 1974.  A variable itself as well as unusual events (e.g., strikes, drought, floods, major power blackout or other unexpected events causing variations in respondents’ activities).

Thus, the latter four components—seasonal, irregular, trading-day and Easter holiday effect—all conceal the fundamental trend-cycle component of the series. Seasonal adjustment (correction of seasonal variation) consists in removing the seasonal, trading-day and Easter holiday effect components from the series, and it thus helps reveal the trend-cycle. While seasonal adjustment permits a better understanding of the underlying trend-cycle of a series, the seasonally adjusted series still contains an irregular component. Slight month-to-month variations in the seasonally adjusted series may be simple irregular movements. To get a better idea of the underlying trend, users should examine several months of the seasonally adjusted series.

Since April 2008, Monthly Retail Trade Survey data are seasonally adjusted using the X-12- ARIMA2 software. The technique that is used essentially consists of first correcting the initial series for all sorts of undesirable effects, such as the trading-day and the Easter holiday effects, by a module called regARIMA. These effects are estimated using regression models with ARIMA errors (auto-regressive integrated moving average models). The series can also be extrapolated for at least one year by using the model. Subsequently, the raw series—pre-adjusted and extrapolated if applicable— is seasonally adjusted by the X-11 method.

The X-11 method is used for analysing monthly and quarterly series. It is based on an iterative principle applied in estimating the different components, with estimation being done at each stage using adequate moving averages3. The moving averages used to estimate the main components—the trend and seasonality—are primarily smoothing tools designed to eliminate an undesirable component from the series. Since moving averages react poorly to the presence of atypical values, the X-11 method includes a tool for detecting and correcting atypical points. This tool is used to clean up the series during the seasonal adjustment. Outlying data points can also be detected and corrected in advance, within the regARIMA module.

Lastly, the annual totals of the seasonally adjusted series are forced to the annual totals of the original series.

Unfortunately, seasonal adjustment removes the sub-annual additivity of a system of series; small discrepancies can be observed between the sum of seasonally adjusted series and the direct seasonal adjustment of their total. To insure or restore additivity in a system of series, a reconciliation process is applied or indirect seasonal adjustment is used, i.e. the seasonal adjustment of a total is derived by the summation of the individually seasonally adjusted series.

12. Data quality evaluation

The methodology of this survey has been designed to control errors and to reduce their potential effects on estimates. However, the survey results remain subject to errors, of which sampling error is only one component of the total survey error. Sampling error results when observations are made only on a sample and not on the entire population. All other errors arising from the various phases of a survey are referred to as nonsampling errors. For example, these types of errors can occur when a respondent provides incorrect information or does not answer certain questions; when a unit in the target population is omitted or covered more than once; when GST data for records being modeled for a particular month are not representative of the actual record for various reasons; when a unit that is out of scope for the survey is included by mistake or when errors occur in data processing, such as coding or capture errors.

Prior to publication, combined survey results are analyzed for comparability; in general, this includes a detailed review of individual responses (especially for large businesses), general economic conditions and historical trends.

A common measure of data quality for surveys is the coefficient of variation (CV). The coefficient of variation, defined as the standard error divided by the sample estimate, is a measure of precision in relative terms. Since the coefficient of variation is calculated from responses of individual units, it also measures some non-sampling errors.

The formula used to calculate coefficients of variation (CV) as percentages is:

CV (X) = S(X) * 100% / X
where X denotes the estimate and S(X) denotes the standard error of X.

Confidence intervals can be constructed around the estimates using the estimate and the CV. Thus, for our sample, it is possible to state with a given level of confidence that the expected value will fall within the confidence interval constructed around the estimate. For example, if an estimate of $12,000,000 has a CV of 2%, the standard error will be $240,000 (the estimate multiplied by the CV). It can be stated with 68% confidence that the expected values will fall within the interval whose length equals the standard deviation about the estimate, i.e. between $11,760,000 and $12,240,000.

Alternatively, it can be stated with 95% confidence that the expected value will fall within the interval whose length equals two standard deviations about the estimate, i.e. between $11,520,000 and $12,480,000.

Finally, due to the small contribution of the non-survey portion to the total estimates, bias in the non-survey portion has a negligible impact on the CVs. Therefore, the CV from the survey portion is used for the total estimate that is the summation of estimates from the surveyed and non-surveyed portions.

13. Disclosure control

Statistics Canada is prohibited by law from releasing any data which would divulge information obtained under the Statistics Act that relates to any identifiable person, business or organization without the prior knowledge or the consent in writing of that person, business or organization. Various confidentiality rules are applied to all data that are released or published to prevent the publication or disclosure of any information deemed confidential. If necessary, data are suppressed to prevent direct or residual disclosure of identifiable data.

Confidentiality analysis includes the detection of possible "direct disclosure", which occurs when the value in a tabulation cell is composed of a few respondents or when the cell is dominated by a few companies.

 
 

Monthly Retail Trade Survey (MRTS) Data Quality Statement

Objectives, uses and users
Concepts, variables and classifications
Coverage and frames
Sampling
Questionnaire design
Response and nonresponse
Data collection and capture operations
Editing
Imputation
Estimation
Revisions and seasonal adjustment
Data quality evaluation
Disclosure control

1. Objectives, uses and users

1.1. Objective

The Monthly Retail Trade Survey (MRTS) provides information on the performance of the retail trade sector on a monthly basis, and when combined with other statistics, represents an important indicator of the state of the Canadian economy.

1.2. Uses

The estimates provide a measure of the health and performance of the retail trade sector. Information collected is used to estimate level and monthly trend for retail sales. At the end of each year, the estimates provide a preliminary look at annual retail sales and performance.

1.3. Users

A variety of organizations, sector associations, and levels of government make use of the information. Retailers rely on the survey results to compare their performance against similar types of businesses, as well as for marketing purposes. Retail associations are able to monitor industry performance and promote their retail industries. Investors can monitor industry growth, which can result in better access to investment capital by retailers. Governments are able to understand the role of retailers in the economy, which aids in the development of policies and tax incentives. As an important industry in the Canadian economy, governments are able to better determine the overall health of the economy through the use of the estimates in the calculation of the nation’s Gross Domestic Product (GDP).

2. Concepts, variables and classifications

2.1. Concepts

The retail trade sector comprises establishments primarily engaged in retailing merchandise, generally without transformation, and rendering services incidental to the sale of merchandise.

The retailing process is the final step in the distribution of merchandise; retailers are therefore organized to sell merchandise in small quantities to the general public. This sector comprises two main types of retailers, that is, store and non-store retailers. The MRTS covers only store retailers. Their main characteristics are described below. Store retailers operate fixed point-of-sale locations, located and designed to attract a high volume of walk-in customers. In general, retail stores have extensive displays of merchandise and use mass-media advertising to attract customers. They typically sell merchandise to the general public for personal or household consumption, but some also serve business and institutional clients. These include establishments such as office supplies stores, computer and software stores, gasoline stations, building material dealers, plumbing supplies stores and electrical supplies stores.

In addition to selling merchandise, some types of store retailers are also engaged in the provision of after-sales services, such as repair and installation. For example, new automobile dealers, electronic and appliance stores and musical instrument and supplies stores often provide repair services, while floor covering stores and window treatment stores often provide installation services. As a general rule, establishments engaged in retailing merchandise and providing after sales services are classified in this sector. Catalogue sales showrooms, gasoline service stations, and mobile home dealers are treated as store retailers.

2.2. Variables

Sales are defined as the sales of all goods purchased for resale, net of returns and discounts. This includes commission revenue and fees earned from selling goods and services on account of others, such as selling lottery tickets, bus tickets, and phone cards. It also includes parts and labour revenue from repair and maintenance; revenue from rental and leasing of goods and equipment; revenues from services, including food services; sales of goods manufactured as a secondary activity; and the proprietor’s withdrawals, at retail, of goods for personal use. Other revenue from rental of real estate, placement fees, operating subsidies, grants, royalties and franchise fees are excluded.

Trading Location is the physical location(s) in which business activity is conducted in each province and territory, and for which sales are credited or recognized in the financial records of the company. For retailers, this would normally be a store.

Constant Dollars: The value of retail trade is measured in two ways; including the effects of price change on sales and net of the effects of price change. The first measure is referred to as retail trade in current dollars and the latter as retail trade in constant dollars. The method of calculating the current dollar estimate is to aggregate the weighted value of sales for all retail outlets. The method of calculating the constant dollar estimate is to first adjust the sales values to a base year, using the Consumer Price Index, and then sum up the resulting values.

2.3. Classification

The Monthly Retail Trade Survey is based on the definition of retail trade under the NAICS (North American Industry Classification System). NAICS is the agreed upon common framework for the production of comparable statistics by the statistical agencies of Canada, Mexico and the United States. The agreement defines the boundaries of twenty sectors. NAICS is based on a production-oriented, or supply based conceptual framework in that establishments are groups into industries according to similarity in production processes used to produce goods and services.

Estimates appear for 21 industries based on special aggregations of the 2012 North American Industry Classification System (NAICS) industries. The 21 industries are further aggregated to 11 sub-sectors.

Geographically, sales estimates are produced for Canada and each province and territory.

3. Coverage and frames

Statistics Canada’s Business Register ( BR) provides the frame for the Monthly Retail Trade Survey. The BR is a structured list of businesses engaged in the production of goods and services in Canada. It is a centrally maintained database containing detailed descriptions of most business entities operating within Canada. The BR includes all incorporated businesses, with or without employees. For unincorporated businesses, the BR includes all employers with businesses, and businesses with no employees with annual sales that have a Goods and Services Tax (GST) or annual revenue that declares individual taxes.  annual sales greater than $30,000 that have a Goods and Services Tax (GST) account (the BR does not include unincorporated businesses with no employees and with annual sales less than $30,000).

The businesses on the BR are represented by a hierarchical structure with four levels, with the statistical enterprise at the top, followed by the statistical company, the statistical establishment and the statistical location. An enterprise can be linked to one or more statistical companies, a statistical company can be linked to one or more statistical establishments, and a statistical establishment to one or more statistical locations.

The target population for the MRTS consists of all statistical establishments on the BR that are classified to the retail sector using the North American Industry Classification System (NAICS) (approximately 200,000 establishments). The NAICS code range for the retail sector is 441100 to 453999. A statistical establishment is the production entity or the smallest grouping of production entities which: produces a homogeneous set of goods or services; does not cross provincial boundaries; and provides data on the value of output, together with the cost of principal intermediate inputs used, along with the cost and quantity of labour used to produce the output. The production entity is the physical unit where the business operations are carried out. It must have a civic address and dedicated labour.

The exclusions to the target population are ancillary establishments (producers of services in support of the activity of producing goods and services for the market of more than one establishment within the enterprise, and serves as a cost centre or a discretionary expense centre for which data on all its costs including labour and depreciation can be reported by the business), future establishments, establishments with a missing or a zero gross business income (GBI) value on the BR and establishments in the following non-covered NAICS:

  • 4541 (electronic shopping and mail-order houses)
  • 4542 (vending machine operators)
  • 45431 (fuel dealers)
  • 45439 (other direct selling establishments)

4. Sampling

The MRTS sample consists of 10,000 groups of establishments (clusters) classified to the Retail Trade sector selected from the Statistics Canada Business Register. A cluster of establishments is defined as all establishments belonging to a statistical enterprise that are in the same industrial group and geographical region. The MRTS uses a stratified design with simple random sample selection in each stratum. The stratification is done by industry groups (the mainly, but not only four digit level NAICS), and the geographical regions consisting of the provinces and territories, as well as three provincial sub-regions. We further stratify the population by size.

The size measure is created using a combination of independent survey data and three administrative variables: the annual profiled revenue, the GST sales expressed on an annual basis, and the declared tax revenue (T1 or T2). The size strata consist of one take-all (census), at most, two take-some (partially sampled) strata, and one take-none (non-sampled) stratum. Take-none strata serve to reduce respondent burden by excluding the smaller businesses from the surveyed population. These businesses should represent at most ten percent of total sales. Instead of sending questionnaires to these businesses, the estimates are produced through the use of administrative data.

The sample was allocated optimally in order to reach target coefficients of variation at the national, provincial/territorial, industrial, and industrial groups by province/territory levels. The sample was also inflated to compensate for dead, non-responding, and misclassified units.

MRTS is a repeated survey with maximisation of monthly sample overlap. The sample is kept month after month, and every month new units are added (births) to the sample.  MRTS births, i.e., new clusters of establishment(s), are identified every month via the BR’s latest universe. They are stratified according to the same criteria as the initial population. A sample of these births is selected according to the sampling fraction of the stratum to which they belong and is added to the monthly sample. Deaths occur on a monthly basis. A death can be a cluster of establishment(s) that have ceased their activities (out-of-business) or whose major activities are no longer in retail trade (out-of-scope). The status of these businesses is updated on the BR using administrative sources and survey feedback, including feedback from the MRTS. Methods to treat dead units and misclassified units are part of the sample and population update procedures.

5. Questionnaire design

The Monthly Retail Trade Survey incorporates the following sub-surveys:

Monthly Retail Trade Survey - R8

Monthly Retail Trade Survey (with inventories) – R8

Survey of Sales and Inventories of Alcoholic Beverages

The questionnaires collect monthly data on retail sales and the number of trading locations by province or territory and inventories of goods owned and intended for resale from a sample of retailers. The items on the questionnaires have remained unchanged for several years. For the 2004 redesign, the general questionnaires were subject to cosmetic changes only. The questionnaire for Sales and Inventories of Alcoholic Beverages underwent more extensive changes. The modifications were discussed with stakeholders and the respondents were given an opportunity to comment before the new questionnaire was finalized. If further changes are needed to any of the questionnaires, proposed changes would go through a review committee and a field test with respondents and data users to ensure its relevancy.

6. Response and nonresponse

6.1. Response and non-response

Despite the best efforts of survey managers and operations staff to maximize response in the MRTS, some non-response will occur. For statistical establishments to be classified as responding, the degree of partial response (where an accurate response is obtained for only some of the questions asked a respondent) must meet a minimum threshold level below which the response would be rejected and considered a unit nonresponse.  In such an instance, the business is classified as not having responded at all.

Non-response has two effects on data: first it introduces bias in estimates when nonrespondents differ from respondents in the characteristics measured; and second, it contributes to an increase in the sampling variance of estimates because the effective sample size is reduced from that originally sought.

The degree to which efforts are made to get a response from a non-respondent is based on budget and time constraints, its impact on the overall quality and the risk of nonresponse bias.

The main method to reduce the impact of non-response at sampling is to inflate the sample size through the use of over-sampling rates that have been determined from similar surveys.

Besides the methods to reduce the impact of non-response at sampling and collection, the non-responses to the survey that do occur are treated through imputation. In order to measure the amount of non-response that occurs each month, various response rates are calculated. For a given reference month, the estimation process is run at least twice (a preliminary and a revised run). Between each run, respondent data can be identified as unusable and imputed values can be corrected through respondent data. As a consequence, response rates are computed following each run of the estimation process.

For the MRTS, two types of rates are calculated (un-weighted and weighted). In order to assess the efficiency of the collection process, un-weighted response rates are calculated. Weighted rates, using the estimation weight and the value for the variable of interest, assess the quality of estimation. Within each of these types of rates, there are distinct rates for units that are surveyed and for units that are only modeled from administrative data that has been extracted from GST files.

To get a better picture of the success of the collection process, two un-weighted rates called the ‘collection results rate’ and the ‘extraction results rate’ are computed. They are computed by dividing the number of respondents by the number of units that we tried to contact or tried to receive extracted data for them. Non-monthly reporters (respondents with special reporting arrangements where they do not report every month but for whom actual data is available in subsequent revisions) are excluded from both the numerator and denominator for the months where no contact is performed.

In summary, the various response rates are calculated as follows:

Weighted rates:

Survey Response rate (estimation) =
Sum of weighted sales of units with response status i / Sum of survey weighted sales

where i = units that have either reported data that will be used in estimation or are converted refusals, or have reported data that has not yet been resolved for estimation.

Admin Response rate (estimation) =
Sum of weighted sales of units with response status ii / Sum of administrative weighted sales

where ii = units that have data that was extracted from administrative files and are usable for estimation.

Total Response rate (estimation) =
Sum of weighted sales of units with response status i or response status ii / Sum of all weighted sales

Un-weighted rates:

Survey Response rate (collection) =
Number of questionnaires with response status iii/ Number of questionnaires with response status iv

where iii = units that have either reported data (unresolved, used or not used for estimation) or are converted refusals.

where iv = all of the above plus units that have refused to respond, units that were not contacted and other types of non-respondent units.

Admin Response rate (extraction) =
Number of questionnaires with response status vi/ Number of questionnaires with response status vii

where vi = in-scope units that have data (either usable or non-usable) that was extracted from administrative files

where vii = all of the above plus units that have refused to report to the administrative data source, units that were not contacted and other types of non-respondent units.

(% of questionnaire collected over all in-scope questionnaires)

Collection Results Rate =
Number of questionnaires with response status iii / Number of questionnaires with response status viii

where iii = same as iii defined above

where viii = same as iv except for the exclusion of units that were contacted because their response is unavailable for a particular month since they are non-monthly reporters.

Extraction Results Rate =
Number of questionnaires with response status ix / Number of questionnaires with response status vii

where ix = same as vi with the addition of extracted units that have been imputed or were out of scope

where vii = same as vii defined above

(% of questionnaires collected over all questionnaire in-scope we tried to collect)

All the above weighted and un-weighted rates are provided at the industrial group, geography and size group level or for any combination of these levels.

Use of Administrative Data

Managing response burden is an ongoing challenge for Statistics Canada. In an attempt to alleviate response burden and survey costs, especially for smaller businesses, the MRTS has reduced the number of simple establishments in the sample that are surveyed directly and instead derives sales data for these establishments from Goods and Service Tax (GST) files using a statistical model. The model accounts for differences between sales and revenue (reported for GST purposes) as well as for the time lag between the survey reference period and the reference period of the GST file.

For more information on the methodology used for modeling sales from administrative data sources, refer to ‘Monthly Retail Trade Survey: Use of Administrative Data’ under ‘Documentation’ of the IMDB.

Table 1 contains the weighted response rates for all industry groups as well as for total retail trade for each province and territory. For more detailed weighted response rates, please contact the Marketing and Dissemination Section at (613) 951-3549, toll free: 1-877-421-3067 or by e-mail at retailinfo@statcan.

6.2. Methods used to reduce non-response at collection

Significant effort is spent trying to minimize non-response during collection. Methods used, among others, are interviewer techniques such as probing and persuasion, repeated re-scheduling and call-backs to obtain the information, and procedures dealing with how to handle non-compliant (refusal) respondents.

If data are unavailable at the time of collection, a respondent's best estimates are also accepted, and are subsequently revised once the actual data become available.

To minimize total non-response for all variables, partial responses are accepted. In addition, questionnaires are customized for the collection of certain variables, such as inventory, so that collection is timed for those months when the data are available.

Finally, to build trust and rapport between the interviewers and respondents, cases are generally assigned to the same interviewer each month. This action establishes a personal relationship between interviewer and respondent, and builds respondent trust.

7. Data collection and capture operations

Collection of the data is performed by Statistics Canada’s Regional Offices.

Table 1
Weighted response rates by NAICS, for all provinces/territories: August 2013
Table summary
This table displays the results of table 1 weighted response rates by NAICS, for all provinces/territories: August 2013. The information is grouped by NAICS - Canada (appearing as row headers), Weighted Response Rates, Total, Survey, and Administrative (appearing as column headers).
  Weighted Response Rates
Total Survey Administrative
NAICS - Canada
Motor Vehicle and Parts Dealers 92.7 93.4 69.9
Automobile Dealers 94.0 94.3 60.0
New Car Dealers1 95.6 95.6  
Used Car Dealers 67.8 69.2 60.0
Other Motor Vehicle Dealers 81.0 82.3 76.0
Automotive Parts, Accessories and Tire Stores 88.2 91.4 68.2
Furniture and Home Furnishings Stores 88.4 92.3 50.5
Furniture Stores 93.4 95.4 53.7
Home Furnishings Stores 79.3 85.7 48.9
Electronics and Appliance Stores 87.3 88.1 48.0
Building Material and Garden Equipment Dealers 88.1 92.1 54.3
Food and Beverage Stores 87.8 90.5 56.2
Grocery Stores 90.8 93.5 62.3
Grocery (except Convenience) Stores 93.5 95.9 67.3
Convenience Stores 55.9 61.1 24.4
Specialty Food Stores 67.5 76.0 31.7
Beer, Wine and Liquor Stores 81.4 82.8 21.8
Health and Personal Care Stores 88.3 88.8 81.2
Gasoline Stations 80.9 81.7 67.4
Clothing and Clothing Accessories Stores 89.6 91.0 43.7
Clothing Stores 90.6 91.9 44.0
Shoe Stores 91.6 93.0  
Jewellery, Luggage and Leather Goods Stores 79.7 81.3 56.0
Sporting Goods, Hobby, Book and Music Stores 85.1 90.5 33.6
General Merchandise Stores 99.0 99.1 87.8
Department Stores 100.0 100.0  
Other general merchadise stores 98.1 98.3 87.8
Miscellaneous Store Retailers 82.0 86.4 35.9
Total 89.5 91.1 59.5
Regions
Newfoundland and Labrador 90.0 91.7 13.6
Prince Edward Island 90.6 91.5 19.5
Nova Scotia 91.3 92.7 58.3
New Brunswick 88.2 89.9 64.1
Québec 89.3 92.2 54.7
Ontario 90.0 91.2 66.2
Manitoba 89.7 90.2 61.4
Saskatchewan 90.6 91.8 61.2
Alberta 87.7 89.2 58.5
British Columbia 89.9 91.5 58.1
Yukon Territory 85.9 85.9  
Northwest Territories 82.1 82.1  
Nunavut 72.3 72.3  

Weighted Response Rates

Respondents are sent a questionnaire or are contacted by telephone to obtain their sales and inventory values, as well as to confirm the opening or closing of business trading locations. Collection of the data begins approximately 7 working days after the end of the reference month and continues for the duration of that month.

New entrants to the survey are introduced to the survey via an introductory letter that informs the respondent that a representative of Statistics Canada will be calling. This call is to introduce the respondent to the survey, confirm the respondent's business activity, establish and begin data collection, as well as to answer any questions that the respondent may have.

8. Editing

Data editing is the application of checks to detect missing, invalid or inconsistent entries or to point to data records that are potentially in error. In the survey process for the MRTS, data editing is done at two different time periods.

First of all, editing is done during data collection. Once data are collected via the telephone, or via the receipt of completed mail-in questionnaires, the data are captured using customized data capture applications. All data are subjected to data editing. Edits during data collection are referred to as field edits and generally consist of validity and some simple consistency edits. They are used to detect mistakes made during the interview by the respondent or the interviewer and to identify missing information during collection in order to reduce the need for follow-up later on. Another purpose of the field edits is to clean up responses. In the MRTS, the current month’s responses are edited against the respondent’s previous month’s responses and/or the previous year’s responses for the current month. Field edits are also used to identify problems with data collection procedures and the design of the questionnaire, as well as the need for more interviewer training.

Follow-up with respondents occurs to validate potential erroneous data following any failed preliminary edit check of the data. Once validated, the collected data is regularly transmitted to the head office in Ottawa.

Secondly, editing known as statistical editing is also done after data collection and this is more empirical in nature. Statistical editing is run prior to imputation in order to identify the data that will be used as a basis to impute non-respondents. Large outliers that could disrupt a monthly trend are excluded from trend calculations by the statistical edits. It should be noted that adjustments are not made at this stage to correct the reported outliers.

The first step in the statistical editing is to identify which responses will be subjected to the statistical edit rules. Reported data for the current reference month will go through various edit checks.

The first set of edit checks is based on the Hidiriglou-Berthelot method whereby a ratio of the respondent’s current month data over historical (last month, same month last year) or auxiliary data is analyzed. When the respondent’s ratio differs significantly from ratios of respondents who are similar in terms of industry and/or geography group, the response is deemed an outlier.

The second set of edits consists of an edit known as the share of market edit. With this method, one is able to edit all respondents, even those where historical and auxiliary data is unavailable. The method relies on current month data only. Therefore, within a group of respondents, that are similar in terms of industrial group and/or geography, if the weighted contribution of a respondent to the group’s total is too large, it will be flagged as an outlier.

For edit checks based on the Hidiriglou-Berthelot method, data that are flagged as an outlier will not be included in the imputation models (those based on ratios). Also, data that are flagged as outliers in the share of market edit will not be included in the imputation models where means and medians are calculated to impute for responses that have no historical responses.

In conjunction with the statistical editing after data collection of reported data, there is also error detection done on the extracted GST data. Modeled data based on the GST are also subject to an extensive series of processing steps which thoroughly verify each record that is the basis for the model as well as the record being modeled. Edits are performed at a more aggregate level (industry by geography level) to detect records which deviate from the expected range, either by exhibiting large month-to-month change, or differing significantly from the remaining units. All data which fail these edits are subject to manual inspection and possible corrective action.

9. Imputation

Imputation in the MRTS is the process used to assign replacement values for missing data. This is done by assigning values when they are missing on the record being edited to ensure that estimates are of high quality and that a plausible, internal consistency is created. Due to concerns of response burden, cost and timeliness, it is generally impossible to do all follow-ups with the respondents in order to resolve missing responses. Since it is desirable to produce a complete and consistent microdata file, imputation is used to handle the remaining missing cases.

In the MRTS, imputation is based on historical data or administrative data (GST sales). The appropriate method is selected according to a strategy that is based on whether historical data is available, auxiliary data is available and/or which reference month is being processed.

There are three types of historical imputation methods. The first type is a general trend that uses one historical data source (previous month, data from next month or data from same month previous year). The second type is a regression model where data from previous month and same month previous year are used simultaneously. The third type uses the historical data as a direct replacement value for a non-respondent. Depending upon the particular reference month, there is an order of preference that exists so that top quality imputation can result. The historical imputation method that was labelled as the third type above is always the last option in the order for each reference month.

The imputation methods using administrative data are automatically selected when historical information is unavailable for a non-respondent. The administrative data source (annual GST sales) is the basis of these methods. The annual GST sales are used for two types of methods. One is a general trend that will be used for simple structure, e.g. enterprises with only one establishment, and a second type is called median-average that is used for units with a more complex structure.

10. Estimation

Estimation is a process that approximates unknown population parameters using only part of the population that is included in a sample. Inferences about these unknown parameters are then made, using the sample data and associated survey design. This stage uses Statistics Canada's Generalized Estimation System (GES).

For retail sales, the population is divided into a survey portion (take-all and take-some strata) and a non-survey portion (take-none stratum). From the sample that is drawn from the survey portion, an estimate for the population is determined through the use of a Horvitz-Thompson estimator where responses for sales are weighted by using the inverses of the inclusion probabilities of the sampled units. Such weights (called sampling weights) can be interpreted as the number of times that each sampled unit should be replicated to represent the entire population. The calculated weighted sales values are summed by domain, to produce the total sales estimates by each industrial group / geographic area combination. A domain is defined as the most recent classification values available from the BR for the unit and the survey reference period. These domains may differ from the original sampling strata because units may have changed size, industry or location. Changes in classification are reflected immediately in the estimates and do not accumulate over time. For the non-survey portion, the sales are estimated with statistical models using monthly GST sales.

For more information on the methodology for modeling sales from administrative data sources which also contributes to the estimates of the survey portion, refer to ‘Monthly Retail Survey: Use of Administrative Data’ under ‘Documentation’ of the IMDB.

The measure of precision used for the MRTS to evaluate the quality of a population parameter estimate and to obtain valid inferences is the variance. The variance from the survey portion is derived directly from a stratified simple random sample without replacement.

Sample estimates may differ from the expected value of the estimates. However, since the estimate is based on a probability sample, the variability of the sample estimate with respect to its expected value can be measured. The variance of an estimate is a measure of the precision of the sample estimate and is defined as the average, over all possible samples, of the squared difference of the estimate from its expected value.

11. Revisions and seasonal adjustment

Revisions in the raw data are required to correct known non-sampling errors. These normally include replacing imputed data with reported data, corrections to previously reported data, and estimates for new births that were not known at the time of the original estimates. Raw data are revised, on a monthly basis, for the month immediately prior to the current reference month being published. That is, when data for December are being published for the first time, there will also be revisions, if necessary, to the raw data for November. In addition, revisions are made once a year, with the initial release of the February data, for all months in the previous year. The purpose is to correct any significant problems that have been found that apply for an extended period. The actual period of revision depends on the nature of the problem identified, but rarely exceeds three years. Time series contain the elements essential to the description, explanation and forecasting of the behaviour of an economic phenomenon: "They are statistical records of the evolution of economic processes through time."1 Economic time series such as the Monthly Retail Trade Survey can be broken down into five main components: the trend-cycle, seasonality, the trading-day effect, the Easter holiday effect and the irregular component.

The trend represents the long-term change in the series, whereas the cycle represents a smooth, quasi-periodical movement about the trend, showing a succession of growth and decline phases (e.g., the business cycle). These two components—the trend and the cycle—are estimated together, and the trend-cycle reflects the fundamental evolution of the series. The other components reflect short-term transient movements.

The seasonal component represents sub-annual, monthly or quarterly fluctuations that recur more or less regularly from one year to the next. Seasonal variations are caused by the direct and indirect effects of the climatic seasons and institutional factors (attributable to social conventions or administrative rules; e.g., Christmas).

The trading-day component originates from the fact that the relative importance of the days varies systematically within the week and that the number of each day of the week in a given month varies from year to year. This effect is present when activity varies with the day of the week. For instance, Sunday is typically less active than the other days, and the number of Sundays, Mondays, etc., in a given month changes from year to year.

The Easter holiday effect is the variation due to the shift of part of April’s activity to March when Easter falls in March rather than April.

Lastly, the irregular component includes all other more or less erratic fluctuations not taken into account in the preceding components. It is a residual that includes errors of measurement on the 1. A Note on the Seasonal adjustment of Economic Time Series», Canadian Statistical Review, August 1974.  A variable itself as well as unusual events (e.g., strikes, drought, floods, major power blackout or other unexpected events causing variations in respondents’ activities).

Thus, the latter four components—seasonal, irregular, trading-day and Easter holiday effect—all conceal the fundamental trend-cycle component of the series. Seasonal adjustment (correction of seasonal variation) consists in removing the seasonal, trading-day and Easter holiday effect components from the series, and it thus helps reveal the trend-cycle. While seasonal adjustment permits a better understanding of the underlying trend-cycle of a series, the seasonally adjusted series still contains an irregular component. Slight month-to-month variations in the seasonally adjusted series may be simple irregular movements. To get a better idea of the underlying trend, users should examine several months of the seasonally adjusted series.

Since April 2008, Monthly Retail Trade Survey data are seasonally adjusted using the X-12- ARIMA2 software. The technique that is used essentially consists of first correcting the initial series for all sorts of undesirable effects, such as the trading-day and the Easter holiday effects, by a module called regARIMA. These effects are estimated using regression models with ARIMA errors (auto-regressive integrated moving average models). The series can also be extrapolated for at least one year by using the model. Subsequently, the raw series—pre-adjusted and extrapolated if applicable— is seasonally adjusted by the X-11 method.

The X-11 method is used for analysing monthly and quarterly series. It is based on an iterative principle applied in estimating the different components, with estimation being done at each stage using adequate moving averages3. The moving averages used to estimate the main components—the trend and seasonality—are primarily smoothing tools designed to eliminate an undesirable component from the series. Since moving averages react poorly to the presence of atypical values, the X-11 method includes a tool for detecting and correcting atypical points. This tool is used to clean up the series during the seasonal adjustment. Outlying data points can also be detected and corrected in advance, within the regARIMA module.

Lastly, the annual totals of the seasonally adjusted series are forced to the annual totals of the original series.

Unfortunately, seasonal adjustment removes the sub-annual additivity of a system of series; small discrepancies can be observed between the sum of seasonally adjusted series and the direct seasonal adjustment of their total. To insure or restore additivity in a system of series, a reconciliation process is applied or indirect seasonal adjustment is used, i.e. the seasonal adjustment of a total is derived by the summation of the individually seasonally adjusted series.

12. Data quality evaluation

The methodology of this survey has been designed to control errors and to reduce their potential effects on estimates. However, the survey results remain subject to errors, of which sampling error is only one component of the total survey error. Sampling error results when observations are made only on a sample and not on the entire population. All other errors arising from the various phases of a survey are referred to as nonsampling errors. For example, these types of errors can occur when a respondent provides incorrect information or does not answer certain questions; when a unit in the target population is omitted or covered more than once; when GST data for records being modeled for a particular month are not representative of the actual record for various reasons; when a unit that is out of scope for the survey is included by mistake or when errors occur in data processing, such as coding or capture errors.

Prior to publication, combined survey results are analyzed for comparability; in general, this includes a detailed review of individual responses (especially for large businesses), general economic conditions and historical trends.

A common measure of data quality for surveys is the coefficient of variation (CV). The coefficient of variation, defined as the standard error divided by the sample estimate, is a measure of precision in relative terms. Since the coefficient of variation is calculated from responses of individual units, it also measures some non-sampling errors.

The formula used to calculate coefficients of variation (CV) as percentages is:

CV (X) = S(X) * 100% / X
where X denotes the estimate and S(X) denotes the standard error of X.

Confidence intervals can be constructed around the estimates using the estimate and the CV. Thus, for our sample, it is possible to state with a given level of confidence that the expected value will fall within the confidence interval constructed around the estimate. For example, if an estimate of $12,000,000 has a CV of 2%, the standard error will be $240,000 (the estimate multiplied by the CV). It can be stated with 68% confidence that the expected values will fall within the interval whose length equals the standard deviation about the estimate, i.e. between $11,760,000 and $12,240,000.

Alternatively, it can be stated with 95% confidence that the expected value will fall within the interval whose length equals two standard deviations about the estimate, i.e. between $11,520,000 and $12,480,000.

Finally, due to the small contribution of the non-survey portion to the total estimates, bias in the non-survey portion has a negligible impact on the CVs. Therefore, the CV from the survey portion is used for the total estimate that is the summation of estimates from the surveyed and non-surveyed portions.

13. Disclosure control

Statistics Canada is prohibited by law from releasing any data which would divulge information obtained under the Statistics Act that relates to any identifiable person, business or organization without the prior knowledge or the consent in writing of that person, business or organization. Various confidentiality rules are applied to all data that are released or published to prevent the publication or disclosure of any information deemed confidential. If necessary, data are suppressed to prevent direct or residual disclosure of identifiable data.

Confidentiality analysis includes the detection of possible "direct disclosure", which occurs when the value in a tabulation cell is composed of a few respondents or when the cell is dominated by a few companies.

Non-residential building construction price index — Weights for each census metropolitan area

Non-residential building construction price index — Weights for each census metropolitan area
Table summary
This table displays the results of non-residential building construction price index — weights for each census metropolitan area. The information is grouped by year (appearing as row headers), halifax, nova scotia, montréal, quebec, ottawa-gatineau, ontario part, ontario/quebec, toronto, ontario, calgary, alberta, edmonton, alberta, vancouver, british columbia and seven census metropolitan area composite (appearing as column headers).
Year Halifax, Nova Scotia Montréal, Quebec Ottawa-Gatineau, Ontario part, Ontario/Quebec Toronto, Ontario Calgary, Alberta Edmonton, Alberta Vancouver, British Columbia Seven census metropolitan area composite
1992 1.8 18.9 6.1 50.3 3.9 5.3 13.7 100
1993 1.9 18.2 8.4 41.3 5.1 6.4 18.7 100
1994 1.6 15.6 9.9 35.0 5.1 7.3 25.5 100
1995 1.4 17.1 8.8 31.3 4.7 6.9 29.8 100
1996 1.3 16.2 7.2 30.1 5.1 5.1 35.0 100
1997 1.1 14.3 6.6 31.6 6.2 5.1 35.1 100
1998 1.0 12.9 6.1 34.4 8.3 5.4 31.9 100
1999 1.0 12.6 5.9 39.3 12.2 6.8 22.2 100
2000 1.4 12.2 5.7 44.7 11.6 6.4 18.0 100
2001 2.2 13.3 6.9 43.2 11.6 6.7 16.1 100
2002 1.9 17.3 7.5 43.3 9.4 6.6 14.0 100
2003 1.5 20.6 7.9 39.1 9.5 7.1 14.3 100
2004 0.9 19.9 6.6 43.7 9.7 6.8 12.4 100
2005 1.5 16.4 5.6 48.4 9.6 6.4 12.1 100
2006 1.9 14.0 6.1 45.5 13.3 6.8 12.4 100
2007 2.1 13.5 5.9 37.2 17.2 8.1 16.0 100
2008 2.0 14.1 5.5 31.3 22.1 8.6 16.4 100
2009 2.1 13.9 4.2 31.4 22.0 10.0 16.4 100
2010 2.2 13.6 4.8 32.4 21.8 11.1 14.1 100
2011 2.6 13.1 5.4 35.3 16.8 13.4 13.4 100
2012 2.3 15.5 5.8 38.9 13.7 11.6 12.2 100
2013 2.4 16.3 5.7 40.4 10.9 11.0 13.3 100

Response rate for the Survey on Sexual Misconduct in the Canadian Armed Forces, 2016

Response rate for the Survey on Sexual Misconduct in the Canadian Armed Forces, 2016
Table summary
This table displays the results of Response rate for the Survey on Sexual Misconduct in the Canadian Armed Forces. The information is grouped by Subpopulation (appearing as row headers), Response Rate, calculated using percent units of measure (appearing as column headers).
Subpopulation Response Rate
  percent
Total 53
Sex  
Male 51
Female 62
Age Group  
<25 22
25-29 39
30-34 51
35-39 61
40-44 66
45-49 71
50+ 72
Force  
Regular Forces 61
Primary Reservists 36
Rank  
Senior Officer 78
Junior Officer 66
Senior NCM 70
Junior NCM 41
Capability Component Description (CCD)  
Army 45
Navy 46
Air 70
CMP 57
Other 66
Environment  
Air 70
Land 48
Sea 50

Survey on Living with Chronic Diseases in Canada

User Guide

November 2011


1.0 Introduction

The Survey on Living with Chronic Diseases in Canada (SLCDC) is a cross-sectional survey that collects information related to the experiences of Canadians with chronic health conditions. Sponsored by the Public Health Agency of Canada (PHAC), the SLCDC takes place every two years, with two chronic diseases covered in each survey cycle. The 2011 survey focused on diabetes and respiratory conditions: asthma and chronic obstructive pulmonary disease (COPD).

The target population for the SLCDC is Canadians living in private dwellings in the ten provinces. Residents of the three territories, persons living on Indian Reserves, residents of institutions, and full-time members of the Canadian Armed Forces are excluded from this survey. The 2011 SLCDC included persons aged 20 years or older with diabetes, persons aged 12 years or older with asthma, and persons aged 35 years or older with COPD.

There were two data collection periods for the 2011 SLCDC: October and November of 2010 and March and April of 2011.

The purpose of this document is to facilitate the manipulation of the SLCDC data file and to describe the methodology used.

Any questions about the data sets and their use or about ordering of custom tabulations should be directed to:

Client Custom Services, Health Statistics Division: 613–951–1746
E–mail: statcan.hd-ds.statcan@statcan.gc.ca

2.0 Background

The Survey on Living with Chronic Diseases in Canada (SLCDC) is a follow-up to the Canadian Community Health Survey (CCHS), an annual cross-sectional survey that collects information related to health status, health care utilization and health determinants for the Canadian population. Since the CCHS relies upon a large sample and identifies several diagnosed chronic conditions, it serves both as sample frame for the SLCDC and a source of additional health and socio-demographic information.

The central objective of the SLCDC is to gather information related to the experiences of persons living with chronic diseases, including diagnosis of a chronic health condition, care received from health professionals, medication use and self-management of their condition.

The survey was sponsored by the Public Health Agency of Canada.

3.0 Objective

The purpose of the SLCDC is to provide information on the impact of chronic disease on individuals, as well as how people with chronic disease manage their health condition. More specifically, the survey had the following objectives:

  • To assess the impact of chronic health conditions on quality of life
  • To provide more information on how people manage their chronic health conditions
  • To identify health behaviors which influence disease outcomes
  • To identify barriers to self-management of chronic health conditions

4.0 Survey content

This section provides a general discussion of the consultation process used in survey content development and gives a summary of the final content selected for inclusion in the SLCDC.

The SLCDC content was developed based on an ongoing consultation process between the Health Statistics Division at Statistics Canada and the Public Health Agency of Canada (PHAC), with significant input from members of expert advisory groups in the areas of diabetes and respiratory conditions. Content selection was based on objectives and data requirements specified by PHAC. Members of the PHAC project team were consulted on a regular basis throughout development and testing of the SLCDC questionnaires. The end result of the consultation process was two SLCDC questionnaires: (1) a diabetes-specific questionnaire and (2) an asthma and COPD -specific questionnaire.

A summary describing each of the modules on the arthritis and hypertension questionnaires is provided in Section 4.2.

4.1 Qualitative testing

As previously stated, the 2011 SLCDC consisted of two different questionnaires: a diabetes questionnaire and a respiratory conditions questionnaire. The questionnaires were developed by Statistics Canada, in collaboration with PHAC. Diabetes and respiratory conditions expert groups were also consulted during content development. The questionnaires were translated by the Official Languages and Translation Division of Statistics Canada. Both questionnaires (in English and French) were tested by Statistics Canada's Questionnaire Design and Review Centre (QDRC) using one-on-one interviews.

Qualitative testing was conducted to assess the content and flow of the SLCDC questionnaires. The questionnaires were administered face-to-face with respondents. The one-on-one interviews explored the four steps in the cognitive process of responding to the questionnaire: understanding the question and response categories, recalling/searching for the requested information, thinking about the answer and making a judgment about what to report, and reporting the answer.

Qualitative testing was conducted in February 2010. English testing took place in Toronto and French testing in Montreal. The frame used to select respondents for the interviews was the 2009 CCHS. A total of 36 participants took part in the testing, representing a cross-section of persons who reported in their CCHS interview having either diabetes, asthma, chronic bronchitis or chronic obstructive pulmonary disease (COPD) diagnosed by a doctor or other health professional. All qualitative interviews were conducted by trained interviewers from QDRC and observed by members of the SLCDC project team, including personnel from STC's Health Statistics Division and PHAC. Some of the key findings from the qualitative testing are discussed below.

Key findings from testing the diabetes questionnaire:

In general, participants found the questionnaire straightforward and easy to answer. There were some questions where respondents were unsure of the terminology used in the questionnaire. For example, many respondents did not understand the term "A1C test" until a clarifying note was read to them. Certain questions in the Coping and Support module were poorly understood by respondents, especially in French.

Following qualitative testing, it was decided to remove the first five questions in the Coping and Support module and re-name the module to Support and Well-being. It was also decided to remove several questions related to difficulties experienced accessing health care and diabetes healthcare teams.

Key findings from testing the respiratory questionnaire:

Overall, participants judged the respiratory questionnaire to be clear-cut and easy to answer. Some respondents felt that the questionnaire was geared towards persons with more severe respiratory problems. One important observation made was that respondents were better able to report the colour of their inhaler rather than its type (rescue vs. controller), so this was emphasized in interviewer training.

The main change following qualitative testing was the decision to use the name of the respondent's respiratory condition in the dynamic text of the questionnaire rather than the term "breathing problems". The Smoking History module was expanded and the Medication Cost module was removed.

4.2 Final questionnaire content

This section outlines the modules comprising the content of the SLCDC diabetes and respiratory questionnaires. The diabetes questionnaire was made up of 15 modules, and the respiratory questionnaire of 17 modules.

Diabetes

GENX
General health:The general health module is used to collect data on self-perceived health, satisfaction with life, self-perceived mental health and self-perceived stress. It is the same as the module used in the CCHS.

The intent was to make respondents feel comfortable before asking them specific questions about their diabetes.
CNDX
Confirmation of diabetes diagnosis: This module is used to confirm that the respondent has received a diagnosis of diabetes from a health professional and the type of diabetes that they have. The module is also used to screen out women whose only diagnosis of diabetes occurred during pregnancy. The screening questions are modified from the Chronic conditions module in the CCHS. The remaining questions are new.

A follow-up question is asked if the respondent says that they do not have diabetes to help determine why there is a discrepancy between what was reported in the CCHS and in the SLCDC. Even if the respondent says that they do not have diabetes because their condition is controlled by medication or lifestyle changes, they will continue with the survey. The experiences of persons who can control their condition through medication or lifestyle changes are of interest to the survey.
XHUX
Health care utilization: The Health care utilization module asks respondents about the health professional that is most responsible for treating their diabetes and whether in the past 12 months they have seen a variety of health professionals, as well as whether they have had difficulties receiving ongoing care for their diabetes.
CODX
Clinical monitoring: This module asks respondents about tests and exams that are commonly given by health professionals to monitor diabetes and its complications. The module asks about the frequency of the tests as well as their results.

Test results and measurements can be used to determine how well-controlled the respondent's diabetes is.
MEDX
Medication use:This module asks respondents about prescription medications, including insulin injections, taken to control blood sugar, blood pressure and blood cholesterol. Respondents are also asked about compliance with their prescriptions as well as whether they take herbal or naturopathic remedies for their diabetes.

Diabetes has a number of associated modifiable risk factors (e.g., diet, weight control, alcohol consumption) and researchers are interested in knowing whether changes to these modifiable risk factors are associated with a reduced rate of prescription medication use in people with diabetes.
ICDX
Restriction of work-related activities: This module asked respondents whether they have had to modify their paid work or volunteering work because of their arthritis. Conceptually these questions are comparable to ACHES. The definition of volunteer work was taken from the Canada Survey on Giving, Volunteering and Participating (STC, 2007).
MEAX
Medication use: This is a short module which asked respondents whether they had taken any prescription, over-the-counter or natural health products for their arthritis in the past month. These questions are not comparable to the CCHS or ACHES questions as they are more general in nature.
HUAX
Insurance coverage: This module asks respondents whether they have insurance that covers the partial or full cost of prescription medications, glucose monitoring equipment, dental care and vision care.

The cost of prescription drugs, glucose monitoring supplies, dental care and vision care can prevent some people from taking their prescription drugs, self-monitoring blood sugar levels or seeking preventative care.
CLDX
Clinical recommendations: This module is made up of a series of questions that ask about things that a doctor or other health professional may have suggested to a respondent to help them manage their diabetes.

The purpose of these questions is to measure the extent to which doctors are following best-practice guidelines for the treatment of diabetes by discussing with their diabetic patients the importance of diet, exercise, weight control, etc.
SMDX
Self-management: This module asks respondents about things they may be doing to help manage their diabetes. The module covers activities that are recognized as being important for the management of diabetes (diet, exercise, weight control, etc.). Follow-up questions measuring barriers to change are asked of respondents who are not currently engaging in activities that are considered modifiable risk factors for diabetes, specifically diet, exercise and weight control.

Questions are similar to those asked in the Clinical recommendations module. The purpose of these questions is to measure the extent to which respondents are applying best-practice guidelines to the treatment of their diabetes.
MODX
Self-monitoring: This module asks questions about diabetes monitoring respondents do themselves outside of a health professional's office. Topics include frequency of blood sugar self-monitoring, frequency of blood pressure monitoring and frequency of foot checks for sores or irritations.
DCDX
Diabetes complications: This module asks about health conditions and complications that are associated with diabetes. Conditions include cataracts, kidney failure, high blood pressure, etc. Respondents must have had the condition diagnosed by a health professional. The number of complications suffered by a respondent can be an indication of how severe or well-controlled their diabetes is.
RADX
Restriction of activities: This module asks a series of questions about being limited in daily or usual activities in the past 12 months because of the diabetes.

When combined with information from the rest of the interview, this module can be used to compare persons who report having activity limitations due to diabetes with those who do not. In conjunction with the module on diabetes complications, this module will help identify persons with more severe symptoms of diabetes.

Persons with more complications and more activity limitations will likely differ across variables such as health care utilization and medication use, compared to persons with fewer complications and limitations.
RWDX
Restriction of work-related activities: The purpose of this module is to identify respondents who have had to modify their paid work because of their diabetes. The module includes questions on the respondent's work life, including current and past employment status, and changes made to work activities due to their diabetes.
SWDX
Support and well-being: Having the support of family and friends, and being able to deal with emotions such as stress have been shown to be beneficial to people with diabetes.

This module asks about self-perceived social support available to the respondent, and whether they have ever needed help for their emotions or mental health in order to cope with their diabetes.
PADX
Patient activation: This module asks a series of questions to determine how involved respondents are in their diabetes treatment and whether they feel confident to handle problems and complications that may develop.
ADMX
Administration: The module asks respondents' permission to link their information from the SLCDC to their responses from the 2010 CCHS. Respondents are then asked if this information can be shared with Statistics Canada's share partners.

Respiratory

GENX
General health: The general health module is used to collect data on self-perceived health, satisfaction with life, self-perceived mental health and self-perceived stress.This module is the same as the module used in the CCHS.

The intent was to make respondents feel comfortable before asking them specific questions about their respiratory condition.
DHRX
Diagnosis and family history: This module is used to confirm that the respondent has received a diagnosis of asthma or chronic bronchitis/emphysema/COPD from a health professional.

For respondents who indicate that they do not have a breathing problem, there is another question to determine why there is a discrepancy between what was reported in the CCHS and in the SLCDC. Respondents who say they feel better or they take medication to control their breathing problem continue with the survey.

The experiences of persons who can control their condition through medication or lifestyle changes are of interest to the survey.

Other questions in this module ask age at diagnosis, the type of respiratory condition, and whether the respondent has any blood relatives who have ever been diagnosed with any of asthma, chronic bronchitis, emphysema, or COPD.
SSRX
Symptoms and severity: Questions in this module ask about the frequency and severity of the main symptoms of asthma and COPD.

In conjunction with the module on restriction of activities, this module will help identify persons with more severe respiratory symptoms. Persons with more severe symptoms and more activity limitations will likely differ across variables such as health care utilization and medication use, compared to persons with less severe symptoms.
TRRX
Triggers: Questions in this module ask respondents about things that bring on the symptoms of their asthma or COPD, or make them worse. A list of common triggers is read to the respondents and they are asked to indicate which ones affect them.
HUHX
Health care utilization: This module refers to contacts respondents had with various health professionals regarding their asthma or COPD in the past 12 months. Information is also sought about the total number of visits to a doctor, to the emergency room and nights spent in hospital in the previous 12 months because of respondent's asthma or COPD.
MERX
Medication use:This module covers the current use of prescription medications for asthma and COPD. It collects detailed information on the two main types of inhalers used to treat these respiratory conditions (reliever and controller), as well as the use of corticosteroids and antibiotics, which are treatments used in more severe cases of the conditions. The module also has questions about reasons for not taking prescription medications for asthma or COPD and medication compliance.
HCRX
Health conditions:This module includes questions about health conditions that are associated with respiratory conditions, such as sleep apnea, osteoporosis, and heart failure. The age of the respondent determines which conditions they are asked about.
ALRX
Allergies:This module asks respondents with asthma about allergies they have and whether they have received treatment for them. We are only interested in allergies that have been diagnosed by a health professional, as the result of allergy testing.
RARX
Restriction of activities:This module includes a series of questions about whether the respondent has been limited in their daily or usual activities in the past 12 months because of their asthma or COPD.

When combined with information from the rest of the interview, this module can be used to compare persons who report having activity limitations due to asthma or COPD with those who do not.

In conjunction with the module on symptoms and severity, this module will help identify persons with more severe symptoms. Persons with more severe symptoms and more activity limitations will likely differ across variables such as health care utilization and medication use, compared to persons with less severe symptoms.
RWRX
Restriction of work-related activities:The purpose of this module is to identify respondents who have had to modify their paid work because of their respiratory condition.

This module includes a number of questions on the respondent's work life, including current and past employment status, changes made to work activities due to their respiratory condition, and exposure to dust, fumes, or gases at work.
RERX
Restriction of educational activities:The purpose of this module is to identify respondents who have had to modify their educational activities because of their respiratory condition.

This module is only asked of respondents under 50 years of age and focuses only on current school activities.
RVRX
Restriction of volunteer activities:The purpose of this module is to identify respondents who have had to modify their volunteer activities because of their respiratory condition.

This module includes a number of questions on the respondent's current and past volunteer status and changes made to volunteer activities due to their respiratory condition.
SMRX
Self-management:The Self-management module asks respondents about things that a doctor or health professional may have suggested they do to help manage their breathing problems. Respondents are also asked whether they did any of these things to help manage their asthma or COPD. The module covers a number of activities and changes that are recognized as being important for the management of respiratory conditions (e.g. seeing an asthma or COPD educator or changing the home environment).

These questions are of interest to researchers who want to know if health professionals are recommending certain activities and whether people with asthma or COPD are making changes to their behaviour and environment as a result of their diagnosis.
SWRX
Support and well-being: Having the support of family and friends, and being able to deal with emotions such as stress have been shown to be beneficial to people with asthma and COPD.

This module asks about self-perceived social support available to the respondent, and whether they have ever needed help for their emotions or mental health in order to cope with their asthma or COPD. There are also additional questions for respondents with COPD asking whether they have discussed their wishes for care in case of hospitalization and as their condition progresses.
SHRX
Smoking history: This module asks respondents about cigarette smoking over the course of their lifetime. The module contains questions about the length of time they smoked daily and occasionally, and the number of cigarettes they smoked.

The purpose of this module is to calculate "pack years", which is the number of cigarettes smoked daily multiplied by the number of years smoked divided by 20. This calculation is used by researchers and physicians to determine the intensity of an individual's tobacco exposure over the course of their lifetime.
SCRX
Smoking cessation: This module asks respondents who are current smokers about their intention to quit smoking in the near future, if they have received advice or information from their doctor about quitting, as well as if they have tried to quit in the previous 12 months and any methods they may have used to quit. Respondents are also asked questions about exposure to second-hand smoke in their home.
ADMX
Administration: This module asks respondents' permission to link their information from the SLCDC to their responses from the 2010 CCHS. Respondents are then asked if this information can be shared with Statistics Canada's share partners.

Respondents aged 12 and 13 first answer the link and share questions for themselves. Then their parent or guardian are asked to give their permission to share and link the data. Both the respondent's and their parent's or guardian's permissions are required to link and share the data.

5.0 Sample Design

5.1 Target population

The 2011 Survey on Living with Chronic Disease in Canada (SLCDC) targets Canadians living in private dwellings in the ten provinces with one of the following conditions:

  • Asthma (excluding people who also have COPD) and aged 12 years or older as of December 31st 2010
  • Diabetes (excluding women who are only diabetic during pregnancy) and aged 20 years or older as of December 31st 2010
  • COPD (including people who also have asthma) and aged 35 years or older as of December 31st 2010.

The conditions had to have been diagnosed by a health professional. Residents of the three territories; persons living on Indian Reserves, Crown lands, or in institutions; full-time members of the Canadian Forces and residents of certain remote regions were not in-scope for this survey and were excluded. These exclusions represent about 2% of the overall Canadian population.

5.2 Domains of interest

The SLCDC aims to produce reliable estimates at the national level by age group and sex. The targeted age groups differ between conditions:

  • For asthma (excluding people who also have COPD), they are 12 – 24, 25 – 39, 40 – 54, and 55+
  • For diabetes (excluding women who are only diabetic during pregnancy), they are 20 – 64 and 65+
  • For COPD (including people who also have asthma), they are 35+.

However, since the 2011 SLCDC sample was selected from the 2010 Canadian Community Health Survey (CCHS), the sample size of the SLCDC was limited by the number of people with the conditions in the CCHS. It is therefore possible that some age groups will end up having insufficient sample size and need to be collapsed for analytical purposes.

5.3 Sampling frame

The 2011 SLCDC used the 2010 CCHS to select its sample. The SLCDC employs a two-phase design in which the first phase is the CCHS sample and the second phase is the SLCDC sample.

The CCHS sample is selected from multiple frames. The first frame is an area frame designed for the Canadian Labour Force Survey (LFS). The second is a list frame of telephone numbers. About half of the CCHS sample is selected from the area frame and the other half is selected from the list frame. For more detailed information on the CCHS sampling process, refer to the 2010 CCHS User Guide.

5.4 Sample size and allocation

In order to produce reliable estimates at the national level by age group and sex, CCHS respondents were stratified by condition, age group and sex. Because the age groups of interest differ between conditions and there were units with more than one condition (e.g. both asthma and diabetes), the age groups used in stratification and allocation consist of the intersection of the age groups listed under section 5.2.

To reduce response burden, it was decided that respondents could receive only one questionnaire. For people having two conditionsFootnote 1 , they were randomly assigned the questionnaire that corresponded to one of their conditions. The sample allocation by questionnaire was done by sex and age groups in proportion to the number of 2010 CCHS respondents for each condition.

Some in-scope units were excluded before sample selection. Units were excluded if they were a proxy respondent or if they did not agree to share or link their CCHS data. These exclusions represented 13% of the first phase sample and were taken into account at the estimation stage since they are part of the population of interest.

Since the number of CCHS respondents with each of the three conditions was not large, the decision was made to include all such units in the 2011 SLCDC sample. In the end, after exclusions, as well as assigning only one questionnaire to units with two conditions, there were 3,650 units in the asthma sample, 3,747 units in the diabetes sample and 1,733 units in the COPD sample. Tables 5.1, 5.2 and 5.3 show the sample sizes of each of the three conditions by age groups and sex:

Table 5.1: SLCDC sample size by stratum for the asthma condition
Stratum (Sex, Age group) 2011 SLCDC sample size
Female 12 to 24 years old 512
Female 25 to 39 years old 544
Female 20 to 54 years old 427
Female 55+ years old 708
Total Female 2,191
Male 12 to 24 years old 504
Male 25 to 39 years old 326
Male 40 to 54 years old 272
Male 55+ years old 357
Total Male 1,459
Total 3,650
Table 5.2: SLCDC sample size by stratum for the diabetes condition
Stratum (Sex, Age group) 2011 SLCDC sample size
Female 20 to 64 years old 769
Female 65+ years old 1,086
Total Female 1,855
Male 20 to 64 years old 875
Male 65+ years old 1,107
Total Male 1,892
Total 3,747
Table 5.3: SLCDC sample size by stratum for the COPD condition
Stratum (Sex, Age group) 2011 SLCDC sample size
Female 65+ years old 1,078
Male 65+ years old 655
Total 1,733

6.0 Data collection

Collection for the SLCDC took place in October and November of 2010 and continued in March and April of 2011. Over the collection period, a total of 6,573 valid interviews were conducted using computer assisted telephone interviewing (CATI).

6.1 Computer-assisted interviewing

Computer-assisted interviewing (CAI) offers two main advantages over other collection methods. First, CAI offers a case management system and data transmission functionality. This case management system automatically records important management information for each attempt on a case and provides reports for the management of the collection process. CAI also provides an automated call scheduler, i.e. a central system to optimize the timing of call-backs and the scheduling of appointments used to support CATI collection.

The case management system routes the questionnaire applications and sample files from Statistics Canada´s main office to regional collection offices (in the case of CATI). Data returning to the main office take the reverse route. To ensure confidentiality, the data are encrypted before transmission. The data are then unencrypted when they are on a separate secure computer with no remote access.

Second, CAI allows for custom interviews for every respondent based on their individual characteristics and survey responses. This includes:

  • Questions that are not applicable to the respondent are skipped automatically.
  • Edits to check for inconsistent answers or out-of-range responses are applied automatically and on-screen prompts are shown when an invalid entry is recorded. Immediate feedback is given to the respondent and the interviewer is able to correct any inconsistencies.
  • Question text, including reference periods and pronouns, is customised automatically based on factors such as the age and sex of the respondent, the date of the interview and answers to previous questions.

6.2 SLCDC application development

For the SLCDC, a CATI application was utilized. The application consisted of entry, survey content, and exit components.

Entry and exit components contain standard sets of questions designed to guide the interviewer through contact initiation, respondent confirmation, tracing (if necessary) and determination of case status. The survey content component consisted of the SLCDC diabetes and respiratory questionnaire modules, which made up the bulk of the application. Development and testing of the CATI application began in April 2010. There were three stages of internal testing: block testing, integrated testing and end-to-end testing.

Block testing consists of independently testing each content module or "block" to ensure skip patterns, logic flows and text, in both official languages, are specified correctly. Skip patterns or logic flows across modules are not tested at this stage as each module is treated as a stand alone questionnaire. Once all blocks are verified by several testers, they are added together along with the entry and exit components into an integrated application. This newly integrated application is then ready for the next stage of testing.

Integrated testing occurs when all of the tested modules are added together, along with the entry and exit components, into an integrated application. This second stage of testing ensures that key information such as age and gender are passed from the sample file to the entry and exit and survey content components of the application. It also ensures that variables affecting skip patterns and logic flows are correctly passed between modules within the survey content component. Since, at this stage, the application essentially functions as it would in the field, all possible scenarios faced by interviewers are simulated to ensure proper functionality. These scenarios test various aspects of the entry and exit components including; establishing contact, confirming that the correct respondent has been found, determining whether a case is in scope and creating appointments.

End-to-end testing occurs when the fully integrated application is placed in a simulated collection environment. The application is loaded onto computers that are connected to a test server. Data are then collected, transmitted and extracted in real time, exactly as would be done in the field. This last stage of testing allows for the testing of all technical aspects of data input, transmission and extraction for the SLCDC application. It also provides a final chance of finding errors within the entry, survey content and exit components.

6.3 Interviewer training

In October 2010 and March 2011, representatives from Statistics Canada's Collection Planning and Management Division visited the four regional offices participating in the collection of the SLCDC data (Halifax, Sherbrooke, Sturgeon Falls, and Edmonton). The purpose of the visits was to train the regional office project managers and teams of interviewers for the SLCDC diabetes and respiratory surveys. Members of the SLCDC project team from Health Statistics Division also attended the training sessions to present information about the background and development of the SLCDC, and to offer additional support and clarify any questions or concerns that may have arisen.

The focus of these sessions was to make interviewers comfortable using the SLCDC application and familiarise interviewers with survey content. The training sessions covered the following topics:

  • goals and objectives of the survey
  • survey methodology
  • application functionality
  • review of the questionnaire content and exercises
  • mock interviews to simulate difficult situations and practise ways of dealing with non-response
  • survey management

One of the key aspects of the training was a focus on minimizing non-response. Exercises to minimise non-response were prepared for interviewers. The purpose of these exercises was to have the interviewers practice convincing reluctant respondents to participate in the survey.

6.4 The interview

Sample units selected from the frame were interviewed from centralised call centres using the CATI application. The CATI interviewers were supervised by a senior interviewer located in the same call centre.

To ensure the best possible response rate attainable, many practices were used to minimise non-response, including:

Introductory letters

Before the start of the collection period, introductory letters explaining the purpose of the survey were sent to the targeted respondents. The letters described the importance of the survey and provided examples of how the SLCDC data would be used.

Mailing address information was not available for all respondents from the 2010 CCHS. For cases where mailing addresses were not available, an introductory letter was not sent out.

Initiating contact

Interviewers were instructed to make all reasonable attempts to obtain interviews. When the timing of the interviewer´s call was inconvenient, an appointment was made to call back at a more convenient time. Numerous call-backs were made at different times on different days.

When a respondent was no longer available at the phone number provided on the 2010 CCHS, tracing of the respondent was initiated. In order to trace respondents, alternate contacts provided by the respondent on the 2010 CCHS were used to obtain the respondent's new telephone number.

Refusal conversion

For individuals who at first refused to participate in the survey, a letter was sent from the regional office to the respondent, stressing the importance of the survey and the targeted respondent's participation. This was followed by a second call from a senior interviewer, a project supervisor or another interviewer to try to convince the respondent of the importance of participating in the survey.

Language barriers

To remove language as a barrier to conducting interviews, the regional offices recruit interviewers with a wide range of language competencies. When necessary, cases were transferred to an interviewer with the language competency needed to complete an interview.

Proxy interviews

Proxy interviews were not permitted for the SLCDC.

6.5 Field operations

The SLCDC consisted of two six week collection periods. Half of the sample was collected in each collection period. The regional collection offices were instructed to use the first two weeks of each collection period to complete 40% of the cases, with the rest of the collection period being used to finalize the remaining sample and to follow up on outstanding non-response cases.

Transmission of cases from the regional offices to head office was the responsibility of the regional office project supervisor, senior interviewer and the technical support team. These transmissions were performed nightly and all completed cases were sent to Statistics Canada's head office.

6.6 Quality control and collection management

During the SLCDC collection period, several methods were used to ensure data quality and to optimize collection. These included using internal measures to verify interviewer performance and the use of a series of ongoing reports to monitor various collection targets and data quality.

CATI interviewers were randomly chosen for validation. Validation during CATI collection consisted of senior interviewers monitoring interviews to ensure proper techniques and procedures (reading the questions as worded in the application, not prompting respondents for answers, etc.) were followed by the interviewers. In addition, members of the survey team from head office visited a number of regional offices to observe collection at various times during the collection period.

A series of reports were produced to effectively track and manage collection targets and to assist in identifying other collection issues. Cumulative reports were generated daily showing response rates, refusal rates and out-of-scope rates. The link and share rates were calculated weekly. Customised reports were also created and used to examine specific data quality issues that arose during collection.

One issue that arose during data collection of the 2009 SLCDC was a higher than expected out-of-scope rate. As a result of this, a series of questions was developed and included in the 2011 questionnaires to follow-up with any respondents who reported that they had not been diagnosed with diabetes, asthma, or COPD. These questions were aimed at identifying respondents who had been diagnosed, but were no longer experiencing symptoms or who were able to manage their condition through medication or changes to their lifestyle. This reduced the number of out-of-scope cases.

For the 2011 SLCDC only COPD produced a higher than expected out-of-scope rate. The question on the CCHS that is used to identify respondents with COPD is: Do you have chronic bronchitis, emphysema or chronic obstructive pulmonary disease or COPD? After the SLCDC collection the notes and remarks for the cases that were coded out-of-scope were examined to gain a better understanding of why respondents reported having COPD on the CCHS but did not on the SLCDC. There was an indication that there was confusion among some respondents about the term "chronic bronchitis". Many seemed to confuse it with the form of bronchitis that is a bacterial or viral infection and reported having "chronic bronchitis" on the CCHS and were screened into the SLCDC as a result.

The impact of the out-of-scope cases on weighting and data quality will be discussed in Chapters 8 and 9, respectively.

7.0 Data processing

7.1 Editing

Most editing of the data was performed at the time of the interview by the computer-assisted interviewing (CAI) application. It was not possible for interviewers to enter out-of-range values and flow errors were controlled through programmed skip patterns. For example, CAI ensured that questions that did not apply to the respondent were not asked.

In response to some types of inconsistent or unusual reporting, warning messages were invoked but no corrective action was taken at the time of the interview. Where appropriate, edits were instead developed to be performed after data collection at Head Office. Inconsistencies were usually corrected by setting one or both of the variables in question to "not stated".

7.2 Coding

Pre-coded answer categories were supplied for all suitable variables. Interviewers were trained to assign the respondent´s answers to the appropriate category.

In the event that a respondent´s answer could not be easily assigned to an existing category, several questions also allowed the interviewer to enter a long-answer text in the "Other" category.

7.3 Creation of derived and grouped variables

To facilitate data analysis and to minimise the risk of error, a number of variables on the file have been derived using items found on the SLCDC questionnaire. Derived variables generally have a "D" or "G" in the fifth character of the variable name. In some cases, the derived variables are straightforward, involving collapsing of response categories. In other cases, several variables have been combined to create a new variable. The Derived Variables Documentation (DV) provides details on how these more complex variables were derived. For more information on the naming convention, please go to Section 11.3.

7.4 Weighting

The principle behind estimation in a probability sample such as the SLCDC is that each person in the sample "represents", besides himself or herself, several other persons not in the sample. For example, in a simple random 2% sample of the population, each person in the sample represents 50 persons in the population.

The weighting phase is a step which calculates, for each record, what this number is. This weight appears on the microdata file, and must be used to derive meaningful estimates from the survey. For example, if the number of individuals who have ever taken insulin injections for their diabetes is to be estimated, this would be done by selecting the records referring to those individuals in the sample with that characteristic and summing the weights entered on those records.

Details of the method used to calculate these weights are presented in Chapter 8.

8.0 Weighting

To ensure estimates produced from survey data are representative of the surveyed population and not just the sample itself, users must incorporate the survey weights in their calculations. A survey weight is given to each person included in the final sample, which consists of all in-scope respondents to the survey. Intuitively, the weight corresponds to the number of persons in the population that are represented by the respondent.

As described in Chapter 5, the SLCDC survey frame is composed of respondents to the 2010 CCHS. The starting point for the SLCDC weighting process is therefore the 2010 CCHS share weight. For more information on this weight, please refer to the 2010 CCHS User Guide.

8.1 Weight adjustment for sample weight

Table 8.1 presents an overview of the different adjustments that are part of the weighting strategy for SLCDC 2011, in the order in which they are applied.

Weighting steps for SLCDC

  • CD 1 – Proxy-Link Adjustment
  • CD 2 – Selection Criteria Adjustment
  • CD 3 – Out-of-Scope in SLCDC Adjustment
  • CD 4 – Non-response in SLCDC Adjustment
  • CD 5 – Share-Link (Final) Adjustment
  • CD 6 – Winsorization
  • CD 7 – Post-Stratification

8.1.1 Proxy-link adjustment

The first step of weighting for SLCDC was to drop the CCHS units in the territories, since they were not part of the target population. The next step was to adjust for the fact that some CCHS respondents were excluded from the SLCDC for practical reasons, even though they were still part of the target population. The reasons for their exclusion are as follows:

  • People who did not agree to link their 2010 CCHS information were excluded. One of the objectives of the SLCDC was to link the SLCDC survey responses with the 2010 CCHS and provide the linked files to survey share partners. Without their permission to link, there was no reason to survey these CCHS respondents.
  • People for whom their 2010 CCHS interview was done by proxy were excluded since the SLCDC questionnaire could not be answered by proxy.

Since these CCHS respondents belonged to the population of interest, adjustments were made to allocate their weights to the remaining CCHS respondents.
The adjustment process starts with the share weights from the 2010 CCHS, and within each cell (defined as the intersection of condition, age group, sex, and region) the adjustment is calculated as

Formula 1

The weight wgtCD1 is calculated as wgts3*adjCD1, where wgts3 is the final 2010 CCHS share weight. After the adjustment is calculated, the excluded units are dropped from the file.

8.1.2 SLCDC selection criteria adjustment

In the SLCDC sampling design, CCHS respondents were stratified by age group and sex (see Chapter 5). In each age group by sex stratum, a unit could have either one condition (asthma, COPD, or diabetes) or two conditions (COPD and diabetes or asthma and diabetes). Those with COPD and asthma were considered as having only COPD. All of those identified as having one condition were selected for the SLCDC and no weight adjustments were necessary. To lower response burden for those with two conditions, the sample unit could receive only one questionnaire. As a result of this selection, weight adjustments were necessary. The adjustment for units with two conditions were calculated as follows:

Formula 2 Formula 3 Formula 4

The weight for those who received the asthma questionnaire was calculated as wgtCD2a = wgtCD1*adjCD2a. Similarly, the weight for those who received the COPD questionnaire was calculated as wgtCD2c = wgtCD1*adjCD2c and for those who received the diabetes questionnaire was wgtCD2d = wgtCD1*adjCD2d.

8.1.3 SLCDC out-of-scope adjustment

After collection, units were classified into two main groups: resolved cases and unresolved cases. Resolved cases are units where contact was made with the CCHS respondent and it was confirmed whether or not they still had the condition (in-scope or out-of-scope). The out-of-scope units were dropped from the file. The unresolved cases are units that cannot be contacted and so it is not possible to know if they are in-scope or out-of-scope. Therefore, logistic models were used (based on the resolved cases) to estimate the probability for an unresolved case to be in-scope. This probability was then used in adjusting the weights.

The weight for the unresolved units that received the asthma questionnaire was calculated as wgtCD3a = wgtCD2a*p_inscope, where p_inscope is the predicted probability of being in scope. Similarly, wgtCD3c = wgtCD2c*p_inscope for COPD units and wgtCD3d = wgtCD2d*p_inscope for diabetes units. This adjustment reduces the total weight of the unresolved units by the predicted number of out-of-scope units that they represent in the population. The units remain in the file to be treated in the non-response adjustment.

8.1.4 SLCDC non-response adjustment

Note that at this point all the unresolved cases were considered to be non-respondents since their weights were adjusted for out-of-scope. Similarly, all the resolved cases that remained (i.e. in-scope units) were respondents. Logistic models using mainly CCHS auxiliary variables were built to predict the probabilities of being a respondent. From the predicted probabilities, response homogeneous groups (RHGs) were created. To ensure the non-response adjustment did not change the estimated number of people with a condition at the stratum level or at the regional level, the RHGs were created within each stratum by region.

The adjustment was calculated within each RHG as follows:

Formula 5 Formula 6 Formula 7

The asthma weight wgtCD4a was calculated as wgtCD3a*adjCD4a. Similarly, the COPD weight wgtCD4c was calculated as wgtCD3c*adjCD4c and the diabetes weight wgtCD4d was wgtCD3d*adjCD4d. After the adjustment, the non-responding units were dropped from the file.

8.1.5 Share-link adjustment

Only the information for the people who agreed to share and link their SLCDC data will be released. Of the 2011 SLCDC respondents, 99% agreed to share and link. Since the people who did not agree to share or link were still in the population of interest, adjustments were made to allocate the weights of the non-sharers / non-linkers to the remaining units. The probability of a respondent agreeing to share and link their SLCDC data is predicted using a logistic regression model. From the predicted probabilities, response homogeneous groups (RHGs) were created the same way as described in 8.1.4.

The adjustment was calculated within each RHG as follows:

Figure 8 Figure 9 Figure 10

At this point, the weight wgtCD5a for the asthma respondents who agreed to share and link their SLCDC data was calculated as wgtCD4a*adjCD5a. Similarly, wgtCD5c = wgtCD4c*adjCD5c and wgtCD5d = wgtCD4d*adjCD5d. After the adjustment, the respondents who did not agree to share or link their 2011 SLCDC data were dropped from the file.

8.1.6 Winsorization

Following the series of weight adjustments, some units may come out with extreme weights compared to other units of the same domain of interest. These units can represent a large proportion of their strata or have a large impact on the variance. To prevent this, the weight of these outlier units is adjusted downward using a "winsorization" trimming approach similar to the one used by CCHS. After winsorization, the weights for asthma, COPD, and diabetes became wgtCD6a, wgtCD6c, and wgtCD6d, respectively.

8.1.7 Post-stratification

To ensure the total numbers of estimated people with the conditions agree with the counts prior to winsorization by stratum and region, post-stratification is employed. Within each intersection of stratum and region, the following adjustment factors were calculated:

Figure 11 Figure 12 Figure 13

The final asthma weight wgtCD7a was calculated as wgtCD6a*adjCD7a. Similarly, the final COPD weight wgtCD7c = wgtCD6c*adjCD7c and the final diabetes weight wgtCD7d = wgtCD6d*adjCD7d. The weights wgtCD7a, wgtCD7c, and wgtCD7d correspond to the final 2011 SLCDC weight that can be found under the variable name WTSX_S.

8.2 Bootstrap weights

Coordinated bootstrap weights are used for SLCDC because of its dependence on the 2010 CCHS sample. Hence, the starting point for the SLCDC bootstrap weights was the 500 replicates from the 2010 CCHS share bootstrap file. Each bootstrap replicate was adjusted using the seven adjustments listed in Table 8.1.

9.0 Data quality

9.1 Out-of-scope cases

The out-of-scope rates of SLCDC vary among the different conditions. The rates are 7% for asthma, 18% for COPD, and 3% for diabetes. There may be several reasons for units becoming out-of-scope:

  • Respondents were incorrectly classified as having the condition according to the CCHS. For example, a number of respondents reported that their condition had not been diagnosed by a health professional, a requirement to be in-scope for the SLCDC.
  • Respondents indicated that they did not have the condition because they no longer had symptoms.
  • Respondents were conditioned by the CCHS to answer "no" to certain questions, knowing that they would then be screened out of the survey. In a certain sense, these units can be considered refusals.

Due to out-of-scope units, the total number of people having the condition differs between the CCHS and the SLCDC. This is especially true for the COPD condition. The CCHS likely includes some respondents who report having the condition but really do not (false positives). However, the SLCDC likely excludes some respondents who really do have the condition but who indicate that they do not to avoid completing the survey (false negatives). The objectives of the analysis dictate which survey should be used. For example, the CCHS provides a time series of the prevalence rates by condition while the SLCDC can be considered a one-time survey. In addition, the CCHS data should be used when looking at co-morbidities with other conditions. However, the SLCDC is able to provide detailed information about the quality of life and health behaviours of persons living with the chronic diseases.

9.2 Response rates

A total of 9,130 people were selected to take part in the 2011 SLCDC: 3,650 for the asthma questionnaire, 1,733 for the COPD questionnaire and 3,747 for the diabetes questionnaire.

For the asthma questionnaire, there were 238 units deemed out-of-scope among the resolved cases (units that had been contacted and could be classified to be in- or out-of scope). Among the unresolved cases (units with no contact, so it was not possible to determine whether they were in- or out-of-scope), the logistic model predicted 68 units to be out-of-scope. For more detailed information on the use of the logistic model, please refer to section 8.1.3. Therefore, there were a total of 306 modelled out-of-scope units. Of the 3,344 modelled in-scope units, 2,507 cases responded to the survey and agreed to share their data with the share partners and to link back to their CCHS responses. This resulted in a response rate of 75.0%. Table 9.1 below contains a summary of the SLCDC response rates by age group and sex for asthma.

Table 9.1 SLCDC initial sample size, modelled in-scope rate and response rate by sex and age group for the asthma questionnaire.
Sex Age Group Sample Selected Modelled No. of In-scope Units Modelled In-scope rate (%) Respondents Response Rate (%)
Female 12 to 24 years old 512 483 94.3 321 66.5
Female 25 to 39 years old 544 511 93.9 360 70.5
Female 40 to 54 years old 427 397 93.0 289 72.8
Female 55+ years old 708 648 91.5 566 87.6
Total Female 2,191 2,039 93.1 1,536 75.3
Male 12 to 24 years old 504 459 91.1 331 72.1
Male 25 to 39 years old 326 294 90.2 197 67.0
Male 40 to 54 years old 272 247 90.8 185 74.9
Male 55+ years old 357 305 85.4 258 84.6
Total Male 1,459 1,305 89.4 971 74.4
Total 3,650 3,344 91.6 2,507 75.0

For the COPD questionnaire, there were 315 units deemed out-of-scope among the resolved cases (units that had been contacted and could be classified as in- or out-of scope). Among the unresolved cases (units with no contact, so it was not possible to determine whether they were in- or out-of-scope), the logistic model predicted 57 units to be out-of-scope. Therefore, there were a total of 372 modelled out-of-scope units. Of the 1,361 modelled in-scope units, 1,133 cases responded to the survey and agreed to share their data with the share partners and to link back to their CCHS responses. This resulted in a response rate of 83.2%. Table 9.2 below contains a summary of the SLCDC response rates by age group and sex for COPD.

Table 9.2: SLCDC initial sample size, modelled in-scope rate and response rate by sex and age group for the COPD questionnaire
Sex Age Group Sample Selected Modelled No. of In-scope Units Modelled In-scope rate (%) Respondents Response Rate (%)
Female 35+ years old 1,078 870 80.7 728 83.7
Male 35+ years old 655 491 75.0 405 82.5
Total 1,733 1,361 78.5 1,133 83.2

For the diabetes questionnaire, there were 129 units deemed out-of-scope among the resolved cases (units that had been contacted and could be classified as in- or out-of scope). Among the unresolved cases (units with no contact, so it was not possible to determine whether they were in- or out-of-scope), the logistic model predicted 28 units to be out-of-scope. Therefore, there were a total of 157 modelled out-of-scope units. Of the 3,590 modelled in-scope units, 2,933 cases responded to the survey and agreed to share their data with the share partners and to link back to their CCHS responses. This resulted in a response rate of 81.7%. Table 9.3 below contains a summary of the SLCDC response rates by age group and sex for diabetes.

Table 9.3: SLCDC initial sample size, modelled in-scope rate and response rate by sex and age group for the diabetes questionnaire.
Sex Age Group Sample Selected Modelled No. of In-scope Units Modelled In-scope rate (%) Respondents Response Rate (%)
Female 20 to 64 years old 769 735 95.6 589 80.1
Female 65+ years old 1,086 1,042 95.9 874 83.9
Total Female 1,855 1,777 95.8 1,463 82.3
Male 20 to 64 years old 875 841 96.1 678 80.6
Male 65+ years old 1,017 972 95.6 792 81.5
Total Male 1,892 1,813 95.8 1,470 81.1
Total 3,747 3,590 95.8 2,933 81.7

9.3 Data interpretation

Since the 2011 SLCDC is a follow-up survey that collected additional data from targeted respondents from the 2010 CCHS, the two surveys share the same survey population. However, their reference periods can be different. The reference period for the 2010 CCHS is the 2010 calendar year while the data collected by the SLCDC reflects the status of the same survey population in October – November 2010 or March – April 2011, depending on whether the unit belonged to the first or second collection period. Under most circumstances, this will not affect the comparison of data from the CCHS and SLCDC. However, interpretation of estimates from the SLCDC should consider the reference period if it is felt that this would affect the responses from respondents.

9.4 Survey errors

The estimates derived from this survey are based on a sample of persons. Somewhat different estimates may have been obtained if a complete census has been taken using the same questionnaire, interviewers, supervisors, processing methods, etc. The differences between the estimates obtained from the sample and those resulting from a census taken under similar conditions are called the sampling error of the estimate.

Errors which are not related to sampling may occur at almost every phase of a survey operation. Interviewers may misunderstand instructions, respondents may make errors in answering questions, the answers may be incorrectly entered on the questionnaire and errors may be introduced in the processing and tabulation of the data. These are all examples of non-sampling errors.

Over a large number of observations, randomly occurring errors will have little effect on estimates derived from the survey. However, errors occurring systematically will contribute to biases in the survey estimates. Considerable time and effort are taken to reduce non-sampling errors in the survey. Quality assurance measures are implemented at each step of the data collection and processing cycle to monitor the quality of the data. These measures include the use of highly skilled interviewers, extensive training of interviewers with respect to the survey procedures and questionnaire, observation of interviewers to detect problems of questionnaire design or misunderstanding of instructions, procedures to ensure that data capture errors are minimized, and coding and edit quality checks to verify the processing logic.

9.4.1 The frame

The 2011 SLCDC was a supplement to the 2010 CCHS, which is based mainly on an area frame and a telephone frame. The coverage of the 2011 SLCDC should then be the same as the CCHS in the ten provinces; the SLCDC does not cover the territories. For the ten provinces, it is unlikely that the under-coverage of CCHS and SLCDC will introduce any significant bias into the survey data.

9.4.2 Non-response

A major source of non-sampling errors in surveys is the effect of non-response on the survey results. The extent of non-response varies from partial non-response (failure to answer just one or some questions) to total non-response. In the case of the 2011 SLCDC, only complete responses were kept for the survey. It is worthwhile to note that respondents tend to complete the questionnaire once they start the interview so partial non-response tends to be rare. Total non-response occurs because the interviewer is either unable to contact the respondent, or the respondent refuses to participate in the survey. Total non-response was handled by adjusting the weight of individuals who responded to the survey to compensate for those who did not respond.

It is important to note that the 2011 SLCDC interview took place several months after the 2010 CCHS interviews. As a result, some units could not be contacted because they moved or changed phone number. The strategy of breaking the collection into two waves (October – November 2010 and March – April 2011) reduced this risk. For these unresolved (non-contact) cases, logistic models were used to estimate the portion of in-scope and out-of-scope units (see section 8.1.3 for more details).

9.4.3 Measurement of sampling error

Since it is an unavoidable fact that estimates from a sample survey are subject to sampling error, sound statistical practice calls for researchers to provide users with some indication of the magnitude of this sampling error. This section of the documentation outlines the measures of sampling error which Statistics Canada commonly uses. It urges users to provide similar measures when producing estimates from this microdata file.

The basis for measuring the potential size of sampling errors is the standard error of the estimates derived from survey results. However, because of the large variety of estimates that can be produced from a survey, the standard error of an estimate is usually expressed relative to the estimate to which it pertains. This resulting measure, known as the coefficient of variation (CV) of an estimate, is obtained by dividing the standard error of the estimate by the estimate itself and is expressed as a percentage of the estimate.

A fictitious example is used to illustrate. Suppose that, based on the survey results, 45.1% of Canadians visited a health care professional in the past twelve months for their chronic disease and this estimate has a standard error of 0.009. Then the coefficient of variation of the estimate is calculated as:

Formula 14

There is more information on the calculation of coefficients of variation in Chapter 10.

10.0 Guidelines for tabulation, analysis and release

This section of the documentation outlines the guidelines to be adhered to by users tabulating, analyzing, publishing or otherwise releasing any data derived from the survey data files. With the aid of these guidelines, users of microdata should be able to produce figures that are in close agreement with those produced by Statistics Canada and, at the same time, will be able to develop currently unpublished figures in a manner consistent with these established guidelines.

10.1 Rounding guidelines

In order that estimates for publication or other release derived from these data files correspond to those produced by Statistics Canada, users are urged to adhere to the following guidelines regarding the rounding of such estimates:

  1. Estimates in the main body of a statistical table are to be rounded to the nearest hundred units using the normal rounding technique. In normal rounding, if the first or only digit to be dropped is 0 to 4, the last digit to be retained is not changed. If the first or only digit to be dropped is 5 to 9, the last digit to be retained is raised by one. For example, in normal rounding to the nearest 100, if the last two digits are between 00 and 49, they are changed to 00 and the preceding digit (the hundreds digit) is left unchanged. If the last digits are between 50 and 99 they are changed to 00 and the proceeding digit is incremented by 1;
  2. Marginal sub-totals and totals in statistical tables are to be derived from their corresponding unrounded components and then are to be rounded themselves to the nearest 100 units using normal rounding;
  3. Averages, proportions, rates and percentages are to be computed from unrounded components (i.e., numerators and/or denominators) and then are to be rounded themselves to one decimal using normal rounding. In normal rounding to a single digit, if the final or only digit to be dropped is 0 to 4, the last digit to be retained is not changed. If the first or only digit to be dropped is 5 to 9, the last digit to be retained is increased by 1;
  4. Sums and differences of aggregates (or ratios) are to be derived from their corresponding unrounded components and then are to be rounded themselves to the nearest 100 units (or the nearest one decimal) using normal rounding;
  5. In instances where, due to technical or other limitations, a rounding technique other than normal rounding is used resulting in estimates to be published or otherwise released that differ from corresponding estimates published by Statistics Canada, users are urged to note the reason for such differences in the publication or release document(s);
  6. Under no circumstances are unrounded estimates to be published or otherwise released by users. Unrounded estimates imply greater precision than actually exists.

10.2 Sample weighting guidelines for tabulation

The sample design used for this survey was not self-weighting. That is to say, the sampling weights are not identical for all individuals in the sample. When producing simple estimates including the production of ordinary statistical tables, users must apply the proper sampling weight. If proper weights are not used, the estimates derived from the data files cannot be considered to be representative of the survey population, and will not correspond to those produced by Statistics Canada.

Users should also note that some software packages might not allow the generation of estimates that exactly match those available from Statistics Canada because of their treatment of the weight field.

10.2.1 Definitions: categorical estimates, quantitative estimates

Before discussing how the survey data can be tabulated and analyzed, it is useful to describe the two main types of point estimates of population characteristics that can be generated from the data files.

Categorical estimates:

Categorical estimates are estimates of the number or percentage of the surveyed population possessing certain characteristics or falling into some defined category. How often individuals experience joint pain is an example of such an estimate. An estimate of the number of persons possessing a certain characteristic or exhibiting certain behaviours may also be referred to as an estimate of an aggregate.

Example of categorical question:

How often does your health professional check your blood pressure at your diabetes related appointments? (CODX_04)
Always
Often
Sometimes
Rarely
Never

Quantitative estimates:

Quantitative estimates are estimates of totals or of means, medians and other measures of central tendency of quantities based upon some or all of the members of the surveyed population.

An example of a quantitative estimate is the average age at which individuals are first diagnosed with asthma. The numerator is an estimate of the age at which individuals with asthma were first diagnosed with this condition, and its denominator is an estimate of the number of individuals who have been diagnosed with asthma.

Example of quantitative question:

How old were you when you were first diagnosed with asthma?
(DHRX_07)
Age of diagnosis

10.2.2 Tabulation of categorical estimates

Estimates of the number of people with a certain characteristic can be obtained from the data files by summing the final weights of all records possessing the characteristic of interest.

Proportions and ratios of the form Formula 15  are obtained by:

  1. summing the final weights of records having the characteristic of interest for the numerator (numerator X );
  2. summing the final weights of records having the characteristic of interest for the denominator (denominator Y ); then
  3. dividing the numerator estimate by the denominator estimate.

10.2.3 Tabulation of quantitative estimates

Estimates of sums or averages for quantitative variables can be obtained using the following three steps (only step a is necessary to obtain the estimate of a sum):

  1. multiplying the value of the variable of interest by the final weight and summing this quantity over all records of interest to obtain the numerator (numerator X );
  2. summing the final weights of records having the characteristic of interest for the denominator (denominator Y ); then
  3. dividing the numerator estimate by the denominator estimate.

10.2.3 Tabulation of quantitative estimates

Estimates of sums or averages for quantitative variables can be obtained using the following three steps (only step a is necessary to obtain the estimate of a sum):

  1. multiplying the value of the variable of interest by the final weight and summing this quantity over all records of interest to obtain the numerator(numerator X );
  2. summing the final weights of records having the characteristic of interest for the denominator (denominator Y ); then
  3. dividing the numerator estimate by the denominator estimate.

For example, to obtain the estimate of the average age at which individuals are diagnosed with asthma , first compute the numerator (numerator X ) by summing the product between the value of variable DHRX_07 and the weight WTSX_S. The denominator (denominator Y ) is obtained by summing the final weight of those records with a value of "2" to the variable CONFLAG. Divide (numerator X ) by (denominator Y ) to obtain the average age at which individuals are diagnosed with asthma.

10.3 Guidelines for statistical analysis

The SLCDC is based upon a complex design, with stratification, multiple stages of selection and unequal probabilities of selection of respondents. Using data from such complex surveys presents problems to analysts because the survey design and the selection probabilities affect the estimation and variance calculation procedures that should be used.

While many analysis procedures found in statistical packages allow weights to be used, the meaning or definition of the weight in these procedures can differ from what is appropriate in a sample survey framework, with the result that while in many cases the estimates produced by the packages are correct, the variances that are calculated are almost meaningless.

For many analysis techniques (for example linear regression, logistic regression, analysis of variance), a method exists that can make the application of standard packages more meaningful. If the weights on the records are rescaled so that the average weight is one (1), then the results produced by the standard packages will be more reasonable; they still will not take into account the stratification and clustering of the sample´s design, but they will take into account the unequal probabilities of selection. The rescaling can be accomplished by using in the analysis a weight equal to the original weight divided by the average of the original weights for the sampled units (people) contributing to the estimator in question.

10.4 Release guidelines

Before releasing and/or publishing any estimate from the data file, users must first determine the number of sampled respondents having the characteristic of interest (for example, the number of respondents with arthritis who experience joint pain). If this number is less than 30, the un-weighted estimate should not be released regardless of the value of the coefficient of variation for this estimate. For weighted estimates based on sample sizes of 30 or more, users should determine the coefficient of variation of the rounded estimate and follow the guidelines below.

Table 10.1 Sampling variability guidelines
Type of Estimate CV (in %) Guidelines
Acceptable 0.0 ≤ CV ≤ 16.6 Estimates can be considered for general unrestricted release. Requires no special notation.
Marginal 16.6 < CV ≤ 33.3  Estimates can be considered for general unrestricted release but should be accompanied by a warning cautioning subsequent users of the high sampling variability associated with the estimates. Such estimates should be identified by the letter E (or in some other similar fashion).
Unacceptable CV > 33.3 Statistics Canada recommends not to release estimates of unacceptable quality. However, if the user chooses to do so then estimates should be flagged with the letter F (or in some other fashion) and the following warning should accompany the estimates:
"The user is advised that …(specify the data) … do not meet Statistics Canada's quality standards for this statistical program. Conclusions based on these data will be unreliable and most likely invalid. These data and any consequent findings should not be published. If the user chooses to publish these data or findings, then this disclaimer must be published with the data."

10.5 Variances and coefficients of variation

The computation of exact coefficients of variation is not a straightforward task since there is no simple mathematical formula that would account for all SLCDC sampling frame and weighting aspects. Therefore, other methods such as re-sampling methods must be used in order to estimate measures of precision. Among these methods, the bootstrap method is the one recommended for analysis of SLCDC data.

The computation of coefficients of variation (or any other measure of precision) with the use of the bootstrap method requires access to information that is considered confidential.

For the computation of coefficients of variation, the bootstrap method is advised. A macro program, called "Bootvar", was developed in order to give users easy access to the bootstrap method. The Bootvar program is available in SAS and SPSS formats, and is made up of macros that calculate the variances of totals, ratios, differences between ratios, and linear and logistic regressions.

Although some standard statistical packages allow sampling weights to be incorporated in the analyses, the variances that are produced often do not take into account the stratified and clustered nature of the design properly, whereas the exact variance program would do so.

11.0 File usage

This section begins by describing the data files and how the data files can be accessed, the weight variable of the data files and an explanation of how it should be used when doing tabulations. This is followed by an explanation of the variable naming convention that is employed by the SLCDC.

11.1 Data file

The SLCDC consists of two different data files: one for diabetes and one for respiratory conditions. Both of these data files have been linked to the 2010 CCHS.

Since the variables from the two surveys are on the same data file, it is important that users are aware of the variables which they are using in their analysis. For example, some demographic variables (age, sex and province of residence) were collected on the CCHS and the SLCDC. Users should therefore be aware which variables they are using in order to ensure consistency in their estimates. More information on how to differentiate the variables from the SLCDC and CCHS are provided in Sections 11.3 and 11.4.

Unlike other CCHS data files, the SLCDC does not have a Master file separate from a Share file. Rather, the SLCDC data file contains only the respondents who agreed to link their SLCDC data to their 2010 CCHS data. Furthermore, only respondents who agreed to share the linked data with the share partners are included on the data file.

The data can be accessed in a number of ways and are described in the next sections.

11.1.1 Share partners

Share partners have access to the data under the terms of the data sharing agreements. These data files contain only information on respondents who agreed to share their data with Statistics Canada's partners. The share partners for the SLCDC are the Public Health Agency of Canada (the survey sponsor), Health Canada and some provincial health departments. Statistics Canada also asks respondents living in Quebec for their permission to share their data with the Institut de la Statistique du Québec. The share file is released only to these organizations. Personal identifiers are removed from the share files to respect respondent confidentiality. Users of these files must first certify that they will not disclose, at any time, any information that might identify a survey respondent.

11.1.2 Research Data Centres

The Research Data Centre (RDC) Program allows researchers to use the survey data in a secure environment in several universities across Canada. Researchers must submit research proposals that, once approved, give them access to the RDC. For more information, please consult the following web page: RDC.

11.1.3 Custom tabulations

One way to provide access to the data files is to offer users the option of having staff in Client Services of the Health Statistics Division prepare custom tabulations. This service is offered on a cost recovery basis. It allows users who do not possess knowledge of tabulation software products to obtain custom results. The results are screened for confidentiality and reliability concerns before release. For more information, please contact Client Services at 613-951-1746 or by e–mail at statcan.hd-ds.statcan@statcan.gc.ca.

11.2 Use of weight variable

The weight variable WTSX_S represents the SLCDC sampling weight. For a given respondent, the sampling weight can be interpreted as the number of people the respondent represents in the population. This weight must always be used when computing statistical estimates in order to make inferences at the population level possible. The production of un-weighted estimates is not recommended. The sample allocation, as well as the survey design, can cause such results to not correctly represent the population. Refer to Chapter 8 on weighting for a more detailed explanation on the creation of this weight.

11.3 Variable naming convention

The SLCDC adopted a variable naming convention that allows data users to easily use and identify the data based on module and condition. The variable naming convention follows the mandatory requirement of restricting variable names to a maximum of eight characters for ease of use by analytical software products.

11.3.1 Variable name component structure in SLCDC

Each of the eight characters in a variable name contains information about the type of data contained in the variable.

Positions 1-2:
Module reference (e.g. SS – Symptoms and severity, ME – Medication use, SM – Self-management and HU – Health care utilization)
Position 3:
Questionnaire-specific reference (D – Diabetes, R – Respiratory)
Position 4:
Reference to the Survey on Living with Chronic Diseases in Canada (X)
Position 5:
Variable type (_ – question, D – derived variable)
Positions 6-8:
Question number

For example: The variable corresponding to Question 1, Health care utilization module, Respiratory questionnaire, SLCDC (HURX_01):

Position 1-2:
HU Comes from the Health care utilization module
Position 3:
H Respiratory questionnaire component
Position 4:
X SLCDC
Position 5:
_ underscore (_ = collected data)
Position 6-8:
01 question number (& answer option where applicable)

The following values are used for the section name component of the variable name:

11.3.2 Positions 1-3 variable / questionnaire section name
GEN General health (on both diabetes and respiratory questionnaires)
CN Confirmation of diabetes diagnosis
DH Diagnosis and family history
SS Symptoms and severity
TR Triggers
CO Clinical monitoring
HU Health care utilization
ME Medication use
IC Insurance coverage
HC Health conditions
AL Allergies
CL Clinical recommendations
RA Restriction of activities
RW Restriction of work related activities
RE Restriction of educational activities
RV Restriction of volunteer activities
SM Self-management
MO Self-monitoring
SW Support and well-being
SH Smoking history
SC Smoking cessation
DC Diabetes complications
PA Patient activation
ADM Administration (on both diabetes and respiratory questionnaires)

The third position of the variable name consists of either a D if the module is on the diabetes questionnaire or an R if the module is on the respiratory questionnaire. A number of modules are on both questionnaires but with different questions for respondents with diabetes and respiratory conditions.

11.3.3 Position 4: Cycle and survey name

The X in position four of the variable name indicates that the variable is part of the SLCDC.

11.3.4 Position 5 variable type
- Collected variable A variable that appeared directly on the questionnaire
C Coded variable A variable coded from one or more collected variables (e.g., SIC, Standard Industrial Classification code)
D Derived variable A variable calculated from one or more collected or coded variables, usually calculated during head office processing (e.g., Health Utility Index)
F Flag variable A variable calculated from one or more collected variables (like a derived variable), but usually calculated by the data collection computer application for later use during the interview (e.g., work flag)
G Grouped variable Collected, coded, suppressed or derived variables collapsed into groups (e.g., age groups)

11.3.5 Positions 6-8: variable name

In general, the last three positions follow the variable numbering used on the questionnaire. The letter "Q" used to represent the word "question" is removed, and all question numbers are presented in a two-digit format. For example, question Q01A in a questionnaire becomes simply 01A, and question Q15 becomes simply 15.

For questions which allow for more than one response option (also referred to as a "mark-all" question), the final position in the variable naming sequence is represented by a letter. For this type of question, new variables were created to differentiate between a "yes" and "no" answer for each response option. For example, if Q2 had 4 response options, the new questions would be named Q2A for option 1, Q2B for option 2, Q2C for option 3, etc. If only options 2 and 3 were selected, then Q2A = No, Q2B = Yes, Q2C = Yes and Q2D = No.

11.4 Variable name component structure in CCHS

Since the SLCDC data files have been linked to the CCHS, it is important to be able to distinguish between the surveys from which the variables originate. The variable naming convention for the CCHS and SLCDC is very similar. The only exception is that the SLCDC uses an X in the fourth position to indicate that the variable comes from the SLCDC.

The example below shows the age variable from the SLCDC and CCHS:
SLCDC: DHHX_AGE
CCHS: DHH_AGE

Users should therefore be aware which variables they are using in order to ensure consistency in their estimates.

Notes

Footnote 1

According to the definitions of the conditions, nobody in the SLCDC sample could be classified to both asthma and COPD. As a result, the maximum number of conditions to which anyone could be classified was two: COPD and diabetes or asthma and diabetes.

Return to footnote 1 referrer

Notice of release of North American Industry Classification System (NAICS) Canada 2017 Version 2.0

March 6, 2017 (Previous notice)

This is an information notice.

The North American Industry Classification System (NAICS) Canada 2017 Version 2.0 replaces NAICS Canada 2017 Version 1.0 as the departmental standard for classifying industry data.

The structure of NAICS remains largely unchanged. Changes have been made to Internet publishing and marijuana cultivation. Internet-only publishing has been moved from 519130 Internet publishing and broadcasting, and web search portals (which is renamed Internet broadcasting and web search portals) to various publishing industries under 511 Publishing industries.

Marijuana growing under cover is classified to 111419 Other food crops grown under cover.

In addition, the definition for 523910 Miscellaneous intermediation has been revised, but the scope of this industry has not changed.

New examples have been added to 541690 Other scientific and technical consulting services.

NAICS Canada 2017 Version 2.0 will be used by the Statistical Registers and Geography Division for the classification of units on the Business Register system, and statistical programs will start implementing NAICS 2017 Version 2.0 as of January 2018.

Contact information

For more information, please contact Standards Division.

Changes in NAICS Canada 2017 Version 2.0

111419 Other food crops grown under coverUS

This Canadian industry comprises establishments, not classified to any other Canadian industry, primarily engaged in growing food crops under glass or protective cover.

Illustrative example(s)

  • herb and spice crops grown under cover
  • hydroponic crops, grown under cover
  • market gardening, greenhouse
  • vegetable food crops grown in greenhouses

Exclusion(s)

  • growing vegetable and melon bedding plants, in open fields (see 111219 Other vegetable (except potato) and melon farming)
  • raising both aquatic animals and plants in integrated growing operations, aquaponics (see 112510 Aquaculture)

All examples

  • fruit farming, berry grown under cover
  • herb and spice crops grown under cover
  • herb farming (e.g., ginseng, echinacea, etc.), greenhouse grown
  • hydroponic crops, grown under cover
  • marijuana, grown under cover
  • market gardening, greenhouse
  • seaweed, grown under cover
  • vegetable farming, under cover
  • vegetable food crops grown in greenhouses

111999 All other miscellaneous crop farmingCAN

This Canadian industry comprises establishments, not classified to any other Canadian industry, primarily engaged in growing crops.

Illustrative example(s)

  • crop and animal (livestock) combination farming (primarily crop)
  • field crop combination farm (except grain and oil seeds; or fruit and vegetable)
  • herb farming (except greenhouse grown)
  • hop, growing or farming
  • peanut farming
  • spice farming (except greenhouse grown)

Inclusion(s)

  • general crop farming or combination crop farming (except combination fruit and vegetable farming)

Exclusion(s)

  • growing a combination of fruit and vegetable (see 111993 Fruit and vegetable combination farming)
  • growing a combination of oilseeds and grains (see 111190 Other grain farming)
  • growing cotton (see 111920 Cotton farming)
  • growing greenhouse, nursery and floriculture products (see 1114 Greenhouse, nursery and floriculture production)
  • growing hay (see 111940 Hay farming)
  • growing sugar cane (see 111930 Sugar cane farming)
  • growing tobacco (see 111910 Tobacco farming)
  • growing tree nuts and fruit (see 1113 Fruit and tree nut farming)
  • growing wheat, corn, rice, soybeans, and other grains and oilseeds (see 1111 Oilseed and grain farming)

All examples

  • chicory plants and roots, growing
  • combination field crop farming
  • crop and animal (livestock) combination farming (primarily crop)
  • crop farms, general
  • crops, hay and grain farming
  • field crop combination farm (except grain and oil seeds; or fruit and vegetable)
  • general crop farming or combination crop farming (except combination fruit and vegetable farming)
  • ginseng and echinacea herb growing (except greenhouse grown)
  • grain and forage crops
  • grain and sugar beets, farming
  • grass, clover and alfalfa seed growing
  • hemp growing
  • herb farming (except greenhouse grown)
  • hop, growing or farming
  • hops and grain growing, combination
  • osier growing
  • peanut farming
  • potatoes and cauliflower growing, combination
  • potatoes and grain growing, combination
  • potatoes and holly growing, combination
  • potatoes and turnips growing, combination
  • spice farming (except greenhouse grown)
  • sugar beet farming
  • tea farming

511 Publishing industries

This subsector comprises establishments primarily engaged in publishing newspapers, periodicals, books, databases, software and other works. These works are characterized by the intellectual creativity required in their development and are usually protected by copyright. Publishers distribute, or arrange for the distribution of copies of these works.

Publishing establishments may create the works in-house, or contract for, purchase, or compile works that were originally created by others. These works may be published in one or more formats including traditional print form, electronic and online. Publishers of multimedia products, such as interactive children's books, multimedia CD-ROM and digital video disk (DVD) reference books, and musical greeting cards are also included. Establishments in this subsector may print, reproduce or offer direct online access to the works themselves or they may arrange with others to carry out such functions.

51111 Newspaper publishers

This industry comprises establishments primarily engaged in carrying out operations necessary for producing and distributing newspapers, including gathering news; writing news columns, feature stories and editorials; and selling and preparing advertisements. These establishments may publish newspapers in print, electronic form or online.

511110 Newspaper publishersCAN

This Canadian industry comprises establishments primarily engaged in carrying out operations necessary for producing and distributing newspapers. These establishments may publish newspapers in print, electronic form or online.

Illustrative example(s)

  • newspapers, publishing
  • newspapers, publishing and printing

Inclusion(s)

  • gathering news, writing news columns, feature stories and editorials; and selling and preparing advertisements
  • publishing newspapers in print, electronic form or online

Exclusion(s)

  • printing, but not publishing, newspapers (see 32311 Printing)
  • selling media time or space for media owners (see 541840 Media representatives)
  • supplying information, such as news, reports and pictures, to the news media (see 519110 News syndicates)

All examples

  • ethnic newspapers, publishing
  • gathering news, writing news columns, feature stories and editorials; and selling and preparing advertisements
  • newspaper branch offices, editorial and advertising
  • newspaper publishing and commercial printing combined
  • newspaper publishing and job printing combined
  • newspapers, publishing
  • newspapers, publishing and printing
  • newspapers, publishing exclusively on Internet
  • publishers or publishing of newspaper, combined with printing
  • publishing newspapers in print, electronic form or online

51112 Periodical publishers

This industry comprises establishments, known as magazine or periodical publishers, primarily engaged in carrying out operations necessary for producing and distributing magazines and other periodicals, including gathering, writing, soliciting and editing articles, and preparing and selling advertisements. Periodicals are published at regular intervals, typically on a weekly, monthly or quarterly basis. These periodicals may be published in print, electronic form or online.

511120 Periodical publishersCAN

This Canadian industry comprises establishments, known as magazine or periodical publishers, primarily engaged in carrying out operations necessary for producing and distributing magazines and other periodicals. Periodicals are published at regular intervals, typically on a weekly, monthly or quarterly basis. These periodicals may be published in print, electronic form or online.

Illustrative example(s)

  • advertising periodicals, publishing
  • comic books in issue format, publishing
  • magazine publishing
  • newsletters publishing
  • periodicals, all formats, publishing

Inclusion(s)

  • gathering, writing, soliciting and editing articles, and preparing and selling advertisements, in periodical publishing
  • publishing periodicals in print, electronic form or online

Exclusion(s)

  • printing, but not publishing, periodicals (see 32311 Printing)
  • publishing directories and databases (see 511140 Directory and mailing list publishers)
  • publishing newspapers (see 511110 Newspaper publishers)
  • publishing sheet music (see 512230 Music publishers)
  • selling media time or space for media owners (see 541840 Media representatives)

All examples

  • advertising periodicals, publishing
  • agricultural magazines and periodicals, publishing and printing combined
  • agricultural magazines and periodicals, publishing or publisher
  • comic books in book format, publishing and printing combined
  • comic books in issue format, publishing
  • crossword puzzle publishing
  • financial magazines and periodicals, publishing and printing combined
  • financial magazines and periodicals, publishing or publisher
  • gathering, writing, soliciting and editing articles, and preparing and selling advertisements, in periodical publishing
  • guide, periodical, journal and magazine publishing, exclusively on Internet (e.g., school, technical, television, trade)
  • juvenile magazines and periodicals, publishing and printing combined
  • juvenile magazines and periodicals, publishing or publisher
  • magazine publishers, publishing and printing combined
  • magazine publishing
  • magazines, publishing and printing combined
  • newsletters publishing
  • newsletters, publishing and printing combined
  • periodical publishers
  • periodicals, all formats, publishing
  • periodicals, publishing and printing combined
  • professional magazines and periodicals, publishing and printing combined
  • professional magazines and periodicals, publishing or publisher
  • publishing periodicals in print, electronic form or online
  • radio guide, publishing or publishers
  • radio guide and schedule publishers or publishing, exclusively on Internet
  • radio or television guides, publishing and printing combined
  • radio schedules and guides, publisher or publishing
  • radio schedules and guides, publishing and printing combined
  • radio, TV, transport, and similar timetables, publishing
  • religious magazines and periodicals, publisher or publishing
  • religious magazines and periodicals, publishing and printing combined
  • scholarly journals, publishing and printing combined
  • scholarly journals, publishing or publisher
  • scholastic magazines and periodicals, publishing and printing combined
  • scholastic magazines and periodicals, publishing or publisher
  • shoppers and real estate guides, publishing and printing
  • technical magazines and periodicals, publishing and printing combined
  • technical magazines and periodicals, publishing or publisher
  • television guides, publishing or publishers
  • trade journals, publishing and printing combined
  • trade journals, publishing or publisher
  • trade magazines and periodicals, publishing and printing combined
  • trade magazines and periodicals, publishing or publisher

51113 Book publishers

This industry comprises establishments primarily engaged in carrying out various design, editing and marketing activities necessary for producing and distributing books of all kinds, such as text books; technical, scientific and professional books; and mass market paperback books. These books may be published in print, audio, electronic form, or online.

511130 Book publishersCAN

This Canadian industry comprises establishments primarily engaged in carrying out various design, editing and marketing activities necessary for producing and distributing books of all kinds. These books may be published in print, audio, electronic form, or online.

Illustrative example(s)

  • books, publishing
  • publishing maps, street guides and atlases
  • school textbooks, publishing
  • travel guide, publishing

Inclusion(s)

  • producing and distributing text books; technical, scientific and professional books; and mass market paperback books
  • publishing books in print, audio, electronic form or online
  • publishing books online

Exclusion(s)

  • direct selling, but not publishing, books (e.g., book clubs) (see 454110 Electronic shopping and mail-order houses)
  • printing, but not publishing, books (see 32311 Printing)
  • publishing music books (see 512230 Music publishers)

All examples

  • almanacs, publishing and printing combined
  • almanacs, publishing or publishers
  • atlases, publishing and printing combined
  • book copyright licensing agency
  • book publishers and exclusive agents
  • book publishing online
  • books, manuals and textbook, publishing exclusively on Internet (e.g., school, technical)
  • books, printing and publishing combined
  • books, publishing
  • book publishing exclusively on Internet (e.g., almanacs, atlases, dictionaries, encyclopedia, fiction)
  • dictionaries, publishing
  • dictionaries, publishing and printing combined
  • encyclopedias, publishing and printing combined
  • encyclopedias, publishing or publisher
  • exclusive agents, books
  • fiction books, publishing and printing combined
  • fiction books, publishing or publishers
  • guides, street, publishers or publishing
  • maps, publishing and printing combined
  • maps, publishing or publisher
  • non-fiction, professional or religious books, publishing exclusively on Internet
  • non-fiction books, publishers or publishing
  • non-fiction books, publishing and printing combined
  • producing and distributing text books; technical, scientific and professional books; and mass market paperback books
  • professional books, publishing and printing combined
  • professional books, publishing or publishers
  • publishing books in print, audio, electronic form, or online
  • publishing maps, street guides and atlases
  • religious books, publishers or publishing
  • religious books, publishing and printing combined
  • school textbooks, publishing
  • school textbooks, publishing and printing combined
  • street guides, publishing and printing combined
  • technical books, publishing and printing combined
  • technical books, publishing or publishers
  • technical manuals and papers, publishing and printing combined
  • technical manuals and papers, publishing or publishers
  • travel guide, publishers or publishing exclusively on Internet
  • travel guide, publishing
  • travel guide, publishing and printing combined

51114 Directory and mailing list publishers

This industry comprises establishments primarily engaged in publishing compilations and collections of information or facts that are logically organized to facilitate their use. These collections may be published in one or more formats, such as print, electronic form or online. Electronic versions may be provided directly to customers by the establishment or third party vendors.

511140 Directory and mailing list publishersCAN

This Canadian industry comprises establishments primarily engaged in publishing compilations and collections of information or facts that are logically organized to facilitate their use. These collections may be published in one or more formats, such as print, electronic form or online. Electronic versions may be provided directly to customers by the establishment or third party vendors.

Illustrative example(s)

  • address and mailing list compilers
  • directories, publishing
  • electronic database (machine readable) publishing
  • telephone directories, publishing

Inclusion(s)

  • providing electronic versions of directories and mailing lists directly to customers by the establishment or third party vendors
  • publishing directories and mailing lists in one or more formats such as print, electronic form or online

Exclusion(s)

  • designing, developing and publishing computer software products (see 51121 Software publishers)
  • duplicating electronic media, such as CD-ROMs and DVDs (see 334610 Manufacturing and reproducing magnetic and optical media)
  • printing, but not publishing, business directories, telephone books and similar products (see 32311 Printing)
  • providing online access to databases developed by others (see 519130 Internet broadcasting and web search portals)
  • publishing encyclopaedias (see 511130 Book publishers)

All examples

  • address and mailing list compilers
  • address list, publishers or publishing
  • address list, publishing and printing combined
  • business directory, publishing and printing combined
  • catalog of collections, publishing
  • catalog of collections, publishing and printing combined
  • compiling mailing lists
  • directories, publishing
  • directories, publishing and printing combined
  • directory publishing exclusively on Internet (e.g., telephone, business)
  • directory (e.g., business, telephone), publishing or publisher
  • electronic database (machine readable) publishing
  • Internet domain registration services
  • providing electronic versions of directories and mailing lists directly to customers by the establishment or third party vendors
  • publishing directories and mailing lists in one or more formats such as print, electronic form or online
  • shipping register, publishing
  • shipping register, publishing and printing combined
  • telephone directories, publishing
  • telephone directories, publishing and printing combined

511190 Other publishersCAN

This Canadian industry comprises establishments, not classified to any other Canadian industry, primarily engaged in publishing other works such as calendars, colouring books, greeting cards and posters.

Illustrative example(s)

  • art prints, publishing
  • calendars, publishing
  • catalogues (e.g., mail order, store and merchandise), publishing
  • diaries and time schedulers, publishing
  • greeting cards and postcards, publishing

Exclusion(s)

  • publishing books, maps and atlases (see 511130 Book publishers)
  • publishing directories and mailing lists (see 511140 Directory and mailing list publishers)
  • publishing magazines and periodicals (see 511120 Periodical publishers)
  • publishing music (see 512230 Music publishers)
  • publishing newspapers (see 511110 Newspaper publishers)

All examples

  • art prints, publishing
  • art prints, publishing and printing combined
  • calendars, publishers or publishing
  • calendars, publishing
  • catalogues (i.e., mail order, store and merchandise), publishing and printing combined
  • catalogues (e.g., mail order, store and merchandise), publishing
  • colouring books, publishing
  • diaries and time schedulers, publishing
  • discount coupon books, publishing and publishers
  • greeting cards and postcards, publishing
  • greeting cards, publishing and printing combined
  • limited edition art prints, publishing
  • pamphlets publishers or publishing exclusively on Internet
  • pattern and plan (e.g., clothing patterns), publishers or publishing exclusively on Internet
  • patterns, paper (i.e., clothing), publishing and printing combined
  • patterns, paper (i.e., clothing), publishing or publishers
  • poster publishers or publishing exclusively on Internet
  • posters, publishing and printing combined
  • posters, publishing or publisher
  • publishers or publishing exclusively on Internet (e.g., art, calendar, catalog of collections, coloring book)
  • publishers or publishing exclusively on Internet (e.g., diary, time scheduler, discount coupon book, greeting card)
  • publishing and printing greeting cards
  • publishing maps and street guides exclusively on Internet
  • racetrack form or program publishers or publishing exclusively on Internet
  • race track programs, publisher or publishing
  • race track programs, publishing and printing combined
  • racing forms, publishing and printing combined, or publishing only
  • racing forms, publishing or publishers
  • yearbooks (i.e., high school, college, university), publishing or publisher
  • yearbooks (i.e., high school, college, university), publishing or publisher exclusively on Internet
  • yearbooks, publishing and printing combined

511211 Software publishers (except video game publishers)CAN

This Canadian industry comprises establishments primarily engaged in software publishing, not including video games. These establishments carry out operations necessary for producing and distributing computer software, such as designing, providing documentation, assisting in installation and providing support services to software purchasers.

Illustrative example(s)

  • publishing packaged software (except video games)
  • publishing packaged software (except video games), including designing and developing

Inclusion(s)

  • designing and publishing or publishing of software only

Exclusion(s)

  • custom designing software to meet the needs of specific users (see 541514 Computer systems design and related services (except video game design and development))
  • mass duplication of software (see 334610 Manufacturing and reproducing magnetic and optical media)
  • providing access to software for clients from a central host site (see 518210 Data processing, hosting, and related services)
  • video game software publishing (see 511212 Video game publishers)
  • video game software publishing (including designing and developing) (see 511212 Video game publishers)

All examples

  • applications software, computer, publishing (including designing and developing), packaged
  • computer software publishing (including designing and developing), packaged
  • computer software, packaged, publishers or publishing
  • designing and publishing or publishing of software only
  • operating systems software, computer, publishing (including designing and developing), packaged
  • programming languages and compilers, publishing
  • programming languages and compilers, publishing (including designing and developing), packaged
  • publishing and reproducing software in integrated facilities
  • publishing packaged software (except video games)
  • publishing packaged software (except video games), including designing and developing
  • software publishers
  • software publishing exclusively on Internet
  • software, computer, packaged, publishers or publishing
  • utility software, computer, packaged, publishers or publishing

511212 Video game publishersCAN

This Canadian industry comprises establishments primarily engaged in video game publishing. These establishments carry out operations necessary for producing and distributing computer video game software, such as designing video games, providing documentation, and providing support services to video game purchasers.

Illustrative example(s)

  • video game designing and developing (with publishing)
  • video game software publishing
  • video game software publishing (including designing and developing)

Inclusion(s)

  • designing and publishing or publishing video games only

Exclusion(s)

  • custom designing video games to meet the needs of specific users (see 541515 Video game design and development services)
  • mass duplication of video games (see 334610 Manufacturing and reproducing magnetic and optical media)
  • providing access to video games for clients from a central host site (see 518210 Data processing, hosting, and related services)

All examples

  • designing and publishing or publishing video games only
  • video game designing and developing (with publishing)
  • video game software publishing
  • video game software publishing exclusively on Internet
  • video game software publishing (including designing and developing)

512230 Music publishersCAN

This Canadian industry comprises establishments primarily engaged in acquiring and registering copyrights in musical compositions, in accordance with the law, and promoting and authorizing the use of these compositions in recordings, on radio and television, in motion pictures, live performances, print, multimedia or other media. These establishments represent the interests of songwriters or other owners of musical compositions in generating revenues from the use of such works, generally through licensing agreements.

Illustrative example(s)

  • music copyright buying and licensing
  • sheet music publishing

Inclusion(s)

  • copyrights or acting as administrators of music copyrights on behalf of copyright owners

Exclusion(s)

  • songwriters who act as their own publishers (see 711513 Independent writers and authors)

All examples

  • administration of music copyrights for others
  • copyrights or acting as administrators of music copyrights on behalf of copyright owners
  • music books (i.e., bound sheet music), publishers or publishing
  • music copyright buying and licensing
  • music copyright, authorizing use
  • music publishers or publishing
  • music publishing or publishers exclusively on Internet
  • music, publishing and printing combined
  • musical performance rights, publishing and licensing of
  • sheet music publishing
  • sheet music, publishing and printing combined
  • songs, publisher or publishing
  • songs, publishing
  • songs, publishing and printing combined

51913 Internet broadcasting and web search portals

This industry comprises establishments exclusively engaged in broadcasting content on the Internet or operating websites, known as web search portals, which use a search engine to generate and maintain extensive databases of Internet addresses and content in an easily searchable format. The Internet broadcasting establishments in this industry provide textual, audio, and/or video content of general or specific interest. These establishments do not provide traditional (non-Internet) versions of the content that they broadcast. Establishments known as web search portals often provide additional Internet services, such as e-mail, connections to other websites, auctions, news, and other limited content, and serve as a home base for Internet users.

519130 Internet broadcasting and web search portalsCAN

This Canadian industry comprises establishments exclusively engaged in broadcasting content on the Internet or operating websites, known as web search portals, which use a search engine to generate and maintain extensive databases of Internet addresses and content in an easily searchable format. The Internet broadcasting establishments in this industry provide textual, audio, and/or video content of general or specific interest. These establishments do not provide traditional (non-Internet) versions of the content that they broadcast. Establishments known as web search portals often provide additional Internet services, such as e-mail, connections to other websites, auctions, news, and other limited content, and serve as a home base for Internet users.

Illustrative example(s)

  • Internet broadcasting (e.g., television, video)
  • Internet search portal and websites, operating
  • Web broadcasting
  • Web search portals, operating

Exclusion(s)

  • developing databases for the purpose of credit reporting (see 56145 Credit bureaus)
  • providing Internet access (see 517 Telecommunications)
  • publishing databases (see 511140 Directory and mailing list publishers)
  • retailing new and used goods using the Internet (see 44-45 Retail trade)
  • traditional or combined broadcasting (see 515 Broadcasting (except Internet))

All examples

  • game sites (exclusively on Internet), operating
  • Internet broadcasting (e.g., audio, radio)
  • Internet broadcasting (e.g., television, video)
  • Internet broadcasting and Web search portals
  • Internet search portals, operating
  • Internet search websites
  • Internet sports sites
  • operating file sharing websites (e.g., textual, audio or video content)
  • social networking websites
  • video on demand (VOD), distribution on the Internet
  • web broadcasting
  • web communities, operating
  • web search portals, operating

52391 Miscellaneous intermediation

This industry comprises establishments primarily engaged in acting as principals (except investment bankers, securities dealers, and commodity contracts dealers) in buying or selling of financial contracts, generally on a spread basis. Principals are investors that buy or sell for their own account.

523910 Miscellaneous intermediationUS

This Canadian industry comprises establishments primarily engaged in acting as principals (except investment bankers, securities dealers, and commodity contracts dealers) in buying or selling of financial contracts, generally on a spread basis. Principals are investors that buy or sell for their own account.

Illustrative example(s)

  • investment clubs
  • land speculation
  • oil royalty dealing (acting as principals)
  • syndicates, investment
  • venture capital companies

All examples

  • bare trustee
  • buying income tax refunds
  • dealing (e.g., minerals, oil or gas royalties or leases, mortgages, tax certificates)
  • gas and oil royalty dealers
  • house flipping services
  • individuals investing in financial contracts on own account
  • investment clubs
  • investment companies investing on own account
  • investment companies, investing in financial contracts
  • land speculation
  • mineral royalties or leases, dealing
  • mortgages, buying and selling (rediscounting)
  • oil royalty dealing (acting as principals)
  • real estate as a trading stock of the seller, buying and selling of
  • syndicates, investment
  • tax liens dealing (i.e., acting as principals)
  • venture capital companies
  • viatical settlement companies

541690 Other scientific and technical consulting services

This Canadian industry comprises establishments, not classified to any other Canadian industry, primarily engaged in providing advice and assistance to other organizations on scientific and technical issues.

Illustrative example(s)

  • agricultural consulting (technical) services
  • agrology consulting services
  • agronomy consulting services
  • economic consulting services
  • energy consulting services
  • livestock breeding consulting services
  • occupational health and safety consulting services
  • physics consulting services
  • safety consulting services

Exclusion(s)

  • contract drilling for oil and gas (see 213111 Oil and gas contract drilling)
  • engineering consulting services (see 541330 Engineering services)
  • oil field crew supervision (see 561110 Office administrative services)
  • providing computer systems integration and design services (see 541514 Computer systems design and related services (except video game design and development))
  • quality control (product inspection) service (see 561990 All other support services)
  • real estate advisory services (see 531390 Other activities related to real estate)
  • services to oil and gas extraction (see 213118 Services to oil and gas extraction)

All examples

  • Aboriginal affairs advisor (technical consulting)
  • agricultural consulting (technical) services
  • agrology consulting services
  • agronomy consulting services
  • audio equipment consulting services
  • beer making consultant
  • biological consulting services
  • building envelope consulting services
  • chemical consulting services
  • dairy herd consulting services
  • defense consulting services
  • economic consulting services
  • energy consulting services
  • expert witness testifying in court
  • fishing consultant service, boundary issue disputes
  • forestry consulting services (except engineer)
  • geochemical consulting services
  • geology consultant
  • geopolitical consulting
  • geo-political consulting services
  • health management consulting services
  • health risk management consultants
  • hydrology consulting services
  • import and export trade consulting services
  • legal information consultant (advice to the general public)
  • livestock breeding consulting services
  • mine lighting consultant
  • motion picture consulting services
  • nuclear energy consulting services
  • occupational health and safety consulting services
  • occupational health and wellbeing consultant
  • paramedic consultant
  • personal growth consultant
  • pharmaceutical consultant
  • physics consulting services
  • product quality consulting services (except quality control service)
  • professional career advancement consultant
  • radio consulting services
  • roof consulting services
  • safety consulting services
  • security consulting services
  • technical consultant services
  • waterproof consulting services
  • wine consulting services

CVs for operating revenue Commercial and industrial machinery and equipment rental and leasing

CVs for operating revenue - Commercial and industrial machinery and equipment rental and leasing
Table summary
This table displays the results of CVs for operating revenue - Commercial and industrial machinery and equipment rental and leasing. The information is grouped by Regions (appearing as row headers), CVs for operating revenue, calculated using pourcentage units of measure (appearing as column headers).
Geography CVs for operating revenue
percentage
Canada 2.32
Newfoundland and Labrador 1.32
Prince Edward Island 0.00
Nova Scotia 0.71
New Brunswick 0.00
Quebec 2.48
Ontario 3.08
Manitoba 1.39
Saskatchewan 1.32
Alberta 4.52
British Columbia 5.62
Yukon 0.00
Northwest Territories 0.00
Nunavut 0.00