Calculation of Volume of Retail Trade Sales

Introduction

This document presents the methodology used to produce volume measures of retail trade sales. To achieve this goal, information from the Consumer Price Index (CPI) the Retail Commodity Survey (RCS) and the Monthly Retail Trade Survey (MRTS) are combined to produce monthly estimates for the total retail trade industry.

Purpose of Deflation

Changes in current dollar retail sales can be decomposed into two elements: a price element, or the part of the growth linked to price variations and a volume element, which covers the change in quantities and quality of the goods and services sold. The chained dollar and constant price data for retail sales provide two evaluations of the changes in the volume of sales. The volume measures are obtained by removing from the current dollar value of sales the price variations measured by appropriate price indexes. This process is known as deflation.

Derivation of Retail Sales Price Indices

The price indexes used for deflation come from the Consumer Price Index (CPI – survey no. 2301) program. Adjustments are made to the CPI indexes to exclude changes in retail sales tax since the retail sales data exclude HST, GST or PST1, whereas the CPI measures includes the effect of these tax changes. An unpublished dataset from Consumer Prices Division is used to adjust the CPI to the retail sales concept.

The Retail Commodity Survey (RCS – survey no. 2008) collects information for 120 exhaustive categories of goods sold by type of retail outlet. This survey provides a breakdown of the retail outlet’s total sales by commodity. Each RCS commodity is matched with the most suitable CPI component, or a weighted combination of CPI components. The RCS is the cornerstone of the methodology since its two dimensions allow the transformation of commodity prices into industry prices weighted by commodity sold.

The Monthly Retail Trade Survey (MRTS – survey no. 2406) produces data on retail trade sales in current dollars by type of store.  In order to bring consistency to the data, the published estimates from RCS by type of retail outlet are benchmarked at the micro data level to the MRTS results. However, there is one single exception. Department stores (NAICS 452110) differ between the two surveys because RCS includes concession sales while MRTS does not.

Deriving a commodity breakdown for most recent months

Volume estimates of retail sales are published about fifty days after the end of the reference period, but the RCS data that are essential to produce them is available quarterly within 90 days of the reference quarter.  Therefore, a projection method is used to derive a commodity breakdown of MRTS data for the most recent months.  For months when a RCS breakdown is not available,   shares from the most recent month of RCS data serve as a starting point.  The previous year’s difference in shares between the month being projected and the month being used is applied to the most recent month available2. The objective of this projection method is to obtain a more up-to-date weighting structure that takes into consideration the seasonality in the goods sold.  The calculated shares are applied to the current dollar value of sales by store type provided by the MRTS to derive current dollar value of sales by store type and commodity.

Deflation and Aggregation

Volume at constant prices

To calculate 2007 dollar volume data, the price indexes are adjusted in a way that the average index equals 100 for this reference year.  Current dollar sales by store type and commodity are then divided by their respective price indexes to derive constant price sales. Finally, the sales in volume at constant prices are summed over all commodities to derive volume of sales at constant prices by store type.  The volume of total sales at constant prices is the sum of the volumes of sales at constant prices by store type.

Volume in chained dollars

The total volume of sales in chained dollars corresponds to the geometric mean of two evaluations of the variations in volume between two consecutive months. The first evaluation is based on the aggregated price of the previous month and the other is based on the price of the current month. Chained dollar estimates for total retail sales are derived from constant dollar estimates of sales by store type using a Fisher index formula. Only the total volume of sales of the retail sector as a whole can be computed in chained dollars.

Implicit price indexes

Implicit price indexes are derived by dividing the current dollar sales by the volume in chained dollars or the volume in constant prices.  They can be thought of as the change in the average price of goods sold at retail stores.  They reflect both changes in prices and changes in the composition of goods and services sold.

Seasonal Adjustment

Current dollar sales and implicit price indexes by store type are both seasonally adjusted using the X-12-ARIMA method.  Seasonally adjusted volume sales by type of store are derived indirectly by dividing seasonally adjusted current dollar sales by their corresponding seasonally adjusted implicit price indexes by type of store.  Seasonally adjusted volumes of sales by type of store are then aggregated to derive seasonally adjusted volumes of total sales.

Notes

  1. HST stands for Harmonized Sales Tax; GST for Goods and Services Tax and; PST for Provincial Sales Tax.

  2. For example, when data are released for June 2012 the most recent RCS data available was for March of 2012 so the shares were computed as Shares June 2012 = Shares March 2012 + Shares June 2011 – Shares March 2011.

Data quality, concepts and methodology: Definitions

The definitions used for the production of statistical tables of Canadian vital statistics data are based on those recommended by the World Health OrganizationNote 1 and the United Nations.Note 2

Age of mother. Age the mother attained at her last birthday preceding delivery.

Birth. The complete expulsion or extraction from its mother of a product of conception, irrespective of the duration of the pregnancy. See also "Fetal death (stillbirth)" and "Live birth".

Birth and fertility rates

  1. Age-specific fertility rate (ASFR): The number of live births per 1,000 women in a specific age group. Five-year age groups were used in these tabulations (ranging from 15 to 19 to 45 to 49 years).
  2. Age-specific fertility rate, women 15 to 19 years: Live births to women under age 20 per 1,000 women aged 15 to 19.
  3. Crude birth rate: The number of live births per 1,000 population.
  4. Total fertility rate (TFR): An estimate of the average number of live births a woman can be expected to have in her lifetime, based on the age-specific fertility rates (ASFR) of a given year. The total fertility rate (TFR) = SUM of single year of age-specific fertility rate.

Birth weight. The first weight of the fetus or newborn obtainedimmediately after birth, expressed in grams.

  1. Extremely low birth weight: Birth weight under 1,000 grams.
  2. Very low birth weight: Birth weight under 1,500 grams.
  3. Low birth weight: Birth weight under 2,500 grams.
  4. Normal birth weight: Ranges from 2,500 to 4,499 grams.
  5. High birth weight: Birth weight of 4,500 or more grams.

Delivery. A delivery may consist of one or more live born or stillborn fetuses. The number of deliveries in a given period will be equal to or less than the number of births because multiple births (twins, triplets or higher-order births) are counted as single deliveries.

Fetal death (stillbirth). See Stillbirth definition.

Fetal death (stillbirth) rate. See Stillbirth rate definition.

Live birth. The complete expulsion or extraction from its mother of a product of conception, irrespective of the duration of the pregnancy, which, after such separation, breathes or shows any other evidence of life, such as beating of the heart, pulsation of the umbilical cord, or definite movement of voluntary muscles, whether or not the umbilical cord has been cut or the placenta is attached.

Marital status of mother. Refers to the legal conjugal status of the mother at the time of the delivery. Persons in common-law relationships are assigned to their legal marital status category. A single person is one who has never been married, or a person whose marriage has been annulled and who has not remarried. A separated person is legally married but is not living with his or her spouse because the couple no longer wants to live together. A divorced person is one who has obtained a legal divorce and has not remarried. A married person is one who is legally married and not separated. A person whose spouse has died and who has not remarried is widowed.

Mean age of mother. The mean (average) age of mother for Canada, a province or a territory is calculated by summing the mothers' ages at their last birthday preceding delivery, and then dividing the sum by the total number of live births in that jurisdiction. To estimate mid-year mean age, a statistic often used in analyses, add 0.5 to mean age.

Mean birth weight. The mean (average) birth weight for Canada, a province or a territory is calculated by summing the first weight of each live newborn (obtained immediately after birth), and then dividing the sum by the total number of live births in that jurisdiction.

Median birth weight. The median is the middle value in a set of ordered numbers (for example, newborns' birth weight ranked from lightest to heaviest). In the case of an even number of observations, the median is the average of the two middle values.

Multiple birth. A delivery that results in more than one birth, whether live born or stillborn. This includes the delivery of twins, triplets, quadruplets, quintuplets and more.

Parity of mother. The number of live births a woman has had to date (excludes fetal deaths or stillbirths). A woman with zero parity has had no live births; a woman of parity 1 has had one live birth, of parity 2, two live births, and so on. In the case of a first delivery resulting in live twins, the woman has a parity of 1 after the first twin is born and a parity of 2 after the second twin is born.

Population. Persons whose usual place of residence is somewhere in Canada, including Canadian government employees stationed abroad and their families, members of the Canadian Armed Forces stationed abroad and their families, crews of Canadian merchant vessels, and non-permanent residents of Canada.

Mid-year (July 1) population estimates are used to calculate the rates in vital statistics Tables.

Provinces and territories. Unless otherwise stated, the geographic distribution of births and fetal deaths (stillbirths) in the tables of this publication is based on the mother's usual place of residence.

Nunavut came into being officially as a Territory of Canada on April 1, 1999. The name Northwest Territories applies to a Territory with different geographic boundaries before and after April 1, 1999.

Stillbirth (fetal death). Death prior to the complete expulsion or extraction from its mother of a product of conception, irrespective of the duration of pregnancy; the death is indicated by the fact that after such separation the fetus does not breathe or show any other evidence of life, such as beating of the heart, pulsation of the umbilical cord, or definite movement of voluntary muscles. Only fetal deaths where the product of conception has a birth weight of 500 grams or more or the duration of pregnancy is 20 weeks or longer are registered in Canada.

In Quebec (as well as in Saskatchewan prior to 2001 and in New Brunswick prior to November 1996), only fetal deaths (stillbirths) weighing 500 or more grams must be reported, regardless of the gestation period.

Because of these differences in reporting requirements, fetal death (stillbirth) data are presented for two gestation periods: 20 or more weeks of gestation (including fetal deaths or stillbirths with unknown weeks of gestation), and 28 or more weeks of gestation (excluding unknown weeks of gestation).

Stillbirth (fetal death) rate. The number of fetal deaths (stillbirths) per 1,000 live births plus fetal deaths (stillbirths).

Type of birth. Type of birth refers to the plurality of a delivery, that is, whether the delivery results in the birth of one or more live born or stillborn infants.

Weeks of gestation. The interval, in completed weeks, between the first day of the mother's last menstrual period and the day of delivery (that is, the duration of pregnancy). It can also be any estimate of that interval, based on ultrasound, a physical examination or other method. Canadian birth registration documents do not specify how the gestational age was calculated. Pre-term refers to a period of gestation less than 37 completed weeks; term, 37 through 41 completed weeks; and post-term, 42 or more completed weeks.

Notes

1. World Health Organization (WHO). International Statistical Classification of Diseases and Related Health Problems, Tenth Revision, Volumes 1 and 2 (ICD–10). Geneva, 1992.

2. United Nations. Principles and Recommendations for a Vital Statistics System. Statistical Papers, Series M, No. 19, Rev. 1. New York, 1974.

Monthly Wholesale Trade Survey

1. Objective, Uses and Users

1.1. Objectives

The Monthly Wholesale Trade Survey (MWTS) provides information on the performance of the wholesale trade sector and is an important indicator of the health of the Canadian economy. In addition, the business community uses the data to analyse market performance.

1.2. Use

The estimates provide a measure of the health and performance of the wholesale trade sector. Information collected is used to estimate level and monthly trend for wholesale sales and inventories. At the end of each year, the estimates provide a preliminary look at annual wholesale sales and performance.

1.3. Users

A variety of organizations, sector associations, and levels of government make use of the information. Wholesalers can use the survey results to compare their performance against similar types of businesses, as well as for marketing purposes. Wholesale associations are able to monitor industry performance and promote their wholesale industries. Investors can monitor industry growth, which can result in better access to investment capital by wholesalers. Governments are able to understand the role of wholesalers in the economy, which aid in the development of policies and tax incentives. As an important industry in the Canadian economy (5 to 6% of the Gross Domestic Product, depending on the year), governments are able to better determine the overall health of the economy through the use of the estimates in the calculation of the nation’s Gross Domestic Product (GDP).

2. Concepts, Variables and Classifications

2.1. Concepts

Wholesale trade is generally the intermediate step in the distribution of merchandise. The sector comprises establishments primarily engaged in the buying and selling of merchandise and providing logistics, marketing and support services.

Wholesalers are organized to sell merchandise in large quantities to retailers, business and institutional clients. However, some wholesalers, in particular those that supply non-consumer capital goods, sell merchandise in single units to final users.  The sector recognizes two main types of wholesalers: wholesale merchants and wholesale agents and brokers.

Wholesale merchants buy and sell merchandise on their own account, that is, they take title to the goods they sell. They generally operate from warehouse or office locations and they may ship from their own inventory or arrange for the shipment of goods directly from the supplier to the client. In addition to the sales of goods, they may provide, or arrange for the provision of, logistics, marketing and support services, such as packaging and labelling, inventory management, shipping, handling of warranty claims, in-store or co-op promotions, and product training. Dealers of machinery and equipment, such as dealers of farm machinery and heavy-duty trucks, also fall within this category. They are known by a variety of trade designation depending on their relationship with suppliers or customers, or the distribution method they employ.

Examples include wholesale merchant, wholesale distributor, drop shipper, rack-jobbers, import-export merchants, buying groups, dealer-owned cooperatives and banner wholesalers. For purposes of industrial classification, wholesale merchants are classified by industry according to the principal lines of commodities sold. A description of each industrial group included in the accompanying statistical data is shown in Appendix IV. As most businesses sell several kinds of commodities, the classification assigned to a business generally reflects either the individual commodity or the commodity group which is the primary source of the establishment’s receipts, or some mixture of commodities which characterizes the establishment’s business.

Wholesale Agents and Brokers buy and sell merchandise owned by others on a fee or commission basis. They do not take title to the goods they buy or sell, and they generally operate at or from an office location. Wholesale agents and brokers are known by a variety of trade designations including import-export agents, wholesale commission agents, wholesale brokers, and manufacturer’s representatives’ ad agents.

2.2. Variables

Sales are defined as the sales of all goods purchased for resale, net of returns and discounts. This includes parts used in generating repair and maintenance revenue, labour revenue from repair and maintenance, sales of goods manufactured as a secondary activity by the wholesaler, and revenue from rental and leasing of office space, other real estate, and goods and equipment.  As well, any commission revenue and fees earned from buying and selling merchandise on account of others by wholesale merchants is also included. Other operating revenue such as operating subsidies and grants, shipping, handling, and storing goods for others are excluded.

Inventories are defined as the book value, i.e., the value maintained in the accounting records, of all stock owned at month end and intended for resale. This includes stock in selling outlets, in warehouses, in transit, or on consignment to others. It also includes stock owned within and outside Canada. Inventories held on consignment from others (not owned), and store and office supplies and any other supplies not to be sold are excluded. Trading Location is the physical location(s) in which business activity is conducted in each province and territory, and for which sales are credited or recognized in the financial records of the company. For wholesalers, this would normally be a distribution centre.

Sales in volume: The value of wholesale trade is measured in two ways; including the effects of price change on sales and net of the effects of price change. The first measure is referred to as wholesale trade in current dollars and the latter as wholesale trade in volume. The method of calculating the current dollar estimate is to aggregate the weighted value of sales for all wholesale outlets. The method of calculating the volume estimate is to first adjust the sales values to a base year, using the price indexes, and then sum up the resulting values.

2.3. Classifications

The Monthly Wholesale Trade Survey is based on the definition of wholesale trade under the NAICS (North American Industrial Classification System). NAICS is the agreed upon common framework for the production of comparable statistics by the statistical agencies of Canada, Mexico and the United States. The agreement defines the boundaries of twenty sectors. NAICS is based on a production-oriented, or supply based conceptual framework in that establishments are groups into industries according to similarity in production processes used to produce goods and services.

Estimates appear for 24 industries based on the 2012 North American Industrial Classification System (NAICS) industries. The 24 industries are further aggregated to 7 sub-sectors which correspond exactly to the 3-digit NAICS codes for wholesale trade industries, with the exception of the following: wholesale agents and brokers; and petroleum and oilseed and grain wholesaler-distributors.

Geographically, sales estimates are produced for Canada and each province and territory. Inventory estimates are produced only for Canada as a whole.

3. Coverage and Frames

Statistics Canada’s Business Register (BR) provides the frame for the Monthly Wholesale Trade Survey. The BR is a structured list of businesses engaged in the production of goods and services in Canada. It is a centrally maintained database containing detailed descriptions of most business entities operating within Canada. The BR includes all incorporated businesses, with or without employees. For unincorporated businesses, the BR includes all employer businesses and businesses with no employees with annualized sales that have a Goods and Services Tax (GST) account or annual revenue coming from individual income tax.

The businesses on the BR are represented by a hierarchical structure with four levels, with the statistical enterprise at the top, followed by the statistical company, the statistical establishment and the statistical location. An enterprise can be linked to one or more statistical companies, a statistical company can be linked to one or more statistical establishments, and a statistical establishment to one or more statistical locations.

The target population for the MWTS consists of all statistical establishments on the BR, excluding unincorporated businesses with no employees and with annual sales less than $30,000,.that are classified to the wholesale sector using the North American Industry Classification System (NAICS) (approximately 90,000 establishments). The NAICS code range for wholesale sector is 410000 to 419999. A statistical establishment is the production entity or the smallest grouping of production entities which: produces a homogeneous set of goods or services; does not cross provincial/territorial boundaries; and provides data on the value of output together with the cost of principal intermediate inputs used along with the cost and quantity of labour used to produce the output. The production entity is the physical unit where the business operations are carried out. It must have a civic address and dedicated labour.

The exclusions to the target population are ancillary establishments (producers of services in support of the activity of producing goods and services for the market of more than one establishment within the enterprise, and serves as a cost centre or a discretionary expense centre for which data on all its costs including labour and depreciation can be reported by the business), future establishments, establishments for which economic signals indicate a null or missing revenue, and establishments in the following non-covered NAICS:

  • 41112 (oilseed and grain)
  • 412 (petroleum products)
  • 419 (agents and brokers)

4. Sampling

The MWTS sample consists of 7,500 groups of establishments (clusters) classified to the Wholesale Trade sector selected from the Statistics Canada Business Register. A cluster of establishments is defined as all establishments belonging to a statistical enterprise that are in the same industrial group and geographical region. The MWTS uses a stratified design with simple random sample selection in each stratum. The stratification is done by industrial groups (mainly, but not only four digit level NAICS), and the geographical regions consisting of the provinces and territories. We further stratify the population by size. The size measure is created using a combination of independent survey data and three administrative variables: the annual profiled revenue, the GST sales expressed on an annual basis, and the declared tax revenue (T1 or T2).

The size strata consist of one take-all (census), at most two take-some (partially sampled) strata, and one take-none (non-sampled) stratum. Take-none strata serve to reduce respondent burden by excluding the smaller businesses from the surveyed population. These businesses should represent at most ten percent of total sales. Instead of sending questionnaires to these businesses, the estimates are produced through the use of administrative data.

The sample was allocated optimally in order to reach target coefficients of variation at the national, provincial/territorial, industrial, and industrial groups by province/territory levels. The sample was also inflated to compensate for dead, non-responding, and misclassified units.

MWTS is a repeated survey with maximization of monthly sample overlap. The sample is kept month after month, and every month new units are added (births) to the sample. MWTS births, i.e., new clusters of establishment(s), are identified every month via the BR’s latest universe. They are stratified according to the same criteria as the initial population. A sample of these births is selected according to the sampling fraction of the stratum to which they belong and is added to the monthly sample. Deaths also occur on a monthly basis. A death can be a cluster of establishment(s) that have ceased their activities (out-of-business) or whose major activities are no longer in wholesale trade (out-of-scope). The status of these businesses is updated on the BR using administrative sources and survey feedback, including feedback from the MWTS. Methods to treat dead units and misclassified units are part of the sample and population update procedures.

5. Questionnaire Design

The questionnaire collects monthly data on wholesale sales and the number of trading locations by province or territory and inventories of goods owned and intended for resale from a sample of wholesalers. For the 2004 redesign, most questionnaires were subject to cosmetic changes only, with the exception of the inclusion of Nunavut. The modifications were discussed with stakeholders and the respondents were given an opportunity to comment before the new questionnaire was finalized. If further changes are needed to any of the questionnaires, proposed changes would go through a review committee and a field test with respondents and data users to ensure its relevancy.

6. Response and Non-response

6.1. Response and Non-response

Despite the best efforts of survey managers and operations staff to maximize response in the MWTS, some non-response will occur.

For statistical establishments to be classified as responding, the degree of partial response (where an accurate response is obtained for only some of the questions asked a respondent) must meet a minimum threshold level below which the response would be rejected and considered a unit non-response. In such an instance, the business is classified as not having responded at all.

Non-response has two effects on data: first it introduces bias in estimates when non-respondents differ from respondents in the characteristics measured; and second, it contributes to an increase in the sampling variance of estimates because the effective sample size is reduced from that originally sought.

The degree to which efforts are made to get a response from a non-respondent is based on budget and time constraints, its impact on the overall quality and the risk of non-response bias.

The main method to reduce the impact of non-response at sampling is to inflate the sample size through the use of over-sampling rates that have been determined from similar surveys.

Besides the methods to reduce the impact of non-response at sampling and collection, the non-responses to the survey that do occur are treated through imputation.

In order to measure the amount of non-response that occurs each month various response rates are calculated. For a given reference month, the estimation process is run at least twice (a preliminary and a revised run). Between each run, respondent data can be identified as unusable and imputed values can be corrected through respondent data. As a consequence, response rates are computed following each run of the estimation process.

For the MWTS, two types of rates are calculated (unweighted and weighted). In order to assess the efficiency of the collection process, unweighted response rates are calculated. Weighted rates, using the estimation weight and the value for the variable of interest, assess the quality of estimation. Within each of these types of rates, there are distinct rates for units that are surveyed and for units that are only modeled from administrative data that has been extracted from GST files.

To get a better picture of the success of the collection process, two unweighted rates called the ‘collection results rate’ and the ‘extraction results rate’ are computed. They are computed by dividing the number of respondents by the number of units that we tried to contact or tried to receive extracted data for them. Non-monthly reporters (respondents with special reporting arrangements where they do not report every month but for whom actual data is available in subsequent revisions) are excluded from both the numerator and denominator for the months where no contact is performed.

In summary, the various response rates are calculated as follows:

Weighted rates:

- Survey Response rate (estimation) = Sum of weighted sales of units with response status i / Sum of survey weighted sales

where i = units that have either reported data that will be used in estimation or are converted
refusals, or have reported data that has not yet been resolved for estimation.

- Admin Response rate (estimation) = Sum of weighted sales of units with response status ii / Sum of administrative weighted sales

where ii = units that have data that was extracted from administrative files and are usable for estimation.

- Total Response rate (estimation) = Sum of weighted sales of units with response status i or response status ii / Sum of all weighted sales

Unweighted rates:

- Survey Response rate (collection) = Number of questionnaires with response status iii / Number of questionnaires with response status iv

where iii = units that have either reported data (unresolved, used or not used for estimation) or are converted refusals.

where iv = all of the above plus units that have refused to respond, units that were not contacted and other types of non-respondent units.

- Admin Response rate (extraction) = Number of questionnaires with response status vi / Number of questionnaires with response status vii

where vi = in-scope units that have data (either usable or non-usable) that was extracted from administrative files

where vii = all of the above plus units that have refused to report to the administrative data source, units that were not contacted and other types of non-respondent units.
(% of questionnaire collected over all in-scope questionnaires)

- Collection Results Rate = Number of questionnaires with response status iii / Number of questionnaires with response status viii

where iii = same as iii defined above

where viii = same as iv except for excluded units that were contacted because their response is unavailable for a particular month since they are non-monthly reporters.

- Extraction Results Rate = Number of questionnaires with response status ix / Number of questionnaires with response status vii

where ix = same as vi with the addition of extracted units that have been imputed or were out of scope

where vii = same as vii defined above
(% of questionnaires collected over all questionnaire in-scope we tried to collect)

All the above weighted and unweighted rates are provided at the industrial group, geography and size group level or for any combination of these levels.

Use of Administrative Data:

Managing response burden is an ongoing challenge for Statistics Canada. In an attempt to alleviate response burden and survey costs, especially for smaller businesses, the MWTS has reduced the number of simple establishments in the sample that are surveyed directly and instead derives sales data for these establishments from Goods and Service Tax (GST) files using a statistical model. The model accounts for differences between sales and revenue (reported for GST purposes) as well as for the time lag between the survey reference period and the reference period of the GST file.

Inventories for establishments where sales are GST-based are derived using the MWTS imputation system. The imputation system uses the previous month’s values, the month-to-month and year-to-year changes in similar size establishments which are surveyed.

For more information on the methodology used for modeling sales from administrative data sources, refer to ‘Monthly Wholesale Trade Survey: Use of Administrative Data’ under ‘Documentation’ of the IMDB.

6.2. Methods used to reduce non-response at collection

Significant effort is spent trying to minimize non-response during collection. Methods used, among others, are interviewer techniques such as probing and persuasion, repeated re-scheduling and call-backs to obtain the information, and procedures dealing with how to handle non-compliant (refusal) respondents.

If data are unavailable at the time of collection, a respondent's best estimates are also accepted, and are subsequently revised once the actual data become available. To minimize total non-response for all variables, partial responses are accepted. In addition, questionnaires are customized for the collection of certain variables, such as inventory, so that collection is timed for those months when the data are available.

Finally, to build trust and rapport between the interviewers and respondents, cases are generally assigned to the same interviewer each month. This action establishes a personal relationship between interviewer and respondent, and builds respondent trust.

7. Data Collection and Capture Operations

Collection of the data is performed by Statistics Canada’s Regional Offices. Respondents are sent a questionnaire or are contacted by telephone to obtain their sales and inventory values, as well as to confirm the opening or closing of business trading locations. There is also follow-up of non-response. Collection of the data begins approximately 7 working days after the end of the reference month and continues for the duration of that month.

New entrants to the survey are introduced to the survey via an introductory letter that informs the respondent that a representative of Statistics Canada will be calling. This call is to introduce the respondent to the survey, confirm the respondent's business activity, establish and begin data collection, as well as to answer any questions that the respondent may have.

8. Editing

Data editing is the application of checks to detect missing, invalid or inconsistent entries or to point to data records that are potentially in error. In the survey process for the MWTS, data editing is done at two different time periods.

First of all, editing is done during data collection. Once data are collected via the telephone, or via the receipt of completed mail-in questionnaires, the data are captured using customized data capture applications. All data are subjected to data editing. Edits during data collection are referred to as field edits and generally consist of validity and some simple consistency edits. They are also used to detect mistakes made during the interview by the respondent or the Interviewer and to identify missing information during collection in order to reduce the need for follow-up later on. Another purpose of the field edits is to clean up responses. In the MWTS, the current month’s responses are edited against the respondent’s previous month’s responses and/or the previous year’s responses for the current month.. Field edits are used to identify problems with data collection procedures and the design of the questionnaire, as well as the need for more interviewer training.

Follow-up with respondents occurs to validate potential erroneous data following any failed preliminary edit check of the data. Once validated, the collected data is regularly transmitted to the head office in Ottawa.

Secondly, editing known as statistical editing is also done after data collection and this is more empirical in nature. Statistical editing is run prior to imputation in order to identify the data that will be used as a basis to impute non-respondents. Large outliers that could disrupt a monthly trend are excluded from trend calculations by the statistical edits. It should be noted that adjustments are not made at this stage to correct the reported outliers.

The first step in the statistical editing is to identify which responses will be subjected to the statistical edit rules. Reported data for the current reference month will go through various edit checks.

The first set of edit checks is based on the Hidiroglou-Berthelot method whereby a ratio of the respondent’s current month data over historical (i.e. last month, or same month last year) or administrative data is analyzed. When the respondent’s ratio differs significantly from ratios of respondents who are similar in terms of industrial group and/or geography group, the response is deemed an outlier.

The second set of edits consists of an edit known as the share of market edit. With this method, one is able to edit all respondents even those where historical and auxiliary data is unavailable. The method relies on current month data only. Therefore, within a group of respondents that are similar in terms of industrial group and/or geography, if the weighted contribution of a respondent to the group’s total is too large, it will be flagged as an outlier.

For edit checks based on the Hidiroglou-Berthelot method, data that are flagged as an outlier will not be included in the imputation models (those based on ratios). Also, data that are flagged as outliers in the share of market edit will not be included in the imputation models where means and medians are calculated to impute for responses that have no historical responses.

In conjunction with the statistical editing after data collection of reported data, there is also error detection done on the extracted GST data. Modeled data based on the GST are also subject to an extensive series of processing steps which thoroughly verify each record that is the basis for the model as well as the record being modeled. Edits are performed at a more aggregate level (industry by geography level) to detect records which deviate from the expected range, either by exhibiting large month-to-month change, or differing significantly from the remaining units. All data which fail these edits are subject to manual inspection and possible corrective action.

9. Imputation

Imputation in the MWTS is the process used to assign replacement values for missing data. This is done by assigning values when they are missing on the record being edited to ensure that estimates are of high quality and that a plausible, internal consistency is created. Due to concerns of response burden, cost and timeliness, it is generally impossible to do all follow-ups with the respondents in order to resolve missing responses. Since it is desirable to produce a complete and consistent micro data file, imputation is used to handle the remaining missing cases.

In the MWTS, imputation for missing values can be based on either historical or administrative data. The appropriate method is selected according to a strategy that is based on whether historical data is available, administrative data is available and/or which reference month is being processed.

There are three types of historical imputation methods. The first type is a general trend that uses one historical data source (previous month, data from next month or data from same month previous year). The second type is a regression model where data from previous month and same month previous year are used simultaneously. The third type uses the historical data as a direct replacement value for a non-respondent.

Depending upon the particular reference month, there is an order of preference that exists so that a top quality imputation can result. The historical imputation method that was labelled as the third type above is always the last option in the order for each reference month.

The imputation methods using administrative data are automatically selected when historical information is unavailable for a non-respondent. The administrative data source (annual GST sales) is the basis of these methods. The annual GST sales are used for two types of methods. One is a general trend that will be used for simple structure, e.g. enterprises with only one establishment, and a second type is called median-average that is used for units with a more complex structure.

Finally, it should be noted that inventories in the MWTS where sales are derived from monthly GST data are also imputed by the MWTS imputation systems. The imputed values are calculated using the same imputation methods that are in place for missing data from non-respondents.

10. Estimation

Estimation is a process that approximates unknown population parameters using only the part of the population that is included in a sample. Inferences about these unknown parameters are then made, using the sample data and associated survey design.  This stage uses Statistics Canada's Generalized Estimation System (GES.)

For wholesale sales, the population is divided into a survey portion (take-all and take-some strata) and a non-survey portion (take-none stratum). From the sample that is drawn from the survey portion, an estimate for the population is determined through the use of a Horvitz-Thompson estimator where responses for sales are weighted by using the inverses of the inclusion probabilities of the sampled units. Such weights (called sampling weights) can be interpreted as the number of times that each sampled unit should be replicated to represent the entire population. The calculated weighted sales values are summed by domain, to produce the total sales estimates by each industrial group / geographic area combination. A domain is defined as the most recent classification values available from the BR for the unit and the survey reference period. These domains may differ from the original sampling strata because units may have changed size, industrial group or location. Changes in classification are reflected immediately in the estimates and do not accumulate over time. For the non-survey portion, the sales are estimated with statistical models using monthly GST sales.

For wholesale inventories, the sample selected for estimating sales is used to derive an estimate through the use of a Horvitz-Thompson estimator for the survey portion. A sample-based ratio is then used to produce the estimate for the non-survey portion, and the estimate of the total is derived as the sum of the survey and non-survey portion estimates.

For more information on the methodology for modeling sales from administrative data sources (i.e. GST data) which also contributes to the estimates of the survey portion, refer to ‘Monthly Wholesale Trade Survey: Use of Administrative Data’ under ‘Documentation’ of the IMDB.

The measure of precision used for the MWTS to evaluate the quality of a population parameter estimate and to obtain valid inferences is the variance. The variance from the survey portion is derived directly from a stratified simple random sample without replacement.

Sample estimates may differ from the expected value of the estimates. However, since the estimate is based on a probability sample, the variability of the sample estimate with respect to its expected value can be measured. The variance of an estimate is a measure of the precision of the sample estimate and is defined as the average, over all possible samples, of the squared difference of the estimate from its expected value.

11. Revisions and seasonal adjustment

Revisions in the raw data are required to correct known non-sampling errors. These normally include replacing imputed data with reported data, corrections to previously reported data, and estimates for new births that were not known at the time of the original estimates.

Raw data are revised, on a monthly basis, for the month immediately prior to the current reference month being published. That is, when data for December are being published for the first time, there will also be revisions, if necessary, to the raw data for November. In addition, revisions are made once a year, with the initial release of the February data, for all months in the previous year. The purpose is to correct any significant problems that have been found that apply for an extended period. The actual period of revision depends on the nature of the problem identified, but rarely exceeds three years.

Time series contain the elements essential to the description, explanation and forecasting of the behaviour of an economic phenomenon: "They are statistical records of the evolution of economic processes through time.1 "  Economic time series such as the Monthly Wholesale Trade Survey can be broken down into five main components: the trend-cycle, seasonality, the trading-day effect, the Easter holiday effect and the irregular component.

The trend represents the long-term change in the series, whereas the cycle represents a smooth, quasi-periodical movement about the trend, showing a succession of growth and decline phases (e.g., the business cycle). These two components—the trend and the cycle—are estimated together, and the trend-cycle reflects the fundamental evolution of the series. The other components reflect short-term transient movements.

The seasonal component represents sub-annual, monthly or quarterly fluctuations that recur more or less regularly from one year to the next. Seasonal variations are caused by the direct and indirect effects of the climatic seasons and institutional factors (attributable to social conventions or administrative rules; e.g., Christmas).

The trading-day component originates from the fact that the relative importance of the days varies systematically within the week and that the number of each day of the week in a given month varies from year to year. This effect is present when activity varies with the day of the week. For instance, Sunday is typically less active than the other days, and the number of Sundays, Mondays, etc., in a given month changes from year to year.

The Easter holiday effect is the variation due to the shift of part of April’s activity to March when Easter falls in March rather than April.

Lastly, the irregular component includes all other more or less erratic fluctuations not taken into account in the preceding components. It is a residual that includes errors of measurement on the variable itself as well as unusual events (e.g., strikes, drought, floods, major power blackout or other unexpected events causing variations in respondents’ activities).

Thus, the latter four components—seasonal, irregular, trading-day and Easter holiday effect—all conceal the fundamental trend-cycle component of the series. Seasonal adjustment (correction of seasonal variation) consists in removing the seasonal, trading-day and Easter holiday effect components from the series, and it thus helps reveal the trend-cycle. While seasonal adjustment permits a better understanding of the underlying trend-cycle of a series, the seasonally adjusted series still contains an irregular component. Slight month-to-month variations in the seasonally adjusted series may be simple irregular movements. To get a better idea of the underlying trend, users should examine several months of the seasonally adjusted series.

Since April 2008, Monthly Wholesale Trade Survey data are seasonally adjusted using the X-12-ARIMA2 software. The technique that is used essentially consists of first correcting the initial series for all sorts of undesirable effects, such as the trading-day and the Easter holiday effects, by a module called regARIMA. These effects are estimated using regression models with ARIMA errors (auto-regressive integrated moving average models). The series can also be extrapolated for at least one year by using the model. Subsequently, the raw series—pre-adjusted and extrapolated if applicable— is seasonally adjusted by the X-11 method.

The X-11 method is used for analysing monthly and quarterly series. It is based on an iterative principle applied in estimating the different components, with estimation being done at each stage using adequate moving averages3. The moving averages used to estimate the main components—the trend and seasonality—are primarily smoothing tools designed to eliminate an undesirable component from the series. Since moving averages react poorly to the presence of atypical values, the X-11 method includes a tool for detecting and correcting atypical points. This tool is used to clean up the series during the seasonal adjustment. Outlying data points can also be detected and corrected in advance, within the regARIMA module.

Lastly, the annual totals of the seasonally adjusted series are forced to the annual totals of the original series. Unfortunately, seasonal adjustment removes the sub-annual additivity of a system of series; small discrepancies can be observed between the sum of seasonally adjusted series and the direct seasonal adjustment of their total. To insure or restore additivity in a system of series, a reconciliation process is applied or indirect seasonal adjustment is used, i.e. the seasonal adjustment of a total is derived by the summation of the individually seasonally adjusted series.

12. Data Quality Evaluation

The methodology of this survey has been designed to control errors and to reduce their potential effects on estimates. However, the survey results remain subject to errors, of which sampling error is only one component of the total survey error.

Sampling error results when observations are made only on a sample and not on the entire population. All other errors arising from the various phases of a survey are referred to as non-sampling errors. For example, these types of errors can occur when a respondent provides incorrect information or does not answer certain questions; when a unit in the target population is omitted or covered more than once; when GST data for records being modeled for a particular month are not representative of the actual record for various reasons; when a unit that is out of scope for the survey is included by mistake or when errors occur in data processing, such as coding or capture errors.

Prior to publication, combined survey results are analyzed for comparability; in general, this includes a detailed review of individual responses (especially for large businesses), general economic conditions and historical trends.

A common measure of data quality for surveys is the coefficient of variation (CV). The coefficient of variation, defined as the standard error divided by the sample estimate, is a measure of precision in relative terms. Since the coefficient of variation is calculated from responses of individual units, it also measures some non-sampling errors.

The formula used to calculate coefficients of variation (CV) as percentages is:

CV(X) = (S(X) / X) x 100%

where X denotes the estimate and S(X) denotes the standard error of X.

Confidence intervals can be constructed around the estimates using the estimate and the CV. Thus, for our sample, it is possible to state with a given level of confidence that the expected value will fall within the confidence interval constructed around the estimate. For example, if an estimate of $12,000,000 has a CV of 2%, the standard error will be $240,000 (the estimate multiplied by the CV). It can be stated with 68% confidence that the expected values will fall within the interval whose length equals the standard deviation about the estimate, i.e. between $11,760,000 and $12,240,000. Alternatively, it can be stated with 95% confidence that the expected value will fall within the interval whose length equals two standard deviations about the estimate, i.e. between $11,520,000 and $12,480,000.

Finally, due to the small contribution of the non-survey portion to the total estimates, bias in the non-survey portion has a negligible impact on the CVs. Therefore, the CV from the survey portion is used for the total estimate that is the summation of estimates from the surveyed and non-surveyed portions.

13. Disclosure Control

Statistics Canada is prohibited by law from releasing any data which would divulge information obtained under the Statistics Act that relates to any identifiable person, business or organization without the prior knowledge or the consent in writing of that person, business or organization. Various confidentially rules are applied to all data that are released or published to prevent the publication or disclosure of any information deemed confidential. If necessary, data are suppressed to prevent direct or residual disclosure or identifiable data.

Confidentiality analysis includes the detection of possible “direct disclosure”, which occurs when the value in a tabulation cell is composed of a few respondents or when the cell is dominated by a few companies.

Notes

  1. A Note on the Seasonal adjustment of Economic Time Series», Canadian Statistical Review, August 1974.

  2. For more information, see X-12-ARIMA Reference Manual Version 0.3 (2007), U.S. Census Bureau.

  3. Ladiray, D. and Quenneville, B. (2001). Seasonal Adjustment with the X-11 Method. New York: Springer-Verlag, Lecture Notes in Statistics no. 158.

Sales in volume for Wholesale Trade

Introduction

With the September 2012 release of the Monthly Wholesale Trade Survey (MWTS) results (reference month July 2012), a new deflation methodology for wholesale sales has been implemented.

This new methodology improves on the previous one, and consequently its results are not strictly comparable to those already published, although the overall trends are similar. The CANSIM table 081-0013 containing the previous estimates has been terminated and the improved results can be found in CANSIM table 081-0015.

The purpose of this document is to present the improved methodology for producing the volume measures of sales from the MWTS, as well as highlight the differences from the previous approach.

Purpose of Deflation

Changes in the value of sales collected at current prices (i.e. at the time the sales took place) by the MWTS may be attributable to changes in prices or to changes in quantities sold, or both. To study the activity of the wholesale sector, it is often desirable to remove the variations due to price changes from the values at current prices in order to obtain an indicator of the changes in the quantities sold, i.e. an indicator of the volume of sales. This process is known as deflation.

Derivation of Wholesale Sales Price Indexes

To deflate wholesale sales, suitable price indexes must be used. In the new deflation methodology for wholesale sales, the main price indexes used are the selling price indexes obtained from the Wholesale Services Price Index (WSPI) program. This program produces monthly data that are released on a quarterly basis with about a four month lag. Hence, they are not available in time to deflate the most recent observations of wholesale sales.

It was thus necessary to construct price indexes to extend the WSPI-based ones for the most current months. The growth rates of these derived price indexes are used until they are replaced by the WSPI-based ones once they become available.

In what follows, we describe how price indexes, with base year 2007, are computed for the deflation of wholesale sales. We first describe how the WSPI data are used and then how the derived price indexes are constructed.

Price indexes based on the WSPI

From the WSPI program, monthly selling price indexes are available at the 5-digit North American Industry Classification System (NAICS) industry level. These selling price indexes are weighted together to obtain a sales price index for each of the wholesale trade industries covered by the MWTS. Those industries are called trade groups.

The weights used to combine the selling price indexes into a trade group price index are the proportions of the sales of the 5-digit NAICS industries within each trade group. These weights are obtained from the Annual Wholesale Trade Survey (AWTS). They vary from year to year; i.e. the 2007 proportions of sales are used in 2007, those of 2008 in 2008, and so on. For the two most recent years, the last available annual data from the AWTS are used.

Derived price indexes

To extend the WSPI-based price indexes, derived price indexes for each trade group had to be constructed based on assumptions that capture the main elements thought to affect wholesalers’ selling prices. These derived price indexes are based on the prices of the commodities traded and on the proportion of the fluctuations in the exchange rate of the dollar that is immediately passed on to the trade group’s customers.

a) Main assumptions

Wholesalers trade a portion of the total supply in Canada of a commodity. The total supply is the sum of domestic production and imports. A wholesale price index for each commodity traded is obtained by combining a domestic production price index with an import price index.

Wholesalers sell domestically and on export markets with perhaps differentiated prices. It is assumed however that they set their prices according to the changes in the prices of the commodities that they trade, whether the commodities are exported or not.

It is also assumed that the variations in the price of a commodity are the same across wholesale trade groups. This means that a commodity sold by various trade groups has a unique price index, but the weight of that commodity varies across trade groups.

b) Wholesale commodity prices

A wholesale price index with base year 2007 for each commodity would be obtained by combining a domestic production price index with an import price index using a 2007 import weight. But since there was no wholesale commodity survey in 2007, the commodity imports’ shares were obtained instead from the 2008 Wholesale Origin and Destination of Goods (WODG) data collected on the AWTS.

Most of the domestic production prices are taken from the Industrial Product Price Index program. For some farm products, data from the Farm Product Price Index program are used. The Commercial Software Price Index as well as the Consumer Price Index for Digital Computing Equipment and Devices, adjusted for major sales tax changes, are also used.

For the import components, the fixed weighted (Laspeyres) import price indexes on a customs basis from the International Trade Price Indexes program are used.

c) Trade group prices

The commodities sold by each trade group, as well as their proportions in the group’s total sales, are known from the 2008 WODG results. These proportions are used to combine the wholesale commodity prices into a price index for the trade group’s sales. The trade group price indexes are weighted harmonic means of the commodity price indexes.

For a few trade groups selling a wide variety of commodities, we included only those commodities accounting for at least 95% of the group’s sales, as the exclusion of the other ones with little weight has essentially no effect on the trade group’s price index.

d) Adjustment for the exchange rate of the dollar

Many of the import prices used in the derivation of the wholesale commodity price indexes fully and immediately reflect the exchange rate fluctuations of the dollar. However, wholesalers do not necessarily adjust their prices immediately to compensate for those fluctuations; generally, they will change their prices to reflect only a proportion of them and maybe with a lag.

A comparison of the trade group price indexes with the selling price indexes from the WSPI program showed that the price indexes for many trade groups required an adjustment to remove a bias caused by the incomplete pass-through of the fluctuations in the exchange rate of the dollar.

These pass-through adjustments were evaluated by a linear regression of the ratio of the trade group price index to the WSPI-based price index on the exchange rate of the dollar vis-à-vis the U.S. currency. The adjusted trade group price indexes are the derived price indexes.

e) An exception

For one trade group, NAICS 4142 - Home Entertainment Equipment and Household Appliance Wholesaler-Distributors, it was found that even the adjusted price index was not appropriately tracking the selling price index from the WSPI program.

Hence, for this particular trade group the derived price index is formed instead from a combination of two CPI components, adjusted for major sales tax changes. The two CPI components are those for Audio Equipment and Video Equipment, which are combined using their weights in the CPI.

Derivation of the Volume of Wholesale Sales

As indicated previously, changes in the value of wholesale sales may be attributable to changes in the prices of the commodities sold, or to the quantities sold, or to both. With deflation, a measure of the volume of sales can be obtained for the analysis of the changes in the quantities sold, removing the effect of price changes.

Two measures of the total volume of wholesale sales are computed. One is the volume of sales at constant prices (with and without seasonal adjustment); the other is the volume of sales in chained dollars (only available seasonally adjusted).

Volume at constant prices (Laspeyres formula)

The volume of sales at constant prices uses the relative importance of the products’ prices in a previous period, currently the year 2007, to evaluate the change in the quantities sold. This year is called the base year. The resulting deflated values are said to be “at 2007 prices.” Using the prices of a previous period to measure current activity provides a representative measurement of the current volume of activity with respect to that period.

The price indexes used to obtain the volume of sales at constant prices are the extended price indexes, i.e. the WSPI-based price indexes extended with the derived price indexes described earlier.

The nominal (current dollars) sales of each trade group are divided by their respective extended price index, and then the total volume of sales at constant prices is obtained by adding the volume of sales across the 25 trade groups covered by the MWTS.

The unadjusted and seasonally adjusted volumes at constant prices are computed similarly. In the computation of the seasonally adjusted volume of sales, however, the price indexes are seasonally adjusted directly using the X-12-ARIMA program if appropriate.

Volume in chained dollars (Fisher formula)

The total volume of sales in chained dollars is the geometric mean of two evaluations of the change in the quantities sold between two consecutive months. One evaluation uses the prices of the previous month to evaluate the change; the other uses the prices of the current month.

Since the general tendency for commodity prices is to increase, the evaluation based on the prices of the previous month tends to overstate the change in quantities; i.e. as price increases, buyers tend to buy more of a cheaper commodity. Therefore, using the prices of a previous period to value the quantities bought currently may lead to an overstatement of the change in quantities.

Similarly, the evaluation of the change in the quantities sold using the prices of the current month will tend to understate the change in quantities as this approach gives more weight to the lower priced commodities than to the higher priced ones.

Hence, the geometric average of the two evaluations of the monthly change in quantities (with the previous and current monthly prices) mitigates these under- and over-statements. The volume of sales in chained dollars thus captures the effect of the most recent price changes in the change in volume, as it combines the changes in volume measured with respect to both the current and previous month’s prices.

The total volume of sales in chained dollars is computed monthly, and then the monthly variations are chained (compounded) to provide a time series of the changes in volumes. The time series is then scaled to be equal to the total value of wholesale sales in current dollars for the year 2007.

As the only monthly price and quantity information available are the price and volume data for the 25 trade groups covered by the MWTS, the volume of sales in chained dollars is only computed for the Wholesale Trade sector as a whole.

As well, it is only produced in seasonally adjusted form, since chaining monthly raw volume variations may result in hard-to-interpret monthly fluctuations.

Improvements over the Previous Methodology

The new methodology for the deflation of wholesale sales brings various improvements to the previous one. These improvements include:

  • The use of observed wholesale selling price indexes (when available) instead of derived trade group price indexes.
  • When the WSPI data are not available, a pass-through adjustment is applied if necessary to the derived trade group price indexes. There were no such adjustments previously.
  • An improved derived price index for the trade group NAICS 4142 - Home Entertainment Equipment and Household Appliance Wholesaler-Distributors.
  • Where appropriate, seasonal adjustment is performed directly on the trade group price index. Previously, it was the deflated sales of each trade group that were seasonally adjusted directly.
  • The base year/reference year has been updated from 2002 to 2007.

Volume of Wholesale Sales for 2004-2006

Above, we described how the volume of wholesale sales at 2007 prices was obtained for the period starting in January 2007. But the MWTS data based on NAICS begin in January 2004. In order to provide an as long as possible time series of the volume of wholesale sales, we also deflated the period 2004-2006.

For the year 2006, we used the selling price indexes from the WSPI program as described above. The WSPI-based price indexes were extended in the past, for the period 2004-2005, using the derived trade group price indexes described earlier, with a base year of 2002. That is, the shares of imports in the wholesale commodity prices was assumed to be equal to that in the total Canadian supply of that commodity in 2002 according to the Input-Output Tables. As well, the proportions of each commodity in the sales of each trade group were obtained from the 2001 Wholesale Trade Commodity Survey by Origin and Destination, as there was no wholesale trade commodity survey in 2002.

The segment at 2002 prices for the years 2004-2006 was then linked to the level of the segment at 2007 prices, by preserving its monthly growth rates.

Downloading MARC Records

Download MARC (Machine-Readable Cataloging) records from the Statistics Canada Library.

Z39.50 protocol enables you to search and retrieve MARC records from the Statistics Canada Library database using software connected to the Internet. This service does not require registration, simply configure your ILS Z39.50 server with the following parameters:

Database: Enterprise
Host Name: sttc.sirsidynix.net
Port number: 7619

Once you have configured your ILS Z39.50 server, consult your ILS documentation for how to proceed with downloading Library catalogue MARC records.

Monthly Retail Trade Survey (MRTS) Data Quality Statement

Objectives, uses and users
Concepts, variables and classifications
Coverage and frames
Sampling
Questionnaire design
Response and nonresponse
Data collection and capture operations
Editing
Imputation
Estimation
Revisions and seasonal adjustment
Data quality evaluation
Disclosure control

1. Objectives, uses and users

1.1. Objective

The Monthly Retail Trade Survey (MRTS) provides information on the performance of the retail trade sector on a monthly basis, and when combined with other statistics, represents an important indicator of the state of the Canadian economy.

1.2. Uses

The estimates provide a measure of the health and performance of the retail trade sector. Information collected is used to estimate level and monthly trend for retail sales. At the end of each year, the estimates provide a preliminary look at annual retail sales and performance.

1.3. Users

A variety of organizations, sector associations, and levels of government make use of the information. Retailers rely on the survey results to compare their performance against similar types of businesses, as well as for marketing purposes. Retail associations are able to monitor industry performance and promote their retail industries. Investors can monitor industry growth, which can result in better access to investment capital by retailers. Governments are able to understand the role of retailers in the economy, which aids in the development of policies and tax incentives. As an important industry in the Canadian economy, governments are able to better determine the overall health of the economy through the use of the estimates in the calculation of the nation’s Gross Domestic Product (GDP).

2. Concepts, variables and classifications

2.1. Concepts

The retail trade sector comprises establishments primarily engaged in retailing merchandise, generally without transformation, and rendering services incidental to the sale of merchandise.

The retailing process is the final step in the distribution of merchandise; retailers are therefore organized to sell merchandise in small quantities to the general public. This sector comprises two main types of retailers, that is, store and non-store retailers. The MRTS covers only store retailers. Their main characteristics are described below. Store retailers operate fixed point-of-sale locations, located and designed to attract a high volume of walk-in customers. In general, retail stores have extensive displays of merchandise and use mass-media advertising to attract customers. They typically sell merchandise to the general public for personal or household consumption, but some also serve business and institutional clients. These include establishments such as office supplies stores, computer and software stores, gasoline stations, building material dealers, plumbing supplies stores and electrical supplies stores.

In addition to selling merchandise, some types of store retailers are also engaged in the provision of after-sales services, such as repair and installation. For example, new automobile dealers, electronic and appliance stores and musical instrument and supplies stores often provide repair services, while floor covering stores and window treatment stores often provide installation services. As a general rule, establishments engaged in retailing merchandise and providing after sales services are classified in this sector. Catalogue sales showrooms, gasoline service stations, and mobile home dealers are treated as store retailers.

2.2. Variables

Sales are defined as the sales of all goods purchased for resale, net of returns and discounts. This includes commission revenue and fees earned from selling goods and services on account of others, such as selling lottery tickets, bus tickets, and phone cards. It also includes parts and labour revenue from repair and maintenance; revenue from rental and leasing of goods and equipment; revenues from services, including food services; sales of goods manufactured as a secondary activity; and the proprietor’s withdrawals, at retail, of goods for personal use. Other revenue from rental of real estate, placement fees, operating subsidies, grants, royalties and franchise fees are excluded.

Trading Location is the physical location(s) in which business activity is conducted in each province and territory, and for which sales are credited or recognized in the financial records of the company. For retailers, this would normally be a store.

Sales in volume: The value of retail trade is measured in two ways; including the effects of price change on sales and net of the effects of price change. The first measure is referred to as retail trade in current dollars and the latter as retail trade in constant dollars. The method of calculating the current dollar estimate is to aggregate the weighted value of sales for all retail outlets. The method of calculating the constant dollar estimate is to first adjust the sales values to a base year, using the Consumer Price Index, and then sum up the resulting values.

2.3. Classification

The Monthly Retail Trade Survey is based on the definition of retail trade under the NAICS (North American Industry Classification System). NAICS is the agreed upon common framework for the production of comparable statistics by the statistical agencies of Canada, Mexico and the United States. The agreement defines the boundaries of twenty sectors. NAICS is based on a production-oriented, or supply based conceptual framework in that establishments are groups into industries according to similarity in production processes used to produce goods and services.

Estimates appear for 21 industries based on special aggregations of the 2007 North American Industry Classification System (NAICS) industries. The 21 industries are further aggregated to 11 sub-sectors.

Geographically, sales estimates are produced for Canada and each province and territory.

3. Coverage and frames

Statistics Canada’s Business Register ( BR) provides the frame for the Monthly Retail Trade Survey. The BR is a structured list of businesses engaged in the production of goods and services in Canada. It is a centrally maintained database containing detailed descriptions of most business entities operating within Canada. The BR includes all incorporated businesses, with or without employees. For unincorporated businesses, the BR includes all employers with businesses, and businesses with no employees with annual sales that have a Goods and Services Tax (GST) or annual revenue that declares individual taxes.  annual sales greater than $30,000 that have a Goods and Services Tax (GST) account (the BR does not include unincorporated businesses with no employees and with annual sales less than $30,000).

The businesses on the BR are represented by a hierarchical structure with four levels, with the statistical enterprise at the top, followed by the statistical company, the statistical establishment and the statistical location. An enterprise can be linked to one or more statistical companies, a statistical company can be linked to one or more statistical establishments, and a statistical establishment to one or more statistical locations.

The target population for the MRTS consists of all statistical establishments on the BR that are classified to the retail sector using the North American Industry Classification System (NAICS) (approximately 200,000 establishments). The NAICS code range for the retail sector is 441100 to 453999. A statistical establishment is the production entity or the smallest grouping of production entities which: produces a homogeneous set of goods or services; does not cross provincial boundaries; and provides data on the value of output, together with the cost of principal intermediate inputs used, along with the cost and quantity of labour used to produce the output. The production entity is the physical unit where the business operations are carried out. It must have a civic address and dedicated labour.

The exclusions to the target population are ancillary establishments (producers of services in support of the activity of producing goods and services for the market of more than one establishment within the enterprise, and serves as a cost centre or a discretionary expense centre for which data on all its costs including labour and depreciation can be reported by the business), future establishments, establishments with a missing or a zero gross business income (GBI) value on the BR and establishments in the following non-covered NAICS:

  • 4541 (electronic shopping and mail-order houses)
  • 4542 (vending machine operators)
  • 45431 (fuel dealers)
  • 45439 (other direct selling establishments)

4. Sampling

The MRTS sample consists of 10,000 groups of establishments (clusters) classified to the Retail Trade sector selected from the Statistics Canada Business Register. A cluster of establishments is defined as all establishments belonging to a statistical enterprise that are in the same industrial group and geographical region. The MRTS uses a stratified design with simple random sample selection in each stratum. The stratification is done by industry groups (the mainly, but not only four digit level NAICS), and the geographical regions consisting of the provinces and territories, as well as three provincial sub-regions. We further stratify the population by size.

The size measure is created using a combination of independent survey data and three administrative variables: the annual profiled revenue, the GST sales expressed on an annual basis, and the declared tax revenue (T1 or T2). The size strata consist of one take-all (census), at most, two take-some (partially sampled) strata, and one take-none (non-sampled) stratum. Take-none strata serve to reduce respondent burden by excluding the smaller businesses from the surveyed population. These businesses should represent at most ten percent of total sales. Instead of sending questionnaires to these businesses, the estimates are produced through the use of administrative data.

The sample was allocated optimally in order to reach target coefficients of variation at the national, provincial/territorial, industrial, and industrial groups by province/territory levels. The sample was also inflated to compensate for dead, non-responding, and misclassified units.

MRTS is a repeated survey with maximisation of monthly sample overlap. The sample is kept month after month, and every month new units are added (births) to the sample.  MRTS births, i.e., new clusters of establishment(s), are identified every month via the BR’s latest universe. They are stratified according to the same criteria as the initial population. A sample of these births is selected according to the sampling fraction of the stratum to which they belong and is added to the monthly sample. Deaths occur on a monthly basis. A death can be a cluster of establishment(s) that have ceased their activities (out-of-business) or whose major activities are no longer in retail trade (out-of-scope). The status of these businesses is updated on the BR using administrative sources and survey feedback, including feedback from the MRTS. Methods to treat dead units and misclassified units are part of the sample and population update procedures.

5. Questionnaire design

The Monthly Retail Trade Survey incorporates the following sub-surveys:

Monthly Retail Trade Survey - R8

Monthly Retail Trade Survey (with inventories) – R8

Survey of Sales and Inventories of Alcoholic Beverages

The questionnaires collect monthly data on retail sales and the number of trading locations by province or territory and inventories of goods owned and intended for resale from a sample of retailers. The items on the questionnaires have remained unchanged for several years. For the 2004 redesign, the general questionnaires were subject to cosmetic changes only. The questionnaire for Sales and Inventories of Alcoholic Beverages underwent more extensive changes. The modifications were discussed with stakeholders and the respondents were given an opportunity to comment before the new questionnaire was finalized. If further changes are needed to any of the questionnaires, proposed changes would go through a review committee and a field test with respondents and data users to ensure its relevancy.

6. Response and nonresponse

6.1. Response and non-response

Despite the best efforts of survey managers and operations staff to maximize response in the MRTS, some non-response will occur. For statistical establishments to be classified as responding, the degree of partial response (where an accurate response is obtained for only some of the questions asked a respondent) must meet a minimum threshold level below which the response would be rejected and considered a unit nonresponse.  In such an instance, the business is classified as not having responded at all.

Non-response has two effects on data: first it introduces bias in estimates when nonrespondents differ from respondents in the characteristics measured; and second, it contributes to an increase in the sampling variance of estimates because the effective sample size is reduced from that originally sought.

The degree to which efforts are made to get a response from a non-respondent is based on budget and time constraints, its impact on the overall quality and the risk of nonresponse bias.

The main method to reduce the impact of non-response at sampling is to inflate the sample size through the use of over-sampling rates that have been determined from similar surveys.

Besides the methods to reduce the impact of non-response at sampling and collection, the non-responses to the survey that do occur are treated through imputation. In order to measure the amount of non-response that occurs each month, various response rates are calculated. For a given reference month, the estimation process is run at least twice (a preliminary and a revised run). Between each run, respondent data can be identified as unusable and imputed values can be corrected through respondent data. As a consequence, response rates are computed following each run of the estimation process.

For the MRTS, two types of rates are calculated (un-weighted and weighted). In order to assess the efficiency of the collection process, un-weighted response rates are calculated. Weighted rates, using the estimation weight and the value for the variable of interest, assess the quality of estimation. Within each of these types of rates, there are distinct rates for units that are surveyed and for units that are only modeled from administrative data that has been extracted from GST files.

To get a better picture of the success of the collection process, two un-weighted rates called the ‘collection results rate’ and the ‘extraction results rate’ are computed. They are computed by dividing the number of respondents by the number of units that we tried to contact or tried to receive extracted data for them. Non-monthly reporters (respondents with special reporting arrangements where they do not report every month but for whom actual data is available in subsequent revisions) are excluded from both the numerator and denominator for the months where no contact is performed.

In summary, the various response rates are calculated as follows:

Weighted rates:

Survey Response rate (estimation) =
Sum of weighted sales of units with response status i / Sum of survey weighted sales

where i = units that have either reported data that will be used in estimation or are converted refusals, or have reported data that has not yet been resolved for estimation.

Admin Response rate (estimation) =
Sum of weighted sales of units with response status ii / Sum of administrative weighted sales

where ii = units that have data that was extracted from administrative files and are usable for estimation.

Total Response rate (estimation) =
Sum of weighted sales of units with response status i or response status ii / Sum of all weighted sales

Un-weighted rates:

Survey Response rate (collection) =
Number of questionnaires with response status iii/ Number of questionnaires with response status iv

where iii = units that have either reported data (unresolved, used or not used for estimation) or are converted refusals.

where iv = all of the above plus units that have refused to respond, units that were not contacted and other types of non-respondent units.

Admin Response rate (extraction) =
Number of questionnaires with response status vi/ Number of questionnaires with response status vii

where vi = in-scope units that have data (either usable or non-usable) that was extracted from administrative files

where vii = all of the above plus units that have refused to report to the administrative data source, units that were not contacted and other types of non-respondent units.

(% of questionnaire collected over all in-scope questionnaires)

Collection Results Rate =
Number of questionnaires with response status iii / Number of questionnaires with response status viii

where iii = same as iii defined above

where viii = same as iv except for the exclusion of units that were contacted because their response is unavailable for a particular month since they are non-monthly reporters.

Extraction Results Rate =
Number of questionnaires with response status ix / Number of questionnaires with response status vii

where ix = same as vi with the addition of extracted units that have been imputed or were out of scope

where vii = same as vii defined above

(% of questionnaires collected over all questionnaire in-scope we tried to collect)

All the above weighted and un-weighted rates are provided at the industrial group, geography and size group level or for any combination of these levels.

Use of Administrative Data

Managing response burden is an ongoing challenge for Statistics Canada. In an attempt to alleviate response burden and survey costs, especially for smaller businesses, the MRTS has reduced the number of simple establishments in the sample that are surveyed directly and instead derives sales data for these establishments from Goods and Service Tax (GST) files using a statistical model. The model accounts for differences between sales and revenue (reported for GST purposes) as well as for the time lag between the survey reference period and the reference period of the GST file.

For more information on the methodology used for modeling sales from administrative data sources, refer to ‘Monthly Retail Trade Survey: Use of Administrative Data’ under ‘Documentation’ of the IMDB.

Table 1 contains the weighted response rates for all industry groups as well as for total retail trade for each province and territory. For more detailed weighted response rates, please contact the Marketing and Dissemination Section at (613) 951-3549, toll free: 1-877-421-3067 or by e-mail at retailinfo@statcan.

6.2. Methods used to reduce non-response at collection

Significant effort is spent trying to minimize non-response during collection. Methods used, among others, are interviewer techniques such as probing and persuasion, repeated re-scheduling and call-backs to obtain the information, and procedures dealing with how to handle non-compliant (refusal) respondents.

If data are unavailable at the time of collection, a respondent's best estimates are also accepted, and are subsequently revised once the actual data become available.

To minimize total non-response for all variables, partial responses are accepted. In addition, questionnaires are customized for the collection of certain variables, such as inventory, so that collection is timed for those months when the data are available.

Finally, to build trust and rapport between the interviewers and respondents, cases are generally assigned to the same interviewer each month. This action establishes a personal relationship between interviewer and respondent, and builds respondent trust.

7. Data collection and capture operations

Collection of the data is performed by Statistics Canada’s Regional Offices.

Table 1
Weighted response rates by NAICS, for all provinces/territories: July 2012
Table summary
This table displays the results of weighted response rates by naics weighted response rates, calculated using total, survey and administrative units of measure (appearing as column headers).
  Weighted Response Rates
Total Survey Administrative
NAICS - Canada  
Motor Vehicle and Parts Dealers 90.8 91.5 66.1
Automobile Dealers 92.2 92.6 56.6
New Car DealersNote 1 93.6 93.6  
Used Car Dealers 70.3 72.6 56.6
Other Motor Vehicle Dealers 78.2 79.1 72.9
Automotive Parts, Accessories and Tire Stores 86.9 90.7 63.8
Furniture and Home Furnishings Stores 88 90.9 58.8
Furniture Stores 92.4 93.8 67.4
Home Furnishings Stores 80.4 85.3 53.8
Electronics and Appliance Stores 90.4 91.2 64.9
Building Material and Garden Equipment Dealers 91.9 93.5 78.9
Food and Beverage Stores 85.7 90.9 28.8
Grocery Stores 85.1 91.3 24.1
Grocery (except Convenience) Stores 87.1 93.3 20
Convenience Stores 62.2 64.9 49
Specialty Food Stores 66.2 73.1 41.9
Beer, Wine and Liquor Stores 92.2 93 62.5
Health and Personal Care Stores 82.7 82.5 85.8
Gasoline Stations 84.2 84.7 76
Clothing and Clothing Accessories Stores 89 90.6 39.1
Clothing Stores 90.4 92 29.2
Shoe Stores 89.2 89.3 80.4
Jewellery, Luggage and Leather Goods Stores 79.1 81.6 51.8
Sporting Goods, Hobby, Book and Music Stores 86.4 93.1 32.6
General Merchandise Stores 99.4 99.5 86
Department Stores 100 100  
Other general merchadise stores 98.8 99 86
Miscellaneous Store Retailers 78.3 82.2 48.8
Total 88.9 90.9 55.6
Regions  
Newfoundland and Labrador 94 94.7 75.8
Prince Edward Island 90.8 91.6 43.9
Nova Scotia 94.2 95 77.4
New Brunswick 83.4 85.5 56
Québec 89.2 92.8 46
Ontario 90.2 92.1 56.2
Manitoba 88.1 88.6 66.8
Saskatchewan 88.9 90.2 61.2
Alberta 86.8 87.7 68.9
British Columbia 86.8 88.6 55
Yukon Territory 87.8 87.8  
Northwest Territories 84.1 84.1  
Nunavut 91.5 91.5  
1 There are no administrative records used in new car dealers

Weighted Response Rates

Respondents are sent a questionnaire or are contacted by telephone to obtain their sales and inventory values, as well as to confirm the opening or closing of business trading locations. Collection of the data begins approximately 7 working days after the end of the reference month and continues for the duration of that month.

New entrants to the survey are introduced to the survey via an introductory letter that informs the respondent that a representative of Statistics Canada will be calling. This call is to introduce the respondent to the survey, confirm the respondent's business activity, establish and begin data collection, as well as to answer any questions that the respondent may have.

8. Editing

Data editing is the application of checks to detect missing, invalid or inconsistent entries or to point to data records that are potentially in error. In the survey process for the MRTS, data editing is done at two different time periods.

First of all, editing is done during data collection. Once data are collected via the telephone, or via the receipt of completed mail-in questionnaires, the data are captured using customized data capture applications. All data are subjected to data editing. Edits during data collection are referred to as field edits and generally consist of validity and some simple consistency edits. They are used to detect mistakes made during the interview by the respondent or the interviewer and to identify missing information during collection in order to reduce the need for follow-up later on. Another purpose of the field edits is to clean up responses. In the MRTS, the current month’s responses are edited against the respondent’s previous month’s responses and/or the previous year’s responses for the current month. Field edits are also used to identify problems with data collection procedures and the design of the questionnaire, as well as the need for more interviewer training.

Follow-up with respondents occurs to validate potential erroneous data following any failed preliminary edit check of the data. Once validated, the collected data is regularly transmitted to the head office in Ottawa.

Secondly, editing known as statistical editing is also done after data collection and this is more empirical in nature. Statistical editing is run prior to imputation in order to identify the data that will be used as a basis to impute non-respondents. Large outliers that could disrupt a monthly trend are excluded from trend calculations by the statistical edits. It should be noted that adjustments are not made at this stage to correct the reported outliers.

The first step in the statistical editing is to identify which responses will be subjected to the statistical edit rules. Reported data for the current reference month will go through various edit checks.

The first set of edit checks is based on the Hidiriglou-Berthelot method whereby a ratio of the respondent’s current month data over historical (last month, same month last year) or auxiliary data is analyzed. When the respondent’s ratio differs significantly from ratios of respondents who are similar in terms of industry and/or geography group, the response is deemed an outlier.

The second set of edits consists of an edit known as the share of market edit. With this method, one is able to edit all respondents, even those where historical and auxiliary data is unavailable. The method relies on current month data only. Therefore, within a group of respondents, that are similar in terms of industrial group and/or geography, if the weighted contribution of a respondent to the group’s total is too large, it will be flagged as an outlier.

For edit checks based on the Hidiriglou-Berthelot method, data that are flagged as an outlier will not be included in the imputation models (those based on ratios). Also, data that are flagged as outliers in the share of market edit will not be included in the imputation models where means and medians are calculated to impute for responses that have no historical responses.

In conjunction with the statistical editing after data collection of reported data, there is also error detection done on the extracted GST data. Modeled data based on the GST are also subject to an extensive series of processing steps which thoroughly verify each record that is the basis for the model as well as the record being modeled. Edits are performed at a more aggregate level (industry by geography level) to detect records which deviate from the expected range, either by exhibiting large month-to-month change, or differing significantly from the remaining units. All data which fail these edits are subject to manual inspection and possible corrective action.

9. Imputation

Imputation in the MRTS is the process used to assign replacement values for missing data. This is done by assigning values when they are missing on the record being edited to ensure that estimates are of high quality and that a plausible, internal consistency is created. Due to concerns of response burden, cost and timeliness, it is generally impossible to do all follow-ups with the respondents in order to resolve missing responses. Since it is desirable to produce a complete and consistent microdata file, imputation is used to handle the remaining missing cases.

In the MRTS, imputation is based on historical data or administrative data (GST sales). The appropriate method is selected according to a strategy that is based on whether historical data is available, auxiliary data is available and/or which reference month is being processed.

There are three types of historical imputation methods. The first type is a general trend that uses one historical data source (previous month, data from next month or data from same month previous year). The second type is a regression model where data from previous month and same month previous year are used simultaneously. The third type uses the historical data as a direct replacement value for a non-respondent. Depending upon the particular reference month, there is an order of preference that exists so that top quality imputation can result. The historical imputation method that was labelled as the third type above is always the last option in the order for each reference month.

The imputation methods using administrative data are automatically selected when historical information is unavailable for a non-respondent. The administrative data source (annual GST sales) is the basis of these methods. The annual GST sales are used for two types of methods. One is a general trend that will be used for simple structure, e.g. enterprises with only one establishment, and a second type is called median-average that is used for units with a more complex structure.

10. Estimation

Estimation is a process that approximates unknown population parameters using only part of the population that is included in a sample. Inferences about these unknown parameters are then made, using the sample data and associated survey design. This stage uses Statistics Canada's Generalized Estimation System (GES).

For retail sales, the population is divided into a survey portion (take-all and take-some strata) and a non-survey portion (take-none stratum). From the sample that is drawn from the survey portion, an estimate for the population is determined through the use of a Horvitz-Thompson estimator where responses for sales are weighted by using the inverses of the inclusion probabilities of the sampled units. Such weights (called sampling weights) can be interpreted as the number of times that each sampled unit should be replicated to represent the entire population. The calculated weighted sales values are summed by domain, to produce the total sales estimates by each industrial group / geographic area combination. A domain is defined as the most recent classification values available from the BR for the unit and the survey reference period. These domains may differ from the original sampling strata because units may have changed size, industry or location. Changes in classification are reflected immediately in the estimates and do not accumulate over time. For the non-survey portion, the sales are estimated with statistical models using monthly GST sales.

For more information on the methodology for modeling sales from administrative data sources which also contributes to the estimates of the survey portion, refer to ‘Monthly Retail Survey: Use of Administrative Data’ under ‘Documentation’ of the IMDB.

The measure of precision used for the MRTS to evaluate the quality of a population parameter estimate and to obtain valid inferences is the variance. The variance from the survey portion is derived directly from a stratified simple random sample without replacement.

Sample estimates may differ from the expected value of the estimates. However, since the estimate is based on a probability sample, the variability of the sample estimate with respect to its expected value can be measured. The variance of an estimate is a measure of the precision of the sample estimate and is defined as the average, over all possible samples, of the squared difference of the estimate from its expected value.

11. Revisions and seasonal adjustment

Revisions in the raw data are required to correct known non-sampling errors. These normally include replacing imputed data with reported data, corrections to previously reported data, and estimates for new births that were not known at the time of the original estimates.

Raw data are revised, on a monthly basis, for the month immediately prior to the current reference month being published. That is, when data for December are being published for the first time, there will also be revisions, if necessary, to the raw data for November. In addition, revisions are made once a year, with the initial release of the February data, for all months in the previous year. The purpose is to correct any significant problems that have been found that apply for an extended period. The actual period of revision depends on the nature of the problem identified, but rarely exceeds three years.

Time series contain the elements essential to the description, explanation and forecasting of the behaviour of an economic phenomenon: "They are statistical records of the evolution of economic processes through time."1 Economic time series such as the Monthly Retail Trade Survey can be broken down into five main components: the trend-cycle, seasonality, the trading-day effect, the Easter holiday effect and the irregular component.

The trend represents the long-term change in the series, whereas the cycle represents a smooth, quasi-periodical movement about the trend, showing a succession of growth and decline phases (e.g., the business cycle). These two components—the trend and the cycle—are estimated together, and the trend-cycle reflects the fundamental evolution of the series. The other components reflect short-term transient movements.

The seasonal component represents sub-annual, monthly or quarterly fluctuations that recur more or less regularly from one year to the next. Seasonal variations are caused by the direct and indirect effects of the climatic seasons and institutional factors (attributable to social conventions or administrative rules; e.g., Christmas).

The trading-day component originates from the fact that the relative importance of the days varies systematically within the week and that the number of each day of the week in a given month varies from year to year. This effect is present when activity varies with the day of the week. For instance, Sunday is typically less active than the other days, and the number of Sundays, Mondays, etc., in a given month changes from year to year.

The Easter holiday effect is the variation due to the shift of part of April’s activity to March when Easter falls in March rather than April.

Lastly, the irregular component includes all other more or less erratic fluctuations not taken into account in the preceding components. It is a residual that includes errors of measurement on the 1. A Note on the Seasonal adjustment of Economic Time Series», Canadian Statistical Review, August 1974.  A variable itself as well as unusual events (e.g., strikes, drought, floods, major power blackout or other unexpected events causing variations in respondents’ activities).

Thus, the latter four components—seasonal, irregular, trading-day and Easter holiday effect—all conceal the fundamental trend-cycle component of the series. Seasonal adjustment (correction of seasonal variation) consists in removing the seasonal, trading-day and Easter holiday effect components from the series, and it thus helps reveal the trend-cycle. While seasonal adjustment permits a better understanding of the underlying trend-cycle of a series, the seasonally adjusted series still contains an irregular component. Slight month-to-month variations in the seasonally adjusted series may be simple irregular movements. To get a better idea of the underlying trend, users should examine several months of the seasonally adjusted series.

Since April 2008, Monthly Retail Trade Survey data are seasonally adjusted using the X-12- ARIMA2 software. The technique that is used essentially consists of first correcting the initial series for all sorts of undesirable effects, such as the trading-day and the Easter holiday effects, by a module called regARIMA. These effects are estimated using regression models with ARIMA errors (auto-regressive integrated moving average models). The series can also be extrapolated for at least one year by using the model. Subsequently, the raw series—pre-adjusted and extrapolated if applicable— is seasonally adjusted by the X-11 method.

The X-11 method is used for analysing monthly and quarterly series. It is based on an iterative principle applied in estimating the different components, with estimation being done at each stage using adequate moving averages3. The moving averages used to estimate the main components—the trend and seasonality—are primarily smoothing tools designed to eliminate an undesirable component from the series. Since moving averages react poorly to the presence of atypical values, the X-11 method includes a tool for detecting and correcting atypical points. This tool is used to clean up the series during the seasonal adjustment. Outlying data points can also be detected and corrected in advance, within the regARIMA module.

Lastly, the annual totals of the seasonally adjusted series are forced to the annual totals of the original series.

Unfortunately, seasonal adjustment removes the sub-annual additivity of a system of series; small discrepancies can be observed between the sum of seasonally adjusted series and the direct seasonal adjustment of their total. To insure or restore additivity in a system of series, a reconciliation process is applied or indirect seasonal adjustment is used, i.e. the seasonal adjustment of a total is derived by the summation of the individually seasonally adjusted series.

12. Data quality evaluation

The methodology of this survey has been designed to control errors and to reduce their potential effects on estimates. However, the survey results remain subject to errors, of which sampling error is only one component of the total survey error. Sampling error results when observations are made only on a sample and not on the entire population. All other errors arising from the various phases of a survey are referred to as nonsampling errors. For example, these types of errors can occur when a respondent provides incorrect information or does not answer certain questions; when a unit in the target population is omitted or covered more than once; when GST data for records being modeled for a particular month are not representative of the actual record for various reasons; when a unit that is out of scope for the survey is included by mistake or when errors occur in data processing, such as coding or capture errors.

Prior to publication, combined survey results are analyzed for comparability; in general, this includes a detailed review of individual responses (especially for large businesses), general economic conditions and historical trends.

A common measure of data quality for surveys is the coefficient of variation (CV). The coefficient of variation, defined as the standard error divided by the sample estimate, is a measure of precision in relative terms. Since the coefficient of variation is calculated from responses of individual units, it also measures some non-sampling errors.

The formula used to calculate coefficients of variation (CV) as percentages is:

CV (X) = S(X) * 100% / X
where X denotes the estimate and S(X) denotes the standard error of X.

Confidence intervals can be constructed around the estimates using the estimate and the CV. Thus, for our sample, it is possible to state with a given level of confidence that the expected value will fall within the confidence interval constructed around the estimate. For example, if an estimate of $12,000,000 has a CV of 2%, the standard error will be $240,000 (the estimate multiplied by the CV). It can be stated with 68% confidence that the expected values will fall within the interval whose length equals the standard deviation about the estimate, i.e. between $11,760,000 and $12,240,000.

Alternatively, it can be stated with 95% confidence that the expected value will fall within the interval whose length equals two standard deviations about the estimate, i.e. between $11,520,000 and $12,480,000.

Finally, due to the small contribution of the non-survey portion to the total estimates, bias in the non-survey portion has a negligible impact on the CVs. Therefore, the CV from the survey portion is used for the total estimate that is the summation of estimates from the surveyed and non-surveyed portions.

13. Disclosure control

Statistics Canada is prohibited by law from releasing any data which would divulge information obtained under the Statistics Act that relates to any identifiable person, business or organization without the prior knowledge or the consent in writing of that person, business or organization. Various confidentiality rules are applied to all data that are released or published to prevent the publication or disclosure of any information deemed confidential. If necessary, data are suppressed to prevent direct or residual disclosure of identifiable data.

Confidentiality analysis includes the detection of possible "direct disclosure", which occurs when the value in a tabulation cell is composed of a few respondents or when the cell is dominated by a few companies.

 

Monthly Wholesale Trade Survey Data Quality Statement

1. Objective, Uses and Users

1.1. Objectives

The Monthly Wholesale Trade Survey (MWTS) provides information on the performance of the wholesale trade sector and is an important indicator of the health of the Canadian economy. In addition, the business community uses the data to analyse market performance.

1.2. Use

The estimates provide a measure of the health and performance of the wholesale trade sector. Information collected is used to estimate level and monthly trend for wholesale sales and inventories. At the end of each year, the estimates provide a preliminary look at annual wholesale sales and performance.

1.3. Users

A variety of organizations, sector associations, and levels of government make use of the information. Wholesalers can use the survey results to compare their performance against similar types of businesses, as well as for marketing purposes. Wholesale associations are able to monitor industry performance and promote their wholesale industries. Investors can monitor industry growth, which can result in better access to investment capital by wholesalers. Governments are able to understand the role of wholesalers in the economy, which aid in the development of policies and tax incentives. As an important industry in the Canadian economy (5 to 6% of the Gross Domestic Product, depending on the year), governments are able to better determine the overall health of the economy through the use of the estimates in the calculation of the nation’s Gross Domestic Product (GDP).

2. Concepts, Variables and Classifications

2.1. Concepts

Wholesale trade is generally the intermediate step in the distribution of merchandise. The sector comprises establishments primarily engaged in the buying and selling of merchandise and providing logistics, marketing and support services.

Wholesalers are organized to sell merchandise in large quantities to retailers, business and institutional clients. However, some wholesalers, in particular those that supply non-consumer capital goods, sell merchandise in single units to final users.  The sector recognizes two main types of wholesalers: wholesale merchants and wholesale agents and brokers.

Wholesale merchants buy and sell merchandise on their own account, that is, they take title to the goods they sell. They generally operate from warehouse or office locations and they may ship from their own inventory or arrange for the shipment of goods directly from the supplier to the client. In addition to the sales of goods, they may provide, or arrange for the provision of, logistics, marketing and support services, such as packaging and labelling, inventory management, shipping, handling of warranty claims, in-store or co-op promotions, and product training. Dealers of machinery and equipment, such as dealers of farm machinery and heavy-duty trucks, also fall within this category. They are known by a variety of trade designation depending on their relationship with suppliers or customers, or the distribution method they employ.

Examples include wholesale merchant, wholesale distributor, drop shipper, rack-jobbers, import-export merchants, buying groups, dealer-owned cooperatives and banner wholesalers. For purposes of industrial classification, wholesale merchants are classified by industry according to the principal lines of commodities sold. A description of each industrial group included in the accompanying statistical data is shown in Appendix IV. As most businesses sell several kinds of commodities, the classification assigned to a business generally reflects either the individual commodity or the commodity group which is the primary source of the establishment’s receipts, or some mixture of commodities which characterizes the establishment’s business.

Wholesale Agents and Brokers buy and sell merchandise owned by others on a fee or commission basis. They do not take title to the goods they buy or sell, and they generally operate at or from an office location. Wholesale agents and brokers are known by a variety of trade designations including import-export agents, wholesale commission agents, wholesale brokers, and manufacturer’s representatives’ ad agents.

2.2. Variables

Sales are defined as the sales of all goods purchased for resale, net of returns and discounts. This includes parts used in generating repair and maintenance revenue, labour revenue from repair and maintenance, sales of goods manufactured as a secondary activity by the wholesaler, and revenue from rental and leasing of office space, other real estate, and goods and equipment.  As well, any commission revenue and fees earned from buying and selling merchandise on account of others by wholesale merchants is also included. Other operating revenue such as operating subsidies and grants, shipping, handling, and storing goods for others are excluded.

Inventories are defined as the book value, i.e., the value maintained in the accounting records, of all stock owned at month end and intended for resale. This includes stock in selling outlets, in warehouses, in transit, or on consignment to others. It also includes stock owned within and outside Canada. Inventories held on consignment from others (not owned), and store and office supplies and any other supplies not to be sold are excluded. Trading Location is the physical location(s) in which business activity is conducted in each province and territory, and for which sales are credited or recognized in the financial records of the company. For wholesalers, this would normally be a distribution centre.

Sales in volume: The value of wholesale trade is measured in two ways; including the effects of price change on sales and net of the effects of price change. The first measure is referred to as wholesale trade in current dollars and the latter as wholesale trade in volume. The method of calculating the current dollar estimate is to aggregate the weighted value of sales for all wholesale outlets. The method of calculating the volume estimate is to first adjust the sales values to a base year, using the price indexes, and then sum up the resulting values.

2.3. Classifications

The Monthly Wholesale Trade Survey is based on the definition of wholesale trade under the NAICS (North American Industrial Classification System). NAICS is the agreed upon common framework for the production of comparable statistics by the statistical agencies of Canada, Mexico and the United States. The agreement defines the boundaries of twenty sectors. NAICS is based on a production-oriented, or supply based conceptual framework in that establishments are groups into industries according to similarity in production processes used to produce goods and services.

Estimates appear for 24 industries based on the 2007 North American Industrial Classification System (NAICS) industries. The 24 industries are further aggregated to 7 sub-sectors which correspond exactly to the 3-digit NAICS codes for wholesale trade industries, with the exception of the following: wholesale agents and brokers; and petroleum and oilseed and grain wholesaler-distributors.

Geographically, sales estimates are produced for Canada and each province and territory. Inventory estimates are produced only for Canada as a whole.

3. Coverage and Frames

Statistics Canada’s Business Register (BR) provides the frame for the Monthly Wholesale Trade Survey. The BR is a structured list of businesses engaged in the production of goods and services in Canada. It is a centrally maintained database containing detailed descriptions of most business entities operating within Canada. The BR includes all incorporated businesses, with or without employees. For unincorporated businesses, the BR includes all employer businesses and businesses with no employees with annualized sales that have a Goods and Services Tax (GST) account or annual revenue coming from individual income tax.

The businesses on the BR are represented by a hierarchical structure with four levels, with the statistical enterprise at the top, followed by the statistical company, the statistical establishment and the statistical location. An enterprise can be linked to one or more statistical companies, a statistical company can be linked to one or more statistical establishments, and a statistical establishment to one or more statistical locations.

The target population for the MWTS consists of all statistical establishments on the BR, excluding unincorporated businesses with no employees and with annual sales less than $30,000,.that are classified to the wholesale sector using the North American Industry Classification System (NAICS) (approximately 90,000 establishments). The NAICS code range for wholesale sector is 410000 to 419999. A statistical establishment is the production entity or the smallest grouping of production entities which: produces a homogeneous set of goods or services; does not cross provincial/territorial boundaries; and provides data on the value of output together with the cost of principal intermediate inputs used along with the cost and quantity of labour used to produce the output. The production entity is the physical unit where the business operations are carried out. It must have a civic address and dedicated labour.

The exclusions to the target population are ancillary establishments (producers of services in support of the activity of producing goods and services for the market of more than one establishment within the enterprise, and serves as a cost centre or a discretionary expense centre for which data on all its costs including labour and depreciation can be reported by the business), future establishments, establishments for which economic signals indicate a null or missing revenue, and establishments in the following non-covered NAICS:

  • 41112 (oilseed and grain)
  • 412 (petroleum products)
  • 419 (agents and brokers)

4. Sampling

The MWTS sample consists of 7,500 groups of establishments (clusters) classified to the Wholesale Trade sector selected from the Statistics Canada Business Register. A cluster of establishments is defined as all establishments belonging to a statistical enterprise that are in the same industrial group and geographical region. The MWTS uses a stratified design with simple random sample selection in each stratum. The stratification is done by industrial groups (mainly, but not only four digit level NAICS), and the geographical regions consisting of the provinces and territories. We further stratify the population by size. The size measure is created using a combination of independent survey data and three administrative variables: the annual profiled revenue, the GST sales expressed on an annual basis, and the declared tax revenue (T1 or T2).

The size strata consist of one take-all (census), at most two take-some (partially sampled) strata, and one take-none (non-sampled) stratum. Take-none strata serve to reduce respondent burden by excluding the smaller businesses from the surveyed population. These businesses should represent at most ten percent of total sales. Instead of sending questionnaires to these businesses, the estimates are produced through the use of administrative data.

The sample was allocated optimally in order to reach target coefficients of variation at the national, provincial/territorial, industrial, and industrial groups by province/territory levels. The sample was also inflated to compensate for dead, non-responding, and misclassified units.

MWTS is a repeated survey with maximization of monthly sample overlap. The sample is kept month after month, and every month new units are added (births) to the sample. MWTS births, i.e., new clusters of establishment(s), are identified every month via the BR’s latest universe. They are stratified according to the same criteria as the initial population. A sample of these births is selected according to the sampling fraction of the stratum to which they belong and is added to the monthly sample. Deaths also occur on a monthly basis. A death can be a cluster of establishment(s) that have ceased their activities (out-of-business) or whose major activities are no longer in wholesale trade (out-of-scope). The status of these businesses is updated on the BR using administrative sources and survey feedback, including feedback from the MWTS. Methods to treat dead units and misclassified units are part of the sample and population update procedures.

5. Questionnaire Design

The questionnaire collects monthly data on wholesale sales and the number of trading locations by province or territory and inventories of goods owned and intended for resale from a sample of wholesalers. For the 2004 redesign, most questionnaires were subject to cosmetic changes only, with the exception of the inclusion of Nunavut. The modifications were discussed with stakeholders and the respondents were given an opportunity to comment before the new questionnaire was finalized. If further changes are needed to any of the questionnaires, proposed changes would go through a review committee and a field test with respondents and data users to ensure its relevancy.

6. Response and Non-response

6.1. Response and Non-response

Despite the best efforts of survey managers and operations staff to maximize response in the MWTS, some non-response will occur.

For statistical establishments to be classified as responding, the degree of partial response (where an accurate response is obtained for only some of the questions asked a respondent) must meet a minimum threshold level below which the response would be rejected and considered a unit non-response. In such an instance, the business is classified as not having responded at all.

Non-response has two effects on data: first it introduces bias in estimates when non-respondents differ from respondents in the characteristics measured; and second, it contributes to an increase in the sampling variance of estimates because the effective sample size is reduced from that originally sought.

The degree to which efforts are made to get a response from a non-respondent is based on budget and time constraints, its impact on the overall quality and the risk of non-response bias.

The main method to reduce the impact of non-response at sampling is to inflate the sample size through the use of over-sampling rates that have been determined from similar surveys.

Besides the methods to reduce the impact of non-response at sampling and collection, the non-responses to the survey that do occur are treated through imputation.

In order to measure the amount of non-response that occurs each month various response rates are calculated. For a given reference month, the estimation process is run at least twice (a preliminary and a revised run). Between each run, respondent data can be identified as unusable and imputed values can be corrected through respondent data. As a consequence, response rates are computed following each run of the estimation process.

For the MWTS, two types of rates are calculated (unweighted and weighted). In order to assess the efficiency of the collection process, unweighted response rates are calculated. Weighted rates, using the estimation weight and the value for the variable of interest, assess the quality of estimation. Within each of these types of rates, there are distinct rates for units that are surveyed and for units that are only modeled from administrative data that has been extracted from GST files.

To get a better picture of the success of the collection process, two unweighted rates called the ‘collection results rate’ and the ‘extraction results rate’ are computed. They are computed by dividing the number of respondents by the number of units that we tried to contact or tried to receive extracted data for them. Non-monthly reporters (respondents with special reporting arrangements where they do not report every month but for whom actual data is available in subsequent revisions) are excluded from both the numerator and denominator for the months where no contact is performed.

In summary, the various response rates are calculated as follows:

Weighted rates:

- Survey Response rate (estimation) = Sum of weighted sales of units with response status i / Sum of survey weighted sales

where i = units that have either reported data that will be used in estimation or are converted
refusals, or have reported data that has not yet been resolved for estimation.

- Admin Response rate (estimation) = Sum of weighted sales of units with response status ii / Sum of administrative weighted sales

where ii = units that have data that was extracted from administrative files and are usable for estimation.

- Total Response rate (estimation) = Sum of weighted sales of units with response status i or response status ii / Sum of all weighted sales

Unweighted rates:

- Survey Response rate (collection) = Number of questionnaires with response status iii / Number of questionnaires with response status iv

where iii = units that have either reported data (unresolved, used or not used for estimation) or are converted refusals.

where iv = all of the above plus units that have refused to respond, units that were not contacted and other types of non-respondent units.

- Admin Response rate (extraction) = Number of questionnaires with response status vi / Number of questionnaires with response status vii

where vi = in-scope units that have data (either usable or non-usable) that was extracted from administrative files

where vii = all of the above plus units that have refused to report to the administrative data source, units that were not contacted and other types of non-respondent units.
(% of questionnaire collected over all in-scope questionnaires)

- Collection Results Rate = Number of questionnaires with response status iii / Number of questionnaires with response status viii

where iii = same as iii defined above

where viii = same as iv except for excluded units that were contacted because their response is unavailable for a particular month since they are non-monthly reporters.

- Extraction Results Rate = Number of questionnaires with response status ix / Number of questionnaires with response status vii

where ix = same as vi with the addition of extracted units that have been imputed or were out of scope

where vii = same as vii defined above
(% of questionnaires collected over all questionnaire in-scope we tried to collect)

All the above weighted and unweighted rates are provided at the industrial group, geography and size group level or for any combination of these levels.

Use of Administrative Data:

Managing response burden is an ongoing challenge for Statistics Canada. In an attempt to alleviate response burden and survey costs, especially for smaller businesses, the MWTS has reduced the number of simple establishments in the sample that are surveyed directly and instead derives sales data for these establishments from Goods and Service Tax (GST) files using a statistical model. The model accounts for differences between sales and revenue (reported for GST purposes) as well as for the time lag between the survey reference period and the reference period of the GST file.

Inventories for establishments where sales are GST-based are derived using the MWTS imputation system. The imputation system uses the previous month’s values, the month-to-month and year-to-year changes in similar size establishments which are surveyed.

For more information on the methodology used for modeling sales from administrative data sources, refer to ‘Monthly Wholesale Trade Survey: Use of Administrative Data’ under ‘Documentation’ of the IMDB.

6.2. Methods used to reduce non-response at collection

Significant effort is spent trying to minimize non-response during collection. Methods used, among others, are interviewer techniques such as probing and persuasion, repeated re-scheduling and call-backs to obtain the information, and procedures dealing with how to handle non-compliant (refusal) respondents.

If data are unavailable at the time of collection, a respondent's best estimates are also accepted, and are subsequently revised once the actual data become available. To minimize total non-response for all variables, partial responses are accepted. In addition, questionnaires are customized for the collection of certain variables, such as inventory, so that collection is timed for those months when the data are available.

Finally, to build trust and rapport between the interviewers and respondents, cases are generally assigned to the same interviewer each month. This action establishes a personal relationship between interviewer and respondent, and builds respondent trust.

7. Data Collection and Capture Operations

Collection of the data is performed by Statistics Canada’s Regional Offices. Respondents are sent a questionnaire or are contacted by telephone to obtain their sales and inventory values, as well as to confirm the opening or closing of business trading locations. There is also follow-up of non-response. Collection of the data begins approximately 7 working days after the end of the reference month and continues for the duration of that month.

New entrants to the survey are introduced to the survey via an introductory letter that informs the respondent that a representative of Statistics Canada will be calling. This call is to introduce the respondent to the survey, confirm the respondent's business activity, establish and begin data collection, as well as to answer any questions that the respondent may have.

8. Editing

Data editing is the application of checks to detect missing, invalid or inconsistent entries or to point to data records that are potentially in error. In the survey process for the MWTS, data editing is done at two different time periods.

First of all, editing is done during data collection. Once data are collected via the telephone, or via the receipt of completed mail-in questionnaires, the data are captured using customized data capture applications. All data are subjected to data editing. Edits during data collection are referred to as field edits and generally consist of validity and some simple consistency edits. They are also used to detect mistakes made during the interview by the respondent or the Interviewer and to identify missing information during collection in order to reduce the need for follow-up later on. Another purpose of the field edits is to clean up responses. In the MWTS, the current month’s responses are edited against the respondent’s previous month’s responses and/or the previous year’s responses for the current month.. Field edits are used to identify problems with data collection procedures and the design of the questionnaire, as well as the need for more interviewer training.

Follow-up with respondents occurs to validate potential erroneous data following any failed preliminary edit check of the data. Once validated, the collected data is regularly transmitted to the head office in Ottawa.

Secondly, editing known as statistical editing is also done after data collection and this is more empirical in nature. Statistical editing is run prior to imputation in order to identify the data that will be used as a basis to impute non-respondents. Large outliers that could disrupt a monthly trend are excluded from trend calculations by the statistical edits. It should be noted that adjustments are not made at this stage to correct the reported outliers.

The first step in the statistical editing is to identify which responses will be subjected to the statistical edit rules. Reported data for the current reference month will go through various edit checks.

The first set of edit checks is based on the Hidiroglou-Berthelot method whereby a ratio of the respondent’s current month data over historical (i.e. last month, or same month last year) or administrative data is analyzed. When the respondent’s ratio differs significantly from ratios of respondents who are similar in terms of industrial group and/or geography group, the response is deemed an outlier.

The second set of edits consists of an edit known as the share of market edit. With this method, one is able to edit all respondents even those where historical and auxiliary data is unavailable. The method relies on current month data only. Therefore, within a group of respondents that are similar in terms of industrial group and/or geography, if the weighted contribution of a respondent to the group’s total is too large, it will be flagged as an outlier.

For edit checks based on the Hidiroglou-Berthelot method, data that are flagged as an outlier will not be included in the imputation models (those based on ratios). Also, data that are flagged as outliers in the share of market edit will not be included in the imputation models where means and medians are calculated to impute for responses that have no historical responses.

In conjunction with the statistical editing after data collection of reported data, there is also error detection done on the extracted GST data. Modeled data based on the GST are also subject to an extensive series of processing steps which thoroughly verify each record that is the basis for the model as well as the record being modeled. Edits are performed at a more aggregate level (industry by geography level) to detect records which deviate from the expected range, either by exhibiting large month-to-month change, or differing significantly from the remaining units. All data which fail these edits are subject to manual inspection and possible corrective action.

9. Imputation

Imputation in the MWTS is the process used to assign replacement values for missing data. This is done by assigning values when they are missing on the record being edited to ensure that estimates are of high quality and that a plausible, internal consistency is created. Due to concerns of response burden, cost and timeliness, it is generally impossible to do all follow-ups with the respondents in order to resolve missing responses. Since it is desirable to produce a complete and consistent micro data file, imputation is used to handle the remaining missing cases.

In the MWTS, imputation for missing values can be based on either historical or administrative data. The appropriate method is selected according to a strategy that is based on whether historical data is available, administrative data is available and/or which reference month is being processed.

There are three types of historical imputation methods. The first type is a general trend that uses one historical data source (previous month, data from next month or data from same month previous year). The second type is a regression model where data from previous month and same month previous year are used simultaneously. The third type uses the historical data as a direct replacement value for a non-respondent.

Depending upon the particular reference month, there is an order of preference that exists so that a top quality imputation can result. The historical imputation method that was labelled as the third type above is always the last option in the order for each reference month.

The imputation methods using administrative data are automatically selected when historical information is unavailable for a non-respondent. The administrative data source (annual GST sales) is the basis of these methods. The annual GST sales are used for two types of methods. One is a general trend that will be used for simple structure, e.g. enterprises with only one establishment, and a second type is called median-average that is used for units with a more complex structure.

Finally, it should be noted that inventories in the MWTS where sales are derived from monthly GST data are also imputed by the MWTS imputation systems. The imputed values are calculated using the same imputation methods that are in place for missing data from non-respondents.

10. Estimation

Estimation is a process that approximates unknown population parameters using only the part of the population that is included in a sample. Inferences about these unknown parameters are then made, using the sample data and associated survey design.  This stage uses Statistics Canada's Generalized Estimation System (GES.)

For wholesale sales, the population is divided into a survey portion (take-all and take-some strata) and a non-survey portion (take-none stratum). From the sample that is drawn from the survey portion, an estimate for the population is determined through the use of a Horvitz-Thompson estimator where responses for sales are weighted by using the inverses of the inclusion probabilities of the sampled units. Such weights (called sampling weights) can be interpreted as the number of times that each sampled unit should be replicated to represent the entire population. The calculated weighted sales values are summed by domain, to produce the total sales estimates by each industrial group / geographic area combination. A domain is defined as the most recent classification values available from the BR for the unit and the survey reference period. These domains may differ from the original sampling strata because units may have changed size, industrial group or location. Changes in classification are reflected immediately in the estimates and do not accumulate over time. For the non-survey portion, the sales are estimated with statistical models using monthly GST sales.

For wholesale inventories, the sample selected for estimating sales is used to derive an estimate through the use of a Horvitz-Thompson estimator for the survey portion. A sample-based ratio is then used to produce the estimate for the non-survey portion, and the estimate of the total is derived as the sum of the survey and non-survey portion estimates.

For more information on the methodology for modeling sales from administrative data sources (i.e. GST data) which also contributes to the estimates of the survey portion, refer to ‘Monthly Wholesale Trade Survey: Use of Administrative Data’ under ‘Documentation’ of the IMDB.

The measure of precision used for the MWTS to evaluate the quality of a population parameter estimate and to obtain valid inferences is the variance. The variance from the survey portion is derived directly from a stratified simple random sample without replacement.

Sample estimates may differ from the expected value of the estimates. However, since the estimate is based on a probability sample, the variability of the sample estimate with respect to its expected value can be measured. The variance of an estimate is a measure of the precision of the sample estimate and is defined as the average, over all possible samples, of the squared difference of the estimate from its expected value.

11. Revisions and seasonal adjustment

Revisions in the raw data are required to correct known non-sampling errors. These normally include replacing imputed data with reported data, corrections to previously reported data, and estimates for new births that were not known at the time of the original estimates.

Raw data are revised, on a monthly basis, for the month immediately prior to the current reference month being published. That is, when data for December are being published for the first time, there will also be revisions, if necessary, to the raw data for November. In addition, revisions are made once a year, with the initial release of the February data, for all months in the previous year. The purpose is to correct any significant problems that have been found that apply for an extended period. The actual period of revision depends on the nature of the problem identified, but rarely exceeds three years.

Time series contain the elements essential to the description, explanation and forecasting of the behaviour of an economic phenomenon: "They are statistical records of the evolution of economic processes through time.1 "  Economic time series such as the Monthly Wholesale Trade Survey can be broken down into five main components: the trend-cycle, seasonality, the trading-day effect, the Easter holiday effect and the irregular component.

The trend represents the long-term change in the series, whereas the cycle represents a smooth, quasi-periodical movement about the trend, showing a succession of growth and decline phases (e.g., the business cycle). These two components—the trend and the cycle—are estimated together, and the trend-cycle reflects the fundamental evolution of the series. The other components reflect short-term transient movements.

The seasonal component represents sub-annual, monthly or quarterly fluctuations that recur more or less regularly from one year to the next. Seasonal variations are caused by the direct and indirect effects of the climatic seasons and institutional factors (attributable to social conventions or administrative rules; e.g., Christmas).

The trading-day component originates from the fact that the relative importance of the days varies systematically within the week and that the number of each day of the week in a given month varies from year to year. This effect is present when activity varies with the day of the week. For instance, Sunday is typically less active than the other days, and the number of Sundays, Mondays, etc., in a given month changes from year to year.

The Easter holiday effect is the variation due to the shift of part of April’s activity to March when Easter falls in March rather than April.

Lastly, the irregular component includes all other more or less erratic fluctuations not taken into account in the preceding components. It is a residual that includes errors of measurement on the variable itself as well as unusual events (e.g., strikes, drought, floods, major power blackout or other unexpected events causing variations in respondents’ activities).

Thus, the latter four components—seasonal, irregular, trading-day and Easter holiday effect—all conceal the fundamental trend-cycle component of the series. Seasonal adjustment (correction of seasonal variation) consists in removing the seasonal, trading-day and Easter holiday effect components from the series, and it thus helps reveal the trend-cycle. While seasonal adjustment permits a better understanding of the underlying trend-cycle of a series, the seasonally adjusted series still contains an irregular component. Slight month-to-month variations in the seasonally adjusted series may be simple irregular movements. To get a better idea of the underlying trend, users should examine several months of the seasonally adjusted series.

Since April 2008, Monthly Wholesale Trade Survey data are seasonally adjusted using the X-12-ARIMA2 software. The technique that is used essentially consists of first correcting the initial series for all sorts of undesirable effects, such as the trading-day and the Easter holiday effects, by a module called regARIMA. These effects are estimated using regression models with ARIMA errors (auto-regressive integrated moving average models). The series can also be extrapolated for at least one year by using the model. Subsequently, the raw series—pre-adjusted and extrapolated if applicable— is seasonally adjusted by the X-11 method.

The X-11 method is used for analysing monthly and quarterly series. It is based on an iterative principle applied in estimating the different components, with estimation being done at each stage using adequate moving averages3. The moving averages used to estimate the main components—the trend and seasonality—are primarily smoothing tools designed to eliminate an undesirable component from the series. Since moving averages react poorly to the presence of atypical values, the X-11 method includes a tool for detecting and correcting atypical points. This tool is used to clean up the series during the seasonal adjustment. Outlying data points can also be detected and corrected in advance, within the regARIMA module.

Lastly, the annual totals of the seasonally adjusted series are forced to the annual totals of the original series. Unfortunately, seasonal adjustment removes the sub-annual additivity of a system of series; small discrepancies can be observed between the sum of seasonally adjusted series and the direct seasonal adjustment of their total. To insure or restore additivity in a system of series, a reconciliation process is applied or indirect seasonal adjustment is used, i.e. the seasonal adjustment of a total is derived by the summation of the individually seasonally adjusted series.

12. Data Quality Evaluation

The methodology of this survey has been designed to control errors and to reduce their potential effects on estimates. However, the survey results remain subject to errors, of which sampling error is only one component of the total survey error.

Sampling error results when observations are made only on a sample and not on the entire population. All other errors arising from the various phases of a survey are referred to as non-sampling errors. For example, these types of errors can occur when a respondent provides incorrect information or does not answer certain questions; when a unit in the target population is omitted or covered more than once; when GST data for records being modeled for a particular month are not representative of the actual record for various reasons; when a unit that is out of scope for the survey is included by mistake or when errors occur in data processing, such as coding or capture errors.

Prior to publication, combined survey results are analyzed for comparability; in general, this includes a detailed review of individual responses (especially for large businesses), general economic conditions and historical trends.

A common measure of data quality for surveys is the coefficient of variation (CV). The coefficient of variation, defined as the standard error divided by the sample estimate, is a measure of precision in relative terms. Since the coefficient of variation is calculated from responses of individual units, it also measures some non-sampling errors.

The formula used to calculate coefficients of variation (CV) as percentages is:

CV(X) = (S(X) / X) x 100%

where X denotes the estimate and S(X) denotes the standard error of X.

Confidence intervals can be constructed around the estimates using the estimate and the CV. Thus, for our sample, it is possible to state with a given level of confidence that the expected value will fall within the confidence interval constructed around the estimate. For example, if an estimate of $12,000,000 has a CV of 2%, the standard error will be $240,000 (the estimate multiplied by the CV). It can be stated with 68% confidence that the expected values will fall within the interval whose length equals the standard deviation about the estimate, i.e. between $11,760,000 and $12,240,000. Alternatively, it can be stated with 95% confidence that the expected value will fall within the interval whose length equals two standard deviations about the estimate, i.e. between $11,520,000 and $12,480,000.

Finally, due to the small contribution of the non-survey portion to the total estimates, bias in the non-survey portion has a negligible impact on the CVs. Therefore, the CV from the survey portion is used for the total estimate that is the summation of estimates from the surveyed and non-surveyed portions.

13. Disclosure Control

Statistics Canada is prohibited by law from releasing any data which would divulge information obtained under the Statistics Act that relates to any identifiable person, business or organization without the prior knowledge or the consent in writing of that person, business or organization. Various confidentially rules are applied to all data that are released or published to prevent the publication or disclosure of any information deemed confidential. If necessary, data are suppressed to prevent direct or residual disclosure or identifiable data.

Confidentiality analysis includes the detection of possible “direct disclosure”, which occurs when the value in a tabulation cell is composed of a few respondents or when the cell is dominated by a few companies.

Notes

  1. A Note on the Seasonal adjustment of Economic Time Series», Canadian Statistical Review, August 1974.

  2. For more information, see X-12-ARIMA Reference Manual Version 0.3 (2007), U.S. Census Bureau.

  3. Ladiray, D. and Quenneville, B. (2001). Seasonal Adjustment with the X-11 Method. New York: Springer-Verlag, Lecture Notes in Statistics no. 158.

Who we are, what we do and who does what

Who we are

From start to finish, your survey is in good hands. As a client of Statistics Canada, you are automatically the beneficiary of the same world-class expertise that regularly delivers Canada's social and economic statistics to the nation and the world.

There are no shortcuts to excellence. Every step in the survey process is subjected to exacting quality controls by highly qualified professionals—each trained and experienced in their statistical specialty. Statistics Canada's teams of experts work together with the understanding that all steps in the process are equally important toward achieving peerless results.

What we do

The saying "you get what you pay for" also rings true when it comes to conducting surveys. Having earned its reputation for excellence over many decades of uncompromising dedication to producing factual, unbiased, quality information, Statistics Canada's model of rigorous practices is the standard to which other statistical organizations can only aspire.

When clients hire Statistics Canada to conduct their surveys they are hiring skilled professionals, using state-of-the-art infrastructure, to ensure that their surveys are successful.

Who does what

A great deal of talent, skill and experience defines Statistics Canada's level of service, ensuring that you get the best value for your money:

  • Survey planners help you to define your information needs and determine the best and most economical way to achieve them.
  • Methodologists use your input about the kind of information you need to determine the sample design, keeping in mind the data collection method that will maximize your return on investment.
  • Questionnaire development and design experts create your questionnaire and then test and re-test them to make sure they will measure exactly what you want.
  • Data collection experts include highly trained interviewers and data collection teams.
  • Data processors edit, code and weight your collected data, then document each phase of the survey, the file, the record layout, each variable in the file and indicators of data quality.
  • Analysts interpret the survey results. These are highly skilled and experienced individuals with extensive knowledge about making sense of data.
  • Disseminators and communicators ensure that your survey receives as widespread attention as you wish.

Communiqué

May 2012

Summary

  • Changes to Canada's domestic travel survey, Travel Survey of Residents of Canada (TSRC), between 2011 and previous years are of sufficient magnitude that they will likely result in a break in the historical series.
  • Based on preliminary findings, survey partners and other users can anticipate changes in the volume/value estimates for 2011 relative to previous years that are beyond the ones expected because of economic or demographic changes.
  • The direction and size of changes in volume/value estimates for 2011 is still unknown and may not be the same across all regions or levels of geography.

Overview of Changes

  • Like the 2010 TSRC, the 2011 study is based on a single rotation of the Labour Force Survey.Footnote 1 Unlike the 2010 version, the new design collects information on overnight trips taken by adult Canadian residents over a two-month rather than one-month period.Footnote 2
  • Between 2010 and 2011, the number of trips for which a respondent was asked to report spending, lodging and activities in a detailed manner changed. Other changes included main purpose categories and the boundaries of in-scope versus out-of-scope trips.Footnote 3
  • For a full description of changes between TSRC 2010 and TSRC 2011, readers are advised to consult Differences Between the 2011 Redesigned TSRC and the 2010 TSRC, available on Statistics Canada's website.Footnote 4

More Specifically

  • In the 2011 survey, limited information is collected for all in-scope reported trips via a trip roster. Depending on the number of trips reported, full details are collected for up to three trips. Details include trip spending, nights spent in lodging types at specific locations, and activities. Statistics Canada's selection system to identify trips to be reported in detail favours out-of-province, more recent and overnight trips.
  • Statistics Canada developed a complex imputation procedure to assign characteristics of fully reported trips to those for which only limited information was captured in the trip roster. Approximately 13% of overnight trips and 25% of same-day trips that were rostered but not explored in detail will have characteristics of other people's trips assigned to them via this imputation procedure. This process could result in some anomalous findings, particularly with respect to activities, lodging types and locations of overnight stays for trips with more than one overnight location.
  • Because the methodologies are different, pooling of the 2010 and 2011 TSRC files is not feasible. Thus, the number of respondent records available for geographic or sector analysis will be smaller than the numbers available in pooled data for 2009 and/or 2010 reference years.Footnote 5

    As a consequence of the two-month recall period for overnight trips, each respondent has the opportunity to report more overnight trips than in the one-month recall surveys of 2009 and 2010. Despite the increase in trip records in the 2011 file (including those with full details reported by the respondent and those with imputed trip characteristics), the total available for analysis falls short of the number of trip records contained in the 2009-2010 pooled file. Thus, data for some sub-provincial locations may be less reliable for 2011 than corresponding estimates from the 2009-2010 pooled files.

  • The manner in which main purpose of trip is asked has changed. As of 2011, respondents are asked whether a trip was for (1) personal or (2) business or work-related reasons. Subsequently, respondents are asked to provide a more specific reason for the trip such as to visit friends or relatives; for holidays, leisure, or recreation; to go to a conference, convention or trade show (business), and the like. Additionally, routine other business trips that were considered out-of-scope prior to 2011 are now considered in-scope.

    Early indications suggest that these conceptual and wording changes may have altered overall volume estimates and the relative shares of trips by main purpose.

What to Expect

  • The TSRC 2011 data file is scheduled for external audit in November 2012 and for release by Statistics Canada in December 2012.Footnote 6
  • Following the TSRC 2011 release, the TSRC Working Group will explore the feasibility of creating adjustment factors at the national and provincial levels to permit comparisons between reference years 2010 and 2011 (bridging).
  • The TSRC Working Group will also examine options for pooling 2011 and 2012 TSRC files.