Data quality, concepts and methodology: Statistical methodology

Warning View the most recent version.

Archived Content

Information identified as archived is provided for reference, research or recordkeeping purposes. It is not subject to the Government of Canada Web Standards and has not been altered or updated since it was archived. Please "contact us" to request a format other than those available.

Survey design

Three sources of data were combined to form a census of all units in the population of interest. These consisted of:

  1. Annualized data from the Quarterly Survey of Financial Statements (QFS) obtained from the Industrial Organization and Finance Division at Statistics Canada.
  2. A survey of provincial or federal level government business enterprises (GBE) that operated in the business sector, with data obtained from the Public Sector Statistics Division at Statistics Canada.
  3. Administrative corporate taxation data in the form of the T2 Corporation Income Tax Return and the General Index of Financial Information (GIFI) obtained from the Tax Data Division at Statistics Canada.

The frame contains 1,345,664 units included in our population of interest. The Quarterly Survey of Financial Statements (QFS) provided consolidated data for 3,575 of the larger enterprises. The survey of government business enterprises provided data for 121 enterprises. The remaining data was obtained through administrative corporate taxation data. Although the vast majority of data comes from the administrative source it is less significant in terms of their contribution to assets and operating revenues (see Text table 1).

Collection and processing

For reference years 1999 and 2000, data collected from the Quarterly Survey of Financial Statements were annualized and then combined with data from a supplementary annual questionnaire that was mailed to survey respondents. The supplementary annual questionnaire was designed to obtain additional detailed information on operating expenses not available from the QFS. Beginning for reference year 2001, the supplementary questions were added to the Quarterly Survey of Financial Statements and the supplementary annual questionnaire was eliminated.

Information from all three data sources was provided in different formats with different sets of variables. In order to merge the data it was necessary to transform all three data sources into a common set of variables that contained a complete set of financial statement information. Certain details were omitted in the process due to the unavailability of data from all sources.

While QFS and GBE data were collected at the enterprise level, GIFI data, on the other hand, were collected at the non-consolidated single legal entity level. Data for single legal entities belonging to a corporate family (multi-legals) are then rolled up to the enterprise level.

Edit and imputation

Several checks are performed on the data to verify internal consistency and identify extreme values. Imputation for complete non-response is performed by 2 general methods. The preferred and most common method makes use of historical information about the non-responding unit and current trends in principal characteristics of similar units. When historical information is not available, such as in the case of births, a donor of similar size and industry is substituted for the missing unit.

Text table 3 indicates the effect of imputation on operating revenues broken down by industry grouping.

Although government business enterprises account for only 7.0% of total assets and 3.5% of total operating revenues, they have a significant presence in certain industries. For example, GBE's hold 65.9% of the assets in the utilities industry and generate 49.1% of the operating revenues in the arts, entertainment and recreation industry (see Text table 2).

Estimation

Since data is obtained from one of the three data sources for each enterprise in the population of interest, estimates are derived from the simple tabulation of data.

The combined survey results were analyzed before publication. Generally, this entails a detailed review of the individual responses (especially for the largest enterprises), a review of general economic conditions and trends, and comparisons with other relevant sub-annual surveys.

Due to certain financial reporting constraints, data for enterprises in the insurance industry could not be obtained through the administrative data source. Data for the industry are therefore derived using QFS weighted estimates rather than a census.

Data accuracy

While considerable effort was made to ensure high standards throughout all collection and processing operations, the resulting estimates are inevitably subject to a certain degree of error. There are two categories of errors in statistical information - sampling errors and non-sampling errors. Non-sampling errors is the only type that applies to this program, given that there was no sampling process used to produce these estimates. 1 

Non-sampling errors can arise from a variety of sources and are difficult to measure and their importance can differ according to the purpose to which the data are being put. Among non-sampling errors are gaps in the information provided by corporations in their tax returns and errors in processing, such as data capture.

Reference period

The objective of this annual series is to cover business activity within a calendar reference period. Data derived from the Quarterly Survey of Financial Statistics approximate the calendar period. The Government business enterprise data reflect fiscal periods which often are governed by the April to March fiscal year of governments. However, beginning with the 2002 reference year, the government business enterprise data has been adjusted to reflect the calendar period. The administrative data used from Canada Revenue Agency (CRA) is based on financial statements filed along with annual income tax returns by corporations. Historically, data from fiscal periods ending at any time fromJanuary to December were included in the reference year. However, beginning with 2004 and 2003 revised, data from income tax returns for fiscal periods ending from April to March have been included in order to better represent business activity in the calendar period.

Confidentiality

The confidentiality of the reported statistics is protected under the provisions of the Statistics Act. For this reason, statistics are released in aggregate form only, with no potential identification of individually reported information. The confidentiality provisions of the Statistics Act override the provisions of the Access to Information Act to guarantee with the confidentiality of reported data of individual respondents.

Limitations of the data

To be valid for either time-series or cross-sectional analysis, the definitions of data must be consistent within time periods or across time periods. Put differently, the differences and similarities in data must reflect only real differences and not differences in the concepts or definitions used in preparing the data.

The ability to use the data for analysis depends on the conceptual framework in which the data is being used.

These data are consistent with the Generally Accepted Accounting Principles (GAAP) of the Canadian Institute of Chartered Accountants. As such they do not agree with the concepts of the CSNA for example. If the GAAP concepts are appropriate for the application of the data then there may still be some problems of consistency (between units or over time) for items where GAAP does not prescribe a particular treatment or allow some latitude.

One of the general problems with GAAP for some uses is that it prescribes a historical cost treatment of assets (i.e. their cost at the time of acquisition). This means that comparisons over time and across industries may not be valid for balance sheet data or for ratios derived from the Balance Sheet.

Next | Previous

Date modified: