Data quality, concepts and methodology: Methodology and data quality

Warning View the most recent version.

Archived Content

Information identified as archived is provided for reference, research or recordkeeping purposes. It is not subject to the Government of Canada Web Standards and has not been altered or updated since it was archived. Please "contact us" to request a format other than those available.

Survey frame and sample selection

Every five years, the Census of Agriculture collects information on agricultural operations across Canada, including institutional farms, community pastures, Indian reserves, etc. The Census of Agriculture provides a list of farms and their crop areas from which a probability sample for the March Farm Survey is selected.

The target populations for the March Farm Survey includes all farms in Canada enumerated in the Census of Agriculture except those on Indian reserves and farms from the Northwest Territories, Yukon, Nunavut and Atlantic region. Institutional farms are also excluded from the target population.

Probability surveys can use two types of sampling frames: list and area. In the March Farm Survey, only the list frame is used in sample selection. This list frame is stratified into homogenous groups on the basis of Census characteristics (such as farm size and crop area) and sub-provincial geographic boundaries. A sample of approximately 12,600 farms was drawn from the list frame for the March 2011 Farm Survey.

Data collection

The March 2011 Farm Survey was carried out from March 24 to March 31. Data collection is undertaken using the "Computer-assisted telephone interview" (CATI) system.

Edit and imputation

With the CATI system, it is possible to implement edit procedures at the time of the interview. Computer programmed edit checks in the CATI system inform interviewers during the interview of possible data errors, which can then be corrected immediately by the interviewer and respondent. CATI significantly reduces the need for subsequent telephone follow-up, thereby reducing respondent burden and survey processing time.

Response rate

Usually by the end of the collection period, 80% of the questionnaires have been fully completed. The refusal rate to the survey is approximately 8% to 9%. The remainder of the sample unaccounted for can be explained by non-contact and non-response. Initial sample weights are adjusted by a process called "raising factor adjustment" in cases of total and partial non-response. No imputation is performed for missing values.

Sampling and non-sampling errors

The statistics contained in this publication are based on a random sample of agricultural operations and, as such, are subject to sampling and non-sampling errors. The overall quality of the estimates depends on the combined effect of these two types of errors.

Sampling errors arise because estimates are derived from sample data and not from the entire population. These errors depend on factors such as sample size, sampling design and the method of estimation. An important feature of probability sampling is that sampling errors can be measured from the sample itself.

Non-sampling errors are errors which are not related to sampling and may occur throughout the survey operation for many reasons. For example, non-response is an important source of non-sampling errors. Coverage, differences in the interpretation of questions, incorrect information from respondents, mistakes in recording, coding and processing of data are other examples of non-sampling errors.

Estimation

The survey data collected are weighted in order to produce unbiased level indicators which are representative of the population. These level indicators then undergo a validation process, based on subject matter analysis and consultation with provincial statisticians, before final estimates are published.

Estimates of farm stocks of grains are obtained by a survey of farm operations, but a major tool used in the verification of these estimates is the farm supply-disposition (or supply-demand) balance sheet. This table reflects activity on farm only before grain enters the commercial system. The total supply and the disposition must be equal.

The supply is composed of opening farm stocks and production. The disposition is comprised of deliveries, seed use, closing farm stocks as well as feed, waste and dockage. The production and farm stock data are estimated from Farm Surveys conducted by Statistics Canada. Seed use data are based on average seeding rates.

A major portion of the deliveries are licensed grain deliveries obtained from the Canadian Grain Commission (CGC). Statistics Canada (StatCan) adjusts these deliveries during the estimation process to account for CGC quality problems and data lags. The adjustments are calculated mainly from commercial supply-demand tables using data available in the CGC publication Grain Statistics Weekly. However, the deliveries published in the StatCan farm supply-disposition tables reflect the CGC published data plus StatCan estimates for unlicensed deliveries to both domestic and export markets and to condominium storage.

The feed, waste and dockage (fwd) component is a residual in the balance sheet. Indicators such as the number of grain consuming animal units, harvest conditions affecting grain quality, established ratios of dockage to delivered grain and grain inspections are used to ensure data accuracy. An unusual estimate in this component may indicate a problem with another data series such as deliveries or may show a change in feeding patterns. Farm stocks are estimated from survey indicators in conjunction with the other components of the balance sheet. Therefore, any apparent fwd anomalies are unlikely to reflect problems with the level of the farm stocks.

National supply and disposition tables provide further information to aid in estimating farm stocks. More detailed information on supply and disposition tables may be obtained in the October issue of Statistics Canada catalogue 22-007-X, Cereals and Oilseeds Review.

Revisions

Stocks data are subject to revision for two years after first being published. Any revisions are published in the July 31 stocks report, which is released in September.

The following table contains some statistics which indicate the magnitude and direction of past revisions to the March 31 farm stocks estimates. The magnitude is measured by the average percent change between the preliminary and final estimates. The direction of revisions is indicated by counting the number of years that the preliminary estimate is above or below the final estimate. The data indicate, for example, that the preliminary estimates of March 31 farm stocks of wheat are revised by a magnitude of, on average, 3.6% and usually in an upward direction.

Data quality

The March 31 farm stock estimates are based on level indicators obtained from a probability survey of farming operations. The potential error introduced by sampling can be estimated from the sample itself by using a statistical measure called the "coefficient of variation" (c.v.). Over repeated surveys, 95 times out of 100, the relative difference between a sample estimate and what should have been obtained from an enumeration of all farming operations would be less than twice the c.v. This range of values is referred to as the "confidence interval". While published estimates may not exactly equal the level indicators due to the validation process, these estimates do remain within the confidence interval of the survey level indicators. For the March Farm Survey, coefficients of variation range from 5% to 10% for the major crops. Coefficients of variation for specialty crops and small areas of major crops are usually within 10% to 25%.

Data confidentiality

Data confidentiality is ensured under the Statistics Act, which prohibits the divulging of individual or aggregated data where individuals or businesses might be identified.

Next | Previous