About Agriculture–Population linkage

Warning View the most recent version.

Archived Content

Information identified as archived is provided for reference, research or recordkeeping purposes. It is not subject to the Government of Canada Web Standards and has not been altered or updated since it was archived. Please "contact us" to request a format other than those available.

An important benefit of conducting the Census of Agriculture jointly with the Census of Population is that information from the two censuses can be linked by means of an automated matching process in order to create the Agriculture–Population Linkage database. This database contains all Census of Agriculture variables and most of the variables (such as income, education, occupation, etc.) included on the long Census of Population questionnaire distributed to a random sample of 20% of all households in Canada. The Agriculture–Population Linkage database permits the cross-tabulation of socio-economic characteristics of farm operators and their families with the agricultural characteristics of farm operations (for example, the age, education and income of dairy farm operators).

Initially created for the 1971 censuses, Agriculture–Population linkage databases are available for the 1971, 1981, 1986, 1991, 1996, 2001 and 2006 censuses.

Automated matching process

The fundamentals of the Agriculture–Population automated matching process are simple. A farm operator completes a Census of Agriculture questionnaire as well as either a short or long Census of Population questionnaire, distributed to 80% and 20% of all households respectively. A unique household identifier is assigned to both the agriculture and population questionnaires when they are dropped off, and this identifier becomes the key for the match. Data from all successfully matched Census of Agriculture and long Census of Population questionnaires are linked to form the Agriculture–Population Linkage database. The 1991 to 2006 Censuses of Agriculture allowed respondents to report up to three operators per farm, and all farm operators were included in the matching process. With this additional information, the relationship between family members living in the same household and operating the same farm can be analyzed. As well, operators in different households operating the same farm can be included in the analysis.

Sampling and weighting

As all questions on the short Census of Population questionnaire were also included on the long questionnaire, Census of Population data were collected either from 100% of the population or on a sample basis (i.e., from a random sample of one in five households). With the exception of data pertaining to very large agricultural operations and their operators, the data on the Agriculture–Population Linkage database must also be weighted up to compensate for sampling. The data associated with these very large agricultural operations and their operators were included on this database but excluded from the weighting procedure. If these operators of large agricultural operations only received a short Census of Population questionnaire, their data for all supplementary questions contained on the long questionnaire were imputed using the responses of similar operators included in the 20% sample of households.

A method, known as the Generalized Least Squares Estimation Procedure, was used to calculate the weights. The weights were calculated independently in each of 206 geographic regions across Canada, defined as "weighting areas". Each weighting area represents between 1,700 and 8,000 persons from the farm population, and respects, as much as possible, the boundaries of census agricultural regions, census divisions and census consolidated subdivisions. As well, efforts were made to ensure the comparability of weighting areas between 1996, 2001 and 2006, while respecting these geographic boundaries. Characteristics referred to as "constraints" were also identified. These were agricultural and population characteristics of primary importance to data users for which data were already available on a 100% basis. In each weighting area, the Generalized Least Squares Estimation Procedure ensured that sample estimates of most of these constraints would be very close to the known population counts. The level of agreement depended on the scarcity of the constraints. Constraints common to many units had high agreement. Rare constraints had lower agreement. At the Canada level, half of the constraints had discrepancies between sample estimates and population counts less than 0.3% of the population count, 90% of the constraints had discrepancies less than 2.1%, and all constraints had discrepancies less than 5%. None of the largest constraints had discrepancies higher than 0.1%.

The Agriculture–Population database contains agricultural data (farm operations and farm operators) and population data (person, household, census family and economic family). For each of these components, weights have been calculated at the person level, household level, census family level and economic family level.

For any given geographic area, the weighted population, household, family or farm totals or subtotals may differ from that shown in previous releases containing Census of Agriculture data collected on a 100% basis. Such variation will be due to sampling. The discrepancies for variables used to define the constraints used in determining the generalized least squares weights were described above. The discrepancies for any variables highly correlated with at least one of the variables used to define a constraint will be similar to the discrepancy of that constraint. For other variables, discrepancies will depend on the relationship with the variable used to define a constraint, and could be large if no relationship exists.