Data quality

Warning View the most recent version.

Archived Content

Information identified as archived is provided for reference, research or recordkeeping purposes. It is not subject to the Government of Canada Web Standards and has not been altered or updated since it was archived. Please "contact us" to request a format other than those available.

Spatial data quality elements provide information on the fitness-for-use of a spatial database by describing why, when and how the data are created, and how accurate the data are. The elements include an overview describing the purpose and usage, as well as specific quality elements reporting on the lineage, positional accuracy, attribute accuracy, logical consistency and completeness. This information is provided to users for all spatial data products disseminated for the census.

Lineage

Lineage describes the history of the spatial data, including descriptions of the source material from which the data were derived, and the methods of derivation. It also contains the dates of the source material, and all transformations involved in producing the final digital files.

The geographic area boundaries, names, codes, and the relationships among the various geographic levels are found on Statistics Canada's Spatial Data Infrastructure. The data for administrative areas are updated using information from provincial and territorial sources. The data for statistical areas are updated using the results of the previous census and input from users.

The province/territory boundary layer was identical to that used in 2001, including the inland water file. It was derived from the generalized census division layer by aggregating shared province/territory identifiers. This spatial file was then linked to the attribute data from the Query Base.

The census division boundary layer was created by clipping the 2006 census division boundary file derived from the National Geographic Base with the dissolved generalized province/territory boundary file above and the inland water file also used in 2001. The resulting layer was cleaned of slivers and was then linked with attribute data from the Query Base.

The base ecumene layer was created by integrating those 2006 dissemination areas (DAs) selected for their significant agricultural activity in nine provinces with the DA components selected in Newfoundland and Labrador (see the General Methodology subsection for DA selection criteria). Every DA or DA component polygon was classified as either being an ecumene DA (meeting the agricultural activity criteria) or not being an ecumene DA.

The base ecumene layer was divided into three component layers: main ecumene, other ecumene pockets (outside the main ecumene) and non-ecumene pockets (within the main ecumene). Different generalization criteria were applied to each of these component layers in order to create a product for small-scale mapping. The final criteria were determined after testing and mapping several options.

External ecumene pockets under 2,000 hectares were removed and those of 2,000 or more hectares were enlarged to increase their visibility on small-scale maps. Neighbouring pockets were then grouped together to create larger and even more visible ecumene pockets.

Internal non-ecumene pockets under 15,000 hectares were removed and those of 15,000 or more hectares were generalized, but not enlarged.

After the internal non-ecumene and external ecumene pockets were dealt with, the main ecumene was generalized manually for small-scale display. These three component layers and the separately treated Newfoundland and Labrador polygons were then reintegrated to produce a generalized layer.

The 2006 census division boundary file developed above was intersected with the resulting generalized ecumene, keeping the census division unique identifier as the basic attribute. In the ecumene/CD file polygons outside of ecumene were coded as LAND and WATER respectively.

The inland water file was created by selecting water features from the National Geographic Database's hydrographic reference layers. These reference data were sourced from the National Topographic Data Base (1:50,000 and 1:250,000) and the Digital Chart of the World (1:1,000,000). Each feature was assigned a rank based on its size and/or cultural importance. The largest and most important features have lower rank values. Only those features with the lowest rank are included with the ecumene boundary files.

Positional accuracy

Positional accuracy refers to the absolute and relative accuracy of the positions of geographic features. Absolute accuracy is the closeness of the coordinate values in a dataset to values accepted as or being true. Relative accuracy is the closeness of the relative positions of features to their respective relative positions accepted as or being true. Descriptions of positional accuracy include the quality of the final file or product after all transformations.

The boundaries are derived from the Spatial Data Infrastructure. The data in the Spatial Data Infrastructure are stored in double precision. This precision allows features that are next to each other on the ground to be placed in the correct position on the map, relative to each other, without overlap. However, the absolute positional accuracy of the features in the database varies depending on the source of the features.

The Spatial Data Infrastructure is not Global Positioning Systems (GPS)-compliant. However, every possible attempt is made to ensure that the geographic area boundaries maintained in the Spatial Data Infrastructure respect the limits of the administrative entities that they represent (e.g., census division and census subdivision) or on which they are based (e.g., census metropolitan area or census agglomeration). The positional accuracy of these limits is dependent upon source materials used by Statistics Canada to identify the location of limits. In addition, due to the importance placed on relative positional accuracy, the positional accuracy of other geographic data (e.g., road network data and hydrographic data) that are stored within the Spatial Data Infrastructure is considered when positioning the limits of the geographic areas.

While the boundaries were originally derived from the National Geographic Base, they have been greatly generalized (particularly on the shorelines and the boundary of Canada) and are not positionally consistent with data on the base.

Attribute accuracy

Attribute accuracy refers to the accuracy of the quantitative and qualitative information attached to each feature (such as population for an urban area, street name, census subdivision name and code).

As noted under Lineage, the attributes (names, types and codes) for all geographic areas displayed on the maps are sourced from the Spatial Data Infrastructure. The names and types for administrative geographic areas have been updated from the 2001 Census using source materials from provincial and territorial authorities.

The attribute data associated with the polygons in the boundary files were independently verified against the data in the Spatial Data Infrastructure and found to be accurate.

Logical consistency

Logical consistency describes the fidelity of relationships encoded in the data structure of the digital spatial data.

In each boundary file, all geographic areas have been verified to have a unique identifier that is valid for the 2006 Census.

Boundaries found in this product are consistent with those found in other spatial products produced as part of the suite of 2006 Census products.

The hydrographic data files were specially created for the boundary files to enable thematic mapping at a national scale.

The land area for geographic areas present in GeoSuite may not be consistent with that computed from the cartographic boundary files. This is because the water features used in the creation of the cartographic boundary files are based on a set of hydrographic features that was created for thematic mapping.

Topological consistency

Topological consistency describes the correctness of the explicitly encoded topological characteristics of a dataset.

This product was checked to ensure that the polygons were consistent with the geographic units being represented. Very small polygons and slivers (resulting from the integration of different layers of information) were removed.

Consistency with other products

Due to extensive generalization, the boundaries in the various files of this product are not consistent with the 2006 Census Cartographic Boundary Files, the 2006 Census Road Network File or the 2006 Census Road Network and Geographic Attribute File.

Completeness

Completeness refers to the degree to which geographic features, their attributes and their relationships are included or omitted in a dataset. It also includes information on selection criteria, definitions used, and other relevant mapping rules.

Each boundary file contains the complete set of geographic areas for that level of the geographic hierarchy.

In the agricultural ecumene boundary layer, at least one ecumene pocket exists for 246 of the 288 census divisions in Canada. Of the remaining 42 census divisions, only 4 had no farms in the 2006 Census of Agriculture. Each ecumene polygon has the following two attributes: a census division unique identifier code (CDuid) and a Census of Agriculture standard geographic area unique identifier code (AGuid).