1 Data, statistical information and statistics
1.3 Data quality

Text begins

Generally speaking, statistical information is evaluated in terms of its “fitness for use” - that is, the extent to which the statistical information can be relied upon to fulfill the user’s information needs. At Statistics Canada, fitness for use is considered along six quality dimensions.

Quality dimensions and examples of questions to ask

Relevance

Does the statistical information matter to Canadians?

  • Does it fill a data gap?
  • Is it useful in building policies?
  • Does it aid in long-term planning?
  • Can it promote new initiatives?

Accessibility

Can users access the statistical information?

  • Is it easy to access?
  • Is it affordable?
  • Is it organized and easy to locate?
  • Can users who encounter difficulties in accessing information request assistance?

Accuracy

Is the statistical information representative of the targeted measurement?

  • Does it cover the required population and period of reference?
  • Are there known sources of under-coverage?
  • Are methods transparent?
  • Was the information produced without external influence?

Timeliness

Is the lag between the period of reference and the availability of the statistical information acceptable?

  • Is the statistical information available when it is the most needed?
  • Are you willing to accept lower accuracy to get the data faster?

Interpretability

The metadata is the information that provides context to data.

  • Is metadata available and complete?
  • Are they useful?
  • Are they reliable?
  • Are they available at the same time as the statistical information to which they pertain?

Coherence

Is the statistical information consistent over time, between region and across sub-populations?

  • Does it use standard concepts and classifications?
  • Was it produced using methods that are common to other statistical products?
  • Is it comparable to previously released statistical information?

Often, a compromise is necessary between quality dimensions. For example, the need for timeliness can impact accuracy: publishing statistical information quickly reduces the time available for ensuring the accuracy of the information.

Besides data quality, it is also important to consider the ethics of data that are collected and of processes used to produce the statistical information. Ethically sourced data are collected in a transparent manner and used in a meaningful way that will not cause harm to the respondents.


Date modified: