Statistics Canada
Symbol of the Government of Canada

Reliability of the Data

Warning View the most recent version.

Archived Content

Information identified as archived on the Web is for reference, research or recordkeeping purposes. It has not been altered or updated after the date of archiving. Web pages that are archived on the Web are not subject to the Government of Canada Web Standards. As per the Communications Policy of the Government of Canada, you can request alternate formats on the "Contact Us" page.

All the possible sources of error are examined below.

Definitions have been taken from a compendium of methods of error evaluation in censuses and surveys, Statistics Canada, catalogue no. 13-564E.

Coverage

“Coverage errors are introduced whenever the sampling frame...does not adequately represent the target population at the time of the survey.”1

Coverage is a minor source of error.  Surveys are of all known and suspected, large R&D performers and funders i.e., those believed to have R&D expenditures of at least $1,000,000.

Administrative data are used for the small R&D performers or funders. Companies have up to 18 months after their fiscal year end to claim a tax credit for their R&D expenditures.  Underreporting due to this time lag is estimated to be less than 8%, and is largely corrected by imputing estimates based on industry trends for all known performers who have not yet submitted their claim.

Response

“A response error occurs whenever a characteristic is misreported in a census or a survey.”1

As a result of a reconciliation of federal and industrial accounts of government grants and contracts, we think that industrial R&D performance estimates may be slightly low. This is caused by the non-reporting of industrial R&D funded by contract. Such work is sometimes not distinguishable from non-R&D contract work.

The accuracy of the company’s estimates of future expenditures has also been a problem in the past, particularly in the wells and petroleum products industries.

Non-response

“Non-response occurs when information required for a survey unit is missing. This could happen because the unit cannot be contacted, because the unit is unable to provide the information requested, or because the unit refuses to cooperate in the survey.”1

Non-response is a potential problem in four areas. One is the estimate of R&D expenditures two years past the base year. If no estimate is made, editors make one - based usually on the expenditure of the preceding year or a slight increase in expenditures.

The second involves the administrative data used for the smaller R&D performers. These represent 10% of all R&D performed by businesses. Certain information is not asked of them. However, the missing data are imputed from the replies of the larger performers in the same industry.

The third concerns companies inadvertently not included in the survey. A number of sources are used to create the mailing lists and it is unlikely that major performers would be overlooked.  

Failure of surveyed companies to reply is the fourth type of non-response. We believe non-response error to be minor and may result in a minor under-estimation of R&D expenditures.

Coding

“A coding operation in a survey or census is defined as the operation where data on questionnaires or source documents are transformed into a format which is suitable for input to the data capture operation. This often involves the assignment of codes for ‘write-in’ entries but may also be a fairly straightforward transcription operation.”1

Uncorrected coding errors are unlikely because of the examination of numerous tables and listings prepared for data analysis before publication tables are created.

Data capture

“The data capture operation in a census or survey consists of converting the data received on questionnaires (e.g., respondent answers) to a machine readable format.”

All data capture for science statistics is through manual intervention:  key-edit or typed entry at a computer terminal.

Significant uncorrected data capture errors are unlikely because of the examination of numerous tables and listings prepared for data analysis before publication tables are created.

Edit and imputation

“The edit procedure usually consists of:  (i) checking each field of every record to ascertain whether it contains a valid code or entry; (ii) checking codes or entries in certain predetermined combinations of fields to ascertain whether codes or entries are consistent with one another... The imputation procedure consists of changing values in some of the fields in records which failed the edit rules with a view to ensuring that the resultant data records satisfy all edit rules.”1

Although there are a number of edits, all cases of failed edit checks are corrected after consideration by editors. Automatic imputations are made only for the smaller R&D performers and funders.

Sampling

“Sampling error occurs whenever survey results are based on a sample of units from a survey frame... Obviously there is no sampling error in complete enumeration surveys.”1

Although a complete enumeration is carried out of known and suspected R&D performers and funders, records received from the administrative data do not provide as much information as do those completing the long form. Certain data are imputed for records from the administrative file based on the patterns of long form respondents in the same industry. Thus, as a result of the 2004 survey, the 2004 business enterprise sector R&D expenditures would be based on full enumeration but about 10% of the expenditures for 2005 and 2006 would have been imputed.


Note

  1. “A compendium of methods of error evaluation in censuses and surveys.” Statistics Canada, Statistical Services Field, November 1978, Catalogue No. 13-564E