|
|
Survey steps >
Scope and purpose
Administrative records are data collected for the purpose of
carrying out various non-statistical programs. For example, administrative
records are maintained to regulate the flow of goods and people across
borders, to respond to the legal requirements of registering particular
events such as births and deaths, and to administer benefits such as pensions
or obligations like taxation. As such, the records are collected with
a specific decision-taking purpose in mind, and so the identity of the
unit corresponding to a given record is crucial. In contrast, in the case
of statistical records, on the basis of which no action concerning
an individual is intended or even allowed, the identity of individuals
is of no interest once the database has been finalized.
Administrative records present a number of advantages to a statistical
agency and to analysts. Demands for statistics on all aspects of our lives,
our society and our economy continue to grow. These demands often occur
in a climate of tight budgetary constraints. Statistical agencies also
share with many respondents a growing concern over the mounting burden
of response to surveys. Respondents may also react negatively if they
feel they have already provided similar information (e.g., revenue) to
administrative programs and surveys. Administrative records, because they
already exist, do not require the cost of direct data collection nor do
they impose a further burden on respondents. It is important to note that
the explosion of technology has also permitted statistical agencies to
overcome the limitations caused by the processing of large datasets. For
all these reasons, administrative records are becoming increasingly usable
and are being used for statistical purposes.
Statistical uses of administrative records include (i) use for survey
frames, directly as the frame or to supplement an existing frame, (ii)
replacement of data collection (e.g., use of taxation data for small businesses
in lieu of seeking survey data for them), (iii) use in editing and imputation,
(iv) direct tabulation, (v) indirect use in estimation (e.g., as auxiliary
information in calibration estimation, benchmarking or calendarisation),
and (vi) survey evaluation, including data confrontation (e.g.,
comparison of survey estimates with estimates from a related administrative
program).
Principles
It is Statistics Canada's policy to use administrative records whenever
they present a cost-effective alternative to direct data collection. As
with any data acquisition program, consideration of the use of administrative
records for statistical purposes is a matter of balancing the costs and
benefits. Administrative records start with a huge advantage they avoid
further data collection costs and respondent burden, provided the coverage
and the conceptual framework of the administrative data are compatible
with the target population. Depending on the use, it is often valuable
to combine an administrative source with another source of information.
The use of administrative records may raise concerns about the privacy
of the information in the public domain. These concerns are even more
important when the administrative records are linked to other sources
of data. The Policy
on Informing Survey Respondents (Statistics Canada, 1998a) requires
that Statistics Canada provides all respondents with information such
as the purpose of the survey, the confidentiality protection, the record
linkage plans and the identity of the parties to any agreements to share
the information provided by those respondents. Record linkage must be
in compliance with the Agency's Policy
on Record Linkage (Statistics Canada, 1996a). In particular, all requests
for record linkage must be submitted to the Confidentiality and Legislation
Committee and approved by the Policy Committee.
The use of administrative data may require the statistical agency to
implement a number, usually only a few, of the survey steps discussed
in previous sections. This is because many of the survey steps (e.g.,
direct collection and data capture) are performed by the administrative
organization. As a result, additional guidelines to those previously presented
are required to suggest ways to compensate for any differences in the
quality goals of source organization (e.g., to compensate for the outgoing
quality from the data capture, which is often uncontrolled).
One must keep in mind the fundamental reason for the existence of these
administrative records: they are the result of an administrative program
that was put in place for administrative reasons. Often the statistical
uses of these records were unknown when the program was implemented and
statistical agency invariably has limited impact in the development of
the program. For that reason, any decisions related to the use of administrative
records must be preceded by an assessment of such records in terms of
their coverage, content, concepts and definitions, the quality assurance
and control procedures put in place by the administrative program to ensure
their quality, the frequency of the data, the timeliness in receiving
the data by the statistical agency and the stability of the program over
time. Obviously, the cost of obtaining the administrative records is also
a key factor in the decision whether to use such records.
Guidelines
- Many of the guidelines in earlier sections are applicable to administrative
records. Sampling and data capture guidelines (see sections on Sampling
and on Data collection and capture operations)
will be relevant if administrative records exist only on paper and have
to be coded and captured. These guidelines will also be of value for
administrative data available in electronic form, including EDI and
EDR. Note that these data, because they exist in electronic form, may
be inherently less stable and subject to additional errors arising from
data treatment and transmission processes at source. Editing and dissemination
guidelines (see sections on Editing and on
Data dissemination) apply to all cases
where a file of individual administrative records is obtained or created
for subsequent processing and analysis.
- Consider privacy implications of the publication of information from
administrative records. Although the Statistics Act provides Statistics
Canada with the authority to access administrative records for statistical
purposes, this use may not have been foreseen by the original suppliers
of information (Statistics Canada, 1970). Therefore, programs should
be prepared to explain and justify the public value and innocuous nature
of this secondary use.
- Collaborate with the designers of new or redesigned administrative
systems. This can help in building statistical requirements into administrative
systems from the start. Such opportunities are rare, but when they happen,
the eventual statistical value of the statistical agency’s participation
can far exceed the time expended on exercise.
- Maintain continuing liaison with the provider of administrative records.
Liaison with the provider is necessary at the beginning of the use of
administrative records. However, it is even more important to keep in
close contact with the supplier at all times so that the statistical
agency is not surprised by any impeding changes, and can even influence
them. Feedback to the supplier of statistical information and of weaknesses
found in the data can be of value to the supplier, leading to a strengthening
of the administrative source.
- Understand the context under which the administrative organization
created the administrative program (e.g., legislation, objectives, and
needs). It has a profound impact on (i) the universe covered, (ii) the
contents, (iii) the concepts and definitions used, (iv) the frequency
and timeliness, (v) the quality of the recorded information, and (vi)
the stability over time.
- Study each data item in the administrative records that are planned
to be used for statistical purposes. Investigate its quality. Understand
the concepts, definitions and procedures underlying its collection and
processing by the administrative organisation. Some of the items might
be of very poor quality and thus might not be fit for use. For example,
the quality of classification coding (e.g., occupation, industrial activity,
geography) might not be sufficient for some statistical uses or might
limit its use.
- Like data collected by means of a survey, administrative data are
also subject to partial and total nonresponse. In some instances, the
lack of timeliness in obtaining all administrative data introduces greater
nonresponse. Some guidelines provided in the section on Response
and nonresponse will thus apply. Unless nonrespondents can be followed
up and responses obtained, develop an imputation or a weight-adjustment
procedure to deal with this nonresponse (see sections on Imputation
and on Estimation). Administrative sources
are sometimes outdated. Therefore, as part of the imputation process,
give special attention to the identification of active and/or inactive
units. Some imputation or transformation may also be required in cases
where some of the units report the data at a different frequency (e.g.,
weekly or quarterly) than the one desired (e.g., monthly).
- Keep in mind that if the information they provide to the administrative
source can cause gains or losses to individuals or businesses, there
may be biases in the information supplied. Special studies may be needed
in order to assess and understand these sources of error.
- Document the nature and quality of the administrative data once assessed.
Documentation helps statisticians decide the uses to which the administrative
data are best suited. Choose appropriate methodologies for the statistical
program based on administrative data and inform users of the methodology
and data quality.
- Keep in mind that the longevity of the source of administrative data
and its continued scope is usually entirely in the hands of the administrative
organization. The administrative considerations that originally dictated
the concepts, definitions, coverage, frequency, timeliness and other
attributes of the administrative program may, over time, undergo changes
that distort time series derived from the administrative source. Be
aware of such changes, and deal with their impact on the statistical
program.
- Implement continuous or periodic assessment of incoming data quality.
Assurance that data quality is being maintained is important because
the statistical agency does not control the data collection process.
This assessment may consist of implementing additional safeguards and
controls (e.g., the use of statistical quality control methods and procedures,
edit rules) when receiving the data, comparisons with other sources
or sample follow-up studies.
- When record linkage of administrative records is necessary (e.g.,
for tracing respondents, for supplementing survey data, or for data
analysis), conform to the Agency's Policy
on Record Linkage. Privacy concerns that may arise when a single
administrative record source is used are multiplied when linkage is
made to other sources. In such cases, the subjects may not be aware
that information supplied on two separate occasions is being combined.
The Policy on Record Linkage is designed to ensure that the public value
of each record linkage truly outweighs any intrusion on privacy that
it represents.
- It is not always easy to combine an administrative source with another
source of information. This is especially true when a common matching
key for both sources is not available and record linkage techniques
are used. In this case, select the type of linkage methodology (i.e.,
exact matching or statistical matching) in accordance with the objectives
of the statistical program. When the purpose is frame creation and maintenance,
edit and imputation or weighting, exact matching is appropriate. When
the sources are linked for performing some data analyses that are impossible
otherwise, consider statistical matching, i.e., matching of records
with similar statistical properties (see Cox and Boruch, 1988; Scheuren
and Winkler, 1993; Kovacevic, 1999).
- When record linkage is to be performed, make appropriate use of existing
software. Statistics Canada’s Generalized Record Linkage Software
is but one example of a number of well-documented packages.
- When data from more than one administrative source are combined, pay
additional attention to reconcile potential differences in their concepts,
definitions, reference dates, coverage, and the data quality standards
applied at each data source. Examples are education data sources, health
and crime reports, and registries of births, marriages, licenses, and
registered vehicles, which are provided by various organizations and
government agencies.
- Some administrative data are longitudinal in nature (e.g., income
tax, goods and services tax). When records from different reference
periods are linked, they are very rich data mines for researchers. Remain
especially vigilant when creating such longitudinal and person-oriented
databases, as their use raises very serious privacy concerns. Use the
identifier with care, as a unit may change identifiers over time. Track
down such changes to ensure proper temporal data analysis. In some instances
the same unit may have two or more identifiers for the same reference
period, thus introducing duplication in the administrative file. If
this occurs, develop an unduplication mechanism.
- Administrative information is sometimes used to replace a set of questions
that would otherwise be asked of the respondent. In this instance, permission
from the respondent may have to be obtained. Follow the Policy
on Informing Survey Respondents in this regard. When consent is
not obtained, put collection procedures in place for the equivalent
survey questions to be asked of the respondents.
- Administrative files are often very large and their use can sometimes
lead to significant processing costs and timeliness issues. Depending
on the need, make use of a random sample from large administrative files
to reduce costs.
References
Brackstone, G.J. (1987). Issues in the use of administrative records
for statistical purposes. Survey Methodology, 13, 29–43.
Cox, L.H. and Boruch, R.F. (1988). Record linkage, privacy and statistical
policy. Journal of Official Statistics, 4, 3–16.
Hidiroglou, M.A., Latouche, M.J., Armstrong, B., and Gossen, M. (1995).
Improving survey information using administrative records: the case of
the Canadian Employment Survey. Proceedings of the Annual Research
Conference, U.S. Bureau of the Census, 171–197.
Kovacevic, M. (1999). Record linkage and statistical matching –
they aren’t the same! SSC Liaison, Vol. 13, No.
3, 24–29.
Michaud, S., Dolson, D., Adams, D., and Renaud, M. (1995). Combining
administrative and survey data to reduce respondent burden in longitudinal
surveys. Proceedings of the Section on Survey Research Methods,
American Statistical Association, 11–20.
Monty, A. and Finlay, H. (1994). Strengths and weaknesses of administrative
data sources: experiences of the Canadian Business Register. Statistical
Journal of the United Nations, ECE 11, 205–210.
Scheuren, F. and Winkler, W.E. (1993). Regression analysis of data files
that are computer matched. Survey Methodology, 19, 39–58.
Statistics Canada (1970). The Statistics Act. Ottawa,
Canada.
Statistics Canada (1996a). Policy
on Record Linkage. Policy Manual, 4.1.
Statistics Canada (1998a). Policy
on Informing Survey Respondents. Policy Manual, 1.1.
Wolfson, M., Gribble, S., Bordt, M., Murphy, B. and Rowe, G. (1987).
The Social Policy Simulation Database: an example of survey and administrative
data integration. Proceedings of the International Symposium on
the Statistical Uses of Administrative Data, Statistics Canada,
201–229.
|