Statistics Canada
Symbol of the Government of Canada

Warning View the most recent version.

Archived Content

Information identified as archived is provided for reference, research or recordkeeping purposes. It is not subject to the Government of Canada Web Standards and has not been altered or updated since it was archived. Please "contact us" to request a format other than those available.

Support Activities

Time series
Record linkage
Data analysis resource centre (DARC) support
Generalized systems
Quality assurance
Training and consultation
Conferences
Survey methodology

For more information on the program as a whole, contact:
Mike Hidiroglou (613-951-0251, mike.hidiroglou@statcan.gc.ca).

Time series

The purposes of these projects are mainly as follows:

  • To provide consultation services in time series;
  • To apply, develop and offer advices on seasonal adjustment methods with respect to the estimation of time series components (Trend, Seasonal, Irregular, Easter, Trading-Day, Holiday and User-defined);
  • To develop and improve benchmarking, interpolation, reconciliation and calendarization methods;
  • To integrate the methods into the appropriate software package; and
  • To develop and maintain courses in time series.

Alumni:

Dr Estela Bee Dagum and Pierre Cholette completed their work term as part of the alumni program in December 2009. During their stay, they directed reading and discussion groups about their 2006 book on Benchmarking, Temporal Distribution and Interpolation Methods for Time Series. Dr Dagum also provided consultation and training on trend estimation and seasonal adjustment. Cholette provided advices and support on calendarisation methods.

Seasonal adjustment:

A new prototype system for time series processing using SAS PROC X12 and Statistics Canada in-house SAS Proc TSRaking was developed and successfully put in production with the Labor Force Survey and at the Canadian Real Estate Agency. The prototype was documented and presented at JSM 2009 as a topic contributed poster titled ‘Recent Developments in Statistics Canada’s Time Series Processing System: Transition to SAS® PROC X12.’ by Ferland and Fortier (2009). This work included extensive collaboration with the SAS Institute regarding PROC X12.

Internal tools to facilitate the use of and transition between SAS Proc X12 and X-12-ARIMA software from the US Census Bureau were developed.

Benchmarking:

Benchmarking refers to techniques used to ensure coherence between time series data of the same target variable measured at different frequencies, for example, sub-annually and annually. Research progress in this area is described in various papers written, published or submitted for publication during the review period:

The paper “A Nonparametric Iterative Smoothing Method for Benchmarking and Temporal Distribution” by Quenneville, Fortier and Gagné (2009a) was published in Computational Statistics and Data Analysis.
 
The paper “Benchmarking via the regression method with working error-models,” by Chen and Wu (2009) was sent to a journal for possible publication.

Quenneville, Fortier and Gagné (2009b) presented a topic contributed paper at JSM2009 titled ‘Illustration and Convergence Property of the Nonparametric Iterative Smoothing Method for Benchmarking and Temporal Distribution.’

Gagné and Quenneville (2009) presented a topic contributed paper at JSM2009 titled ‘Testing Time Series Data Compatibility for Benchmarking.’

Quenneville, Picard and Fortier (2010) started to work on a paper for an invited presentation for JSM2010 tentatively titled “Interpolation, Benchmarking and Temporal Distribution with Natural Splines.”

Reconciliation:

Reconciliation refers to the techniques used to impose contemporaneous aggregation constraints to tables of time series, so that the “cell” series add up to the appropriate “marginal total” series. Fortier and Quenneville (2009) presented an invited paper at JSM 2009 titled ‘Reconciliation and Balancing of Accounts and Time Series, from Concepts to a SAS Procedure’ to summarize and justify the method used. Quenneville and Fortier (2010) started the writing of an overview and summary paper titled “Restoring Accounting Constraints in Time Series – Methods and Software for a Statistical Agency” for possible publication in a book in honor of Dr David Findley.

Software development:

Version 1.03.001 of the Forillon system grouping time series procedures has been released. This version contains Proc Benchmarking, a procedure to benchmark sub-annual indicator series to control totals (benchmarks) and Proc TSRaking, a procedure to restore additivity in a system of time series. An intranet site is available and installation instructions can be found on the installation instructions page. Assistance is also available through Forillon@statcan.gc.ca.

Bérubé and Fortier (2009) presented a topic contributed paper at JSM2009 titled ‘PROC TSRAKING: An In-house SAS® Procedure for Balancing Time Series’ to introduce the reconciliation procedure.

Trend estimation:

Dr. Sylvia Bianconcini, professor form the University of Bologna, Italy visited the TSRAC from June 3 to August 31. Dagum and Bianconcini (2009) presented an invited paper at JSM2009 titled, ‘Recent developments in short-term trend prediction for real time analysis.’ Bianconcini and Dagum (2009) presented the topic contributed paper at JSM2009 titled ‘Equivalent reproducing kernels for smoothing spline predictors.’
 
Martin Romero, Meszaros, Elliot and McLaren (2009) presented the topic contributed paper at JSM2009 titled ‘Issues in trend-cycle estimates for official statistics.’

Courses:

The course ‘Seasonal Adjustment with X-12-ARIMA’ (course number 0434) has been revised to use a new version of the interface, called WinX12, recently released from the US Census Bureau. The development of the course notes and practical exercises for a course on “Reconciliation of Time Series with Proc TSRaking” (course number 0237) was started.

Quality guidelines:

Quality guidelines on benchmarking and related techniques, and on seasonal adjustment and trend-cycle estimation were updated and released. See Statistics Canada (2009a) and Statistics Canada (2009b).

Additional presentations and papers:

Various presentations and papers have been given or written reporting on research work that took place at the Time Series Research and Analysis Centre; see Fortier (2009) and Quenneville (2009).

Ad hoc consultation:

TSRAC staff continued to offer ad hoc support on time series issues as part of the resource centre’s mission: partial funding of the calendarisation methodology in general, calendarisation for the monthly surveys, estimation of indicator series for calendarisation purposes, trend estimation on environment data and various others.

For further information, contact:
Benoit Quenneville (613-951-1605, benoit.quenneville@statcan.gc.ca) or
Susie Fortier (613-951-4751, susie.fortier@statcan.gc.ca).

References

Bianconcini, S., and Dagum, E.B. (2009). Equivalent reproducing kernels for smoothing spline predictors. In JSM Proceedings, Business and Economic Section. Alexandria, VA: American Statistical Association.

Dagum, E.B., and Bianconcini, S. (2009). Recent developments in short-term trend prediction for real time analysis. In JSM Proceedings, Business and Economic Section. Alexandria, VA: American Statistical Association.

Record linkage

The objectives of the Record Linkage Resource Centre (RLRC) are to:

  • Provide consulting services to both internal and external users of record linkage methods, including recommendations on software and methodology and collaborative work on record linkage applications.
  • Evaluate alternative record linkage methods and develop improved methods.
  • Evaluate software packages for record linkage and, where necessary, develop prototype versions of software incorporating methods not available in existing packages.
  • Assist in the dissemination of information concerning record linkage methods, software and applications to interested persons both within and outside Statistics Canada.

Below is a list of our activities in 2009-2010:

  • Continued to provide support work to the development team (SDD) for the Generalized Record Linkage System (GRLS) and GLINK (Fundy), including participating in GRLS/GLINK User Group meetings.
  • Provided consulting services to external users of record linkage methods.
  • Provided various consulting services to internal users of record linkage methods.
  • Produced a bibliography of record linkage literature.
  • Performed an exchange of documentation with the US Census Bureau on Bigmach software and with the Italian National Statistical Institute (ISTAT) on the RELAIS (Record Linkage At Istat) software.
  • Conducted a survey of record linkage activities in the Methodology Branch in order to build an inventory of information, encourage knowledge sharing and to improve coherence of record linkage activities.

For further information, contact:
Abdelnasser Saidi (613-951-0328, abdelnasser.saidi@statcan.gc.ca).

Data analysis resource centre (DARC) support

The Data Analysis Resource Centre (DARC) is a centre of data analysis expertise within Statistics Canada, the purpose of which is to encourage, suggest and provide good analytical methods and tools for use with Statistics Canada data. Since DARC gets funding for its support activities from a number of sources as well as the Methodology Block Fund, all of its support activities are mentioned in this report. DARC also uses a portion of its Methodology Block Fund support activities for doing research. Those research activities are also partially funded by Data Analysis Research (1919) and are described in the report for that project.

Consultations

In 2009/10 DARC did a great number of consultations on statistical methods with analysts from many areas within Statistics Canada. Some problems related to more traditional analytical approaches with survey data, such as estimation and inference related to descriptive statistics and fitting of linear and logistic models to survey data. But other consultations involved newer, more complex issues, such as the use of hierarchical models with survey data and integrating data from more than one survey into a single analysis. Some consultations were not based on survey data, such as an extensive consultation to study the coverage complexities of the T2LEAP database, which is used for the longitudinal analysis of corporate employers; another example was a consultation with BSSTSD on methodologies and best practices for comparing the business outcomes of firms that do and do not participate in a particular program.

Again this year, there was a considerable consultation with other methodologists. Some methodologists wanting to consult with DARC were doing analysis themselves, such as analysis to evaluate a program; examples of this were the consultations with LFS methodologists studying the impact of moving to internet interviews of participants or consultations with CHMS methodologists who were wishing to predict the impact on variance estimates of changing the number of sites sampled in the survey. Other methodologists wanting to consult were supporting subject-matter analysts of the survey data for which they were responsible and wished some additional information on suitable analytical approaches.

As well, the consultative support given to the Research Data Centres stayed high. The majority of consultations involved questions regarding appropriate methods and tools for complex survey data. Many of these consultations were with the Statistics Canada analysts who support the RDC researchers, but some consultations were with the researchers directly. A common problem related to the pooling of data from more than one survey into a single analysis.

Several consultations also took place with people external to both Statistics Canada and the RDC network. One consultation was with researchers from the Institute for Clinical Evaluative Sciences, wishing assistance with calculating survey bootstrap variance estimates for population risk predictions, based on some research results that DARC had developed in the previous year. Another consultation was with researchers from DND who are analyzing data from a periodic survey carried out in Afghanistan. A large consultation with DND, which was mainly cost-recovered, involved a variety of analyses of 2006 census data in order to study employment-related characteristics of military spouses and to compare these to other population groups.

As well as giving advice on a methodological approach, many consultations also included support for determining and using suitable software for an analysis.

The consultation function also included technical reviewing and refereeing of articles for both internal and external journals, including Health Reports, Canadian Social Trends, JOS, the RDC Information and Technical Bulletin, and the proceedings of the Survey Methods Section of the SSC and of the Joint Statistical Meetings.

Provision of Training

During 2009/10 DARC was responsible for presenting Course 0438, Statistical Analysis of Survey Data. Both parts (0438A and 0438B) were presented in both English and French. Additional material was also developed. Some methodologists outside of DARC were also involved.

A seminar was given at the RDC analysts’ conference in November 2009 on pressing analytical questions faced by the analysts. The topics covered were based on submissions from the analysts. Some of the same topics were covered in a seminar presented at CIQSS in February 2010, where the researchers were invited, in advance, to propose problems they were encountering when analysing Statistics Canada data.

DARC members were involved in the brainstorming sessions for each session of the Data Interpretation Workshop, in the project reviews and in presenting a seminar on the analysis of survey data. There has been follow-up with some of the participants who desired further assistance with their analysis projects.

DARC presented a seminar in the seminar series aimed at new recruits on the topic of the analysis of survey data.

A seminar was presented in the seminar series on Policies, Standards and Guidelines in February 2010. The topic was the Analysis and Presentation of Data.

Software Evaluation, Development and Promotion

During 2009/10, DARC continued to examine various commercial software packages with respect to their suitability for analyzing Statistics Canada survey data. In addition to keeping abreast of new features in software that can make use of survey bootstrap weights for design-based analysis (e.g., SUDAAN 10, Stata 10 and SAS 9.2), work was done on identifying what advances are being made in allowing for design information in software for multilevel modeling and for fixed effects regression. Some modifications were also made to a SAS macro that, when combined with the HLM6 software, allowed variance estimation for a select set of hierarchical models through the use of survey bootstrap weights.

A seminar entitled “Introduction to SAS-callable SUDAAN” was presented two different times in the year to people who had newly gained access to SUDAAN.

DARC was involved in the process of extending access to software packages SUDAAN and Stata to analysts from a large number of divisions outside of Methodology. Time was also spent on supporting the use of these software packages.

Other activities

DARC resources were used to support PhD students at Statistics Canada on NICDS/MITACS internships.
 
DARC members were involved in a variety of organization functions for the program of the Survey Methods Section of the SSC in 2009 and 2010 and for the program of the 2009 Methodology Symposium.

Georgia Roberts sat on the editorial boards of Health Reports and Canadian Social Trends and sat on a committee reviewing analysis software at Statistics Canada.

DARC members were responsible for rewriting the chapter on Data Analysis and Presentation that was included in the newest edition of the Quality Guidelines. (Statistics Canada, 2009c).

For further information, contact:
Georgia Roberts (613-951-1471, georgia.roberts@statcan.gc.ca).

Generalized systems

Confid2:

Confid2 is the generalized software used to suppress sensitive cells in a data release. It preserves the confidentiality of data providers while masking the minimum amount of information. Version 1.0 of Confid2 was released in May 2009, and was the subject of a presentation at SSC in June. In January 2010, version 1.02 of Confid2 was released.

Enhancements contained in Confid2 include the following:

  • Proc Sensitivity: The new confidentiality rule C2 is now available. The Methods and Standards Committee has approved the adoption of the new confidentiality rule C2, which is an in-house dominance rule, to replace Duffett rules.
  • A bug in the case of multiple decomposition has been fixed. The process of defining the constraints required for suppression has been improved.
  • %Suppress: The new options CVAR1 and CVAR2 may be used to specify a cost variable that is used to prioritize cells for suppression.
  • %Audit: The macro uses SAS/Connect to perform parallel processing on the same machine in order to decrease execution time.

A tutorial and website for Confid2 were also made available in January. The tutorial is designed to guide new users of the system in using system resources. The tutorial describes the steps to follow to implement a simple disclosure control strategy using Confid2. A data set and exercises accompany the tutorial to aid in understanding how the system operates.

BANFF:

The Banff system for edit and imputation was developed to satisfy most of the edit and imputation requirements of economic surveys at Statistics Canada. Banff can be called with programs in SAS, tasks in SAS Enterprise Guide or Metadata with the Banff Processor. The Banff Processor allows users to enter their edits in a spreadsheet format which is then read and parsed into SAS code. The Banff Processor was released internally in August 2009 and user support was provided. The production version of Banff 2.04 (including the Processor) is expected to be released in the spring of 2010.

GSAM:

GSAM is the generalized sampling system. The beta version of the second phase allocation was released in December 2009. The second phase allocation re-introduces the S2P Minimize Total Cost Allocation function for a second phase sample under SRSWOR. The S2P Minimize Total Cost Allocation function determines the second phase sample allocation of a two-phase sample design under SRSWOR given a first phase sample and CV constraints. Each CV constraint is defined as a combination of a domain and a target variable list. A domain can be either a specific logical expression or a BY group definition. This function can also be used to overcome the CV constraints limit (of 20) for the S1E Minimize Total Cost Allocation function for a one-phase sample under SRSWOR. To do this, you simply use your entire population (or sampling frame) as your first phase sample. The production version of GSAM 2.4 is expected to be released at the end of May 2010.

StatMx:

StatMx is a collection of SAS macros that provide additional functions and capabilities beyond those currently available in either the Generalized Sampling System (GSAM) or the Generalized Estimation System (GES). The methodology described in the paper “A new face on two-phase sampling with calibration estimators” by C. Särndal and V. Estevao (Survey Methodology, June 2009 – vol. 35, no. 1) is being used to estimate the variance of the estimates from two-phase designs.

Several activities were undertaken this year. The Calibration Weights Function document was updated to make it more readable and to correct some errors and omissions. A stand-alone macro was developed for handling the estimation of proportions; before running the estimation function, the macro adds the necessary dummy variables to a copy of the sample file for the required proportions of interest and creates the corresponding parameter file for their estimation. The feasibility of adding the option to output the design effect for the parameter estimates in the calibration estimation function was assessed, and a prototype for E1 designs was created and tested. A review of Michel Ferland’s implementation of the Lavallée-Hidiroglou algorithm was initiated, for possible inclusion in StatMx.

In January 2010, StatMx 3.5 was released. Enhancements include the following:

  • A correction for the merger of the point estimates and their variances in the Calibration Estimation function for two-phase (E2) designs when some empty domains are present (empty domains contain no sample data);
  • The Stratification and Allocation function able to process much larger datasets; and
  • Some data management methods to improve performance.

Methodology management discussed the current status of StatMx and potential options for its long-term development and support. They are assessing the future use and expansion of StatMx modules.

For further information, contact:
Laurie Reedman (613 951-7301, laurie.reedman@statcan.gc.ca).

Quality assurance

Quality Control Data Analysis System (QCDAS):

QCDAS is SAS based with a menu driven user interface. It is used to analyze, estimate and tabulate summary statistics for quality control programs. The system is also used to generate reports to reflect the analysis. The new QCDAS SAS system was put into production in April 2009 and has completely replaced the previous MS_Access system. During the first few months of production, a few bugs were identified and corrected. The French version of the interface screens was developed and reviewed and should be implemented in April 2010. A draft copy of the user manual and an administrator’s guide were developed and are currently under review.

Generic Intelligent Character Recognition (ICR):

ICR is the technology used for data capture, which makes use of a combination of automated machine capture (using optical character, mark and image recognition), along with the manual heads-up capture of data by operators. The current system used is called “AnyDoc”. BSMD has developed and implemented a generic Quality Control for this ICR system for the document preparation, scanning, automated data entry (ADE) and manual key-from-image (KFI) stages of data processing. Upgrades to the AnyDoc system itself necessitated analysis to determine what changes and adjustments are required to the QC methodology. A team from Statistics Canada (with one representative from BSMD) visited the US Census Bureau to learn firsthand how their quality control for automated machine capture works. This team is tasked with planning a redesign of the data capture process in OID in order to streamline operations and make better use of emerging technology. The contact with AnyDoc has been renewed for 3 years, after which time there is the option to move to the US Census Bureau system instead.

A study was done to evaluate whether changes need to be made to the quality control parameters used in the generic ICR system. Different scenarios were run using data from 11 surveys. A report on the findings was written and changes were proposed to the parameters, namely the number of standard deviations from the target error rate for the ADE and KFI processes and the target error rate for the KFI process.

Automated Coding Parsing and Scoring Methodology (ACTR):

ACTR is used to automatically assign predefined codes to responses to open-ended questions. This is done in two steps. In the first step, both the input and reference text are parsed by a user-defined parsing strategy in order to reduce the text to a standard form. Parsing deals with problems such as common spelling variations and abbreviations. The parsing strategy plays a strong role in determining the success rate of the coding process. The second step is to match the parsed input text to a list of parsed descriptions in a reference file and assign the associated code when a match is successful. Direct or indirect matching can be done. For indirect matching, a weight is assigned to each matching word in the input phrase and a score for this phrase is computed based on weights and the number of words in common between the input and the reference descriptions. Research was undertaken to improve indirect matching by modifying the word weighting and phrase scoring methodology. Analysis of changes made to the current weighting and scoring formula, and an alternative method of calculating the phrase score were presented to the BSMD Technical Committee in April. Following their recommendations, more work was done and a proposal was written which is believed to improve the match rate. A new interface for ACTR is currently being developed by SDD and preliminary versions were tested by BSMD. The recent developments in ACTR will be presented at SSC in June and possibly at the Advisory Committee in the fall of 2010.

Consultation:

Advice on quality control methods was provided as requested to various clients, including the Census of Population, Agriculture Division, and Geography Division.

For further information, contact:
Laurie Reedman (613 951-7301, laurie.reedman@statcan.gc.ca).

Training and consultation

The Statistical Training Committee (CFS) coordinates the development and delivery of 20 regularly-scheduled courses in survey methods, sampling theory and practice, questionnaire design, time series methods and statistical methods for data analysis. During the year, a total of 25 scheduled sessions were given in either English or French.

The suite of courses continues to expand. As many as five new courses are currently being developed on the following topics: analysis of longitudinal data, calibration, automated coding and processing, reconciliation of time series and methodological issues for new collection methods.

For more information, contact:
François Gagnon (613-951-1463, francois.gagnon@statcan.gc.ca).

Conferences

2008 Symposium:

The 24th International Symposium on Methodological Issues took place on October 28-31, 2008. The theme was “Data Collection: Challenges, Achievements and New Directions”. It took place at the Palais des Congrès in Gatineau.

The individual papers for the 2008 Symposium proceedings were approved by Dissemination Division. The papers were loaded onto the Statistics Canada website and a CD-ROM was created. An official release of the proceedings took place on December 3, 2009. A CD-ROM was mailed to all participants.

The work on the 2008 Symposium has now been completed.

2009 Symposium:

The 25th International Methodology Symposium was held on October 27 to 30, 2009 at the Palais des congrès in Gatineau under the theme of “Longitudinal Surveys: from Design to Analysis”. Close to 500 people were in attendance. Peter Lynn was the keynote speaker, and Graham Kalton was the Waksberg Award speaker. Two workshops were also given on the first day of the symposium:

  • Michelle Simard and François Brisebois: “Over 15 Years of Longitudinal Surveys at Statistics Canada: Lessons and Innovations”;
  • Sophia Rabe-Hesketh and Anders Skrondal: “Multilevel Modeling of Longitudinal Data”.

In the months leading up to the Symposium, the organizing committee put together a program made up of over 60 presentations from 15 different countries. It also took care of the details surrounding registration, operations and facilities. The proceedings should be disseminated by late 2010.

2010 Symposium:

The preparations for the XXVIth International Methodology Symposium are well under way. The Symposium will take place from October 27 to 29, 2010 at the Crowne Plaza Hotel in Ottawa and the theme will be “Social Statistics: The Interplay among Censuses, Surveys and Administrative Data”. The keynote speaker will be Jelke Bethlehem, and Ivan Fellegi will be the Waksberg Award speaker. The Symposium will be preceded by a day of workshops.

For further information, please contact:
Pierre Lavallée (613-951-2892, Pierre.Lavallee@statcan.gc.ca) or
Sylvie Gauthier (613-951-4691, Sylvie.Gauthier@statcan.gc.ca).

Survey methodology

Survey Methodology (SM) is an international journal that publishes articles in both official languages on various aspects of statistical development relevant to a statistical agency. Its editorial board includes world-renowned leaders in survey methods from the government, academic and private sectors. It is one of just two major journals in the world dealing with methodology for official statistics.

The 2009-2010 year was marked by various changes to the Management and Editorial Boards: John Kovar now acts as chairman, Mike Hidiroglou is the new editor, and Susie Fortier steps in as the production manager. The Journal also welcomed two new assistant editors in Guylaine Dubreuil and Cynthia Bocci. Other changes to the list of associate editors are also being finalised.

The June 2009 issue, SM 35-1, was released on June 22nd, 2009. The issue contains nine papers.

The December 2009 issue, SM 35-2, was released on December 23rd, 2009. It contains 11 papers including the ninth in the annual invited paper series in honour of Joseph Waksberg. The recipient of the 2009 Waksberg Award is Graham Kalton.

In the reference period, the journal received 51 submissions of papers by various authors.

Survey Methodology is now included in the ISI Web of knowledge. The last 6 issues have been indexed (33-1 to 35-2; June 2007 to December 2009). SM is scheduled to receive its first index factor by the Journal of Citation Report in the summer of 2010. The SM is also newly covered by SCOPUS in the Elsevier Bibliographic Databases starting with the June 2008 issue.

The available on-line electronic versions of SM now range from 35-2 (December 2009) to 25-2 (December 1999). Electronic index and abstracts are also available up to 22-2 (December 1996). The conversion of back issues to an electronic version continues. Work is under way to produce a CD of the first issues.

Documentation of all phases of the production of Survey Methodology continues, with discussion with managers from the Customer Relation Management System among others.

For more information, contact:
Susie Fortier (613 951-4751, susie.fortier@statcan.gc.ca).