Statistics Canada - Statistique Canada
Skip main navigation menuSkip secondary navigation menuHomeFrançaisContact UsHelpSearch the websiteCanada Site
The DailyCanadian StatisticsCommunity ProfilesProducts and servicesHome
CensusCanadian StatisticsCommunity ProfilesProducts and servicesOther links
Statistics Canada Quality guidelines Main page Defining quality Survey steps Management context Acronyms More information PDF version Online catalogue
Management   Survey steps > 

Data analysis (and presentation)

Scope and purpose

Data analysis is the process of transforming raw data into useable information that is often presented in the form of a published analytical article. The basic steps in the analytic process consist of identifying an issue, asking meaningful questions, developing answers to the questions through examination and interpretation of data and communicating the message to the reader.

Analytical results can underscore the usefulness of data sources by shedding light on issues. Some Statistics Canada programs even depend on analytical output as a major data product because, for confidentiality reasons, it is not possible to release the microdata to the public. In recent years there has been emphasis placed on increasing the amount of relevant analysis being done within the Agency with Statistics Canada data.

Data analysis also has an important role as part of the survey development and revision process. It can have a crucial impact on data quality by helping to identify data quality related problems and by influencing future improvements to the survey process. Analysis is essential for understanding results from previous surveys and pilot studies, for planning new statistical activities, for providing information on data gaps, for designing surveys, and for formulating quality objectives.

Principles

A statistical agency is concerned with the relevance and usefulness to users of the information contained in its data. Analysis is the principal tool for obtaining information from the data. Analysis results may be categorized into two general types: (a) descriptive results, which are results relating to the survey population at the time that the data were collected - for example, the median income in the year that the population was surveyed; and (b) analytical results relating to a survey population that often goes beyond the actual population surveyed – for example, the chance of someone having a particular chronic disease.

To be effective, the analyst needs to know the audience and the issues of concern (both current and those likely to emerge in the future) when identifying topics and suitable ways to present results. Study of background information allows the analyst to choose appropriate data sources and statistical methods. Any conclusions presented in an analytical study, including those that can impact on public policy, must be supported by the data being analyzed.

Guidelines
  • Ensure that the data are appropriate for the analysis to be carried out. This requires investigation of a wide range of details such as whether the survey population of the survey sufficiently approximates the target population of the analysis, whether the variables and their concepts and definitions are relevant to the study, whether the longitudinal or cross-sectional nature of the survey is appropriate for the analysis, whether the sample size in the study domain is sufficient to obtain meaningful results and whether the ascertained quality of the data from the survey supports these results.

  • If more than one data source is being used for the analysis, investigate whether the sources are consistent and how they may be appropriately combined.

  • Consider whether imputed values should be included in the analysis and if so, how they should be handled (see section on Imputation).

  • Consider how unit and/or item nonresponse should be handled in the analysis.

  • Choose an analytical method that is appropriate for the question being investigated.

  • When making comparisons between two groups of individuals, businesses, or other units, control for extraneous factors. If significant differences between the groups are found as a result of statistical tests, then consider alternative plausible explanations for the differences.

  • Since most analyses are based on observational studies rather than on the results of a controlled experiment, avoid drawing conclusions concerning causality.

  • Use diagnostic techniques to assess the analytical model.

  • Beware of focusing on short-term trends without inspecting them in light of medium-and long-term trends. Frequently, short-term trends are merely minor fluctuations around a more important medium- and/or long-term trend.

  • Where possible, avoid arbitrary time reference points, such as the change from last year to this year. Instead, use meaningful points of reference, such as the last major turning point for economic data, generation-to-generation differences for demographic statistics, and legislative changes for social statistics.

  • Consult with experts both on the subject matter and on the statistical methods.

  • Analytical methods that ignore the survey design can be useful, provided the model being assumed in the analysis is correct. However, alternative methods that incorporate the sample design information, frequently called design-based methods, will generally be effective even when some aspects of the model are incorrectly specified. Assess whether the survey design information can be incorporated into the analysis and if so how this should be done. Having determined the appropriate analytical method, investigate the software choices that are available to apply the method. [See Binder and Roberts (2001) for a definition of ignorable survey designs, and Binder and Roberts (2003) and Skinner, Holt and Smith (1989) for discussion of ignoring the survey design. See Statistics Canada (2003a), Chambers and Skinner (2003), Korn and Graubard (1999), Lehtonen and Pahkinen (1995), Lohr (1999), Thomas (1993), and Skinner, Holt and Smith (1989) for a number of examples showing the benefits of design-based analytical methods.]

  • Before beginning to write, prepare an outline of the article. When preparing the outline, consider such questions as: “What issue am I addressing? What data am I using? Can I eliminate any irrelevant data? What analytical methods are appropriate? What results do I want to highlight? What are my interesting findings?”

  • Focus the article on the important variables and topics. Trying to be too comprehensive will often interfere with a strong story line.

  • Arrange ideas in a logical order and in order of relevance or importance. Use headings, sub-headings and sidebars to strengthen the organization of the article.

  • Keep the language as simple as the subject permits. Depending on the targeted audience for the article, some loss of precision may sometimes be an acceptable tradeoff for more readable text.

  • Use graphs in addition to text and tables to communicate the message. Use headings that capture the meaning (e.g., “Women’s earnings still trail men’s”) in preference to traditional chart titles (e.g., “Income by age and sex”). Always help readers understand the information in the tables and charts by discussing it in the text.

  • When tables are used, take care that the overall format contributes to the clarity of the data in the tables and prevents misinterpretation. This includes spacing; the wording, placement and appearance of titles; row and column headings and other labeling.

  • Explain rounding practices or procedures. In the presentation of rounded data, do not use more significant digits than are consistent with the accuracy of the data.

  • When presenting details about rates, be careful to distinguish between percentage change and change in percentage points. Define the base used for rates.

  • Ensure that all references are accurate and are referred to in the text.

  • Check for errors in the article. Check details such as the consistency of figures used in the text, tables and charts, the accuracy of external data, and simple arithmetic.

  • Ensure that the intentions stated in the introduction are fulfilled by the rest of the article. Make sure that the conclusions are consistent with the evidence.

  • Have the article reviewed by at least two other persons. Where appropriate, verify the quality of the translation.

  • As a good practice, consider doing a presentation about the analysis results that have been obtained. This is another kind of peer-review that can help improve the article. Always dry run presentations involving external audiences.
References

Binder, D.A. (1983). On the variances of asymptotically normal estimators from complex surveys. International Statistical Review, 51, 279-292.

Binder, D.A. and Roberts, G. (2001). Can informative designs be ignorable? Newsletter of the Survey Research Methods Section, American Statistical Association, Issue 12.

Binder, D.A. and Roberts, G.R. (2003). Design based methods for estimating model parameters. In Analysis of Survey Data, R.L. Chambers and C.J. Skinner (eds.), Wiley, Chichester, 29-48.

Chambers, R.L. and Skinner, C.J. (eds.) (2003). Analysis of Survey Data. Wiley, Chichester.

Korn, E.L. and Graubard, B.I. (1999). Analysis of Health Surveys. Wiley, New York.

Lehtonen, R. and Pahkinen, E.J. (1995). Practical Methods for Design and Analysis of Complex Surveys. Wiley, Chichester.

Lohr, S.L. (1999). Sampling: Design and Analysis. Duxbury Press.

Skinner, C.K., Holt, D. and Smith, T.M.F. (1989). Analysis of Complex Surveys. Wiley, Chichester.

Statistics Canada (1995). Policy on the Review of Information Products. Policy Manual, 2.5.

Statistics Canada (2001a). Guidelines on Writing Analytical Articles. Communications Division.

Statistics Canada (2003a). Analysis Handbook. Prepared by the Data Analysis Resource Centre, Methodology Branch.

Statistics Canada (2003e). The Official Style Guide. Editorial Services, Communications Division. See http://icn-rci.statcan.ca/10/10d/10d_000_e.htm (STC intranet site). Publication updated regularly.

Thomas, D.R. (1993). Inference using complex data from surveys and experiments. Canadian Psychology, 34, 415-431.




Home | Search | Contact Us | Français Return to top of page
Date Modified: 2008-11-24 Important Notices