Statistics Canada
Symbol of the Government of Canada

Student worksheet
Analysing 2001 Census microdata

Your task

During this activity, you will do research using microdata from the Census available. Or go to http://www.statcan.gc.ca/start-debut-eng.html > Learning resources > Resources by school subject > Mathematics > Grades 9 to 12 > Data (tab) > 2001 Census microdata file subsets.

You will have the opportunity to:

  • ask questions and make predictions based on the data provided
  • use technology to generate graphics using the microdata and then analyze these graphs
  • develop a research question that makes use of the microdata

You are encouraged to work with a partner.

Student instructions:

  1. View the 2001 Census attributes from the Statistics Canada website.
    1. How many attributes are there?
    2. How many attributes are numeric?
    3. List two numeric attributes that interest you.
    4. What is the difference between a numeric attribute and a categorical attribute?
    5. How many attributes are categorical?
    6. List two categorical attributes that interest you.
  2. Download the microdata from the 2001 Census from the Statistics Canada website. Open the microdata with your statistical software (i.e., data analysis or spreadsheet software). The microdata are available in separate files for each of the provinces, one for Canada, and one for the three territories combined. Write the name of the file you chose.

One-variable analysis:

  1. From the two numeric attributes you selected above, chose one variable to analyse. Write the name of the variable you chose.
    1. Using available software, calculate the mean, median and standard deviation of the values for the selected numeric attribute.
    2. Produce two different graphical representations of the distribution of your selected numeric attribute. If producing a histogram, try different bin widths. If the software permits, overlay the mean or median value on your graphs.
    3. Which graph displays the information more clearly? Explain your answer.

Two-variable analysis:

Two numeric attributes: Select two numeric attributes that you think might be related. Write their names.

  1. When we have two numeric attributes, we can create a scatterplot of the two variables. Then we can also compute and overlay a line of best fit to help us determine whether there is a significant correlation between the attributes. Do this and create your graph.
  2. Please fill in the information below from your graph.
    1. Equation of the line of best fit
    2. Slope of the line of best fit
    3. y-intercept of the line of best fit
    4. Value of r2 for the line of best fit
    5. Value of r, the correlation coefficient
  3. How strong is the correlation? Explain.

Two categorical attributes:

  1. Select two categorical attributes that you think might be related. Write their names.
  2. Create a graph to determine whether there seems to be any correlation between these two attributes.
  3. I can tell that there is / is not a correlation between the two variables because…

One categorical and one numeric attribute:

  1. Select one categorical and one numeric attribute that you think might be related. Write their names.
  2. Create a graph to determine whether there seems to be any correlation between these two attributes. (Hint: Plot the numeric attribute on the x-axis and use box plots.)
  3. What trends can you identify in your graph? Please explain these trends, if possible.

Developing a Research Question:

  1. Develop one research question that makes use of these microdata files and that you would find an interesting subject for an analysis project. You can also use microdata from the Mathematics page to examine changes over time and geography by comparing 1991 and 2001 Census results for different provinces. In addition, you can compare detailed results from the 2001 Census microdata with over 1,700 variables of aggregate census data available for the same and more detailed geographic areas on E−STAT.