Data Visualization: An introduction - Transcript
(The Statistics Canada symbol and Canada wordmark appear on screen with the title: "Data Visualization: An introduction")
Data Visualization: An introduction
Welcome to part one of a multi part series on data visualization. This video will provide an introductory overview of data visualization, and how to use it to tell your story.
This video addresses the data visualization competency. By the end of this video, you should have a deeper understanding of data visualization, and how it can be used to present data in an interesting and aesthetically pleasing way.
We will go over when it should be used, and we will give you some examples of the different types of data visualization techniques that exist.
Steps of the data journey
This diagram is a visual representation of the data journey. From collecting the data to cleaning, exploring, describing and understanding the data, to analyzing the data and lastly to communicating with others the story the data tell.
Step 4: Tell the story
Data visualization can occur at different steps of the data journey, depending on what you're using it for. In this video, we'll be focusing primarily on how to present data in a way that helps tell the story.
(Diagram of the Steps of the data journey: Step 1 - Find, gather, protect; Step 2 - explore, clean, describe; Step 3 - analyze, model; Step 4 - tell the story. The data journey is supported by a foundation of stewardship, metadata, standards and quality.)
Data visualization is the graphical representation of information and data. It is a combination of art and science as it uses tools such as charts, graphs and maps to make trends and patterns that might be hidden in a large data set much easier to understand.
Why use data visualization?
But how does data visualization make trends and patterns easier to understand?
Vision is such an important part of how we experience the world. Perhaps because it's how we've always survived. How we've found food, avoided threats, created art that preserves our culture and histories. And since the brain absorbs and processes visual information faster than any other stimuli, presenting information through graphics can be incredibly effective.
So it only makes sense that as technology has evolved, so would the way we present information we're trying to share with the world.
(4 images where, starting from the left, an apple pie, cherry pie, blueberry pie and "other" pie are sorted with a squinting face with tongue out emoji as a 5th image on the far right.)
For example, think about the following question: What is the most popular kind of pie? If you really wanted to know the most popular type of pie in your hometown, you might decide to conduct a survey.
This survey would ask everyone in town what kind of pie is their favourite. Apple? Cherry? Blueberry? Some other flavour? And finally, an option for people who really just don't like pie at all. Once you've acquired your data, there's several ways to communicate the results.
Option 1: Text
The first option is text. You could consider creating a written report describing the figures that read something like "of the 100 people surveyed, 40 preferred apple pie. 30 preferred blueberry and 20 preferred cherry. Additionally, five people chose a flavor other than those in the list, and five said they didn't like pie at all."
Option 2: Table
(Image of a table where the left and right columns lists the different pie flavours and the count of respondents preferring said flavour, respectively: "apple = 40";"bleuberry = 30";"Cherry = 20";"Other = 5";"I don't like pie = 5";"Total = 100".)
In this situation, where we're just trying to find out the most popular pie flavor. We might decide that reading a full analysis of the results is unnecessary.
This is where the option to receive the exact same results in a table, could be preferable. When reading a table, it's all about the numbers. Here we can clearly see that most people prefer apple pie without having to take the time to read through a lot of text.
So, a good thing to note here is that when you're trying to compare more than two numbers, you will probably want to look into presenting your data in a more visual way, rather than textual.
Option 3: Visual
(A series of images with 4 apple pies; 3 bleuberry pies; 2 cherry pies & half a pie for those who like other pies and the other half for those who do not like pies.)
A third way to present the results of our pie survey is without many words or numbers at all. Option three is where data visualization comes in. From this picture it's instantly clear that apple pie is the most popular.
Types of data visualization
(Simplified image of a series of different types of data visualizations: (left) Graphics; Charts; Maps; Tables; Pictographs; Infographics; Dashboards (Right).)
There are many different ways of presenting data visually, such as, graphs, charts, maps, tables pictographs, infographs and dashboards. On the next few slides will look at what each one is best at showing.
(Text on screen: Showing relationship between two things)
(Image of a Scatter plot on display with the titltle on top:"Total revenue from of ice-cream sales, 2019 ($CAD)".The vertical(y) and horizontal(x) axis represent the proportion of the revenue ($CAN) and temperture (Celsius), respectively.)
A scatter plot is great for showing the relationship between two values. In this graph we can clearly see the relationship between temperature on the horizontal axis and ice cream sales on the vertical axis. We can see how ice cream revenues increased with increasing temperatures.
(Text on screen: Showing trends through time)
(Image of a line graph on display with the titltle on top:"Canada's official poverty line".The vertical(y) and horizontal(x) axis represent the proportion of the population (%) and year (year), respectively.)
A line graph is a good way to show how something changes over time. This one shows how Canada's official poverty line has been declining in recent years from 12.1% in 2015 to 8.7% in 2018.
(Text on screen: Showing a comparison between several things)
(Image of a bar chart on display with the titltle on top:"Cannabis use in the past three months by age, Canada - Fourth quarter 2019".The vertical(y) and horizontal(x) axis represent the proportion of cannabis users (%) and age group (year), respectively. The left most bar to the right most bar, represent the age groups: "15 to 24"; "25 to 34"; "35 to 44"; "45 to 54"; "55 to 64" and "65 and over".)
A bar chart is better when you want to compare different groups of things. Here we compare the use of cannabis among Canadians by age group. The chart clearly shows that cannabis use is higher among those in the younger age groups compared with older age groups.
(Text on screen: Showing the composition of a whole)
(Image of a circular pie chart tittled on the top: Six provinces cultivated "vinifera and french hybrid" grapes for winemaking in 2018. The pie chart is composed of 3 asymetric slices.)
A pie chart is the perfect tool for showing the composition of a whole, or the distribution of something. Here, we see that in 2018 Ontario produced more grapes for winemaking than all the other provinces combined.
(Text on Screen: Putting data into geographical context)
(Image of the map of Canada where each province has a different gradient of bleu representing the unemployment rate where the darker bleus represent a higher unemployment rate in percentage points. Dark regions are areas with no data.)
Here is an example of a map being used as data visualization. It shows how the job vacancy rate differs across provinces. The job vacancy rate for each province in Canada is indicated by the shading on the map.
(Text on Screen: Used to show many categories, and provide more detail and precision than many other data visualization methods)
(Image of a table where the left most column represents the age group; the middle and right major columns represent "All families with children" and "Total children in all families", respectively. Both major columns contain sub columns representing the years 2015; 2016 and 2017.)
Tables are used to show many categories and provide more detail and precision than many other data visualization methods. In this table we see the number of families with children compared to the total number of children in all families for different age ranges of children.
(Text on Screen: Simple but instantly interpretable)
((Reuse of the pie survey) A series of images with 4 apple pies; 3 bleuberry pies; 2 cherry pies & half a pie for those who like other pies and the other half for those who do not like pies.)
This data visualization from the pie example is a pictograph. A pictograph is the representation of data using images. This is one of the simplest ways to represent statistical data. The popularity of different kinds of pie is represented by the number of pies. In this pictograph, each pie represents 10 individuals. While a pictograph has very low precision, our brains interpret the message instantly.
(Text on Screen: Used to tell a comprehensive data story)
(An image containing an infograph Titled: "Family matters - information on the splitting of householde tasks. Who does what ?". Infograph contains facts and conclusions on the subject mater.)
An infographic is several data visualizations put together to tell a more comprehensive data story. Typically, an infographic portrays the state of something at a particular point in time. Like a poster.
In this example, several data points are put together to tell a story about who does the chores in a family. From this infographic we learn that some chores are done equally by men and women, like dishes, shopping and organizing the social life. While laundry and meal prep are more likely to be done by women, outdoor work is most likely done by men.
Finally, the infographic reveals that the distribution of tasks depends on who's in the labor force at the time.
(Text on Screen: Used to inform business decisions and are updated at regular intervals)
(An image containing a dashboard where tables, charts and graphics to display several issues related to human resources)
A dashboard is several data visualizations put together, often to inform business decisions. Dashboards are usually updated regularly and show changes over time. The colour, size, and position of the individual graphics are used strategically to focus attention on different aspects.
This dashboard for example uses tables, charts and graphics to display information to manage human resources.
How to choose the right visualization
The right visualization depends on several factors.
What type of data do you have? Are their relationships in the data? Or are they changing over time? Are you making comparisons or showing the composition of something? And who's your audience? What story do you want to tell them? Are differences by geographic region important to them? How much precision do they want or need? Is your audience making business decisions based on the information you're sharing? Or, is it simply to inform?
On the previous slides you saw some different types of data visualizations and what each one can be used for.
Recap of Key points
(Text on Screen: Data visualization is the graphical representation of information and data.; Vision is an important part of how we experience the world.; There are many different ways of presenting data visually.)
In this video, you learned that data visualization is the graphical representation of information and data.
A picture truly is worth 1000 words. Just make sure you choose the right picture to accurately represent your data and effectively get your message across. Watch for more videos in this series featuring good practices for data visualization.
(The Canada Wordmark appears.)