+ - 0:00:00
Notes for current slide
Notes for next slide

Storytelling with Data Visualization

Autodesk - technical meeting

Anabelle Laurent

November 30, 2021

1 / 41

Why is Data Visualization important? 📊

2 / 41

Why is Data Visualization important? 📊

  • Universal way to communicate information

  • Provides clear and effective message

  • Find patterns, trends, spot extreme values

  • Make data memorable

  • Maintain the audience's interest

3 / 41

What make a good visualization? 🤔

4 / 41

What make a good visualization? 🤔

  • Reveals a trend or relationship between variables

  • Always have at minimum a caption, axis, scales and symbols

  • Distinct and legible symbols (i.e., use contrast)

  • Caption should convey as much information as possible

  • No noise: keep information at minimum

  • the correct graph type based on the kind of data to be presented

5 / 41

Disclaimer

This workshop does not provide code but all the plots were made using R Studio (see last slides for more details)

Artwork by @allison_horst

6 / 41

Visualizing distribution

Artwork by @allison_horst

7 / 41

Visualizing distribution : histograms

For plotting the distribution of a single quantitative variable

Try different bin widths for best visual appearance.

  • Small bin width -> peaky and busy histogram

  • Large bin width -> features might disappear

8 / 41

Visualizing distribution : density plot

Try different bandwidths for best visual appearance

  • Small bandwidth -> peaky and busy density

  • Large bandwidth -> smooth feature and might look like a gaussian

9 / 41

Visualizing multiple distributions

10 / 41

Visualizing multiple distributions

  • The peaks of the density plot are where there is the highest concentration of points

  • For several distributions, density plots work better than histograms.

11 / 41

Visualizing multiple distributions

12 / 41

Visualizing multiple distributions: ridgeline plot

13 / 41

Visualizing multiple distributions: ridgeline plot

Ridgeline plot shows the distribution of a numeric value for several groups (at least 5-6 groups) or when they overlap each other.

14 / 41

Visualizing distributions: boxplot

A boxplot can summarize the distribution of a numeric variable for several groups

15 / 41

Visualizing distributions: boxplot

Boxplot does not tell about the number of observations.

16 / 41

Visualizing distributions: boxplot with jitter

Boxplots with jitter tell about:

  • the distribution of the data

  • if the groups are balanced or unbalanced in terms of observations.

17 / 41

Visualizing distributions: boxplot with jitter

No overlapping facilitates the visual appearence of the plot

18 / 41

Visualizing distributions: violin plot

  • Violins are equivalent to density estimate

  • They are useful to represent bimodal data.

19 / 41

Visualizing associations among quantitative variables

20 / 41

Relationship between 2 numeric variables: scatterplot

21 / 41

Relationship between 2 numeric variables: scatterplot + linear fit

22 / 41
Relationship between 2 numeric variables: scatterplot + quadratic fit

⚠️ Linear fit is widely used but it is not always the best fit, try quadratic fit too.

23 / 41

Relationship between 2 numeric variables: scatterplot

24 / 41

Multi-panel plots

Split a single plot using one variable with many levels

25 / 41

Multi-panel plots

Split a single plot using the combinations of two discrete variables.

26 / 41

Multi-panel plots

⚠️ different scales can lead to misinterpretation

27 / 41

Bubble plot

A bubble plot is a scatterplot with 3 numerical variables

28 / 41

Tell a story with your data 📖

29 / 41

Tell a story with your data

Before data visualization you must:

  • Know your audience

  • Know the level of data detail expected

  • Give enough context

  • Ask yourself: What do I want my audience know/remember with the data I am presenting?

30 / 41

Tell a story with your data

Don't be repetitive but be consistent (theme, color scheme, font size etc.)

31 / 41

Tell a story with your data

Guide your audience by point out specific values

32 / 41

Tell a story with your data

Guide your audience by pointing out specific values

33 / 41

Tell a story with your data

Customize your plot using highlighting

34 / 41

Tell a story with your data

Customize your plot using highlighting + text

35 / 41

Interactive graphics with ggplotly

010000200003000040000500004050607080
AfricaAmericasAsiaEuropeOceaniaLife expectancy vs GDP per capita in 2007 GDP per capita (US$) Life expectancy (years) Populationcontinent
36 / 41

Data visulization using interactive web-app

One case-study ISOFAST web-app

Problem Statement:

  • Make most use of the cumulative experiment data collected since 2006
  • Need data-driven insights (overview at the network level and not only farm level)

Development of ISOFAST

  • Audience: farmers, local agronomists, researchers
  • Easy-to-navigate user interface
  • Effective data visualizations
  • Economic analysis for adaptive decision making
37 / 41

R library used for this presentation

library(ggplot2)
library(dplyr)
library(tidyr)
library(gapminder)
library(gghighlight)
library(ggrepel)
library(dygraphs)
library(plotly)
38 / 41

Resources to go deeper into Data Viz

39 / 41

Thank you for your attention

✉️ my email: alaurent@iastate.edu

Slides created via the R package xaringan.

41 / 41

Why is Data Visualization important? 📊

2 / 41
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow