Reflecting on exploratory versus explanatory data visualization

2021-01-16

I still haven’t created examples for the #TidyTuesday project.

But in looking at other submissions and comparing them with some of the visualizations I was preparing to create, I had some real insight into the difference between exploratory and explanatory data visualizations as I reflected on why I liked certain examples more than others and my own.

Exploratory data analysis, as the name implies, is about exploring the data. These figures can be quite complex and show a lot of data.

I noticed this faceted plot example. It is a nice faceted plot and cannot be understood with one look. It took me some time to read the legend and scan back and forth across all the years to understand its meaning.

This is what makes this a good exploratory plot. It invites the viewer to explore and think about the work and data.

Although this is a complex exploratory plot, I think it is an exemplar for an exploratory plot, much like an infographic.

Here’s another good exploratory plot showing a network of the 300 most common transatlantic slave routes.

#TidyTuesday Week 25.
Network graph linking the 300 most common transatlantic slave routes. The routes are grouped according to random walks, highlighting some of the colonies of each nation. pic.twitter.com/QzH1EdHc02
— MissingNotAtRandom (@AtMissing) June 18, 2020

I really enjoyed this plot because of the various annotations scattered throughout the visual. These enhance the plot’s meaning and understanding.

On the other hand, there are explanatory plots.

These have more thought and purpose to what they wish to show.

For example, the linked plot below is comparing the number of paintings acquired from a prolific artist, Joseph Mallord William Turner, versus everyone else.

I went ultra-simple for #TidyTuesday, but a new thing for me was using the {#glue} 📦 which I love. Code on my GitHub https://t.co/eMRb3GP0G0. A visualisation about the Tate's favourite artist. pic.twitter.com/EfzRddNAbm
— Jack Davison (@JDavison_) January 15, 2021

These plots are typically not complex. The above plot is a standard histogram you learn in middle or high school. However, it is very effective in telling you a “story” or message.

To me, it shows

how prolific an artist Joseph Mallord William Turner was, and
how many paintings the Tate Art Museum has acquired.

These points are immediately clear.

I wrote this post because as I was creating my own visualizations for the #TidyTuesday project, I noticed how I didn’t feel as drawn to my examples as much as these others I found.

I then reflected on what kind of plot I was making and what insights or information I could learn from the plot. I realized I didn’t have a clear purpose in creating the plot other than to use a particular ggplot2 package, ggalt.

Although I may be overthinking it, this single exploration into the #TidyTuesday project has reminded me of what makes a good visualization. I hope to finally participate, share, and continue to learn from making more visualizations.

Side note, a great resource on exploratory data analysis can be found using NIST’s Engineering Statistics Handbook.

Eric Leung Code and Data Learnings about blog projects misc feed

Reflecting on exploratory versus explanatory data visualization

Related Posts

Get all dates for a day of the week 2023-07-18

Updating your local branch after getting GitHub suggestions 2023-06-19

How I created an RStudio addin, pyblack, to format Python code with black 2023-06-06