In data visualization, simplicity is key to effectively conveying information. Often, the most impactful charts are those that strip away unnecessary complexity, allowing the viewer to immediately focus on the essential insights – be it an outlier, a comparison, or a trend. Minimizing visual elements on a chart enhances clarity and directs attention where it matters most.
Consider the common challenge of comparing two datasets across different categories, such as population trends across continents over time. Standard bar chart approaches, while familiar, can sometimes fall short in providing immediate clarity. Let’s examine this using a practical example.
<span>library</span><span>(</span><span>tidyverse</span><span>)</span>
<span>library</span><span>(</span><span>gapminder</span><span>)</span>
<span>library</span><span>(</span><span>patchwork</span><span>)</span>
<span>d</span> <span><-</span> <span>gapminder</span> <span>%>%</span>
<span>filter</span><span>(</span><span>continent</span> <span>%in%</span> <span>c</span><span>(</span><span>"Americas"</span><span>,</span> <span>"Europe"</span><span>))</span> <span>%>%</span>
<span>group_by</span><span>(</span><span>continent</span><span>,</span> <span>year</span><span>)</span> <span>%>%</span>
<span>summarize</span><span>(</span><span>pop</span> <span>=</span> <span>sum</span><span>(</span><span>pop</span><span>))</span>
<span>p1</span> <span><-</span> <span>ggplot</span><span>(</span><span>d</span><span>,</span> <span>aes</span><span>(</span><span>year</span><span>,</span> <span>pop</span><span>,</span> <span>fill</span> <span>=</span> <span>continent</span><span>))</span> <span>+</span>
<span>geom_col</span><span>()</span>
<span>p2</span> <span><-</span> <span>ggplot</span><span>(</span><span>d</span><span>,</span> <span>aes</span><span>(</span><span>year</span><span>,</span> <span>pop</span><span>,</span> <span>fill</span> <span>=</span> <span>continent</span><span>))</span> <span>+</span>
<span>geom_col</span><span>(</span><span>position</span> <span>=</span> <span>"dodge"</span><span>)</span>
<span>p1</span> <span>+</span> <span>p2</span> <span>+</span>
<span>plot_layout</span><span>(</span><span>ncol</span> <span>=</span> <span>1</span><span>)</span>
Looking at the stacked bar chart (top), determining the exact year when the population of the Americas surpassed Europe’s population requires a degree of guesswork. It appears to be somewhere between 1960 and 1980, but pinpointing the moment at a glance is challenging. The dodged bar chart (bottom) offers slightly improved clarity, but it introduces visual clutter with an abundance of bars. This approach necessitates mental effort to bridge the gaps between paired bars and visualize the magnitude difference effectively. Focusing on individual bar pairs can also become visually tiring, making neither chart ideal for quick and intuitive data interpretation.
Inspired by Mike Bostock’s innovative approach to data visualization, a more streamlined alternative emerges – the Comparative Bar Chart. This method consolidates two related values into a single bar, effectively highlighting the difference between them through color-coded segments. This approach minimizes visual noise and directly emphasizes the comparative aspect of the data.
Recognizing the need for a readily accessible tool to create such visualizations, the compareBars
htmlwidget for R was developed. This package simplifies the creation of comparative bar charts, offering a cleaner and more impactful way to present comparative data.
<span>library</span><span>(</span><span>compareBars</span><span>)</span>
<span>d</span> <span>%>%</span>
<span>spread</span><span>(</span><span>continent</span><span>,</span> <span>pop</span><span>)</span> <span>%>%</span>
<span>mutate</span><span>(</span><span>year</span> <span>=</span> <span>factor</span><span>(</span><span>year</span><span>))</span> <span>%>%</span>
<span>compareBars</span><span>(</span><span>year</span><span>,</span> <span>Americas</span><span>,</span> <span>Europe</span><span>)</span>
The comparative bar chart instantly clarifies the point at which the Americas’ population exceeded Europe’s. Furthermore, it provides a much clearer visual representation of the population difference’s magnitude year by year. This method delivers a visualization that is not only cleaner but also significantly more compelling for comparative data analysis.
For those seeking to tailor these charts further, the compareBars
package offers various customization options. Exploring the README on GitHub will reveal the full range of features available to refine your comparative bar charts and optimize them for your specific data visualization needs.