In data visualization, the principle of “less is more” is paramount. Effective charts immediately direct the viewer’s attention to key insights – be it outliers, contrasts, or patterns. Simplicity in design enhances impact.
Consider these common bar chart examples generated using R and ggplot2:
<span>library</span><span>(</span><span>tidyverse</span><span>)</span>
<span>library</span><span>(</span><span>gapminder</span><span>)</span>
<span>library</span><span>(</span><span>patchwork</span><span>)</span>
<span>d</span> <span><-</span> <span>gapminder</span> <span>%>%</span>
<span>filter</span><span>(</span><span>continent</span> <span>%in%</span> <span>c</span><span>(</span><span>"Americas"</span><span>,</span> <span>"Europe"</span><span>))</span> <span>%>%</span>
<span>group_by</span><span>(</span><span>continent</span><span>,</span> <span>year</span><span>)</span> <span>%>%</span>
<span>summarize</span><span>(</span><span>pop</span> <span>=</span> <span>sum</span><span>(</span><span>pop</span><span>))</span>
<span>p1</span> <span><-</span> <span>ggplot</span><span>(</span><span>d</span><span>,</span> <span>aes</span><span>(</span><span>year</span><span>,</span> <span>pop</span><span>,</span> <span>fill</span> <span>=</span> <span>continent</span><span>))</span> <span>+</span>
<span>geom_col</span><span>()</span>
<span>p2</span> <span><-</span> <span>ggplot</span><span>(</span><span>d</span><span>,</span> <span>aes</span><span>(</span><span>year</span><span>,</span> <span>pop</span><span>,</span> <span>fill</span> <span>=</span> <span>continent</span><span>))</span> <span>+</span>
<span>geom_col</span><span>(</span><span>position</span> <span>=</span> <span>"dodge"</span><span>)</span>
<span>p1</span> <span>+</span> <span>p2</span> <span>+</span> <span>plot_layout</span><span>(</span><span>ncol</span> <span>=</span> <span>1</span><span>)</span>
Looking at the stacked bar graph (top), pinpointing the exact year when the population of the Americas surpassed Europe’s total population is challenging. You might roughly estimate between 1960 and 1980, but precise determination is difficult at a glance. The dodged bar chart (bottom) offers slightly better clarity, yet it introduces visual clutter with numerous bars, forcing the viewer to mentally calculate the magnitude difference and strain to compare paired bars. This approach to data visualization isn’t ideal for quick, impactful insights.
Inspired by Mike Bostock’s innovative approach, a more streamlined alternative emerges: the Comparative Bar Graph. This visualization technique integrates both datasets into a single bar, clearly highlighting the difference between them. While replicating Bostock’s method in R initially involved complex data manipulation and extensive coding (as detailed in this RPubs example), the need for simplification and abstraction became apparent. This led to the development of the compareBars
htmlwidget for R.
The compareBars
package offers a cleaner and more intuitive solution:
<span>library</span><span>(</span><span>compareBars</span><span>)</span>
<span>d</span> <span>%>%</span>
<span>spread</span><span>(</span><span>continent</span><span>,</span> <span>pop</span><span>)</span> <span>%>%</span>
<span>mutate</span><span>(</span><span>year</span> <span>=</span> <span>factor</span><span>(</span><span>year</span><span>))</span> <span>%>%</span>
<span>compareBars</span><span>(</span><span>year</span><span>,</span> <span>Americas</span><span>,</span> <span>Europe</span><span>)</span>
With a comparative bar graph, the moment the Americas’ population exceeded Europe’s becomes instantly evident. Furthermore, the visualization provides a clearer understanding of the population magnitude difference year by year. This results in a more compelling and less cluttered representation of the data.
For customization options and further details, explore the compareBars README on GitHub. Embrace the power of comparative bar graphs for enhanced data storytelling.