Line graphs are the Swiss army knives of data visualisation. They can be almost anything… which is both good and bad.
Line graphs are slow to interpret
Many graphs serve one clear purpose. Take the five graphs below:
Even without labels, it’s clear what role each of these graphs serves:
Pie chart—components of a total
Thermometer—progress toward a goal amount
Speedometer—percentage of the largest possible value
Histogram—distribution of values
Box plot—statistical summaries of several datasets
In other words, if I’m presented with one of the graphs above, I have an immediate head start on interpreting it. If, instead, I’m presented with a line graph, I’m forced to read the axes labels and limits first.
Deciphering text is the slow way to intake information. Shape is fastest, then colour, and only then text. This so-called Sequence of Cognition, popularised by Alina Wheeler, is something marketers need to know about.
I typically write 100–200 lines of code each time I develop a scientific figure that is destined for publication. This is a dangerous length because it’s easy to create a functioning mess. With shorter code fragments, it’s feasible to start over from scratch, and with thousands of lines of code, it makes sense to invest time upfront to organise and plan. But in between these extremes lurks the appeal to write a script that feels coherent at the time, but just creates problems for future you.
Let’s say you want to create a moderately complicated figure like this:
A script for this figure could be envisaged as a series of sequential steps:
Read data in from a csv file
Remove any flagged data
Create four subplots
Plot the first line of data against time
Label the y axis
Set the y axis limit
Repeat steps 4–6 for the second and third lines of data
Comments within code are harmless, right? They don’t affect run-time, so you might as well use them whenever there’s any doubt something is unclear.
I hope you aren’t nodding your head, because a liberal use of comments is the wrong approach. Not all types of code comments are evil, but many are rightfully despised by programmers as (i) band-aid solutions to bad code, (ii) redundant, or even (iii) worse than no comment at all.
The same is true for scientific figures and their captions. In fact, many of the rules discussed in the post Best Practices for Writing Code Comments remain valid when we replace comments and code with captions and figures, respectively.
Too many scientific figures are ugly. I see three possible reasons:
Laziness: scientists could make nice figures, but don’t put in the effort
Obliviousness: scientists are unaware their figures are ugly
Indifference: scientists care only about the data, but not their presentation
Take the following published scientific figure (suitably disguised):
Let’s list the problems: (1) Space is poorly used and data are cramped. (2) Text is bold for no reason. (3) Multiple fonts are used. (4) Tick marks are barely visible. (5) Some labels don’t fit in their respective boxes. (6) Axis values are unnecessarily repeated. (7) Dashed and dash-dotted lines are ugly. (8) Mathematical symbols are not italicised.
“There is this scientific convention of: ‘You put the images on one side, then you put the text to decipher it on the other side.’” That’s Jonathan Corum, science graphics editor for the New York Times, politely critiquing one of the ways in which a typical scientific paper creates unnecessary work for the reader, or “cognitive overhead.”
Decipher is the key word above (and a word I’ll use again below). If deciphering is necessary, it will precede understanding, but that doesn’t mean it is necessary. “No one intends to build a product with large cognitive overhead, but it happens if there isn’t forethought and recognition for it.”
How many colours do you need to visualise data scored on a five-point scale?
If you went with the obvious answer of five colours, here’s what you get:
The green and grey figure wins in two ways. First, it tells a story: about a third of respondents view Wikipedia favourably. (Although there are other interpretations of the data shown, a good figure emphasises a single message.) Second, the grey and green version just looks better.
Everything should be made as simple as possible, but no simpler said Einstein. Except, he didn’t. His version of the quote was four times longer.
I’m not surprised that it took a non scientist to paraphrase and create the short, popular version. As scientists, we are not accustomed to brevity. We want to provide every detail. We read papers filled with columns of 10pt text. We construct figures with dozens of lines and colours. We spare no bit of white space when we design posters. And don’t get me started on logos for scientific campaigns (long story short: too many elements, too many colours, and too literal).
We lack minimalism.
You may argue that detail, nuance, and chains of logic—hallmarks of science—are not easily reduced to 280 characters or a sexy soundbite. I don’t disagree. But there are still aspects of minimalism we should embrace.
Web developers and programmers have a lot riding on the quality of their product. If it behaves in an unusual manner, it’ll frustrate users. If it’s unattractive, it won’t attract users. If it’s no good, it will lose users. Users are the primary concern. Consequently, there are fields dedicated to this concern: User Interface (UI) and User Experience (UX).
Because web developers and programmers have a much wider potential audience than scientists, they have a better handle on the importance and behaviour of the end user. Scientists could learn a thing or two about UI/UX.
A computer is a better artist than I am. If I can tell it what to draw, it will produce attractive results. To make a nice schematic, the hardest part is to tell the computer what I want to draw. Fortunately for us so-called left-brain types prevalent throughout the sciences, a familiarity with scientific software can overcome a lack of artistic talent, allow rapid iteration of a design, and even provide creative inspiration.
Invoking my scientific software skills, I am able to produce elegant figures: