The Goldilocks level of abstraction in science

Science is full of abstractions. A line plot is an abstraction. A false-colour image is an abstraction. Even scientific notation like 6.0 × 1023 is an abstraction. Abstractions like these are central to science, invoked all the time, and easy to understand. Other abstractions are far from simple and may cause more confusion than clarity.

Abstraction is a slippery slope. Things can quickly get out of hand. Let’s start with a simple abstraction like velocity. The units, metres per second or kilometres per hour, tell us what it is: how fast something moves. I imagine most scientists would still be comfortable if I increase the level of abstraction by calculating the acceleration (i.e., moving from m s−1 to m s−2). But what if I had instead starting talking about a quantity in m2 s−1, not m s−2? Diffusivity and viscosity are such quantities; they measure how quickly something spreads. All I did was change m s−1 to m2 s−1 and I’ve taken a concept a kid can understand to something that may trip up an undergraduate physicist.

And that’s adding only a single extra level of abstraction. How about the following sentence:

The kinematic turbulent kinetic energy input can be represented by u*2cp where cp is the phase speed of the peak of the slope spectrum of the surface displacement field.

Notice the three “of the”. The authors (Craig and Banner) are pilling abstraction upon abstraction.

Is such an information-dense sentence a problem? It depends. Without context, the sentence is impenetrable. If, however, the authors have built up the individual components piece by piece in the preceding text, then this concise statement may be surprisingly informative. (The paper as a whole is great, but the sentence above is a little laughable.)

Levels of abstraction

In Death by Black Hole, Neil deGrasse Tyson describes six levels of abstraction separating a star from how an astronomer learns about a star. Level 0 is the star itself, 1 is a picture of it, 2 is the light from that picture, 3 is the spectrum of that light, 4 is the pattern of lines in that spectrum, and 5 is the shifts in that pattern. David Deutsch illustrates the separation nicely by noting that astronomers these days never look up at the sky (except perhaps in their spare time).

Any scientist could easily come up with a similar multi-level chain of logic for their own work. When conversing with collegues in the same field, we often do so in the realm of the 4th and 5th levels of abstraction. This is where many of the secrets that we’re trying to discover are hidden.

Any level of abstraction has the potential to distil a result to its essential elements. It can just as easily, however, cause your reader to stumble. This raises a question: if and when is abstraction worthwhile?

Even low-level abstractions can be problematic

It is easy to think something is simple because it is familiar, and vice versa.

Area, for example, is a straightforward concept, a level one or two abstraction. But that doesn’t necessarily make it intuitive. While reading a paper recently, I came across the quantity 300 000 km2. Although I recognised this to be a significant portion of the ocean, I had little feel for its true size. Later in the paper, however, it was alternately described as approximately 1500 km long and up to 300 km wide. This presentation as a length × width, to me at least, was much more intuitive.

Of course, there are situations when area is more intuitive than length × width. Any looking to rent or buy a house quickly becomes comfortable thinking in terms of 100 m2 or 1000 ft2 rather than, say, 10 m × 10 m or 20 ft × 50 ft.

How, then, should we present low-level abstractions?

Quick fixes for low-level abstractions

When you work with certain quantities long enough, you get a feel for what they mean and their relative size, as in the house-buying example above. If you don’t have a typical value to compare against, the quantities may as well be arbitrary numbers. By the way, this is something salespeople know well. If selling a car, say, they’ll start off with a high initial price in order to make every subsequently lower price seem like a comparatively good deal.

Baselines are key. Sometimes there are obvious reference points. In astronomy, there is the astronomical unit AU, equal to the distance from the earth to the sun. Discussing distances in this unit not only saves having to think of numbers in the millions of kilometres, but it makes such large distances more intuitive. On the other end of the scale, atomic physicists measure in electron volts (eV). This avoids having to converse in really small numbers all the time (1 eV is 1.6 × 10−19 J).

If there isn’t an intuitive unit readily available, compare against something meaningful. It’s like talking about distances in terms of football fields or volume in terms of Olympic swimming pools. In my field of oceanography, diffusivity (rate of spreading that I mentioned earlier) can be made intuitive by referencing it to a widely used, canonical value of 10−4 m2 s−1.

Like its magnitude, the sign of a quantity can also be unintuitive. In my own research, for example, I often come across ambiguous phrases that could be easily clarified with simple, intuitive labels. For example, terms like positive buoyancy flux or negative heat flux divergence are often just unnecessarily complex ways to describe a process that stabilises or warms the ocean. Be clear, the problem here isn’t about using jargon, it’s about being unnecessarily abstract.

Labelling a plot can reduce abstraction. Here, “warming” has been used to simplify the otherwise abstract quantity of a negative heat flux divergence, and vice versa for cooling.

On their own, these changes are all minor. Yet, small increases in objective complexity can produce big increases in perceived complexity. Each bit of cognitive effort you save the reader is a bit more they can expend on the real science, the truly abstract stuff.

Where abstraction prevails

In How to Read Water, Tristan Gooley cautions against abstracting away the beauty of water waves:

As soon as we start using terms like wavelength and period there is a risk that we feel the beauty leaching away, or as the ocean scientist Willard Bascom put it, there’s a danger that the study of the ocean falls into the hands of those who’ve never seen the sea.

This isn’t actually Gooley’s point though. He follows by noting

But try to befriend these terms, as they are only labels that will accelerate your ability to read waves.

Anyone who has studied physics knows that wavelength and period apply to more than just water waves. Learning about water waves can help you understand light, earthquakes, atoms, or springs. As Nikola Tesla put it (or at least attributed to him), if you wish to understand the Universe, think of energy, frequency and vibration. That’s the beauty of abstraction.

Building up to high-level abstraction

These days, there’s enough scientific literature to go around that few readers will sit down with a paper and work through everything themselves until they understand it. You may well have great theories or observations, but if your paper jumps straight to level 5, you run the risk of losing your reader before you’ve even started.

In my work, there’s a particular journal, the Journal of Fluid Mechanics, that is full of fascinating papers. But I don’t read many of them. At least in my opinion, papers in this journal are notorious for ramping up the levels of abstraction too quickly. Often by Figure 2, there’ll be plots with quantities of level 4 and 5 abstraction. That may be okay if I’m sufficiently familiar with the topic. If, instead, I happen to come across the paper during a search, I’ll often pass it up because the time investment is too great for a paper that may or may not be relevant. And I don’t know if it’s going to be relevant because it started off too high-level.

Ben Orlin sums up this sentiment nicely:

A scientist who can’t convey his thinking will end up on a lonely island of thought, whose ideas never reach other shores, while the scientist who can share her truth enjoy’s a hero’s welcome from the grateful crowd.

He actually said mathematician, not scientist, but it generalises easily. He cites the example of a 12-year-old in his math class helping to explain to everyone else how de Moivres formula works by focusing on a single case for an exponent of 2. His attempts to convey understanding had failed because he’d started with the general case for an exponent of n. His lesson: build up to the higher-level abstraction.

In a paper I’m currently writing, I build up to high-level concepts figure by figure. Figure 1 shows the platform and the oceanographic temperature sensors it contains. Figure 2 shows the temperature those sensors measured, together with the subsequently derived vertical gradient of temperature. Figure 3 shows spectra of the measured temperature. Figure 4 shows how those spectra change throughout the day. Figure 5 shows the turbulence quantities inferred from the spectra and temperature gradient. After that, the real science happens.

Claus Wilke shares this same thought in Fundamentals of Data Visualization:

I start with a figure that is as close as possible to showing the raw data, and in subsequent figures I show increasingly more derived quantities. Derived quantities (such as percent increases, averages, coefficients of fitted models, and so on) are useful to summarize key trends in large and complex datasets. However, because they are derived they are less intuitive, and if we show a derived quantity before we have shown the raw data our audience will find it difficult to follow. On the flip side, if we try to show all trends by showing raw data we will end up needing too many figures and/or being repetitive.

Where I stand on abstraction

I ended up studying physical oceanography because it wasn’t too abstract. Other possible avenues stemming from my physics undergrad seemed either too small (atomic physics) or too large (astrophysics) to fathom. I wanted to research something I could touch and feel, or at least measure with instruments that I can touch and feel. Ironically, though, physical oceanography is full of abstraction. Not only that, but the most interesting and satisfying concepts are the abstract ones.

I followed the same trajectory writing this post. Initially, my underlying theme was to be hey, Joe Scientist, a lot of your abstraction is unnecessary. Rein it in. Keep it simple. Yet I found it hard to substantiate this claim. Instead, I continually found myself recognising that science without abstraction would just be photos, poorly comprehended observations, and long-winded, ungeneralisable descriptions.

The only way to reconcile the arguments both for and against abstraction is to recognise that, like all good things, it is best in moderation.

Author: Ken Hughes

Post-doctoral research scientist in physical oceanography

One thought on “The Goldilocks level of abstraction in science”

Comments are closed.

%d bloggers like this: