Clichés of scientific writing

Novel writers use an average of 100 clichés for every 100 000 words. Or about one every four pages. That’s what Ben Blatt found by comparing a range of novels against a list of 4000 clichés. How does scientific writing compare?

In one sense, scientific writing avoid clichés. A scientist isn’t going to write that their new results put the nail in the coffin of the outgoing theory, that they were careful to dot their i’s and cross their t’s so as to follow the methods of Jones et al. by the book, that Brown et al.’s finding is a diamond in the rough, or that two possible interpretations are six of one and half a dozen of the other.

In another sense, scientific writing is full of clichés. Our writing often feels like a fill in the blanks: the results of this study show X, these findings are in good agreement with Y, or Z is poorly understood and needs further study. Need more examples? Checkout the Manchester Academic Phrasebank, a collection of phrases from the academic literature that are “content neutral and generic in nature.”

Perhaps I’m being too harsh. Maybe scientific writing isn’t full of dull, procedural, and formulaic expressions. Maybe it’s confirmation bias in that I’ve read hundreds of scientific papers, so it’s easy to recall at least a few such expressions.

This calls for some data. Specifically, an inspection of the text of 360 papers that I’ve collected over the years in my field of physical oceanography (published since 2000). I’ve used this set of papers before in a similar post. Combining an automated search with some manual intervention, I checked the 360 papers for a range of clichés: third-person constructs, hedging terms, overzealous assertions, and directives for future work.

The author dislikes third-person statements

It’s a myth that scientific writing should demonstrate dispassionate observation. But let’s say you do buy into the myth and want to pretend you’re somehow impartial and uninvolved in the experiments that you’re reporting. When it comes time to recognise your own role, you’ll be forced to describe yourself and co-authors in the third person as “the authors”. As in, for example, “the categories were selected by the authors”. This awkward phrasing appears in 6% of the papers (a small number, fortunately). And not only is it awkward, it can be ambiguous. Another 6% of papers use “the authors” to refer to other writers, not themselves.

The most common scenario for third-person references is phrases along the lines of “as far as the authors’ are aware, no studies have considered process X”. The second-most common usage of “the authors” is in abstracts, as if for some reason they need to be even more dispassionate? Though maybe a desire to remain neutral is not actually relevant here. Case in point: Acknowledgements, the one section of a scientific paper where personality is always allowed. Yet, if I include that section in my count, the 6% ramps up to over 30%. “The authors thank …”, “The authors wish to thank …”, “The authors acknowledge …”, etc. Why gratitude is often expressed in this odd manner beats me.

There’s no good reason to use “the authors” when “we” is simpler, shorter, and better. That’s why 96% of the papers use “we” at least once. (Six of the 14 that don’t use “we” are single author papers, but none of these use “I” instead.) On average, the pronoun comes up 22 times per paper. One paper features 247 uses. Ironically, that paper has a single author (though it’s mathematical convention to use “we” regardless).

we
I’d make a joke about 50 or more uses of “we” being a sign of narcissism, except that I average 60 in my own papers.

Hedging our bets

Extraordinary claims require extraordinary evidence“. A flip side of this oft-repeated aphorism is that if you have only ordinary evidence, you can make only ordinary claims. What’s a sure-fire way to ensure you claim is ordinary? Hedge.

Take the word “suggest”. There’s at least one use of suggest (or suggests, suggested, suggesting, suggestion) in 87% of the papers. On average the word shows up five times per paper, or about once for every three pages. When you get to using “suggest” more than once per page (as 10% of papers do), that’s a sign of overuse.

Close cousins of suggest are “consistent with” and “likely”. Both occur, on average, three times per paper. As it happens, both terms showed up at least once in 256 of the 360 papers. The stronger phrase “agrees with” (or agree without the s) is 10 times less common.

A different type of hedge is “believe”. This is used in 1/4 of papers, implying that at least 3/4 of us agree that science isn’t about “belief”. It’s about facts, evidence, theories, experiments. (Obviously, using the word “believe” doesn’t imply a scientist disagrees with the statement. And seldom is the word used more than once in a paper.)

This is important. That’s important. Everything is important.

We all strive to do important science. But in some sense, importance is a zero-sum game. If everything is important, then nothing is. But that doesn’t stop us from claiming importance, however tenuous it is.

“Important” or “importance” shows up in 96% of papers. That’s as often as “we”! (Though “we” is used three times more often when tallying up all uses.) Importance is referenced, on average, seven times per paper. But don’t use that number as a guideline for your own writing since it is skewed by a few large values. Instead aim for something closer to the mode of the distribution: three times per paper. Better yet, aim for a single thing being important.

importance

Like “important”, but more assertive, is “crucial”.  This is used far less often: it shows up in only 20% of papers. And only 20% of that 20% uses it more than once. You might say that the comparatively limited usage of “crucial” is consistent with our propensity to hedge.

Leaving it for later

The cliché that something creates more questions than it answers often applies to science. Suggestions for future research are an acceptable approach to flesh out a discussion (though you shouldn’t end with them). Yet I was surprised at how few papers explicitly identify such suggestions.

“Future study/studies/work/research/experiment” showed up in 18% of papers. “Beyond the scope” showed up in 9%. “Cannot explain”, “do not explain” or “does not explain” showed up in 4%.

Here’s where I might acknowledge the shortcoming of my methods and note that in future work I’d check whether there are phrases equivalent to those above that I forgot and thereby excluded from my counts. But this is a blog post, not a scientific paper, so .

What isn’t cliché in scientific writing?

Among the 360 papers, there’s a single use of “gigantic”. I also recently came across the hedge “hopingly,” which is sufficiently rare that WordPress is underlining it as a spelling mistake as I type. But, of course, single words don’t really count. Real answers to the question of what isn’t cliché might be humour, contractions, or one-word sentences and single-sentence paragraphs. I look forward to the days when these become common.

Endnotes

Asides inspired by the main text, but that didn’t quite fit.

  1. Regular clichés, like those in the second paragraph, seldom occur within the body text of scientific papers. Yet they are common fodder for titles, as shown by Google Scholar searches for All in a day’s work, Back to square one, or Don’t judge a book by its cover.
  2. For a particularly notable academic cliché combination, consider one of the topic sentences in a well-known psychology paper: “To clarify the distinctive nature of our proposal it is useful to briefly consider prior research on overconfidence”. The first 17 of the 18 words are a generic framing of the only meaningful word in the sentence. (Do check out the paper though. Its topic, the illusion of explanatory depth, is fascinating and relevant to the practice of science in general.)
  3. The Academic Phrasebank notes on its homepage that it was designed for non-native speakers of English, but it is the native speakers that have ended up as the majority of its users.
  4. I acknowledge that the fill-in-the-blanks approach is a good way to get started writing, but some pushback against this approach is warranted.
  5. Like “we” and “important/importance”, another common word is “data”, which shows up, on average, 22 times per paper (“dataset” is included in this count). 2% of papers had triple-digit usage. Conversely, 22 of the 360 papers did not use the word. Of those 22 papers, 20 were focused on either theory or simulations.
  6. Awkward second-person references are worse than third-person. 10% of papers mentioned “the reader”, whereas <1% of papers mentioned “you” in any second-person sense. (That 10% excludes the boilerplate phrase used in 10 papers that “the reader is referred to the web version of this article” [for colour figures].)
  7. In a similar vein to “the author” and “the reader”, there’s the phrase “this paper”. This phrase is used in 60% of papers. On average, it was used 1.4 times per paper, which is reasonable. Four of the 360 papers, however, had double-digit uses of the phrase.
  8. I wanted to include stats for “possible/possibly” in the hedging section. But these words are can be used in many ways other than hedging, so I left them out. Similarly, I excluded “critical” in the importance section as that has specific meanings in my field.
  9. The heading This is important. That’s important. Everything is important should be read in Oprah’s voice.
  10. Rather than proclaiming importance where it’s not due, this paper is more honest: “From a practical perspective, that result, of course, is only moderately interesting.”

Author: Ken Hughes

Post-doctoral research scientist in physical oceanography

5 thoughts on “Clichés of scientific writing”

  1. Dear Ken Hughes,

    I read your mildly amusing and well-written post entitled “Clichés of scientific writing” with glee.

    As for Cat Diamond’s question asked on 26 November 2019 “Why Can’t More Scientists Write Like Darwin?”, the following excerpt from my academic post entitled “Do Plants and Insects Coevolve? 🥀🐝🌺🦋” co-authored with Dr Craig Eisemann at https://soundeagle.wordpress.com/2016/08/17/do-plants-and-insects-coevolve/ can in part provide a good answer:

    … Unlike Charles Darwin who lived in the age of the “gentleman scholar” and enjoyed substantial financial inheritance and professional freedom to pursue wide-ranging interests from biology to geology, modern-day academics and researchers are increasingly specialized or vocationalized, blindingly honing their skillsets on pinpointing minutiae to outshine others in their respective microniches. Gone are the big narratives and grand syntheses, unless one has the time, fortitude and resources to become a maverick pursuing truly revolutionary research or going against prevailing trends to wield long and meandering strokes on the large canvass of a book (such as Darwin’s 365-page The Various Contrivances by Which Orchids are Fertilised by Insects), let alone a watershed multi-chapter magnum opus (as exemplified by Darwin’s 502-page On the Origin of Species)…

    In general, I would support any satisfying solutions to render scientific publications less cut and dry and much more engaging and holistic.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s