A better way to rank the scientific literature

Feeling like your scientific papers aren’t getting the attention they deserve? Wanna bump up your citations counts for the next decade? Then, consider dying young. It apparently helps: a posthumous spike in recognition arises owing to the promotional efforts of colleagues.

This morbid example is but one of many arguments that citations in the scientific literature are not a true meritocracy. Another example: last month I hypothesised that many papers are cited only because they’re new, not because their content is new. It makes me think there’s a better way to rank references.

Scorning citation metrics is a favourite pastime of scientists (up there with scorning p values). Distilling a study’s quality to a single value is simplistic is the standard argument. But what if we double down? What if we focus more on numbers when it comes to citations?

The problem with citations? They’re all created equal. We never rank them, but maybe we should. Rather than the binary judgement of whether to cite a paper or not, let’s start scoring them. Or at least put them in a better order. Let’s recognise that a citation in the Discussion is worth two in the Introduction.

My epiphany came after reading a few of Scott Berkun’s books. In his books, he invokes a Ranked Bibliography, reasoning that

Traditional bibliographies provide little value. They obscure the relative value of prior works, and fail to indicate how the author used them (devoured, skimmed, or as a paperweight).

Rather than list the material he referred to in alphabetical order, he orders them in terms of how influential they were. It’s a simple, yet profound change.

Ranked Bibliographies could improve the way we rank the influence of scientists, but that’s arguably frivolous, nor is it the purpose here. Instead, the act of ranking could make interacting with the scientific literature more efficient for everyone.

What is a Ranked Bibliography?

Scott Berkun’s system is straightforward: for each book or article he cites, he counts up the number of notes he made from it during his research. He then simply sorts the bibliography by these numbers. (He sees as the method’s biggest flaw the assumption that all notes influenced him equally. I’ll note other possible issues later.)

The Ranked Bibliography from Scott Berkun’s Myths of Innovation. Note the highlighted scores. Source: Scott’s Twitter.

I see two possible ranking systems for the scientific literature. One is to follow Scott’s method and assign a number or score to every reference. The other is to rely on a more holistic measure of relevance (forgoing a numerical scoring system). If either method seems too subjective, remember that we list authors on papers in subjective ways. If we listed authors on papers alphabetically, we’d lose all the implied information about who did the most work.

Since references are no longer listed alphabetically at the end, a ranking system requires a numbered reference format like Vancouver (possibly in conjunction with an author–year system). Something like the following

Unlike Smith (2000)37, we use the method of Jones (2010)4, which has been used previously for …. (Davis, 2015)20.

In this example, Jones (2010) is the paper ranked fourth-most influential to the current study. Smith (2000) and Davis (2015) are still relevant, but much less so.

Problems that a Ranked Bibliography addresses

In-text citations would be weighted by relevance.
Every scientist knows the feeling of reading a paper, coming across an interesting statement, and then having to decide whether to pause reading so as to follow up on the associated reference. Sometimes this means flipping to the bibliography, checking the reference, and guesstimating from the title whether it’s relevant. Having a number indicating relevance right there in the text would aid this decision.

Inconsequential citations would be identified as such.
Perfunctory. It’s a word I learned while undertaking a cursory search of what’s been published about citations in the scientometrics1 literature. It means ‘carried out with a minimum of effort or reflection’. A manual analysis of 30 physics papers back in 1975 classed 40% of citations as perfunctory. A much more recent study is more blunt:

Many fluff citations exist.

They’re referring to fill-in-the-blank citations often found in the Introduction. The pieces we know to cite whenever we mention a particular topic. The papers we can confidently accept by reading half the abstract.

While fluff citations are necessary for framing papers, a ranking system would rightly place these references at the low-relevance end of the bibliography.

The scientific literature would feel less daunting.
Reading a paper with 30–50 references can be discouraging if you’re unfamiliar with said references. It feels like there’s so much to learn before you’ll attain an adequate level of background knowledge. But what if only a couple of those references are central to the current study?

The advice of others can limit the need to sample everything. It’s like choosing music choices. In Rockonomics, Alan Krueger points out how Spotify has more than 30 million tracks (~200 years to listen to it all). To cope with such a vast collection, we rely on recommendations from friends. In a similar vein, a Ranked Bibliography would curate the literature, making it more feasible to be aware of key past results and new advances.

Ranked Bibliographies would also help when looking for follow-up studies. For example, I was recently looking for any papers that built directly on the one I’d just finished reading. It’d been cited 33 times, so I was left to wade through 33 titles/abstracts on the off chance that there’s one or two closely related studies. Such a search would more efficient with search results based on references ranked by influence to the current study. (This is different to ordering the follow-up studies by how many citations they each have, which is what Google Scholar does.)

Problems with a Ranked Bibliography

Ranked Bibliographies work well for Scott Berkun’s books. He assesses many sources with the intent of writing a single book intended for popular consumption. Compared to scientific papers, he is less obligated to track down primary sources or give credit to the first person responsible for an idea or result. Given that, it’s reasonable to ask whether a Ranked Bibliography causes as many problems as it solves?

Subjectivity is the biggest problem:

  • If you’re writing several papers based on an overlapping set of references, how do you decide how much each one influenced each individual paper?
  • Without a clear, quantitative way to score or order references, different authors will order the literature in ad hoc, idiosyncratic ways, nullifying advantages gained from a ranking system.
  • You could even argue that ranking gives scientists more leverage to game the citation system (as it stands, that’s likely less of a problem than you might think).

Another problem relates to the Matthew Effect: scientists or papers that are already well known or cited can garner more recognition merely through their existing recognition (the rich get richer). A Ranked Bibliography may exacerbate this undesirable effect. This is the counter argument to my suggestion that curation would occur. Continuing with the Rockonomics example, Krueger also points out that the popularity of artists and songs is highly susceptible to random perturbations subject to positive feedbacks. I mean, how else does one explain Drake being the most streamed artist of the decade? (Hint: it’s not his singing voice.)

Review papers will also benefit from a Ranked Bibliography system. These are already overcited. Scoring a paper based on how many useful facts and figures it contains will only make this worse.

This was a thought exercise (mostly)

If I ever write a book, I’ll consider a Ranked Bibliography. But that’s the only scenario in which I’ll have a say over how the bibliography is formatted2. There’s too much inertia in the scientific publishing industry for something as esoteric as bibliography formatting to change. That said, machine learning and natural language processing may change the game. It’s already possible to automatically clarify the way individual citations are being used (e.g., as background, as contrast, as methodology, etc). The startup scite, for example, classifies whether papers are being ‘mentioned’, ‘supported’, or ‘disputed’. Fair warning, if you try it on a paper of your own, don’t be disappointed when other papers citing your one merely mention it. Once. In the introduction. Perfunctorily.


1. Scientometrics is a weird blend of hard science and social science. On one hand, standard statistical analyses. On the other, turgid and pretentious writing. Take the following excerpt from an abstract in the journal Scientometrics:

This new synthesis is embodied in a citation classification system, the citation cube, with dimensions of normative compliance, symbolic consensus, and disinterestedness (self-citation).

Huh? (Yes, I realise that a scientist calling out writing for being overly academic is the pot calling the kettle black.)

2. In addition to adopting Scott Berkun’s Ranked Bibliography, books would all benefit from including other back matter that he adds for entertainment’s sake such as

  • Calendar time since project begun
  • Number of drafts
  • Completely abandoned outlines/drafts
  • Words in the book
  • Total words written (including those cut)
  • Book written with: quill pens, by oil lamp.
  • Book actually written with: Mac Office 2013 (almost as frustrating)



Author: Ken Hughes

Post-doctoral research scientist in physical oceanography

%d bloggers like this: