Nonlinearity in scientific productivity

During her rise, but before becoming a poker champion, Maria Konnikova was counselled by her coach that she was winning prize money in too many tournaments.

Wait, why wouldn’t she want to win prize money in every tournament? And what’s that go to do with a post about productivity in science? A loose answer to both questions: nonlinearity.

Maria’s initial goal in tournaments was to survive until enough other players had lost so that she reached the threshold, say the top 15%, to earn prize money. To reach this threshold, she was playing cautiously. Too cautiously, that is, for a realistic shot at the big money that goes to the top-placed finishers. Given how poker and its payouts work, a good player is better served by aiming high and winning a few large prizes (hence incurring many failures) compared to having many small wins.

Science poses the same conundrum. Instead of poker chips, we’re betting time. You can spend years on a high-risk, high-reward project and, if you’re lucky, you make a big breakthrough. Or you play it safe and produce incremental contributions.

Say you have five published papers. Ignoring the issue that citations are a poor metric for quality, would you rather your papers have citations counts of (A) 1000, 2, 1, 0, 0 or (B) 100, 100, 100, 100, 100? Scenario A has twice the total of B, but suggests that four of the five papers are ignored. Measured by h-index or i10-index, B is the better choice. If you picked B, would you change your mind if the first number was 10 000 rather than 1000?

These hypotheticals are a roundabout way to demonstrate that scientific productivity isn’t easily defined, let alone measured. That hasn’t, of course, stopped people from musing over it. (Programmers are also intrigued by the question of gauging productivity, and I’ll borrow liberally from their point of view throughout this post.)

Multiplicative productivity

Bill Shockley, in a 1957 paper, observed that publication rates of scientists at the Los Alamos Lab increased exponentially, not linearly, when ordered from least to most productive. In other words, a small number of scientists have exceptionally large publication rates. But why?

Shockley was famous for winning the 1956 Nobel Prize in physics. He was infamous for being paranoid to the extent of recording phone calls, and having very backward views on race. He was also fascinated by productivity and intelligence. Despite his various legacies, his hypotheses about productivity are hard to dismiss out of hand.

One hypothesis was that the complex task of producing a scientific paper can be broken down into eight separate task such as thinking of a good problem, writing adequately, and profiting from criticism (full list in the end notes). If, as he assumed, these abilities are independent, then their effect on overall productivity is to multiply together. A scientist need only be 50% better than average at each task to be 25 times (i.e., 1.58) more productive overall.

Joel Spolsky, best known for his role in creating Stack Overflow, speculates about a similar multiplicative idea at play when it comes to hiring software engineers. In his book Smart and Gets Things Done, he points out that finding a great engineer requires applications from a big talent pool. To get a big talent pool requires removing obstacles for the applicants. Examples of obstacles include applicants living in the wrong city, not having the right visas, or coming from the “wrong” schools. If removing each obstacle doubles the number of applicants, then the size of the talent pool grows with the square of the number of obstacles removes.

Both Shockley and Spolsky also commented on, and agreed about, the value for money of the most productive people. Shockley notes

The relationship of salary to productivity shows that rewards do not keep pace with increasing production.

And Spolsky notes

Great people are much, much more valuable than average people. In programming, they are three to ten times as productive, while only costing 20% or 30% more. And they hit high notes that nobody else can hit.

Factorial productivity

You know what grows faster than polynomials or exponentials? Factorials. Quick recap: factorials are denoted with an exclamation mark and, as an example, 4! = 4×3×2×1 = 24. This is the number of possible ways to order four elements, say the letters, A, B, C, and D. A small increase to 6! and the result is already 720. 10! is a bit over 3.6 million. Factorials are why you haven’t won the lottery.

Graphs of the nonlinear quantities x squared, 2 to the power of x, and x factorial
Though a little slow off the mark, factorials grow far faster than typical nonlinear quantities.

Factorials factor into another of Shockley’s hypotheses (lame pun intended). Say one scientist can keep four distinct ideas in their mind, whereas another can keep six. The latter isn’t 6/4 or 1.5 times more likely to discover a connection between two ideas, but rather 6!/4! = 720/24 = 30 times more likely.

This time, the parallel idea from the software engineering world comes from Frederick Brooks’s book The Mythical Man-Month first published in 1975.  Brooks’s premise was that developing robust, production-level software involves being aware of the possible interactions between functions in a program. The number interactions grows combinatorially, to use his word, with the number of functions. If nothing else, someone who can keep track in their head of the role of an extra few functions can much more quickly recognise or prevent bugs.

The title of his book is another nod to nonlinear productivity. Taking the number of men and multiplying it by the number of months they work on the project will not give you an unambiguous value for the total output. The unit of a man-month is, despite its appeal to a project manager, a myth.

Are some people really that much more productive?

It’s easy to pick holes in Shockley’s algebraic formulations of scientific productivity. His choice of the eight separate tasks needed to put together a paper is rather arbitrary. Less arbitrary, but also relevant is his assumption that productivity at each separate task multiplies. You could reasonably ask why someone who is 50% more productive at each task isn’t merely 50% more productive overall. And whether his specific list is still relevant 60 years later is up for question (as discussed in a comment thread elsewhere.) But the gist of his ideas were simple, so of course they’re going to miss any kind of nuance.

That the truth about productivity is more nuanced is exactly the conclusion of Bill Nichols’s study The End to the Myth of Individual Programmer Productivity. One of his main findings was that productivity averages out. Many of his test subjects exhibited a large range in productivity between tasks. Seldom did he find programmers who were consistently productive. But note my use of seldom. He did admit to a few consistently productive programmers, which sound suspiciously like the scientists on the tail of the distribution that Shockley was talking about.

Perhaps the biggest assumption implicit in the arguments so far is that science (or programming) is an individual pursuit, and hence that productivity is an individual trait. While it may often feel this way, there are always other people involved. Maybe they present bureaucratic hurdles; maybe they allow unique opportunities; or maybe they’re a team member who doesn’t accomplish a lot on their own but act as a force multiplier. Or as Google executive Urs Hölzle pointed out:

Your greatest impact as an engineer comes through hiring someone who is as good as you or better. Because over the next year, they double your productivity. There’s nothing else you can do to double your productivity. Even if you’re a genius, that’s extremely unlikely to happen.

This sort of increase in productivity with every hire sets up a self-reinforcing cycle in which it encourages others great programmers to want to work at the same place. Paul Graham points out the nonlinearity that comes with this:

You won’t attract good hackers in linear proportion to how good an environment you create for them.  The tendency [of hackers] to clump means it’s more like the square of the environment.

Specialist or generalist?

Physicist Enrico Fermi has been called the last man who knew everything. In addition to the hyperbole, he died in 1954. In other words, science these days requires specialisation. It always has, of course, but more so in the 21st century. Yet the multiplicative idea of productivity mentioned above suggests that a single skill will not do. Scientists who cannot convey their theories in words are ineffective. As are prolific writers without substance. In discussing innovators, a group to which scientists belong, Scott Berkun summarises that

There are three kinds of people who are rare in this world: those who are excellent communicators, those who find interesting and useful ideas, and those who can convert an idea into a realistic plan. It’s exceptionally rare for one person to be good at all three.

Sound familiar? This is a lot like Shockley’s idea using three broad abilities instead of eight narrow ones.

So what’s a scientist to do? Double down on their specialisation in hopes of any scientific breakthrough—big or small—or cultivate a range of skills to better document and communicate those breakthroughs? My thoughts firmly favour the latter. As I wrote on the About page when I created this site five years ago: scientists with an appreciation of design and knowledge of the correct tools to use are much better placed to convey their work in journals, at conferences, and to each other. Or, as Leo Breiman put it, the trick to being a scientist is to be open to using a wide variety of tools.

End notes

I came across Shockley’s paper about productivity several years ago, but wasn’t aware of his many other claims to fame, if you will, until stumbling across his backstory in the book How Would You Move Mount Fuji?. The title alludes to one of many possible lateral and logical thinking puzzles used by employers during interviews for tech companies. These questions are intended as a surrogate test for the types of intelligence that translate to the job at hand, but are going out fashion for good reason. Shockley was one of the early proponents of such questions. Yet, as retold in the opening chapter of the puzzle book, rather than being impressed when an applicant gave a quick, correct answer to what was expected to require tedious arithmetic, Shockley almost threw a tantrum. He did not want to be outsmarted, or at least equalled. For the full story, see Broken Genius: The Rise and Fall of William Shockley, Creator of the Electronic Age.

Acknowledging they were a partial listing and not in order of importance, Shockley listed his eight separate tasks for publishing a scientific paper as the following: 1) ability to think of a good problem, 2) ability to work on it, 3) ability to recognize a worthwhile result, 4) ability to make a decision as to when to stop and write up the results, 5) ability to write adequately, 6) ability to profit constructively from criticism, 7) determination to submit the paper to a journal, 8) persistence in making changes (if necessary as a result of journal action).

Author: Ken Hughes

Post-doctoral research scientist in physical oceanography

2 thoughts on “Nonlinearity in scientific productivity”

Comments are closed.

%d bloggers like this: