Inspired by 250 things an architect should know but 60% less ambitious
Click on each word (not its number) for a brief elaboration
-
Anscombe’s quartet
Four distinct datasets (x vs y) that produce the same summary statistics (mean, variance, correlation coefficient, and line of best fit)
-
HSL colour space
A colour space that defines colours in terms of their Hue (e.g., red or blue), Saturation (vivid to washed out), and Lightness (white to black)
-
Whitespace
The area within a design (website, poster, figure, etc) that lacks text, images, or other elements
-
HARKing
A questionable approach to research: Hypothesising After the Results are Known
-
Fermi problems
Problems in which an answer cannot be estimated outright but is instead derived as the product of more easily estimated quantities (e.g., how many grains of rice are eaten across the world every year?)
-
Use cases of .png and .jpg images
The JPG format is optimised for photos, whereas PNGs are for graphs and diagrams
-
That vs which
That and which, although similar, have opposing implications about whether a clause is restrictive or not
-
McNamara Fallacy
Basing a decision on only numbers or other objective measures without reference to any qualitative factors
-
Version control
A tool for tracking and recording all changes to software and other digital files as they evolve
-
Serial position effect
The human tendency to better remember what happened at the start and the end and forget what happened in the middle
-
Zenodo and Figshare
Online repositories for datasets, code, and other research output
-
Your carbon footprint
A typical person living in a western country will have an annual footprint of 5–20 tonnes CO2
-
Matthew effect
Well known scientists get cited more often than lesser known ones leading to a positive feedback loop
-
Hand-waving solutions
A metaphor for an answer that might gloss over details, be vague, or rely on many approximations
-
Logarithmic scales
Scales that increase geometrically (e.g., 1, 2, 4, 8, 16, …) rather than linearly (2, 4, 6, 8, …)
-
“Data” is a plural
It can sound odd, but data were collected is correct and data was collected is not
-
Left-branching sentences
A sentence structure to avoid because the initial words only make sense as the sentence nears it end
-
Regression to the mean
A statistical tendency for outliers in an initial experiment to deviate less in a subsequent experiment
-
ImageMagick
Software for all manner of image manipulations and conversions that can be run from the command line
-
Bash shell
The default command line interface
-
Butterworth filters
A widely used approach for smoothing time-series data
-
Golden ratio
The value 1.618…; an aesthetically pleasing aspect ratio for a rectangle among many other claims to fame
-
Types of map projections
Flattening the earth to a two-dimensional image can be achieved in numerous ways, each with its own pros and cons
-
DOI and PMID
Unique digital identifiers that can point to publications, datasets, software, and more
-
Edward Tufte
An early name in data visualisation and author of several books on the topic
-
Widows and orphans
A line at the beginning or end of a paragraph that is separated from the rest by a page break
-
Construction cost of the Large Hadron Collider
One of the most expensive scientific experiments took ~3 billion Swiss Francs to build (or ~5 billion US dollars back in 2001)
-
Resolution of an electron microscope
Electron microscopes can resolve objects as small as 0.1 nanometers
-
Adding epicycles
Tweaking a fundamentally flawed theory in a last-ditch effort to make it explain observations
-
Why governments fund basic scientific research
Among many reasons, basic scientific research (i) lowers the barrier for firms that want to develop new products and (ii) develops skilled scientists and engineers who can capitalise on research undertaken elsewhere
-
William Shockley’s thoughts on productivity
Shockley speculated that a small number of scientists can be exponentially more productive in total because the creation of a scientific paper is the combination of many individual tasks, and productivity in each of these tasks multiplies together to give overall productivity.
-
Kerning
Adjusting the spacing between individual letters in text to improve aesthetics
-
How last authorship varies across fields
Depending on scientific field, the last author either did the least work, is the group leader, obtained funding for the project, or has a surname near the end of the alphabet
-
Project Jupyter
An open source project that simplifies and promotes interactive use of many programming languages
-
Uncertainty propagation
The uncertainty of a derived quantity (e.g., kinetic energy derived from speed and mass) can be calculated from the uncertainty of the input quantities following simple—though sometime tedious—arithmetic
-
SSH (Secure Shell)
The standard way to access a remote server via the command line
-
Difference between a hyphen and a minus sign
Although similar, they should not be confused; a hyphen (-) is a short dash used to combine words, whereas a minus sign is longer (−)
-
Optimal number of characters per line
A line of text should have 60–70 characters (counting spaces) for a single-column layout and 40–50 for multiple columns (see page 32 of Detail in Typography)
-
The .eps file type
A predecessor to PDF that was developed in the late 1980s and is almost obsolete
-
Stroke and fill
For line drawings, the edge is known as the stroke and the interior is known as the fill
-
The Greek alphabet
The order doesn’t matter, but knowing the individual letters is worthwhile
-
Text anti-aliasing
The smoothing of text to improve its appearance (especially relevant at coarse resolution)
-
Triptychs
A three-panel image or collection of images (and an easy way to create an attractive title slide)
-
Active voice
Better than passive voice in most cases
-
Fast Fourier transform
An algorithm that makes much of modern technology possible
-
Pseudoscience
Statements and methods purportedly grounded in science but obviously flawed
-
ORCID
A unique digital identifier for a researcher that is linked with their scholarly works
-
Transistors
Your phone likely has billions of them
-
RAM
A computer’s short-term memory in a sense (distinct from the long-term memory that is the hard drive)
-
Effective cost of a night’s worth of observation from a large telescope
About $50 000
-
Functions in programming
One of the building blocks of any programming language that, typically, (i) takes one or more inputs, (ii) does whatever to those inputs, and (iii) returns an output
-
Argument from authority fallacy
The incorrect assumption that a claim is true because it is coming from an authority figure
-
Pregistered studies
Studies in which the methodology and hypothesis are published before data are obtained
-
Strawman argument
Misrepresenting a claim or changing its context so as to make it easier to argue against
-
The amount of freely available satellite data
NASA, for example, currently has about 30 active earth-observing satellites producing about 30 TB of data each day
-
For loops
The simplest way in most programming languages to make a computer do something again and again
-
Illusion of explanatory depth
Most people are overconfident in their understanding of a complex phenomenon or procedure until they try to explain it step by step
-
How regression works
Calculating a line of best fit is one of those things everyone should do manually at least once to understand the procedure that can otherwise be a black box
-
Bayesian statistics
An approach to statistics in which probabilities are continually updated as new information is obtained
-
System 1 and 2 thinking
Two distinct ways of thinking: system 1 is fast and driven by intuition and emotion, whereas system 2 is slower and more deliberate
-
Pasteur’s quadrant
Use-inspired basic research, or the view that basic and applied research aren’t mutually exclusive
-
Sayre’s law
In any dispute the intensity of feeling is inversely proportional to the value of the issues at stake
-
Planning fallacy
The tendency to underestimate the time needed to complete a task (e.g., writing a scientific paper) even with prior experience in the same or similar tasks
-
Floating point numbers
The system used by computers that allows a small number of bits (a zero or one) to represent a wide range of numbers (e.g., 64 bits can be used to closely approximate any number, positive or negative, up to 1.8×10308)
-
The Dobzhansky Template
A format coined by scientist-turned-filmmaker Randy Olson that aims to drill down to the essence of an idea: Nothing in ___ makes sense except in light of ___ (e.g., nothing in biology makes sense except in light of evolution)
-
Newman design squiggle
A visual metaphor for the design process that works equally well for the process of doing science
-
Gestalt Laws
Design laws, grounded in psychology, for how humans perceive combinations of objects or elements
-
Einstellung effect
An inefficient problem solving technique where you rely on your previous approaches that worked in the past despite there being better methods
-
Bike-shedding
Also known as the Law of Triviality, bike-shedding is giving undue emphasis on minor matters such as the design of bike sheds to be included within the development of a nuclear power plant
-
Simpson’s Paradox
Subsets of a dataset, all of which have a negative statistical trend, can still produce a positive trend in the overall dataset
-
Parkinson’s law
Work expands to fill the time available for its completion
-
Epistemic trespassing
When an expert in a given field trespasses into another and makes claims where they lack expertise
-
Decline effect
The strength or effect size of a scientific result tends to decline over successive replications
-
Base rate neglect
Misjudging the probability of an event due to more intuitive individuating information (e.g., thinking it’s more likely than not that someone who is 6-foot-8 plays basketball professionally, except that the chances are a fraction of 1%)
-
Identifiable victim effect
The desire to assist a specific individual facing a certain hardship but not a large, unknown group of people facing the same hardship
-
John Ioannidis
A somewhat controversial physician/scientist perhaps best known for his claim that most published research findings are false
-
Texas sharpshooter fallacy
Deriving incorrect conclusions by overly focusing on clusters of data points that may have arisen by chance
-
Survivorship bias
A type of selection bias in which the dataset contains only people who made it past some hurdle
-
BANs: big-ass numbers
One of the simplest ways to visualise data in which, in place of graphs, a few select metrics are displayed as numbers in large text
-
Anchoring bias
The tendency (and salesperson’s boon) for people to focus on relative changes from an initial value rather than the absolute amount
-
The difference between science and engineering
Scientists aim to generate knew knowledge and engineers aim to apply knowledge to solve real-world problems
-
Researcher degrees of freedom
A measure of the flexibility a scientist has in developing, analysing, and publishing an experiment
-
Banking to 45°
As a rule of thumb, the aspect ratio of a line graph should be one in which the changes to be emphasised have a slope of ~45°
-
The linear model of innovation
The conjecture that basic research informs applied research, which promotes development and production, which ultimately lead to economic growth
-
Daryl Bem’s precognition paper
An infamous study—that passed peer-review—that purportedly shows that people can essentially see briefly into the future
-
Starting with the cake
A teaching philosophy that starts with the big picture rather than tedious fundamentals
-
Arxiv
One of the original preprint servers (now 30 years old)
-
Principle of least astonishment
A guideline that encourages a design (say, an interface or piece of software) to be built to behave in a way that most users expect it to
-
Germanic vs Latinate words
Words with a German heritage tend to be simpler and less pretentious than those from Latin
-
Altmetrics
A type of citation measure that counts mentions in blogs, tweets, and other social media rather than standard citations in scientific papers
-
scite.ai
A (now expensive) AI service that summarises the different ways a paper is cited (supported, contrasted, or mentioned) rather than merely counting the number of citations
-
Root cause analysis
A problem solving technique that looks to solve the underlying issue rather than the immediate (and possibly superficial) problem
-
Complex vs complicated
Something that is complicated may involve a tedious number of straightforward steps, whereas something that is complex may have multiple nonlinear interactions and emergent behaviour
-
WEIRD subjects
People from Western, Educated, Industrialized, Rich, and Democratic societies who are over-represented in scientific studies involving human subjects
-
Inkscape
A vector graphics editor that is more than sufficient for a scientist’s needs
-
Donald Knuth
A computer scientist notable for, among many things, the creation of the TeX typesetting language and his decision to forgo email as of Jan 1, 1990
-
Oblique, isometric, and one- and two-point perspective
Four standard ways to project a three-dimensional object into two dimensions
-
The second law of thermodynamics
Entropy of a closed system cannot decrease or, more simply, heat flows from hot to cold
-
The rate of sea level rise
The current global average is about 4 mm/yr, but this varies regionally depending on the vertical movement of land
-
The Oxford comma
The comma placed before “and” or “or” in a list of three or more items
One thought on “100 things a scientist should know about”
Comments are closed.