Python spoilt me, Part 2

In a previous post, I listed a range of Matlab’s idiosyncrasies and flaws that seemed so much more blatant once I returned from several years of Python use. This post is a continuation, except this time highlighting ways in which Python makes life simpler rather than Matlab making life more difficult.

If you haven’t tried Python and you’re on the fence about whether it’s worth learning, let the points below convince you.

Python loops are shorter, easier, and neater

Too often with Matlab, I find myself having to create an index to loop over elements in an array. Consider the following example of applying myfunc to every element of array:

for ii = 1:length(array)
    myfunc(array(ii))
end

While this gets the job done, it obscures the purpose of the loop. In general, Python allows cleaner ways to loop over elements and often makes indices unnecessary. Consider the equivalent Python code:

for element in array:
    myfunc(element)

Python also has handy looping tools like enumerate, zip, and list comprehensions. These are tools you never knew you wanted until you learn them and subsequently can’t do without. In fact, the example above can be achieved in a single line. Not only that, but the output (as a list) can be assigned in the same line:

newarray = [myfunc(element) for element in array]

These examples barely scratch the surface of how much easier loops are in Python. Other than being able to define functions wherever I like, easy and quick loops are perhaps what I miss most when using Matlab rather than Python.

Python instills good formatting

A for-loop without indentation is poor form, but perfectly valid in Matlab. By contrast, indentation is an essential ingredient in a Python for-loop. By itself, this wouldn’t be a big deal, but it is representative of a larger issue: Python’s syntax generally enables better, easier-to-read code.

Python also has a definitive style guide: PEP-8. It describes the conventions to follow for seemingly every possible matter related to code layout: the number of spaces to indent, the amount of whitespace between words and lines, the order to import modules, the locations where line breaks should occur, the alignment of code across different lines, and many more.

I have my editor set up to provide an unobtrusive message whenever I accidentally forget one of the PEP-8 recommendations. These small, occasional reminders keep my code clean and consistent. Consistent not just with my own code, but with everyone else because they follow the same guidelines.

Matlab’s editor highlights a few errors and provides suggestions for a few improvements. But PEP-8 covers a much wider range of issues.

Python boosts transferable programming skills

I learned scientific computing by solely using Matlab for four years. However, I never gained a working knowledge of various programming concepts such as classes, object-oriented programming, namespaces, and basic memory management until I started to use Python. Admittedly, these aren’t typically relevant to my day-to-day coding, but I find that having some awareness of the concepts encourages me to think more carefully about the best way to implement something.

Python also turns up in a number of places unrelated to scientific computing, especially when using Linux, so knowing the language can come in handy. For example, Python scripts are used by Autokey, a handy Linux utility that lets me automate countless tedious tasks such as making a word count or google search of highlighted text regardless of the application being used.

Many complex tasks can be achieved in only a few lines

Creating a map, laying out a multi-panel figure, or creating a publication-quality plot from an external netCDF file, all tasks that can be implemented incredibly easily in Python.

A map of New Zealand, for example, is as simple as

from mpl_toolkits.basemap import Basemap
m = Basemap(projection='eqdc', width=1200e3,
            height=1800e3, lat_0=-41, lon_0=173,
            resolution='h')
m.fillcontinents(color='silver')

One line to import the appropriate module, another to choose the basic map properties, and finally a command to draw with the desired colour:

New_Zealand_Basemap

Multi-panelled plots are even simpler. Let’s say I want a 2 × 3 panel plot in which all of the plots share both the x and y axes. And it needs to be 7 × 3 inches, so as to fit nicely across a page. This can be achieved with a single command:

fig, axs = plt.subplots(
    nrows=2, ncols=3, figsize=(7, 3),
    sharex=True, sharey=True)

Individual axes can then be accessed based on their coordinates, e.g., axs[0, 0].plot(...

Setting up the plot yourself isn’t even required sometimes. The excellent xarray module, together with a standard netCDF dataset can create publication-quality plots in very few steps. Here’s an example from a 3D ocean model (MITgcm) output, where I want to plot a vertical slice of temperature (THETA​) along the Y direction between ±10 km from 0 in the X direction.

from xarray import open_dataset
ds = open_dataset('output filename')
ds.THETA.isel(Y=0).sel(X=np.s_[-10e3:10e3]).plot()

xarray_example

It’s not quite publication ready just yet, but pretty damn good for three lines. Note, also, how indices are specified by Y=... and X=.... No need to remember whether the X is the first, second, or third dimension. Easier for the person when they write the code, and much easier for anyone later reading the code.

Okay, Matlab has a few good things going

I’ve now written almost two posts highlighting the advantages of Python relative to Matlab. Of course, it is not all one sided. There are a handful of ways in which Matlab makes life easy compared to Python:

  • Defining a Matlab vector is simply start:step:end, whereas the equivalent in Python is a little more verbose: np.r_[start:end:step].
  • Matlab structs are easier to create than dicts, their Python equivalent (though structs are perhaps not as easy to work with once created).
  • Comparing arrays elementwise in Matlab doesn’t require a separate function, e.g., compare A | (B & C) (Matlab) with np.logical_or(A, np.logical_and(B, C))
  • Colormaps in Python are more cumbersome than in Matlab, where they can simply be an array of numbers with three columns.
  • Python is missing an equivalent to Matlab’s switch/case statements.

Still want more Python features?

There’s a pretty exhaustive list of Python gems on this StackOverflow post.

Advertisements

Scientific rationale: convenient little white lies?

Physics is like sex: sure, it may give some practical results, but that’s not why we do it quipped Richard Feynman. The oceanographer Curtis Ebbesmeyer1 provides a similar, albeit less memorable quote, when describing his early work on water slabs (aka snarks), which had relevance to both military and pollution issues: such practical matters did not interest me. I found snarks fascinating, even beautiful in their own right. The introductions to many scientific papers, however, are framed in terms of practical results. Hence the rhetorical question implied in the title: are the rationale we as scientists publish convenient little white lies: simply a way to validate undertaking the science that we find personally interesting and intrinsically satisfying?

Continue reading “Scientific rationale: convenient little white lies?”

Python spoilt me; returning to Matlab is hard

Using Python daily for more than three years as part of my scientific workflow and then abruptly returning to regular Matlab use has made me realise how much better Matlab could be and how evident its idiosyncrasies are. Conversely, while I was aware and noticed that Python makes things simple, it is Matlab’s comparative flaws that really made me come to appreciate just how much has been achieved in the past decade by the community in making Python an indispensable scientific tool.

An aim of this post is to recognize Python’s impressive convenience and versatility. Unfortunately, however, this post more naturally develops by taking the pessimistic approach of highlighting Matlab’s flaws. What follows are several minor, and a few major, annoyances that I’ve noticed on returning to Matlab.

Continue reading “Python spoilt me; returning to Matlab is hard”

Benefits of an e-reader for academics

E-readers are no good for reading scientific papers.1 They’re grayscale, they’re too small, and flipping back and forth between pages takes time. That said, my e-reader has two key benefits for me as a scientist/academic. It provides a truly offline method to read content later and it lets me read books that are only available as PDFs.

ereader_example2
An example of how a given blog post looks on my e-reader

Continue reading “Benefits of an e-reader for academics”

Palatino and Source Sans Pro, the only fonts a scientist needs

The title of this post is both my subjective opinion and the TL;DR version of this post. If you’re interested in why I no longer bother with any other fonts, let me explain.

palatino_and_source_sans_pro.png

Continue reading “Palatino and Source Sans Pro, the only fonts a scientist needs”

Organise scripts and figures easily with Jupyter Notebooks

Keeping track of scripts used to generate figures is difficult. Before realising that Jupyter Notebooks could solve most of my problems, I would have directories with dozens of scripts with filenames of varying levels of ambiguity. Names that probably meant something to me at the time, but are hardly descriptive months or years later. Names like  ISW_plume_plots.m, new_ISW_model_plots.m, and plot_model_behaviour.m. A certain PhD comic springs to mind.

Regardless of whether its Python, R, Julia, Matlab, or pretty much any other type of code, Jupyter Notebooks solve the problem. For example, I use a single notebook to archive the code for all figures in a paper and, more importantly, I can associate each set of code with the figure it generates. Rather than trying to remember what file I want, I need only remember which figure I want. (I say archive because I much prefer to do the bulk of my exploratory analysis in an editor. Alternatively, JupyterLab may work better for you.)

Continue reading “Organise scripts and figures easily with Jupyter Notebooks”

This is the best decade to be a grad student

Catching up on the literature is a daunting aspect of graduate studies. As a physical oceanographer, I regularly cite work from 30 to 40 years ago. In that time, and all the way back to the turn of the 20th century, the scientists before me got to answer all the low-hanging-fruit problems and write the papers that will be cited thousands of time. They leave behind the messy, complex, and esoteric questions for the current grad students. Surely, then, I would think the 60s or 70s or even earlier would have been the best time to be a grad student?

Continue reading “This is the best decade to be a grad student”