“Clean Code” Book Club: Chapter 3, Functions (Part 1)

Posted on Sat 27 April 2024

Continuing with our book club, this week we started with chapter 3 (“Functions”). Since it’s a longer chapter, we’re splitting it into two; this week focussed on pages 31–40.

(As usual, this post collects most of my notes and some points from our group discussion; completeness is explicitly a non-goal.)

Chapter 3: Functions

At the beginning of this week’s discussion, it quickly became clear that we all had reservations about the advice in this chapter. Several people said that the basic advice seemed sound but the author, Robert Martin, is overdoing it.

I’ll go one step further: This chapter is fundamentally flawed.

That flaw, I think, is that it’s too obsessed with counting lines of code, when the actual question is whether code is “clean”.1 Now, much of the time, LoCs might correlate with “cleanliness”; but make no mistake, LoCs are used because they are easy to measure, not because they are an accurate measure.2 And if you agressively optimize for a flawed metric, you will miss your actual goal; your code will be utterly terrible and you will be so blinded by your metric that you cannot see that.


We ended chapter 2 with a code example that did too many things at once and thus failed to make the point it was supposed to make. Unfortunately, with listings 3-1 and 3-2, we begin chapter 3 in the same way. There are several things changing between these listings:

  1. renaming (of the function itself and several variables)
  2. cleaning up the logic (for non-test pages, code would write page content to buffer and then back to page; that is skipped)
  3. and extracting functions

Steps 1 and 2 are unquestionably improvements. However, the focus in this chapter should be on step 3—and that feels a bit like a sleight of hand in this example, because the extracted code is simply hidden from listing 3-2. I could get almost the same effect by adding the comments // include setup pages and // include teardown pages in front of the if blocks in listing 3-1 and then collapsing those blocks in my IDE to hide them from view.

Small!

There is a famous quote, usually (incorrectly) attributed to Albert Einstein:

Everything should be made as simple as possible, but no simpler.

Unfortunately, Martin appears to have missed the second part of that quote:

The first rule of functions is that they should be small. The second rule of functions is that they should be smaller than that. (p. 34)

Do shorter functions tend to be easier to comprehend than longer functions? Probably, yes. But there is a trade-off that Martin is missing: Individual functions don’t stand on their own. We don’t read them like a Buzzfeed listicle of “13 Haikus That Will Restore Your Faith in Humanity”3. We read functions, because we want to understand a chunk of code. And if that chunk of code is split up into five different functions, we need to understand those five different functions and how they interact with each other. Splitting code up into many smaller functions based on some arbitrarily set line count limit just replaces excessive length with excessive nesting depth.

Instead of counting lines of code, I’d suggest the following hypothesis: The optimal length of a function is commensurate with the chunk size of the programmer’s working memory. Like Martin, I can’t provide objective justification for this; but at the very least, my hypothesis is a lot more plausible than an arbitrary four-line limit.

Do One Thing

Functions should do one thing. They should do it well. They should do it only. (p. 35)

(Formatting [sic].)

At a high level, that sounds like reasonably good advice. The problem is: How do you define “one thing”? Is an island one thing? Is a beach on that island one thing? Is a grain of sand on that beach one thing? Arguably, each of them is “one thing”—just at different levels of abstraction.

One Level of Abstraction per Function

Martin uses this notion of “levels of abstraction” and asserts that each function should only have one level of abstraction. Which sounds reasonable enough … until you notice that he considers getPathNameForPage(page) and pageCrawler.getFullPath(page) different levels of abstraction. (From listing 3-7.)

And yes, technically, one may be able to argue that these are ever-so-slightly-different levels of abstraction. But whatever tiny clarity we may gain by separating these subtly different levels, we immediately pay in additional complexity by adding an extra layer of nested functions.

We then talked a few minutes about the “Stepdown Rule”, that reading code from top to bottom should descend from the highest abstraction level to the lowest:

  • This may not even work in C(++), which can be picky about how you order code.
  • It won’t work for single-file Python scripts, either: Code is executed from top to bottom, so low-level helper functions need to be defined above the main code block that uses them.
  • Is this similar to literate programming? Not quite; literate programming is e.g. well suited for data pipelines, since it models a linear flow of data; whereas this stepdown rule is about serializing a tree of nested functions.

Switch Statements

Polymorphism can replace the need for switch statements in some contexts; and the example given in listings 3-4 and 3-5 nicely illustrates that point. But Martin asserts that all switch statements should be met with polymorphism, which seems too dogmatic.4

Since we were discussing how this would look in Python, here’s a Python version of listing 3-5. It uses an Abstract Base Class to define the employee class structure; but unlike Java, idiomatic Python code would avoid the overhead of defining single-use EmployeeFactory interface and implementation classes, when a simple makeEmployee function would suffice.

from abc import ABC, abstractmethod

class Employee(ABC):
    """Abstract Base Class for all employees.
    Child classes must implement methods labeled `@abstractmethod`.
    """
    def __init__(self, record):
        self.employeeRecord = record

    @abstractmethod
    def isPayday(): pass

    @abstractmethod
    def calculatePay(): pass

    @abstractmethod
    def deliverPay(pay): pass

class CommissionedEmployee(Employee):
    def isPayday():
        # must implement this method here
        pass

    def calculatePay():
        # must implement this method here
        pass

    def deliverPay(pay):
        # must implement this method here
        pass

class HourlyEmployee(Employee):
    pass  # must implement all three methods here

class SalariedEmployee(Employee):
    pass  # must implement all three methods here

def makeEmployee(record):
    match record.type:
        case "Commissioned":
            return CommissionedEmployee(record)
        case "Hourly":
            return HourlyEmployee(record)
        case "Salaried":
            return SalariedEmployee(record)
        case _:
            raise ValueError(f"Invalid employee type: {record.type}")

class EmployeeRecord:
    type = "Commissioned"  # hardcoded for demo purposes

record = EmployeeRecord()
employee = makeEmployee(record)
print(type(employee))  # will print `<class 'CommissionedEmployee'>`

Use Descriptive Names

This is pretty much a reprise of chapter 2; there’s not much to comment on here.

We continue next week with function arguments, side effects and more …

  1. For some Stewart-ian “I know it when I see it” definition of “clean”, which is notoriously hard to measure. ↩︎
  2. I’m reminded of this classic story. ↩︎
  3. “Number 11 will surprise you!” ↩︎
  4. Though I guess if all you have is an AbstractHammerFactory, everything looks like a nail … ↩︎

“Clean Code” Book Club: Chapter 2, Meaningful Names

Posted on Thu 18 April 2024

After last week’s introduction, in the second session of this book club we dive into the main content of the book. And, if you believe the old joke, we start with one of the hardest chapters:

There are two hard problems in computer science: Naming things, cache invalidation and off-by-one errors.

(As usual, this post collects most of my notes and some points from our group discussion; completeness is explicitly a non-goal.)

Chapter 2: Meaningful Names

Like about half of the chapters in this book, this one was not written by Robert Martin, but by one of his collaborators; in this case, Tim Ottinger.

In general, my impression of this chapter was fairly uneven. It discusses many important aspects of naming, formulates clear guidelines—some of which I was already aware of, others I’ll need to keep in mind—and is very conscious of how context-dependent naming is. Superficially, some guidelines may seem contradictory; but to me, this is not a flaw in the writing, but rather an acknowledgement that naming choices have tradeoffs—being a good developer simply means being better at making reasonable tradeoffs. Unfortunately, several examples seemed unhelpful or even actively misleading to me.1

Use Intention-Revealing Names

If a name requires a comment, then the name does not reveal its intent. (p. 18)

For example, naming a variable d does not make the intent clear; whereas naming it maxAgeInDays does. This is such an obvious, basic guideline; and yet I’ve been guilty of breaking it. (Though if we’re honest, so has every programmer, right?)

Now, if this applies to naming function arguments as well, does that mean we shouldn’t (need to) describe arguments in the function’s docstring? We discussed this a bit in the book club; and we think that would be an overly broad interpretation of this guideline. For example, a function may take an argument maxAgeInDays but have a non-obvious behaviour for special values like 0. Such larger context belongs in a comment; the basic intent of the argument is still clear from its name alone.

Here’s a longer example:2

# Bad: Names don’t reveal meaning of the variables.
# It isn’t clear what the code is doing at a high level.
def getThem():
    list1 = []
    for x in theList:
        if x[0] == 4:
            list1.append(x)
    return list1

# Better: High-level intent is clear from reading code.
def getFlaggedCells():
    flaggedCells = []
    for cell in gameBoard:
        if cell.status = CellStatus.FLAGGED
            flaggedCells.append(cell)
    return flaggedCells

Avoid Disinformation

The first example in this section recommends not to use hp as a variable name for a hypotenuse, since it could also be used (alongside aix and sco3) as the name of a Unix variant.

I don’t think this is a great example. For one, I struggle to imagine a context where there is serious doubt whether a variable refers to a hypotenuse or to HP-UX. But also, hp is a fairly bad variable name for other reasons anyway—it’s fairly cryptic, hard to pronounce, … if you do need a name for a hypotenuse, at least make it hyp!

The following example (don’t call a variable accountsList if it’s not actually a list) makes a good point, though. And even if it is a list, encoding that in the name is often unnecessary. I like the convention of simply using a plural noun (accounts) to name such a variable—whether it’s a list, set, dict or similar class—since it enables beautifully readable code like for account in accounts:.

It is very helpful if names for very similar things sort together alphabetically (p. 20)

This is very practical API design advice! For example, in NumPy, np.as<TAB> offers the autocomplete options asarray, ascontiguousarray, asfortranarray, asmatrix, asscalar, etc., so even if I know only one or two of these functions, it’s obvious where to look for related ones. In contrast, if they were named np.arrayfrom, np.fortranarrayfrom etc., I’d have to look through the whole documentation to find the one that’s most appropriate.

Make Meaningful Distinctions

Two variables named time and time2 may be distinguishable to the compiler/interpreter, but they are not meaningfully distinct to a human reader. Use e.g. startTime and endTime instead—or rawTime and correctedTime, or whatever makes sense in context.

Similarly, the chapter notes that “noise words” are not a meaningful distinction. For example, if two objects are named accountData and account, it’s unclear to a reader what the difference is. Or, in an earlier Java code sample in this chapter, there were both an ArrayList and a List. If you’re not already very familiar with the API, can you tell what the difference might be? I had to look it up.

Use Pronouncable Names

If you can’t pronounce it, you can’t discuss it […] This matters, because programming is a social activity. (p. 22)

Well said!

Use Searchable Names

Grepping for a NAMED_CONSTANT is easy, grepping for a digit is basically hopeless. A very good point, which I hadn’t given much thought to before.

This section also has the following rule of thumb, which is more widely applicable:

The length of a name should correspond to the size of its scope (p. 22)

That makes sense: If the scope is small, then I can see all the necessary context at a glance and the name doesn’t need to duplicate that information. Whereas if the scope is too large to be glanceable, I need the name of the variable to provide me with context.

Avoid Encodings

This section discusses Hungarian Notation or the common convention of prefixing e.g. member variables with m_. This, as the chapter says, is largely unnecessary in even halfway modern IDEs. Experienced developers learn to ignore it after a bit, while new developers joining a codebase have an extra hurdle to reading the code. Especially in an academic context, where the software is often not the main product but just a tool to enable a research project, new developers (e.g. PhD students or postdocs) might not have the time or motivation to understand such cryptic encodings. Maintainers therefore need to expend significant effort to enforce such conventions, or the code will become inconsistent and thus misleading over time.

Often, such encodings also make names hard to pronounce—see above!


Another kind of encoding—not mentioned in the book, but highly relevant to research software—is mathematical notation. Just because it’s legible when properly typeset, doesn’t mean it’s legible in code.

For example, sub- or superscript indices are extremely useful in typeset equations, but often don’t translate well into code. Or I recently encountered an example, where a physics paper used γ and Γ for distinct but related variables. In the typeset equations, this was a clear distinction; but in code, having variables named gamma and Gamma would be terribly confusing.

Avoid Mental Mapping

This chapter left me a bit puzzled. It’s not that I disagree with anything in here; it’s just that the examples don’t make it clear how this is any different from the section “Use Intention-Revealing Names” at the beginning of the chapter … 🤷

Class and Method Names

Classes and objects should be nouns or noun phrases; methods and functions should be verbs or verb phrases.

Good advice. I mostly do this intuitively, but it’s good to be aware of this.

Don’t Be Cute

If colleagues can’t understand your code without googling pop-culture references or asking you to explain some inside joke, that’s a problem.

I know it’s tempting to use that really clever joke … but don’t. If it’s otherwise safe for work, feel free to post it in the #random channel on your team’s Slack, where anyone who doesn’t get the reference can safely ignore it; if it’s risqué, post it on your AfterDark Mastodon account.

Pick One Word per Concept … and One Concept per Word (a.k.a. Don’t Pun)

These two sections are closely related to the “Make Meaningful Distinctions” guideline earlier.

For example, fetch, retrieve and get may be different, but they’re not meaningfully distinct to a reader, so pick one and use it for all equivalent methods across different classes.

But on the flip side: Don’t use the same word for conceptually different things. This can sometimes be subtle. For example, appending an object to a list is kinda like adding it, right? Well … that’s just English being imprecise. Consider these examples:

>>> [1,2] + [3,4]
[1, 2, 3, 4]
>>> [1,2] + 3
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: can only concatenate list (not "int") to list

Instead of list.add(element), the Python standard library rightfully calls this function list.append(element).4

Use Solution/Problem Domain Names

Remember that the people who read your code will be programmers. (p. 27)

That’s not necessarily true for research software—a lot of our collaborators are experts in their specific field of research, but have no formal programming training. Thus, while the book recommends programming terms first, and domain terms as a fallback, for us the better approach is often going to be the other way around.

Add Meaningful (But Not Gratuitous) Context

The last two sections tell us that we want to be right in The Goldilocks Zone of Context—just enough to inform, not so much that it overwhelms.

The basic idea here makes a lot of sense. But unfortunately, I found the example code listings here worse then useless. To me, the original version of the code is better than the “improved” version in almost every way: The “improved” version of the code has 1.5 times as many lines of code. Instead of 1 function, it has 1 class containing 5 functions; all for a simple three-way if-elseif-else statement. Instead of executing linearly from top to bottom, execution now jumps up and down between 3 of the 5 functions.

Supposedly, that example code demonstrates that having the three variables number, verb and pluralModifier defined inside a GuessStatisticsMethods class instead of a printGuessStatistics function adds more meaningful context? I didn’t see much of a difference; and even if there is one, the example is terribly ill-suited to illustrating that difference, since it includes major unrelated code changes (completely restructuring the code flow) and even changes the functionality (instead of printing a message, the “improved” code returns a string).

Oh, and the class and function names? For a chapter that’s all about meaningful naming, I found them surprisingly obtuse. Here’s a Python version of the initial code example:

def printGuessStatistics(candidate, count):
    if count == 0:
        print(f"There are no {candidate}s.")
    else if count == 1:
        print(f"There is 1 {candidate}.")
    else:
        print(f"There are {count} {candidate}s.")

(Note: The one significant detail that got lost in this translation is that candidate was a char in the Java version.)

The name printGuessStatistics doesn’t make much sense here without additional context which this code sample doesn’t offer. But let’s assume that these are results from a survey asking people to guess the right answer from some options A, B, C and D. That would explain the Guess part, but Statistics is still a bit of a misnomer—printGuessCounts would be more accurate. Also, why are three functions in the “improved” version (which basically correspond to the print function calls in my Python version) called thereAreNoLetters, thereIsOneLetter and thereAreManyLetters? This is violating the guideline of “One Word Per Concept”—Letters refers to the same thing as candidate—and the guideline that method names should be verb phrases.

With all these complaints about that example out of the way, what would be a better example for adding meaningful context? Off the top of my head: Maybe we could have gone back to the getFlaggedCells example at the start of this chapter and discussed how having a CellStatus enum adds useful context, compared to having a bunch of independent constants like FLAGGED, EMPTY and NONE.

But it’s certainly worth thinking about—what are other, better examples?

  1. Of course, some of that will be due to my perspective: The code examples use Java (which I have almost no experience writing) and often draw from a business context. In contrast, my primary language is Python and I work mainly on scientific software; so some cultural differences are unavoidable. ↩︎
  2. I will write examples in Python in this post. Partly so I don’t just copy the author’s code; and partly, because it’s a nice exercise to translate examples into a different language and verify whether the guidelines still apply, independent of syntax details. ↩︎
  3. Yes, that SCO. “The Most Hated Company in Tech”, which became bankrupt just as this book came out. ↩︎
  4. Ironically, one of the first Java code examples in this chapter uses list1.add(x);. I think it’s a missed opportunity that the author explicitly brings up the add/append example, but doesn’t acknowledge this. ↩︎

“Clean Code” Book Club: Foreword, Introduction and Chapter 1

Posted on Wed 10 April 2024

I just started a book club in our RSE group. The first book we’re going to read is “Clean Code” by Robert C. Martin and today we discussed the Foreword, Introduction and first chapter. (A sample PDF which includes these sections is at the publisher’s website.)

This post collects most of my notes and some points from our group discussion. (Completeness is explicitly a non-goal; otherwise I would not have been able to finish this post in a reasonable time.)

Foreword

This jumped out at me:

Back in my days working in the Bell Labs Software Production Research organization we had some back-of-the-envelope findings that suggested that consistent indentation style was one of the most statistically significant indicators of low bug density. (p. xxii)

With language-enforced indentation, more advanced IDEs and automated code formatters (black, ruff, clang-format, …) running in CI or as pre-commit hooks, this particular correlation may no longer be true. But does the underlying mechanism still hold true? Does attention to detail still correlate with code quality? I’m inclined to believe it does.

If you care about attention to detail, this foreword contains plenty of quotes and proverbs that will encourage this. (Of course, if you don’t care, how likely are you to pick up this book in the first place?)

Introduction

This book will make you work, and work hard. […] You’ll be reading code—lots of code. And you will be challenged to think about what’s right about that code and what’s wrong with it. (p. xxvi)

I’ve read PEP 8 (the Python style guide) more than once. I largely agree or at least find it reasonable1—but as I read it, I rarely stop to think about it; let alone discuss it with others. Already I’m excited to go through this book in a book club, rather than on my own!

The book is divided into three parts. In addition to the description given the introduction, I’ll include my initial impression based on a handful of skimmed page.

  • Twelve surprisingly compact2 chapters that use short code examples to introduce “principles, patterns and practices”—from basics like naming variables and writing comments to advanced topics like system architecture and concurrency. (These should be well suited for a book club—I’m genuinely excited to discuss many of these!)
  • Three chapters discussing more complex examples in greater detail. This is where the hard work promised in the quote above will come in. (Going through these may require a very different format; more like a code review session than a book club?)
  • Finally, a systematic collection of code smells and heuristics encountered throughout the book. (This sounds like a valuable reference to consult over the coming years.)

But first, we’ll get to the first chapter, which doesn’t belong to either part.

Chapter 1: Clean Code

You are reading this book for two reasons. First, you are a programmer. Second, you want to be a better programmer. Good. We need better programmers. (p. 1)

Oh, hush! ☺️ You’re just trying to butter me up …

This chapter starts with a brief section “There Will Be Code”, talking about how the need for coding will continue to exist, despite (or because of) new higher-level abstractions. While events of the past year or so, with “AI” and LLMs becoming mainstream, were impossible to foresee when this book was written, this section has held up quite well.

Over the next sections of this chapter, Robert Martin talks about code rot, illustrating its potential consequences (decreasing developer productivity, potentially long and painful from-scratch rewrites). And while it’s tempting for developers to blame external factors (requirements and schedules imposed by management and marketing) for code rot, Martin argues that it’s not that easy: It is our job to defend code quality! (This section reminded me a lot of The Phoenix Project, which I’m currently reading.)

While the details vary for us as RSEs in academia—we have PIs instead of CEOs, our schedules are dictated by conference dates instead of marketing departments—the basic idea is the same. (Being in a central RSE group does give us a bit more independence from PIs, so we are better able to push back.)

Following this, Martin quotes several prominent developers about their definitions of “clean code”. Their responses (elegance, efficiency, readability, simplicity, care, beauty, …) are notably subjective—clearly, this is more of an art than a science. And in the following section, Martin explicitly acknowleges that:

Don’t make the mistake of thinking that we are somehow “right” in any absolute sense. (p. 13)

This book presents one school of thought—no absolute truths. Some things are a matter of personal taste. Others may vary with context—what’s true for Java may not be true for Python, and vice versa. (Actually, it might be interesting to compare the first part of this book with PEP 8?)

And speaking of PEP 8: While details may differ (we’ll see over the following weeks), the underlying thinking is clearly similar. Consider, for example, this quote from the book:

We are constantly reading old code as part of the effort to write new code. (p. 14)

Does that sound familiar?

Towards the end of the chapter, Martin quotes the Boy Scout Rule (“Leave the campground cleaner than you found it.”) and argues it is our professional duty to apply this to code.

I love this idea in the abstract, but—contrary to what the foreword tells you—the devil is in the details: What if clean-up work increases the size of our diffs and makes it a little harder to understand the main point of a PR? Or, even worse, what if an effort to “clean up” ends up introducing a subtle new bug? Something to ponder while we read future chapters …

Next up, chapter 2 (“Meaningful Names”).

  1. The 79 character maximum line length being a major exception 🫣 ↩︎
  2. 10 pages on objects and data structures? 16 pages on concurrency? ↩︎

Software Matters

Posted on Sun 24 March 2024

In May 20231, the Gran Sasso national laboratory in Italy hosted a workshop on supernova neutrino detection. Since I’m a maintainer of several open source software projects in this field, the organisers invited me to give a talk on software. After talking about the software used by the SNEWS network, and SNEWPY specifically2, I closed with a part to remind people that software matters. This is a writeup of that part.


Software matters.

More and more often, what we call a “PhD in Astrophysics” is actually a “PhD in software development with applications to astrophysics”. And that means problems in the software will lead to problems in the physics results. Here are three examples I’ve seen myself in the last few years:

  • “Oh, that input file contains the flux of muon or tau neutrinos, not the sum of both! Oops …”
  • “We had a bug in the script that we noticed too late; so the tabulated values in the paper are inaccurate.”
  • “We compared two event generators and found that one of them implemented an old cross section that was off by ~30%.”

(Note that I’ve tweaked these examples slightly to remove identifying details. The point is not to blame any particular person or software; it’s that these things happen to all of us.)

So: Software matters. It can truly make or break a physics result. And if we all write the same code from scratch again and again, we will waste more time and produce more bugs. That means less physics and worse physics. So let’s not do that.

With SNEWPY, we have well-tested open-source software that covers a lot of common tasks when working with supernova neutrinos. It’s easy to integrate SNEWPY into your own software, too, so you can focus on your particular project rather than reinventing the wheel. Let’s use this to achieve more and better physics results!


  1. A historical note: Back when I prepared this talk, I didn’t think of myself as a research software engineer—though I already was one, de facto. I’m writing it up now, while I’m still new to the RSE community, so I can look back at 2023-me’s perspective in a few months or years. It’ll be interesting to see how well this holds up over time … ↩︎
  2. For the purposes of this post, all you need to know is that SNEWPY does most of the busywork for you when studying supernova neutrinos. ↩︎

Hello World

Posted on Sat 16 March 2024

Hello, world! 👋