“Clean Code” Book Club: Chapter 3, Functions (Part 1)

Posted on Sat 27 April 2024

Continuing with our book club, this week we started with chapter 3 (“Functions”). Since it’s a longer chapter, we’re splitting it into two; this week focussed on pages 31–40.

(As usual, this post collects most of my notes and some points from our group discussion; completeness is explicitly a non-goal.)

Chapter 3: Functions

At the beginning of this week’s discussion, it quickly became clear that we all had reservations about the advice in this chapter. Several people said that the basic advice seemed sound but the author, Robert Martin, is overdoing it.

I’ll go one step further: This chapter is fundamentally flawed.

That flaw, I think, is that it’s too obsessed with counting lines of code, when the actual question is whether code is “clean”.¹ Now, much of the time, LoCs might correlate with “cleanliness”; but make no mistake, LoCs are used because they are easy to measure, not because they are an accurate measure.² And if you agressively optimize for a flawed metric, you will miss your actual goal; your code will be utterly terrible and you will be so blinded by your metric that you cannot see that.

We ended chapter 2 with a code example that did too many things at once and thus failed to make the point it was supposed to make. Unfortunately, with listings 3-1 and 3-2, we begin chapter 3 in the same way. There are several things changing between these listings:

renaming (of the function itself and several variables)
cleaning up the logic (for non-test pages, code would write page content to buffer and then back to page; that is skipped)
and extracting functions

Steps 1 and 2 are unquestionably improvements. However, the focus in this chapter should be on step 3—and that feels a bit like a sleight of hand in this example, because the extracted code is simply hidden from listing 3-2. I could get almost the same effect by adding the comments // include setup pages and // include teardown pages in front of the if blocks in listing 3-1 and then collapsing those blocks in my IDE to hide them from view.

Small!

There is a famous quote, usually (incorrectly) attributed to Albert Einstein:

Everything should be made as simple as possible, but no simpler.

Unfortunately, Martin appears to have missed the second part of that quote:

The first rule of functions is that they should be small. The second rule of functions is that they should be smaller than that. (p. 34)

Do shorter functions tend to be easier to comprehend than longer functions? Probably, yes. But there is a trade-off that Martin is missing: Individual functions don’t stand on their own. We don’t read them like a Buzzfeed listicle of “13 Haikus That Will Restore Your Faith in Humanity”³. We read functions, because we want to understand a chunk of code. And if that chunk of code is split up into five different functions, we need to understand those five different functions and how they interact with each other. Splitting code up into many smaller functions based on some arbitrarily set line count limit just replaces excessive length with excessive nesting depth.

Instead of counting lines of code, I’d suggest the following hypothesis: The optimal length of a function is commensurate with the chunk size of the programmer’s working memory. Like Martin, I can’t provide objective justification for this; but at the very least, my hypothesis is a lot more plausible than an arbitrary four-line limit.

Do One Thing

Functions should do one thing. They should do it well. They should do it only. (p. 35)

(Formatting [sic].)

At a high level, that sounds like reasonably good advice. The problem is: How do you define “one thing”? Is an island one thing? Is a beach on that island one thing? Is a grain of sand on that beach one thing? Arguably, each of them is “one thing”—just at different levels of abstraction.

One Level of Abstraction per Function

Martin uses this notion of “levels of abstraction” and asserts that each function should only have one level of abstraction. Which sounds reasonable enough … until you notice that he considers getPathNameForPage(page) and pageCrawler.getFullPath(page) different levels of abstraction. (From listing 3-7.)

And yes, technically, one may be able to argue that these are ever-so-slightly-different levels of abstraction. But whatever tiny clarity we may gain by separating these subtly different levels, we immediately pay in additional complexity by adding an extra layer of nested functions.

We then talked a few minutes about the “Stepdown Rule”, that reading code from top to bottom should descend from the highest abstraction level to the lowest:

This may not even work in C(++), which can be picky about how you order code.
It won’t work for single-file Python scripts, either: Code is executed from top to bottom, so low-level helper functions need to be defined above the main code block that uses them.
Is this similar to literate programming? Not quite; literate programming is e.g. well suited for data pipelines, since it models a linear flow of data; whereas this stepdown rule is about serializing a tree of nested functions.

Switch Statements

Polymorphism can replace the need for switch statements in some contexts; and the example given in listings 3-4 and 3-5 nicely illustrates that point. But Martin asserts that all switch statements should be met with polymorphism, which seems too dogmatic.⁴

Since we were discussing how this would look in Python, here’s a Python version of listing 3-5. It uses an Abstract Base Class to define the employee class structure; but unlike Java, idiomatic Python code would avoid the overhead of defining single-use EmployeeFactory interface and implementation classes, when a simple makeEmployee function would suffice.

from abc import ABC, abstractmethod

class Employee(ABC):
    """Abstract Base Class for all employees.
    Child classes must implement methods labeled `@abstractmethod`.
    """
    def __init__(self, record):
        self.employeeRecord = record

    @abstractmethod
    def isPayday(): pass

    @abstractmethod
    def calculatePay(): pass

    @abstractmethod
    def deliverPay(pay): pass

class CommissionedEmployee(Employee):
    def isPayday():
        # must implement this method here
        pass

    def calculatePay():
        # must implement this method here
        pass

    def deliverPay(pay):
        # must implement this method here
        pass

class HourlyEmployee(Employee):
    pass  # must implement all three methods here

class SalariedEmployee(Employee):
    pass  # must implement all three methods here

def makeEmployee(record):
    match record.type:
        case "Commissioned":
            return CommissionedEmployee(record)
        case "Hourly":
            return HourlyEmployee(record)
        case "Salaried":
            return SalariedEmployee(record)
        case _:
            raise ValueError(f"Invalid employee type: {record.type}")

class EmployeeRecord:
    type = "Commissioned"  # hardcoded for demo purposes

record = EmployeeRecord()
employee = makeEmployee(record)
print(type(employee))  # will print `<class 'CommissionedEmployee'>`

Use Descriptive Names

This is pretty much a reprise of chapter 2; there’s not much to comment on here.

We continue next week with function arguments, side effects and more …

For some Stewart-ian “I know it when I see it” definition of “clean”, which is notoriously hard to measure. ↩
I’m reminded of this classic story. ↩
“Number 11 will surprise you!” ↩
Though I guess if all you have is an AbstractHammerFactory, everything looks like a nail … ↩