“Clean Code” Book Club: Chapter 3, Functions (Part 1)
Posted on Sat 27 April 2024
Continuing with our book club, this week we started with chapter 3 (“Functions”). Since it’s a longer chapter, we’re splitting it into two; this week focussed on pages 31–40.
(As usual, this post collects most of my notes and some points from our group discussion; completeness is explicitly a non-goal.)
Chapter 3: Functions
At the beginning of this week’s discussion, it quickly became clear that we all had reservations about the advice in this chapter. Several people said that the basic advice seemed sound but the author, Robert Martin, is overdoing it.
I’ll go one step further: This chapter is fundamentally flawed.
That flaw, I think, is that it’s too obsessed with counting lines of code, when the actual question is whether code is “clean”.1 Now, much of the time, LoCs might correlate with “cleanliness”; but make no mistake, LoCs are used because they are easy to measure, not because they are an accurate measure.2 And if you agressively optimize for a flawed metric, you will miss your actual goal; your code will be utterly terrible and you will be so blinded by your metric that you cannot see that.
We ended chapter 2 with a code example that did too many things at once and thus failed to make the point it was supposed to make. Unfortunately, with listings 3-1 and 3-2, we begin chapter 3 in the same way. There are several things changing between these listings:
- renaming (of the function itself and several variables)
- cleaning up the logic (for non-test pages, code would write page content to buffer and then back to page; that is skipped)
- and extracting functions
Steps 1 and 2 are unquestionably improvements.
However, the focus in this chapter should be on step 3—and that feels a bit like a sleight of hand in this example, because the extracted code is simply hidden from listing 3-2.
I could get almost the same effect by adding the comments // include setup pages
and // include teardown pages
in front of the if
blocks in listing 3-1 and then collapsing those blocks in my IDE to hide them from view.
Small!
There is a famous quote, usually (incorrectly) attributed to Albert Einstein:
Everything should be made as simple as possible, but no simpler.
Unfortunately, Martin appears to have missed the second part of that quote:
The first rule of functions is that they should be small. The second rule of functions is that they should be smaller than that. (p. 34)
Do shorter functions tend to be easier to comprehend than longer functions? Probably, yes. But there is a trade-off that Martin is missing: Individual functions don’t stand on their own. We don’t read them like a Buzzfeed listicle of “13 Haikus That Will Restore Your Faith in Humanity”3. We read functions, because we want to understand a chunk of code. And if that chunk of code is split up into five different functions, we need to understand those five different functions and how they interact with each other. Splitting code up into many smaller functions based on some arbitrarily set line count limit just replaces excessive length with excessive nesting depth.
Instead of counting lines of code, I’d suggest the following hypothesis: The optimal length of a function is commensurate with the chunk size of the programmer’s working memory. Like Martin, I can’t provide objective justification for this; but at the very least, my hypothesis is a lot more plausible than an arbitrary four-line limit.
Do One Thing
Functions should do one thing. They should do it well. They should do it only. (p. 35)
(Formatting [sic].)
At a high level, that sounds like reasonably good advice. The problem is: How do you define “one thing”? Is an island one thing? Is a beach on that island one thing? Is a grain of sand on that beach one thing? Arguably, each of them is “one thing”—just at different levels of abstraction.
One Level of Abstraction per Function
Martin uses this notion of “levels of abstraction” and asserts that each function should only have one level of abstraction.
Which sounds reasonable enough … until you notice that he considers getPathNameForPage(page)
and pageCrawler.getFullPath(page)
different levels of abstraction. (From listing 3-7.)
And yes, technically, one may be able to argue that these are ever-so-slightly-different levels of abstraction. But whatever tiny clarity we may gain by separating these subtly different levels, we immediately pay in additional complexity by adding an extra layer of nested functions.
We then talked a few minutes about the “Stepdown Rule”, that reading code from top to bottom should descend from the highest abstraction level to the lowest:
- This may not even work in C(++), which can be picky about how you order code.
- It won’t work for single-file Python scripts, either: Code is executed from top to bottom, so low-level helper functions need to be defined above the main code block that uses them.
- Is this similar to literate programming? Not quite; literate programming is e.g. well suited for data pipelines, since it models a linear flow of data; whereas this stepdown rule is about serializing a tree of nested functions.
Switch Statements
Polymorphism can replace the need for switch
statements in some contexts; and the example given in listings 3-4 and 3-5 nicely illustrates that point.
But Martin asserts that all switch
statements should be met with polymorphism, which seems too dogmatic.4
Since we were discussing how this would look in Python, here’s a Python version of listing 3-5.
It uses an Abstract Base Class to define the employee class structure; but unlike Java, idiomatic Python code would avoid the overhead of defining single-use EmployeeFactory
interface and implementation classes, when a simple makeEmployee
function would suffice.
from abc import ABC, abstractmethod
class Employee(ABC):
"""Abstract Base Class for all employees.
Child classes must implement methods labeled `@abstractmethod`.
"""
def __init__(self, record):
self.employeeRecord = record
@abstractmethod
def isPayday(): pass
@abstractmethod
def calculatePay(): pass
@abstractmethod
def deliverPay(pay): pass
class CommissionedEmployee(Employee):
def isPayday():
# must implement this method here
pass
def calculatePay():
# must implement this method here
pass
def deliverPay(pay):
# must implement this method here
pass
class HourlyEmployee(Employee):
pass # must implement all three methods here
class SalariedEmployee(Employee):
pass # must implement all three methods here
def makeEmployee(record):
match record.type:
case "Commissioned":
return CommissionedEmployee(record)
case "Hourly":
return HourlyEmployee(record)
case "Salaried":
return SalariedEmployee(record)
case _:
raise ValueError(f"Invalid employee type: {record.type}")
class EmployeeRecord:
type = "Commissioned" # hardcoded for demo purposes
record = EmployeeRecord()
employee = makeEmployee(record)
print(type(employee)) # will print `<class 'CommissionedEmployee'>`
Use Descriptive Names
This is pretty much a reprise of chapter 2; there’s not much to comment on here.
We continue next week with function arguments, side effects and more …
- For some Stewart-ian “I know it when I see it” definition of “clean”, which is notoriously hard to measure. ↩
- I’m reminded of this classic story. ↩
- “Number 11 will surprise you!” ↩
- Though I guess if all you have is an
AbstractHammerFactory
, everything looks like a nail … ↩