“Clean Code” Book Club: Chapter 9, Unit Tests

Posted on Sun 14 July 2024

Continuing with our book club on Robert Martin’s “Clean Code”, last week we’ve discussed chapter 9 (“Unit Tests”). (As usual, this post collects most of my notes and some points from our group discussion; completeness is explicitly a non-goal.)

Chapter 9: Unit Tests

This chapter—the only testing-related one in the book—focusses specifically on unit tests. There’s nothing here on integration tests; nothing on doctests; almost nothing on continuous integration or automation, except that tests should be “convenient to run”. And while the latter is understandable, given that the book was written over 15 years ago, it is a significant omission for a modern reader.

In the chapter’s introduction, Martin starts by describing an ad-hoc testing procedure common in the mid-90s, then lauding how the Agile and TDD movements have caused a “mad rush” to integrate testing into programming.

The Three Laws of TDD

While Martin is a strong proponent of Test Driven Development (TDD), I think his description of it here is doing TDD a disservice by being eyeroll-inducingly dogmatic:

First Law: You may not write production code until you have written a failing unit test.

Second Law: You may not write more of a unit test than is sufficient to fail, and not compiling is failing.

Third Law: You may not write more production code than is sufficient to pass the currently failing test. (p. 122)

So, on the one hand, Martin calls these “laws” and wrote them to be extremely prescriptive—he clearly wants us to take these literally. On the other hand, if we take these literally, then we’re never allowed to refactor our production code or test code; because refactoring, by definition, does not affect whether tests fail.

Kent Beck, who “rediscovered” TDD, wrote a much better summary of the process of TDD. To the above three steps (Martin’s “laws”), he adds a step 0 (think about the test scenarios to cover) and a step 4 (refactoring).1

Keeping Tests Clean

Test code is just as important as production code. It is not a second-class citizen. It requires thought, design, and care. It must be kept as clean as production code. (p. 124)

Yep.

Though I’d add that, typically, test code should be strictly simpler than production code. (And this, in turn, makes it easier to keep the tests clean.)

It is unit tests that keep our code flexible, maintainable, and reusable. (p. 124)

… because they enable us to make changes to our code, without having to worry about introducing new bugs by accident. We talked for a bit about our personal experiences where tests—or their absence—impacted how confident we were about making changes to code and how that impacted our development process.

A corollary: When we discover a bug, in addition to fixing it, we should think about how to improve our tests so they discover similar bugs in the future. We won’t be able to avoid making mistakes; but at least we can learn from them and try to fail better next time.

That can mean adding new unit tests for a currently untested part of the code; or covering an edge case we didn’t think about earlier; or adding new elements to our test suite (e.g. run the test suite under multiple operating systems, once we realized parts of our code are subtly OS-dependent2).

Domain-Specific Testing Language

I’m not completely sold on this. Yes, developing your own Domain-Specific Testing Language may make your tests look cleaner; but the price for that is a significantly increased amount of helper code. (At what point do you need to start writing a separate test suite to test your DSTL?)

While Martin doesn’t discuss this trade-off, his first example (listings 9-1 and 9-2) gives a good example where this approach helps produce cleaner code.

A Dual Standard

Then, however, we get to listings 9-3 and 9-4 … 😱

The “before” version of the code (listing 9-3) was pretty much fine; whereas Martin’s “improved” version in listing 9-4 is awful. From the terrible function name (wayTooCold();) to its unexpected side effect (executing controller.tic();) to the cryptic assertEquals(“HBchL”, hw.getState());, it breaks several rules that Martin himself wrote down earlier in this book. And perhaps worst of all—the code is hard to talk about.3

(The supposed main point of this subsection, by the way, was that the getState() function introduced for this DSTL is implemented in a cleaner but less efficient way. Martin argues that while this is not suitable for the embedded real-time system running in production, it is preferable for a more powerful test system. But yet again, this book is not making that point well because the example contains too many other distractions.)

One Assert per Test?

Here, Martin rejects the school of thought that claims every test should have only one assert statement; suggesting instead a “one concept per test” rule. After reading earlier chapters, I found this surprisingly pragmatic.

  1. I also much prefer his overall tone to the one Martin adopted in this section. Instead of Martin’s dogmatic commandments, Beck simply describes a pragmatic approach to programming.
  2. “Of course the path separator in URLs and local file paths is always the same. I mean, no reasonable OS would use anything other than a slash as a path separator, right?” 🫣
  3. “The state is uppercase h, uppercase b, lowercase c, uppercase h, uppercase l but it should be uppercase h, uppercase b, lowercase c, lowercase h, uppercase l.”