“Clean Code” Book Club: Conclusions

Posted on Mon 04 November 2024

The final chapters of Clean Code contained step-by-step examples for code refactorings in chapters 14–16, as well as a list of rules in chapter 17. Both were not quite suitable for the format of a book club, so we worked through one of the refactoring chapters together but stopped after that.

Now that we’re done, I spent the past six weeks or so pondering how I feel about this book. My initially high hopes for the book were disappointed by chapter 2 or 3. Over the following chapters, I increasingly found myself wondering, who this book is for. I wouldn’t recommend it to experienced programmers, who would get very little out of it. I certainly wouldn’t recommend it to beginners—while there is plenty of good advice here, there’s also enough bad advice1 and utterly terrible example code that it might do more harm than good. Maybe an intermediate Java programmer might get something useful out of it? But for RSEs like us, it was definitely not a useful book.

Despite all that, I still enjoyed the book club immensely. I’m a physicist by training; and while I got a lot of experience writing software during my PhD and postdoc, I received almost no formal training in software development. Being self-taught used to give me impostor syndrome—still does, sometimes—especially when I started working as a Research Software Engineer. Reading through this book, realizing I was already familiar with most of the contents, discussing it with colleagues and being able to make convincing arguments where I disagree with advice in the book … all this gave me a lot more confidence. More confidence, almost certainly, than I would’ve gotten from reading a better book.

So for me, at this particular time in my life, this may have been just the right book? Very strange, indeed.

  1. or at least, advice that should be taken with far more grains of salt than the disclaimer in chapter 1 provides

“Clean Code” Book Club: Chapters 10–13

Posted on Wed 11 September 2024

Continuing with our book club on Robert Martin’s “Clean Code”, in July and August we’ve discussed chapters 10 (“Classes”) to 13 (“Concurrency”). (As usual, this post collects most of my notes and some points from our group discussion; completeness is explicitly a non-goal.)

Chapter 10: Classes

This chapter was written by Robert Martin together with Jeff Langr.

The first rule of classes is that they should be small. The second rule of classes is that they should be smaller than that. (p. 136)

That’s a less than auspicious way to start the chapter.

However, instead of using lines of code, this chapter uses responsibilities of a class as a measure. This is far more reasonable, since it’s actually related to code cleanliness. As a guideline, it cites the Single Responsibility Principle:

a class or module should have one, and only one, reason to change. (p. 138)

That is good advice in principle; but once again, we find ourselves struggling with a philosophical question: What is one reason? How fine-grained do we define our concepts? The authors advocate for a very fine-grained approach, using the metaphor that we want our tools neatly organized into toolboxes instead of in a single jumbled pile. However, I’m wary of overdoing it, since a proliferation of toolboxes can become overwhelming, too! (Imagine if—instead of a drawer for kitchen utensils—your kitchen had separate drawers for whisks, for slotted spoons, for mixing spoons, for potato mashers, for ladles, for spatulas, for measuring spoons, for skimmers, for graters, etc.)

Cohesion is suggested as a useful indicator: High cohesion indicates that all parts of a class operate together (or “change for the same reason”), while low cohesion indicates that a class might be split into multiple, more cohesive parts. This discussion made me wonder whether there are tools to visualize code cohesion. (Though if we make our classes as small as the authors would like us to, it’s probably fairly easy to determine the cohesion with a quick look over the code.)

Also, there were several editing issues in code listing 10-10 (for example, the order of subclasses differed from order of functions in 10-9, for no discernible reason; and it looks like a few lines in the PlaceholderList and PrepareInsertSql classes were deleted accidentally, so both classes got smushed into each other).

Chapter 11: Systems

Since the contents were highly Java-specific and not transferable to any other languages, we skipped this chapter in our book club. (Several of us noted that this chapter felt very out of place in the book, both in terms of writing style and target audience.)

Chapter 12: Emergence

This was a nice summary of many key aspects from previous chapters, so we didn’t have too much to discuss. I found it a little strange to have such a chapter located at this point in the book—it might have worked better to swap this with chapter 13, to close the first part of the book with this summary chapter, before getting into the worked examples in chapters 14–16.

Chapter 13: Concurrency

In this chapter, Brett L. Schuchert took up the unenviable task of summarizing concurrency in 13 pages (plus a 30 page appendix with more detailed code examples). Due to this short length, the chapter felt lacking in several areas: for example, there was no discussion of the distinction between multithreading and multiprocessing; and subsections on different execution models (Producer-Consumer, Reader-Writer, Dining Philosophers) or writing correct shut-down code contained brief descriptions of the problems, but no actionable advice. Both the Java-specific focus of the book and its age introduced further gaps, since many concepts or approaches that are more prominent in other languages or simply younger than ~15 years (async/await, promises and futures, actor model, …) were missing completely.1

Those inherent limitations aside, the chapter did a good job dispelling some myths and misconceptions around concurrency. I especially liked the example demonstrating (including the full byte code in the appendix!) how even a simple line of code like ++lastIdUsed; is non-atomic and can lead to threading issues.

Concurrency Defense Principles

Referring back to the Single Responsibility Principle (“A method/class/component should have a single reason to change”), this subsection highlights that concurrency design is a reason in its own right. Code to manage concurrency should thus be separated from domain-specific code. The three corollaries discussed here (limit the scope of data, use copies of data, threads should be as independent as possible) effectively lead towards a functional programming style; though sadly the author did not explicitly mention this.

Testing Threaded Code

To me, this subsection was a highlight of the chapter, containing clear descriptions of the problems in testing multithreaded code and actionable advice on solving them:

  • Treat spurious failures as candidate threading issues
  • Get your nonthreaded code working first
  • Make your threaded code pluggable
  • Make your threaded code tunable
  • Run with more threads than processors
  • Run on different platforms
  • Instrument your code to try and force failures
  1. That’s before we get into implementation details of individual languages, other than Java. E.g. most relevant to us, Python has a couple peculiarities like the Global Interpreter Lock (which enforces that only one thread at a time executes Python bytecode) and how it interacts with binary extension modules like NumPy.

“Clean Code” Book Club: Chapter 9, Unit Tests

Posted on Sun 14 July 2024

Continuing with our book club on Robert Martin’s “Clean Code”, last week we’ve discussed chapter 9 (“Unit Tests”). (As usual, this post collects most of my notes and some points from our group discussion; completeness is explicitly a non-goal.)

Chapter 9: Unit Tests

This chapter—the only testing-related one in the book—focusses specifically on unit tests. There’s nothing here on integration tests; nothing on doctests; almost nothing on continuous integration or automation, except that tests should be “convenient to run”. And while the latter is understandable, given that the book was written over 15 years ago, it is a significant omission for a modern reader.

In the chapter’s introduction, Martin starts by describing an ad-hoc testing procedure common in the mid-90s, then lauding how the Agile and TDD movements have caused a “mad rush” to integrate testing into programming.

The Three Laws of TDD

While Martin is a strong proponent of Test Driven Development (TDD), I think his description of it here is doing TDD a disservice by being eyeroll-inducingly dogmatic:

First Law: You may not write production code until you have written a failing unit test.

Second Law: You may not write more of a unit test than is sufficient to fail, and not compiling is failing.

Third Law: You may not write more production code than is sufficient to pass the currently failing test. (p. 122)

So, on the one hand, Martin calls these “laws” and wrote them to be extremely prescriptive—he clearly wants us to take these literally. On the other hand, if we take these literally, then we’re never allowed to refactor our production code or test code; because refactoring, by definition, does not affect whether tests fail.

Kent Beck, who “rediscovered” TDD, wrote a much better summary of the process of TDD. To the above three steps (Martin’s “laws”), he adds a step 0 (think about the test scenarios to cover) and a step 4 (refactoring).1

Keeping Tests Clean

Test code is just as important as production code. It is not a second-class citizen. It requires thought, design, and care. It must be kept as clean as production code. (p. 124)

Yep.

Though I’d add that, typically, test code should be strictly simpler than production code. (And this, in turn, makes it easier to keep the tests clean.)

It is unit tests that keep our code flexible, maintainable, and reusable. (p. 124)

… because they enable us to make changes to our code, without having to worry about introducing new bugs by accident. We talked for a bit about our personal experiences where tests—or their absence—impacted how confident we were about making changes to code and how that impacted our development process.

A corollary: When we discover a bug, in addition to fixing it, we should think about how to improve our tests so they discover similar bugs in the future. We won’t be able to avoid making mistakes; but at least we can learn from them and try to fail better next time.

That can mean adding new unit tests for a currently untested part of the code; or covering an edge case we didn’t think about earlier; or adding new elements to our test suite (e.g. run the test suite under multiple operating systems, once we realized parts of our code are subtly OS-dependent2).

Domain-Specific Testing Language

I’m not completely sold on this. Yes, developing your own Domain-Specific Testing Language may make your tests look cleaner; but the price for that is a significantly increased amount of helper code. (At what point do you need to start writing a separate test suite to test your DSTL?)

While Martin doesn’t discuss this trade-off, his first example (listings 9-1 and 9-2) gives a good example where this approach helps produce cleaner code.

A Dual Standard

Then, however, we get to listings 9-3 and 9-4 … 😱

The “before” version of the code (listing 9-3) was pretty much fine; whereas Martin’s “improved” version in listing 9-4 is awful. From the terrible function name (wayTooCold();) to its unexpected side effect (executing controller.tic();) to the cryptic assertEquals(“HBchL”, hw.getState());, it breaks several rules that Martin himself wrote down earlier in this book. And perhaps worst of all—the code is hard to talk about.3

(The supposed main point of this subsection, by the way, was that the getState() function introduced for this DSTL is implemented in a cleaner but less efficient way. Martin argues that while this is not suitable for the embedded real-time system running in production, it is preferable for a more powerful test system. But yet again, this book is not making that point well because the example contains too many other distractions.)

One Assert per Test?

Here, Martin rejects the school of thought that claims every test should have only one assert statement; suggesting instead a “one concept per test” rule. After reading earlier chapters, I found this surprisingly pragmatic.

  1. I also much prefer his overall tone to the one Martin adopted in this section. Instead of Martin’s dogmatic commandments, Beck simply describes a pragmatic approach to programming.
  2. “Of course the path separator in URLs and local file paths is always the same. I mean, no reasonable OS would use anything other than a slash as a path separator, right?” 🫣
  3. “The state is uppercase h, uppercase b, lowercase c, uppercase h, uppercase l but it should be uppercase h, uppercase b, lowercase c, lowercase h, uppercase l.”

“Clean Code” Book Club: Chapter 8, Boundaries

Posted on Mon 17 June 2024

Continuing with our book club on Robert Martin’s “Clean Code”, last week we’ve discussed chapter 8 (“Boundaries”). (As usual, this post collects most of my notes and some points from our group discussion; completeness is explicitly a non-goal.)

Chapter 8: Boundaries

Here’s a boundary I like to set: I don’t want a sketch of a boy leering at a towel-clad woman in the bathroom in my programming books. The cover image for this chapter is kinda creepy.


This chapter, written by James Grenning, covers ways of interacting with third-party code (or even first-party code that’s not yet written; or that you don’t want to closely couple to).

One key point that came up again and again in our discussion was that not all dependencies are equal. On one hand, there are things like a logging framework—an example Grenning uses throughout this chapter—, where a typical application might use a small number of functions hundreds of times throughout the code base. On the other hand, there are dependencies like SciPy, where some piece of research software may use a mix of statistical functions, numerical integration or interpolation algorithms, constants and more: the overall API surface it’s exposed to is much larger, but most elements are used a small number of times or even just once.

Unfortunately, this chapter focusses solely on the first case; it does not discuss how to identify cases where these approaches may not be appropriate. I was also missing a more nuanced discussion of the downsides of the approaches advocated in this chapter.

Using Third-Party Code

Providers of third-party packages and frameworks strive for broad applicability so they can work in many environments and appeal to a wide audience. Users, on the other hand, want an interface that is focused on their particular needs. (p. 114)

The example here is a good illustration:

public class Sensors {
    private Map sensors = new HashMap();

    public Sensor getById(String id) {
        return (Sensor) sensors.get(id);
    }

    // ...
}

The Sensors class is a Map of individual sensors; but instead of exposing the full API of Java’s Map to the caller, it encapsulates the actual Map and provides its own, more limited API.

This interface lets the Sensors class explicitly enforce design and business rules; e.g. ensuring that any object added to the Map is of the Sensor type. It also allows the Sensors implementation to change over time, without changing the interface exposed to callers.

On the other hand, if the interface is too restricted, it could be frustrating to callers. E.g., if I had a Sensors class in Python, I’d expect it to support language idioms like

for sensor in sensors:
    # do something with each sensor here

and the class would feel terribly broken if that didn’t work.

Another potential downside is inconsistency: Over time, developers become familiar with the interface of common data structures (like maps, lists, etc.) in their language; so if Sensors inherits the interface of Map, they will be able to pick that up instantly. However, if Sensors implements its custom API that deviates from the familiar interface (e.g. by having a function getByID instead of get), it has a much steeper learning curve.

Now, whether the advantages are worth these trade-offs depends on the context: Developers on a large corporate Java app may value the enforced structure higher; whereas I, as maintainer of open-source scientific Python codes, prefer a more familiar interface for occasional users or first-time contributors. However, it’s a little disappointing that the book did not acknowledge these trade-offs.

Exploring and Learning Boundaries / Learning log4j / Learning Tests Are Better Than Free

These three closely related sections have led to extensive discussion during our book club. There were several points here, that we were really struggling with.

First, the suggestion that every project should write their own custom wrapper for log4j. A convenience function to set sensible defaults during initialization? Sure. But a full wrapper? That may make sense in some situations—but as above, it comes with trade-offs that the chapter simply doesn’t acknowledge. And for a dependency like SciPy, writing my own wrapper simply doesn’t make sense.

Next, we discussed “learning tests”. We commonly explore third-party code by playing around with it in the REPL before using it in our own code; but contrary to the chapter’s claim, writing tests still has a nontrivial overhead.1

We also weren’t sure, what benefit the “learning tests” are supposed to deliver. The sample code in the chapter only showed very basic tests that verify that the code executes successfully; i.e. that the log4j functions aren’t removed/renamed or get a completely different signature (such as additional required arguments). But in compiled languages, these are checks that a compiler would already perform, before even running the tests. More subtle semantic changes to an—outwardly identical—API may not be caught by the compiler; but even then, existing tests of your own code are effectively “integration tests” for your dependencies, that would catch such changes if they affect your own code.

So: What is the benefit of these learning tests? If I think about it for a while, I might come up with some edge case; but is that truly worth the ongoing cost (energy usage, time, complexity, …) of this significantly larger test suite?

Using Code That Does Not Yet Exist

This all seems sensible; and the Adapter Pattern introduced here can indeed be quite powerful. Unfortunately, as above, the chapter doesn’t discuss the limits of this approach. Sure, it works great if the API you’re mocking is fairly small—just a single function in this example. But what happens if you have a much more complex dependency, whose API design isn’t obvious?

  1. Though Java didn’t get an official REPL until 2017, long after “Clean Code” was written.

“Clean Code” Book Club: Chapters 5–7

Posted on Tue 11 June 2024

Continuing our book club on Robert Martin’s “Clean Code”, over the past month we’ve started to rotate who leads the discussion on each chapter. When I’m not the discussion lead, my notes have been a bit less detailed. Accordingly, posts from now on may be shorter and less comprehensive; completeness continues to explicitly be a non-goal.

In this post, I’ll cover chapters 5 (“Formatting”), 6 (“Objects and Data Structures”) and 7 (“Error Handling”).

Chapter 5: Formatting

Pick a vaguely reasonable style, use it, be pragmatic about it. That’s it.

Robert Martin discusses his personal style preferences and thinking behind them in some detail in this chapter. Overall, it’s a reasonable style; but we didn’t discuss details since that would just risk devolving into bike-shedding.

Instead, we used most of the hour-long session to talk about our experiences with automatted tools. Code formatters like black, clang-format or others may be helpful; on the other hand, automated tools are notoriously bad at being pragmatic and may be more frustrating than helpful in some cases.

Chapter 6: Objects and Data Structures

Objects hide their data behind abstractions and expose functions that operate on that data. Data structures expose their data and have no meaningful functions. (p. 95)

Once you get used to the fact that Martin’s definition of “data structures” is different from what we typically think of as data structures, this distinction makes sense.1 Martin then explains nicely how data structures lead to procedural code; and how this makes it easy to add new functions that operate on these data structures, without having to change existing data structures. On the other hand, Martin explains, objects enable object-oriented code, which makes it easy to add new objects without having to change existing functions.

This complementarity is quite elegant; however, we discussed in the book club that it’s not always helpful in practice. Research software is often quite exploratory, so when writing some initial code, it may be uncertain how it will be extended in the future; or it may be necessary to add both objects/data structures and functions.

The Law of Demeter

This section led to a lot of discussion and we were somewhat sceptical.

For one, method chaining is not bad per se; while it may be considered sloppy style in Java, it is a very established pattern e.g. in jQuery (which is explicitly designed to enable this).

But more importantly, we were not convinced by the law itself. According to the Law of Demeter, Martin says, the following code …

Options opts = ctxt.getOptions();
File scratchDir = opts.getScratchDir();
final String outputDir = scratchDir.getAbsolutePath();
String outFile = outputDir + "/" + className.replace('.', '/') + ".class";
FileOutputStream fout = new FileOutputStream(outFile);
BufferedOutputStream bos = new BufferedOutputStream(fout);

… is bad form, because it knows too much about the objects returned by getOptions() and getScratchDir(). But passing those returned objects as an argument into a helper function, which then knows the exact same things about these objects, would supposedly be fine. I’m sure there are more complex examples with a layered architecture, where this would make sense. But in this specific example that would just add unnecessary layers of indirection to obey a constraint that feels arbitrary.

In Martin’s defense, he doesn’t propose adding helper functions. Instead, he suggests creating another function on the ctxt object. Naïvely, one might come up with a ctxt.getAbsolutePathOfScratchDirectoryOption(); function as a direct replacement for ctxt.getOptions().getScratchDir().getAbsolutePath();. However, exposing all chained functions like that would lead to a combinatorial explosion.

Instead, Martin considers how that scratch path is actually used by the calling code, and recommends a function that specifically fulfills this use case:

String classFileName = className.replace('.', '/') + ".class";
BufferedOutputStream bos = ctxt.createScratchFileStream(classFileName);

The downside here, which Martin ignores, is that as different use cases appear, this could equally lead to an explosion of methods on the ctxt object. (And what’s more, it requires the ctxt object to be aware of any higher level code calling it and to change in response to changes in higher level code. Not a great software design.)

Chapter 7: Error Handling

Error handling is important, but if it obscures logic, it’s wrong. (p. 103)

Leaving aside some Java-specific bits that weren’t applicable to other programming languages that our group commonly uses, this was an interesting chapter that sparked some good and nuanced discussion in our book club.

Use Exceptions Rather Than Return Codes

In principle, we agreed that exceptions—where available—are preferable to return codes.

The code example for this section quite nicely separates error handling from application logic. However, it assumes that all we do in response to errors is to log them. What if we want to handle the errors in other ways, e.g. by restoring a previous state or prompting the user for input? That would by necessity mix error handling and application logic.

Write Your Try-Catch-Finally Statement First

We agreed it’s typically a good idea to think about the overall structure first (What am I going to do? How can it fail? How do I handle failure?), before writing the implementation details. However, the example goes far beyond that, giving step-by-step instructions on how to do test-driven development (TDD). That is rather beside the point here and doesn’t make much sense without explaining the motivations for using TDD, which will only be covered in a later chapter.

Define Exception Classes in Terms of a Caller’s Needs

Often, the caller doesn’t care and doesn’t want to know about the details of how an operation failed; it just matters whether it did, in fact, fail. Adding a wrapper class to hide away those details, as recommended in this section, works well enough in those cases. On the other hand, it would usually be preferable if the library implements a class hierarchy of exceptions. That way, callers can choose whether to look out for, e.g., a generic ConnectionError or whether to handle each ConnectionError subclass (ConnectionAbortedError, ConnectionRefusedError, etc.) individually; all without having to write their own wrapper.

Define the Normal Flow

Hm … on the one hand, this concrete example looks cleaner when using a default value. But on the other hand, picking a particular default value is tying the implementation of the ExpenseReportDAO class very closely to the higher-level business logic using it, which could lead to severe issues. For example, what if the business logic requires a different default value in some other circumstances? (E.g., when claiming expenses for conference attendences, the default is a per diem; but when claiming expenses for hosting a visitor, there is no per diem.) Or what if the per diem varies by location and needs to be entered by the admin staff manually?

If the ExpenseReportDAO class raises a MissingExpensesError and lets the higher-level business logic handle it differently depending on context, that’s all straightforward to implement. But if it returns a fixed default value automatically, handling other cases becomes much more difficult. (Perhaps even impossible, if the calling code can’t distinguish between, say, a per diem of £30 and actual expenses that happen to sum to £30.)

  1. Some details of this chapter were very Java-specific. For example, in Python it’s common to access member variables directly, rather than defining getters and setters; there is also no enforced distinction between private and public variables. In practice, though, naming conventions (e.g. using a leading underscore for “private” variables) or documentation usually fulfill that function well enough, so the main points of this chapter apply in Python, too.

“Clean Code” Book Club: Chapter 4, Comments

Posted on Wed 08 May 2024

Continuing with our book club on Robert Martin’s “Clean Code”, this week we’ve discussed chapter 4 (“Comments”). (As usual, this post collects most of my notes and some points from our group discussion; completeness is explicitly a non-goal.)

After the last chapter, which I had severe issues with, this one got better again. My overall impression is, though, that it’s suffering a lot from a lack of editing: In addition to several redundant passages, there are a few mistakes in sample code, several redundant subsections and plenty of redundancies.

Chapter 4: Comments

The proper use of comments is to compensate for our failure to express ourself in code. (p. 54)

We had rather mixed feelings about the introduction. Martin does a good job explaining potential problems with comments (they can easily become outdated and misleading) and arguing that time spent keeping them up to date is often better spent making code more readable so it doesn’t require comments in the first place. However, he’s really over-doing his spiel: It’s hard to take a statement like “Comments are always failures” seriously when it’s followed by a section listing eight different kinds of “Good Comments”.

Good Comments

We fully agreed with four of the types of good comments (“Explanation of Intent”, “Clarification”, “Warning of Consequences” and “Amplification”) and didn’t have much to discuss or add to those. The other four types led to some discussion:

Legal Comments

For example, copyright and authorship statements are necessary and reasonable things to put into a comment at the start of each source file. (p. 55)

Are they? IANAL, but isn’t it sufficient to have a license in the repository, rather than in each source file? (And why just source files? What about documentation files? Or graphics, like a software’s logo?)

Also, while authorship information in the style of Copyright (C) 2024 by $COMPANY_NAME may work for software written by a single company, how could you meaningfully do this for an open-source project that is developed by, potentially, thousands of individuals? I occasionally see something like Copyright (C) 2024 by $PROJECT_NAME contributors, which is basically tautological. (“This code was written by the people who wrote this code.”)

Informative Comments

This section starts with the (vague enough to be useless) statement that “it is sometimes useful to provide basic information with a comment”. It then gives two examples, each followed by a paragraph explaining that it would be better to change the code and avoid this comment.

Why was this section written in the first place? And how did it make it past an editor?

Clarification

The second example from the “Informative Comments” section—adding a human-readable version of a regex—would have worked well here. (Arguably better than the example actually given here: If the code reads a.compareTo(a) == 0;, adding the comment // a == a really doesn’t offer much benefit.)

TODO Comments

We weren’t big fans of these. Martin argues that IDEs make it easy to locate these comments, so it’s easy for programmers to keep track of them. But that doesn’t square with the chapter introduction: If a major reason to avoid comments is that programmers will not put in the effort to keep them up to date even when editing that exact part of the code, then why would you assume programmers are going to put in the effort to regularly look for TODO comments across the whole code base?

Instead, we discussed keeping track of TODOs in an issue tracker instead; to ensure they’re all in a single place, rather than some being in an issue tracker and some scattered throughout the code base. In the cases where it’s actually necessary to add a TODO comment in the code—e.g. if code may be confusing otherwise—it’s then best to add a link to the respective issue to the TODO comment and include further details and discussion in the issue. This also makes it clear how to find out whether a particular TODO is still relevant.

Bad Comments

Among bad comments, we found ourselves fully agreeing with about half the subsections (“Misleading Comments”, “Journal Comments”, “Don’t Use a Comment When You Can Use a Function or a Variable” (This one in particular had a very nice example!), “Commented-Out Code”, “HTML Comments”, “Nonlocal Information” and “Too Much Information”) and didn’t have much to comment on there.

Mumbling

There’s a mistake in the example for this subsection: The code listing uses a loadedProperties.load function, but the text below refers to loadProperties.load every time. While subtle, this is a potentially important distinction here: loadedProperties strongly implies that some properties (presumably the defaults?) were already loaded; which would make the comment in this example a little clearer.

More generally, while unclear or confusing comments are indeed undesirable, we didn’t find this advice very actionable. Often, the person writing a comment may not even realise that it’s ambiguous and that it may be interpreted differently by another reader.

Redundant Comments

In the second example here, Martin argues that a set of javadocs comments is redundant and thus useless clutter that should be avoided.

/**
 * The container event listeners for this Container.
 */
protected ArrayList listeners = new ArrayList();

/**
 * The Loader implementation with which this Container is
 * associated.
 */
protected Loader loader = null;

This will depend on the context, of course, but we were sceptical about this. For public APIs at least, we’d err on the side of documenting them consistently, even if several of the comments are redundant. A user’s first impression when some functions or variables are missing documentation is not going to be “Oh, clearly they left out documentation because it would be redundant”; it’s gonna be “oh, the documentation is inconsistent; I can’t trust it.”

Mandated Comments

Here, Martin is just repeating his earlier point about redundant information in javadocs, using a different example. By the way: Have you spotted the error in the code listing 4-3?1

Noise Comments

In the first example of this subsection, Martin is repeating his point about redundant javadocs for the third time. Is this a total lack of editing or a badly written meta-joke about redundancy?

The second example (listing 4-4) is at least different, but I found it quite irritating. In this listing, the programmer added a comment saying //Give me a break! when some exception handling code throws yet another exception. Noise? Sure. A little unprofessional? Perhaps. But also harmless and cathartic and very human.2

Martin’s advice here—the programmer should’ve extracted a try/catch block into a separate function instead—feels emotionally tone-deaf to me. (Not just because I disagree; but also because it’s a complete non sequitur and sounds patronizing: “You seem frustrated. Have you tried doing Yoga extracting a few lines into a separate function?”)

Scary Noise

This is the fourth time in this chapter that Martin is telling us that javadocs can be noisy. Give me a break!

Position Markers

Here, we agreed that these are okay when used sparingly; which even Martin seems to accept. (Though we were a bit puzzled that he writes “the noisy train of slashes at the end” should “especially” be eliminated, when this is exactly what makes the marker so visible and thus useful. That’s like arguing fire alarms are okay sometimes, but they shouldn’t be so loud.)

Closing Brace Comments

Agreed; these are unnecessary clutter. A combination of code indentation and IDEs highlighting matching braces and/or current scope removes the need for this.

Attributions and Bylines

Agreed; git blame is preferable.

(That said, this is just a variant of the “Journal Comments”; so why not merge the two subsections?)

Inobvious Connection

I agree with this point in principle; but what is or isn’t obvious depends heavily on context. Martin’s example here doesn’t strike me as particularly good:

/*
 * start with an array that is big enough to hold all the pixels
 * (plus filter bytes), and an extra 200 bytes for header info
 */
this.pngBytes = new byte[((this.width + 1) * this.height * 3) + 200];

What is a filter byte? Does it relate to the +1? Or to the *3? Both? Is a pixel a byte? Why 200? (p. 70)

I don’t have experience writing image manipulation code; but even to me, it seems fairly obvious that the pixels array is width * height * 3 (where 3 stands for three colour values per pixel—red, green and blue), so the + 1 must correspond to the filter bytes. A developer working on this low level image manipulation code will almost certainly have much more experience in this area than I do; so more detailed explanations are likely unnecessary here.

Also, the exact composition of the PNG header info bytes is arguably irrelevant here? If the comment were to go into detail here—rather than elsewhere in the code, where the header bytes are actually handled—that would be an example of non-local information, which Martin rightly advised against just one page earlier.

Function Headers

Isn’t this just a special case of “Redundant Comments” and “Noise Comments”? Why do these two fairly vague sentences need their own subsection?

Javadocs in Nonpublic Code

I don’t think public vs. nonpublic is the right distinction here. Just because an API is private, doesn’t mean that the people using that API won’t benefit from good docs, just like external users of a public API will.

The better distinction here is between a (public or nonpublic) API on the one hand and internal helper functions on the other hand (which are only relevant to developers who are working on that particular file). (Maybe that is actually what Martin means here and his language contrasting “nonpublic code” with “public APIs” just isn’t very precise.)

  1. The docstring and function signature contain int durationInMinutes, but the implementation sets cd.duration = duration;.
  2. To be clear: More explicit swearing, violent or insulting language would be completely unacceptable in a professional or open-source code base. But an exasperated “Give me a break!” is something completely different.

“Clean Code” Book Club: Chapter 3, Functions (Part 2)

Posted on Thu 02 May 2024

Continuing with our book club, this week we’ve finished chapter 3 (“Functions”), covering pages 40–52. For boring reasons, I missed part of the discussion this time, so this will be a shorter than usual post. I’ll focus solely on the “Function Arguments” subsection, which inspired by far the most discussion, and skip the rest.

A few days ago, a colleague shared the post “It's probably time to stop recommending Clean Code” with our book club. I wrote last week’s notes before reading that post, so I found it encouraging to notice overlap between both—and sometimes quite cathartic to see vocal criticism of other issues in the chapter that subconsciously irked me but that I wasn’t able to articulate precisely.

Chapter 3: Functions

Function Arguments

The ideal number of arguments for a function is zero. Next comes one, followed closely by two. Three arguments should be avoided where possible. More than three requires very special justification—and then shouldn’t be used anyway. (p. 40)

As so often in this chapter, I think the basic point (too many arguments tend to correlate to messy code) is reasonable; but by focussing too much on a metric that’s easy to measure, but not very accurate, Martin once again gives advice that leads to bad code.

For example:

includeSetupPage() is easier to understand than includeSetupPageInto(newPageContent). The argument is at a different level of abstraction than the function name and forces you to know a detail that isn’t particularly important at that point. (p. 40)

Skipping the argument makes it easier to understand that one line, perhaps. But if I want to understand what the code does, knowing where that setup page gets included is pretty important. And leaving out that single argument means I have to instead look up and read the whole definition of the includeSetupPage() function.

Another example can be found in this extract from listing 3-7:

public class SetupTeardownIncluder {
  private PageData pageData;
  private boolean isSuite;
  private WikiPage testPage;
  private StringBuffer newPageContent;
  private PageCrawler pageCrawler;

  private String render(boolean isSuite) throws Exception {
    this.isSuite = isSuite;
    if (isTestPage()) // <-- this is the offending line
      includeSetupAndTeardownPages();
    return pageData.getHtml();
  }
}

If what is a test page?

It’s gotta be one of the five member variables listed above … but can you guess which of them? Defining isTestPage() without any arguments means I have to glance up at the list of member variables, which already makes the code harder to understand than simply having that one argument.

What makes this case particularly egregious, though, is that the naming is actively misleading: Based on both the class names and variable names, clearly only one of the five (the WikiPage testPage) is a “page”—so surely that would be the page that isTestPage() checks, right? Wrong.1


In a later sub-section, Martin argues that flag arguments (i.e., any argument that is a boolean) are “a truly terrible practice”, because they mean that a function “does more than one thing” and should be split into two different functions. But often, flag arguments don’t determine whether to do one thing or another; they just specify one small aspect of how that one thing is done. For example: Should the function do a “dry run” and list the changes that would happen or actually modify the data? Should it modify an array in place or return results in a new array? Should it print verbose output or not?

Splitting such things up into separate functions would be clunky; especially if multiple flags could apply independently. (Imagine that API: performOperationVerboselyInPlace(), performOperationQuietlyInPlace(), performOperationVerboselyNondestructive(), performOperationQuietlyNondestructive(), … 😱)

Martin is correct that a context-less true or false in a function call is often confusing. But in other languages, like Python, that is easily solved by using keyword arguments. For example:

deleteOldFiles("/usr/local/", True)  # Confusing :(
deleteOldFiles("/usr/local/", dry_run=True)  # Perfectly clear :)

This also solves a problem Martin mentions for functions with multiple arguments that don’t have an obvious ordering (e.g., assertEquals(expected, actual) or assertEquals(actual, expected)).

Some languages even go one step further: Objective-C (example here) and Swift (example below) distinguish internal and external keyword names:

func divide(numerator: Int, byDenominator denominator: Int) -> Int? {
    if denominator == 0 {
        return nil
    }

    return numerator / denominator
}

So while the arguments are named numerator and denominator in the function body, the function can be called as divide(numerator: 42, byDenominator: 7), which sounds even closer to an English sentence.

  1. The correct answer would have been pageData—and the only way to figure that out is, once again, to look up and read through the function definition.

“Clean Code” Book Club: Chapter 3, Functions (Part 1)

Posted on Sat 27 April 2024

Continuing with our book club, this week we started with chapter 3 (“Functions”). Since it’s a longer chapter, we’re splitting it into two; this week focussed on pages 31–40.

(As usual, this post collects most of my notes and some points from our group discussion; completeness is explicitly a non-goal.)

Chapter 3: Functions

At the beginning of this week’s discussion, it quickly became clear that we all had reservations about the advice in this chapter. Several people said that the basic advice seemed sound but the author, Robert Martin, is overdoing it.

I’ll go one step further: This chapter is fundamentally flawed.

That flaw, I think, is that it’s too obsessed with counting lines of code, when the actual question is whether code is “clean”.1 Now, much of the time, LoCs might correlate with “cleanliness”; but make no mistake, LoCs are used because they are easy to measure, not because they are an accurate measure.2 And if you agressively optimize for a flawed metric, you will miss your actual goal; your code will be utterly terrible and you will be so blinded by your metric that you cannot see that.


We ended chapter 2 with a code example that did too many things at once and thus failed to make the point it was supposed to make. Unfortunately, with listings 3-1 and 3-2, we begin chapter 3 in the same way. There are several things changing between these listings:

  1. renaming (of the function itself and several variables)
  2. cleaning up the logic (for non-test pages, code would write page content to buffer and then back to page; that is skipped)
  3. and extracting functions

Steps 1 and 2 are unquestionably improvements. However, the focus in this chapter should be on step 3—and that feels a bit like a sleight of hand in this example, because the extracted code is simply hidden from listing 3-2. I could get almost the same effect by adding the comments // include setup pages and // include teardown pages in front of the if blocks in listing 3-1 and then collapsing those blocks in my IDE to hide them from view.

Small!

There is a famous quote, usually (incorrectly) attributed to Albert Einstein:

Everything should be made as simple as possible, but no simpler.

Unfortunately, Martin appears to have missed the second part of that quote:

The first rule of functions is that they should be small. The second rule of functions is that they should be smaller than that. (p. 34)

Do shorter functions tend to be easier to comprehend than longer functions? Probably, yes. But there is a trade-off that Martin is missing: Individual functions don’t stand on their own. We don’t read them like a Buzzfeed listicle of “13 Haikus That Will Restore Your Faith in Humanity”3. We read functions, because we want to understand a chunk of code. And if that chunk of code is split up into five different functions, we need to understand those five different functions and how they interact with each other. Splitting code up into many smaller functions based on some arbitrarily set line count limit just replaces excessive length with excessive nesting depth.

Instead of counting lines of code, I’d suggest the following hypothesis: The optimal length of a function is commensurate with the chunk size of the programmer’s working memory. Like Martin, I can’t provide objective justification for this; but at the very least, my hypothesis is a lot more plausible than an arbitrary four-line limit.

Do One Thing

Functions should do one thing. They should do it well. They should do it only. (p. 35)

(Formatting [sic].)

At a high level, that sounds like reasonably good advice. The problem is: How do you define “one thing”? Is an island one thing? Is a beach on that island one thing? Is a grain of sand on that beach one thing? Arguably, each of them is “one thing”—just at different levels of abstraction.

One Level of Abstraction per Function

Martin uses this notion of “levels of abstraction” and asserts that each function should only have one level of abstraction. Which sounds reasonable enough … until you notice that he considers getPathNameForPage(page) and pageCrawler.getFullPath(page) different levels of abstraction. (From listing 3-7.)

And yes, technically, one may be able to argue that these are ever-so-slightly-different levels of abstraction. But whatever tiny clarity we may gain by separating these subtly different levels, we immediately pay in additional complexity by adding an extra layer of nested functions.

We then talked a few minutes about the “Stepdown Rule”, that reading code from top to bottom should descend from the highest abstraction level to the lowest:

  • This may not even work in C(++), which can be picky about how you order code.
  • It won’t work for single-file Python scripts, either: Code is executed from top to bottom, so low-level helper functions need to be defined above the main code block that uses them.
  • Is this similar to literate programming? Not quite; literate programming is e.g. well suited for data pipelines, since it models a linear flow of data; whereas this stepdown rule is about serializing a tree of nested functions.

Switch Statements

Polymorphism can replace the need for switch statements in some contexts; and the example given in listings 3-4 and 3-5 nicely illustrates that point. But Martin asserts that all switch statements should be met with polymorphism, which seems too dogmatic.4

Since we were discussing how this would look in Python, here’s a Python version of listing 3-5. It uses an Abstract Base Class to define the employee class structure; but unlike Java, idiomatic Python code would avoid the overhead of defining single-use EmployeeFactory interface and implementation classes, when a simple makeEmployee function would suffice.

from abc import ABC, abstractmethod

class Employee(ABC):
    """Abstract Base Class for all employees.
    Child classes must implement methods labeled `@abstractmethod`.
    """
    def __init__(self, record):
        self.employeeRecord = record

    @abstractmethod
    def isPayday(): pass

    @abstractmethod
    def calculatePay(): pass

    @abstractmethod
    def deliverPay(pay): pass

class CommissionedEmployee(Employee):
    def isPayday():
        # must implement this method here
        pass

    def calculatePay():
        # must implement this method here
        pass

    def deliverPay(pay):
        # must implement this method here
        pass

class HourlyEmployee(Employee):
    pass  # must implement all three methods here

class SalariedEmployee(Employee):
    pass  # must implement all three methods here

def makeEmployee(record):
    match record.type:
        case "Commissioned":
            return CommissionedEmployee(record)
        case "Hourly":
            return HourlyEmployee(record)
        case "Salaried":
            return SalariedEmployee(record)
        case _:
            raise ValueError(f"Invalid employee type: {record.type}")

class EmployeeRecord:
    type = "Commissioned"  # hardcoded for demo purposes

record = EmployeeRecord()
employee = makeEmployee(record)
print(type(employee))  # will print `<class 'CommissionedEmployee'>`

Use Descriptive Names

This is pretty much a reprise of chapter 2; there’s not much to comment on here.

We continue next week with function arguments, side effects and more …

  1. For some Stewart-ian “I know it when I see it” definition of “clean”, which is notoriously hard to measure.
  2. I’m reminded of this classic story.
  3. “Number 11 will surprise you!”
  4. Though I guess if all you have is an AbstractHammerFactory, everything looks like a nail …

“Clean Code” Book Club: Chapter 2, Meaningful Names

Posted on Thu 18 April 2024

After last week’s introduction, in the second session of this book club we dive into the main content of the book. And, if you believe the old joke, we start with one of the hardest chapters:

There are two hard problems in computer science: Naming things, cache invalidation and off-by-one errors.

(As usual, this post collects most of my notes and some points from our group discussion; completeness is explicitly a non-goal.)

Chapter 2: Meaningful Names

Like about half of the chapters in this book, this one was not written by Robert Martin, but by one of his collaborators; in this case, Tim Ottinger.

In general, my impression of this chapter was fairly uneven. It discusses many important aspects of naming, formulates clear guidelines—some of which I was already aware of, others I’ll need to keep in mind—and is very conscious of how context-dependent naming is. Superficially, some guidelines may seem contradictory; but to me, this is not a flaw in the writing, but rather an acknowledgement that naming choices have tradeoffs—being a good developer simply means being better at making reasonable tradeoffs. Unfortunately, several examples seemed unhelpful or even actively misleading to me.1

Use Intention-Revealing Names

If a name requires a comment, then the name does not reveal its intent. (p. 18)

For example, naming a variable d does not make the intent clear; whereas naming it maxAgeInDays does. This is such an obvious, basic guideline; and yet I’ve been guilty of breaking it. (Though if we’re honest, so has every programmer, right?)

Now, if this applies to naming function arguments as well, does that mean we shouldn’t (need to) describe arguments in the function’s docstring? We discussed this a bit in the book club; and we think that would be an overly broad interpretation of this guideline. For example, a function may take an argument maxAgeInDays but have a non-obvious behaviour for special values like 0. Such larger context belongs in a comment; the basic intent of the argument is still clear from its name alone.

Here’s a longer example:2

# Bad: Names don’t reveal meaning of the variables.
# It isn’t clear what the code is doing at a high level.
def getThem():
    list1 = []
    for x in theList:
        if x[0] == 4:
            list1.append(x)
    return list1

# Better: High-level intent is clear from reading code.
def getFlaggedCells():
    flaggedCells = []
    for cell in gameBoard:
        if cell.status = CellStatus.FLAGGED
            flaggedCells.append(cell)
    return flaggedCells

Avoid Disinformation

The first example in this section recommends not to use hp as a variable name for a hypotenuse, since it could also be used (alongside aix and sco3) as the name of a Unix variant.

I don’t think this is a great example. For one, I struggle to imagine a context where there is serious doubt whether a variable refers to a hypotenuse or to HP-UX. But also, hp is a fairly bad variable name for other reasons anyway—it’s fairly cryptic, hard to pronounce, … if you do need a name for a hypotenuse, at least make it hyp!

The following example (don’t call a variable accountsList if it’s not actually a list) makes a good point, though. And even if it is a list, encoding that in the name is often unnecessary. I like the convention of simply using a plural noun (accounts) to name such a variable—whether it’s a list, set, dict or similar class—since it enables beautifully readable code like for account in accounts:.

It is very helpful if names for very similar things sort together alphabetically (p. 20)

This is very practical API design advice! For example, in NumPy, np.as<TAB> offers the autocomplete options asarray, ascontiguousarray, asfortranarray, asmatrix, asscalar, etc., so even if I know only one or two of these functions, it’s obvious where to look for related ones. In contrast, if they were named np.arrayfrom, np.fortranarrayfrom etc., I’d have to look through the whole documentation to find the one that’s most appropriate.

Make Meaningful Distinctions

Two variables named time and time2 may be distinguishable to the compiler/interpreter, but they are not meaningfully distinct to a human reader. Use e.g. startTime and endTime instead—or rawTime and correctedTime, or whatever makes sense in context.

Similarly, the chapter notes that “noise words” are not a meaningful distinction. For example, if two objects are named accountData and account, it’s unclear to a reader what the difference is. Or, in an earlier Java code sample in this chapter, there were both an ArrayList and a List. If you’re not already very familiar with the API, can you tell what the difference might be? I had to look it up.

Use Pronouncable Names

If you can’t pronounce it, you can’t discuss it […] This matters, because programming is a social activity. (p. 22)

Well said!

Use Searchable Names

Grepping for a NAMED_CONSTANT is easy, grepping for a digit is basically hopeless. A very good point, which I hadn’t given much thought to before.

This section also has the following rule of thumb, which is more widely applicable:

The length of a name should correspond to the size of its scope (p. 22)

That makes sense: If the scope is small, then I can see all the necessary context at a glance and the name doesn’t need to duplicate that information. Whereas if the scope is too large to be glanceable, I need the name of the variable to provide me with context.

Avoid Encodings

This section discusses Hungarian Notation or the common convention of prefixing e.g. member variables with m_. This, as the chapter says, is largely unnecessary in even halfway modern IDEs. Experienced developers learn to ignore it after a bit, while new developers joining a codebase have an extra hurdle to reading the code. Especially in an academic context, where the software is often not the main product but just a tool to enable a research project, new developers (e.g. PhD students or postdocs) might not have the time or motivation to understand such cryptic encodings. Maintainers therefore need to expend significant effort to enforce such conventions, or the code will become inconsistent and thus misleading over time.

Often, such encodings also make names hard to pronounce—see above!


Another kind of encoding—not mentioned in the book, but highly relevant to research software—is mathematical notation. Just because it’s legible when properly typeset, doesn’t mean it’s legible in code.

For example, sub- or superscript indices are extremely useful in typeset equations, but often don’t translate well into code. Or I recently encountered an example, where a physics paper used γ and Γ for distinct but related variables. In the typeset equations, this was a clear distinction; but in code, having variables named gamma and Gamma would be terribly confusing.

Avoid Mental Mapping

This chapter left me a bit puzzled. It’s not that I disagree with anything in here; it’s just that the examples don’t make it clear how this is any different from the section “Use Intention-Revealing Names” at the beginning of the chapter … 🤷

Class and Method Names

Classes and objects should be nouns or noun phrases; methods and functions should be verbs or verb phrases.

Good advice. I mostly do this intuitively, but it’s good to be aware of this.

Don’t Be Cute

If colleagues can’t understand your code without googling pop-culture references or asking you to explain some inside joke, that’s a problem.

I know it’s tempting to use that really clever joke … but don’t. If it’s otherwise safe for work, feel free to post it in the #random channel on your team’s Slack, where anyone who doesn’t get the reference can safely ignore it; if it’s risqué, post it on your AfterDark Mastodon account.

Pick One Word per Concept … and One Concept per Word (a.k.a. Don’t Pun)

These two sections are closely related to the “Make Meaningful Distinctions” guideline earlier.

For example, fetch, retrieve and get may be different, but they’re not meaningfully distinct to a reader, so pick one and use it for all equivalent methods across different classes.

But on the flip side: Don’t use the same word for conceptually different things. This can sometimes be subtle. For example, appending an object to a list is kinda like adding it, right? Well … that’s just English being imprecise. Consider these examples:

>>> [1,2] + [3,4]
[1, 2, 3, 4]
>>> [1,2] + 3
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: can only concatenate list (not "int") to list

Instead of list.add(element), the Python standard library rightfully calls this function list.append(element).4

Use Solution/Problem Domain Names

Remember that the people who read your code will be programmers. (p. 27)

That’s not necessarily true for research software—a lot of our collaborators are experts in their specific field of research, but have no formal programming training. Thus, while the book recommends programming terms first, and domain terms as a fallback, for us the better approach is often going to be the other way around.

Add Meaningful (But Not Gratuitous) Context

The last two sections tell us that we want to be right in The Goldilocks Zone of Context—just enough to inform, not so much that it overwhelms.

The basic idea here makes a lot of sense. But unfortunately, I found the example code listings here worse then useless. To me, the original version of the code is better than the “improved” version in almost every way: The “improved” version of the code has 1.5 times as many lines of code. Instead of 1 function, it has 1 class containing 5 functions; all for a simple three-way if-elseif-else statement. Instead of executing linearly from top to bottom, execution now jumps up and down between 3 of the 5 functions.

Supposedly, that example code demonstrates that having the three variables number, verb and pluralModifier defined inside a GuessStatisticsMethods class instead of a printGuessStatistics function adds more meaningful context? I didn’t see much of a difference; and even if there is one, the example is terribly ill-suited to illustrating that difference, since it includes major unrelated code changes (completely restructuring the code flow) and even changes the functionality (instead of printing a message, the “improved” code returns a string).

Oh, and the class and function names? For a chapter that’s all about meaningful naming, I found them surprisingly obtuse. Here’s a Python version of the initial code example:

def printGuessStatistics(candidate, count):
    if count == 0:
        print(f"There are no {candidate}s.")
    else if count == 1:
        print(f"There is 1 {candidate}.")
    else:
        print(f"There are {count} {candidate}s.")

(Note: The one significant detail that got lost in this translation is that candidate was a char in the Java version.)

The name printGuessStatistics doesn’t make much sense here without additional context which this code sample doesn’t offer. But let’s assume that these are results from a survey asking people to guess the right answer from some options A, B, C and D. That would explain the Guess part, but Statistics is still a bit of a misnomer—printGuessCounts would be more accurate. Also, why are three functions in the “improved” version (which basically correspond to the print function calls in my Python version) called thereAreNoLetters, thereIsOneLetter and thereAreManyLetters? This is violating the guideline of “One Word Per Concept”—Letters refers to the same thing as candidate—and the guideline that method names should be verb phrases.

With all these complaints about that example out of the way, what would be a better example for adding meaningful context? Off the top of my head: Maybe we could have gone back to the getFlaggedCells example at the start of this chapter and discussed how having a CellStatus enum adds useful context, compared to having a bunch of independent constants like FLAGGED, EMPTY and NONE.

But it’s certainly worth thinking about—what are other, better examples?

  1. Of course, some of that will be due to my perspective: The code examples use Java (which I have almost no experience writing) and often draw from a business context. In contrast, my primary language is Python and I work mainly on scientific software; so some cultural differences are unavoidable.
  2. I will write examples in Python in this post. Partly so I don’t just copy the author’s code; and partly, because it’s a nice exercise to translate examples into a different language and verify whether the guidelines still apply, independent of syntax details.
  3. Yes, that SCO. “The Most Hated Company in Tech”, which became bankrupt just as this book came out.
  4. Ironically, one of the first Java code examples in this chapter uses list1.add(x);. I think it’s a missed opportunity that the author explicitly brings up the add/append example, but doesn’t acknowledge this.

“Clean Code” Book Club: Foreword, Introduction and Chapter 1

Posted on Wed 10 April 2024

I just started a book club in our RSE group. The first book we’re going to read is “Clean Code” by Robert C. Martin and today we discussed the Foreword, Introduction and first chapter. (A sample PDF which includes these sections is at the publisher’s website.)

This post collects most of my notes and some points from our group discussion. (Completeness is explicitly a non-goal; otherwise I would not have been able to finish this post in a reasonable time.)

Foreword

This jumped out at me:

Back in my days working in the Bell Labs Software Production Research organization we had some back-of-the-envelope findings that suggested that consistent indentation style was one of the most statistically significant indicators of low bug density. (p. xxii)

With language-enforced indentation, more advanced IDEs and automated code formatters (black, ruff, clang-format, …) running in CI or as pre-commit hooks, this particular correlation may no longer be true. But does the underlying mechanism still hold true? Does attention to detail still correlate with code quality? I’m inclined to believe it does.

If you care about attention to detail, this foreword contains plenty of quotes and proverbs that will encourage this. (Of course, if you don’t care, how likely are you to pick up this book in the first place?)

Introduction

This book will make you work, and work hard. […] You’ll be reading code—lots of code. And you will be challenged to think about what’s right about that code and what’s wrong with it. (p. xxvi)

I’ve read PEP 8 (the Python style guide) more than once. I largely agree or at least find it reasonable1—but as I read it, I rarely stop to think about it; let alone discuss it with others. Already I’m excited to go through this book in a book club, rather than on my own!

The book is divided into three parts. In addition to the description given the introduction, I’ll include my initial impression based on a handful of skimmed page.

  • Twelve surprisingly compact2 chapters that use short code examples to introduce “principles, patterns and practices”—from basics like naming variables and writing comments to advanced topics like system architecture and concurrency. (These should be well suited for a book club—I’m genuinely excited to discuss many of these!)
  • Three chapters discussing more complex examples in greater detail. This is where the hard work promised in the quote above will come in. (Going through these may require a very different format; more like a code review session than a book club?)
  • Finally, a systematic collection of code smells and heuristics encountered throughout the book. (This sounds like a valuable reference to consult over the coming years.)

But first, we’ll get to the first chapter, which doesn’t belong to either part.

Chapter 1: Clean Code

You are reading this book for two reasons. First, you are a programmer. Second, you want to be a better programmer. Good. We need better programmers. (p. 1)

Oh, hush! ☺️ You’re just trying to butter me up …

This chapter starts with a brief section “There Will Be Code”, talking about how the need for coding will continue to exist, despite (or because of) new higher-level abstractions. While events of the past year or so, with “AI” and LLMs becoming mainstream, were impossible to foresee when this book was written, this section has held up quite well.

Over the next sections of this chapter, Robert Martin talks about code rot, illustrating its potential consequences (decreasing developer productivity, potentially long and painful from-scratch rewrites). And while it’s tempting for developers to blame external factors (requirements and schedules imposed by management and marketing) for code rot, Martin argues that it’s not that easy: It is our job to defend code quality! (This section reminded me a lot of The Phoenix Project, which I’m currently reading.)

While the details vary for us as RSEs in academia—we have PIs instead of CEOs, our schedules are dictated by conference dates instead of marketing departments—the basic idea is the same. (Being in a central RSE group does give us a bit more independence from PIs, so we are better able to push back.)

Following this, Martin quotes several prominent developers about their definitions of “clean code”. Their responses (elegance, efficiency, readability, simplicity, care, beauty, …) are notably subjective—clearly, this is more of an art than a science. And in the following section, Martin explicitly acknowleges that:

Don’t make the mistake of thinking that we are somehow “right” in any absolute sense. (p. 13)

This book presents one school of thought—no absolute truths. Some things are a matter of personal taste. Others may vary with context—what’s true for Java may not be true for Python, and vice versa. (Actually, it might be interesting to compare the first part of this book with PEP 8?)

And speaking of PEP 8: While details may differ (we’ll see over the following weeks), the underlying thinking is clearly similar. Consider, for example, this quote from the book:

We are constantly reading old code as part of the effort to write new code. (p. 14)

Does that sound familiar?

Towards the end of the chapter, Martin quotes the Boy Scout Rule (“Leave the campground cleaner than you found it.”) and argues it is our professional duty to apply this to code.

I love this idea in the abstract, but—contrary to what the foreword tells you—the devil is in the details: What if clean-up work increases the size of our diffs and makes it a little harder to understand the main point of a PR? Or, even worse, what if an effort to “clean up” ends up introducing a subtle new bug? Something to ponder while we read future chapters …

Next up, chapter 2 (“Meaningful Names”).

  1. The 79 character maximum line length being a major exception 🫣
  2. 10 pages on objects and data structures? 16 pages on concurrency?