Code comments

Posted on Updated on

I read a blog post by Jeff Atwood some time ago (warning: this is an old post: https://blog.codinghorror.com/coding-without-comments/) that got me thinking about code comments again. I forget how I came across the blog post but it made me realise that I never, ever comment.

I used to think that comments were an essential part of software development because that’s what I was taught at university and, if I’m honest, they’re still a good idea for junior developers. I have found, though, that in most codebases comments are merely an excuse for poorly-written code. In fact, I would almost go so far as to say that the definition of well-written code is code that needs no comments.

I’d like to clear one thing up before we really begin: I’m talking about comments, not documentation. XML or JavaDoc comments for intellisense are fantastic–if done correctly–for documenting the public surface of your API. From now on when I use the word ‘comment’ I’m talking about double-slashed comments (// such as this) in the middle of a piece of code.

When do we see comments?

If you’ve got comments in your code then you’re probably in one of the following situations:

You’re doing something that looks strange or unnecessary
This often happens in legacy codebases and is a shortcut in the interests of time
You’re implementing a really complicated algorithm or high-performance code that requires you to get “clever”
This might be a legitimate use of comments to walk someone through a difficult-to-understand algorithm
You’re writing code that is intended to be a learning tool, such as in a beginners programming course
I’m not convinced that this is appropriate for production code
Your code, simply put, is bad.
Easily the most common scenario

When you follow the programming principals such as SOLID, encapsulation and CQS and you keep an eye out for the usual code smells such as the lengths of methods and the number of their parameters and the length of methods and the number of their parameters you almost never need comments to explain what’s going on. For example, below is a method that adheres to the principals I outlined above.

public void ProcessPaymentsForOrder(OrderId orderId)
{
	var order = orderRepository.Retrieve(orderId);
	var payments = paymentGenerator.GenerateForOrder(order);

	foreach(var payment in payments)
		paymentProcessor.Process(payment);

	orderRepository.Save(order);
}

How could this code be improved by the addition of comments? Would adding the line

        // Process payment
	paymentProcessor.Process(payment);

be easier to read or clearer to understand than the original?

To my eye comments are a code smell, meaning that they’re not necessarily wrong by themselves, but they’re a warning sign that your code is not in good shape. In the same way that the number of dependencies of a class can be a warning that the class is violating the Single Responsibility Principal, the presence of code comments normally indicates that a method is too long and is probably doing more than one thing.

Bad or incorrect documentation is actually worse than no documentation

After all, when it comes to describing how your system operates, what’s your source of truth? The code, or the comment above it? Take the following example:

// Get all customers who do not live in the UK
var customers = allCustomers.Where(x => x.country != "US");

We now have a disparity between the comment and the code: the comment says “Get all customers not in the UK” but the code says “Get all customers not in the US”. One of them contains a typo, but which one? No user has reported a bug, so we’d probably assume that the code is the correct one, unless the code is still quite new and then it could really be a bug that is as yet unreported. This duality of conflicting information reminds me of an old proverb:

Segal’s Law:

A man with a watch knows what time it is. A man with two watches is never sure.

The takeaway from this is that the code is reality and the comment is someone’s opinion, therefore you should trust the code. In which case, what’s the point of the comment? It’s actually quite similar to user interface design where, perhaps contrary to intuition, adding more information might add to the confusion for the user / reader. The greatest user interface that can be designed is the one that needs no user manual.

There’s an example of user interface design that might help me make this point (as is sort-of done here: http://blog.jgc.org/2010/06/elevator-button-problem.html) and that’s the simple issue of elevator buttons. Most people will be familiar with a lift call system that presents the user with two buttons: up and down (also: there are no instructions written anywhere for the user to read). If presented with these two buttons on their own the vast majority of users will think to themselves “I’m currently on the fifth floor and I want to get to the tenth, therefore I need to go up” and they will press the ‘up’ button, which is what the system designers intended.

If, however, we choose to present some more information about the system to the user, such as a display indicating which floor the lift is currently on, suddenly the waters are muddied. Some users will continue to reason that “I want to go up” but some will see the current location of the lift and take that into their decision of which button to press: “I am on the fifth floor and the lift is currently on the eighth, therefore I want the lift to come down to me“. Although we are not technically designing a user interface in the code itself when we’re programming it actually holds many of the same concepts, and I hope to introduce the idea that code that might have been perfectly well understood by the reader might now be misunderstood due to the addition of a comment (whether by the original author or a later passing programmer).

This last point might be purely subjective, but I personally am able to grok well-written code faster and more accurately than I can English prose. E.g.

customer.Orders.Sum(order => order.Total)

vs

// get total value of all orders for customer

Code comments are a barrier to refactoring

As my programming experience grows I start to find myself more and more concerned about the refactorability (if you’ll allow me to invent a word) of the code I work with. It’s also one of the reasons I almost exclusively test now outside-in; now that’s a topic for another day, but suffice it to say that I need to be left free to chop / change / reform the code constantly as I work on it to keep its readability and maintainability high. Comments hinder this process due to their nature: they are plain English language which means they can’t be reworked by an IDE tool; any reforming of comments needs to be done manually. This can very quickly become untenable as we extract methods, inline variables, define interfaces, convert method arguments to class dependencies and introduce classes from method parameters; comments have no place in this quickly changing environment.

Most comments can be obviated by refactoring. Therefore you should continue to refactor, rewrite and rename until eventually there are no comments left.

In summary

I’m not saying that comments are always bad, and I am not saying that comments never provide value. If I’m given a piece of code to decipher and it has not been particularly well written then I’d much rather it was commented than not. But well-written code that doesn’t need comments is better. If it would help to rank different types of code for their readability then it would go something like this (with the best result at the top):

  1. Uncommented, well-written code
  2. Commented, well-written code
  3. Commented, poorly-written code
  4. Uncommented, poorly-written code

I think we’d all be in agreement that to find oneself with “uncommented, poorly-written code” (#4) is a pretty dire situation. Some programmers might try to improve the prognosis by adding comments and therefore progressing to #3, but as we can see this isn’t the greatest stride forward that can be made. I am arguing that we should instead strive for #1 by refactoring and rewriting, never adding comments in the first place. We can also see from #2 in this list the curious notion that well-written code can actually be improved by removing the comments, because:

Comments that describe how the code achieves its goals are just a second version of the code. As a second version, they either agree with the code and add no value or they disagree with the code and provide negative value.

https://visualstudiomagazine.com/articles/2013/06/01/roc-rocks.aspx

What’s the real reason people write comments? Because writing good code is hard.

Leave a comment