"Clean Code: A Handbook of Agile Software" book review and summary

This book should be a mandatory read for every programmer. At the very least, some book on producing clean code should be mandatory, as in software 80% of the work or more is what we call "maintenance". More specifically, the ratio of time spent reading vs writing is over 10:1.

As programmers, we should have more of a repairman or auto mechanic mindset than one of  producers of good software. And if we are "mechanics", in software the equivalent of inspecting machines and fixing wearing parts is to refactor relentlessly. Unfortunately, this is rarely the case.

Writing clean code is hard work. Having the knowledge of principles is not enough. It is a lot like producing a work of art. Most people can tell if it's good or not, but very few can produce a good one. Additionally there are many definitions for "clean code".

Before delving into the details, it's worth keeping in mind this introductory warning.
Martial artists do not all agree about the best martial art, or the best technique within a martial art. Often master martial artists will form their own schools of thought. [...] None of these different schools is absolutely right. Yet within a particular school we act as though the teachings and techniques are right.
Indeed, the same goes for this book. It is opinionated, but the author hopes that, at the very least, the thought put into the principles and heuristics will be acknowledged.

Here is a list of pieces of advice, rules and heuristics that I collected from the book and that I try to keep in mind. These are the most important to me:

Use consistent naming throughout the codebase. There is no right name, per se. There are not only good names, but also names that are more appropriate for a project or organization for a number of reasons. Once a naming standard is chosen, it should be used consistently throughout the codebase.

A function should only do one thing. How to know if it is doing one thing? If the function does only those steps that are one level below the stated name of the function, then the function is doing one thing. In other words, the concept described by the name of the function is decomposed into steps one level of abstraction below.

The ideal number of arguments for a function is zero. Almost never have more than three arguments. Sometimes two arguments make sense, when they are components of a single value such as Point p = new Point(0, 0);. Two arguments aren’t necessarily evil, but they do come at a cost.

The switch statement should rarely be used. A valid use is within the abstract factory pattern.

Within “if” or “switch” clauses, ideally, there should be only one line with a function call.

Flag (boolean) arguments are ugly and they mean that the function does more than one thing.

Don’t use output arguments. Have the function change its owning object instead.

Side effects are lies: the function name claims to do one thing but the function does a number of other different things. Even if you change the function name, a function shouldn’t do more than two different things.

Exceptions are better than returning error codes. They clutter less the code.

Comments should be avoided. They often lie. The reason is simple: programmers can’t realistically maintain them. Inaccurate comments are far worse than no comments at all.

One should write good javadoc comments, especially if writing a public API. However Javadocs can be just as dishonest as any other kind of comment.

If one function calls another, they should be vertically close, and the caller should be above the callee, if at all possible. This gives the program a natural flow and greatly enhances the readability of the whole module.

Objects and data structures are complementary. Objects hide the data behind abstractions and expose functions. Data structures expose the data and have no meaningful functions. Sometimes we want objects, but other times we want simple data structures with procedures operating on them. We must understand this without prejudice and choose the approach best for the job at hand. 

However hybrid structures with meaningful functions and public variables (or public accessors and mutators) have the worst of both worlds.


Rules for error handling:
-          Exceptions
o   Prefer exceptions to error codes. With exceptions one can untangle error handling from the algorithm.
o   It’s better to use unchecked exceptions.
o   Creative informative error messages with the exceptions.
-          Don’t pass or return null
o   Return a special case object or throw an exception instead. If calling a null returning method from a third party library, wrap that method.

With third party libraries we can encapsulate the classes needed. With this approach, should the library change, we only need to refactor in one point of our codebase. We also restrict the number of operations that can be performed in an object from the third party library.

Test code is just as important as production code. It must be kept as clean. The author assures us that clean test code don’t degrade in the face of a changing codebase.

Clean test code is readable test code.

Testing code doesn’t need to be as performant as production code but it does have to be as clean.

Have only a single concept per test.

Classes should be small. A class should have a single responsibility. The single responsibility principle states that a class should have one, and only one, reason to change.

Cohesion in classes: a class should have a small number of variables and methods that manipulate most of those variables.

Separation of concerns: separate creation from use. This can be done using dependency injection, for example.

Cross cutting concerns with AOP, which is minimally invasive. For example Spring AOP. With Spring, because few lines of code are needed, the application is almost decoupled from Spring.

Whether designing systems or modules, create the simplest thing that can possibly work.

Tight coupling makes it difficult to write tests. Following a simple rule that says we need to have tests and run them continuously impacts our system’s adherence to the goals of low coupling and high cohesion.

Case Studies

Chapters 14 to 16 are case studies on refactoring code.
In chapter 16, the SerialDate case study, we are presented with an interesting problem. SerialDate is an abstract class, with a public static method called “createInstance” which, is, as the name suggests, a factory method. 

In order to create the object, the class uses some implementation class, which is definitely a bad practice.

Should the abstract class provide a public static method to created instances of some subclasses? Probably not.

Should we remove it? Only as an exercise in code cleaning, since it's part of a public API.


The book finds a compromise and offers an overengineered solution where the class instantiation is deferred to a combination of a singleton, decorator and abstract factory patterns, thus isolating the knowledge about implementation classes.

Apart from being verbose, there’s something wrong about an abstract class returning an object of an implementation, whether it has direct knowledge of the class or not.

Let’s say we remove the method. We then have to deal with a number of public methods that use it: getPreviousDayOfWeek, getNearestDayOfWeek, and  getFollowingDayOfWeek. The good news is that, although this methods where originally static, they were refactored, as part of this code cleaning exercise, as instance methods. That means, whenever these methods are called, creating a new object, as they are supposed to, is not a problem.

One way to do this is by making the aforementioned methods into a template method pattern where the factory methods are declared as abstract. Abstract static methods are not allowed in Java though, and factory methods should be static. Also static methods cannot be overriden. So, how to solve this conundrum?

If you realize that, from a logical standpoint, the semantics of abstract and static do not exclude the possibility of two methods being both and only the language itself prevents it, you can shrug your shoulders, say “not my fault”, and unapollogetically leave the factory method as an instance method.

Applying the Fundamental Theorem of Software Engineering (adding a layer of indirection) it is also possible to find a different solution, but the one I can think of is too convoluted to considered.

 Conclusion

It seems obvious that not writing clean code is willingly making development harder for theirselves and their company, as smart of a decision as walking on one leg for some spurious, crazy reason.

Part of writing clean code is hard work, but a big part of it comes almost for free: spend a few days reading on a programming language best practices, be organized, be careful with naming, follow guidelines. It is just plainly dumb how code is carelessly written in many organizations.

That's why a good book on clean code provides a vast ROI for any programmer. And this book is a good one. I found the rules and heuristics presented to make sense and be insightful.

Unfortunately there are some grammar and orthography errors that cheapen the quality of the writing ("it's" as a possesive -more than once-, "complimentary" instead of "complementary", etc). 
The Java code presented doesn't follow some best practices, such as avoiding using arrays and using "List" instead. To draw a more detailed comparison, refer to the excellent Joshua Bloch's "Effective Java™, Second Edition".

All in all, a fundamental book for any programmer.

Comments