"Code Complete, 2nd edition" Book Review and Summary

This is a book about "software construction", which are activities such as detailed design, coding and debugging, and to a lesser extent all other activities in software development. Therefore activities only superficially covered (or not covered at all) by this book are: management, requirements development, software architecture, UI design, system testing and maintenance.

Software Construction is an important topic because the gap between the best software engineering practice and the average practice is very wide – perhaps wider than in any other engineering discipline.

All aspects of code construction are described with excruciating detail in this text. A good part of it is common sense or common knowledge -for good programmers at least-, however the level of detail, experience and thought put into it is so high that it is best used as a guide whenever we are looking for advice on a situation. It will deliver a wealth of hard-data, instructions and ideas.

Otherwise, as a more casual read, it is advisable to keep the eyes open for those not so obvious bits of advice that hide in the text.

Most of the advise in the coding-related chapters is conveyed in a more concise way in the book "Clean Code", along with examples. This book is also more theoretical and less opinionated than "Clean Code".

The first edition was published in 1995, while this second edition is from 2004. The author claims that 95% of the first edition holds up, which seems quite remarkable for a fast evolving discipline. Presumably, as the rate of change slows down with time barring any paradigm shift, more than 95% have held up by 2013. You can do the math for when you are reading this.

The author claims that construction has been neglected because it's thought of as a relatively mechanical process. Not only that's not the case according to him, but it is also the only activity that is guaranteed to get done.

The source code is the only thing that, in a certain sense, is always up-to-date, therefore it should be of high quality.

The precept "measure twice, cut once" is highly relevant to the construction part of software development, which can account for up to two thirds of the total project cost. The worst software projects end up doing construction two or three times. Doing such an expensive part twice is, unsurprisingly, a very bad idea.

Prerequisites

A focus on prerequisites can reduce costs regardless of whether you use an iterative or sequential approach.

Why are prerequisites important: if a good problem definition hasn’t been specified, you might be solving the wrong problem. If a good requirements work hasn’t been done, you might have missed important details of the problem. If a good architectural design hasn’t been done, you might be solving the right problem the wrong way.

Design

Design is the activity that links requirements to coding.

Managing complexity is the most important technical topic in software development.

Twofold approach to managing complexity:

Minimize the amount of essential complexity that one has to deal with at any time.
Keep accidental complexity from proliferating.

The definition of essential complexity is the one that is intrinsic and cannot be removed.

Desirable Characteristics of a Design

Minimal complexity.
Ease of maintenance.
Minimal connectedness.
Extensibility
Reusability
High fan-in. Fan-in is the number of classes that use a given class. A high fan-in means that there is a good use of utility classes at lower levels.
Low-medium fan-out. Fan-out is the number of classes that a given class uses.
Leanness. No extra parts.
Stratification.

Design Patterns

Commonly used design patterns reduce complexity, reduce errors, provide heuristic value and streamline communication.

Design practices

These are heuristics too, but they describe the steps you can take to get good results: Iterate, Divide and Conquer, Top-Down and Bottom-Up, Experimental Prototyping and Collaborative Design.

Classes and routines

On the topics of classes and routines, most of it is either common sense, common knowledge or can be found over and over in other books. But the discussion about interface semantics is a little more interesting in my opinion.

Each interface has a programmatic part and a semantic part. The semantic part is a set of assumptions and cannot be enforced by the compiler. Therefore it should be documented. Asserts and other techniques can be used to either convert semantic to programmatic or make the semantics more obvious. Some such techniques are described in the book "clean code". See an example of this in Chapter 17, G31 "Hidden Temporal Couplings".

On the topic of inheritance the same is true, it's mostly common knowledge, but it includes a good definition of the Liskov Substitution Principle and what it really means. It means that there are no semantic differences in subclass implementations. For example the method "InterestRate()" on "CheckingAccount" or "SavingsAccount" it returns the interest the bank pays, however in an "AutoLoanAccount" subclass, that method returns the interest that the consumer has to pay.

Regarding inheritance, the takeaway, as any developer knows, is that it tends to increase complexity and we should maintain a bias against it.

Defensive Programming

A good program never produces poor quality output, regardless of the quality of the input. Instead it outputs nothing, produces an error message or complains about the input.

Bad inputs can be handled in a lot of different ways. How to error handle:

Return a value that is good enough, or makes sense in some way
Log an error message
Return an error code or throw an exception
Display an error message
Call a cental error processing routine
Shut down
Handle the error locally

Using one method or another will be influenced by the balance between correctness and robustness that we desire in our program.

The Pseudocode Programming Process

Or PPP, for short. This chapter is an introduction to creating routines through writing pseucode, a process with which any developer is familiar with.

The key to well-written pseucode is writing English-like statements at the level of intent and avoiding the use of any specific programming language syntax.

Alternatives to the PPP are: test-driven development and just "hacking". The author opines that PPP is the superior alternative. I concur.

Variables and Statements

Parts III (variables) and IV (statements) of the book are not very interesting for good programmers or those who have read "Clean Code". It's mostly a compedium of common sense ideas and well known practices that will be of most benefit to beginners or those who use this book as a guide.

In chapter 11, the Hungarian notation is introduced. This is an interesting topic and this notation has most valid uses in the modern software development. Unfortunately the book doesn't go on to discuss it in more detail.

Another interesting (although minor) tidbit that I hadn't considered before is this: in the case of an "if" without "else", it advises including the "else" part of the clause and add a short comment in it explaining why in this case nothing needs to be done.

Lastly, this excerpt about recursion is interesting:

For a small group of problems, recursion can produce simple, elegant solutions. For a
slightly larger group of problems, it can produce simple, elegant, hard-to-understand
solutions. For most problems, it produces massively complicated solutions—in those
cases, simple iteration is usually more understandable. Use recursion selectively.

Software-Quality

Software has external and internal quality characteristics.

External are: Correctness, Usability, Efficiency, Reliability, Integrity, Adaptability, Accuracy, Robustness.

Internal: Maintainability, Flexibility, Portability, Reusability, Readability, Testability, Understandability

Improvement in any of these characteristics can result in deteriorating some other characteristics. Therefore the importance of finding an optimal solution and a set of characteristics to emphasize for the system being built.

A software-quality program is composed of: software-quality objectives, explicit quality-assurance activity, testing strategy, software-engineering guidelines, informal technical reviews, formal technical reviews, external audits.

There are many defect removal processes. It's important to bear in mind that none of them can achieve more than 75% defect removal, and usually just half that. Using a wide variety of techniques is the only way to achieve high defect removal rates. Most organizations however use a test-heavy approach which is not effective.

The General Principle of Software Quality is that improving quality reduces costs. The best way to improve productivity and quality is to reduce the time spent reworking code. That rework can arise from changes in design, changes in requirement or debugging. For example, debugging and associated refactoring take about 50% of the time in a tradicional, dated, development cycle.

Collaborative Construction

A human reviewer can spot different kind of errors than testing can eg. unclear error messages, hard-coded values, repeated code, etc.

A formal inspection is very effective in detecting defects while being relatively economical. It also has one of the highest defect-detection rates (behing beta tests and prototyping).

Developer Testing

Testing is the most popular quality improvement activity. To a degree this is unfortunate given the value of the collaborative techniques (which we have just seen).

Most developer testing is white-box testing, that is testing code whose inner workings the developer is aware of. White box testing is advantageous and it allows to test the class more thoroughly.

Is it better or write tests before or after? Some reasons to write tests cases first are:

It doesn't take more effort than writing them after
Defects are detected earlier
Requirement defects are exposed sooner

The author thinks testing first is a good general approach.

It's impossible to achieve exhaustive testing. We should aim to choose those tests most likely to find errors. Some methods to do that are the following:

Structured basis testing: count the number of possible paths and create the minimum set of tests that will go through all of them. The number of paths can be computed by starting with 1 and adding 1 whenever there's a loop or selection statement and a 1 for each case in a case statement.
Data flow testing: This is based on the idea that there are at least as many errors in data usage as in control flow.

Variables can be in different states. We'd like to test all combinations of first "defined" and then "used" or defined-used. A reasonable testing strategy is starting with structured basis testing and then adding the necessary cases to complete the data flow testing.

It is also useful to test for: boundary conditions, too little data, too much data, wrong kind of data, wrong size of data, uninitilized data.

Refactoring

Code evolves substantially during development. Because they are not straightforward processes, coding, debugging and unit testing consume 1/3 to 2/3 of the work in one project. Additionally, modern approaches are more code-centered and willing to change code more frequently.

Code evolution is an inevitable process, therefore, it is to our advantage to plan for it.

Evolution should improve the internal quality of the program.

Note that these ideas are similar to the philosophy outlined in the book "clean code".

Code tuning strategies

Performance can be addressed at the strategic and tactical levels.

Users are interested in performance but a high throughput, delivering software on time, no downtime and a well designed interface is more important.

Performance is loosely related to code speed. Sacrificing other characteristics for code speed is dumb.

If efficiency is a priority there are multiple ways to achieve it: program requirements, program design, class design, operating-system interactions, hardware, compilation and code tuning. Note that code tuning, what many people think about when they hear the word "performance" is only one of the possibilities.

In general, don't optimize until the program is complete and correct and it has been determined that performance needs to improve.

Common source of slowness: I/O operations, paging, system calls, interpreted languages, errors.

System Considerations

Size of a project changes the relative amount of work of different activities. With size, the construction activities scale linearly but other activities, such as architecture, system testing, documentation, planning or requirements, scale up faster.

The amount of code and people involved aren't the only variables involved in a project's activities makeup. More subtle is the quality and complexity of the final product. A program that's intended to be commercially distributed needs to be extensively tested and documented.

How to foster good code in a company: assign two people to every part of the project, require code sign-offs, circulate good code examples, code is a public asset, reward good code, don't necessarily impose strict standards.

Measurements

There are numerous measurements that can be done in a software project. It's possible to measure any project attribute in a way that is better than not measuring at all. Argueing against measurement is to argue that it's better not to know what's happening on a project.

Software Craftsmanship

Conquer complexity
Pick a process
Write code for people
Program *into* the language
Focus attention by using conventions
Program in terms of problem domain
Watch for warning signs
Iterate
Beware of religious issues

Diego Pardo Martín

Search This Blog