7effects: Delightful Documentation

please visit 7effects.com

Everybody hates documentation.

Reading it is unhelpful, out of date, and a waste of time. Writing it is hard, boring and a waste of time, because you know people won't read it, and it will soon be out of date anyway.

Notice the vicious circle there: all the documentation that you see is bad, so that is the kind you write yourself, not knowing better.

Well, I actually like doing documentation. But then, my documentation is completely different from the usual - more interesting to write and far more helpful to read. I am going to tell you how.

The worst documentation I ever saw is still carved into my memory. It described a COBOL program. For those who, luckily, do not know COBOL, it looks like this:

00029 MOVE ZEROES TO RECORD-AREA-1.
00030 ADD A TO B GIVING C.

(Notice the capitals? This was long ago: computers were so slow and stupid that you had to SHOUT. Lower case had not been invented yet.)

Here is what the documentation said:

Line 00029 fills RECORD-AREA-1 with zeroes.
Line 00030 adds A to B and puts the result into C.

It was like that for page after page. It was not a short program, COBOL never is. Is there anything more pointless than documentation like that?

It is an extreme example of what not to do. But it is what people do automatically. They think the job is to describe the code. Completely wrong!

Why?

The first thing is that you are translating a precise computer language into a vague human one. Natural languages are shorter and more expressive than programming languages. They also have multiple interpretations, and misunderstandings are common. When you document as above, you throw away precise facts and give secondhand rumour instead. Besides, you are only repeating what the code already says!

Secondly, this sort of documentation is guaranteed to be out of date. Think about it. If the documentation describes the code, then the description has to be updated whenever the code changes. For every little change.

Now, you, being a professional, always update your documentation, of course. Others, fixing a crash late at night perhaps, might possibly forget. Especially if the documentation is outside the program, in a separate folder somewhere on the network. What are the chances, do you think?

So the two fundamental problems with how people generally write documentation are:

it is unhelpful

it is out of date

Not to mention being hard work.

How to do better? Is it possible to write documentation that is helpful and always up-to-date? Maybe even easy and interesting?

1. Forget external documents.

Comments embedded in the code are the only place for documentation.

But what about Program Specifications, Design Documents, System Architecture, etc.?

Forget them.

Do not misunderstand me: those are all important. Rather, they were important during the design process. (Planning the architecture is vital to prevent painting yourself into a corner later.) But after coding starts, they are history.

Those design documents are like scaffolding - essential while constructing a building. But no sensible person keeps the scaffolding after the building is up. And you certainly do not gold plate the scaffolding with masses of red tape and heavy, detailed standards.

That is why you should not update the design documents when coding finds gaps or errors. Those specifications are obsolescent, updating them is a waste. (Of course, a big gap that affects many parts of the system may require the architect to issue an update. That is a different matter.)

2. Do not describe what the code does, nor how it works.

"Surely that's the whole point of documentation?", you may be thinking.

Wrong!

If you want to know what and how, the correct place to look is the code, not a secondhand imprecise description in a natural language. The code is definitive; the comment may be misleading.

As with all rules, this suggestion is a guide. It can be broken when necessary (rarely).

If there is a piece of code that is particularly tricky, or uses the language in unfamiliar ways, then you had better explain that in a comment. You will probably need it yourself after you have forgotten the trick in a few months.

But be careful here. Many times I have found a neat one-liner that does the work of ten lines of code. But then I have abandoned it, when I realised that the explanation would be longer than the original ten lines.

Having said what not to do, let us look at what kind of documentation is useful. What do we want from good documentation?

3. Give context and overview.

I start all my main programs with a comment that summarises the whole system.

"This is the XYZ System which enables authorised business partners to order widgets from us. Such orders are passed to the PQR Workflow System and also to the MLK Billing Database."

For lower level modules, I start by describing the situation when we are called.

"An external customer has queried an order in the XYZ System and an administrator needs to change its status. The admin has already been authorised by Module EFG, and has navigated thru screens U, V and W."

The important point is that such comments tell you things that are not in the code. The very opposite of what bad documentation does. We add information instead of repeating it. Or accidentally misrepresenting it.

Also notice that the comments are at such a high level that they will not go out of date. The underlying code - implementation details - may change a lot without affecting the overview. Eventually there may be such a major change in functionality that you have to update the overview, but that should be rare. And it will probably only take a few sentences, even then.

4. Give the purpose of the code.

Having set the scene, we continue with more information that is not in the code. We say why this program or module or subroutine exists. What its purpose is. What it aims to achieve.

Of course, the reader could work that out by studying the code in detail. The point is to save him the trouble of doing that, by summarising the purpose of each chunk of code.

This is also a debugging aid. Suppose a function says that it is going to return the average of some numbers. You can instantly see something is wrong if the code calls a square root function.

The purpose of a piece of code is the most important information that you can give to anyone who is trying to understand or modify or debug it.

Again, this is high level, summary information. You will seldom need to update it, even if implementation details change.

5. Talk about data interconnections.

The need for this does not always come up, but it is one of my little favourites.

Usually the main data being processed follows the structure of the code. (It jolly well should, if you have structured properly!) The data comes in at the top of a program/module/subroutine, gets processed, then passes out of the bottom.

Sometimes, though, there is minor data that flows "sideways" between modules that are not directly connected. It might be a flag that is set in module A, and used by distant modules F, M and Q.

Normally we think of a program as a flow of instructions: one line of code followed by another. (Allowing for subroutine calls and suchlike.)

Here we consider the flow of data instead. Think of it as commenting a data variable. When it is declared, you say "This holds information about xxxx. It is set in module A. Used in modules F, M and Q." In module F, the comment would say "This value was set up in module A during ... when the user did ... Here we will use the value to do ...".

Of course, this is implementation detail. So it may need updating if the code changes. At least we have made it easy by listing all the places which use that data.

More important, this information is not obvious from the code. You would have to look at all modules A, F, M and Q to understand how and why the data was being used.

That, fundamentally, is the whole point of good documentation - making things easy for whoever needs to understand or change our code in the future.

Personally, I rather like documenting this way.

It shares many of the pleasures of good coding. When I write programs, I enjoy finding elegant code that is fast, easy to understand, free of bugs and is easily updated for the most likely changes. Similarly, when I write documentation comments, I aim for short, easy to understand wording that is helpful, stays up to date and avoids misunderstandings (namely, documentation bugs).

Summary

comments, not external documents

do not describe the code

context, overview

purpose

data interconnections

Comments should add information that is not obvious in the code.

Do you have any tips to add to the list?

please visit 7effects.com

7effects

7effects Headline Animator

Wednesday, March 5, 2008

Delightful Documentation

No comments:

Blog Archive

About Me

7effects' shared items

Subscribe via email