Thoughts concerning API-Documentation: August 2011

Mittwoch, 17. August 2011

How to describe the relation between spec and code?

In nearly all of my projects I had to answer two categories of questions for many times:

Which changed lines of code belong to a specific ticket in the issue tracker? Example: after deploying the version of our application including feature #42 the application crashes often with a memory leak. We need to review all changed lines of code for e.g. unexpected side effects.
Why has a specific piece of code been implemented in the way it is? Example: the business expert believes that there is an error in the computation of discount rates for our customers. We need to read the corresponding code, understand the computation of discounts and explain it to the business expert. After that the expert states that this is a bug we are responsible for. Now we have to proof that the implemented behavior has been specified by showing when and were (e.g. a specification document).

All in both examples we need a mapping between code and spec-documents (e.g. issues in the issue tracker like Atlassian JIRA):

We know the issue and need the changed lines of code.
We know the changed lines of code and need the issue.

How could this relation be realized in the code? A simple approach could be to mention the issue id in the commit comments of the version control, e.g. "Fix issue #42: fix memory leak by clearing the cache regularly". With this approach were are able to identify all changed files for a given issue as well as the issue for a changed file.

But in case of modified existing code this approach might be too crude. We need the issue more granular per lines. It would be possible to extract this information from the file's version-history of the version control. But this could be very time consuming. This is why I started in my current project to mark changed lines with an inline comment:

// Changes due to issue #42

…

doSomeStuff();

…

// End changes due to issue #42

This has the big advantage to identify changed lines per issue with a simple full text search. But there is also a big disadvantage: after some changes the readability of the code decreases due to the marker-comments:

// Changes due to issue #42

…

doSomeStuff();

…

// Changes due to issue #07

…

doSomeMoreStuff();

…

// End changes due to issue #42, #07

Yet I do not know a sophisticating solution to this disadvantage and would be pleased about comments by the readers of blog :). Feel free to post your experiences and opinions.

Montag, 15. August 2011

Extension of my thoughts concerning the "multi verb in identifiers "-dilemma

Today I discussed my preceding blogpost and the two introduced strategies to solve it with some colleagues. During this discussion I realized that there is an important precondition for both strategies. In my last post I discussed the identifier "generate and send invoice" and I described two ways to refactor this identifier:

Generalize (and merge) the verbs: "charge invoice"
Specialize the adjective: "send ready-to-print invoice"

Both strategies rely strongly on the semantic of the noun "invoice" (the accusative object if you read the identifier as an imperative sentence). This noun is a business term and it implies a business meaning: a message which lists the ordered articles or services and the amount of money the recipient has to pay for them.

I asked myself if the strategies would work in case of a noun without such a business semantic, e.g. "generate and send mail". The verb "to charge" implies to provide an invoice, but it won't make sense for any message. So the identifier "charge mail" would lead into serious trouble because one could understand it in a way that the recipient has to pay for the mail (because the mail is charged). Applying the second strategy would cause a similar case: "send ready-to-print mail" does no tell the whole story since the content of the mail is not characterized. A mail could also just be a simple covering letter.

This means that both described strategies have a strong precondition: an accusative object with a specific meaning in the operation's domain. This leads to the following refactoring algorithm for operation identifiers with two predicates:

1) Specialize the accusative object as precise as possible.
2a) Generalize the two verbs.
2b) Specialize the accusative with an adjective.

I would like to thank my colleagues Gerwin Abbing (Blue Carat AG) and Stefan Jockenhövel (Paragon Systemhaus GmbH) for the interesting and inspiring discussions on this topic.