A unit test by any other name…

Everybody is, at least to some extent, familiar with the following quote:

What’s in a name? that which we call a rose
By any other name would smell as sweet;
So Romeo would, were he not Romeo call’d,
Retain that dear perfection which he owes
Without that title….

— Romeo and Juliet, Act II, Scene II

It is with these words that Juliet attempts to persuade Romeo that they are their own persons and the fact that they are by name members of feuding families should not keep them apart. Ever since this phrase has come to be used to mean that two concepts are (or can be) the same even if they are named differently.

I was put in mind of this quote today by a discussion I was in at work. The discussion was one that comes up every now and then in my work environment: the value, need and effort of unit testing. And it involved the usual suspects, one thinking that unit testing is overrated and a lot of work for little return (especially during initial development), one seeing some value but thinking that unit testing is only useful for larger or more complex pieces of code, one thinking that unit testing is very important but not caring whether people work test-first or test-last and me: a test-first purist.

At this point I should probably explain a little about my working environment. I am currently staffed through my employer at a company where IT is an important supporting activity but not a core activity. That is, I am at a large company that has its own IT department but makes its money doing something else. That same company is also a very good employer and has, over the years, been very successful at maintaining employee loyalty; as a result it has a lot of employees who have been in the IT department for years (decades even) and have worked on a multitude of projects, languages and platforms. It has not, however, built up any deep expertise on some of the more intricate parts of software engineering, so that company has contracted consulting services from my employer. My employer is very good at many aspects of software engineering, especially project management. My employer is very bad at generating employee loyalty however (as most of the authors on Gridshore can testify to), so my employer employs a lot of young people with fresh ideas (and sometimes lacking in necessary experience).

As a result of the above, my working environment is a mix of somewhat older developers who have worked a certain way for a long time and show varying degrees of willingness to switch gears and younger developers who have not been working for a long time at all and therefore have no problem at all adopting new ideas (since they have none of their own that need displacing). Sometimes this situation leads to contention between proponents and opponents of a new idea. And at the same time it leads to the odd situation that the same new idea has varying degrees of acceptance in both camps (fore and against). Test-driven development and especially working test first is such an area. As sketched above the discussion at my project usually leads to four different points of view once you factor in pro/con and (for want of a better word) Puritanism:

	Puritan	Liberal
Pro	Test-driven, test first is a must	Unit testing is a must, but working test-last is okay
Con	Unit Testing ist Evil und verboten! Alles kaputt!	Unit testing is a pain that takes time and involves a lot of work; however, sometimes it adds some value

In my work I may rightfully be considered a Pro Puritan. I work test-first unless it is thoroughly impossible to test at all and I frown upon people who do work test-last.

My status as a test-driven puritan is somewhat surprising considering my background. By rights I am not a software engineer at all but a computing scientist. And not just a computing scientist, but one from the Eindhoven University of Technology, home of one of the fathers of computing science, Edsger Dijkstra. In the computing science department of my alma mater, “testing” is a very nasty word. Testing is done by icky people who don’t use rigorous methods to develop software, who don’t know the ins-and-outs of using predicate calculus as an a-priori proving technique that derives a correct program from its specification. Testing, in short, is what you do if you are a very pathetic loser indeed. It might therefore be considered odd that I would develop into a puritan after having that drilled into me.

And yet, after a few years of careful consideration, it has occurred to me that it is this very sense of formal rigor that my education installed in me that drives me towards the more puritan view of testing, of testing first and allowing the test to drive the development. I shall attempt to explain further.

Let me start by saying that I still, after all these years, feel that my old department is correct. If you need provably correct software, formal rigor is the way to go. A-priori proof is your method and testing is a poor substitute. But business software development isn’t like that. Business software development rarely has the sense of urgency that comes with the imperative that something must be flawless — good enough will often do and is usually more economical. Not to mention that formal rigor is rarely achievable in today’s IT business, which is full of people who think that rigor is something that happens to dead people.

In the IT business (consulting business really), the process of producing correct programs is reduces from a formal development methodology backed by mathematical certainties to coming up with something that sort of works (i.e. compiles and at least manipulates the entities involved in the problem domain in some way) and then improving upon that initial solution by testing and tweaking and testing and bugfixing and testing and so on until one reaches a final result that is perhaps not perfect, but good enough to fool the customer on the live environment. Now, while it is not exactly the same thing, the above process shows parallels to a mathematical technique for solution finding called local search.

Local search, for those of you not familiar with it, is a form of optimization technique. Optimization problems in mathematics are problems in which you are given a well-defined set S of elements from which you must retrieve “the best one”. In order to do this you are also given a function f whose domain is S and whose range is the set of real numbers (f: S → ℜ). The problem is now (more formally) to find the element of S such that the value assigned to that element by f is maximal (i.e. find s^* ∈ S such that ∀ s ∈ S: f(s) ≤ f(s^*)). Optimization problems are in general mathematically hard to solve optimally, so many techniques exist that give varying degrees of certainty of finding the absolutely optimal solution but yet do give a guarantee of finding a solution.

Local search is on of the many optimization techniques available. It works by picking a random element from the set S and then stepping through the neighbors of s within S (the assumption being that you have a concept of being able to reach other elements of S from s in one step or more steps). It keeps doing this until it finds an element s′ that is a better solution than all of its neighbors. This method offers no guarantee whatsoever of ever actually finding a solution (you can, theoretically, cycle forever through a local group of equally rotten solutions) and no guarantee of finding a solution that is any more optimal than finding a locally optimal solution (i.e. you can find a solution that is better than all its neighbors, only to find that you are standing on a molehill at the bottom of a very deep ravine). However, on average local search does pretty well, which is why it is used often (both as a way of handling relatively simple problems and as a starting point for better solutions to more complex problems). Astute readers will already see the parallels with test driven development.

In TDD, our problem is to find/develop a program that meets specification S closely enough to be acceptable for use on the live environment. What is close enough is measured by our unit test, which consists of a number of test cases. For a unit test for specification S, let T be the number of test cases. Let p be some program that we hope will meet our specification and let t(p) be the number of test cases that pass if we run our unit test against p. We are now looking for any program p such that t(p)/T = 1.

Of course, once we have found such a program p, we will have no guarantee that it is the best program we can choose. It may be clumsily written, it may be horribly complex, full of code smells or just damn ugly. Such is the nature of search and experimentation. However, we do know that it is one of the optimal programs possible as measured by our unit test: our local search of implementing and tweaking has led us to an optimum (one of the many possible).

There are some remarks to be made about this. The first is that, in my view, this semi-formalization explains why TDD should work test-first. We are searching through the solution space for a program that is optimal against our unit test (this being our estimate of meeting the specification S). It is not given to us to change the specification and therefore we should not change the way conformance to S is measured. We can only influence which program we are offering up to be measured by the unit test. Working test-last reverses the search; you run the risk of doing a local search for a unit test that passes on your program rather than searching for a program that passes on your unit test. In test-first development, you already run the risk of your unit test not being an accurate measure of conformance to the specification; in test-last development, this problem is gravely compounded.

The second remark is that TDD as seen in this light does not diminish the creative element of your work as a software developer. In any optimization process it is true that the quality of the initial choice of element and the strategy of neighbor selection greatly influences the outcome of the entire optimization problem. No less so in TDD. Making all the tests pass leads you to a sufficient solution. It is your insight as a developer that leads you to the elegant solution and increases the probability that your solution will be truly optimal.

Third, it is clear that TDD forces you to divide your problem domain into smaller problem chunks to for which you must seek a solution as a method of bridling complexity. Optimization problems give fewer and fewer guarantees as the solution space becomes harder to navigate. Bad enough that you cannot necessarily find the optimum, if you cannot even decide which neighbor is better you are completely lost. In terms of unit testing this means that the developer is forced to come up with segmentations of his problem and composable part-solutions in order to have any chance of TDD working at all.

A unit test by any other name…