Friday, September 22, 2006

Tips-First for Test-First

Of all the exciting ideas and revelations that came from Kent Beck's original XP book, Test-First Programming has been the one that most significantly affected the way I work on a day-to-day basis.

I love programming test-first. It's a great way to take a large, amorphous task and solve it piece by piece. It's also a nice morale boost -- "Hey, I know that my code does nine things. Let's go for ten..."

Here are a bunch of things I wish somebody had told me about test-first programming.

Unit Testing is not All Testing

So, after I started doing test-first, I walked around for about six months all, "my code is perfect because I wrote tests". My smugness came crashing down when testers found some bugs. My code was better because I wrote tests, but turns out I had made some dicey assumptions about the inputs, and so my tests passed, but were still incorrect. Test-first is not a complete test suite. You still need to do acceptance testing, you still need to do GUI testing where appropriate.

However, you can still automate a very large percentage of acceptance tests. The more you can automate the tests, the more they'll be run, and the happier you'll be.

Test-First is a structure for writing good code at least as much as it's a means for verifying code

Code that has been written in a test-first style tends to have certain qualities. Small methods, loosely coupled. Small objects, loosely coupled. Code that causes side effects (such as output) tends to be separated from code that doesn't. These are all side effects of what's easy to test -- it's easy to write small methods in a tight test-first loop. And dependencies between methods or objects make tests harder to write.

As it happens, those exact qualities -- tight cohesion and loose coupling -- are exactly what characterizes the best software architectures. My test-first experience is that I wind up with much better code architectures from test-first then I do when I try to guess the design before I start. (Which is not to say that a little bit of pre-design can't be helpful, just that it can be overdone).

Test-first is better suited for some things than others

That doesn't mean that you shouldn't try, of course. Test-first is vital in cases where you know the input and output, but not the process. It's also critical in cases where your program can be incorrect in subtle ways. It's somewhat less important for things that will visibly or loudly break. GUI's are a challenge because GUI layouts tend to change in ways that can break unit tests. GUI behaviors are more stable and easier to test. Again, though, you should try and have automated coverage of even those areas that weren't developed test-first.

Trust the process. Look ahead on tests, not implementation.

It works. The tight process is: Write a test. Run the test so that it fails. Make the simplest fix that will pass the test. Run the test so that it passes. Refactor. Keep to that tight loop. Resist the temptation to guess about what you'll need to pass the next test. What I usually do is put a list of the tests I'm going to write in comments in my test class -- that's my lookahead plan, and keeps me from forgetting something. But the design I do in my actual code comes during the refactor step, which is where I see duplication and abstraction.

The earlier you start the better off you are.

It gets increasingly hard to convert a code base to test-first the longer you wait. I've even had 20 line classes that needed significant refactoring to unit test (mostly because output was intertwined with functionality -- the code was better after the refactoring). Test-first is a good place to start, anyway -- pick something to test, and go.

Treat your tests like code and refactor them

Pretty much every test-first guide gives sort of a perfunctory nod to, "oh yeah, keep your test code clean." But I think this could stand a little more attention. For one thing, your unit tests are critically important to your ability to deliver quality code -- and they have no tests of their own. The cleaner your tests are, the better you'll be able to see issues with the tests themselves.

One thing that has worked nicely for me is extracting sets of assertions into custom assert methods. If you are continually making the same five assertions to check validity of your objects, throw them into an assert_instance method of some kind. Another common case is making the same assertion over a range of values -- move the for loop to your custom assertion and pass the range endpoints in.

There are two big advantages to doing this consistently. The first is that it's easier to see what's going on from one line of assert_person(expected_name, expected_addr) then from five lines of assert_equals. The second is that it ensures that you actually do make all the assertions every time. Hey, everybody slacks, and test-first is about making the test setups as quickly as possible. If you can trigger all umpteen tests on your class with one method call, you're more likely to do the whole set every time, rather than just picking one or two at random each time.

Don't reuse instance variables

This is another refactoring issue. It's tempting as you add new unit test cases to do something like this:
Person p = new Person("noel", "david", "rappin");
assertEquals(15, p.nameLength());
assertEquals(19, p.nameLength());
assertEquals(22, p.nameLength());
assertEquals(22, p.nameLength());

The last test fails -- quick, what's it testing? Okay, now we have to trace the life of that instance variable all the way back up. It's hard to read, and prone to dangerous errors. You should never reuse an instance variable like this in a unit test -- every assertion should, where its at all feasible, be completely distinct:

assertNameLength(int expected, String first, String middle, String last) {
Person p = new Person(first, middle, last);
assertEquals(expected, p.nameLength())

testNameLength() {
assertNameLength(15, "noel", "david", "rappin");
assertNameLength(19, "noel", "david", "flintstone);
assertNameLength(22, "pebbles", "david", "flintstone");
assertNameLength(22, "betty", "david", "flintstone");
Now when the last test fails, you can actually see what's going on.

Avoid tautologies

The scariest issue you can have with tests is a test that passes when it should fail, allowing you to continue blithely along, ignorant of a bug you should have already caught. There will come a day, for instance, when you will forget to put any assertions in a test. There are a couple of things you can do to make tautologies less likely.
  1. Follow the process. The process says each test has to fail before you add code. Adding tests that you already know will pass can easily lead to writing a test that will never fail.

  2. If you have constants for text or numerical values in your code, don't reuse those in the tests -- use the literal or create a separate constant in the test.

  3. Be careful with Mock Objects. Try not to test the things that you are explicitly inserting in the Mock when it's created.

Mock Objects Rule

Mock Objects are the missing link in helping you test all the things that are traditionally hard to unit test, like databases, GUI, web server... anything where your code is dependent on an external system or person, the Mock can get in the way and pretend to be that third-party and allow you to send and receive data in a testable way. Mock Object packages exist for a variety of languages, and using a package will save you time and effort on your tests.

Hope that helps -- go out and test something.