I'm at the airport, coming home from one of my favorite conferences: Software Practice Advancement. It is a conference but it is also a yearly hangout for a group of people who have a terrifying amount of insight and experience. Many of them are from London, so you get to hear about all of the incredible things that happen in bank projects. That may sound mundane, but it isn't. I don't think there's a domain quite like it now. It mixes high end mathematics, a business interface, incredible time pressure and enormous processing constraints.
This year, I did a session with Steve Freeman and Meirion Morgan called Functional Finance. It was, essentially, and introduction to OCaml using examples from Quantitative Analysis. I don't really know much about quantitative analysis, but I was able to do the intro to OCaml bit for the session. All in all, it was fun, but one of the most interesting things to me, was the last piece of preparation that we did.
Before the conference, I went to Steve's house and we sat down to work on the last example. It was a bear. It contained a triply nested loop on a single array, whose inner loop iterated backwards while performing an accumulating calculation on the current and previous element. We tried to imagine what combination of maps and folds we would need to make the code more functional.
I wasn't much help as I was jetlagged, but Steve attacked the example viciously. One of the things that he did was very interesting. He noticed that the inner loop contained special case logic for the beginning of the array, and that the special case logic could go away if the array was extended leftward.. if it, in effect, had a -1 index. I'd seen and done this sort of thing before, but it occurred to me that I'd never seen it written up. It's a very handy technique. Special cases can often be eliminated if you extend the edges of collections.
For a more concrete example, imagine a program that scores a bowling game. We all know that bowling games have ten frames, but the rules of bowling allow a variable number of throws per frame. If you throw a strike, you have only one throw for that frame but the score for that frame is 10 plus the sum of the next two throws. It seems like this is simple enough logic to code, but the tenth frame is special. A strike in the tenth frame, again, wants two more throws. Does this mean that we should use a data structure which can have three possible throws per frame? We could do that, but we could also have a dummy eleventh frame hanging off the end of the array which the tenth frame can dip into when calculating its score. Then, the logic for scoring is uniform. A frame's score is the sum of its throws, except in the case of spares and strikes which both use throws from the next frame to calculate their own score.
The pattern shows up quite a bit in games. The game Conway's Life is a good example. Cells are alive in the next generation if the sum of their neighbors is within a set of constraints. Cells on the edge of the playing field don't have as many neighbors as cells that are not on the edge, but you can fix that by extending the edge one cell in each direction. You can calculate for cells that are between (1,1) and (n-1,n-1), and allow their calculations to use values in the range (0,0) to (n,n). It's an easy way to code Conway's Life if you decide not to do something more ambitious with the edges like wrapping them around.
Edge Extension is a nice technique, but it does have trade-offs. For one thing, special case logic is a bit more explicit than Edge Extension. If you are working on something like the bowling game, something which models a real world process, you'll probably find yourself documenting the differences between your code and the normal understanding of the domain. But, on the other hand, special case logic is often nasty, complex, and error prone. If we can get rid of it, we're often better off.

Comments (3)
I think of this technique as very similar to the Null Object Pattern; it's another way to handle (certain) special cases without special code.
Posted by Carl Manaster | March 21, 2008 10:30 PM
Carl, yes I came close to mentioning that. It seems to share the same downside too -- slight obfuscation.
Posted by Michael Feathers | March 22, 2008 9:09 AM
Vicious, moi?
I learned the idea on my first programming course, writing linked lists in Pascal on an Atlas (I think it was powered by donkeys on a treadmill). Knowing better than my lecturer, I ignored his advice and tried to code them with special end cases rather than a barrier value. I used up a lot of line-printer paper until I got the idea.
On the other hand, the worst story about using barriers I heard was from Bill Janssen. I can't remember how far Bill was from the source of the story, but the gist of it was as follows. Bill, or a friend, or a mythical programmer, was working with a program back in the days when memory was a few K. The program had a loop and he couldn't see how it finished. Eventually, he gave up and went back to original author who said, "When it's been through enough cycles the program counter overflows and the computer resets. Easy."
Posted by Steve Freeman | March 24, 2008 4:41 PM