Where leading programmers explain how they find
unusual and carefully designed solutions

Recent Posts

Michael Feathers

Periodically, people try to distill the essence of object-orientation. They come up with rules of thumb that pack a powerful punch. One of these is “Tell, Don’t Ask.” The Pragmatic Programmers came up with that phrase and it’s a great piece of advice. When possible, prefer to tell an object to do something rather than asking it for some data. If you ask an object for some data, massage it, and then pass it back to the same object, you’ve found an operation which probably belongs on that other object.

Rules of thumb are great, and no one expects them to apply every place, but I’ve seen some interesting things when I’ve tried to push “tell, don’t ask" to the extreme. I end up with pipelined architectures. Data comes into the system and it’s processed by an object. The object sends a message to another object, and eventually you end up with some event or action that is produced by the last set of objects in the call chain. This sort of design works, but it’s definitely not the way you'd tackle most design problems unless you're working in shell.

I was reminded of this sort of architecture a while ago when I was looking at some code that Steve Freeman and Nat Pryce have been using as examples in their JMock2 tutorials. A common theme around their work is that correct use of mocks in test-driven development can lead to a rather different style of design. Rather then telling an object to do something and then asking whether it was done, you tell an object to do something and then see what happens to its collaborators. This leads to designs that are somewhat like what I described above. They are a bit like a workflow. In conversations about larger systems, Steve tells me it’s better. There’s less design impact under new requirements. And, it makes sense. Each object is responsible for doing one thing and notifying its successor. That’s a bit less than what you have when your objects are responsible for doing things as well as coordinating interaction with objects that give them results.

No, this “push” architecture is a bit different. It’s almost the opposite of functional programming. In the purest form of functional programming, you never tell; you ask. And when you have lazy evaluation, the system only does as much as it really needs to when it’s answering your question. But, despite the differences, there are similarities between these “push” and “pull” systems. Both of them seem to have a relatively stateless substrate that data flows along.

So, there’s this confluence between “tell, don’t ask” and a particular approach to TDD. What’s happening here? I think it comes back something central. Return values push work back on callers. It's not enough to do some work and decide to send a message to a collaborator; you have wait for something to come back to you and then, perhaps, do some more work. We're better off not returning values.

We can carry it a bit further also. If we aren't returning values, maybe we're better off sending asynchronous messages send rather than synchronous messages? If we aren't waiting for a return values, why not?

Curiously enough, this model is somewhat like the one used by Erlang. Erlang has developed quite a buzz recently. It presents a concurrency model that is almost entirely different from the norm. In Erlang, creating new processes is cheap. It might even be cheaper than creating objects in OO languages. The idea behind Erlang is that if you can make a large number of processes and guarantee that they never share state, you can develop more robust systems. Each process receives messages, does its work, and sends messages to other processes. The message sends are largely asynchronous. Sounds familiar, doesn’t it?

Ralph Johnson has written that Erlang is really OO at its core. Processes are objects. They are little processing hubs that do work and send results to their neighbors. In a way, this is much closer to the metaphor of objects than we’re used to. Alan Kay has mentioned that messaging is primary to OO, that the metaphor is cells in an organism. Cells send chemical messages to other cells, and I’m no biologist, but I doubt they are synchronous.

I nearly didn't post this blog tonight. It's terribly abstract, and it sort of flies off in several directions. Seriously, I can only think about this sort of thing for so long before I'm ready to knock myself in the head just to reconnect with something tangible, but it is nice to mull some of these things. You never know what you'll end up with.

If you have time, try the extreme "tell, don't ask" experiment. Try to write a little system using only methods with no return values. If you do, ask yourself afterwards how you really decide whether to ask or tell in your designs. There's a lot to learn there.

Michael Feathers

Recently, I was doing some refactoring with a C# team. We were moving along well, but there were some annoyances. One of them was the back and forth navigation that we had to do whenever we choose to extract a method: Make the method abstract in a base class and then override it in a derived class. It was a series of refactoring steps that the available C# refactoring tool couldn’t deal with, and it was bit harder than it would’ve been if we were using Java.

Why?

Well, the problem was the override/virtual syntax in C#. When you have a method that you’d like to override, you have to go to it, make it virtual and then go to the overrides and explicitly mark them with the override keyword.

The thing is, I know why they designed .NET this way. Microsoft has always tried to enable clean incremental deployment of software. It’s been a concern for them from the beginning. COM was designed specifically to allow incremental deployment and MFC was written the way it was to avoid the fragile base class problem.

So I understand that Microsoft wanted to make redeployment of base classes safer, but, really, I wish that I didn’t have to retype and adjust declarations to do it. The IDE should help more.

A year or so ago, I met up with Ivan Moore and he was talking about a tool he wanted to write for Java. What it would do is load up a set of classes and add private to the declaration of every field and method that could be private. It was a cool idea, and he had a very cool name for it: Thatcher (it privatizes things).

Some friends and I were talking to him about it and although many of us agreed that it would be a great tool to have, a few of us were a bit apprehensive. We’d been burned by frameworks that were overly restrictive and we felt that if private were a tool-generated default, it could be misused rather badly.

As we spoke, we had another idea. What if your IDE did the analysis for you, and visually marked every field and method that that was not used outside the class? Would that be enough?

Yes, there are a lot of things to think about. One is the scope of analysis. Another is social effect of replacing prohibition with information. But, in general, I don’t think we make our IDEs do enough of this work. It would be great to have a Haskell editor that derives and shows you the type signature of a function, or a C++ IDE that annotates a view with markers for side-effects. There is a lot that can be done.

I know it's weird to say that IDEs don't do enough for us in a time when nearly all IDEs give you incremental compilation, refactoring, code completion, and hints. But, sadly, the fact that all of that is possible now hasn't yet touched or influenced language design in the mainstream. And it should.

If we ask our tools to give us more information about our programs we might find that we have less to type, and less to change when we do need to make changes. I think I'd like that.

Alberto Savoia

This blog is a follow-up to a previous entry in which I wrote about how thinking about beautiful code also made me think about the opposite end of the spectrum - the kind of code that most developers somewhat rudely refer to as "crap".

If I had to write a tool that would attempt to decide if a particular piece of code could be considered beautiful, I would not know where to start. Detecting crappy code, however, seemed more within reach; so we threw caution to the wind and we gave it a shot. Read on and you can decide if that shot would have been better used on something else (like our foot).

[Cue in sound of a giant can of worms being opened]

Disclaimer

The CRAP metric and the crap4j plug-in are, at this point, highly experimental in nature. The CRAP formula is based on what we believe are sound principles and ideas; but at this time the actual numerical values used in its calculation and interpretation should only be considered a starting point. We plan to refine the metric as we gain experience. Of course, it's also possible that we'll "scrap the CRAP" if we determine that it's not useful.


If you are adventurous and care enough about the topic, read on, download the plug-in, run it on some code, and work with us to improve it (or at least give us feedback, good or bad). Otherwise, I suggest you check back in a few months - after we've gained some experience with this early prototype and did the first round of tuning and cleaning up.

Introduction

There is no fool-proof, 100% objective and accurate way to determine if a particular piece of code is crappy or not. However, our intuition – backed by research and some empirical evidence – is that unnecessarily complex and convoluted code, written by someone else, is the code most likely to elicit a “This is crap!” response. If the person looking at the code is also responsible for maintaining it going forward, the response typically changes into “Oh crap!”

Since writing automated tests (e.g., using JUnit) for complex code is particularly difficult, crappy code usually comes with few, if any, automated tests. The presence of automated tests indicates not only some degree of testability (which in turn seems to be associated with better, or more thoughtful, design), but also that the developers cared enough and had enough time to write tests – which is a good sign for the people inheriting the code.

Because the combination of high complexity and lack of tests appears to be a good indicator of crappy code, my Agitar Labs colleague Bob Evans and I have been experimenting with a metric based on those two measurements. The Change Risk Analysis and Prediction (CRAP) score uses cyclomatic complexity and code coverage from automated tests to help estimate the effort and risk associated with maintaining legacy code. We started working on an open-source experimental tool called “crap4j” that calculates the CRAP score for Java code. We need more experience and time to fine tune it, but the initial results are encouraging and we have started to experiment with it in-house.

Crap4J is currently a prototype and it’s implemented as an Eclipse (3.2.1 or later) plug-in which finds and runs any JUnit tests in the project to calculate the coverage component. If you are interested in contributing to crap4j’s open-source effort to support other environments and test frameworks (e.g. TestNG) and/or coverage tools (e.g. Emma) please let us know. Instructions for installing the crap4j plug-in are below, but first let’s re-introduce our first pass for the CRAP formula.

> continue reading
Michael Feathers

A few months ago, I was at a conference and I ran a session called Design Sense with a friend of mine, Emmanuel Gaillot. We presented over a hundred snippets of code and asked participants to rate them. The idea was to see just how much we could all agree on good design. We gathered a lot of information and learned quite a bit.

The examples came from all over. Many were from open source projects. Some were examples from the web (a couple of the highest scoring snippets were excerpts from Python code that Peter Norvig wrote), and some of them were from open source things that Emmanuel and I had done in the past. When we asked people to rate them, we didn’t tell them who had written them, so it was fun to see just what they thought of some of our code. It turns out that there was one snippet of mine that I was particularly proud of, but it didn’t score very high. Here it is:

#define TEST(name,classUnderTest) \
  class classUnderTest##name##Test : public Test \
  { \
  public: \
    classUnderTest##name##Test () : Test (#name "Test") {} \
    void runTest (TestResult& result_); \
  } classUnderTest##name##Instance; \
  void classUnderTest##name##Test::runTest (TestResult& result_) 

Yeah, I know. It does look ugly, but here’s the story. That macro is from a C++ testing harness that I wrote a long time ago, and ugly as it was, I had good intentions. The idea was to make it drop dead easy for people to write test cases in C++. And, I didn’t want to use C++ templates either. At that time, it seemed that there were too many people who had compilers that didn’t support them very well.

In JUnit, it’s easy. You write a new class, mark it with some JUnit annotations, and the framework takes over. JUnit uses reflection to find the class and find the appropriate testing methods. Then it creates the test objects, and runs them. It’s all hands-free. In C++, however, there is no reflection, so you are stuck registering test cases, grouping cases into suites, and registering suites.

The goal was to make writing and registering a test as simple as this:

TEST(recording, Vise)
{
  vise.openSection("aSection");
  ASSERT_TRUE(vise.isRecording());
}

And that macro does the job. Here’s how:

The TEST macro takes two tokens as arguments and uses them to construct the name of a new class using the token pasting operator (##). The macro defines the class and declares an instance that is created when the program starts up. The class of that instance inherits from a class named Test whose objects register themselves with a global test runner when they are constructed. The macro ends with the declaration of a method named runTest. People who are using the macro supply code in a set of braces after the macro call. That code becomes the body of runTest.

Once all of that machinery is there, you can create an arbitrary number of tests:

TEST(recording, Vise)
{
  vise.openSection("aSection");
  CHECK(vise.isRecording());
}

TEST(gripAValue, Vise)
{
  vise.openSection("aSection");
  vise.grip("3", __FILE__, __LINE__);
  vise.closeSection();

  CHECK_FALSE(vise.isRecording());
  LONGS_EQUAL(1, store.size("aSection"));
}

The thing that makes the C++ preprocessor so powerful is the fact that you can use it to create unhygienic macros – it’s all just simple text replacement and you can replace text anywhere, even across language constructs. TEST uses this power to generate a declaration that should be followed by a block. It’s almost Ruby-esque that way.

I hate C++’s preprocessor, but I love it also. When you’re in a corner and you need to create an abstraction that doesn’t really align with C++’s grammar, you’re not completely lost. You do, however, pay a price.

It’s been a while since I wrote that code, and I’ve found better ways of doing similar things since then. But, sometimes making one thing ugly allows you to make many other things nice, and that can be a fair trade.

Note: The test macro above was inspired by a similar macro in Mike Hill's TestKit framework. The idea lives on in UnitTest++ and a few other C++ testing frameworks.

Michael Feathers

I was sitting with some friends the other week and a question came up. Someone asked, “so, when did you last write a for-loop?” We all chimed in, one after another, and the answers varied. Some of us had written fors earlier in the day; others, the previous week. Some of my friends hadn’t written a for in months. And, no, it wasn’t because they’d left programming, it was because they were working in languages where you either don’t have fors or you don’t have to use them often… languages like Ruby, Smalltalk, OCaml, and Haskell.

Well, more power to them. I work in C, C++, and Java periodically. For-loops will always be a part of C. In C++, we you can avoid fors with algorithms and functors (although the syntax is a bit wild), but in Java, you’re still stuck with them unless you opt for the Apache collections library or some homegrown solution. Yes, there’s talk about adding closures to Java, but it’s not there yet.

There’s something very annoying about writing a for. For one thing, you have to deal with a lot of detail that is really “off the side” of what you want to do. If you want to iterate an array in C, you can do this:


for(int n = 0; n < size; ++n) 
    sum += elements [n];

Isn’t it ridiculous that most of the complexity is in the first line, rather than the second? It’s all bookkeeping.

Isn’t it even more ridiculous that we’ve been doing this for years? And, sadly, it’s not because we haven’t tried to abstract our way out of it. Since the beginning of “structured programming” people have looked into ways to make iteration less onerous. I remember Pascal and Ada with their “upto”s and “downtos”.. and there was the CLU language, which introduced the concept of an iterator, which was great, but then we had to live with this for years in Java:


for (Iterator it = orders.iterator(); it.hasNext(); )  {
   Order order = (Order)it.next();
   …
}

It was just insane. And, yes, the for syntax introduced in Java 5 is nicer, but, really, just give me a block:


 orders.each { | order | ... }

So we're moving forward. People are moving towards blocks and closures, but why did it take so long?

Part of it, I think is a historical fear of inefficiency. For the longest while, the idea of using a block in the context of iteration was scary for many people. There's more indirection. And, regardless of whether it was slower, it looked slower. Another part, I think, was a fear among language designers that the vast majority of programmers wouldn't be able to handle blocks or that they would be put off by them. And I think that today we can say that that is obviously wrong. It's just a shame that it took so long to figure out.

So, if I run into those same friends in five years and we have the same conversation, I wonder what the answers will be? Will any of us still be writing fors? Only the C folks.. that's my bet.

Michael Feathers

I’ve been diving into OCaml recently. It’s a fun language and a bit different as the statically typed, type-inferencing languages go. OCaml is not only a full-bore functional programming language; it also compiles to native code on many platforms and it's very fast. You can’t call it a propeller-head language. That won’t do. People do some incredibly practical work in OCaml. It’s sort of the functional programming language equivalent to Python.

Part of the fun of OCaml is its syntax. It's a little quirky. List elements are separated not by commas, but by semicolons:

List.iter GlDraw.vertex2 [-1., -1.; 0., 1.; 1., -1.];

There are some other strange choices as well. If you want to send a message, you use the hash symbol between the object name and method rather than a dot:

  method run result =
      result#test_started;
      self#set_up;
      try self#run_test result with
        TestFailure message ->
          result#add_failure test_name message;
      self#tear_down

I’m not sure why the language designers made these choices, but I suspect that history has something to do with it. OCaml is derived from Caml which, in turn, is derived from Robin Milner’s ML. Language design is evolutionary and early choices, to some extent, constrain later choices. One thing that is obvious, though, is that OCaml definitely isn’t in the C language family so if you're used to Java, C++, or C#, it does look a bit alien. For me, it's been hard to tell whether its syntax is any more inconsistent than the syntax of C-derived languages. It may just be that my eyes aren’t tuned to it yet.

It’s hard to talk about it without devolving into a language war, but syntax does matter. And, it affects more than aesthetics, it can affect correctness as well.

One of the bits of syntax I love the most is the way that Ruby handles instance variables. In Ruby, you preface them with an ‘@’. My first impression when I saw this was “Wow, what a pain! You have to have ‘@’s all over the place.” But, as I used Ruby, I appreciated how important that choice was. In C++ and Java, it is relatively easy to assign an instance variable to itself in a constructor. That never comes up in Ruby. Moreover, you aren’t tempted to make up silly names to disambiguate constructor arguments and instance variables.

No, I think syntax does matter, and I don't think it's wholly subjective.

Here's a quicksort in Erlang:

qsort([]) -> [];
qsort([Pivot|Rest]) -> qsort([ X || X <- Rest, X < Pivot]) 
        ++ [Pivot]
        ++ qsort([ Y || Y <- Rest, Y >= Pivot]).

And, here's a similar quicksort in Haskell:

quicksort [] = []
quicksort (s:xs) = quicksort [x|x <- xs,x < s] 
        ++ [s] 
        ++ quicksort [x|x <- xs,x >= s] 

Which one do you like better? Which one seems easier on the eyes? Are there some objective criteria for syntax in language design?