About five years ago, I was visiting a company on the east coast and I sat with one of their developers. We were about to make some changes in a method that was a couple hundred lines long and I said “Let’s break this method down a bit; let’s do some extractions.” The developer looked at me and said “Oh. You’re an academic.”
I don’t think I’ll ever forget that incident. I knew that I probably wouldn’t get much mileage out of saying “Hey, the best programmers I know would say that this method is ridiculously long, if you don’t see it that way, I don’t know what to say.” So, we worked together for a while and it snuck up on him. We got to a point where it was obvious that we had to do something different, and sure enough it was extract method.
The fact is I didn’t have the heart to tell him that it was a bit worse than he thought. I’m not an academic, but I do like to factor my code to a very fine grain. I like to break methods down into itty bitty pieces so that every single one of them does exactly one thing. They’re one to five lines, ten or twenty lines max; and, in my opinion, they’re beautiful.
Here’s an example. I was writing a utility a while ago that produces specialized source code wrappers for classes in java libraries. As I was working on it, I noticed repetitive formatting. I pulled it out into a class and this is what I ended up with:
package glaze.tool;
public class Formatter {
private String text = "";
private String indentation = "";
public void newLine() {
newLine("");
}
public void newLine(String lineText) {
newText("\n" + indentation + lineText);
}
public void newText(String newText) {
text += newText;
}
public void indent() {
indentation += "\t";
}
public void outdent() {
if (indentation.length() > 0)
indentation = indentation.substring(1);
}
public String getText() {
return text;
}
}
The methods are ridiculously small but they form a little language that reads rather well in code. Here’s an example of its use:
package glaze.tool;
import java.lang.reflect.Method;
public class GlazeInterface extends GlazeType {
private final static String INTERFACE_NAME_PREFIX = "I";
public GlazeInterface(Class clazz, ClassStore store) {
super(clazz, store);
}
@Override
protected void buildDeclaration() {
out.newLine("public interface " + getName() + " {");
out.indent();
buildMethods();
out.outdent();
out.newLine("}");
out.newLine();
}
protected GlazeMethod makeGlazedMethod(Method method) {
return new GlazeInterfaceMethod(method, store);
}
public String getName() {
return INTERFACE_NAME_PREFIX + wrappedClassName();
}
}
Now, if you’ve encountered code like this before, code where all of the methods are very short, you might have cursed the person who wrote it. It looks overfactored, but is it? Most of the methods ended up this small because I was very zealous about removing duplication, and it’s hard to argue for duplication. On the other hand, I have to admit that the code is a bit opaque. It takes time and study to see how the pieces fit together, but once you understand, changes are often easier and you don’t have to touch as much code to make them.
I spend a lot of time helping people in ugly sprawling code, so it's nice when I get a chance to factor code finely. I don't think I'm alone. I've seen a number of examples of this coding style over the years.. a Java port of the HotDraw framework, an early version of emacs, various implementations of the STL in C++.
If you haven't taken a piece of code and tried to squeeze out every bit of duplication, try it as an experiment. See what happens.

Comments (12)
I recently wrote a Python script after having spent some time in Haskell, and I found I was writing code like this -- functions that were often 1 or 2 lines, rarely more than 5 -- and when they got to 10 lines, I started looking at them like they needed refactoring.
In fact I noticed I was automatically pulling out functions that encapsulated a single piece of knowledge -- even when the function was only used once.
The result is code that is extremely easy to read top down -- in each function you only ever have to deal with one thing. The fact that you can't understand it bottom up is basically irrelevant, because for most programs you never start to look at them that way.
A few years, I remember reading someone on c2.com saying that they usually had methods that were just 5 - 10 lines long, and I couldn't imagine how they ever got anything done in 5 lines. In some languages, with some APIs that are at just the wrong 'level', it can in fact be difficult to get anything done, but more and more this style is natural to me.
Posted by Luke Plant | July 24, 2007 3:50 PM
I believe strongly in DRY, although I (and a lot of other people, I think) interpret it in practice as "don't repeat yourself too much if you can help it."
Posted by warren | July 24, 2007 4:38 PM
What a shame Java doesn't have blocks, you could factor that formatter a little further if you did. Here's a Smalltalky version of what I mean:
out declare: ('public interface ', self name) body: [ printBuildMethodsOn: out ]The body of declare:body: would look something like:
Or could you get the same/similar effect with an inner class?
Posted by Piers Cawley | July 24, 2007 4:46 PM
I generally agree with you. I like code with methods/functions small and to the point. And this is precisely why I dislike embedded Java. Due to the particular way the compiler works, the more methods and classes you have, the larger the final binary. Even if that removes a lot of duplication. The current code I work on started with small methods, and we had to merge them to hundred or thousand lines monsters, so that the code would fit it the size limit.
Functional languages are quite powerful on refactoring and limiting duplication, thanks to first class functions and closures. I am thinking of trying lisp, because lisp Macros seem to be able to go beyond that, and allow refactorings that are not possible in any other language.
Posted by Florian | July 24, 2007 8:49 PM
A few years ago I had a similar "a ha!" moment regarding refactoring and small methods. My employer had hired Fred George as an XP coach and he set the standard: there shall be no Java method longer than three lines. I thought this was ridiculous, as did some of the others in our group, but a few public (as in "get the projector and gather 'round") refactoring sessions made a believer of me. After going through the process the code looks better, is more understandable, and, perhaps just as important, developers understand the problem and its solution better than they did before.
Posted by Bob Rogers | July 25, 2007 12:41 PM
It's great to see so much positive response. I still encounter many people who are freaked out by this style of development. You have to work with them to get them to see the advantages. It definitely takes some acclimation.
Posted by Michael Feathers | July 26, 2007 2:11 AM
I compulsively do this and I'm glad I finally know the name for the process. I always have this strange guilty feeling when I write code that is redundant - even if there's no other way around it. (i.e. Those times when you repeat a single line of code once or twice, which isn't enough to warrant writing a function, or even a loop)
Very nice though. I'll have to remember that name.
(I like this site btw, I'll have to pick up a copy of the book. It's nice to see there are other people out there that appreciate elegant/beautiful coding practices.)
Posted by Aaron | August 8, 2007 11:36 PM
To me, beautiful code gets the coupling right. The Formatter example is problematic in this regard. For example, take these two lines out of context:
The 1 and "\t" are implicitly coupled through asymmetric, imperative code. This is one of the main causes of software rot (entropy is a poor analogy, imiho), which you seem concerned to eliminate.
Here's why implicit coupling is problematic. Say I move to a shop that likes to indent without tabs and indents by four. I then write:
It's very difficult to see that the length of the string and the 4 are coupled. Indeed, it's hard to see that the string is exactly four spaces. What's most important, is that the lines that I brought together, are indeed separated by several other lines, and are only coupled by the names of the methods: indent and outdent.
Nit picking? Not in my opinion. Implicit coupling crops up in most code. We are taught imperative coding in school, because that's what the teachers know. Teachers don't write a lot of code, and if they do, it's often throwaway code. Refactoring isn't really something most teachers have time for. Extreme refactoring is unheard of.
You can refactor this example statelessly and have the coupling be explicit. Unfortunately, it's hard in Java, because it requires closures, or at a minimum, the ability to use other abstractions like a list of lines of code as an intermediate form.
Almost every line of code (that's not declaration) in the Formatter is implicitly coupled. Another example is the fact that the indentation variable must be used, or indentation won't happen. It's an unnecessary artifact of imperative coding, and one of the byproducts of our hyper-focus on object-oriented programming in schools. Even the Smalltalk example makes the same mistake: indent and outdent should be explicitly coupled as one function.
Another example of implicit coupling is that out is the object used for producing the result. This is an artifact of the unnecessary coupling between formatting and output, but coupling correctly is hard. A more stateless approach would return a successive concatenation of strings or more simply building up a list of strings with newlines added later. Each would be added in a relation to the other -- not as in the above example by virtue of them simply being in the same module. This would eliminate the need for this line of code, also:
This imperative code is protecting against the stateful design choice that the caller controls indentation through outdent, which, is an unnecessary function, as I pointed out earlier.
I hope this feedback helps you to write more stateless and more explicitly-coupled code in the future.
P.S. you probably want to explore lexically and dyamically scoped lambdas before you dive into macros.
Posted by Rob Nagler | August 9, 2007 10:31 AM
Rob:
Thanks. I agree and I take your suggestion to heart. I hadn't thought about the implicit coupling.
I was wondering.. do you have an implementation in mind that you'd like to share? It would be nice to see this sort of thing in a stateless style.
Posted by Anonymous | August 9, 2007 11:36 AM
I don't have an implementation in mind that would be putting the cart before the hours. I don't understand the problem statement as put forth by Michael:
My guess is that the particular problem doesn't matter. Michael was trying to explain his approach to programming. The comments I wrote earlier today were a vain attempt 1) to let off steam due to my excessive workload and 2) to point out another view of the meaning of beautiful code. It was fun to think about how many ways the code above was implicitly coupled. Then again, if it's Java, it's probably implicitly coupled. This isn't the right forum to explain this statement so feel free to contact me offline.
Posted by Rob Nagler | August 10, 2007 1:12 AM
Rob:
Apologies. That was me as "anonymous" above. I forgot to put in my name.
Posted by Michael Feathers | August 10, 2007 10:13 AM
just a few small suggestions
- you can use a StringBuffer / StringBuilder inside your class. Most JVMs implement the + operator on String with one of these helper classes anyway, but this way it doesn't have to be re-created all the time.
- to avoid implicit coupling, you can use an indent amount and an indent string constant (or even make it configurable) as such:
private static final int INDENT_AMOUNT = 1;
private static final String INDENT_STRING = "\t";
private int indentation = 0;
and then
public void newLine(String lineText) {
buf.append("\n");
for(int i = 0; i < indentation; i++) {
buf.append(INDENT_STRING);
}
newText(lineText);
}
public void indent() {
indentation += INDENT_AMOUNT;
}
public void outdent() {
indentation = Math.max(0, indentation - INDENT_AMOUNT);
}
public String getText() {
return buf.toString();
}
indenting with 4 spaces would mean:
INDENT_AMOUNT = 4;
INDENT_STRING = " ";
now, you could do away with INDENT_AMOUNT and set INDENT_STRING to " ", but that's arguably less verbose.
Posted by Gergely Nagy | August 14, 2007 5:41 AM