Where leading programmers explain how they find
unusual and carefully designed solutions

Greg Wilson

Recent Posts

Greg Wilson

As multicore CPUs and distributed applications become the norm rather than the exception, more and more programmers are discovering that concurrent programming is a lot harder than sequential programming---so much so, in fact, that traditional "hack and patch" approaches simply don't work. Race conditions and deadlocks are hellish hard to reproduce under controlled circumstances; if you don't design and analyze concurrent systems before putting them together, the odds are long that they'll never really work.

I therefore wasn't surprised that several of the contributors to Beautiful Code chose concurrency as their topic. While beauty is desirable elsewhere, I think it's essential in the brave new parallel world the industry is rushing toward. I'd therefore like to set a challenge to readers of this blog: we'll send a free copy of the book to the 20 people who submit the most elegant, correct, solution to the problem described below by this Halloween (October 31, 2007). Answers don't have to consist of running programs---sufficiently detailed pseudocode is fine---but you must explain (a) why you believe your solution works, and (b) what makes it beautiful. You can submit your entry at the Beautiful Code wiki site; the rules of the game are given there, and below.


Speed Scrabble is a game for two or more players that takes only a few minutes to play. The rules are simple:

  1. Spread the tiles face down on the table. Each player chooses 15 tiles at random, and puts them in front of herself; the rest are set aside.
  2. Someone says "Go!" Every player turns over 7 of her tiles and starts trying to make words with them. Words must crisscross in the standard Scrabble way, and the players' "boards" are independent---each player only uses her own characters.
  3. As soon as any player has made a board using all 7 of her characters, she shouts, "Go!" Every player turns over another character, so that all players now have 8 characters to use. Players may rearrange their pre-existing words when and as they please---they do not have to stick to their earlier choices.
  4. When any player has used all 8 of her characters, she shouts, "Go!", and everyone turns over their ninth tile. This repeats until everyone has turned over all 15 tiles. The winner is whoever makes a legal 15-character board first.

The challenge is to design the architecture of a "referee" program for a computer Speed Scrabble tournament. Contestants will submit classes that extend an abstract base class called IPlayScrabble that the contest organizers will provide. These classes will be loaded by a SpeedScrabble program, which will create an instance of each one and pass it the dictionary of words that can be used in the game, and its first seven tiles.

Each player will run in its own thread, in parallel with its competitors. As soon as it has made a board with the tiles it has, it will request another tile from the "referee". The first player to complete a 15-tile board wins.

This is trickier than it first seems because players have to be able handle "interrupts", i.e., notice when the referee is trying to give them another tile that they didn't ask for because some other player has taken the lead. (It's often the case that a set of $N$ tiles has no solution, but an $N+1$-tile extension does, so not taking tiles as soon as they're offered is a good way to come in last.) At the same time, when the central referee is handing out tiles, it must never wait for a player to accept one before moving on to the next player, because if it takes player P two seconds to say, "Yes, thank you, carry on," that's two seconds that player P+1 has lost.

So: how will the central referee hand out characters? What methods will you give players, and which ones will they be trusted to implement themselves? Most importantly, how will you convince sceptics that your synchronization and communication scheme can't deadlock, and won't ever starve one player because of something another player has done?

Greg Wilson

One of the reasons there's less beautiful code in the world than most of us would like is that academic computer scientists don't actually seem to care much about it. Other than a lecture or two about commenting and sensible variable names, they don't talk about it in class---certainly not as much as professors of architecture talk about the esthetics of their discipline---and most journal editors never even look at their contributors' code. The result is a vicious circle: people don't think other people care about beautiful code, so they don't talk about it, so everyone believes it isn't really important.

One welcome exception is John Wiley & Sons' journal Software: Practice & Experience. Now in its thirty-seventh year, SP&E has published everything from Bourne's "A Design for a Text Editor" and first description of Make to papers on efficient plagiarism detection for large code repositories and an empirical study of Java bytecodes.

One of my favorite SP&E papers from the past few years is Jing-Chao Chen's "Building a new sort function for a C library", from 2004 [1]. You might think that sorting (at least for data sets smaller than Google's) is a solved problem; if so, Chen's abstract is worth reading in full:

Up to now, the most efficient sort function for a C library was a variant of Hoare's Quicksort (qsort, for short), which was proposed by Bentley and Mcllroy in the early 1990s. Here we call this library function BM qsort. This paper introduces a library function which is based on Chen's Proportion Extend Sort (psort). This work is inspired by the fact that psort is generally faster than qsort, and in the worst case, qsort requires O(n2) comparisons, while psort requires O(n log n). In our library function, many tricks used in BM qsort have been adopted, such as sorting a small array by an insertion sort and the handling of equal elements. Some new tricks are also proposed, however, such as the adaptive partitioning scheme. To assess more effectively the behavior of our function, the test cases are enhanced. The empirical results show that our function is robust and faster than BM qsort. On particular classes of inputs, such as 'already sorted', it is linear time.

Chen' Proportion Extended Sort doesn't outperform Quicksort in big-Oh terms---nothing could---but over the course of eighteen pages, this paper shows how and why it does better in practical terms. It's a great example of how practitioners can learn from one another, and gradually improve on one another's work. It is also, once you understand it, some very elegant code.

[1] Jin-Chao Chen: "Building a new sort function for a C library". Software: Practice & Experience, 34:777-795, April 2004.