In Design Sense, Michael Feathers ponders how to teach what is and is not a good design. He mentions an exercise to help think like an artist; flip a picture upside-down. This eliminates the bias of recognition; it's not a face or a flower, it's just a series of lines and shapes. Is there anything similar for coding? Where are our biases?

Programmers tend to be overly detail oriented. We get hung up on what the code does and how it does it. I've observed that struggling programmers focus on (and argue about) syntax details. Is it "if ( condition )" or "if( condition )" or "if (condition)"? Do I use camelCase or underscore_delimited? Should I use a regex or substr()? Can I tweak this line to make it run faster?

Experienced programmers look at structure. How deeply nested is the code? How well encapsulated? Long subroutines with lots of nesting raise red flags and are immediate targets for extract method refactoring. Compound statements are looked upon with skepticism; would a temp variable make it simpler?

How do you think structurally? As with the artist-to-be turning their picture upside-down to eliminate the distraction of recognition, programmers must eliminate their bias towards syntax details. A simple way to do this is to zoom out, literally. Make your font so small the syntax details are no longer visible, only the indentation. There's nothing left to consider but structure.

Let's take a look at an example from ExtUtils::MakeMaker, a very old and very complicated Perl module. Look at one of the most complicated sub-modules, ExtUtils::MM_Unix. Colored syntax highlighting helps, so you might want to have a look in your favorite editor. Lower the font size until the letters are almost indistinguishable; 7 point Monaco works well for me.

MM_Unix%20from%2010%2C000%20feet.jpg

Structural quirks stand out immediately. c_o() contains an odd, repeating indentation structure with duplication.

        push @m, '
    .c.s:
    	$(CCCMD) -S $(CCCDLFLAGS) "-I$(PERL_INC)" $(PASTHRU_DEFINE) $(DEFINE) $*.c
    ';
        push @m, '
    .c$(OBJ_EXT):
    	$(CCCMD) $(CCCDLFLAGS) "-I$(PERL_INC)" $(PASTHRU_DEFINE) $(DEFINE) $*.c
    ';
        push @m, '
    .C$(OBJ_EXT):
    	$(CCCMD) $(CCCDLFLAGS) "-I$(PERL_INC)" $(PASTHRU_DEFINE) $(DEFINE) $*.C
    ' if !$Is_OS2 and !$Is_Win32 and !$Is_Dos; #Case-specific
        push @m, '
    .cpp$(OBJ_EXT):
    	$(CCCMD) $(CCCDLFLAGS) "-I$(PERL_INC)" $(PASTHRU_DEFINE) $(DEFINE) $*.cpp

    .cxx$(OBJ_EXT):
    	$(CCCMD) $(CCCDLFLAGS) "-I$(PERL_INC)" $(PASTHRU_DEFINE) $(DEFINE) $*.cxx

    .cc$(OBJ_EXT):
    	$(CCCMD) $(CCCDLFLAGS) "-I$(PERL_INC)" $(PASTHRU_DEFINE) $(DEFINE) $*.cc
    ';
        join "", @m;

First thing is to clear up all the clutter caused by the duplication and put in some vertical whitespace.

        my $command = '$(CCCMD)';
        my $flags   = '$(CCCDLFLAGS) "-I$(PERL_INC)" $(PASTHRU_DEFINE) $(DEFINE)';

        push @m, qq{
    .c.s:
    	$command -S $flags $*.c
    };

        push @m, qq{
    .c$(OBJ_EXT):
    	$command $flags $*.c
    };

        push @m, qq{
    .C$(OBJ_EXT):
    	$command $flags $*.C
    } if !$Is_OS2 and !$Is_Win32 and !$Is_Dos; #Case-specific

        push @m, qq{
    .cpp$(OBJ_EXT):
    	$command $flags $*.cpp

    .cxx$(OBJ_EXT):
    	$command $flags $*.cxx

    .cc$(OBJ_EXT):
    	$command $flags $*.cc
    };

        return join "", @m;

This turns out to be a series of pushes onto an array with a special case in the middle. A little restructuring moves the special case to the end and collapses two pushes into one.

        my $command = '$(CCCMD)';
        my $flags   = '$(CCCDLFLAGS) "-I$(PERL_INC)" $(PASTHRU_DEFINE) $(DEFINE)';

        push @m, qq{
    .c.s:
    	$command -S $flags \$*.c

    .c$(OBJ_EXT):
    	$command $flags \$*.c

    .cpp$(OBJ_EXT):
    	$command $flags \$*.cpp

    .cxx$(OBJ_EXT):
    	$command $flags \$*.cxx

    .cc$(OBJ_EXT):
    	$command $flags \$*.cc
    };

        push @m, qq{
    .C$(OBJ_EXT):
    	$command \$*.C
    } if !$Is_OS2 and !$Is_Win32 and !$Is_Dos; #Case-specific

        return join "", @m;

Now deeper issues can be considered; does that special case at the end belong in an OS-specific sub-class? Peer closer and you notice a discrepancy between what the condition is doing (checking the presence of various operating systems) and what the comment says (looking for a case-sensitive filesystem). Hmmm!

cflags() spans more than a page and is an obvious candidate for breaking up. The condition commented "Expand hints for this extension via the shell" begs for extraction.

        my($name);
        ( $name = $self->{NAME} . "_cflags" ) =~ s/:/_/g ;
        if ($prog = $Config{$name}) {
            # Expand hints for this extension via the shell
            print STDOUT "Processing $name hint:\n" if $Verbose;
            my(@o)=`cc=\"$cflags{cc}\"
              ccflags=\"$cflags{ccflags}\"
              optimize=\"$cflags{optimize}\"
              perltype=\"$cflags{perltype}\"
              optdebug=\"$cflags{optdebug}\"
              eval '$prog'
              echo cc=\$cc
              echo ccflags=\$ccflags
              echo optimize=\$optimize
              echo perltype=\$perltype
              echo optdebug=\$optdebug
              `;
            my($line);
            foreach $line (@o){
                chomp $line;
                if ($line =~ /(.*?)=\s*(.*)\s*$/){
                    $cflags{$1} = $2;
                    print STDOUT "  $1 = $2\n" if $Verbose;
                } else {
                    print STDOUT "Unrecognised result from hint: '$line'\n";
                }
            }
        }

The long, embedded lists in constants() also stand out.

        for my $macro (qw(

                  AR_STATIC_ARGS DIRFILESEP DFSEP
                  NAME NAME_SYM 
                  VERSION    VERSION_MACRO    VERSION_SYM DEFINE_VERSION
                  XS_VERSION XS_VERSION_MACRO             XS_DEFINE_VERSION
                  INST_ARCHLIB INST_SCRIPT INST_BIN INST_LIB
                  INST_MAN1DIR INST_MAN3DIR
                  MAN1EXT      MAN3EXT
                  INSTALLDIRS INSTALL_BASE DESTDIR PREFIX
                  PERLPREFIX      SITEPREFIX      VENDORPREFIX
                       ),
                       (map { ("INSTALL".$_,
                              "DESTINSTALL".$_)
                            } $self->installvars),
                       qw(
                  PERL_LIB    
                  PERL_ARCHLIB
                  LIBPERL_A MYEXTLIB
                  FIRST_MAKEFILE MAKEFILE_OLD MAKE_APERL_FILE 
                  PERLMAINCC PERL_SRC PERL_INC 
                  PERL            FULLPERL          ABSPERL
                  PERLRUN         FULLPERLRUN       ABSPERLRUN
                  PERLRUNINST     FULLPERLRUNINST   ABSPERLRUNINST
                  PERL_CORE
                  PERM_RW PERM_RWX

                  ) ) 
        {
            next unless defined $self->{$macro};

            # pathnames can have sharp signs in them; escape them so
            # make doesn't think it is a comment-start character.
            $self->{$macro} =~ s/#/\\#/g;
            push @m, "$macro = $self->{$macro}\n";
        }

They suggest they would be better off in their own methods. The change improves not only readability, but also flexibility, because a subclass can now override with their own lists.

        ...

        for my $macro ( $self->list_of_constants ) 
        {
            next unless defined $self->{$macro};

            # pathnames can have sharp signs in them; escape them so
            # make doesn't think it is a comment-start character.
            $self->{$macro} =~ s/#/\\#/g;
            push @m, "$macro = $self->{$macro}\n";
        }
    
        ...
    
    sub list_of_constants {
        my $self = shift;
        return qw(
                  AR_STATIC_ARGS DIRFILESEP DFSEP
                  NAME NAME_SYM 
                  VERSION    VERSION_MACRO    VERSION_SYM DEFINE_VERSION
                  XS_VERSION XS_VERSION_MACRO             XS_DEFINE_VERSION
                  INST_ARCHLIB INST_SCRIPT INST_BIN INST_LIB
                  INST_MAN1DIR INST_MAN3DIR
                  MAN1EXT      MAN3EXT
                  INSTALLDIRS INSTALL_BASE DESTDIR PREFIX
                  PERLPREFIX      SITEPREFIX      VENDORPREFIX
                       ),
                       (map { ("INSTALL".$_,
                              "DESTINSTALL".$_)
                            } $self->installvars),
                       qw(
                  PERL_LIB    
                  PERL_ARCHLIB
                  LIBPERL_A MYEXTLIB
                  FIRST_MAKEFILE MAKEFILE_OLD MAKE_APERL_FILE 
                  PERLMAINCC PERL_SRC PERL_INC 
                  PERL            FULLPERL          ABSPERL
                  PERLRUN         FULLPERLRUN       ABSPERLRUN
                  PERLRUNINST     FULLPERLRUNINST   ABSPERLRUNINST
                  PERL_CORE
                  PERM_RW PERM_RWX
                 );
    }

The "in and out and in and out..." indentation of init_dirscan() is caused by a long chain of if/elsif/else conditions that could be simplified.

            if (-d $name){
                next if -l $name; # We do not support symlinks at all
                next if $self->{NORECURS};
                $dir{$name} = $name if (-f $self->catfile($name,"Makefile.PL"));
            } elsif ($name =~ /\.xs\z/){
                my($c); ($c = $name) =~ s/\.xs\z/.c/;
                $xs{$name} = $c;
                $c{$c} = 1;
            } elsif ($name =~ /\.c(pp|xx|c)?\z/i){  # .c .C .cpp .cxx .cc
                $c{$name} = 1
                    unless $name =~ m/perlmain\.c/; # See MAP_TARGET
            } elsif ($name =~ /\.h\z/i){
                $h{$name} = 1;
            } elsif ($name =~ /\.PL\z/) {
                ($pl_files{$name} = $name) =~ s/\.PL\z// ;
            } elsif (($Is_VMS || $Is_Dos) && $name =~ /[._]pl$/i) {
                # case-insensitive filesystem, one dot per name, so foo.h.PL
                # under Unix appears as foo.h_pl under VMS or fooh.pl on Dos
                local($/); open(PL,$name); my $txt = ; close PL;
                if ($txt =~ /Extracting \S+ \(with variable substitutions/) {
                    ($pl_files{$name} = $name) =~ s/[._]pl\z//i ;
                }
                else { 
                    $pm{$name} = $self->catfile($self->{INST_LIBDIR},$name); 
                }
            } elsif ($name =~ /\.(p[ml]|pod)\z/){
                $pm{$name} = $self->catfile($self->{INST_LIBDIR},$name);
            }

Since the code uses an inline documentation style, with each method's documentation appearing just before the method itself, it's easy to identify undocumented and under-documented methods. Color makes the large comment blocks in dynamic_lib() stand out.

        my $libs = '$(LDLOADLIBS)';

        if (($Is_NetBSD || $Is_Interix) && $Config{'useshrplib'}) {
            # Use nothing on static perl platforms, and to the flags needed
            # to link against the shared libperl library on shared perl
            # platforms.  We peek at lddlflags to see if we need -Wl,-R
            # or -R to add paths to the run-time library search path.
            if ($Config{'lddlflags'} =~ /-Wl,-R/) {
                $libs .= ' -L$(PERL_INC) -Wl,-R$(INSTALLARCHLIB)/CORE -Wl,-R$(PERL_ARCHLIB)/CORE -lperl';
            } elsif ($Config{'lddlflags'} =~ /-R/) {
                $libs .= ' -L$(PERL_INC) -R$(INSTALLARCHLIB)/CORE -R$(PERL_ARCHLIB)/CORE -lperl';
            }
        }

Any code with a story attached is worth some investigation. Long comments mean the code doesn't explain itself. Sometimes this means bad code, sometimes this means code to handle an odd situation, but it always means complexity. That means get it out into its own subroutine where it can be better tested, documented and not interrupt the narrative of the main code flow. A simple extract method takes care of that.


        my $libs = '$(LDLOADLIBS)';
        $libs .= $self->_special_shared_lib_flags;

    ...

    # Use nothing on static perl platforms, and to the flags needed
    # to link against the shared libperl library on shared perl
    # platforms.  We peek at lddlflags to see if we need -Wl,-R
    # or -R to add paths to the run-time library search path.
    sub _special_shared_lib_flags {
        my $self = shift;

        my $flags = '';
        if (($Is_NetBSD || $Is_Interix) && $Config{'useshrplib'}) {
            if ($Config{'lddlflags'} =~ /-Wl,-R/) {
                $flags = ' -L$(PERL_INC) -Wl,-R$(INSTALLARCHLIB)/CORE -Wl,-R$(PERL_ARCHLIB)/CORE -lperl';
            } elsif ($Config{'lddlflags'} =~ /-R/) {
                $flags = ' -L$(PERL_INC) -R$(INSTALLARCHLIB)/CORE -R$(PERL_ARCHLIB)/CORE -lperl';
            }
        }
    
        return $flags;
    }

After looking through the routines for a while one wonders if alphabetical order is best. The large size of the class (3700+ lines) and the sheer amount of time it takes to look through it all suggests it is too large and consideration should be given to breaking it down.

Like an artist stepping back from her work, this simple method of lowering the font size allows the programmer to forget for a while the syntactical details and business rules. Instead you can observe nothing but the structure and, by doing so, spot points in the code that need extra care. I've maintained ExtUtils::MM_Unix for years now, yet this simple exercise revealed a pile of simple and obvious improvements.