Friday, March 20, 2009

Power Of Naming

This is from a moderately recent copy of my Meaningful Names paper, and was translated into Java in the book Clean Code. Here it is in original context with python code examples.

The simple act of using a better name (instead of applying a better comment) can reduce the difficulty of working with the code we write.
What is the purpose of this python code?
             list1 = []
             for x in theList:
                     if x[0] == 4:
                             list1 += [x];
             return list1
Why is it hard to tell what this code is doing? Clearly there are no complex expressions. Spacing and indentation are reasonable. There are only three variables and two constants mentioned at all. There aren't even any fancy classes or overloaded operators, just a list of lists (or so it seems).
The problem isn't the simplicity of the code but the implicity of the code (to coin a phrase): the degree to which the context is not explicit in the code itself. The code requires me to answer questions such as:

  1. What kinds of things are in theList?

  2. What is the significance of the zeroeth subscript of an
    item in theList?

  3. What is the significance of the value 4?

  4. How would I use the list being returned?
This information is not present in the code sample, but it could have been. Say that we're working in a mine sweeper game. We find that the board is a list of cells called theList. Let's rename that to theBoard.
Each cell on the board is represented by a simple array. We further find that the zeroeth subscript is the location of a status value, and that a status value of 4 means 'flagged'. Just
by giving these concepts names we can improve the code considerably:
             flaggedCells = []
             for cell in theBoard:
                     if cell[STATUS_VALUE] == FLAGGED:
                             flaggedCells += [cell]
             return flaggedCells
Notice that the simplicity of the code is not changed. It still has exactly the same number of operators and constants, with exactly the same number of nesting levels.
We can go further and write a simple class for cells instead of using an array of ints. It can include an intention-revealing function (call it isFlagged) to hide the magic numbers. It results in a new version of the function:

             flaggedCells = []
             for cell in theBoard:
                     if cell.isFlagged():
                             flaggedCells += [cell]
             return flaggedCells

The flagged cells line is weird. It works, but what it does is create an immediate array containing just the one cell, and then add that array to the existing flaggedCells array. That’s funky. Why not use the append method instead?
             flaggedCells = []
             for cell in theBoard:
                     if cell.isFlagged():
                             flaggedCells.append(cell)
             return flaggedCells

That is much more obvious, I think. It’s a tiny change, but when we named the operation isFlagged(), the only uncomfortable bit is the way that we’re appending to the array. This is more straightforward, without changing much. It removes one temporary array, which is technically simpler.

But now it’s rather obvious that Python provides a much more terse way to write the code, and in this case it comes out rather clear and readable — and it eliminates the append() call entirely:
             return [ cell for cell in theBoard if cell.isFlagged() ]

Even with the function collapsed to a list comprehension, it's not at all difficult to understand.

This is the power of naming.  If it is this simple, and changes our code so much, what is a reasonable excuse for choosing poor names? 

No comments:

Post a Comment