Monday, February 22, 2010

Simple V. Clear V. Easy V. Primitive

I once posted a blog about the meaning of the word "Simple" as used in "The Simplest Thing That Might Possibly Work", an activity that generated a little buzz and first steps toward resolution. The problem I realized is that "simple" gets mixed up with "easy", "primitive", and "clear" in ways most unfortunate. This was true of myself as well as others in the conversation.

As I search the web for words like simple and complex, clear or confusing, perhaps easy and hard, I realize that we're all splitting hairs in different ways. I suggest that we consider some more crisp boundaries when discussing qualities of code. Here are the axes I suggest:
Simple v. Complex: relating to number of parts and steps involved
Clear v. Puzzling: relating to ease of comprehension
Easy v. Difficult: relating to ease of implementation
Developed v. Primitive: relating to degree of investment in types
In this terminological divide, simple v. clear v. easy v. primitive is a much more simple categorization.

Discussion

The term The Simplest Thing That Might Possibly Work is aimed at avoiding blockage. Ward Cunningham points at the idea of the shortest path to a solution as "simplest", while still regarding refactoring as necessary to reach an increasingly simple form (adding a touch of fuzz to "simplest" in original context). He also addresses the problem of clarity, but clearly "Easy" is part of his process.

I have touched on this in my article on using better names where I show an example of perplexing python code, and through the act of renaming I make the code quite clear.

One of the problems with the article's original code is that it is too primitive, and being so very primitive increases the number of parts (methods, variables, constants). Eventually I create a method (is_flagged) which moves the primitive subscripting and literal values out of the way, leaving the example routine in a state that is not only more clear, but also more simple than before. The addition of a class with a named method makes the code more developed (less primitive) while also making it more clear (less perplexing) and leaving fewer twiddly bits in the original routine (simpler). Doing this required relatively little investment of my time and effort (easy) beyond reverse-engineering the routine.

In the final example, I make the code considerably more terse. This brevity is enabled by the clarity, which is also enabled by the simplicity. The result is something most python programmers (and most non-python programmers) can apprehend at a glance.

Simple

We can choose to understand "simplicity" as a simple count. Each variable, constant, and operator is a part of the routine. Each branch in the routine is a part. When there are a great many parts involved, the routine is 'complex' even if it is easy to understand. If there are a very few parts, it operates simply even if it is terse and cryptic.

Likewise we can count the steps involved in an operation. The more steps, the more complex. A step is a single operation, whether assignment, function call, calculation, or instantiation.

It might be interesting to note the similarity in the CRC metric's use of "complexity" which is a count of possible paths through the code.

Clear

Clarity is more-or-less orthogonal to Simplicity. The first, unworked example in my article was quite unclear. I can hardly call it complex, because if x[0] == 4 has only one variable, two operators, and two constants. That is a rather simple structure, even if we add in the for loop and the additional list variable to hold the selected elements. Though it is simple and primitive, it would be sinfully perverse to refer to it as "clear."

While code can be simple and yet unclear, it may also be fairly complex yet clear in intent and usage. These two attributes are confused because sufficiently complicated code becomes perplexingly unclear, and sufficiently simplified code usually becomes relatively obvious in intent (or at least becomes easy to clarify with a simple act of naming).

One might add "brevity" as another virtue, with the note that simplicity, clarity, and brevity together combine to form something we refer to as "elegance." Complexity and obfuscation combine to form something one may refer to as "opulence" but which most would recognize as "a mess."

Easy

An easy solution is best described via twitter by Gary Bernhardt (in an echo of Ward Cunningham) as a short "distance to a solution". It may be easy to drop a nested loop with an "if" statement into the current routine. It might be to re-purpose a variable for "just this little block of code". Gary's full quote about code obfuscators is that "they focus on their distance to a solution over the readers' distance to an understanding."  A short distance to a solution is "easy", whereas a short "distance to understanding" is "clear."


Note that an easy solution to implement may be neither simple nor clear nor brief nor well-developed. It might be easy to write an if/else, duplicate the entire method in each clause of the if, and then modify one of them to cover a new condition.  It would make the code doubly complex (duplicating the operations and operators, plus one) and much less clear (as you'd have to hunt for the differences).

That being said, we like things to be easy, provided they don't compromise simplicity or clarity. Our ideal is to maximize simplicity and clarity so that future solutions will be easy to implement well.

Primitive

We all know of the code smell called "Primitive Obsession" which has some deleterious effects on simplicity and clarity.

If we have primitive obsession, we surround our classes and methods with little clouds of variables. A classic example is passing latitude and longitude packed as integers instead of passing a coordinate object. Each method using the pair takes on more parameters, the parameters have less specific types, the routine must perform more primitive operations, and it passes these values to its subordinate routines which take on a similar burden. Duplication of functionality is virtually guaranteed.

Complexity cannot vanish from a system, but it can be moved to a place where it can be treated as if it were very simple indeed.

Sometimes a more primitive solution is appropriate. In python, I once replaced a monstrous dictionary with a primitive list of tuples and received a huge performance payback. The more primitive container was right, but it made my code less clear and less simple. The more primitive solution was not easier to write or to read. Mind you, the real problem in my example was that I had chosen the wrong developed abstraction. I replaced it with a purpose-built class that was clear, simple, and performant.

Why Does It Matter?

If we work in agile teams our goal is to always build a future in which programming is easier, faster, and cleaner. To respond quickly to changes, we must be able to read and understand code quickly. If we allow simplicity and clarity to degrade in the short term, we find ourselves having an increasingly difficult time producing quality software in the near-term future.

2 comments:

  1. Tim

    great post,

    I find that people with a waterfall/classical RUP mindset frequently confuse simple with primitive, or unsophisticated.

    It actually took me a while to get that simple is actually hard to do, and it actually takes a fair amount of sophistication, discipline and dedication to come up with software that is both simple and capable of doing the job that needs to.

    ReplyDelete
  2. Simple can be easy, but it can also be hard. It's a different kind of optimizing. Clear is harder yet, but it's usually possible.

    Simple and Primitive and Easy are the are most concrete of these attributes. Clear is rather harder to define and prove, so we fall back on social programming methods to optimize that one.

    ReplyDelete