Thursday, April 23, 2009

Code Patching in Legacy Messes

I really don't want to make this sound like a big deal. The whole thing happens in less than a week and really only affects a handful of files. Nonetheless we went through the gamut of emotions common to maintaining ugly code, and used a range of tools common to trying to do a craftsmanlike job in a legacy code base. Even with a small task, it can be a challenge to stay engaged and positive.

I'm working with a good partner on some bad code that wraps some SQL and does some dynamic web stuff for reports. If we were working in SQL, we could have written the query and been done with it. Even if it were SQL and a pile of if statements it would be comprehensible. The problem is that the code is clever and complicated. The people I work with have cleaned it up considerably from its original consultant-delivered form. We got to work with the good version.

It is a rather clever framework that hides the SQL behind a report by causing you to write chunks of SQL as literal strings in various disjointed locations, involving about three or four families of enums and collections of dozens of new objects, some near-duplicates of others. Some routines help write the code. Several routines accept parameters they don't use, and some sections of code are written in one place and discarded or replaced later in the routine. It's a little funky. No, it's a big funky.

The design principle most consistently followed is maximize the surface area of your code. It seems to ensure that adding a small feature to a report will require touching the program in many places. The second principle must be something about making the time sequence of events more obscure. Or about eager near-duplication.

Dead and live code were entertwined. We'd remove the dead code, but this is fragile and ugly old code and frankly we're afraid of it. We don't have the time budget to clean it all up. We work for a more forward-thinking company, so a few guys are already working on the replacement. Soon this code will die and be buried amidst celebration (locked in Ryan's Book Of The Dead Code), but for now a customer really wants this change and iteration is winding down. We want them happy, and want this done well.

Of course, it's heavily commented. Every method has a javadoc-style comment repeating the name of the method. Usually a three-to-five line comment once you count the flowerboxing. Sometimes they would repeat the names and types of parameters, too. Heavily, uselessly commented.

We had five useful tools that made this task less daunting and ultimately successful:
  1. The first weapon in our arsenal was teamwork. We would try to keep each other from getting depressed, and from refactoring the whole thing. We have a deadline, and we need to make a change. Can't clean it all up, have to make progress. We would each take breaks and were honest about when we were glazing and unable to keep thinking about it.
  2. The second was exploration. We spent the first pair-day just digging around in the code and trying to figure out what it was doing. We read the tests, seeing that they are a kind of gold standard test where a scenario was set up and clearly the resulting SQL was pasted into the test as the expected value. We can tell by the idiosyncratic formatting of the SQL. In the second day we're doing more experimentation. We wonder what happens if you comment something out. What if we change this string or that one? Can we instantiate this class and see what it actually contributes to the code? Developers who had worked in the code gave us a boost.
  3. The third tool was the tests. If we didn't have the gold standard tests when we picked up the code, we would surely have had to write them in order to make any reasonable progress at all. We added little to the body of tests, but it was enough.
  4. The fourth tool was a running application. It's always important to remember that the code is not the thing. It is the design for the product only. The product is information management. Without running the application, one cannot make informed decisions or informed guesses.
  5. The fifth tool was our analyst. Michael was helpful when we had requirements questions, and of course we had a few. The requirement was not all that complex, but we still needed an answer now and then.
It seemed like we had to spend a lot of non-productive time in questions and exploration and poking about, but we nailed the change, simplified a few small areas of the program, and improved the date handling today. No, it's not going up on my refrigerator door, but it works and is better than when we first found it.

PS: Mr. Smarty Pants Complex Coder, we're eroding your mess already, and soon it will be gone completely. Be happy or be sad, but it is happening.

PPS: I almost always use the word 'clever' in a pejorative sense, as I consistently did here. It is something I got from Walter Moore, and is a habit I'll keep for a lifetime perhaps.


  1. I've never liked treating clever as bad. I think clever is a bad word to use to describe what you hate.

  2. I stand by anti-cleverness. Smart is good. Brilliant is good. But when someone is exercising their cleverness, I think it's bad.

    You want the best algorithm for the job at hand, and you want it written in the most obvious way possible. Sometimes it only gets to a certain level of non-cleverness, but you deal.

    What I hate is when people write things like (!(x^y)) for bools meaning (x==y) and when they try to see just how deeply they can nest lambdas and get away with it, or when they write perl in C or cobol in C++ or rely on Java/C# reflection when simple straight-forward coding will do.

    And this stupid framework where we could simply write the SQL or we can write dozens and dozens of in-obvious lines of code to piece the query together at some point in the future, YET we have to enter the snippets in order.

    Sometimes people make things complicated because making a Rube Goldberg machine is great fun, or because they can't find the simpler route, or both.

    When I finish reverse-engineering someone's code and realize it does something relatively simple (compared to reverse-engineering the code) then I can only think "that's clever" and reach for the delete key.

    Smart is good. Clever is bad. Being simple is good. Complicated is bad.

    Here republished:
    The Zen of Python, by Tim Peters

    Beautiful is better than ugly.
    Explicit is better than implicit.
    Simple is better than complex.
    Complex is better than complicated.
    Flat is better than nested.
    Sparse is better than dense.
    Readability counts.
    Special cases aren't special enough to break the rules.
    Although practicality beats purity.
    Errors should never pass silently.
    Unless explicitly silenced.
    In the face of ambiguity, refuse the temptation to guess.
    There should be one-- and preferably only one --obvious way to do it.
    Although that way may not be obvious at first unless you're Dutch.
    Now is better than never.
    Although never is often better than *right* now.
    If the implementation is hard to explain, it's a bad idea.
    If the implementation is easy to explain, it may be a good idea.
    Namespaces are one honking great idea -- let's do more of those!

  3. The anti-cleverness you describe is anti-badness. I think we all can agree that bad is bad. But you and most of the other anti-clever crusaders have no definition for it other than it made your life miserable or you at least don't like it.

    And then you assume someone you never met thought they were being smart. That's where your ice gets really thin. You weren't there and you don't know what the constraints were at the time.

    And what really makes it irksome is that you have a whole library of well defined "bad code smells" which are a lot more specific. Why don't you use those? Those have meaning and in many cases they have remedies.

    It comes across as smug blaming of the guy who left, and dirties what I've observed is an otherwise a healthy data driven approach.