Thursday, August 6, 2009

Any Road To Rome is Also A Road From Rome

I'm back (with my partner) working on the same stinky stuff I blogged about earlier. That code all made it back to the main codeline where we can refactor it but we had to lose a lot of our refactoring in order to get it back in. It was a lesson learned. We had to abandon some changes because we let our code diverge too far from trunk. Now we know about code perturbation.

Then there were bugs in the mudball function, and bugs are always bad news. These were mostly of the "implicit" variety. Code that falls through the "if" traps in a certain way, combined with some default settings in some of the variable(s) at the start of the method, in between the interwoven concerns and feature-envy ...

I really want to get in there and get it under control (which is to say, wrapped in a fuzzy blanket of unit tests). Sadly the blanket does not start out soft and fuzzy. It's a blanket of glass shards and barbed wire. In this case, the test doubles (testable classes) are loaded with so many details that they cloud the story they're trying to tell. We end up with a dozen lines of code to set up a test double followed by one invocation and an assert. Those dozen lines are hard-won by tracing the function with a debugger until we find where the code turns wrong, then perhaps extracting some expressions into functions before adding more setup to the test case to clean up the flow.

This is okay. Every fat, heavily-coupled legacy routine's characterization tests require massive, intricate setup. There are parameters to construct, seams to find and exploit, prefactoring, code spelunking, and all kinds of reverse-engineering involved. The immediate form it takes is a nearly-embarassing set of ugly fat tests.

We're writing the intricate, ugly tests. The class that contained the original ugly method is growing as various paragraphs of code and unclear expressions get moved into new methods. The new methods often have nothing to do with the containing class and really should be moved to more appropriate classes (e.g. "managerObject.hasExpired(session)" probably should be "session.HasExpired"), but here they are. If we were building from scratch, the current state of the code would be unacceptable.

At this time we should remember that every road into Rome is also a road out of Rome.

However, we are not building from scratch but scratching the code apart, so we need this intermediate state in order to get control of the code. One method makes it more possible to detect an effect on one of the related objects. Another makes it possible to fake (through an inheritance seam) a failure that throws an exception. Another reduces the number of variables the main routine needs to manage. This is reverse-engineering, and involves some dead-end assumptions and experimentation. Each stage is taking us closer to understanding the intended effects of the big honking function.

When we have a pretty good specification via unit tests, we can focus on cleaning the code properly. Until we have the tests built up, refactoring is irresponsible and hazardous (hence some of the new bugs). We will relocate methods, probably inline a few, probably extract several more. Statements dealing with one concern will be grouped and extracted until the main function has a single level of abstraction. Flags will be eliminated. If statements will be obviated (if possible) or extracted.

Yeah, the code was in a mess, and all the work we're doing is making it even messier as it enables the work that drives us out of the mess.

We will be done when the tests are clear and tell a good story, and the code that implements the tests is not so detail-oriented and intricate. At that point, only the NDA will keep me from showing you all the cool things we've done. I expect to see that day tomorrow or the next. It's going to be glorious!