Friday, October 8, 2010

Explaining Refactoring

I recently collected up some information from Jira and git about our software process.

I am mulling over an idea, and a little feedback is welcome. To help understand the context, among our products we have one codebase that was originally written pre-agile in a non-TDDed manner but greatly improved during and after the company's agile transition. It is pretty good stuff, and constantly improving in quality and functionality, consistently delivering real value to customers. We're pretty of proud of it and its future. There is a great team working on it. Sometimes, though, we find the need for a large-scale refactoring somewhere in the code base, and we have the courage to tackle these.

Basically the problem is that refactoring sounds bad to product managers. When it's done well, there is no visible difference in the code. When it isn't done perfectly, it injects defects. The obvious knee-jerk reaction is to resist any large-scale refactoring efforts.

As an aside: refactoring ought to be invisible. It should be part of every task, and not a task on its own. If a system is built ground-up using TDD then no "refactoring tasks" should appear in release notes and no "refactoring tickets" need be filed. Sometimes, however, there is legacy code with significant technical debt or significant code cruft in new code. That ugliness might be in key areas of the application. Sometimes it is worthwhile to dedicate time to bringing a piece of code under control of tests. Even so, if people outside of your development team are aware of refactoring it is because something is wrong somewhere. It is a process smell.

Of course the product manager is right about the risk of large-scale refactorings and also about the lack of end-user value from effective efforts. What is missing is an understanding of the value of improvements to code virtues, maintainability, and leverage. Product managers don't see code.

Refactoring done well speeds code changes. The reward is that we are able to sustainably produce more features faster, with better quality. The punishment for not refactoring is that the code becomes continually worse and changes become slower and more defective. We need continual refactoring and testing, and sometimes we need some an initial "boost" to areas that are hard to test. This is only obvious to people who are elbows-deep in the code, and product managers don't see code.

Since indeed cleaning up ugly code is somewhat risky, we technical people should have a solid means to judge whether any large-scale effort is truly worth doing. Otherwise we may waste many weeks on a crazy "ugly code" witch hunt without significantly improving the fluidity of development. The product managers don't want that, and neither do we.

Product managers see a bug here, a flaw there, a new bit of functionality, a slipped date, a funky side-effect here and there, all disconnected. They can't see the problems in the code because they can't see code. Some of them don't even believe that the technical problems really exist because it sounds like programmers are just making excuses.

I hacked together a simple similarity comparison of the text of our Jira tickets and compared them to code change records for those tickets. My efforts met with great disappointment. Two similarly-described bugs may not involve the same code at all. Conversely, if I show the tickets whose resolutions do involve the same file, non-programmers strain to see any commonality between them.

While product managers see only holes. programmers see only gophers. If product managers could see which set of bugs are connected to which source code files then they might demand remediation, but only for the code that really effects delivery and quality. At least in a rational world there would be such a level of data-based decision-making.

One answer may be to have the product managers work in the code, but I doubt that coaches will often be successful with this strategy. Instead, we may need a better way to illustrate the underlying connections.

This is where the heat map comes in, but it was written from a programmer's point of view. As a step in a new direction, I coded up a report to list the tickets for any given file. Maybe in the coming weeks I can convert the data to a form that product managers can see and digest without becoming programmers first.

If you have any interesting ideas for further research or credible presentation of this data, drop me a comment below. I'd love to walk in some day with a new way for my partners who are product managers to understand our work so that technical debt remediation can be properly assessed and prioritized. I'll settle for better criteria on which to base my choice of large-scale refactoring efforts.

It's a good dream. Waiting for your comments.