Tuesday, September 22, 2009

The Elusive Code Quality Argument

I was reading Uncle Bob's latest blog this morning about messy code and technical debt. I wanted to make a comment about the problems programming shops face, but decided to do it here instead.

The problem with clean code is twofold:
1) people who can't see it don't believe in it
2) some people who should be able to see it don't believe in it

People who can't see it don't believe in it.


One of the heartbreaking lessons from the Big Ball Of Mud talk on Wednesday at Agile2009 is that people working two levels of management above your head do not know that the code is messy. Joe and Brian popped up a slide of Mike Rowe, and quipped that you can't bring him out to wade in the muck in a way that non-programmers can understand. Oh, the code is stinky and messy and bad, but only you can see it.

If you can't see the difference between clean and ugly code, it all sounds like a "programmer myth". It seems daft to take time for refactoring. After all, when the programmers finish refactoring the code doesn't do anything new, but the programmers feel better. How much money do we lose to make programmers feel better?

We need quality (in low bug count, low regression count, sustainable productivity) but can't afford time for quality practices (TDD, pairing, and clean code). Discounting this dubious "clean code" thing, it must be because the programmers aren't very good. Which is right, as far as it goes. Better programmers make better code which can be enhanced more readily. But doesn't that imply that our fastest programmers must be our best programmers?


Some people who should be able to see it don't believe in it.


Not all programmers can see mess. If they could see it, then they wouldn't make so much of it.

What if one makes a new program by copying an existing program and hastily hacking it into a workable shape (ignoring duplication and testing) and drops it into the release for tomorrow? Isn't that a big win for my team? If it's done quickly, doesn't that make me a good programmer?

Maybe the jury is out until we hear back from the users. Is my responsibility to hack code out quickly, or to make stuff that works in actual users' hands? What about when my peers come along to fix something: have I helped or hindered them? Quick hacks stop well before they reach 'done.' Though hacks they look good in the short term, they are just deferring work to post-release. It would be wrong to reward this behavior.

A number of otherwise capable and productive programmers can't tell mess from brilliance. Their code is complex, confusing, implicit, indirect, cryptic, and poorly organized, but it works and they feel good about it. They may have reached some level of success for continually pouring out working code, yet their code is a shambles. James Grenning would say such a person is like a cook who never cleans the kitchen.

The primary factors determining how quickly we will program today are the quality of the code we're working in, and our ability to do work well. Clean, clear, obvious, straightforward code makes us better and faster, poor code makes us slower and more likely to make mistakes. John Goodsen from RadSoft always told me that the secret to going fast was not to slap things together but to make fewer, more correctable mistakes. This level of disciplined work is not a waste of time, but a small-yet-potent investment in future productivity.

We've learned that the longer a bug remains undetected, them more it will cost to locate, isolate, and eliminate it. Cleaner code will reduce the incidence of bugs and TDD will also speed discovery of bugs. Ugly code will encourage the creation of bugs and lack of TDD will allow them to remain undetected for longer periods. Sending bugs out to the customers erodes good will, which nobody wants. As a coping mechanism, exhaustive manual testing is costly in time and money. Code cleaning and TDD together are a waste preventative rather than a waste of money and time.

Duplication of code is a common form of "messy code", generally caused by copy-and-paste programming. It is particularly ugly because developers may fix one copy of the code (perhaps in a report) not knowing that it has been duplicated elsewhere (perhaps in another report or screen). Later we report bugs that look like recurrence/regression but really they are just bug duplicates. Going back to fix a bug multiple times is an expensive waste of user patience. Eliminating duplication is waste removal, not actually a form of waste at all.

Cleaning our code and testing our code make us go faster, but the effects are not immediate. It may seem inobvious that we are going faster by taking time to clean and refactor our code, by using TDD and pair programming, but these are the practices that we use to avoid having code returned by QA or unhappy users. If we measure from the time we pick up an assignment until the time it really works for our users, we find that TDD, Refactoring, Pair Programming and like practices greatly speed development. If we only measure from the time we pick up until we release the buggy feature, then all these practices seem to slow us down. You have to choose the measurements that really matter.


Where does this leave us?


If some programmers can't tell clean code from messy code, most managers cannot tell, and most sales and product people can't tell, and if the benefits of refactoring trail the intial feature work by weeks, months, or years, then aren't we without hope of improvement?

We are without hope of external rescue. It is unlikely that any non-developers in authority will mandate or even approve the practices that will get us out of our mess. If things are going to be better, it will be because we make them better. We don't need permission, but if we care about our products then we do need to use hygienic practices in our daily programming.

There is hope, but it is only us.

5 comments:

  1. I've been toying with this:

    It's based on my poor understanding of quality from reading a lot lately.

    The idea is that we understand that our team is a system and it has the capability of producing 70 story points plus or minus 20 with 95% confidence sprint to sprint.

    Daily we tend to produce 140 lines of code plus or minus 30 with 95% confidence.

    When our sprint turns up at the end with only 15 points accomplished, the sprint is lost. But there's a good chance we had several days of 10 loc produced, and maybe on early on one with 400 loc, way out of our normal variance.

    If we had measured loc daily, without trying to artificially optimize loc, we would have had, 8 days ago, hard proof that something was up.

    Anyway, it's an idea I've been toying with. I need a personal metric I can easily measure to try this. Daily loc isn't convenient in my environment. (Which is weird, long story.)

    I think where things went bad historically was when we started moralizing loc and bug counts. We viewed them as people ranking systems. We should have treated them like the oil pressure gauge on a stock car.

    ReplyDelete
  2. I didn't relate that to your post very well. How bad is the code? What does it look like?

    Well, when I rushed in that 400 loc beast because the customer and product manager really wanted it fast, the whole rest of the print ran on 10 loc per day and no other stories got done.

    That's more visible than me whining about it.

    ReplyDelete
  3. I'm not sure I know how your comments related to my posting.

    Was your point that cramming in 400 lines of crap plays to the crowd and is why it's so common, or that cramming in 400 lines of crap is a good idea because it makes people happy?

    Or that clean code is BS and I'm whining about people being more productive than I am?

    ReplyDelete
  4. I should really get, like, my own blog. Oh wait, I like totally have one.

    Measuring loc is an example of a way we could make the invisible visible.

    Maybe everybody believes in quality code, but no one has ever seen any, or maybe no one who matters has ever seen any.

    The people that matter have seen a lot of cargo culting in the name of quality that resulted in waste. They usually seen that several times.

    So what I'm saying is counting lines of code is a measurement that's amoral. It can be measured and monitored, and with some history we can use it notice that a sprint is off rails and try to pin down a cause.

    Maybe a better example is time spent in stand up meetings. 30 minutes 3 days running means something's up, and smart management would intervene intelligently.

    Anyway, I really need to demonstrate this and tell the story, in my own blog.

    ReplyDelete
  5. Tim,

    We lose much of our visibility of quality because 1) quality is typically measured in terms of tests, and 2) we can only test what we spec.

    If we had specs like "new developers have to understand this stuff in less than a week", "defect investigations have to complete in less than 2 days", we'd all very quickly understand that there's more to quality than "the features work".

    ReplyDelete