Friday, March 22, 2013

Product Health Rules!

Dealing with product health is simple in theory.

You need to have a central build-and-test server and a repo that is treated as the central repo for the developers (in git, servers have no built-in roles). It has to be set up to run all the tests, whether they are unit tests, story tests (cucumber, etc), or what-have you.

Now, the thing you have to know is the state of your local machine, and the state of the build server.

When I say GREEN, I mean "builds and all tests pass."   When I say RED I mean that something doesn't build or did not pass all the tests.

The Rules


The rules in precedence order are:
GET TO GREEN. 
Green to green; anything else is obscene.
You need to know that your code is good, and the server's code is good, and you can push your code to the server.

"But wait", you might say, "there are states unaccounted for here. What about pushing green to red?"

"But Tim!" you may cry, "my code isn't the code that broke the build/tests! Why should I slow down when my code is okay?"

If you work all by yourself, you can probably afford a little slack here,. If you have a whole team in the same code base, or if you have many teams in the code base, then breakage affects everyone. The only sane way I know to manage product health is for everyone to follow the same simple rules in a disciplined way.

Rules Illustrated


Let's assume you have a largish team, or a few teams working together on the product, and these rules will be more obvious.







Green to green is great. You can pull from green builds, you can push your green code to the green build. Nobody gets frustrated, nobody gets hurt.







Can I pull from a green build when I'm in a "red" state? Well, it's possible, but it's likely to cause a lot of confusion as you try to find out what you've done to turn yours red.  You may end up merging over broken code and accidentally turn your machine green, but you'll never know what you did, quite.

Better to revert the local code so you're green, then pull the latest. Programming is complicated enough that we don't really need to make it worse.






Pushing to a perfectly good green build from a red local build? Obscene! It breaks the build and that's not good for anyone. It does happen, but it's bad. Usually, it happens when people don't know if their own build is green or not. I've done it myself, when I didn't run all the tests locally. I hate breaking the build, especially when I find it was preventable.


What about pushing from a green build to a broken build? Won't that just increase the balance of good code on the server side? It might even fix the build? 

Well, if you are the person fixing a broken build, it's fine. Otherwise you are getting in the way of the person who is trying to fix the build, and that's just rude. Don't make their job harder.

By pushing good code onto the broken build, you are slowing down the build's transition to green. Our first-priority rule is to get to green. You're in the way. Why not wait until it goes green, then merge, test, and push? 
  





What about pulling from a red build? You want to have the most recent code to start your next change, right?  Yes, but you will never be sure whether "red" on your box is referring to your error or the one you pulled from the main code line.  Worse, you might get used to seeing red, and stop reacting to errors on your own computer! Better to leave a red build alone.

Exception: if you are pulling from a red server to fix the build, you have the first-priority rule on your side.







Hey, if the build is already broken, what does it matter if I push more brokenness at it?  Once again, pushing to a red server will complicate the process of getting it to green. Someone may figure out the original problem, and then be totally confused when they merge in your new problem on top of it.

It won't do to get used to the idea of brokenness and then compound it. This is the famous "broken window effect" that psychologists talk about. It's the last thing our code base needs.

The Loophole!

Hey, we've found a loophole! If we never write tests, then our code is never technically red. All of the tests that exist continue to run green and we can check in as long as nobody else breaks the build!


Okay, mr smarty-pants, that gets you around the rules, but leaves you with unknown brokenness in the code. You don't really know that your code works, or that it will still work after merging with other people's code. 

By testing, you are reducing the period between the addition of a bug and the detection and removal of the same bug. You reduce the number of people who have to be involved. By not testing, you are saying that it's okay that your bug might be caught by external QA or by a customer in the field. You are even leaving open the possibility that it might cause greater issues by generating bad data that is used to subtly steer a customer's system into a more complicated and painful kind of failure.

If you're lucky, your error will last long enough to cost the company millions in customer service time, debugging, customer downtime, loss of reputation, and even product failure.

Or maybe you'll get away with it this time. 

Is the question mark really better than writing a few tests? I'm thinking that faking product health is no better than ignoring it. Besides, TDD is fast and easy when you get used to it. It is one of the fast, safe practices that build teams reputations. 



No comments:

Post a Comment