Tuesday, May 11, 2010

Mocking Done Badly

using(Playback) {
This is the last part of a common style of mock test. We see the system-under-test (sut) being asked to perform some action, and some parameter is passed to it.  We know it's a mock test because of the Playback bit. 
What is being tested here? Well something about PerformAction is being tested. Not much there. No clear assert written at the bottom of the test to capture our attention.
Let me narrate as I work through the actual code before me, and maybe the readers can respond to tell me whether this really is mocking done badly, or I'm "just not used to good mocking."

I struggle not to look at the test name, so the code can speak to me more directly.
  • The parameter is not interesting, it is just a string. 
  • There is another variable, which is a stub, but it's not used inside playback.
  • There is a 'record' section, set aside with 'using' and not (thankfully) a #region.
  • In the record section, we call one function with the parameter string, returning the stub. So there is some kind of look-up method called exactly once. The string is the key, the stub is the result. Looks like database avoidance.
  • Another function is called, taking the stub returned by the first. It is called exactly once, and returns the number '2'.
At this point I've read the whole test. Hmmm... what are we doing here again?
  • Check the test name: ProcessExactlyOneFooWhenGivenParameterBar.  Oh. So maybe the "Repeat.Once()" part of the mock is meaningful. But what's really going on? 
  • I read the neighboring test (there is only one). It calls the same method with the parameter "all", and expects to see a call to a different method underneath, one that collects a list of things to process.  Okay, "all" is a magic name that means 'do them all' and it collects the list. That helps a tiny bit. The name of that test reflects that my understanding is right, but I can't really follow the original test much better with that knowledge in hand.
  • The second-to-last desperate measure is called upon*: I read the code.  This is backward. I should never read the production code to explain the test. I should read the tests to explain the production code. My assumptions hold so far.
  • The method for actually doing the processing is the one that is expected to be called once, and is to use the stub returned by a look-up method. The return code seems arbitrary. I extract it and call it "arbitraryRecordCount"
  • Some method renaming is done to make tests more meaningful. I double-check the production code to make sure it reads better too. As is usually the case, the code reads better.
  • Realize that the logging was redundant and ill-placed, and moved it to a unique and more appropriate spot. This simplifies production code even more, and all tests still pass.
  • I bring a bit more detail up into to the test in hopes that it will make it more clear. It does, but now I have more setup. There is an urge to break the mock setup into paragraphs, a sure sign I'm gathering cruft.
  • I realize there is a test missing, one that would have made more sense of some of one existing test. I add it. It duplicates the first test a bit. I extract a little, but worry about making it abstract enough that you can't really follow it.
  • I set up the original test to be a bit more explanatory.
  • It is at this point that I realize that the setup for this test fully simulates the sensing and acting of one pass through the SUT's method, if all IF statements are ignored/assumed. It is right, and having read the code I understand why it is right. I see the values threaded from one return to the next call, etc. I fear I've just proven that "IF" works in this language.
  • Now I see that processing involves doing one look-up, calling a method that does a  more detailed look-up, and then using the latter look-up result to perform an update.  I can see that this is what the code does by looking at the tests. It's rather non-abstract, and not remotely in business language.
I do have to wonder if they have any real value other than cementing the code's current implementation into the test suite.  If we extract a method or inline a method, I know that these tests will fail. The new method will not be "incorrect", only "different."  It bothers me that the tests will fail if I improve the structure of the SUT.
This is doing test-last, which is a poor way to do testing.  That means it's most likely a poor way to do mocking. The tests are fast and decoupled from database, etc.  
It takes much longer to write about this than to do it, yet I took the time so that there is some basis for discussion. Is my code conscience overly-sensitive, or does this have several kinds of wrong in it?
Is this mocking done well, or done poorly? You tell me.

* the last desperate measure is to trace execution with the debugger.

Sunday, May 2, 2010

Making Stuff is Making Stuff

On the television as I type this is a show called "Amazing Wedding Cakes", which is unsurprisingly about cake making. In the current episode a bride has ordered 190 individual single-serving cakes. The team is struggling with a trade-off between doing the work on time and doing the work really well.

We all do that. Software is so very much the same. We simply don't have time to obsess on every detail, but we also can't skimp on essential details. We need to make both radial and linear improvement, but we can't abandon our charter to deliver quality.

This has been a running meme this week, whether one can afford quality and whether skimping quality will help a team go faster.  I've gone on record enough times saying that I don' think you go faster writing poor code, and I echo Uncle Bob Martin in saying that our most significant impediment is the bad code we have to work in.

In the end, the boss was pretty angry about the care that was taken with each of the cakes but the work was done on time with pretty high quality. The customers were delighted.

Of course, while that is a common story, the difference between cakes and software is that cake makers don't have to build all their other cakes by modifying the cake they've just made.  If there is an internal defect in one of the mini-cakes, it won't cause the others to fail and it will not cause failures in all of the cakes they make in the future. Wedding cakes are a one-time thing. Software lasts and accrues changes.

That is not to talk down the cake industry. My wife makes cakes, and I know how much work goes into the structure, flavor, and aesthetics. It's a really cool and really competitive artistic industry.

The point is just that some things are short-term and highly visible, while others are more long-term and mostly invisible. If we treated cakes like a multi-year investment it would be silly, and it is similarly stupid to treat software as a bunch of one-off, short-term projects.