Tuesday, November 4, 2008

Extreme Measures

  • Shorten iterations to force priority
  • SME can only help others complete tasks
  • Require 40% stories 100% done at midpoint
  • Revert/discard work over three weeks old
  • Random weekly team roster to force closure
  • Stir pairs twice daily
  • Eliminate individual tasks
Sometimes one has to take some extreme measures to help a team over the hump in their agile transition. It is hard to adjust work habits without having a work environment that depends on new behaviors. These extreme measures may stick, or may be training wheels for extreme programming.

Shorten Iterations
Shorten iterations to force priority. Cause the Customer role to pick fewer things to do, more often. This also should force developers to reach closure on cases more quickly. If the team is used to letting things lag and stack up for some future day, shortening the iteration can help them get into the habit of finishing things more quickly and taking on less work.

SME Has No Tasks
SME can only help others complete tasks. This rule forces collective code ownership. If a subject matter expert is not allowed to "do his own work" then he/she must do it through other people. This means more people with their hands in the same code, and also means a higher "truck number" for the team.

40 At The 50
Require 40% stories 100% done at midpoint if the team is still trying to assign work to individuals. If work is split up to individuals, it is normal that much of the work completes (if it completes) on the last day of the iteration. If the team is expected to organize around completing stories every day or two, then they have to work together in a new way. Normally tracking velocity will take care of the problem, since having 100% of the work 90% done means a velocity of zero. In the cases where velocity is not motivation enough, you may need to enforce the "fourty by the fifty" (40% completely done by the 50% mark of the iteration) rule.

Destroy Unfinished Work
Revert/discard work over three weeks old. Nothing proves sincerity like throwing work away. If you roll up your sleeves and delete some of the old tasks that have been pecked at over the course of weeks or months but never completed, it actually helps your team focus on the things that *really* need to be done, which improves your velocity. Incomplete work product is defined in Lean processes as "waste". If it were really all that important, it would have been driven to completion. It's trash. Take it out.

Random Roster
Randomize the team roster weekly to force closure of stories. Divide the team in half. Perhaps have half of them fix bugs while the other half works on new features. In most non-agile teams, people are used to having work slop over the edges of the iteration, and so they claim "done" when they're not really "done done". So randomize the teams. Now nobody can count on having next week to finish up the work they've committed to this week.

In addition, moving from team to team means that they will have the opportunity/obligation to work on parts of the system that are unfamiliar to them. This motivates the cleaning of ugly code and the shoring up of weak tests. It costs velocity, but improves truck number and code.

It is uncomfortable to live with change and uncertainty this way, but it will push people to rely on tests for features they don't know and ensure the tests pass pass before they hand off the code to a peer.

Stir Pairs
Stir pairs twice daily if you find people migrating to "pair marriages". You want to avoid having the same people partner up over and over. Things go stale that way, and people tend to partner with people at their own skill level rather than learning from people who are more skilled and sharing the burden of teaching those less skilled. If the partners are stirred occasionally, there is no undue burden and no hiding.

No Individual Tasks
Eliminate individual tasks by requiring that all production code have two sets of eyes at a minimum. Require pairing and TDD for all code. If this sounds extreme to you, you haven't been working in a shop that truly practices the XP style. This is actually part of the original process, and has been taught as-given for quite a long time now.

I don't generally advocate heavy-handed measures, but sometimes you have to create a system that teaches the practices you want people to learn... if only as a temporary measure.

Python Pimpl Pattern

A classic unit test blunder is to make use of the system time freely in your code. Another blunder is to monkey-patch your preferred time function.

I was working with some ATs which failed because they were written with a date in mind, and the calendar has marched on since those days. The answer is fairly obvious, to override date function. With a little searching, I find a utility fixture for forcing a given date/time. It worked as long as I ran the test in isolation, but failed when I ran the test in its suite.

Code in the system performed imports as "from mx.DateTime import now", and 'now' became a stable reference to whatever mx.DateTime.now happens to be. If you change the reference in mx.DateTime, it doesn't affect your stable reference. It binds at the time the mx.DateTime importer is loaded.

Now, python does some nice optimization. When you import a file, it doesn't necessarily read the file from disk. If the file is already loaded, it merely maps the namespace of the module into the current namespace (as requested by the import statement).

So the file Importer.py imports using "from mx.DateTime import now". If that happens after the fixture has monkey-patched mx.DateTime.now to some silly lambda method, then 'now' in Importer points to the lambda. If, on the other hand, it was imported prior to the monkey patch, 'now' points to the original function. If mx.DateTime.now is changed after Importer imported it, it has no effect. That's true even if the change is to set it back to mx.DateTime.now's original value.

Now let's say that Importer did "import mx.DateTime" and didn't bind 'now' to mx.DateTime.now but instead called the method as mx.DateTime.now(). Now the monkey patch is fine. The reference is indirect, via lookup, and not via a bound reference. If we always called mx.DateTime.now, then monkey-patching ("mx.DateTime.now = lambda: return DateTime(blah)") will work, and un-patching it will work too. Some would say "problem solved". I suppose that would do it. But in Python, we consider this kind of patching to be evil. We try to respect module boundaries and not make implicit changes.

We can write our own function in a module and have it call mx.DateTime.now() and replace it to force the current date, but that puts us back in the same trouble if anyone writes "from TimsModule import now". That stable reference problem comes back for TimsModule as it did for mx.DateTime.

So we need a function that can be used with a bound reference or called via the module path, and still give us the results we want. Back in C++ days, J.Coplein wrote up the envelope/letter pattern (aka pImpl). You need a function that delegates its implementation (like a 'strategy'). This is easy since all functions in python are objects:
------ NowFunction.py

from mx.DateTime import now as originalNowFunction

def now():
return now.implementation()

now.implementation = originalNowFunction


Now we need an example of a program which imports now() and calls it repeatedly, so that we can prove that it is affected dynamically by changes to the implementation:

----- Importer.py


from NowFunction import now

def lookNow():
"Watch how now() changes implementation"
for i in xrange(35):
yield now()



What's left is a program that manipulates the now function and demonstrates that the first file is getting the full benefit of setting and unsetting the implementation. Something that will set it to various values and back. Maybe based on some well-known programming example (with no attempt at optimizing or playing code golf):
----- test.py

import Importer
from NowFunction import now, originalNowFunction

for n,value in enumerate(Importer.lookNow()):
if (n % 3) == 0:
now.implementation = lambda: "fizz"
if (n % 5) == 0:
now.implementation = lambda: "buzz"
if (n % 5) == 0 and (n % 3) == 0:
now.implementation = lambda: "fizzbuzz"
if (n % 7) == 0:
now.implementation = originalNowFunction
print n, "Got",value, ", next sample ", now()

Monday, November 3, 2008

Acceptance Test Qualities

I'm involved in writing a new agile agile guide with Jeff Langr. We are taking agile concepts and trying to boil them down to the simplest forms that cover the bases reasonably well.

It is rather like playing The Three Things (AKA "the Two Things") game for Agile software development. An example:

Acceptance Tests

  • Define “done done” for stories
  • Must be automated
  • Document all uses of the system
  • Should be usable as the basis for system documentation
  • Do not replace exploratory tests
  • Run in as-close-as-possible-to-production environment


This list is intended as a small set of reminders, so that when one is in the midst of a project, one might find some guidance. Is the test really fit for use as documentation or written as programmer-ese? Is it describing the feature well enough to guide development? Is the Continuous Integration environment running it in a naive or unusual system configuration? Should we run these tests manually?

The bullet list should speak to you. If not, then read through the explanation below.

Define “done done” for stories

Clearly some of the greatest value in ATs is that they are executable specifications. No work should be assigned for completion without some ATs first being created that describe the feature fairly fully. I tend to not require fully comprehensive coverage for all ATs, but I find that sometimes I am wrong not to. This point is as important as it is difficult. We are frequently finding "missed requirements" or "unexpected interactions." The answer for these is probably not to have full Big Design Up-Front (BDUF) but to find a more agile way to deal with corrections and changes.

Must be automated

ATs really have to be automated. Manual testing simply cannot scale. We can expect to run every automated test we've ever written a few times a day, but could hardly expect to run all of the manual tests we could have written even once every two weeks. Automation doesn't just make testing convenient, it makes continual testing possible.

Document all uses of the system

Even uses of a system that pre-exist the team's agile transition still need tests. This is because the second value of acceptance tests is in preventing regressions or detecting brokenness. It is never a good time to be ignorant of the fact that you've broken your system.

Should be usable as the basis for system documentation

The third value of the ATs is that they document the system. That should make it easier for people whose job is also to document the system. Often this power of testing is overlooked, especially when the test are written in a non-literate style.

Do not replace exploratory tests

Of course, automated tests are never complete and features are prone to have unintended interactions or consequences. Professional testers are valuable teammates. Their exploratory testing may uncover things that programmers, intimate with the workings of their code, might not.

Run in as-close-as-possible-to-production environment

Finally, tests need to run on their target platform. It happens, though. It's better to find any platform issues earlier in the process though. If the tests include a database, it ought to be the same kind of database you'll see in production. Likewise file systems, network hardware & software, etc. It might be handy to have a CI system run the tests once on a development-like system and then install and run again on a production-like environment.

Agile Progress and Branching

This week, and last, we are doing our work in the release candidate (RC) branch, which will eventually be merged to trunk. We maintain a "stable trunk" system, with the RC as our codeline (for now). This is an intermediate step on our way to continuous integration.

Partly because of the change in version control, the team has learned to rely more upon the tests, and is writing them quickly. We have had a noticeable increase in both unit tests (UTs) and automated user acceptance tests (UATs) in only one week There were some problems with people checking in code for which some tests did not pass, but they have learned very quickly that this is quite unwelcome.

We are painfully aware of the time it takes to run both test suites. The UTs suffer from a common testability problem, in that they were written to use the database and they sometimes tend to be subsystem tests rather than truly unit tests. When they are scoped down and mocking is applied, they should be much faster. Sadly, we are using one of those ORM frameworks that wants to own our objects and bind them tightly to the database, so we will have to go through some more mechanizations to get our objects free of the database. This is common, but always troublesome. The features that make a framework convenient can be the same ones that frustrate all attempts at building moderately comprehensive test suites. Our unit tests take over 10 minutes on my computer, and the UATs take much longer. =8-o

We have been closing down old branches for a while now (releasing backlogged work), which can only increase our productivity by decreasing the "drag" of branch maintenance and troublesome integrations. We have not outlawed development branches, but we will start committing to a small amount of work to always be done in the RC, with larger tasks or those with uncertain results branched for now.

We have a nosetest-based harness for gathering coverage information from unit tests, and I hooked up coverage.py to collect the same data for our UATs. It's not a perfect system yet, but we can at least start to chart some trends.

Our Continuous Integration effort is nascent. I'm going to try to find a way to set up buildbot to run all our unit tests (at least) and then to launch the UATs through FitNesse (always a pain to automate). I'm expecting a lot of fun here.

Our informative workspace initiative is coming along. We have UT counts and timing graphs, the same for UATs, working card-walls for our tasks, simple process information, etc. Some of our programmers have been producing monthly production charts to track the amount of money moving through the system, etc.

Overall, we're doing a pretty good job of transitioning. We have challenges, but we've come a long way.

Sunday, October 19, 2008

Branching

The team I have been working with has been reporting a surprising amount of time on merging forward branches, merging them to a RC, and dealing with merges that were not done correctly. This just supports my thesis that branching is a wasteful practices.

What I think makes sense is that a team (whole team) is given sufficiently small jobs that any one of them can be completed within an iteration. Anything the team starts it also finishes. The team also commits to the creation and use of unit tests (preferably via TDD) so that the tests they have written give them confidence that they've broken nothing with their most recent changes.

I see the team working in a common code line for all new work. This can be trunk, or a common branch. As work completes, the CI server runs all the tests (UT and AT and anything else they can automate). If all the tests run, the branch is tagged as a release candidate. When it is time to release, an existing release candidate is chosen as the next release. If there are always a number of release candidates, this should not stop the development team from working.

The branches needed are a quick-fix branch (the last production release), the current RC (being certified for production) and the common code line (current development). That's a total of three. I can concede the need for branches for experimental work: tasks that should be discarded if they don't work out. I also realize that a company may need to keep multiple production branches open for customers who have not upgraded yet. But for many companies (IT departments and SaaS producers) three branches can be enough. No matter how I've tried, I have never managed to get a production system to use less than three. I think it is close to the minimum.

This simplification pays off in reduced waste.

Speed of development does not come from the sources people often attribute. It is not a matter of typing faster, racing through work, increased pressure, rampant caffeine abuse, exceptional effort by genius programmers, cutting corners, or working longer hours (though any of those may create a temporary "bump"). I am firmly convinced that increased development velocity is a result of having higher quality work products and a simpler system to work in. This is why Lean and XP and other Agile techniques appeal to me. Having capable programmers who can enter code quickly doesn't hurt, of course. Such programmers are even more productive in a simpler system.

By delaying the point of commitment so that the customer may "cherry pick" changes for a release, complicated branching ensures a last minute "integration hell" panic at the end of every release. Last-minute integration decreases quality and complicates planning.

Mind you, in an open-source system is is still reasonable to push branch-per change management because it is acceptable that many submissions (most?) may be rejected or revised, and may take many weeks or many months to reach an acceptable state.

In commercial applications, a simple three-branch system can be a much more efficient way to get code out the door.

Wednesday, October 8, 2008

START

RandsInRepose writes the best article I've seen in a long time. I am also daunted by large tasks, and especially large learning tasks. Even large drudgery tasks can give me pause from time to time. His answer is so simple, and so reasonable.

Start.
Iterate.
Mix it up.

I noticed in the tree story that he had one more bit of advice -- get some help.

Nice points to ponder.

Monday, October 6, 2008

Branching as a Coping Mechanism

Branching is a coping mechanism.

There is a lot of branching and release-time cherry-picking of features in the commercial software world. It seems to alleviate some pain and make the process more manageable, but I fear this is at too great a cost. Many development groups branch due to dysfunction.

Note that I am not talking about open source projects. I'm specifically speaking about commercial projects where programming talent is not free. I'm also not talking down distributed version control, where every checkout is really a separate branch. I am concerned more about use of branches for the wrong reasons and at a high cost.

Often branching is used to put off integration. This is a losing game. After so many changes to the trunk a branch will no longer slide seamlessly into the release stream. We can work around that problem (itself a work-around) if we merge forward all branches periodically so that they track the trunk more closely. This activity is time spent on features that may not be released this week or even next. Or maybe ever.

Branching can seem like a handy way to create flexibility for product team, as they can de-/re-prioritize as they please up until the last minute . This would follow the lean principle of deferring decisions to the latest responsible minute, if it were not for the word "responsible". They've already spent the money to have the features implemented by the time they decide not to release the feature. Wouldn't it have been more responsible to only spend developer time and effort on things they definitely need to deliver?

In several businesses, the branches are a vote of no-confidence in the development team. If a team regularly accepts more work than it can complete, it makes sense to keep the work separated until it really completes.This is clearly a failing both in planning and execution. Better to only assign work that is completable in a reasonable time, and to build with techniques like TDD that give a better sense of real completion.

In some cases, branching is required because the QA team is so backlogged that it may be weeks or months before they will have the opportunity to test the work that has been done. In this case, the answer is probably not to make an even larger backlog for the QA team. Unfinished work is waste. There is little reason for piling waste up on top of waste.

The underlying problem is that branches are a perishable inventory. More is not better. Each incomplete branch has to be maintained or eliminated. Tossing out the work is obvious waste. Continuing to maintain branches you're not releasing is also waste. Letting the code rot until it no longer can merge to the trunk is also waste.

If a team takes on small, immediately-releasable units of work, and completes them quickly then branching becomes unnecessary. If the work they do is the most important work they can do, then the flexibility for *not* release that work is unwelcome. If the team practices CI so that it always has something ready to release, then it is hard to find reasons to branch. If your agile team became truly agile, you could obviate most branching and free up a lot of manpower for other important tasks.

But imagine that you had to ask developers to make a new branch for their work and had to explain that it was because you didn't bother to break their task into reasonably-sized slices, because you want to de-prioritize it, that they would have to maintain the branches until someone gets around to testing, and that you don't really expect to be able to release it.

There are good reasons for branching, such as when making experimental changes you may very well want to discard. Alternatively if you examine your reasons for branching, you may not like what you learn.

If you find yourself in a dysfunctional branching hell, the answer is not to stop branching. The answer is to obviate branching by producing less, producing it better (with more and better testing), and integrating it sooner. Eliminate the need, then the practice.