Tuesday, August 18, 2009

Don't Geocache Your Variables

GeoCaching is a quirky, slightly geeky, fun treasure hunting game made possible by the ubiquity of GPS devices on the market. Essentially, person A will hide some object somewhere on the face of the earth, then publish the location as GPS coordinates. Person B will find their way to the given coordinates, take the object left by A, and replace it with another little gift/trophy for person C who might come along later. The locations may be at the bottom of a lake, top of a mountain, inside a cave, in a hollow tree, underground, in the hulk of an abandoned vehicle, etc. The game, therefore is partly in hiding the items in fun places, and partly in finding the items.

Geocaching in "meatspace" is fine, and probably a lot of fun (I've not had the pleasure personally). Geocaching in software is wrong.

I have found classes which contain something like this:

public class Location{
private Item XXX;
public Item Xxx {
get { return XXX; }
set { XXX = value; }
// A bunch of other variables
// A bunch of methods that deal with other variables,
// ... but never mention XXX or Xxx ever again.

Now the problem is that the Location doesn't need the Item at all, yet it is the place where some other class' method, perhaps CacheFiller.SetItem(Item x) will store item x. At some point in the program another class' method, like CacheRetriever.GetItem(), will retrieve the value.

Why is the data in Location? Because Location is somewhere in the intersection of the set of objects CacheFiller and CacheRetriever have in common. In other words, it is a convenient place.

Convenience is a pretty good reason, yes? Isn't it important to write new features quickly? You bet your bottom line it is! The code we write today is the code that's in our way tomorrow. What kind of week do we want to have next week?

If we are contract coders, and also selfish jerks, we might not care too much about the guys who have to deal with our code in the future. Even so, "the future" may be as soon as next month or next week, and short-cutting our code today will mean we (ourselves!) can do less tomorrow.

When I worked for Uncle Bob Martin the first time (Hi Bob) we traveled the world teaching object oriented design. Uncle Bob said in a maximally cohesive class, all of the methods used all of the variables. Most classes will not be totally cohesive, but it certainly is suspicious and smelly if a class has variables that are never used by its methods, or has methods that do not use any of its own variables. Low cohesion is poor design.

That having been said, you can get an official Agile Otter Cohesive Design Waiver in two circumstances:

  1. For certain utility classes have no variables of their own, and work by using a low-level API methods, provided the utility does not store the API's variables in other classes.

  2. For certain tuple-like "data truck" objects which have no behavior of their own, provided no other classes in the system are performing operations on the data (ie. feature envy).

I'm finding that for many of the beany objects I encounter, I find a number of other places in the code that are performing the object's behaviors. Remember that "having no methods" is not the same as "having no behavior". Frequently these are duplicated in multiple inconvenient places (close to UI or database code, intermixed with the rendering of reports, etc). The feature envy can be solved by extracting and migrating methods, and might result in a much more cohesive class.

When I find data-less utility classes, I sometimes find that they are geocaching some variables in other non-static objects. It suggests that there are proper classes possible that would combine functions from the utility classes and parts of the data objects together in a more cohesive whole.

Let's reiterate some examples for those who joined us late:

  1. When methods of a class touch their own instance methods, that is cohesion. In general, cohesion is good and should be maximized.

  2. When methods of one class touch the variables of other classes (even through getter/setter methods) that is coupling. In general, coupling is considered to be necessary but not so good, is to be minimized.

Coupling is not so good because it frustrates testing and refactoring. The places where a change is made and where the change causes a failure are not very close together. It is best when cause and effect are more closely situated in time and space. The more coupled a module is, the easier it is to break and the harder code is to separate and repurpose. For instance, the code in a triply-nested 'if' statement inside a switch/case statement in the query generation section of an Initialize method of a code-behind for a UI form is a little hard to test in fitnesse or NUnit. Testing is very important and we should not make it hard to do important things.

Cohesion is good because a cohesive class can be tested in greater isolation from UI and network and database issues. It can provide its own evidences of activity. It can be mocked-out more easily. It can be understood in a single IDE window. This makes it easy to do important things, and there are advantages in making important things easy to do.

Geocaching variables is the practice of locating them in places remote to their origins and destinations where other classes who are aware of their secret locations can re-acquire them later.

If you geocache your variables, you are increasing coupling and minimizing cohesion. In practical terms, that means it is harder for your coworkers to be sure that they're not breaking something when they refactor code you've written. It will be harder for them to write tests, which makes it easier for them to make undetected mistakes. Mistakes prevent systems from being accepted and deployed in a timely manner. Being unable to deploy systems keeps companies from being paid, and may cause them to lose paying customers permanently. Being paid is a good thing.

That means that geocaching variables is one of the ways to make your company stop paying you. That can make your spouse sick from anxiety and cause your children to grow up poor and wretched. It may or may not cause you to be attacked by velociraptors. It may speed the coming alien attack/ice age/global warming. It could bring about an apocalypse.

Most likely, though, it will slow you down and make your coworkers ashamed of your code.