Friday, March 19, 2010

Full Test Access With Data Hiding for Production

I'm talking to users of statically-typed languages today. Functional and dynamic language people, cut them some slack and realize I'm not talking to you.

Breaking with Conventional Wisdom

There are certainly Object-Oriented design pundits who will argue that you should never, ever have public variables in a class. The reason is good enough, that the object should be responsible for its own state. You should tell it to do things, and it should do them. Likewise you should not manipulate its variables for it, since that requires a strong understanding of valid states for the objects, and this understanding is a strong coupling. Strong couplings between classes are bad things. All of that wisdom is correct and proven.

However, many programmers make bean-ish objects where every private variable has a getter and a setter. This helps to preserve uniformity of access (you can't tell if it's calculated or stored, method or variable), but is otherwise the same as exposing public variables. When uniform access is important, this is a good practice.

The Proposal

When writing tests, data-hiding and uniformity of access are unhelpful. While many say that a class should be tested through the interface it uses in production, we still recognize that testing is improved when there is unimpeded access to a class. How often do you hear of people foregoing tests or adding a "testable" work-around simply because a method is declared protected or private?

There is a way to harmonize both the desire for object self-management and the need for testing with full access.

Some (hopefully most) classes are not built to be used directly by other classes, but through interfaces. A method outside of the interface does not exist to the classes that use the interface. This invisibility is "data hiding" and is a good thing. Private methods approximate this invisibility too well, making details of representation invisible to tests as well as to users.

If we program through interfaces, then we find that adding public methods to classes (not their interfaces) is a non-issue. We are without fear of programmers pillaging our object's state. The non-interface methods do not pop up in the code hints or code completion windows. They do not exist, and an attempt to call our public non-interface methods are compiler errors. We have the data hiding we need.

Tests will usually be working against concrete implementations, and are not tied to the small set of methods presented by the interface. Tests automatically have "privileged access" to methods outside the interface of the class. Data hiding therefore can exist alongside full test access.

Some classes inherit implementation from others. It's a code sharing mechanism with a long and storied history. These derived classes will have access to the public methods of their base classes. So test them well.

Side Effects

There is an additional advantage to all classes implementing interfaces, in that interfaces make it easy to build test doubles. Some languages have facilities for partial-mocking of classes, but an interface only makes the work easier.

Moreover, most statically-typed languages have IDEs with refactoring support, so that "extract interface" is a relatively trivial thing to do. If it is easy to make an interface, easy to use an interface, easy to mock an interface, and the interface effectively hides the interface used by testing, it is hard to understand why people do not use them more.

UI components are a useful class of objects used through interfaces. They are called by some kind of a UI framework, through interfaces it provides. The UI framework doesn't know your class from any other class. So why do the UI classes keep private methods and private variables? The only callers of the class cannot call them even if they were made public. If some other classes in the UI need to use a UI component, let it be used through an interface which constrains the set of usable class features.

Another instance of classes not needing private variables are test harnesses and test fixtures. Against whom are they protecting their private state? Against the test runners? Against production programmers? Against other tests? Horse feathers! There is no real need for tests to preserve their internals.

Sound Bite

IFF we program to interfaces, THEN private implementation variables and methods are not only redundant but also impede testing. Let's consider a little shift in our thinking if it will make our work easier.


  1. "If we program through interfaces..."
    What do you mean by the term interface? Is an interface the full body of public methods available to a class? If you have no private variables or methods, how do you store the class data, or encapsulate functions that need to be reused by various public methods of the class?

  2. By "interface" I'm talking of declared interfaces, which are approximated in C++ as pure virtual base classes or abstract base classes. In java and c# they're called "interfaces" instead.

    If so by "programming to interfaces" I mean that you always pass, receive, return, manipulate classes through interfaces (unless it is beyond absurd to do so, as with integers).

    In that case, the published set of methods in the base class are the only methods the world knows, other than the implementation class itself and possibly the factory or factory method that makes it.

    So if I take an interface that has three methods, and the class implementing it has twelve, I can only see the three. To see more I would have to "pierce the veil" by down-casting from the interface to the concrete class, which is generally a poor idea. Since everyone knows the interface, not the implementation, then data and inherited methods in the implementer are unreachable.

    In cases where a class is protected by an interface, the data and utility methods can be public if you like, because "public" loses its meaning.

    That's the point: the tests are written to the implementation, and get full disclosure, while the classes who use the interface have a limited view.

  3. I'm largely in agreement with you, Tim, but there is (at least) one down side with interfaces (the Java type, that is) and that is they can't be changed after they have been published. You can't even add a new method, or remove something that isn't being used. Unless, that is, you control all the implementations of the interface as well.

    My approach of late, in Java, is to use interfaces when I can distill the concept down to 1-3 methods. Anything bigger and I probably haven't got the real essence of what the thing is, so it's probably going to evolve, and in that case, I'll use a public class as the interface. (I might still make the public class abstract, and keep the concrete implementations behind a factory interface, so I could still define public non-interface methods in those subclasses and it amounts to the same thing.)