Agile Otter Blog: Search results for TDD

Showing posts sorted by relevance for query TDD. Sort by date Show all posts

Tuesday, September 26, 2017

Why TDD?

Why do we care if people call what they're doing TDD or BDD or why do we care if they actually do it?

Saying It

When people mislabel what they’re doing and refer to it as “BDD” or “TDD” or “Scrum” or “Agile” when it isn't, it screws up all the conversations that follow until we manage to unravel that they’re really just doing automated testing, or iterations, or what-have-you.

Clarity in a conversation has the same value as clarity in code, and maybe more so.

Someone told me that they were doing TDD, and we were well into the conversation before I realized that everything I had said for several minutes had been totally misconstrued.

What they thought was TDD was “holding a testing sprint before release”

So, the value in crisp terminology is improved communication.

What is the value proposition for misusing terms?

I guess some people — even in the agile world — don’t know that TDD and BDD are processes, not artifacts or tools; and that neither is just another name for “automated testing.”

You’d think we all know better, but no.

Why does TDD beat just writing tests after using test-coverage tool?

It doesn't have to be TDD, you know. Cope says he can get equivalent results using DbC, and Cope isn’t given to making such statements lightly. So there are different ways to get similar results (it seems).

If you find other ways of achieving the same goals (or over-achieving by also solving other goals) then, by all means, teach me what you learn.

Of course, you can use test coverage tools with TDD, so really having the coverage tool is a wash. There is nothing about TDD that forbids having measurement.

Let's stick to TDD v. Test-After for a moment:

It brings testing across the RW/BS line. Test-after tends to live on the BS side, even though we like to think it doesn’t.
It begins with the developer as a user of a class/function and ends up with the developer as an implementer.
It keeps the code runnable at all times, since you can't have the code "up on blocks" for hours at a time and run the tests.
The feedback loops are hella fast. You know that everything was fine 10 seconds ago, and you’ve only written two lines of code and now it’s broken.

Typically, the code is written (as an implementor) so it has an inside-out API. You get the algorithm right, and then you expose the variables via some function calls and make sure that it does its job. This can be complicated in some cases, so the code that I'm typing at 14:30 may have been typed into code that was last run yesterday. Or the day before. If I'm writing it all in one go, I don't have to keep it runnable. After all, I only need it to work when it's done.

And then it’s done. Well, you know, other than tests.

The code could conceivably be shipped, and the author has checked and double-checked it on the fly. We've had the debugger out to fix some problems so we know it works.

Still, the Powers That Be say there have to be automated tests. So there is an immediate emotional foot-dragging and obligatory feel to the work, but we’re professionals so we push through…

… and then we see that to test some variation, we would have to change the code. That’s one thing when it’s being written, but dammit the code is done and I don’t want to change it just to make the tests pass; how much do I need this test? This is another source of emotional foot-dragging and obligation. Maybe we push through that and change the code or write a complicated test.

… and writing the test we see that it’s a pain to pass all these parameters. It’s a little ticklish because there are 3 integers in the parameter list in a row so it’s easy to get the wrong value in the wrong place. OTOH, the code is done and changing the interface will mean changing the finished code and all the tests. Emotional foot-dragging kicks in. Can we just leave it the way it is? Note that in TDD we would be faced with this decision repeatedly, increasing the likelihood that we would revise the interface for safety's sake.

There are other points of friction. While there should logically be NO DIFFERENCE in doing test-after instead of test-before, it somehow just never turns out that way.

The only thing that test-after ensures is that there are tests; one of the less interesting side-effects of TDD. TDD isn’t about writing tests; it’s about driving the development of code in a special way with the side-effect that there is pretty good test coverage at the end.

Don't take my word for it.

Anyone is welcome to try an experiment.

If you are not sure that TDD makes any sense, or that it's effective, then try doing TDD for two weeks and then doing test-after for two weeks and evaluate the quality of the code and the quality of the tests and the quality of the experience.

Is the non-TDD way as safe?
Is the code as frequently runnable and integrated?
Is the feedback as fast?
Does it result in as good a code interface?
Does it feel like the testing is a natural part of the work flow?
Does it lend to refactoring as easily?
Is the code just as good (accurate, simple, clear)?
What other benefits do you get from the non-TDD method?

And don't just limit that to test-after. There are probably better ways than TDD, and we'll find them by going forward. Likely the answer is not "just don't do it" but in doing something even more simple, valuable, and profound in its place.

FIND THAT BETTER WAY!

And then come here and post a link to what you've learned. I'm all ears!

Monday, January 15, 2024

Fundamentally Wrong

The Problem

An article has been shared with me by several friends and also by some critics, and some people who are both, describing how TDD is fundamentally wrong, and doing test-after-development is better.

To be fair, the process described here is fundamentally wrong:

The problems with step #1 would indeed lead to the wasteful problems described below it. The recommendation here would certainly be better than the process described above:

TDD Is Fundamentally Wrong is Fundamentally Wrong

Now, the problem with this article is more fundamental than the problem being described.

TDD does not mean "Write all the tests then all the code"

It has never meant that.

That is not TDD.

That is some other misbegotten travesty that has no name.

This is the fifth or sixth time I've heard anyone describe TDD as writing all the tests first. In all cases except one, it has been described by people who self-describe as being anti-TDD, and who write articles decrying the foolishness that they identify as TDD (which is not TDD).

I have never seen anyone do TDD that way -- even unsuccessfully. I have never seen anyone even try to do TDD that way. I would never sit by while someone tried to do that and called it TDD. That's simply not the process.

The one time that I read an article that actually recommended doing that, it was from a Microsoft publication early in the XP/Agile days. The public outcry was great and sudden, and the article was retracted. I don't think I've ever seen anyone else recommend that approach, because that approach is so obviously flawed and was never the process used by original XP teams.

So What Is TDD?

TDD was originally described as a three-step dance that repeats over and over as one develops code.

You can take that straight from the horse's mouth (so to speak):

To these three steps, we (at Industrial Logic) added a fourth and final step. Others may also have independently added this 4th step, I don't know.

Write a test that does not pass, and in fact cannot pass, because the functionality it describes does not exist. This failing test is the RED step, so-called because test-running programs generally produce the results of a failed test run colored red.
Write the code that passes the test. When the test passes, it is typical that the test-running program will present the results colored green, so this is often called the GREEN step. The code may not be optimal or beautiful, but it does (only) the thing the test(s) require it to do.
Refactor the code and test so that both are readable, well-structured, clean, simple, and basically full of virtue (so far). Refactoring requires the presence of tests, so this way we can refactor as soon as the code passes the tests, rather than waiting until after all the code for our feature/story/task is finished. We refactor very frequently.
Integrate with the code base. This will include at least making a local commit. Most likely it will be a local comment and also a pull from the main branch (preferably git pull -r). More than half the time, it also includes a push to the shared branch so everyone else can benefit from our changes and detect any integration issues early.

We repeat this cycle for the next test. The whole cycle may repeat 4-10 times an hour.

We do 1-2-3-4-1-2-3-1-2-3-4, we do not do 111111111-222222222-44444-(maybe someday)333. These are not batched.

Was It A Misunderstanding of the List Method?

Some people misunderstood Kent Beck's List Method, in which you begin with a step 0 of writing down a list of the tests you think you will need to pass for your change to be successful (see the screen shot and link to Kent Beck's article).

Note that you only make a list of tests. You do not write them all in code.

As you enter the TDD cycle, you take a test from the list. That may be the first test, the easiest test, the most essential test, or the most architecturally significant test. You follow the 4-step (or 3-step) dance as above.

If you realize a test is unnecessary, you scratch it off the list. Don't write tests if the functionality they describe is already covered by an existing test.

As you make discoveries, you add new tests to the list. That discovery may lead you to scratch some unwritten tests off the list. That's normal.

Eventually, you will note that all the tests on your list have been scratched out. Either you implemented them, or you realized they're unnecessary. This applies to the tests you discovered as well as the original list.

You've done everything you can think of doing that is relevant to this task, so you must be done. This is doubly true if you have a partner also thinking with you, or even more certain if you have a whole ensemble cast working with you.

You never had to predict the full implementation of the features and write tests against that speculative future state.

It's a tight inner cycle, not a series of batches.

Do I disagree with the article, then?

Indeed, the "write all tests first" would only work for the most trivial and contrived practice example. It would never suffice in real work where 11/12ths of what we do is learning, reading, and thinking.

As far as the process being unworkable, I totally agree.

As far as that process being TDD, I totally disagree.

That characterization of TDD is fundamentally wrong.

Tuesday, November 12, 2013

TDD: more to know

The basics are well-known:

Everyone knows the basic cycle of TDD.
You should also know the improved Industrial Logic version of the TDD cycle.
You have heard Uncle Bob's three rules.

But there is so much more to know.

I have been gathering little sound bites for you which may help you build your skills and knowledge.

Please feel free to drop additional factoids or questions. I'm happy to explain any of these at length if you like.

Here is my list:

Your code has two parts: the part you have covered with TDD, and the part that requires you to use a debugger.
Microtests are F.I.R.S.T. (you cannot TDD after writing the code)
Only microtests are appropriate for TDD; other tests are useful, but not for TDD.
Microtests are not all your tests - you need other levels of test still.
TDD does not validate your system; it only speeds development and improves quality.
TDD without a pair programming partner is like programming while wearing only one shoe.
TDD's power is in the shortness of the cycle and the constant invitation to evaluate and refactor; this is why test-after doesn't yield the same results.
We absolutely will unabashedly change production code in order to make tests pass (e.g. to make it testable)
If you don't do the 'refactor' and 'integrate' steps of the cycle, it can only end in tears.
Depending on clock or file system or database or web services or anything outside the function under test disqualifies your test as a microtest.
Don't try to test the unit in context; this way lies trouble. Test it in artificial context.
Don't build a bigger framework for testing; make a smaller test.
You have to know what is the micro-unit of code you're trying to test.
TDD is not slow. It just seems like it is going to be slow.
You don't have to read the code to understand a good test.
TDD is a specific discipline. It can be done wrong or badly.
Tests have their own anti-patterns.

Visit They're Called Microtests by Mike Hill, and then maybe consider learning more at Industrial Logic.

Tuesday, September 6, 2011

You Cannot Possibly Do TDD after Coding

Just for the record: it is flat out impossible to "do the tdd" after the code is finished. This is just a matter of definition.

You can write tests after the code is finished, but that has no relationship to TDD whatsoever. None. Nada. Zip.

In TDD you write a new, failing test. Next you write enough code to pass the test. Then you refactor. This repeats until the code is done, writing and running tests continually. It shapes the way the code is written. It is a technique for making sure you write the right code, and that you do so in an incremental and iterative way.

In ATDD, you have a failing acceptance tests. You take the first failing tests (or the easiest) and use TDD to build the code so that that part of the AT passes. You run the AT after each major step that you've built using TDD. When the AT is all green, you have completed the feature. This helps avoid early stop and also helps avoid gold-plating.

If tests were product, then it would make no difference whether you do them first or last, and redundancy would be silly and worth avoiding.

If tests are rightfully seen as process, then redundancy is "coherence" and not worth avoiding. It helps us move forward.

The biggest problem with TDD is that if you're not doing the process, then the product (tests) don't make sense.

Hate TDD? Then there's a good chance you're not actually doing it.

Tuesday, September 22, 2009

The Elusive Code Quality Argument

I was reading Uncle Bob's latest blog this morning about messy code and technical debt. I wanted to make a comment about the problems programming shops face, but decided to do it here instead.

The problem with clean code is twofold:
1) people who can't see it don't believe in it
2) some people who should be able to see it don't believe in it

People who can't see it don't believe in it.

One of the heartbreaking lessons from the Big Ball Of Mud talk on Wednesday at Agile2009 is that people working two levels of management above your head do not know that the code is messy. Joe and Brian popped up a slide of Mike Rowe, and quipped that you can't bring him out to wade in the muck in a way that non-programmers can understand. Oh, the code is stinky and messy and bad, but only you can see it.

If you can't see the difference between clean and ugly code, it all sounds like a "programmer myth". It seems daft to take time for refactoring. After all, when the programmers finish refactoring the code doesn't do anything new, but the programmers feel better. How much money do we lose to make programmers feel better?

We need quality (in low bug count, low regression count, sustainable productivity) but can't afford time for quality practices (TDD, pairing, and clean code). Discounting this dubious "clean code" thing, it must be because the programmers aren't very good. Which is right, as far as it goes. Better programmers make better code which can be enhanced more readily. But doesn't that imply that our fastest programmers must be our best programmers?

Some people who should be able to see it don't believe in it.

Not all programmers can see mess. If they could see it, then they wouldn't make so much of it.

What if one makes a new program by copying an existing program and hastily hacking it into a workable shape (ignoring duplication and testing) and drops it into the release for tomorrow? Isn't that a big win for my team? If it's done quickly, doesn't that make me a good programmer?

Maybe the jury is out until we hear back from the users. Is my responsibility to hack code out quickly, or to make stuff that works in actual users' hands? What about when my peers come along to fix something: have I helped or hindered them? Quick hacks stop well before they reach 'done.' Though hacks they look good in the short term, they are just deferring work to post-release. It would be wrong to reward this behavior.

A number of otherwise capable and productive programmers can't tell mess from brilliance. Their code is complex, confusing, implicit, indirect, cryptic, and poorly organized, but it works and they feel good about it. They may have reached some level of success for continually pouring out working code, yet their code is a shambles. James Grenning would say such a person is like a cook who never cleans the kitchen.

The primary factors determining how quickly we will program today are the quality of the code we're working in, and our ability to do work well. Clean, clear, obvious, straightforward code makes us better and faster, poor code makes us slower and more likely to make mistakes. John Goodsen from RadSoft always told me that the secret to going fast was not to slap things together but to make fewer, more correctable mistakes. This level of disciplined work is not a waste of time, but a small-yet-potent investment in future productivity.

We've learned that the longer a bug remains undetected, them more it will cost to locate, isolate, and eliminate it. Cleaner code will reduce the incidence of bugs and TDD will also speed discovery of bugs. Ugly code will encourage the creation of bugs and lack of TDD will allow them to remain undetected for longer periods. Sending bugs out to the customers erodes good will, which nobody wants. As a coping mechanism, exhaustive manual testing is costly in time and money. Code cleaning and TDD together are a waste preventative rather than a waste of money and time.

Duplication of code is a common form of "messy code", generally caused by copy-and-paste programming. It is particularly ugly because developers may fix one copy of the code (perhaps in a report) not knowing that it has been duplicated elsewhere (perhaps in another report or screen). Later we report bugs that look like recurrence/regression but really they are just bug duplicates. Going back to fix a bug multiple times is an expensive waste of user patience. Eliminating duplication is waste removal, not actually a form of waste at all.

Cleaning our code and testing our code make us go faster, but the effects are not immediate. It may seem inobvious that we are going faster by taking time to clean and refactor our code, by using TDD and pair programming, but these are the practices that we use to avoid having code returned by QA or unhappy users. If we measure from the time we pick up an assignment until the time it really works for our users, we find that TDD, Refactoring, Pair Programming and like practices greatly speed development. If we only measure from the time we pick up until we release the buggy feature, then all these practices seem to slow us down. You have to choose the measurements that really matter.

Where does this leave us?

If some programmers can't tell clean code from messy code, most managers cannot tell, and most sales and product people can't tell, and if the benefits of refactoring trail the intial feature work by weeks, months, or years, then aren't we without hope of improvement?

We are without hope of external rescue. It is unlikely that any non-developers in authority will mandate or even approve the practices that will get us out of our mess. If things are going to be better, it will be because we make them better. We don't need permission, but if we care about our products then we do need to use hygienic practices in our daily programming.

There is hope, but it is only us.

Tuesday, September 29, 2009

Tragedy of the Doubtful Solution

The Background

I (re)tell a story frequently about a place I worked in the 90s. There was a piece of code with an absurd cyclomatic complexity score, running literally hundreds of lines in length, and being called from myriad places in the code base.

The code was written to check to see if two ranges were overlapping. Being written in a poor 4GL, it took four parameters representing the starting and stopping dates of two time ranges. Such a simple task for the code to be so horrible and lengthy. In C it would look rather like this:

 bool ranges_are_overlapping(int a, int b, int c, int d){
 }

It quickly became clear to the most casual reader that the ranges were a..b and c..d. Well, sort of. The code was defensive and was written with the understanding that the range-defining arguments could be somewhat unordered:

 if ( (a<b) && (c<d) ) {
    // blah blah
 }
 else if ( (a==b) && (c<d) ){
   // blah blah
 }
 else if ( (a>b) && (c<d) ){
    // blah blah
 }
 else if ( (a<b) && (c==d) ) {
    // blah blah
 }
 else if ( (a==b) && (c==d) ){
   // blah blah
 }
 else if ( (a>b) && (c==d) ){
    // blah blah
 }
 else if ( (a<b) && (c>d) ) {
    // blah blah
 }
 else if ( (a==b) && (c>d) ){
   // blah blah
 }
 else if ( (a>b) && (c>d) ){
    // blah blah
 }

Now, of course, even once we square away the range markers, there are a whole host of possibilities. A could be less than, equal to, or greater than C, and B could likewise be greater than, less than, or equal to D.

     if ( (a<c) && (b<d)) {
        // Lets call this "inner block"
        if ( b > c ) {
        }
        else if ( b == c ) {
        }
        else if ( b < c) {
        }
        // And on we go ....
     }
     else if ( (a<c) && (b==d)) {
       // cut/paste/edit "inner block" from above
     }
     else if ( (a<c) && (b>d)) {
       // cut/paste/edit "inner block" from above
     }
     else if ( (a==c) && (b<d)) {
       // cut/paste/edit "inner block" from above
     }
     else if ( (a==c) && (b==d)) {
       // cut/paste/edit "inner block" from above
     }
     else if ( (a==c) && (b>d)) {
       // cut/paste/edit "inner block" from above
     }
     else if ( (a>c) && (b<d)) {
       // cut/paste/edit "inner block" from above
     }
     else if ( (a>c) && (b==d)) {
       // cut/paste/edit "inner block" from above
     }
     else if ( (a>c) && (b>d)) {
       // cut/paste/edit "inner block" from above
     }

Of course, each permutation of the range start and end relationship would require a full repetition of the comparisons of the range starts and ends complete with cut-n-paste-n-edit inner blocks. We do not attempt to recreate the entire mess here.

I bet the programmer responsible for this routine wrote more lines of code that day than anyone else on the team. What he lacked in efficiency, he made up in diligence.

The code was correct in its results, but it was huge and slow and tedious to desk-check. It had no tests, automated or otherwise.

I replaced it with the rather ordinary and obvious solution:


 int left_start = min(a,b);
 int left_end = max(a,b);

 int right_start = min(c,d);
 int right_end = max(c,d);

 bool distinct = ( left_end < right_start) || (right_end < left_start);
 return !distinct;

Straighten out the begin/end so you don't need a Cartesian explosion of if statements, and then look at the ranges. When either range ends before the other starts, there is no overlap. Otherwise, you have overlap.

It is hardly a miracle of profound logic or deep mathematical insight. Nor of excessive typing and careful editing.

Young Tim actually had to work that out on paper. It worked great, was much smaller and simpler, and ran much faster than the ugly monstrosity that was there before.

The reason I'm writing this is not to express that I'm revolted by bad code or to brag that I replaced it with good code. It is not to ridicule diligent-yet-misguided junior programmers. The point of this blog is what happened next.

What Happened Next

I was called into the boss' office. Someone noticed my code improvement! He showed me the old code and the new, both printed out in neat stacks on his desk. I grinned and said "Yes, it's much faster now."

He didn't smile back. He frowned.

He told me that it was clear from the original code that the author had thought through all of the possible scenarios and had accounted for them. Mine, on the other hand, showed no such diligence. I clearly had not put any real effort into my work. My answer was too small. Even though they couldn't make it fail in testing (yet), he knew that I must have left something out. As a result of disbelief in my abilities, he was rolling back my change.

I was upset that a small working algorithm was about to be tossed out and a messy steaming pile of code put back in. I was more upset with the accusation that I did not think through the cases, and that (despite all evidence to the contrary!) I had written an insufficient piece of code.

I tried to defend my solution, even pulling out my scrap paper with all the overlap scenarios on it, but it was too late. The decision was made. I and my solution were obviously inferior. Chastised, I returned to work at my desk.

I began to lose interest in the company where I had previously intended to spend my entire career. In time I left and have had a good time of it.

The Payoff

Why blog about it in 2009? Because my TDD associates have blogged and tweeted about how, in TDD, the code becomes more generic as the tests become more specific.

If the Tim of 1994 had known about TDD, he would have built up a large and specific base of tests covering all the cases represented by the old code. His tests would have demonstrated that he'd thought through all the permutations, and the simpler solution might have been fielded. Tim would have been saved a moment of humiliation, and that poor application would have gotten a boost in performance.

TDD would have made the code better, and it would have improved the experience I had there. It would have given me visible evidence of my thinking. With a body of unit tests, there is proof that we've thought things through. An oblique, small solution cannot provide that on its own.

Young programmers: consider this advice. The more elegant solutions you devise will need a body of proof if you are to survive clue-challenged technical managers. If you don't do TDD for the sake of the code, do it for yourself.

Tuesday, February 27, 2024

Definition-by-Dysfunction

I've done it. You've seen me.

You've done it. I watched you do it.

We've probably argued about it.

The Defining Dysfunctions

I published a blog post some time ago on the Industrial Logic website about programming together vs programming under surveillance. It's a relatively simple piece, and it identifies a problem we have in the world when it comes to just about any technique or discipline.

When I suggested that people mistake group programming for working under surveillance, an incredulous reader exclaimed “How could it possibly be anything else!?”

So here's the thing: a person had a bad experience where instead of actually researching what pair programming is and how it works, they just sat down at a keyboard with another person and tried "doing pair programming" without any pre-study or preparation. They ended with one person bored, watching the other program.

This is a widely-known dysfunction or "failure pattern" known as "Worker/Watcher." It's not how pair programming is done.

The two people who had this one unpleasant, uninformed, wasteful experience came into it with little more than a sound bite ("two people coding together"), guessed at how it was done ("one person types"), and had a poor experience.

Since they didn't start with a good definition of pair programming, that experience became the definition of pair programming.

They had a defining dysfunction.

Unless something new happens, they will forever see pair programming as wasteful and pointless practice.

Is that the real definition of pair programming?

Is that what it's about and how it's done?

Not remotely. but it is the one touch-point they have -- the one experience that they have had of it, and "pair programming" is the name of that experience now.

Do they want to try it again? Clearly not. They know what it is, and it's nonsense.

They will likely take to social media to decry the BS that is pair programming and save everyone else from this wasteful and unprofitable behavior.

You and I have likely done the same.

Give those complainers some credit, because at least they tried it (or something that they imagined was it) first.

Some Examples Might Be Useful Here

I ranted against scrum™for years, because I let the dysfunctions I routinely see become the definition of the process for me. If anyone were to actually try doing scrum™it would be a pretty good way of working, but nobody does.

Sadly, a lot of people in that space have adopted the defining dysfunctions. As far as they are concerned the "right way" to do scrum™ is to have a titled person assign individual work tickets to developers, who strive to serve the maximum number of tickets per fortnight to raise the velocity of the team -- striving to do "twice the work in half the time" (a soundbite). This isn't remotely what the defining document of the scrum™ method describes but it is what is often taught as "doing scrum."

Are you an agile hater because it's all Jira tickets, meetings, estimation, and work crammed in to meet artificial deadlines? I'd hate that too. I do hate that. It's just not agile. it's not even scrum™.

Do you hate agile because agile is "no documentation," "no design," and "no estimates"? Well, that's a worthy distaste. It's also a mischaracterization.

Do you hate TDD because you have to write all the tests first, and you don't even know what shape the answer is going to take, so it's impossible? Well, that's a bad process, and it's not TDD. It's not the definition of TDD, it's just a failure mode.

Why Bother Trying?

Sometimes people don't even have to have a bad experience to adopt a defining dysfunction. They hear a sound bite or title and imagine dysfunctions. They (efficiently) go straight to disdaining the practice based on their imagined defining dysfunctions.

Do you think that Psychological Safety means you can't ever say anything that might possibly upset someone? Does Radical Candor mean that you can say whatever you want without consequence? Wrong, and wrong. Those are defining guesses at dysfunction.

If we only guess at a discipline or a philosophy or a behavior and don't actually bother to investigate what it's intended to be, what it really means, and how people actually perform it then we don't have the basis for an honest opinion. If we have only experienced it as a bad attempt at a good idea we haven't formed a valid opinion.

We're Too Smart For That!

Have you jumped to conclusions based on naive attempts or imagination alone?

I'm betting we both have. It's a human thing, and we're all human. It doesn't much matter how smart or experienced you are -- you've done this. I might just be projecting, but I've seen too many examples to believe it to be less than universal. Please prove me wrong!

I'm willing to bet that you've done so this week. Let's both look out for these mistakes, because I'm betting that we could be more successful at many things if we took the time to understand them.

Monday, April 2, 2012

Really TDD-ing: Less simple than you think

TDD is a pretty simple system, as popularly described. For an individual doing a kata or playing with some example code it is really just three steps. On card #44 of AgileInAFlash it looks like this:

It's the right way to think about TDD, and the right way to get started.

When you get to a real project in a team environment you have many other issues to consider. In a modern team environment, with a DVCS, it looks more like this:

Get the current code (git pull, get clone, hg pull -u, whatever)
Run all the tests and be sure they're all really green, to avoid blaming yourself for unrelated breakages.
If there are no tests, write a trivial failing test ("assertTrue(false)") to prove your testing setup (ide, scripts, makefiles) really work. Then delete the trivial test.
Write one "real" red test. INSIST ON RED.
Write code to turn it green.
Local commit "save your game"
Refactor (or intentionally choose not to).
Check to be sure you're still green.
Local commit.
Push your change to the shared repo (or have a darned good reason not to).
Start from (1) again.

The first two steps are really important, because our peers may have checked in code without being sure that all the tests pass. We don't always work with perfect Agile coworkers (especially in a company with multiple teams, or one in transition, or both).

If you don't commit on either side of a refactoring (steps 6 and 9) you can quickly end up with no recourse but to revert and lose minutes or hours of work. Because the hard part of programming is the learning and thinking, losing code is really a triviality but it isn't fun to have to retype the code you have already built once. It's good to have a known "green" state to return to, and the more recent the better.

There is more to the flow than I listed, though. If you get a red test, you have to back up a few steps. If you did anything significant, you will want to push to the main code line after each refactoring and possibly after each green bar (steps 9 and 6, respectively).

It's often wise to put the "real" code to turn the test green (step 5) into your test class initially, and then migrate it to the production code during the refactoring break (step 7). It keeps you from having to bounce around between a bunch of files and tightens your cycle considerably.

Since you are pair-programming -- as sensible people do -- you might switch pairs at every red (step 4) or every green test result (step 5). That's a small ritual that keeps everyone's head in the game.

If you have a messy code base and/or a refactoring that touches code in a great many places, you may want to do the refactoring (step 7) in the fresh main line code, and then pull the refactoring back to your local copy. Trying to keep large refactorings in a branch will cause a lot of integration pain. Best to get it to everyone soon, and get it over with.

If you use feature branching (God help you) then everything gets harder to do, and you'll feel a pressure to avoid pulling code updates and rerunning tests (steps 1 & 2). Avoiding them makes it worse. The longer you're off the main code line, the worse all your merges will be. Get back on soonest.

If you don't have a DVCS (again, God help you!) then it all gets rather messy because non-distributed version control means you can't have a local commit. Some people say that's better, but I say it ain't*. A remote commit takes a lot more time than a local savepoint, and having dozens of them at once, every minute or two from every pair, can drag a reasonably good server to its knees.

* Apologies to Billy Joel, and of course my 4th grade English teacher.

Monday, March 14, 2022

What does Tim have against "private" methods?

A big thread erupted, all full of misunderstandings and miscommunications, about the idea of "private" methods.

It all started quite innocently (I maintain) when someone asked how we felt about testing private methods. Some people jumped in with "Absolutely Not! Never! That's wrong! Test via public interfaces."

I thought a little longer, and said "I'm not sure" and then later "I'm not sure that 'private' is even needed."

This is where the problems started, and maybe here I can clarify what I meant by it all.

People assume (and insist) that I could only possibly mean that they should substitute 'public' for all protected and private members, polluting the interface, and inviting the violation of a class' internal state.

That was never my intention and still isn't. Still, this is what people insist that I must have meant from the start. I suppose this is because that's what they imagined me to mean and it's hard to admit that you're wrong, or perhaps because this is social media so people never miss an opportunity to double-down and pile on if at all possible. It's like a sport.

It's also entirely possible that I didn't communicate well at all in short tweets. Maybe that's all on me. I can't say.

Well, you've come this far, so let's see if I can't communicate better in blog than in twitter, since I can write in peace without people accusing me of meaning the wrong thing while I try to explain what I actually mean.

If I've managed to make sense with this post, let me know.

If I've not, the forum is open for clarifying questions below.

Autocomplete is a big deal.

The good thing about 'private' and 'protected' -- the thing that actually makes programming easier/better, is that the methods aren't offered by auto-complete. The idea of locking people out of calling the method? Not even remotely a second-degree concern.

People program by autocomplete. You type the name of a thing, press the dot, and you get a list of things you can do. You may never look at the documentation, or read the example code, but you will see the autocomplete hundreds of times a day.

When you want to disable an account, there is likely to be a 'disable' method attached to an account. You choose that method, run the tests, ship it. That's how you normally will navigate.

When you don't see a method you want, you might pop up a level and use the module or some management API interface. Again, you type the name and a dot, and you look for 'disable' in the recommended names.

Private and Protected are two of the ways to keep functions out of that list, and if they're not in the list, you're not likely to even consider calling them.

Arbitrarily Located Functions

A developer is implementing a class and realizes that they need to compare date ranges. They write that up, drop it into a private method, and all is well.

Or is it?

Why is that method located in this class? Does it have high cohesion? No. Does it support the purpose of this class directly? No.

Why then? Because this class calls the function, and this is the file I was editing when I wrote the function.

In other words, the function doesn't belong here according to any rule of design, but it wasn't found anywhere else either. The developer drops it into the class with 'private' and moves on.

What I've found is that often the private methods in one class are repeatedly reinvented in private methods of other classes. Sometimes they're even copied and pasted directly from one class to another.

Why? Because this is the class they had open in the editor when they realized they needed the function, and it's not available for calling because it's private in the other class. So they copied it.

Arbitrarily located functions, hidden from view, repeated by invention and copying, likely hiding the opportunity for Single Point Of Truth (SPOT) abstractions and libraries that would make everyone's job a little easier.

This isn't a rare occurrence.

By being hidden and arbitrarily located, they invite duplication and reinvention.

Testability of Private Methods? None

When we talk about testing private methods, it's a non-starter. If you put the word 'private' on the front of a method in most languages, you cannot call that method from a test. That means you can't possibly test private methods.

Okay, there is a way, but it involves reflection and that's an obscenity. If you ever use reflection to crack into some legacy code and test hidden methods, make sure you remove that hack as soon as humanly possible (or sooner) because it's a travesty. It's awful. It will fail you if the method is renamed or moved. Reflection is generally a bad idea, and I wouldn't do that (much, or for long).

If you are doing TDD (and why wouldn't you?) you test the public methods, and if they're using private methods then the private methods are being exercised. If you have meaningful tests and assertions, then the private methods are being tested through the public interface and that's probably okay.

If you're not doing TDD (and why not?) then you may not have code that tests all the private behaviors. You may have to read all the code in order to exercise it properly, which is a pain.

If you're doing test-after development (but why?) you've written the code and now you have to write tests around it. Every private method is a called from some public method and you're going to have to figure out how to get down into that private method code and recognize if it's working correctly or not.

By being private, those methods have cut off the easiest route to testability -- calling the method directly.

Should You Test Implementation?

You should: implementation is how code behaves and you have to test behavior.

There are caveats here:

If you're structure-aware in your tests, they'll resist structural refactoring
If you're time-sequence-aware in your tests, they'll resist time-order refactoring
If you're using reflection, you're poisoning our future and must be stopped (half-grin)

There are rules, of course. You don't lock down things that you want to change, and you don't leave unspoken the things that must be true. Your tests specify how your world behaves at a certain level.

Sometimes that's too high a level and there are mini-behaviors you need to check too.

I've seen cases where testing only at the interface required hundreds of tests, but testing at the internal level requires barely a dozen. This is because at the higher level, you need all the combinations of all the paths of all the subordinate functions. At lower levels, you need one for each path in a function.

High-level end-to-end and integration tests take a long time to run, and microtests are cheap and fast.

We tend to do microtesting (or unit testing) so that we can afford to run the tests in a tight TDD cycle.

To argue against testing implementation functions is to argue against unit testing and microtesting on principle.

If you should only use the most-public interface, then shouldn't you only test at the UI level? Yeah, this is not a good idea.

So how can we get at the lower-level functions? If they're all private, we can't but it is totally possible if they are public methods on lower-level classes that are composed ("encapsulated") by the API classes.

Interfaces and Implementations

In many languages, you can declare a public interface, and that interface can be implemented by other classes.

It's accepted that one should always program to the interface in those languages. This way, one can only use the declared, public interface of the implementations. This keeps the implementations substitutable (via Liskov Substitution Principle).

Non-interface methods of the implementers are already hidden: callers have no access to them via the interface.

If you're using the interface, and you press the dot, then only methods in the public interface are presented. The other methods of the implementer may as well not exist, because they are not available here.

If you have segregated interfaces, this is even nicer. It may be that two or three public interfaces are the right way to think and design interactions, but combining a few of those interfaces together is the better way to implement the behaviors. No user of one interface has any awareness that the other interfaces exist on the object, nor of the public methods of the other interfaces, nor any of the public methods of the implementers.

In order to reach the public methods of the implementer, a caller would have to down-cast from the interface to the implementation (a no-no that justifies a sharp crack across the knuckles with a yardstick) in order to even know that other methods exist.

This is convenient. This means that all methods of the implementers can be public without exposing those methods to callers. Since they're public, you can write tests directly against them. Because the implementation is being tested also through the interface, it's easy to ensure that it behaves as a whole implementation.

Why not make them private? Because it's redundant and limits testing.

Without polluting the public interface, one has a fully open and testable class.

Thinking a little more deeply

Private methods, when we choose to use them, may signal a need for us to think more deeply about our situation and strategies.

In a Rat's Nest:
sometimes people are doing test-last (after coding) and they have a complex method. You shouldn't have big, complex methods if you have any other choice. To try to manage the complexity, they may extract some private methods.
In order to test through the API, you have to navigate the rat's nest. You will have to set up deep data structures and some combination of boolean conditions that will allow you to get to the private method's function call, and then to exercise the code inside that method.
It's easier to make the method's access less restricted and test it directly than to have dozens of long and complicated tests. Perhaps package-public, perhaps protected so you can cheat with inherit-to-expose.
It's better to test the code easily than to write awful, fragile, internals-aware tests.
Missed Abstractions
Where one class has a bunch of private methods, often there is some cohesion between those methods. If one were to set the private methods side-by-side, one might recognize that there have been missed abstractions.
Maybe some of the methods are generic string, date, or math functions. These could be public methods in a utility package or more-primitive type. If you move them to the "right" place, they can be fully testable and can be reused within the codebase. They would not be public methods of the class you're working on.
Perhaps some of those represent a lower-level concept that is munged into the current class. If they were pulled out, they could become testable, public methods on the new class. They would still not appear in the public interface of the class you're working on.
In Languages without Private
In Python, smalltalk, and similar languages there's no 'private' and we've been okay with that for a long time (since the 70s for Smalltalk).
While people say that encapsulation is a core feature of OOD (and it is) it isn't the "private" keyword that causes encapsulation. It's done via composition.
In a Python module, you can choose what classes you expose and which you don't. You have a class with a public API, which is composed of classes/functions that aren't part of the public API. They're tested directly within the module and don't have a "private" keyword associated with them. They might not have underscore-decorated names or any other semblance of access protections.
Other APIs
In any language, the Model class or the API class may have a simple interface into the module (as described above) but the module may have a lot of complicated functions and logic divided into multiple classes and functions.
The model class doesn't have any 'private' methods at all - it just calls the public methods of other classes in the module. Those classes have copious tests and may not have any methods declared Private or Protected, because only the API and tests call those methods - they're not visible to the outside world (protection from Hyrum's Law)

So What Do You Do Instead

Don't pollute the public interface of a class with private methods. If you simply flip the private methods to be public, you'll lose understandability of the interface and create unwanted dependencies. This is what I recommend not doing. Keep interfaces clear and clean.
Move your methods to the places where they belong (where they have cohesion) and can be easily found by others.
Try not to need private methods. They should be rare. Remember, many OO languages don't even have the concept of private and they're just fine without them.
in some legacy code test-after situations, you might raise accessibility of a method to public, protected, or module-private in order to support refactoring via better tests (in the short term)
Consider using encapsulation properly: by composing behaviors under an API, rather than by housing all behaviors in the API's class.
In some languages, you have 'interface' or 'protocol' classes that declare an interface to use. Do that when there is a clear public interface.

And, of course, I would be remiss if I didn't admit that I do sometimes create private methods. I try not to, and when I find a better way than settling on private, I am usually happier. I sometimes leave a little cruft like private methods until I have more information and can see a better design form; it may take weeks or months.

So yeah, there are private methods in my code. I just don't see that as "good coding" and a "solution." It's temporary.

Thursday, May 7, 2009

Top-down/Bottom-up

I always find I do a better job if I top-down my tdd work. I have a good context, goals are clearer, and the strategy doesn't get lost in the tactical work. It's such a good way to work.

On the other hand, I find myself constantly battling the urge to bottom-up the solution (and losing fairly often). When I bottom-up the work, I risk having a solution that doesn't really work in context and I have more rework. So what is the pressure that makes me want to bottom up?

I think it's because TDD is easiest when you are closest to a unit you're testing. It is easiest to isolate, and the tests are smaller and more clear than at higher levels. In addition, you never have a higher-level red test to ignore (or I've seen people comment them out) while you get a lower level test working.

I also think it has to do with the human "optimization drive" to skip all the navigation and jump to the file we think we'll have to change anyway. Maybe this is stronger in technical people, maybe not. It often works out just fine, but sometimes not.

I think I want to optimize away the "sometimes not".

Wednesday, December 31, 2008

Quality Rant

Men more brilliant than ourselves have tried for decades to get the idea of quality across to businesspeople and tradespeople (including software craftsmen) and have had only very limited
success.

TDD is just another grandchild of Quality.
http://en.wikipedia.org/wiki/W._Edwards_Deming#Deming.27s_14_points

Deming's points didn't stop applying just because we're in software. The excuse,"Our problems are different" was specifically listed as an obstacle to real improvement.

But still we feel have to justify the desire to increase quality of our products. Does this seem silly? I spend far too much time trying to get people to build quality in via TDD and JIT inspection (AKA pairing) and collaboration, but still they feel that this is slowing them down.

"We all *know* the sun circles the earth, because it rises in the east and sets in the west. Stupid heliocentric theory is a fun pastime for intellectuals, but doesn't work in the real world."

My vent for the day, from very real frustrations.

For the lazy, busy, or browser-impaired:

* Create constancy of purpose toward improvement
* Adopt the new philosophy
* Cease dependence on inspection
* Move toward a single supplier for any one item
* Improve constantly and forever
* Institute training on the job
* Institute leadership
* Drive out fear
* Break down barriers between departments
* Eliminate slogans
* Eliminate management by objective
* Remove barriers to pride of workmanship
* Institute education and self-improvement
* The transformation is everyone's job.

Thursday, December 1, 2016

TDD: Start With A Failing Test

The question was asked:

In Test-Driven Development, what does it mean to start with a failing test?

This is not a complicated question, so let me give the short answer:

Write a test that can't possibly succeed because you have not yet implemented the feature; but which would succeed if the part of the feature it's testing were written.

You want it to be a good test: clear, obvious, simple, discrete.

You want it to fail, so you see what the error message will look like -- whether it will provide enough information when someday it fails unexpectedly.

Then you write enough feature that the test passes works (but not the whole feature).

The idea is like a video game. You write a test, which is your first challenge. Then you beat that challenge and save your game (to version control) so you can come back. You layer on the challenges until you've beat the game (written the feature).

There is more to the TDD cycle, but this is enough to answer the one question.

BTW, the same "accumulation of phases that work" is the preferred approach to writing stories, which add to features... the whole world of test-driven is about thin, tested, integrated slices continually being built and integrated.

Wednesday, July 9, 2025

Who (the heck) Am I?

I'm Tim Ottinger.

You may already know me.

I'm a long-time developer, agilist, XPer, CI/CD, teaming/ensemble, TDD, and general software delivery specialist.

I wrote the second chapter of Clean Code. I am the originator and co-author of Agile in a Flash with Jeff Langr, and I wrote Use VIM Like a Pro.

I'm mentioned in the "acknowledgements" sections of many other books on Code Craft, Agility, Design Patterns, because I’ve been heavily involved in the tech community and have done reviews and edits of their pre-publication materials.

I worked with "Uncle Bob" Martin at Object Mentor (twice). With his company, I taught physicists at Stanford Linear Accelerator Center how to improve software design for high-energy physics. I worked with companies like Caterpillar, Xerox, and many others.

I worked with Ian Murdoch (who founded the Debian Project) in his company, Progeny Linux Systems. We created software to aid in the creation of custom commercial Linux distributions.

I built industrial balancing machines with ITW.

I have worked with Industrial Logic (Joshua Kerievsky's company) in the USA for the past 14 years. We helped companies create their flagship product. We worked with companies in oncology, heavy manufacturing, agriculture, insurance, respiratory health, and others. I’ve designed and delivered training courses. I’ve helped transform legacy code. I’ve coached software development executives.

I'm known as "the agile otter" (a play on my connections with XP and agile practices, and a play on my last name).

I have taught modern software practices and code craft to thousands of people around the world, including the USA, Canada, India, China, Germany, Hungary, Poland, Norway, Australia, and other countries. I enjoy international travel and love serving software organizations, whether large or small.

I'm particularly well known for code craft, TDD, CI, CD, and refactoring.

In recent years, I've worked to bring Lean flow to organizations, helping them to reduce unpredictability, delay, and frustrations of typical delivery processes. This has been rewarding and fascinating, and will continue to make a big difference for the companies I’ve served..

I'm living in Scotland. I speak one human language and many programming languages.

Friday, March 20, 2009

First glance: SPE and TDD

SPE is a pretty nice python editor. I'm using Ubuntu Intrepid Ibex's default version (0.8.4) and have had pretty good results most of the time. It has nice features like functional auto-completion, file navigation, debugging, UML generation, session history, shell window, tab and syntax checking, etc. Many of these features, sadly, I don't really need. I probably need a bigger project so I can really appreciate the UML and documentation generators. In today's little app, it hardly matters.

Overall, it is a very reasonable full-featured editor for python programmers.

SPE 0.8.4 suffers from the same malady many of the others suffer from: no real support for TDD. I can switch to a script and run it with Ctrl-R (after making sure I do a save on each file that has changed).

I would like to see the next version contain a Save-All feature and a nosetests runner. It would be even cooler if the nosetests runner would fire up automagically after a save or save-all or else invoke save-all on the way to running the tests. That would be so sweet.

I would give a more thorough review, but I'm learning the pmock library while learning spe, and that's taking more time from actual programming than I'd like already.

Friday, April 24, 2009

Cool TDD Antipatterns Appearance on Stack Overflow

There is a cool outgrowth from James Carr's collaborative paper on TDD Antipatterns. There is a Stack Overflow article where you can vote to determine the most prevalent of the ugly smells.

I was a little pleased to see "Free Ride" as the number one last time I looked. Just pride of ownership, I'm not glad it's common.

Monday, July 21, 2025

Who Are You To Say What's Right?

This is probably too snarky, but bear with me:

If I fill my fuel tank with non-fuel or the wrong fuel, it will damage my car and fail to perform.

Is that because fueling the car is a bad idea?

If fueling a car is a good idea, shouldn't there be a million ways to do it so that it's easy for people? This "petrol only" rule = is too restrictive to be useful in the real world.
Every time I put diesel in my petrol vehicle, the mechanic pulling my gas tank and cleaning the fuel lines tells me I'm doing it wrong. Don't they realize there's more than one way? Just because my way consistently fails doesn't make me wrong!
Why should all of those elite gate-keepers put down people who choose their own fuels? Why are they so petty and close-minded?
Don't you think that people should get to choose to put whatever they like in the gas tank? This top-down mandate by car makers is undemocratic! It's authoritarian! Where is the psychological safety?
Who are you to tell me what fuel to use? You're no better than I am, just because you're lucky enough to afford a car that runs every day!
My cousin put petrol in the petrol tank and got a flat tire on the same day. Your petrol causes flat tires! What do you say to that!?

Look, it doesn't matter what we're talking about here: If you do it poorly, you will probably fail.

The correlation is quite high.

Is it possible that you could do it correctly and still have issues?

It is more likely that your issues are not related to the thing you did RIGHT, but to the things still not done well.

If you are going to change the spark plugs, don't claim the freedom to put the plug cables back in any order that you find aesthetically pleasing. Your car won't work if you do. There is a right way and a wrong way.

If you are making a cake, you might want to use baking ingredients and not gardening supplies. You might have some latitude in technique and proportion, but if you mix dirt and fertilizer and put it out in the sun, what you're doing is not baking a cake. That's gardening. You can't expect to serve it with ice cream at your child's birthday party to the acclaim of the assembled children.

Why is it that some people have done TDD for decades and it has always worked for them, and your first attempt for a couple of afternoons in 2021 didn't turn out?

Is it because TDD doesn't work, and they're all a bunch of liars, or because your attempt was inexpert and maybe didn't employ the same strategies and approaches that they would have?