Monday, January 15, 2024

Fundamentally Wrong

The Problem

An article has been shared with me by several friends and also by some critics, and some people who are both, describing how TDD is fundamentally wrong, and doing test-after-development is better.

To be fair, the process described here is fundamentally wrong:



The problems with step #1 would indeed lead to the wasteful problems described below it. The recommendation here would certainly be better than the process described above:



TDD Is Fundamentally Wrong is Fundamentally Wrong


Now, the problem with this article is more fundamental than the problem being described.

TDD does not mean "Write all the tests then all the code"

It has never meant that.

That is not TDD.

That is some other misbegotten travesty that has no name.


This is the fifth or sixth time I've heard anyone describe TDD as writing all the tests first. In all cases except one, it has been described by people who self-describe as being anti-TDD, and who write articles decrying the foolishness that they identify as TDD (which is not TDD).

I have never seen anyone do TDD that way -- even unsuccessfully.  I have never seen anyone even try to do TDD that way. I would never sit by while someone tried to do that and called it TDD. That's simply not the process.

The one time that I read an article that actually recommended doing that, it was from a Microsoft publication early in the XP/Agile days. The public outcry was great and sudden, and the article was retracted. I don't think I've ever seen anyone else recommend that approach,  because that approach is so obviously flawed and was never the process used by original XP teams.

So What Is TDD?


TDD was originally described as a three-step dance that repeats over and over as one develops code.

You can take that straight from the horse's mouth (so to speak):



To these three steps, we (at Industrial Logic) added a fourth and final step.  Others may also have independently added this 4th step, I don't know.

  1. Write a test that does not pass, and in fact cannot pass, because the functionality it describes does not exist.  This failing test is the RED step, so-called because test-running programs generally produce the results of a failed test run colored red.
  2. Write the code that passes the test.  When the test passes, it is typical that the test-running program will present the results colored green, so this is often called the GREEN step. The code may not be optimal or beautiful, but it does (only) the thing the test(s) require it to do.
  3. Refactor the code and test so that both are readable, well-structured, clean, simple, and basically full of virtue (so far).  Refactoring requires the presence of tests, so this way we can refactor as soon as the code passes the tests, rather than waiting until after all the code for our feature/story/task is finished. We refactor very frequently.
  4. Integrate with the code base. This will include at least making a local commit. Most likely it will be a local comment and also a pull from the main branch (preferably git pull -r). More than half the time, it also includes a push to the shared branch so everyone else can benefit from our changes and detect any integration issues early.


 We repeat this cycle for the next test. The whole cycle may repeat 4-10 times an hour.


We do 1-2-3-4-1-2-3-1-2-3-4, we do not do 111111111-222222222-44444-(maybe someday)333.  These are not batched.

Was It A Misunderstanding of the List Method?

Some people misunderstood Kent Beck's List Method, in which you begin with a step 0 of writing down a list of the tests you think you will need to pass for your change to be successful (see the screen shot and link to Kent Beck's article). 

Note that you only make a list of tests. You do not write them all in code.

As you enter the TDD cycle, you take a test from the list. That may be the first test, the easiest test, the most essential test, or the most architecturally significant test. You follow the 4-step (or 3-step) dance as above. 

If you realize a test is unnecessary, you scratch it off the list. Don't write tests if the functionality they describe is already covered by an existing test.

As you make discoveries, you add new tests to the list. That discovery may lead you to scratch some unwritten tests off the list. That's normal.

Eventually, you will note that all the tests on your list have been scratched out. Either you implemented them, or you realized they're unnecessary. This applies to the tests you discovered as well as the original list.

You've done everything you can think of doing that is relevant to this task, so you must be done. This is doubly true if you have a partner also thinking with you, or even more certain if you have a whole ensemble cast working with you.

You never had to predict the full implementation of the features and write tests against that speculative future state.

It's a tight inner cycle, not a series of batches.

Do I disagree with the article, then?


Indeed, the "write all tests first" would only work for the most trivial and contrived practice example. It would never suffice in real work where 11/12ths of what we do is learning, reading, and thinking.

As far as the process being unworkable, I totally agree.

As far as that process being TDD, I totally disagree. 

That characterization of TDD is fundamentally wrong.


Friday, January 5, 2024

Python Listicle!

People often ask me (directly, or just generally posting to some social site) how they can learn Python quickly.  

Learning Python is one of those things where one can begin quite easily and quickly, but there is some depth to the language that one will want to understand and use once one gets past the most elementary early uses. 

If you are learning from tutorials, you might want to follow along in a REPL. You can try running Python locally (see ipython and/or bpython), as a Jupyter notebook, or in Repl.it if you want to keep your local machine Python-free for the time being).

You will probably want to install an IDE, though. There are many Python IDEs and Editors in the world, but PyCharm is the king of them all. Nothing else even comes close.

So, here are some great places to start:
  • Learn X In Y Minutes is great for experienced developers who are unfamiliar with the syntax and idioms of Python. It's all learn-by-example and is highly recommended for programmers who are exploring Python for the first time.
  • For less experienced developers, consider the official Beginner's Guide or the W3 Schools tutorial first.
  • Regardless of your level, you will want to bookmark the Official Docs which include reference material and tutorial material.
  • You will get a lot of good tips and deeper lessons from Arjan Codes on YouTube, or the many excellent lessons at Real Python. This is true whether you are an expert, intermediate, or beginner. There is a lot of content to explore, so don't try to take it all in over the course of a weekday.
  • A language without a great standard library is just a syntax. The Python Module of the Week gives some of the best in-depth exposition you can find. Definitely spend time there!
  • The Python community has created so many additional libraries at The Python Package Index. Here you can search, research, and learn about the many frameworks and libraries that make Python the best choice for so many jobs in real-world applications.
  • Every feature you'll use started as one of the Python Enhancement Proposals (PEPs). Python PEPs are to Python what the RFPs are for the internet and the worldwide web.  If you need to deeply understand a feature's purpose and intention, this is the place to go.
  • What's New In Python is a crucial resource for experienced developers to keep up with changes in the language. Besides being a bullet list of new features, there's some very good expository writing there and links to the relevant PEPs.

That is a lot, I know, but if you choose one of these resources according to your needs at the moment, I think you will be well satisfied. 

Wednesday, December 20, 2023

Leadership.

 I have this very simple/simplistic view on leadership.

People will become a follower if:
a) They believe the person is competent
b) They will personally benefit from that person's competence
Whether it's a minister, a businessperson, a writer, a local organizer, or a criminal doesn't matter all that much.

We like to impart character and integrity to our leaders, and we like to pretend that we chose them because of their superior traits, but it doesn't matter as much as we would like to pretend. If those were really the criteria, we would never fall for con men and tricksters.
Remember. that people followed some pretty unsavory characters in the past and many do now.

I had to chew on this for years, because people can be radical followers of some pretty awful characters. Why would they be so blind to the character of their heroes?

It seems consistent now, that it's down to two factors.
The perceptions don't even have to be correct. Sometimes confidence masquerades as competence. In politics, business, religion, or social life the bombastic and dynamic personality is often chosen over the quietly competent because confidence looks like competence to the outsider.

Some people speak with such conviction that others are tricked into believing they know what they're talking about.
It always comes down to people "backing the winner" - the good/bad guy who is "on our side" and who is aligned with our interests. They're "going to win" so we want to be on their side, and benefit from that support.
Their competence will benefit us, so they're the "good guy."
That draws followers.
It defines leaders.

Wednesday, November 22, 2023

Choose your Expression: Structural Matching, IF-ELSE, and Dictionaries

 So, I have a command line utility that collects and presents some time-series data.

What it is isn't important, but dates are involved.

You can specify start dates and/or end dates with options --after and --until.  If you specify neither, you get everything.

This programming idea is not so interesting on its own, and I have multiple expressions that all work just fine. It's not a programming puzzle I am here to present.

Instead, I'm curious about which version speaks to you, which teaches you, which repulses you. 

More than that, I'm interested in WHY. 

Here is an if-the-else version:


Here is a similar version using a dictionary:



And one that uses structural matching:



Some people will naturally prefer if/else for the simple reason of familiarity. They see a lot of if/then/else logic, and so there isn't much to learn or think about. They may call it "simpler' but it is not. 

The dictionary version has more parts, but they're very simple parts, and all the actual work is done in the first and last sentence. 

The structural matching is using newer syntax, and is both less verbose than the if/else and more verbose than the dictionary version.

Which strikes you as the most valuable expression of the idea? 

Which would you rather write?

Which would you rather edit?

Which would you rather test? 

What is your reasoning behind the appeal of your chosen method (pun intended).




Wednesday, August 30, 2023

CSS Specificity Rundown

 CSS really is fun. 

No, seriously. I'm not being sarcastic here. 

Even though I've been in software a long, long, long time, I hadn't really studied CSS before last year (shocking, I know) and so I've been behind in my training. 

I had an opportunity to dive in more, and all was going well until I started playing with media queries and ran into a specificity problem that wasn't so obvious (to me, though it may have been to you).

So, to help people who are treading the same path, here are some aids on specificity:


* W3 Schools has a fun "try it and learn" approach to general CSS Specificity

* Saucelabs specifically breaks down specificity and media queries

* Halodoc provides some best practices to avoid troubles.

* Specificity with Darth Vader and Stormtroopers and stuff at smashing magazine.


These should get you past the worst of your troubles nicely. If you do get stuck, then it's always nice to experiment with a local HTML/CSS document or maybe fire up a REPL.IT instance for HTML, CSS, and JS. It's all fun.

PS: Because we need it so often, here is a CSS Selector Cheat Sheet.

Wednesday, April 6, 2022

Code Smells Listicle

 After many times looking up various resources on code smells and code smell taxonomies, I finally decided to make a listicle (list article) of these. 

Enjoy:

On the "positive" side of the ledger, we also have virtues:
  • The original Code Virtues article from Pragmatic Programmers, republished at Medium.
  • The explainer article (feel free to start here) at Industrial Logic.

Monday, March 14, 2022

What does Tim have against "private" methods?

 A big thread erupted, all full of misunderstandings and miscommunications, about the idea of "private" methods. 

It all started quite innocently (I maintain) when someone asked how we felt about testing private methods. Some people jumped in with "Absolutely Not! Never! That's wrong! Test via public interfaces."

I thought a little longer, and said "I'm not sure" and then later "I'm not sure that 'private' is even needed."

This is where the problems started, and maybe here I can clarify what I meant by it all. 

People assume (and insist) that I could only possibly mean that they should substitute 'public' for all protected and private members, polluting the interface, and inviting the violation of a class' internal state. 

That was never my intention and still isn't. Still, this is what people insist that I must have meant from the start. I suppose this is because that's what they imagined me to mean and it's hard to admit that you're wrong, or perhaps because this is social media so people never miss an opportunity to double-down and pile on if at all possible. It's like a sport. 

It's also entirely possible that I didn't communicate well at all in short tweets. Maybe that's all on me. I can't say.

Well, you've come this far, so let's see if I can't communicate better in blog than in twitter, since I can write in peace without people accusing me of meaning the wrong thing while I try to explain what I actually mean.

If I've managed to make sense with this post, let me know. 

If I've not, the forum is open for clarifying questions below.

Autocomplete is a big deal.

The good thing about 'private' and 'protected' -- the thing that actually makes programming easier/better, is that the methods aren't offered by auto-complete. The idea of locking people out of calling the method? Not even remotely a second-degree concern.

People program by autocomplete. You type the name of a thing, press the dot, and you get a list of things you can do. You may never look at the documentation, or read the example code, but you will see the autocomplete hundreds of times a day.

When you want to disable an account, there is likely to be a 'disable' method attached to an account. You choose that method, run the tests, ship it. That's how you normally will navigate.

When you don't see a method you want, you might pop up a level and use the module or some management API interface. Again, you type the name and a dot, and you look for 'disable' in the recommended names. 

Private and Protected are two of the ways to keep functions out of that list, and if they're not in the list, you're not likely to even consider calling them.

Arbitrarily Located Functions

A developer is implementing a class and realizes that they need to compare date ranges. They write that up, drop it into a private method, and all is well.

Or is it? 

Why is that method located in this class? Does it have high cohesion? No. Does it support the purpose of this class directly? No. 

Why then? Because this class calls the function, and this is the file I was editing when I wrote the function.

In other words, the function doesn't belong here according to any rule of design, but it wasn't found anywhere else either. The developer drops it into the class with 'private' and moves on.

What I've found is that often the private methods in one class are repeatedly reinvented in private methods of other classes. Sometimes they're even copied and pasted directly from one class to another.

Why? Because this is the class they had open in the editor when they realized they needed the function, and it's not available for calling because it's private in the other class.  So they copied it.

Arbitrarily located functions, hidden from view, repeated by invention and copying, likely hiding the opportunity for Single Point Of Truth (SPOT) abstractions and libraries that would make everyone's job a little easier.

This isn't a rare occurrence. 

By being hidden and arbitrarily located, they invite duplication and reinvention. 

Testability of Private Methods? None

When we talk about testing private methods, it's a non-starter. If you put the word 'private' on the front of a method in most languages, you cannot call that method from a test. That means you can't possibly test private methods.

Okay, there is a way, but it involves reflection and that's an obscenity. If you ever use reflection to crack into some legacy code and test hidden methods, make sure you remove that hack as soon as humanly possible (or sooner) because it's a travesty. It's awful. It will fail you if the method is renamed or moved. Reflection is generally a bad idea, and I wouldn't do that (much, or for long).

If you are doing TDD (and why wouldn't you?) you test the public methods, and if they're using private methods then the private methods are being exercised. If you have meaningful tests and assertions, then the private methods are being tested through the public interface and that's probably okay.

If you're not doing TDD (and why not?) then you may not have code that tests all the private behaviors. You may have to read all the code in order to exercise it properly, which is a pain.

If you're doing test-after development (but why?) you've written the code and now you have to write tests around it. Every private method is a called from some public method and you're going to have to figure out how to get down into that private method code and recognize if it's working correctly or not.

By being private, those methods have cut off the easiest route to testability -- calling the method directly.


Should You Test Implementation?

You should: implementation is how code behaves and you have to test behavior.

There are caveats here:

  • If you're structure-aware in your tests, they'll resist structural refactoring
  • If you're time-sequence-aware in your tests, they'll resist time-order refactoring
  • If you're using reflection, you're poisoning our future and must be stopped (half-grin)

There are rules, of course. You don't lock down things that you want to change, and you don't leave unspoken the things that must be true. Your tests specify how your world behaves at a certain level.

Sometimes that's too high a level and there are mini-behaviors you need to check too.

I've seen cases where testing only at the interface required hundreds of tests, but testing at the internal level requires barely a dozen. This is because at the higher level, you need all the combinations of all the paths of all the subordinate functions. At lower levels, you need one for each path in a function.

High-level end-to-end and integration tests take a long time to run, and microtests are cheap and fast.

We tend to do microtesting (or unit testing) so that we can afford to run the tests in a tight TDD cycle. 

To argue against testing implementation functions is to argue against unit testing and microtesting on principle.

If you should only use the most-public interface, then shouldn't you only test at the UI level? Yeah, this is not a good idea. 

So how can we get at the lower-level functions? If they're all private, we can't but it is totally possible if they are public methods on lower-level classes that are composed ("encapsulated") by the API classes. 


Interfaces and Implementations

In many languages, you can declare a public interface, and that interface can be implemented by other classes.

It's accepted that one should always program to the interface in those languages. This way, one can only use the declared, public interface of the implementations. This keeps the implementations substitutable (via Liskov Substitution Principle). 

Non-interface methods of the implementers are already hidden: callers have no access to them via the interface. 

If you're using the interface, and you press the dot, then only methods in the public interface are presented. The other methods of the implementer may as well not exist, because they are not available here.

If you have segregated interfaces, this is even nicer. It may be that two or three public interfaces are the right way to think and design interactions, but combining a few of those interfaces together is the better way to implement the behaviors. No user of one interface has any awareness that the other interfaces exist on the object, nor of the public methods of the other interfaces, nor any of the public methods of the implementers.

In order to reach the public methods of the implementer, a caller would have to down-cast from the interface to the implementation (a no-no that justifies a sharp crack across the knuckles with a yardstick) in order to even know that other methods exist. 

This is convenient. This means that all methods of the implementers can be public without exposing those methods to callers.  Since they're public, you can write tests directly against them. Because the implementation is being tested also through the interface, it's easy to ensure that it behaves as a whole implementation.

Why not make them private? Because it's redundant and limits testing. 

Without polluting the public interface, one has a fully open and testable class.


Thinking a little more deeply

Private methods, when we choose to use them, may signal a need for us to think more deeply about our situation and strategies.

  1. In a Rat's Nest:
    sometimes people are doing test-last (after coding) and they have a complex method. You shouldn't have big, complex methods if you have any other choice. To try to manage the complexity, they may extract some private methods.
    In order to test through the API, you have to navigate the rat's nest. You will have to set up deep data structures and some combination of boolean conditions that will allow you to get to the private method's function call, and then to exercise the code inside that method.
    It's easier to make the method's access less restricted and test it directly than to have dozens of long and complicated tests. Perhaps package-public, perhaps protected so you can cheat with inherit-to-expose. 
    It's better to test the code easily than to write awful, fragile, internals-aware tests.
  2. Missed Abstractions
    Where one class has a bunch of private methods, often there is some cohesion between those methods. If one were to set the private methods side-by-side, one might recognize that there have been missed abstractions.
    Maybe some of the methods are generic string, date, or math functions. These could be public methods in a utility package or more-primitive type. If you move them to the "right" place, they can be fully testable and can be reused within the codebase. They would not be public methods of the class you're working on.
    Perhaps some of those represent a lower-level concept that is munged into the current class. If they were pulled out, they could become testable, public methods on the new class. They would still not appear in the public interface of the class you're working on.
  3. In Languages without Private
    In Python, smalltalk, and similar languages there's no 'private' and we've been okay with that for a long time (since the 70s for Smalltalk). 
    While people say that encapsulation is a core feature of OOD (and it is) it isn't the "private" keyword that causes encapsulation. It's done via composition.
    In a Python module, you can choose what classes you expose and which you don't. You have a class with a public API, which is composed of classes/functions that aren't part of the public API. They're tested directly within the module and don't have a "private" keyword associated with them. They might not have underscore-decorated names or any other semblance of access protections. 
  4. Other APIs
    In any language, the Model class or the API class may have a simple interface into the module (as described above) but the module may have a lot of complicated functions and logic divided into multiple classes and functions. 
    The model class doesn't have any 'private' methods at all - it just calls the public methods of other classes in the module. Those classes have copious tests and may not have any methods declared Private or Protected, because only the API and tests call those methods - they're not visible to the outside world (protection from Hyrum's Law)

So What Do You Do Instead

  • Don't pollute the public interface of a class with private methods. If you simply flip the private methods to be public, you'll lose understandability of the interface and create unwanted dependencies. This is what I recommend not doing. Keep interfaces clear and clean.
  • Move your methods to the places where they belong (where they have cohesion) and can be easily found by others.
  • Try not to need private methods. They should be rare. Remember, many OO languages don't even have the concept of private and they're just fine without them.
  • in some legacy code test-after situations, you might raise accessibility of a method to public, protected, or module-private in order to support refactoring via better tests (in the short term)
  • Consider using encapsulation properly: by composing behaviors under an API, rather than by housing all behaviors in the API's class.
  • In some languages, you have 'interface' or 'protocol' classes that declare an interface to use. Do that when there is a clear public interface. 
And, of course, I would be remiss if I didn't admit that I do sometimes create private methods. I try not to, and when I find a better way than settling on private, I am usually happier.  I sometimes leave a little cruft like private methods until I have more information and can see a better design form; it may take weeks or months. 

So yeah, there are private methods in my code. I just don't see that as "good coding" and a "solution." It's temporary.