Wednesday, February 3, 2016

Naming good is what Position?

I want to drop in a quick note about variable and class naming, one of my favorite hobby horses.

I talked a little bit about choosing more completable names already. I think that is very important, and you would be well-served by paying attention to that bit.

Today let's consider the serial position effect.

In any programming namespace, we're going to come across some naming issues. Here is a non-comprehensive, non-exclusive list:

  • Objects with a common base will want to include the base class name in the derived class, leaving us with MonoNUnitTestRunner derived from NUnitTestRunner derived from TestRunner, possibly breaking naming in its derivation from CodeEvaluationRunner and on and on.
  • Objects in languages with declared interfaces have a tendency to want to declare the interfaces in their names. An iUserAction will have a UserActionImpl in many cases, whose children may follow the prior rule above. 
  • People love naming the patterns they use in implementing an interface. You will see names like UserActionVisitorFactory 
  • Noise words will creep in, so that you may have a Subscriber class with a base class tied to persistence. You can't call them both Subscriber unless you split them into separate namespaces, so you'll tend to name one SubscriberInfo or SubscriberDataObject or the like. Not to mention SubscriberObject and SubscriberManager and BaseSubscriber. Of course "manager", and "info" and "data" and "object" are completely noise words in an OO system, but they sure are prevalent. 
  • Fuzziness and generality will be signaled by the use of vague terms. Identifier is general, FactoryManager is general, pattern-related (possibly) and vague.  Often when we don't have a clear, unique purpose for an object, we'll use vague terms.  It might be a code smell. It means that other, similar terms will have to carry "warts" or "qualifications" to be unique. Maybe there must be a UniqueShortStringId so that we don't confuse it with the UUID or GUID we use in Id. 
  • People suspicious of the 'magic' of namespaces may feel uncomfortable with letting the namespaces and class names be sorted out by the compiler (for some reason) and may embed namespaces and class names in their object and method names. I know, it's redundant and seems silly, but it happens. We end up with CustomerManagement.Customer.getCustomerName(). Sigh.
Is it wrong to have compound names? Probably not, and besides "wrong" is an uninteresting qualifier. Maybe better questions will help:
  • Is it helpful? 
  • s it only a coping mechanism for over-used namespaces? 
  • Is it a signal that our code is degrading? 
  • Is it a sign that we're doing a good job? 
  • Is it a temporary state, that might be leading us toward more manageable code? 

Rather than answer ethical questions that we don't have room for in this space, let's consider instead that we have compound names, for better or worse.

What ordering should we use?

Serial position effect tells us that we will recall the first and last parts of the name most clearly. This has been illustrated in even smaller scale by silly facebook posts like this:
"If you can raed tihs, tehn you hvae a vrey rrae gfit. Olny sveen penecrt of plpoe are albe to usdtnernad tihs txet. You hvae an azimang bairn!"
Which of course, contains misinformation because just about anyone can read it if they look at it for a few seconds and can read English at all.

The interesting bit though, is that it is true that you don't really struggle that much when the middles are scrambled because your brain doesn't really latch onto the middles.

Maybe this is why it's easier reading these really short paragraphs than the long ones above and the short bullet points instead of the longer ones.

It seems that if we are going to use long compound names, we might be better off if we:

  • Put the most important word at the front to support dot-programming.
  • Put the next most important word at the end of the name, so it's memorable and recognizable.
  • Either delete the other parts because they're noise, or bury them in the middle.
Give it a shot today when you're programming. It might lead you to more clear and easily-read names!