Monday, June 29, 2009

Code Perturbation and Extensive Branch Mods

Balance these facts:
  1. Refactoring is a good thing, but is also perturbation.
  2. Gratuitous perturbation is a bad thing.
Refactoring is good because confusing, stupid, repetitive, complicated code is the devil. We can't even pretend it is not the devil, because we catch bad code sneaking away with a slice of our souls from time to time. Bad code is bad. Refactoring makes bad code better. Some perturbation is a very good thing.

Gratuitous perturbation of the code base is a bad thing. If I make a million changes to a million places, then diffing (and therefore merging) are going to be a slice of hell. This is a different devil, but still a devil.

The problem with perturbation is that it makes it hard to maintain branches. Branched development is a good thing sometimes, providing some isolation for a very short period of time, and some ability to compose and recompose a release. Branching becomes odious, however, when the code in a branch differs greatly from the code in a shared codeline (often trunk). Merging becomes difficult, manual, and fraught with error.

In a branch, you want to perturb less, and refactor new areas of code. Your merges will work. Alternatively, you want to refactor the trunk first and then merge it to your branch, so at least they're both similar code bases.

Gratuitous perterbation would be reformatting the source base, renaming globals, and the like. In the branch, you don't really want to do that. In the shared trunk, you might want to do that. It depends on how the people in branches will handle it. By the way, you probably DO want to do big things like reformatting all the code and eliminating all globals. You just have to do it when not much else is going on, or else you want to coordinate with people working in branches.

Given some routine:
pubic int doSomething() {
// 15 lines of old code
// 12 lines of new code goes here
// 30 lines of old code
// 2 lines of new code goes here

61 is a horribly unhealthy line count. Clean functions are under a dozen lines, and closer to 6. I'll bet you those 14 lines of new code don't line up perfectly under the name and intent of the function, nor do they have the same level of abstraction (except in this made-up case, where they "do something").

You can put those new lines inline, as shown, but that is a mess. You might not be able to sleep at night or ingest foods afterward, but you are physically capable. If you can add code inline without becoming ill, then you probably should retune your sense of smell.

You can refactor. I'm betting doSomething will break down into a hierarchy of smaller functions that have meaningful names. But if you refactor this in a branch, then the shared codeline merges will be a big pain. If you want your merges to be reasonably easy, you'll have to either do the refactoring to both code lines or do them in the shared line and merge them down to the trunk. It's double-work now, but it avoids harder work later and who is to say it will happen only twice? It's an insurance premium. It could be wasted, but it could pay off bigtime.

The other option is to add only two lines of code to the existing, ugly function. Those two lines would be to functions, which contain all the rest of the new stuff. This causes minimal perturbation. The diff shows two lines inserted in the function, and two new functions being added beneath. Further merges will be pretty easy to deal with. Once the branch moves to trunk, then refactoring in trunk might be more reasonable.

Here then is the moral of our tale, and the motto to live by:
Don't be caught with extensive changes in a branch.

Friday, June 19, 2009

SVN fail most sighted

The svn fail I see most is like this:

<<<< working
# some line
>>>>> other

Now, how is it a conflict that I added a line of code and trunk didn't? I will freely admit that the diffing stuff is very smart and not very easy. I want to cut a lot of slack, and I'm happy that this is easy to resolve, but I really have to wonder how that is not an update rather than a conflict.

Preferences On Code Style

Please help me read your code. I know you don't owe me anything, and you can run your code even if it doesn't pass the Agile Otter Sniff Test. I appreciate all of that. But I think that you and I can both do a better job if we're just up-front about things.

I find little speed bumps in most code, and it breaks my fragile concentration . Maybe writing on index cards has made me parsimonious, but now I believe that less is more. I can read your code better if there is less of it, and it's more obvious.
  1. A function should not have many effects on the code. Don't code things into the same function just because they happen at nearly the same time.
  2. You do not have to shoehorn your new code into an existing class. Clear a space for it.
  3. Extract classes when it makes sense to do so.
  4. Use less horizontal space. Long lines and lines with long blank leaders cause my eyes to cross and make me scroll my windows to see if there's something I want to read. This is more important if I'm using an IDE, because I'll have tiled views to the left and right.
  5. Use less vertical space. Don't double-space everything. Don't add meaningless blank lines. All you're doing is making me scroll more. Don't put a space between a comment and the line of code it is explaining. This is more important in an IDE because toolbars and other window tiles take up the top and bottom. Sometimes I'm stuck in a 60x12 space trying to read 120 x 240 functions.
  6. Your functions do not need a blank line after the opening brace and before the closing brace. Get value for your vertical whitespace, as if it were costly.
  7. Stop flowerboxing all your comments. I used to like that, but now the signal-to-noise ratio makes me nuts. A one-line comment should be just one line long.
  8. Do not make needless comments. If the code says what it does, the comment doesn't have to. Face it, non-programmers are NOT going to read your code. Needless or redundant comments are annoying and distract me from the code. I delete them without asking for permission, so expect to lose them.
  9. Consider removing the big banners telling me that the default constructor I'm looking at is a default constructor. I'm simple-minded and particular, not stupid as a rock.
  10. Pay attention to naming. When namespace names and variable names and filenames don't stand apart crisply, I forget which is which. I'm simple-minded, so make the distinctions clear and meaningful. In particular, don't use two names which vary by one phoneme.
  11. The same class names in two different namespaces is confusing. You can't prevent it all the time, but you can try to make sure that when it happens it is meaningful.
  12. Don't be afraid to extract methods, introduce variables, etc to make the code more obvious. Obvious counts.
  13. Don't make me have to remember what functions were called prior to this function call. Relying on other calls to initialize fields to certain values will just tick me off. I can only hold a little context in my head at once. When I'm looking to make a change, I don't even want to hold all of your class' context in my head. I have other things in mind.
  14. You commented out several paragraphs of code, and now I have to skip over all that crap to read the live code. If you need to keep it around, use version control.
Those things help me. How can I make my code more pleasant for you?

Not Fitting In An Iteration

I was in denial for quite along time. I thought that there were really no tasks that couldn't be broken down and implemented in phases. I'm in a change now that is trying my ideals.

Of course, this is a cross-cutting concern that deals with a big "ility." In particular it deals with scalability but I don't want to provide a bunch of detail that will distract us from the point.

The code is legacy in both the MFeathers meaning "without unit tests" an in the sense of "handed down from one generation to another, unsuspecting one." The new generation has done some excellent work getting huge tracts of land cleared and fenced with TDD and AT and what-have-you. Really, quite the transformation. The original designers had a philosophy and working style that did not survive the transformation (we think for the better) so there are architectural/design decisions being unmade on a regular basis.

In particular, there is this giant jellyfish of a design decision that's gotten in the way. It has long, long, long tentacles that extend far into the depths of layers of code, across the type system in funny ways, and into the realm of architectural concerns. When it's fixed, it will make the system better in many ways, and will clear the way to a whole host of other improvements. In short, it may be the coolest subproject in the whole company.

The jellyfish represents the munging of two or three separate concerns in one mechanism. It is a facility that was so amazingly handy that developers used it whenever they could. Remember that one man's fuzzy boundaries are another man's flexible solution. Now the concerns have to be split and the mechanism changed.
... one man's fuzzy boundaries are another man's flexible solution ...

We've managed to dredge up one stinging tentacle after the other, but there are still several more. In the course of doing so, we've had to make a branch (a short-term fork, really) and we spend a pretty significant amount of time merging code from the trunk.

I was commenting to a pair partner (Hi, Nick) the other day that we should have worked out a way to get this thing out in iteration-sized buckets. As soon as I said it I realized that we would have, had we known that the finished result was going to look like it does now/so-far.

This is not the first jellyfish I've met while swimming in legacy waters. In another company, Ed worked on a problem for an entire year and yet there were unexpected avenues in data access that still complicated the process. Not because Ed wasn't thorough and smart, but things can get out of hand politically and technically. Politics complicated the technical work, and there was little fun to go around.

I'm trying to recover and determine how we could have made these changes in smaller steps, staying in a nice, green, running trunk with the rest of the team. I just can't see how we could have done it without knowing the many things we learned through refactoring and exploring and periodic cul-de-sacs in the code. It was bigger than any of our heads.

So what is the point?
  • Is the uncertainty the problem, and could we have killed it first?
  • An opportunity for links & advice from my small, but wise, readership.
  • The merge I'm waiting on is sucking all my CPU and enthusiasm, and I had to do something.

svn and patches don't mix

Patch files are no fun. Look at this little bit from the SVN Red Bean Book
In this particular example, there really isn't much difference. But svn merge has special abilities that surpass the patch program. The file format used by patch is quite limited; it's able to tweak file contents only. There's no way to represent changes to trees, such as the addition, removal, or renaming of files and directories. Nor can the patch program notice changes to properties. If Sally's change had, say, added a new directory, the output of svn diff wouldn't have mentioned it at all. svn diff outputs only the limited patch format, so there are some ideas it simply can't express.

I only mention this because I've had a recent hiney bite in the combination of patch and svn.

Tuesday, June 16, 2009

Culture of Blame

Sometimes you read an article and realize, "Hey! I've been there!". I was following some tweets the other day and was directed to an article on The Blame Game, which explains how things work in a culture of blame.

I've definitely seen this at work at least two or three times in my 30 years of programming. It's a tough situation. Here is a diagram about how the system remains and escalates over time:

I did find that promoting transparency in a culture of blame is not so much a way of creating change as a way of attracting blame. It was a difficult lesson.

Wednesday, June 3, 2009

My Story Board is The Same As Yours!

I saw a recent article describing one team's use of a Story board. I mention it here, because it's the same way I've set up the last two. Love it or hate it, it's also what I've been doing.

Tuesday, June 2, 2009

Customer/Vendor Relationship outside of IT

Symmetry and Simplicity on a Monday

I just watched an Ira Glass video about taste and sticking with your work, and practicing. Of course, any attempt to compare myself to Glass will end in disappointment for me.

I work with a great programmer. He holds more of the system in his head than I do, and is more comfy with the environment. We were sketching different ways to go forward, and before lunch he showed me a direction I really liked. It harmonized ideas from various conversations, and is well-timed.

I did feel something was a little off, but I was confused about what it was. I blame it on mondays, especially since I was busy with many things this weekend (departing friends, graduation open house, etc). I got lost between C# syntax and C or Python at one time, and only frustrated myself (and surely my gracious partner too).

Finally I realized that I was being confused over an asymmetry I had noticed earlier. I couldn't even phrase the concern (well, without being intense) other than it was asymmetrical. That sounded about as OCD as possible. Eventually I realized that I was confusing two levels of wrapping/unwrapping and I think we had them confused in code too.

The problem in a nutshell was that we had some doubly-wrapped data in a formatted packet and didn't have crisp boundaries established. Without the code being clearly segregated, I was not able to keep track of which wrapper we were in, and it seemed we were mixing level 1 wrapping with level 2 wrapping (to ill effect). I realized I was confused, and I realized that the code looked confused too, and we refactored it into a more sane state.

Right now, it looks good. It makes sense and we might be able to continue development by extending it cleanly to many of the types we work with. It has symmetry and clarity, and even a little bulletproofing tossed in.

I was hazy (perhaps from sleeping poorly) and could not tell that I was munging two concepts in my head. As soon as I realized the problem, I became lucid and energized, instead of being frustrated and inarticulate. Understanding gives me energy to work, confusion sucks the oxygen from my air. The code was suddenly easy to clarify and test.

Perhaps I need to practice what I have read about thinking clearly. That would be a good lesson to take with me.

What I am likely to remember, though, was that my "taste" was dead on even when our work was a little off. This made Ira Glass' words ring through to me again. You can know what's right, and can know that "this isn't it" even when you're muddling through on a (mentally) cloudy Monday.

Glass' observation seems to be universal (enough) if it applies to both him and mere me.