Migration Driven Development

Dotan Reis
CodeX
Published in
5 min readMar 19, 2021

--

This week I faced a moral problem when planning a rewrite of a part of our system. The problem was, I recently read a post by Joel Spolsky that said that rewriting your code is, and I quote:

the single worst strategic mistake that any software company can make

To be fair, the post talked about rewriting the entire system, giving the Netscape 6.0 rewrite as an example, but I strive to be cautious in cold water, so I thought, am I making a mistake?

Cautious in cold water (Photo by Oscar Sutton on Unsplash)

I was alarmed to read that:

there is absolutely no reason to believe that you are going to do a better job than you did the first time

But then I thought — but I didn’t write it the first time! I was relieved for just a moment until I read that this only means that:

you don’t actually have “more experience”. You’re just going to make most of the old mistakes again, and introduce some new problems that weren’t in the original version.

This was quite a bummer. I really thought the code needs a rewrite! Luckily, another post, this one by Herb Caudill, talked about some examples of rewrites that were justified and successful. In a nutshell, the post says that sometimes rewriting is justified, but not just because the code is subjectively bad, but because it’s demonstrably holding you back, or you have a better understanding of your needs now than you did before.

One of my favorite points by Spolsky’s was that old code has advantages because it’s old. It has been tested and debugged. It has faced the complexities of the outside world. A lot of its ugliness may be directly caused by this — more and more tweaks and ifs and hacks that solve real issues that were unknown before and maybe unknown to you as you rewrite. (Don’t look at the jug but at what’s inside it)

Or as Caudill sums it up:

What you’ve already created has value.

What you’ve already created has value. (Image from a recommended but unrelated talk about DevOps)

All this brings me to the purpose of this post, which is, as the title subtly hints:

Migration Driven Development

This is an approach I’ve been inadvertently developing every time I had to do a large rewrite or refactor, and I believe it helps in planning a rewrite such that you don’t lose the value in the existing code, and you will quickly find when the new code under-performs.

The idea is this:

Plan the rewrite according to your migration plan.

That is, you are not done planning until you have a full migration plan that’s integrated into the development plan. The development steps include exactly how you integrate every piece of code into the production environment after you write it, and before you write more. By integration I don’t mean just deploy code that does nothing, I mean replace existing functions or calls. Use every piece of new code right after you write it.

For example, let’s say I have a service that holds all the customer data, and I want to extract a part of it, let’s say info about the customer’s hobbies. The way the old service saves and updates them is all ugly and wrong, there’s a lot of code I don’t understand, etc. Right now the Customer model looks like this:

{ customerId: string, firstName: string, phoneNumber: string, hobbies: string[] }

I might want to write a new service for hobbies, implement all the features, then run some migration and be done with it. But that might mean losing all the value that’s in the current design. Maybe the ugly code is hiding some insights about how the system needs to behave?

So instead what I would do is plan in advance how the migration will work, and integrate it with my workflow. I might start by creating this new service, and only implementing a very basic API for creating and deleting hobbies, and only call them from the original service in the places where it now creates or deletes hobbies, so they are duplicated. Then I would gradually rollout getting data from the new service, with a fallback to the original one. Meanwhile, I’ll move small chunks of logic between the services. Only when I’m confident I didn’t break anything and the data in both services matches exactly will I stop using the old collection.

Then I would gradually move more logic into the new service, and only when the first one becomes just a proxy to the new one will I have other services call the new one directly. It might take more time, but the chances of me discovering after-the-fact that there’s some piece of logic that I didn’t implement are slim.

Migration (Photo by Julia Craice on Unsplash)

I think this method has a few advantages:

  1. It’s much less scary and dangerous to run migrations when they are thoroughly integrated into the development process.
  2. You don’t lose the value of the old code (unless you decide to).
  3. You will fail in mid-process (or even before you start) if the new design is inferior to the old one in some way, rather than figuring it out only in the end, possibly after a non-reversible migration step.

But all good things come with a price:

  1. It might take longer to write the new code.
  2. The system will have more time in mid-migration, which may be problematic if more people are working on the code you want to replace.
  3. You might not be able to write the code you wanted. If you want radical changes, migrating might be impossible until you finish. (Which is kind of the point, because such a rewrite means you lose the old code’s value)
Migration pictures are fun (Photo by Jordi Fernandez on Unsplash)

The takeaway here is this: At any point of the process, if you’re not backwards compatible, you should assume you’ve created new bugs. The approach I suggested minimizes this to almost nothing. If the existing code is so bad that you think you can forgo its honey, then good luck, and I hope you find better code bases to work on in the future.

Further reading:

--

--

Dotan Reis
CodeX

Software developer @ riseup. MA student @ The Cohn Institute in Tel Aviv University