Database migrations done right - Michael Brunton-Spall
These patterns often break the change into multiple database and application deployments. For example, the pattern of adding a non-nullable column to a database schema could require:
- schema change to add a nullable column
- update the software to write to the nullable column and handle nulls on read
- perform data migration to update the null columns to have the correct data
- execute a schema change to set the column to not-nullable
- remove the null-handling code from the app
If you can execute schema changes or deploy code around once a week or fortnight, then executing that process could take you two months. If you can make these changes hours or minutes apart, then this is a couple of days work for a developer at most.
The application of these patterns requires understanding that you need to make very small changes, each released to live as fast as possible and with as quick feedback as you can get.
We can apply all of the database refactoring patterns by following one pretty simple principle:
Every change you make must be backward compatible with the rest of the system
So looking back at our add non-null column change we can identify it like this:
- Add nullable column to database – System keeps adding rows, nulls are fine, reads ignore the null
- Code change to write correct value to new rows, and handle reading unexpected nulls – Database doesn't change, now we have some null rows and some rows with data
- Run data migration to fill the other columns – This might be a script, or a bit of code in the application, either way your app doesn't care about any row, it handles data and nulls just fine
- Add the non-null constraint – The database now has no nulls and your new code is writing the correct data.
- Remove the code that handles the null case – it won't happen anymore.
Read full article from Database migrations done right - Michael Brunton-Spall