Monday, 1 October 2012

Database Version Control

I have at least a passing interest in how to manage databases and version control.

I want to compare two different approaches to version control of databases. There are two major challenges with managing databases: version control and deployment. This post focusses specifically on tools for SQL Server, but the same issues would exist for most relational databases.

SSDT - SQL Server Data Tools

SSDT is the successor to 'Data Dude' from Microsoft. It focusses more on the version control side of managing databases. It integrates with Visual Studio 2010 and later, including TFS version control. You can point it at a database and say: generate a model from this. You can 'build' databases with syntax checking of your databases, generating errors & warnings when missing tables or columns are referenced. This is basically just another type of Visual Studio Project.

Version Control

The tool manages tables, procs, triggers, functions etc as create statements. In other words, tables are stored as:
CREATE TABLE MyTable
(Column1 int)
procs, functions etc are similarly stored.

Deployment

Deployments are managed by comparing your version controlled model to the database you want to upgrade. SSDT provides options to either generate a script to upgrade the database (in SQL CMD mode) or it will execute the upgrade against the database.

DbUp

DbUp concentrates solely on managing upgrade scripts and the process for running those scripts.

Version Control

DbUp is not concerned with syntax checks or version control.

Deployment

DbUp focuses on ensuring that upgrade scripts are run just once on the database. Each script is stored in the version table, if the script has already been run it doesn't run the script.

Comparing the approaches

The two approaches are almost polar opposites. SSDT focusses on version control, DbUp focusses more on deploying changes.

Why the SSDT approach is great

You can treat tables/procs/etc like any other piece of code. You can see a history of all checkins that have been made to a table, when columns were added etc. You can run a comparison of what was there before against what is there now. You can also see a snapshot in time of what was there.

This is very much focussed on the version control.

Where the SSDT approach falls down

When it comes up upgrades, SSDT will generate a single upgrade script. This has the following problems:
  1. You are trusting SSDT to generate an upgrade script that will work. In some cases (primarily when there is data in the database) it is impossible for SSDT to generate a valid upgrade script. I'm not too comfortable with trusting the magic tool to generate a valid upgrade script
  2. The upgrade script is a single script. If the script fails part of the way through (eg through a change to the database between the time the script was generated and the time it was run), you have a major problem. And your major problem is occurring at the worst possible time: deploying to production.
  3. The deployment is not repeatable. Any script that is generated is specific to the database being upgraded. It is common to have a Uat/Joint test/Staging environment before production that is used as a testbed for deploying to production. Typically there is some variation between these environments, which can be handled by SSDT with command variables. As a result you will need to generate a different script for each environment. This means that the upgrades are not repeatable.
There are other failings of SSDT (worthy of a blog posting on their own).

Why the DbUp approach is great

DbUp ensures that upgrade scripts scripts are run against the database just once. This means deployments across multiple environments are repeatable.

Where the DbUp approach falls down

DbUp focusses entirely on the database upgrade to the detriment of version control. The reality is that in most cases databases include real code (procs, triggers, functions etc). Simply versioning changes makes it very hard to see what changes have been made.

This is an issue even for changes to tables. It makes it hard to see what the state a single table was at a given time and how that table has changed over time, given that you have collection of change scripts for a table.

There must be a better way...

DbUp is a deployment strategy with no real version control strategy.

SSDT is a version control strategy with a deeply flawed deployment strategy.

Wouldn't it be nice if there was some sort of middle ground that provided the best of both worlds?

4 comments:

  1. Previously I have separated database work into structural changes (requiring the DbUp approach), e.g. changes to table structure, plus code changes (changes to stored procs, etc), in which simply the latest version is always deployed.

    ReplyDelete
    Replies
    1. I think this is a pretty good approach. It very much depends on how much code (procs, functions etc) you've got in the database. The more code, the greater need to manage it through more standard version control approaches.

      Delete
    2. I believe if your database development releases follow software development best practices (which it should) for example continuous integration and automated deployment, situations such as what you described below will not arise. Database code, along with application code should always follow the proper release cycle and hence you will not see any surprises when releasing from a pre-production to production environment. In other words, if you are taking advantage of SSDT then all database code should be managed through this tool.


      The upgrade script is a single script. If the script fails part of the way through (eg through a change to the database between the time the script was generated and the time it was run), you have a major problem. And your major problem is occurring at the worst possible time: deploying to production.

      Delete
    3. Agreed, if best practices are followed then the size of the delta will be smaller, and hence the risk will be lower. It does depend on the environment you are working in. It's often hard to ensure that all database code will be managed through the tool, ops teams can make ad hoc changes that aren't captured.

      It's worth being aware of the different approaches though, depending on the situation one might work better than the other.

      This post is also 4 years old, so the tooling has definitely changed in this time.

      Delete