8
.
05
.
2024
20
.
06
.
2022
Ruby on Rails
Backend
Tutorial

Safe data migrations in Rails

Paweł Strzałkowski
Chief Technology Officer

Afraid to run data migrations in production? Don't be. There are a few easy tricks which can help you feel secure.

… what data migrations?

There are times when data in your application has to be explicitly adjusted. You might have changed the data model or found an error in the data consistency. It happens. Those changes, except for the smallest of them, shouldn't be performed within db:migrate tasks, around the deployment process. They may be lengthy and CI is usually not prepared for such operations.

Basically, sometimes we need to run a custom script to change production data. Let's see how we can make it less scary.

Organize well

Create a folder, for example db/data_migrations and put every data migration definition inside.

Keep them separated from the application

Remember that db/data_migrations folder we've created? It's not auto loaded into your application by default. Each time you want to use it, you have to explicitly load it using the require method.

require 'PATH_TO_APP/db/data_migrations/perform_an_important_change_data_migration'

You will never run it by accident.

Keep it consistent

All data migrations classes should follow the same pattern, so that they are predictable and safe to use.

# db/data_migrations/perform_an_important_change_data_migration.rb

module DataMigrations
  class PerformAnImportantChangeDataMigration
    def self.call
      ...
    end
  end
end

Then, run it in rails console

> require_relative 'db/data_migrations/perform_an_important_change_data_migration'
 => true
> DataMigrations::PerformAnImportantChangeDataMigration.call

or wrap in a rake task

# lib/tasks/data_migration.rake

require_relative '../../db/data_migrations/perform_an_important_change_data_migration'

namespace :data_migration do
  desc 'Performs an Important Data Migration Change'
  task perform_an_important_change: :environment do
    DataMigrations::PerformAnImportantChangeDataMigration.call
  end
end

Add a safety switch

This is a game changer. When you run a data migration in the production environment you usually cannot be 100% certain what comes out of it. A random invalid record can ruin your day in the middle of the run. Use the following safety mechanism

# db/data_migrations/perform_an_important_change_data_migration

module DataMigrations
  CommitForbidden = Class.new(StandardError)

  class PerformAnImportantChangeDataMigration
    def self.call(commit_changes: false)
      ActiveRecord::Base.transaction do
        ...

        raise CommitForbidden unless commit_changes
      end
    end
  end
end

The data migration can be performed as a dry run to check if it performs correctly.

If the commit_changes argument is set to false, it automatically rolls back the transaction block. It allows you to check the performance and the outcome of the migration just as if you used sandbox mode. Once you are ready and certain - add commit_changes: true and run it for real. I have run it in a multi-tenant (by schema) setup with ~30000 updates in the run. Worked like a charm. It also allows you to print out every change to the console. You may show it to your client and double-check if it matches expectations.

Test it

Write a few tests. There is no migration too small or too big for testing. A test helps you to think in the context of real data and gives a chance to realize that edge cases exist.

Ask somebody to check it

Even if you are at the super-pro-master level, you may still make a mistake. Even if you use continuous deployment / no code review approach, data migrations are tricky as they escape the classic flow. The best tests won't fully help if you don't think of an exception from a rule. Ask a colleague to give it a quick read. It may save you a week of fixing the mess.

Clean after yourself

Data migrations tend to become obsolete quite quickly. Once they are used, you may remove the test and the migration itself. All the files will forever remain in your repository - no need to pollute the codebase.

Let's sum it up!

Data migrations are needed in almost every application. We change schema and we change business approach. It has to be reflected in the data. There are ways to do it in a safe way. Don't rush, use safety mechanisms and... good luck :)

Paweł Strzałkowski
Chief Technology Officer

Check my Twitter

Check my Linkedin

Did you like it? 

Sign up To VIsuality newsletter

READ ALSO

A look back at Friendly.rb 2023

14
.
11
.
2023
Cezary Kłos
Conferences
Ruby

Debugging Rails - Ruby Junior Chronicles

14
.
11
.
2023
Piotr Witek
Ruby on Rails
Backend
Tutorial

GraphQL in Ruby on Rails: How to Extend Connections

14
.
11
.
2023
Cezary Kłos
Ruby on Rails
GraphQL
Backend
Tutorial

Tetris on Rails

17
.
03
.
2024
Paweł Strzałkowski
Ruby on Rails
Backend
Frontend
Hotwire

EURUKO 2023 - here's what you've missed

14
.
11
.
2023
Michał Łęcicki
Ruby
Conferences

Easy introduction to Connection Pool in ruby

14
.
11
.
2023
Michał Łęcicki
Ruby on Rails
Backend
Ruby
Tutorial

When crazy ideas bring great time or how we organized our first Conference!

04
.
12
.
2023
Alexander Repnikov
Ruby on Rails
Conferences
Visuality

Stacey Matrix & Takeaways - why does your IT project suck?

02
.
10
.
2024
Wiktor De Witte
Project Management
Business

A simple guide to pessimistic locking in Rails

14
.
11
.
2023
Michał Łęcicki
Ruby on Rails
Backend
Ruby
Tutorial

Poltrax design - story of POLTRAX (part 3)

04
.
12
.
2023
Mateusz Wodyk
Startups
Business
Design

Writing Chrome Extensions Is (probably) Easier Than You Think

14
.
11
.
2023
Antoni Smoliński
Tutorial
Frontend
Backend

Bounded Context - DDD in Ruby on Rails

17
.
03
.
2024
Paweł Strzałkowski
Ruby on Rails
Domain-Driven Design
Backend
Tutorial

The origin of Poltrax development - story of POLTRAX (part 2)

29
.
11
.
2023
Stanisław Zawadzki
Ruby on Rails
Startups
Business
Backend

Ruby Meetups in 2022 - Summary

14
.
11
.
2023
Michał Łęcicki
Ruby on Rails
Visuality
Conferences

Repository - DDD in Ruby on Rails

17
.
03
.
2024
Paweł Strzałkowski
Ruby on Rails
Domain-Driven Design
Backend
Tutorial

Example Application - DDD in Ruby on Rails

17
.
03
.
2024
Paweł Strzałkowski
Ruby on Rails
Domain-Driven Design
Backend
Tutorial

How to launch a successful startup - story of POLTRAX (part 1)

14
.
11
.
2023
Michał Piórkowski
Ruby on Rails
Startups
Business

How to use different git emails for different projects

14
.
11
.
2023
Michał Łęcicki
Backend
Tutorial

Aggregate - DDD in Ruby on Rails

17
.
03
.
2024
Paweł Strzałkowski
Ruby on Rails
Domain-Driven Design
Backend
Tutorial

Visuality at wroc_love.rb 2022: It's back and it's good!

14
.
11
.
2023
Patryk Ptasiński
Ruby on Rails
Conferences
Ruby

Our journey to Event Storming

14
.
11
.
2023
Michał Łęcicki
Visuality
Event Storming

Should I use Active Record Callbacks?

14
.
11
.
2023
Mateusz Woźniczka
Ruby on Rails
Backend
Tutorial

How to rescue a transaction to roll back changes?

17
.
03
.
2024
Paweł Strzałkowski
Ruby on Rails
Backend
Ruby
Tutorial