Data migrations with space.upgrade()

Example on GitHub: migrations

In this tutorial, you learn to write migrations that include data migration using the space.upgrade() function.

Prerequisites

Before starting this tutorial, complete the Basic tt migrations tutorial. As a result, you have a sharded Tarantool EE cluster that uses an etcd-based configuration storage. The cluster has a space with two indexes.

Writing a complex migration

Complex migrations require data migration along with schema migration. Connect to the router instance and insert some tuples into the space before proceeding to the next steps.

$ tt connect myapp:router-001-a

myapp:router-001-a> require('crud').insert_object_many('writers', {
    {id = 1, name = 'Haruki Murakami', age = 75},
    {id = 2, name = 'Douglas Adams', age = 49},
    {id = 3, name = 'Eiji Mikage', age = 41},
}, {noreturn = true})

The next migration changes the space format incompatibly: instead of one name field, the new format includes two fields first_name and last_name. To apply this migration, you need to change each tuple’s structure preserving the stored data. The space.upgrade function helps with this task.

Create a new file 000003_alter_writers_space.lua in /migrations/scenario. Prepare its initial structure the same way as in previous migrations:

local function apply_scenario()
--  migration code
end
return {
    apply = {
        scenario = apply_scenario,
    },
}

Start the migration function with the new format description:

local function apply_scenario()
    local space = box.space['writers']
    local new_format = {
        {name = 'id', type = 'number'},
        {name = 'bucket_id', type = 'number'},
        {name = 'first_name', type = 'string'},
        {name = 'last_name', type = 'string'},
        {name = 'age', type = 'number'},
    }
    box.space.writers.index.age:drop()

Примечание

box.space.writers.index.age:drop() drops an existing index. This is done because indexes rely on field numbers and may break during this format change. If you need the age field indexed, recreate the index after applying the new format.

Next, create a stored function that transforms tuples to fit the new format. In this case, the function extracts the first and the last name from the name field and returns a tuple of the new format:

box.schema.func.create('_writers_split_name', {
    language = 'lua',
    is_deterministic = true,
    body = [[
    function(t)
        local name = t[3]

        local split_data = {}
        local split_regex = '([^%s]+)'
        for v in string.gmatch(name, split_regex) do
            table.insert(split_data, v)
        end

        local first_name = split_data[1]
        assert(first_name ~= nil)

        local last_name = split_data[2]
        assert(last_name ~= nil)

        return {t[1], t[2], first_name, last_name, t[4]}
    end
    ]],
})

Finally, call space:upgrade() with the new format and the transformation function as its arguments. Here is the complete migration code:

local function apply_scenario()
    local space = box.space['writers']
    local new_format = {
        {name = 'id', type = 'number'},
        {name = 'bucket_id', type = 'number'},
        {name = 'first_name', type = 'string'},
        {name = 'last_name', type = 'string'},
        {name = 'age', type = 'number'},
    }
    box.space.writers.index.age:drop()

    box.schema.func.create('_writers_split_name', {
        language = 'lua',
        is_deterministic = true,
        body = [[
        function(t)
            local name = t[3]

            local split_data = {}
            local split_regex = '([^%s]+)'
            for v in string.gmatch(name, split_regex) do
                table.insert(split_data, v)
            end

            local first_name = split_data[1]
            assert(first_name ~= nil)

            local last_name = split_data[2]
            assert(last_name ~= nil)

            return {t[1], t[2], first_name, last_name, t[4]}
        end
        ]],
    })

    local future = space:upgrade({
        func = '_writers_split_name',
        format = new_format,
    })

    future:wait()
end

return {
    apply = {
        scenario = apply_scenario,
    },
}

Learn more about space.upgrade() in Upgrading space schema.

Publishing the migration

Publish the new migration to etcd.

$ tt migrations publish "http://app_user:config_pass@localhost:2379/myapp" \
                        migrations/scenario/000003_alter_writers_space.lua

Примечание

You can also publish all migrations from the default location /migrations/scenario. All other migrations stored in this directory are already published, so tt skips them.

$ tt migrations publish "http://app_user:config_pass@localhost:2379/myapp"

Applying the migration

Apply the published migrations:

$ tt migrations apply "http://app_user:config_pass@localhost:2379/myapp" \
                      --tarantool-username=client --tarantool-password=secret

Connect to the router instance and check that the space and its tuples have the new format:

$ tt connect myapp:router-001-a

myapp:router-001-a> require('crud').get('writers', 2)
---
- rows: [2, 401, 'Douglas', 'Adams', 49]
  metadata: [{'name': 'id', 'type': 'number'}, {'name': 'bucket_id', 'type': 'number'},
    {'name': 'first_name', 'type': 'string'}, {'name': 'last_name', 'type': 'string'},
    {'name': 'age', 'type': 'number'}]
- null
...

Next steps

Learn to use migrations for data schema definition on new instances added to the cluster in Extending the cluster.