Data migrations with space.upgrade()
Example on GitHub: migrations
In this tutorial, you learn to write migrations that include data migration using
the space.upgrade()
function.
Before starting this tutorial, complete the Basic tt migrations tutorial. As a result, you have a sharded Tarantool EE cluster that uses an etcd-based configuration storage. The cluster has a space with two indexes.
Complex migrations require data migration along with schema migration. Connect to the router instance and insert some tuples into the space before proceeding to the next steps.
$ tt connect myapp:router-001-a
myapp:router-001-a> require('crud').insert_object_many('writers', {
{id = 1, name = 'Haruki Murakami', age = 75},
{id = 2, name = 'Douglas Adams', age = 49},
{id = 3, name = 'Eiji Mikage', age = 41},
}, {noreturn = true})
The next migration changes the space format incompatibly: instead of one name
field, the new format includes two fields first_name
and last_name
.
To apply this migration, you need to change each tuple’s structure preserving the stored
data. The space.upgrade function helps with this task.
Create a new file 000003_alter_writers_space.lua
in /migrations/scenario
.
Prepare its initial structure the same way as in previous migrations:
local function apply_scenario()
-- migration code
end
return {
apply = {
scenario = apply_scenario,
},
}
Start the migration function with the new format description:
local function apply_scenario()
local space = box.space['writers']
local new_format = {
{name = 'id', type = 'number'},
{name = 'bucket_id', type = 'number'},
{name = 'first_name', type = 'string'},
{name = 'last_name', type = 'string'},
{name = 'age', type = 'number'},
}
box.space.writers.index.age:drop()
Примечание
box.space.writers.index.age:drop()
drops an existing index. This is done
because indexes rely on field numbers and may break during this format change.
If you need the age
field indexed, recreate the index after applying the
new format.
Next, create a stored function that transforms tuples to fit the new format.
In this case, the function extracts the first and the last name from the name
field
and returns a tuple of the new format:
box.schema.func.create('_writers_split_name', {
language = 'lua',
is_deterministic = true,
body = [[
function(t)
local name = t[3]
local split_data = {}
local split_regex = '([^%s]+)'
for v in string.gmatch(name, split_regex) do
table.insert(split_data, v)
end
local first_name = split_data[1]
assert(first_name ~= nil)
local last_name = split_data[2]
assert(last_name ~= nil)
return {t[1], t[2], first_name, last_name, t[4]}
end
]],
})
Finally, call space:upgrade()
with the new format and the transformation function
as its arguments. Here is the complete migration code:
local function apply_scenario()
local space = box.space['writers']
local new_format = {
{name = 'id', type = 'number'},
{name = 'bucket_id', type = 'number'},
{name = 'first_name', type = 'string'},
{name = 'last_name', type = 'string'},
{name = 'age', type = 'number'},
}
box.space.writers.index.age:drop()
box.schema.func.create('_writers_split_name', {
language = 'lua',
is_deterministic = true,
body = [[
function(t)
local name = t[3]
local split_data = {}
local split_regex = '([^%s]+)'
for v in string.gmatch(name, split_regex) do
table.insert(split_data, v)
end
local first_name = split_data[1]
assert(first_name ~= nil)
local last_name = split_data[2]
assert(last_name ~= nil)
return {t[1], t[2], first_name, last_name, t[4]}
end
]],
})
local future = space:upgrade({
func = '_writers_split_name',
format = new_format,
})
future:wait()
end
return {
apply = {
scenario = apply_scenario,
},
}
Learn more about space.upgrade()
in Upgrading space schema.
Publish the new migration to etcd.
$ tt migrations publish "http://app_user:config_pass@localhost:2379/myapp" \
migrations/scenario/000003_alter_writers_space.lua
Примечание
You can also publish all migrations from the default location /migrations/scenario
.
All other migrations stored in this directory are already published, so tt
skips them.
$ tt migrations publish "http://app_user:config_pass@localhost:2379/myapp"
Apply the published migrations:
$ tt migrations apply "http://app_user:config_pass@localhost:2379/myapp" \
--tarantool-username=client --tarantool-password=secret
Connect to the router instance and check that the space and its tuples have the new format:
$ tt connect myapp:router-001-a
myapp:router-001-a> require('crud').get('writers', 2)
---
- rows: [2, 401, 'Douglas', 'Adams', 49]
metadata: [{'name': 'id', 'type': 'number'}, {'name': 'bucket_id', 'type': 'number'},
{'name': 'first_name', 'type': 'string'}, {'name': 'last_name', 'type': 'string'},
{'name': 'age', 'type': 'number'}]
- null
...
Learn to use migrations for data schema definition on new instances added to the cluster in Extending the cluster.