space_object:create_index()

object space_object¶

space_object:create_index(index-name[, index_opts])¶

Обязательным условием является создание индекса для спейса до вставки в спейс кортежей или выборки кортежей из него. Первый созданный индекс, который будет использоваться в качестве первичного индекса, должен быть уникальным.

тип возвращаемого значения:
Параметры:	space_object (`space_object`) – ссылка на объект index_name (`string`) – имя индекса, которое должно соответствовать правилам именования объектов index_opts (`table`) – index options (see index_opts)
возвращает:	объект индекса
	index_object

Возможные ошибки:

too many parts
index „…“ already exists
primary key must be unique

Building or rebuilding a large index will cause occasional yields so that other requests will not be blocked. If the other requests cause an illegal situation such as a duplicate key in a unique index, building or rebuilding such index will fail.

Пример:

-- Create a space --
bands = box.schema.space.create('bands')

-- Specify field names and types --
box.space.bands:format({
    { name = 'id', type = 'unsigned' },
    { name = 'band_name', type = 'string' },
    { name = 'year', type = 'unsigned' }
})

-- Create a primary index --
box.space.bands:create_index('primary', { parts = { 'id' } })

-- Create a unique secondary index --
box.space.bands:create_index('band', { parts = { 'band_name' } })

-- Create a non-unique secondary index --
box.space.bands:create_index('year', { parts = { { 'year' } }, unique = false })

-- Create a multi-part index --
box.space.bands:create_index('year_band', { parts = { { 'year' }, { 'band_name' } } })

index_opts

object index_opts¶

Index options that include the index name, type, identifiers of key fields, and so on. These options are passed to the space_object.create_index() method.

Примечание

These options are also passed to index_object:alter().

index_opts.type¶: The index type.

Type: string

Default: TREE

Possible values: TREE, HASH, RTREE, BITSET

index_opts.id¶: A unique numeric identifier of the index, which is generated automatically.

Type: number

Default: last index’s ID + 1

index_opts.unique¶

Specify whether an index may be unique. When true, the index cannot contain the same key value twice.

Type: boolean
Default: true

Пример:

-- Create a non-unique secondary index --
box.space.bands:create_index('year', { parts = { { 'year' } }, unique = false })

index_opts.if_not_exists¶: Specify whether to swallow an error on an attempt to create an index with a duplicated name.

Type: boolean

Default: false

index_opts.parts¶

Specify the index’s key parts.

Type: a table of key_part values
Default: {1, ‘unsigned’}

Пример:

-- Create a primary index --
box.space.bands:create_index('primary', { parts = { 'id' } })

-- Create a unique secondary index --
box.space.bands:create_index('band', { parts = { 'band_name' } })

-- Create a non-unique secondary index --
box.space.bands:create_index('year', { parts = { { 'year' } }, unique = false })

-- Create a multi-part index --
box.space.bands:create_index('year_band', { parts = { { 'year' }, { 'band_name' } } })

Примечание

Другой способ объявления оператора parts

Если раньше (до версии 2.7.1) индекс состоял из одной части, содержал дополнительные параметры, например is_nullable или collation, и был описан так:

my_space:create_index('one_part_idx', {parts = {1, 'unsigned', is_nullable=true}})

(с одинарными фигурными скобками), то Tarantool игнорировал эти параметры.

Начиная с версии 2.7.1, при описании индекса можно не указывать дополнительные фигурные скобки, но допускаются оба варианта:

-- с дополнительными фигурными скобками
my_space:create_index('one_part_idx', {parts = {{1, 'unsigned', is_nullable=true}}})

-- без дополнительных фигурных скобок
my_space:create_index('one_part_idx', {parts = {1, 'unsigned', is_nullable=true}})

index_opts.dimension¶: The RTREE index dimension.

Type: number

Default: 2

index_opts.distance¶: The RTREE index distance type.

Type: string

Default: euclid

Possible values: euclid, manhattan

index_opts.sequence¶: Create a generator for indexes using a sequence object. Learn more from specifying a sequence in create_index().

Type: string or number

index_opts.func¶: Specify the identifier of the functional index function.

Type: string

index_opts.hint¶

Since: 2.6.1

Specify whether hint optimization is enabled for the TREE index:

If true, the index works faster.
If false, the index size is reduced by half.

Type: boolean
Default: true

index_opts.bloom_fpr¶

Vinyl only

Specify the bloom filter’s false positive rate.

Type: number
Default: vinyl.bloom_fpr

index_opts.page_size¶

Vinyl only

Specify the size of a page used for read and write disk operations.

Type: number
Default: vinyl.page_size

index_opts.range_size¶

Vinyl only

Specify the default maximum range size (in bytes) for a vinyl index.

Type: number
Default: vinyl.range_size

index_opts.run_count_per_level¶

Vinyl only

Specify the maximum number of runs per level in the LSM tree.

Type: number
Default: vinyl.run_count_per_level

index_opts.run_size_ratio¶

Vinyl only

Specify the ratio between the sizes of different levels in the LSM tree.

Type: number
Default: vinyl.run_size_ratio

key_part

object key_part¶

A descriptor of a single part in a multipart key. A table of parts is passed to the index_opts.parts option.

key_part.field¶

Specify the field number or name.

Примечание

To create a key part by a field name, you need to specify space_object:format() first.

Type: string or number

Examples: Creating an index using field names and numbers

key_part.type¶: Specify the field type. If the field type is specified in space_object:format(), key_part.type inherits this value.

Type: string

Default: scalar

Possible values: listed in Indexed field types

key_part.collation¶

Specify the collation used to compare field values. If the field collation is specified in space_object:format(), key_part.collation inherits this value.

Type: string
Possible values: listed in the box.space._collation system space

Пример:

-- Create a space --
box.schema.space.create('tester')

-- Use the 'unicode' collation --
box.space.tester:create_index('unicode', { parts = { { field = 1,
                                                        type = 'string',
                                                        collation = 'unicode' } } })

-- Use the 'unicode_ci' collation --
box.space.tester:create_index('unicode_ci', { parts = { { field = 1,
                                                        type = 'string',
                                                        collation = 'unicode_ci' } } })

-- Insert test data --
box.space.tester:insert { 'ЕЛЕ' }
box.space.tester:insert { 'елейный' }
box.space.tester:insert { 'ёлка' }

-- Returns nil --
select_unicode = box.space.tester.index.unicode:select({ 'ЁлКа' })
-- Returns 'ёлка' --
select_unicode_ci = box.space.tester.index.unicode_ci:select({ 'ЁлКа' })

key_part.is_nullable¶

Specify whether nil (or its equivalent such as msgpack.NULL) can be used as a field value. If the is_nullable option is specified in space_object:format(), key_part.is_nullable inherits this value.

You can set this option to true if:

the index type is TREE
the index is not the primary index

It is also legal to insert nothing at all when using trailing nullable fields. Within indexes, such null values are always treated as equal to other null values and are always treated as less than non-null values. Nulls may appear multiple times even in a unique index.

Type: boolean
Default: false

Пример:

box.space.tester:create_index('I', {unique = true, parts = {{field = 2, type = 'number', is_nullable = true}}})

Предупреждение

It is legal to create multiple indexes for the same field with different is_nullable values or to call space_object:format() with a different is_nullable value from what is used for an index. When there is a contradiction, the rule is: null is illegal unless is_nullable=true for every index and for the space format.

key_part.exclude_null¶

Since: 2.8.2

Specify whether an index can skip tuples with null at this key part. You can set this option to true if:

the index type is TREE
the index is not the primary index

If exclude_null is set to true, is_nullable is set to true automatically. Note that this option can be changed dynamically. In this case, the index is rebuilt.

Такие индексы вообще не хранят отфильтрованные кортежи, поэтому индексирование будет выполняться быстрее.

Type: boolean
Default: false

key_part.path¶

Specify the path string for a map field.

Type: string

See the examples below:

Creating an index using the path option for map fields
Creating a multikey index using the path option with [*]

Примеры

Creating an index using field names and numbers

create_index() can use field names or field numbers to define key parts.

Example 1 (field names):

To create a key part by a field name, you need to specify space_object:format() first.

-- Create a primary index --
box.space.bands:create_index('primary', { parts = { 'id' } })

-- Create a unique secondary index --
box.space.bands:create_index('band', { parts = { 'band_name' } })

-- Create a non-unique secondary index --
box.space.bands:create_index('year', { parts = { { 'year' } }, unique = false })

-- Create a multi-part index --
box.space.bands:create_index('year_band', { parts = { { 'year' }, { 'band_name' } } })

Example 2 (field numbers):

-- Create a primary index --
box.space.bands:create_index('primary', { parts = { 1 } })

-- Create a unique secondary index --
box.space.bands:create_index('band', { parts = { 2 } })

-- Create a non-unique secondary index --
box.space.bands:create_index('year', { parts = { { 3 } }, unique = false })

-- Create a multi-part index --
box.space.bands:create_index('year_band', { parts = { 3, 2 } })

Создание индекса с использованием пути для полей с ассоциативными массивами (индексы по пути JSON)

Чтобы создать индекс для поля, которое представляет собой ассоциативный массив (строка с путем и скалярное значение), укажите строку c путем во время создания индекса:

parts = {field-number, 'data-type', path = 'path-name'}

Тип индекса должен быть TREE или HASH, а содержимое поля — всегда ассоциативный массив с одним и тем же путем.

Пример 1 — Простое использование пути:

box.schema.space.create('space1')
box.space.space1:create_index('primary', { parts = { { field = 1,
                                                       type = 'scalar',
                                                       path = 'age' } } })
box.space.space1:insert({ { age = 44 } })
box.space.space1:select(44)

Пример 2 — для большей наглядности используем path вместе с format() и JSON-синтаксисом:

box.schema.space.create('space2')
box.space.space2:format({ { 'id', 'unsigned' }, { 'data', 'map' } })
box.space.space2:create_index('info', { parts = { { 'data.full_name["firstname"]', 'str' },
                                                  { 'data.full_name["surname"]', 'str' } } })
box.space.space2:insert({ 1, { full_name = { firstname = 'John', surname = 'Doe' } } })
box.space.space2:select { 'John' }

Создание индекса по массивам (multikey) с использованием опции path с символом [*]

Строка в параметре пути может содержать символ [*], который называется заменителем индекса массива. Описанные так индексы используются для JSON-документов, у которых одинаковая структура.

Например, при создании индекса по полю №2 для документа со строками, который будет начинаться с {'data': [{'name': '...'}, {'name': '...'}], раздел parts в запросе на создание индекса будет выглядеть так:

parts = {{field = 2, type = 'str', path = 'data[*].name'}}

Тогда кортежи с именами можно быстро получить с помощью index_object:select({key-value}).

A single field can have multiple keys, as in this example which retrieves the same tuple twice because there are two keys „A“ and „B“ which both match the request:

my_space = box.schema.space.create('json_documents')
my_space:create_index('primary')
multikey_index = my_space:create_index('multikey', {parts = {{field = 2, type = 'str', path = 'data[*].name'}}})
my_space:insert({1,
         {data = {{name = 'A'},
                  {name = 'B'}},
          extra_field = 1}})
multikey_index:select({''}, {iterator = 'GE'})

Результат выборки будет выглядеть так:

tarantool> multikey_index:select({''},{iterator='GE'})
---
- - [1, {'data': [{'name': 'A'}, {'name': 'B'}], 'extra_field': 1}]
- [1, {'data': [{'name': 'A'}, {'name': 'B'}], 'extra_field': 1}]
...

The following restrictions exist:

символ [*] должен стоять отдельно или в конце имени в пути.
символ [*] не должен повторяться в пути.
Если в индексе есть путь с x[*], то в никаком другом индексе не может быть пути с x.компонентом,
[*] нельзя указывать в пути первичного ключа.
Если индекс должен быть уникальным (unique=true) и в нем есть путь с символом [*], то запрещается использовать дублирующиеся ключи в разных кортежах, но в в одном кортеже можно использовать дублирующиеся ключи.
Структура значения поля должна соответствовать стуктуре, заданной в определении пути, или значение поля должно быть nil (nil не индексируется).
В спейсе с индексами по массивам можно хранить не более ~8000 элементов, проиндексированных таким образом.

Создание функционального индекса

Функциональные индексы — это индексы, которые вызывают пользовательскую функцию для формирования ключа индекса, в отличие от других типов индексов, где ключ формирует сам Tarantool. Функциональные индексы используют для сжатия, усечения или реверсирования или любого другого изменения индекса по желанию пользователя.

Ниже приведены рекомендации по созданию функциональных индексов:

The function definition must expect a tuple, which has the contents of fields at the time a data-change request happens, and must return a tuple, which has the contents that will be put in the index.
The create_index definition must include the specification of all key parts, and the custom function must return a table that has the same number of key parts with the same types.
Спейс должен быть на движке memtx.
Функция должна быть персистентной и детерминированной (см. Создание функции с телом).
Части ключа не должны зависеть от JSON-путей.
Функция должна получать доступ к значениям частей ключа по индексу, а не по имени поля.
Функциональные индексы не должны быть по первичному ключу.
Нельзя изменить ни функциональные индексы, ни функцию, если она используется для индекса, то есть единственный способ изменить их — это удалить индекс и создать его заново.
Только функции, запущенные из песочницы, могут использоваться в функциональных индексах.

Пример:

Функция может создать ключ, используя только первую букву строкового поля.

Создайте спейс. В спейсе должно быть поле с первичным ключом, которое не будет полем для функционального индекса:
```
box.schema.space.create('tester')
box.space.tester:create_index('i', { parts = { { field = 1, type = 'string' } } })
```
Создайте функцию. Функция принимает кортеж. В этом примере она работает на кортеже tuple[2], поскольку источник ключа — поле номер 2, в которое мы вставляем данные. Используйте string.sub() из модуля string, чтобы получить первый символ:
```
function_code = [[function(tuple) return {string.sub(tuple[2],1,1)} end]]
```

Сделайте функцию персистентной с помощью box.schema.create:

box.schema.func.create('my_func',
        { body = function_code, is_deterministic = true, is_sandboxed = true })

Create a functional index. Specify the fields whose values will be passed to the function. Specify the function:

box.space.tester:create_index('func_index', { parts = { { field = 1, type = 'string' } },
                                              func = 'my_func' })

Insert a few tuples. Select using only the first letter, it will work because that is the key. Or, select using the same function as was used for insertion:

box.space.tester:insert({ 'a', 'wombat' })
box.space.tester:insert({ 'b', 'rabbit' })
box.space.tester.index.func_index:select('w')
box.space.tester.index.func_index:select(box.func.my_func:call({ { 'tester', 'wombat' } }))

Результаты двух запросов select будут выглядеть так:

tarantool> box.space.tester.index.func_index:select('w')
---
- - ['a', 'wombat']
...
tarantool> box.space.tester.index.func_index:select(box.func.my_func:call({{'tester','wombat'}}));
---
- - ['a', 'wombat']
...

Вот пример кода полностью:

box.schema.space.create('tester')
box.space.tester:create_index('i', { parts = { { field = 1, type = 'string' } } })
function_code = [[function(tuple) return {string.sub(tuple[2],1,1)} end]]
box.schema.func.create('my_func',
        { body = function_code, is_deterministic = true, is_sandboxed = true })
box.space.tester:create_index('func_index', { parts = { { field = 1, type = 'string' } },
                                              func = 'my_func' })
box.space.tester:insert({ 'a', 'wombat' })
box.space.tester:insert({ 'b', 'rabbit' })
box.space.tester.index.func_index:select('w')
box.space.tester.index.func_index:select(box.func.my_func:call({ { 'tester', 'wombat' } }))

Функции для функциональных индексов могут возвращать множество ключей. Такие функции называют «мультиключевыми» (multikey).

To create a multikey function, the options of box.schema.func.create() must include is_multikey = true. The return value must be a table of tuples. If a multikey function returns N tuples, then N keys will be added to the index.

Пример:

tester = box.schema.space.create('withdata')
tester:format({ { name = 'name', type = 'string' },
                { name = 'address', type = 'string' } })
name_index = tester:create_index('name', { parts = { { field = 1, type = 'string' } } })
function_code = [[function(tuple)
       local address = string.split(tuple[2])
       local ret = {}
       for _, v in pairs(address) do
         table.insert(ret, {utf8.upper(v)})
       end
       return ret
     end]]
box.schema.func.create('address',
        { body = function_code,
          is_deterministic = true,
          is_sandboxed = true,
          is_multikey = true })
addr_index = tester:create_index('addr', { unique = false,
                                           func = 'address',
                                           parts = { { field = 1, type = 'string',
                                                  collation = 'unicode_ci' } } })
tester:insert({ "James", "SIS Building Lambeth London UK" })
tester:insert({ "Sherlock", "221B Baker St Marylebone London NW1 6XE UK" })
addr_index:select('Uk')

Версия:

space_object:create_index()

index_opts

key_part

Примеры

Creating an index using field names and numbers

Создание индекса с использованием пути для полей с ассоциативными массивами (индексы по пути JSON)

Создание индекса по массивам (multikey) с использованием опции path с символом [*]

Создание функционального индекса