Tarantool combines an in-memory DBMS and a Lua server in a single platform providing ACID-compliant storage. It comes in two editions: Community and Enterprise. The use cases for Tarantool vary from ultra-fast cache to product data marts and smart queue services.

Here are some of Tarantool’s key characteristics:

Easy handling of OLTP workloads: processes hundreds of thousands RPS
Data integrity: write-ahead log (WAL) and data snapshots
Cooperative multitasking: transactions are performed in lightweight coroutines with no inter-thread locking
Advanced indexing: composite indexes, locale support, indexing by nested fields and arrays
Compute close to data: Lua server and Just-In-Time compiler on board
Durable distributed storage: multiple failover modes and RAFT-based synchronous replication available

Tarantool allows executing code alongside data, which helps increase the speed of operations. Developers can implement any business logic with Lua, and a single Tarantool instance can also receive SQL requests.

Tarantool has a variety of compatible modules (Lua rocks). You can pick the ones that you need and install them manually.

Tarantool runs on Linux (x86_64, aarch64), macOS (x86_64, aarch64), and FreeBSD (x86_64).

You can use Tarantool with a programming language you’re familiar with. For this purpose, a number of connectors are provided.

Editions

Tarantool comes in two editions: the open-source Community Edition (CE) and the commercial Enterprise Edition (EE).

Tarantool Community Edition

Tarantool Community Edition lets you develop applications and speed up a system in operation. It features synchronous replication, affords easy scalability, and includes tools to develop efficient applications. The Tarantool community helps with any practical questions regarding the Community Edition.

Tarantool Enterprise Edition

Tarantool Enterprise Edition provides advanced tools for administration, deployment, and security management, along with premium support services. This edition includes all the Community Edition features and is more predictable in terms of solution cost and maintenance. The Enterprise Edition is shipped as an SDK and includes a number of closed-source modules.

Note

In this documentation, topics related to Enterprise Edition features are marked with an Enterprise Edition admonition.

The Enterprise Edition provides an extended feature set for developing and managing clustered Tarantool applications, such as:

Security audit log.
SSL support for traffic encryption.
Centralized configuration storage.
Supervised failover.
Tuple compression.
Non-blocking DDL.
Security enforcement features.
Read views.
Write-ahead log extensions.
Flight recorder.
Tarantool bindings to OpenLDAP.
Enterprise database connectivity: Oracle and any ODBC-supported DBMS (for example, MySQL, Microsoft SQL Server).
Static package for standalone Linux systems.

The Enterprise Edition is distributed in the form of an SDK, which includes the following key components:

The extended Enterprise version of the tt utility.
Tarantool Cluster Manager – a web-based visual tool for managing Tarantool clusters.

Use cases

Fast first-class storage

Primary storage
- No secondary storage required
Tolerance to high write loads
Support of relational approaches
Composite secondary indexes
- Data access, data slices
Predictable request latency

Advanced cache

Write-behind caching
Secondary index support
Complex invalidation algorithm support

Smart queue

Support of various identification techniques
Advanced task lifecycle management
- Task scheduling
- Archiving of completed tasks

Data-centric applications

Arbitrary data flows from many sources
Incoming data processing
Storage
Background cycle processing
- Scheduling support

Getting started

This section will get you acquainted with Tarantool.

Installing Tarantool

This section explains how to download and set up Tarantool Enterprise Edition and run a sample application provided with it. To learn how to download and install Tarantool Community Edition, see the Download page.

Note

The tt utility provides the ability to install and work with multiple Tarantool versions.

System requirements

The recommended system requirements for running Tarantool Enterprise are as follows.

Hardware requirements

To fully ensure the fault tolerance of a distributed data storage system, at least three physical computers or virtual servers are required.

For testing/development purposes, the system can be deployed using a smaller number of servers. However, it is not recommended to use such configurations for production.

Software requirements

As host operating systems, Tarantool Enterprise Edition supports Red Hat Enterprise Linux and CentOS versions 7.5 and higher.

Note

Tarantool Enterprise can run on other systemd-based Linux distributions but it is not tested on them and may not work as expected.
glibc 2.17-260.el7_6.6 and higher is required. Take care to check and update, if needed:
```
$ rpm -q glibc
glibc-2.17-196.el7_4.2
$ yum update glibc
```

Network requirements

Hereinafter, “storage servers” or “Tarantool servers” are the computers used to store and process data, and “administration server” is the computer used by the system operator to install and configure the product.

The Tarantool cluster has a full mesh topology, therefore all Tarantool servers should be able to communicate and send traffic from and to TCP/UDP ports used by the cluster’s instances (see advertise_uri: <host>:<port> and config: advertise_uri: '<host>:<port>' in /etc/tarantool/conf.d/*.yml for each instance). For example:

# /etc/tarantool/conf.d/*.yml

myapp.s2-replica:
  advertise_uri: localhost:3305 # this is a TCP/UDP port
  http_port: 8085

all:
  ...
  hosts:
    storage-1:
      config:
        advertise_uri: 'vm1:3301' # this is a TCP/UDP port
        http_port: 8081

To configure remote monitoring or to connect via the administrative console, the administration server should be able to access the following TCP ports on Tarantool servers:

22 to use the SSH protocol,
ports specified in instance configuration to monitor the HTTP-metrics.

Additionally, it is recommended to apply the following settings for sysctl on all Tarantool servers:

$ # TCP KeepAlive setting
$ sysctl -w net.ipv4.tcp_keepalive_time=60
$ sysctl -w net.ipv4.tcp_keepalive_intvl=5
$ sysctl -w net.ipv4.tcp_keepalive_probes=5

This optional setup of the Linux network stack helps speed up the troubleshooting of network connectivity when the server physically fails. To achieve maximum performance, you may also need to configure other network stack parameters that are not specific to the Tarantool DBMS. For more information, please refer to the Network Performance Tuning Guide section of the RHEL7 user documentation.

Package contents

The latest release packages of Tarantool Enterprise are available in the customer zone at Tarantool website. Please contact support@tarantool.io for access.

Each package is distributed as a tar + gzip archive and includes the following components and features:

Static Tarantool binary for simplified deployment in Linux environments.
tt command-line utility that provides a unified command-line interface for managing Tarantool-based applications. See tt CLI utility for details.
Tarantool Cluster Manager – a web-based interface for managing Tarantool EE clusters. See Tarantool Cluster Manager for details.
Selection of open and closed source modules.
Sample application walking you through all included modules

Archive contents:

tarantool is the main executable of Tarantool.
tt command-line utility.
tcm is the Tarantool Cluster Manager executable.
examples/ is the directory containing sample applications:
- pg_writethrough_cache/ is an application showcasing how Tarantool can cache data written to, for example, a PostgreSQL database;
- ora_writebehind_cache/ is an application showcasing how Tarantool can cache writes and queue them to, for example, an Oracle database;
- docker/ is an application designed to be easily packed into a Docker container;
rocks/ is the directory containing a selection of additional open and closed source modules included in the distribution as an offline rocks repository. See the rocks reference for details.
templates/ is the directory containing template files for your application development environment.

Installation

The delivered tar + gzip archive should be uploaded to a server and unpacked:

$ tar xvf tarantool-enterprise-sdk-<version>.tar.gz

No further installation is required as the unpacked binaries are almost ready to go. Go to the directory with the binaries (tarantool-enterprise) and add them to the executable path by running the script provided by the distribution:

$ source ./env.sh

Make sure you have enough privileges to run the script and that the file is executable. Otherwise, try chmod and chown commands to adjust it.

Creating your first Tarantool database

Example on GitHub: create_db

In this tutorial, you create a Tarantool database, write data to it, and select data from this database.

Prerequisites

Before starting this tutorial:

Install the tt utility.
Install Tarantool.

Note

The tt utility provides the ability to install Tarantool software using the tt install command.

Creating an application

The tt create command can be used to create an application from a predefined or custom template. In this tutorial, the application layout is prepared manually:

Create a tt environment in the current directory using the tt init command.
Inside the instances.enabled directory of the created tt environment, create the create_db directory.
Inside instances.enabled/create_db, create the instances.yml and config.yaml files:
- instances.yml specifies instances to run in the current environment. In this example, there is one instance:
```
instance001:
```
- config.yaml contains basic instance configuration:
```
groups:
  group001:
    replicasets:
      replicaset001:
        instances:
          instance001:
            iproto:
              listen:
              - uri: '127.0.0.1:3301'
```
  The instance in the configuration accepts incoming requests on the 3301 port.

Working with the database

Starting an instance

Start the Tarantool instance from the tt environment directory using tt start:
```
$ tt start create_db
```

To check the running instance, use the tt status command:

$ tt status create_db
INSTANCE               STATUS   PID   MODE  CONFIG  BOX      UPSTREAM
create_db:instance001  RUNNING  8685  RW    ready   running  --

Connect to the instance with tt connect:
```
$ tt connect create_db:instance001
   • Connecting to the instance...
   • Connected to create_db:instance001

create_db:instance001>
```
This command opens an interactive Tarantool console with the create_db:instance001> prompt. Now you can enter requests in the command line.

Creating a space

Create a space named bands:

create_db:instance001> box.schema.space.create('bands')
---
- engine: memtx
  before_replace: 'function: 0x010229d788'
  field_count: 0
  is_sync: false
  is_local: false
  on_replace: 'function: 0x010229d750'
  temporary: false
  index: []
  type: normal
  enabled: false
  name: bands
  id: 512
- created
...

Format the created space by specifying field names and types:

create_db:instance001> box.space.bands:format({
                           { name = 'id', type = 'unsigned' },
                           { name = 'band_name', type = 'string' },
                           { name = 'year', type = 'unsigned' }
                       })
---
...

Creating indexes

Create the primary index based on the id field:

create_db:instance001> box.space.bands:create_index('primary', { parts = { 'id' } })
---
- unique: true
  parts:
  - fieldno: 1
    sort_order: asc
    type: unsigned
    exclude_null: false
    is_nullable: false
  hint: true
  id: 0
  type: TREE
  space_id: 512
  name: primary
...

Create the secondary index based on the band_name field:

create_db:instance001> box.space.bands:create_index('secondary', { parts = { 'band_name' } })
---
- unique: true
  parts:
  - fieldno: 2
    sort_order: asc
    type: string
    exclude_null: false
    is_nullable: false
  hint: true
  id: 1
  type: TREE
  space_id: 512
  name: secondary
...

Writing and selecting data

Insert three tuples into the space:

create_db:instance001> box.space.bands:insert { 1, 'Roxette', 1986 }
---
- [1, 'Roxette', 1986]
...
create_db:instance001> box.space.bands:insert { 2, 'Scorpions', 1965 }
---
- [2, 'Scorpions', 1965]
...
create_db:instance001> box.space.bands:insert { 3, 'Ace of Base', 1987 }
---
- [3, 'Ace of Base', 1987]
...

Select a tuple using the primary index:

create_db:instance001> box.space.bands:select { 3 }
---
- - [3, 'Ace of Base', 1987]
...

Select tuples using the secondary index:

create_db:instance001> box.space.bands.index.secondary:select{'Scorpions'}
---
- - [2, 'Scorpions', 1965]
...

Creating a sharded cluster

Example on GitHub: sharded_cluster_crud

In this tutorial, you get a sharded cluster up and running on your local machine and learn how to manage the cluster using the tt utility. This cluster uses the following external modules:

vshard enables sharding in the cluster.
crud allows you to manipulate data in the sharded cluster.

The cluster created in this tutorial includes 5 instances: one router and 4 storages, which constitute two replica sets.

Prerequisites

Before starting this tutorial:

Install the tt utility.
Install tarantool.

Note

The tt utility provides the ability to install Tarantool software using the tt install command.

Creating a cluster application

The tt create command can be used to create an application from a predefined or custom template. For example, the built-in vshard_cluster template enables you to create a ready-to-run sharded cluster application.

In this tutorial, the application layout is prepared manually:

Create a tt environment in the current directory by executing the tt init command.
Inside the empty instances.enabled directory of the created tt environment, create the sharded_cluster_crud directory.
Inside instances.enabled/sharded_cluster_crud, create the following files:
- instances.yml specifies instances to run in the current environment.
- config.yaml specifies the cluster configuration.
- storage.lua contains code specific for storages.
- router.lua contains code specific for a router.
- sharded_cluster_crud-scm-1.rockspec specifies external dependencies required by the application.
The next Developing the application section shows how to configure the cluster and write code for routing read and write requests to different storages.

Developing the application

Configuring instances to run

Open the instances.yml file and add the following content:

storage-a-001:
storage-a-002:
storage-b-001:
storage-b-002:
router-a-001:

This file specifies instances to run in the current environment.

Configuring the cluster

This section describes how to configure the cluster in the config.yaml file.

Step 1: Configuring credentials

Add the credentials configuration section:

credentials:
  users:
    replicator:
      password: 'topsecret'
      roles: [ replication ]
    storage:
      password: 'secret'
      roles: [ sharding ]

In this section, two users with the specified passwords are created:

The replicator user with the replication role.
The storage user with the sharding role.

These users are intended to maintain replication and sharding in the cluster.

Important

It is not recommended to store passwords as plain text in a YAML configuration. Learn how to load passwords from safe storage such as external files or environment variables from Loading secrets from safe storage.

Step 2: Specifying advertise URIs

Add the iproto.advertise section:

iproto:
  advertise:
    peer:
      login: replicator
    sharding:
      login: storage

In this section, the following options are configured:

iproto.advertise.peer specifies how to advertise the current instance to other cluster members. In particular, this option informs other replica set members that the replicator user should be used to connect to the current instance.
iproto.advertise.sharding specifies how to advertise the current instance to a router and rebalancer.

The cluster topology defined in the following section also specifies the iproto.advertise.client option for each instance. This option accepts a URI used to advertise the instance to clients. For example, Tarantool Cluster Manager uses these URIs to connect to cluster instances.

Step 3: Configuring bucket count

Specify the total number of buckets in a sharded cluster using the sharding.bucket_count option:

sharding:
  bucket_count: 1000

Step 4: Defining the cluster topology

Define the cluster topology inside the groups section. The cluster includes two groups:

storages includes two replica sets. Each replica set contains two instances.
routers includes one router instance.

Here is a schematic view of the cluster topology:

groups:
  storages:
    replicasets:
      storage-a:
        # ...
      storage-b:
        # ...
  routers:
    replicasets:
      router-a:
        # ...

To configure storages, add the following code inside the groups section:

storages:
  roles: [ roles.crud-storage ]
  app:
    module: storage
  sharding:
    roles: [ storage ]
  replication:
    failover: manual
  replicasets:
    storage-a:
      leader: storage-a-001
      instances:
        storage-a-001:
          iproto:
            listen:
            - uri: '127.0.0.1:3302'
            advertise:
              client: '127.0.0.1:3302'
        storage-a-002:
          iproto:
            listen:
            - uri: '127.0.0.1:3303'
            advertise:
              client: '127.0.0.1:3303'
    storage-b:
      leader: storage-b-001
      instances:
        storage-b-001:
          iproto:
            listen:
            - uri: '127.0.0.1:3304'
            advertise:
              client: '127.0.0.1:3304'
        storage-b-002:
          iproto:
            listen:
            - uri: '127.0.0.1:3305'
            advertise:
              client: '127.0.0.1:3305'

The main group-level options here are:

roles: This option enables the roles.crud-storage role provided by the CRUD module for all storage instances.
app: The app.module option specifies that code specific to storages should be loaded from the storage module. This is explained below in the Adding storage code section.
sharding: The sharding.roles option specifies that all instances inside this group act as storages. A rebalancer is selected automatically from two master instances.
replication: The replication.failover option specifies that a leader in each replica set should be specified manually.
replicasets: This section configures two replica sets that constitute cluster storages.

To configure a router, add the following code inside the groups section:
```
routers:
  roles: [ roles.crud-router ]
  app:
    module: router
  sharding:
    roles: [ router ]
  replicasets:
    router-a:
      instances:
        router-a-001:
          iproto:
            listen:
            - uri: '127.0.0.1:3301'
            advertise:
              client: '127.0.0.1:3301'
```
The main group-level options here are:
- roles: This option enables the roles.crud-router role provided by the CRUD module for a router instance.
- app: The app.module option specifies that code specific to a router should be loaded from the router module. This is explained below in the Adding router code section.
- sharding: The sharding.roles option specifies that an instance inside this group acts as a router.
- replicasets: This section configures a replica set with one router instance.

Resulting configuration

The resulting config.yaml file should look as follows:

credentials:
  users:
    replicator:
      password: 'topsecret'
      roles: [ replication ]
    storage:
      password: 'secret'
      roles: [ sharding ]

iproto:
  advertise:
    peer:
      login: replicator
    sharding:
      login: storage

sharding:
  bucket_count: 1000

groups:
  storages:
    roles: [ roles.crud-storage ]
    app:
      module: storage
    sharding:
      roles: [ storage ]
    replication:
      failover: manual
    replicasets:
      storage-a:
        leader: storage-a-001
        instances:
          storage-a-001:
            iproto:
              listen:
              - uri: '127.0.0.1:3302'
              advertise:
                client: '127.0.0.1:3302'
          storage-a-002:
            iproto:
              listen:
              - uri: '127.0.0.1:3303'
              advertise:
                client: '127.0.0.1:3303'
      storage-b:
        leader: storage-b-001
        instances:
          storage-b-001:
            iproto:
              listen:
              - uri: '127.0.0.1:3304'
              advertise:
                client: '127.0.0.1:3304'
          storage-b-002:
            iproto:
              listen:
              - uri: '127.0.0.1:3305'
              advertise:
                client: '127.0.0.1:3305'
  routers:
    roles: [ roles.crud-router ]
    app:
      module: router
    sharding:
      roles: [ router ]
    replicasets:
      router-a:
        instances:
          router-a-001:
            iproto:
              listen:
              - uri: '127.0.0.1:3301'
              advertise:
                client: '127.0.0.1:3301'

Adding storage code

Open the storage.lua file and define a space and indexes inside box.watch() as follows:

box.watch('box.status', function()
    if box.info.ro then
        return
    end

    box.schema.create_space('bands', {
        format = {
            { name = 'id', type = 'unsigned' },
            { name = 'bucket_id', type = 'unsigned' },
            { name = 'band_name', type = 'string' },
            { name = 'year', type = 'unsigned' }
        },
        if_not_exists = true
    })
    box.space.bands:create_index('id', { parts = { 'id' }, if_not_exists = true })
    box.space.bands:create_index('bucket_id', { parts = { 'bucket_id' }, unique = false, if_not_exists = true })
end)

The box.schema.create_space() function creates a space. Note that the created bands space includes the bucket_id field. This field represents a sharding key used to partition a dataset across different storage instances.
space_object:create_index() creates two indexes based on the id and bucket_id fields.

Note

In a sharded space, uniqueness by secondary index is only guaranteed within a single shard, not across the whole cluster.

Adding router code

Open the router.lua file and load the vshard module as follows:

local vshard = require('vshard')

Configuring build settings

Open the sharded_cluster_crud-scm-1.rockspec file and add the following content:

package = 'sharded_cluster_crud'
version = 'scm-1'
source  = {
    url = '/dev/null',
}

dependencies = {
    'vshard == 0.1.27',
    'crud == 1.5.2'
}
build = {
    type = 'none';
}

The dependencies section includes the specified versions of the vshard and crud modules. To install dependencies, you need to build the application.

Building the application

In the terminal, open the tt environment directory. Then, execute the tt build command:

$ tt build sharded_cluster_crud
   • Running rocks make
No existing manifest. Attempting to rebuild...
   • Application was successfully built

This installs the vshard and crud modules defined in the *.rockspec file to the .rocks directory.

Working with the cluster

Starting instances

To start all instances in the cluster, execute the tt start command:

$ tt start sharded_cluster_crud
   • Starting an instance [sharded_cluster_crud:storage-a-001]...
   • Starting an instance [sharded_cluster_crud:storage-a-002]...
   • Starting an instance [sharded_cluster_crud:storage-b-001]...
   • Starting an instance [sharded_cluster_crud:storage-b-002]...
   • Starting an instance [sharded_cluster_crud:router-a-001]...

Bootstrapping a cluster

After starting instances, you need to bootstrap the cluster as follows:

Connect to the router instance using tt connect:

$ tt connect sharded_cluster_crud:router-a-001
   • Connecting to the instance...
   • Connected to sharded_cluster_crud:router-a-001

Call vshard.router.bootstrap() to perform the initial cluster bootstrap and distribute all buckets across the replica sets:
```
sharded_cluster_crud:router-a-001> vshard.router.bootstrap()
---
- true
...
```

Checking the cluster status

To check the cluster status, execute vshard.router.info() on the router:

sharded_cluster_crud::router-a-001> vshard.router.info()
---
- replicasets:
    storage-b:
      replica:
        network_timeout: 0.5
        status: available
        uri: storage@127.0.0.1:3305
        name: storage-b-002
      bucket:
        available_rw: 500
      master:
        network_timeout: 0.5
        status: available
        uri: storage@127.0.0.1:3304
        name: storage-b-001
      name: storage-b
    storage-a:
      replica:
        network_timeout: 0.5
        status: available
        uri: storage@127.0.0.1:3303
        name: storage-a-002
      bucket:
        available_rw: 500
      master:
        network_timeout: 0.5
        status: available
        uri: storage@127.0.0.1:3302
        name: storage-a-001
      name: storage-a
  bucket:
    unreachable: 0
    available_ro: 0
    unknown: 0
    available_rw: 1000
  status: 0
  alerts: []
...

The output includes the following sections:

replicasets: contains information about storages and their availability.
bucket: displays the total number of read-write and read-only buckets that are currently available for this router.
status: the number from 0 to 3 that indicates whether there are any issues with the cluster. 0 means that there are no issues.
alerts: might describe the exact issues related to bootstrapping a cluster, for example, connection issues, failover events, or unidentified buckets.

Writing and selecting data

To insert sample data, call crud.insert_many() on the router:

crud.insert_many('bands', {
    { 1, box.NULL, 'Roxette', 1986 },
    { 2, box.NULL, 'Scorpions', 1965 },
    { 3, box.NULL, 'Ace of Base', 1987 },
    { 4, box.NULL, 'The Beatles', 1960 },
    { 5, box.NULL, 'Pink Floyd', 1965 },
    { 6, box.NULL, 'The Rolling Stones', 1962 },
    { 7, box.NULL, 'The Doors', 1965 },
    { 8, box.NULL, 'Nirvana', 1987 },
    { 9, box.NULL, 'Led Zeppelin', 1968 },
    { 10, box.NULL, 'Queen', 1970 }
})

Calling this function distributes data evenly across the cluster nodes.

To get a tuple by the specified ID, call the crud.get() function:

sharded_cluster_crud:router-a-001> crud.get('bands', 4)
---
- rows:
  - [4, 161, 'The Beatles', 1960]
  metadata: [{'name': 'id', 'type': 'unsigned'}, {'name': 'bucket_id', 'type': 'unsigned'},
    {'name': 'band_name', 'type': 'string'}, {'name': 'year', 'type': 'unsigned'}]
- null
...

To insert a new tuple, call crud.insert():

sharded_cluster_crud:router-a-001> crud.insert('bands', {11, box.NULL, 'The Who', 1962})
---
- rows:
  - [11, 652, 'The Who', 1962]
  metadata: [{'name': 'id', 'type': 'unsigned'}, {'name': 'bucket_id', 'type': 'unsigned'},
    {'name': 'band_name', 'type': 'string'}, {'name': 'year', 'type': 'unsigned'}]
- null
...

Checking data distribution

To check how data is distributed across the replica sets, follow the steps below:

Connect to any storage in the storage-a replica set:

$ tt connect sharded_cluster_crud:storage-a-001
   • Connecting to the instance...
   • Connected to sharded_cluster_crud:storage-a-001

Then, select all tuples in the bands space:

sharded_cluster_crud:storage-a-001> box.space.bands:select()
---
- - [1, 477, 'Roxette', 1986]
  - [2, 401, 'Scorpions', 1965]
  - [4, 161, 'The Beatles', 1960]
  - [5, 172, 'Pink Floyd', 1965]
  - [6, 64, 'The Rolling Stones', 1962]
  - [8, 185, 'Nirvana', 1987]
...

Connect to any storage in the storage-b replica set:

$ tt connect sharded_cluster_crud:storage-b-001
   • Connecting to the instance...
   • Connected to sharded_cluster_crud:storage-b-001

Select all tuples in the bands space to make sure it contains another subset of data:

sharded_cluster_crud:storage-b-001> box.space.bands:select()
---
- - [3, 804, 'Ace of Base', 1987]
  - [7, 693, 'The Doors', 1965]
  - [9, 644, 'Led Zeppelin', 1968]
  - [10, 569, 'Queen', 1970]
  - [11, 652, 'The Who', 1962]
...

Getting started with Tarantool Cluster Manager

Enterprise Edition

This tutorial uses Tarantool Enterprise Edition.

Example on GitHub: tcm_get_started

In this tutorial, you get Tarantool Cluster Manager up and running on your local system, deploy a local Tarantool EE cluster, and learn to manage the cluster from the TCM web UI.

To complete this tutorial, you need:

A Linux machine with glibc 2.17 or later.
A web browser: Chromium-based (Chromium version 108 or later), Mozilla Firefox 101 or later, or another up-to-date browser.
The Tarantool Enterprise Edition SDK 3.0 or later in the tar.gz archive. See Installing Tarantool for information about getting the archive.

For more detailed information about using TCM, refer to Tarantool Cluster Manager.

Setting up Tarantool EE

Extract the Tarantool EE SDK archive:
```
$ tar -xvzf tarantool-enterprise-sdk-gc64-<VERSION>-<HASH>-r<REVISION>.linux.x86_64.tar.gz
```
This creates the tarantool-enterprise directory beside the archive. The directory contains three executables for key Tarantool EE components:
- tarantool – Tarantool Enterprise Edition.
- tt – the tt command-line utility.
- tcm – Tarantool Cluster Manager.
Add the Tarantool EE components to the executable path by executing the env.sh script included in the distribution:
```
$ source tarantool-enterprise/env.sh
```

To check that the Tarantool EE executables tarantool, tt, and tcm are available in the system, print their versions:

$ tarantool --version
Tarantool Enterprise 3.0.0-0-gf58f7d82a-r23-gc64
Target: Linux-x86_64-RelWithDebInfo
Build options: cmake . -DCMAKE_INSTALL_PREFIX=/home/centos/release/sdk/tarantool/static-build/tarantool-prefix -DENABLE_BACKTRACE=TRUE
Compiler: GNU-9.3.1
C_FLAGS: -fexceptions -funwind-tables -fasynchronous-unwind-tables -static-libstdc++ -fno-common -msse2  -fmacro-prefix-map=/home/centos/release/sdk/tarantool=. -std=c11 -Wall -Wextra -Wno-gnu-alignof-expression -fno-gnu89-inline -Wno-cast-function-type -O2 -g -DNDEBUG -ggdb -O2
CXX_FLAGS: -fexceptions -funwind-tables -fasynchronous-unwind-tables -static-libstdc++ -fno-common -msse2  -fmacro-prefix-map=/home/centos/release/sdk/tarantool=. -std=c++11 -Wall -Wextra -Wno-invalid-offsetof -Wno-gnu-alignof-expression -Wno-cast-function-type -O2 -g -DNDEBUG -ggdb -O2
$ tt version
Tarantool CLI EE 2.1.0, linux/amd64. commit: d80c2e3
$ tcm version
1.0.0-0-gd38b12c2

Starting TCM

Tarantool Cluster Manager is ready to run out of the box. To start TCM run the following command:

$ tcm --storage.etcd.embed.enabled

Important

The TCM bootstrap log in the terminal includes a message with the credentials to use for the first login. Make sure to save them somewhere.

Jan 24 05:51:28.443 WRN Generated super admin credentials login=admin password=qF3A5rjGurjAwmlYccJ7JrL5XqjbIHY6

The –storage.etcd.embed.enabled option makes TCM start its own instance of etcd on bootstrap. This etcd instance is used for storing the TCM configuration.

Note

During the development, it is also convenient to use the TCM-embedded etcd as a configuration storage for Tarantool EE clusters connected to TCM. Learn more in Centralized configuration storages.

Logging into TCM

Open a web browser and go to http://127.0.0.1:8080/.
Enter the username and the password you got from the TCM bootstrap log in the previous step.
Click Log in.

After a successful login, you see the TCM web UI:

Setting up a Tarantool EE cluster

To prepare a Tarantool EE cluster, complete the following steps:

Define the cluster connection settings in TCM.
Configure the cluster in TCM.
Start the cluster instances locally using the tt utility.

Defining the cluster’s connection settings in TCM

A freshly installed TCM has a predefined cluster named Default cluster. It doesn’t have any configuration or topology out of the box. Its initial properties include the etcd and Tarantool connection parameters. Check these properties to find out where TCM sends the cluster configuration that you write.

To view the Default cluster’s properties:

Go to Clusters and click Edit in the Actions menu opposite the cluster name.
Click Next on the General tab.
Find the connection properties of the configuration storage that the cluster uses. By default, it’s an etcd running on port 2379 (default etcd port) on the same host. The key prefix used for the cluster configuration is /default. Click Next.
Check the Tarantool user that TCM uses to connect to the cluster instances. It’s guest by default.

Configuring a cluster in TCM

TCM provides a web-based editor for writing cluster configurations. It is connected to the configuration storage (etcd in this case): all changes you make in the browser are sent to etcd in one click.

To write the cluster configuration and upload it to the etcd storage:

Go to Configuration.
Click + and provide an arbitrary name for the configuration file, for example, all.

Paste the following YAML configuration into the editor:

credentials:
  users:
    guest:
      roles: [super]
groups:
  group-001:
    replicasets:
      replicaset-001:
        replication:
          failover: manual
        leader: instance-001
        instances:
          instance-001:
            iproto:
              listen:
              - uri: '127.0.0.1:3301'
              advertise:
                client: '127.0.0.1:3301'
          instance-002:
            iproto:
              listen:
              - uri: '127.0.0.1:3302'
              advertise:
                client: '127.0.0.1:3302'
          instance-003:
            iproto:
              listen:
              - uri: '127.0.0.1:3303'
              advertise:
                client: '127.0.0.1:3303'

This configuration sets up a cluster of three nodes in one replica set: one leader and two followers.

Click Apply to send the configuration to etcd.

When the cluster configuration is saved, you can see the cluster topology on the Stateboard page:

However, the cluster instances are offline because they aren’t deployed yet.

Deploying the cluster locally

To deploy a local cluster based on the configuration from etcd:

Go to the system terminal you used when setting up Tarantool.

Create a new tt environment in a directory of your choice:

$ mkdir cluster-env
$ cd cluster-env/
$ tt init

Inside the instances.enabled directory of the created tt environment, create the cluster directory.
```
$ mkdir instances.enabled/cluster
$ cd instances.enabled/cluster/
```
Inside instances.enabled/cluster, create the instances.yml and config.yaml files:
- instances.yml specifies instances to run in the current environment. In this example, there are three instances:
```
instance-001:
instance-002:
instance-003:
```
- config.yaml instructs tt to load the cluster configuration from etcd. The specified etcd location matches the configuration storage of the Default cluster in TCM:
```
config:
  etcd:
    endpoints:
    - http://localhost:2379
    prefix: /default
```

Start the cluster from the tt environment root (the cluster-env directory):

$ tt start cluster

To check how the cluster started, run tt status. This output should look like this:

$ tt status cluster
INSTANCE              STATUS   PID   MODE  CONFIG  BOX      UPSTREAM
cluster:instance-001  RUNNING  8747  RW    ready   running  --
cluster:instance-002  RUNNING  8748  RO    ready   running  --
cluster:instance-003  RUNNING  8749  RO    ready   running  --

Managing the cluster in TCM

To learn to interact with a cluster in TCM, complete typical database tasks such as:

Checking the cluster state.
Creating a space.
Writing data.
Viewing data.

Checking cluster state

To check the cluster state in TCM, go to Stateboard. Here you see the overview of the cluster topology, health, memory consumption, and other information.

Connecting to an instance

To view detailed information about an instance, click its name in the instances list on the Stateboard page.

To connect to the instance interactively and execute code on it, go to the Terminal tab.

Creating a space

Go to the terminal of instance-001 (the leader instance) and run the following code to create a formatted space with a primary index in the cluster:

box.schema.space.create('bands')
box.space.bands:format({
    { name = 'id', type = 'unsigned' },
    { name = 'band_name', type = 'string' },
    { name = 'year', type = 'unsigned' }
})
box.space.bands:create_index('primary', { type = "tree", parts = { 'id' } })

Writing data

Since instance-001 is a read-write instance (its box.info.ro is false), the write requests must be executed on it. Run the following code in the instance-001 terminal to write tuples in the space:

box.space.bands:insert { 1, 'Roxette', 1986 }
box.space.bands:insert { 2, 'Scorpions', 1965 }
box.space.bands:insert { 3, 'Ace of Base', 1987 }

Reading data

Check the space’s tuples by running a read request on instance-001:

box.space.bands:select { 3 }

This is how it looks in TCM:

Checking replication

To check that the data is replicated across instances, run the read request on any other instance – instance-002 or instance-003. The result is the same as on instance-001.

Note

If you try to execute a write request on any instance but instance-001, you get an error because these instances are configured to be read-only.

Viewing data in TCM

TCM web UI includes a tool for viewing data stored in the cluster. To view the space tuples in TCM:

Click an instance name on the Stateboard page.
Open the Actions menu in the top-right corner and click Explorer.

This opens the page that lists user-created spaces on the instance.
Click View in the Actions menu of the space you want to see. The page shows all the tuples added previously.

Platform

This section contains documentation for the Tarantool platform consisting of a database and an application server.

Concepts

Storage engines

A storage engine is a set of low-level routines that store and retrieve values. Tarantool offers a choice of two storage engines:

memtx is the in-memory storage engine used by default.
vinyl is the on-disk storage engine.

For details, check the Storage engines section.

Data model

Tarantool is a NoSQL database. It stores data in spaces, which can be thought of as tables in a relational database, and tuples, which are analogous to rows. There are six basic data operations in Tarantool.

The platform allows describing the data schema but does not require it.

Tarantool supports highly customizable indexes of various types.

To ensure data persistence and recover quickly in case of failure, Tarantool uses mechanisms like the write-ahead log (WAL) and snapshots.

For details, check the Data model page.

Fibers and cooperative multitasking

Tarantool executes code in fibers that are managed via cooperative multitasking. Learn more about Tarantool’s thread model.

For details, check the page Fibers, yields, and cooperative multitasking.

Transactions

Tarantool’s ACID-compliant transaction model lets the user choose between two modes of transactions.

The default mode allows for fast monopolistic atomic transactions. It doesn’t support interactive transactions, and in case of an error, all transaction changes are rolled back.

The MVCC mode relies on a multi-version concurrency control engine that allows yielding within a longer transaction. This mode only works with the default in-memory memtx storage engine.

For details, check the Transactions page.

Replication

Replication allows keeping the data in copies of the same database for better reliability.

Several Tarantool instances can be organized in a replica set. They communicate and transfer data via the iproto binary protocol. Learn more about Tarantool’s replication architecture.

By default, replication in Tarantool is asynchronous. A transaction committed locally on the master node may not get replicated onto other instances before the client receives a success response. Thus, if the master reports success and then dies, the client might not see the result of the transaction.

With synchronous replication, transactions on the master node are not considered committed or successful before they are replicated onto a number of instances. This is slower, but more reliable. Synchronous replication in Tarantool is based on an implementation of the RAFT algorithm.

For details, check the Replication section.

Sharding

Tarantool implements database sharding via the vshard module. For details, go to the Sharding page.

Triggers

Tarantool allows specifying callback functions that run upon certain database events. They can be useful for resolving replication conflicts. For details, go to the Triggers page.

Application server

Using Tarantool as an application server, you can write applications in Lua, C, or C++. You can also create reusable modules.

To increase the speed of code execution, Tarantool has a Lua Just-In-Time compiler (LuaJIT) on board. LuaJIT compiles hot paths in the code – paths that are used many times – thus making the application work faster. To enable developers to work with LuaJIT, Tarantool provides tools like the memory profiler and the getmetrics module.

To learn how to use Tarantool as an application server, refer to the guides in the How-to section.

Storage engines

A storage engine is a set of low-level routines which actually store and retrieve tuple values. Tarantool offers a choice of two storage engines:

memtx is the in-memory storage engine used by default.
vinyl is the on-disk storage engine.

All the details on the engines you can find in the dedicated sections:

Storing data with memtx

The memtx storage engine is used in Tarantool by default. The engine keeps all data in random-access memory (RAM), and therefore has a low read latency.

Tarantool prevents the data loss in case of emergency, such as outage or Tarantool instance failure, in the following ways:

Tarantool persists all data changes by writing requests to the write-ahead log (WAL) that is stored on disk. Also, Tarantool periodically takes the entire database snapshot and saves it on disk. Learn more: Data persistence.
In case of a distributed application, a synchronous replication is used to ensure keeping the data consistent on a quorum of replicas. Although replication is not directly a storage engine topic, it is a part of the answer regarding data safety. Learn more: Replicating data.

In this section, the following topics are discussed in brief with the references to other sections that explain the subject matter in details.

Memory model
Data persistence
Accessing data
Replicating data
Summary

Memory model

There is a fixed number of independent execution threads. The threads don’t share state. Instead they exchange data using low-overhead message queues. While this approach limits the number of cores that the instance uses, it removes competition for the memory bus and ensures peak scalability of memory access and network throughput.

Only one thread, namely, the transaction processor thread (further, TX thread) can access the database, and there is only one TX thread for each Tarantool instance. In this thread, transactions are executed in a strictly consecutive order. Multi-statement transactions exist to provide isolation: each transaction sees a consistent database state and commits all its changes atomically. At commit time, a yield happens and all transaction changes are written to WAL in a single batch. In case of errors during transaction execution, a transaction is rolled-back completely. Read more in the following sections: Transaction model, Transaction mode: MVCC.

Within the TX thread, there is a memory area allocated for Tarantool to store data. It’s called Arena.

Data is stored in spaces. Spaces contain database records – tuples. To access and manipulate the data stored in spaces and tuples, Tarantool builds indexes.

Special allocators manage memory allocations for spaces, tuples, and indexes within the Arena. The slab allocator is the main allocator used to store tuples. Tarantool has a built-in module called box.slab which provides the slab allocator statistics that can be used to monitor the total memory usage and memory fragmentation. For more details, see the box.slab module reference.

Also inside the TX thread, there is an event loop. Within the event loop, there are a number of fibers. Fibers are cooperative primitives that allow interaction with spaces, that is, reading and writing the data. Fibers can interact with the event loop and between each other directly or by using special primitives called channels. Due to the usage of fibers and cooperative multitasking, the memtx engine is lock-free in typical situations.

To interact with external users, there is a separate network thread also called the iproto thread. The iproto thread receives a request from the network, parses and checks the statement, and transforms it into a special structure—a message containing an executable statement and its options. Then the iproto thread ships this message to the TX thread and runs the user’s request in a separate fiber.

Data persistence

Tarantool ensures data persistence as follows:

After executing data change requests in memory, Tarantool writes each such request to the write-ahead log (WAL) files (.xlog) that are stored on disk. Tarantool does this via a separate thread called the WAL thread.

Tarantool periodically takes the entire database snapshot and saves it on disk. It is necessary for accelerating instance’s restart because when there are too many WAL files, it can be difficult for Tarantool to restart quickly.

To save a snapshot, there is a special fiber called the snapshot daemon. It reads the consistent content of the entire Arena and writes it on disk into a snapshot file (.snap). Due of the cooperative multitasking, Tarantool cannot write directly on disk because it is a locking operation. That is why Tarantool interacts with disk via a separate pool of threads from the fio library.

So, even in emergency situations such as an outage or a Tarantool instance failure, when the in-memory database is lost, the data can be restored fully during Tarantool restart.

What happens during the restart:

Tarantool finds the latest snapshot file and reads it.
Tarantool finds all the WAL files created after that snapshot and reads them as well.
When the snapshot and WAL files have been read, there is a fully recovered in-memory data set corresponding to the state when the Tarantool instance stopped.
While reading the snapshot and WAL files, Tarantool is building the primary indexes.
When all the data is in memory again, Tarantool is building the secondary indexes.
Tarantool runs the application.

Accessing data

To access and manipulate the data stored in memory, Tarantool builds indexes. Indexes are also stored in memory within the Arena.

Tarantool supports a number of index types intended for different usage scenarios. The possible types are TREE, HASH, BITSET, and RTREE.

Select query are possible against secondary index keys as well as primary keys. Indexes can have multi-part keys.

For detailed information about indexes, refer to the Indexes page.

Replicating data

Although this topic is not directly related to the memtx engine, it completes the overall picture of how Tarantool works in case of a distributed application.

Replication allows multiple Tarantool instances to work on copies of the same database. The copies are kept in sync because each instance can communicate its changes to all the other instances. It is implemented via WAL replication.

To send data to a replica, Tarantool runs another thread called relay. Its purpose is to read the WAL files and send them to replicas. On a replica, the fiber called applier is run. It receives the changes from a remote node and applies them to the replica’s Arena. All the changes are being written to WAL files via the replica’s WAL thread as if they are done locally.

By default, replication in Tarantool is asynchronous: if a transaction is committed locally on a master node, it does not mean it is replicated onto any replicas.

Synchronous replication exists to solve this problem. Synchronous transactions are not considered committed and are not responded to a client until they are replicated onto some number of replicas.

For more information on replication, refer to the corresponding chapter.

Summary

The main key points describing how the in-memory storage engine works can be summarized in the following way:

All data is in RAM.
Access to data is from one thread.
Tarantool writes all data change requests in WAL.
Data snapshots are taken periodically.
Indexes are build to access the data.
WAL can be replicated.

Storing data with vinyl

Tarantool is a transactional and persistent DBMS that maintains 100% of its data in RAM. The greatest advantages of in-memory databases are their speed and ease of use: they demonstrate consistently high performance, but you never need to tune them.

A few years ago we decided to extend the product by implementing a classical storage engine similar to those used by regular DBMSs: it uses RAM for caching, while the bulk of its data is stored on disk. We decided to make it possible to set a storage engine independently for each table in the database, which is the same way that MySQL approaches it, but we also wanted to support transactions from the very beginning.

The first question we needed to answer was whether to create our own storage engine or use an existing library. The open-source community offered a few viable solutions. The RocksDB library was the fastest growing open-source library and is currently one of the most prominent out there. There were also several lesser-known libraries to consider, such as WiredTiger, ForestDB, NestDB, and LMDB.

Nevertheless, after studying the source code of existing libraries and considering the pros and cons, we opted for our own storage engine. One reason is that the existing third-party libraries expected requests to come from multiple operating system threads and thus contained complex synchronization primitives for controlling parallel data access. If we had decided to embed one of these in Tarantool, we would have made our users bear the overhead of a multithreaded application without getting anything in return. The thing is, Tarantool has an actor-based architecture. The way it processes transactions in a dedicated thread allows it to do away with the unnecessary locks, interprocess communication, and other overhead that accounts for up to 80% of processor time in multithreaded DBMSs.

The Tarantool process consists of a fixed number of “actor” threads

If you design a database engine with cooperative multitasking in mind right from the start, it not only significantly speeds up the development process, but also allows the implementation of certain optimization tricks that would be too complex for multithreaded engines. In short, using a third-party solution wouldn’t have yielded the best result.

Algorithm

Once the idea of using an existing library was off the table, we needed to pick an architecture to build upon. There are two competing approaches to on-disk data storage: the older one relies on B-trees and their variations; the newer one advocates the use of log-structured merge-trees, or “LSM” trees. MySQL, PostgreSQL, and Oracle use B-trees, while Cassandra, MongoDB, and CockroachDB have adopted LSM trees.

B-trees are considered better suited for reads and LSM trees—for writes. However, with SSDs becoming more widespread and the fact that SSDs have read throughput that’s several times greater than write throughput, the advantages of LSM trees in most scenarios was more obvious to us.

Before dissecting LSM trees in Tarantool, let’s take a look at how they work. To do that, we’ll begin by analyzing a regular B-tree and the issues it faces. A B-tree is a balanced tree made up of blocks, which contain sorted lists of key- value pairs. (Topics such as filling and balancing a B-tree or splitting and merging blocks are outside of the scope of this article and can easily be found on Wikipedia). As a result, we get a container sorted by key, where the smallest element is stored in the leftmost node and the largest one in the rightmost node. Let’s have a look at how insertions and searches in a B-tree happen.

Classical B-tree

If you need to find an element or check its membership, the search starts at the root, as usual. If the key is found in the root block, the search stops; otherwise, the search visits the rightmost block holding the largest element that’s not larger than the key being searched (recall that elements at each level are sorted). If the first level yields no results, the search proceeds to the next level. Finally, the search ends up in one of the leaves and probably locates the needed key. Blocks are stored and read into RAM one by one, meaning the algorithm reads $logB(N)$ blocks in a single search, where N is the number of elements in the B-tree. In the simplest case, writes are done similarly: the algorithm finds the block that holds the necessary element and updates (inserts) its value.

To better understand the data structure, let’s consider a practical example: say we have a B-tree with 100,000,000 nodes, a block size of 4096 bytes, and an element size of 100 bytes. Thus each block will hold up to 40 elements (all overhead considered), and the B-tree will consist of around 2,570,000 blocks and 5 levels: the first four will have a size of 256 Mb, while the last one will grow up to 10 Gb. Obviously, any modern computer will be able to store all of the levels except the last one in filesystem cache, so read requests will require just a single I/O operation.

But if we change our perspective —B-trees don’t look so good anymore. Suppose we need to update a single element. Since working with B-trees involves reading and writing whole blocks, we would have to read in one whole block, change our 100 bytes out of 4096, and then write the whole updated block to disk. In other words,we were forced to write 40 times more data than we actually modified!

If you take into account the fact that an SSD block has a size of 64 Kb+ and not every modification changes a whole element, the extra disk workload can be greater still.

Authors of specialized literature and blogs dedicated to on-disk data storage have coined two terms for these phenomena: extra reads are referred to as “read amplification” and writes as “write amplification”.

The amplification factor (multiplication coefficient) is calculated as the ratio of the size of actual read (or written) data to the size of data needed (or actually changed). In our B-tree example, the amplification factor would be around 40 for both reads and writes.

The huge number of extra I/O operations associated with updating data is one of the main issues addressed by LSM trees. Let’s see how they work.

The key difference between LSM trees and regular B-trees is that LSM trees don’t just store data (keys and values), but also data operations: insertions and deletions.

LSM tree:

Stores statements, not values:
- REPLACE
- DELETE
- UPSERT
Every statement is marked by LSN
Append-only files, garbage is collected after a checkpoint
Transactional log of all filesystem changes: vylog

For example, an element corresponding to an insertion operation has, apart from a key and a value, an extra byte with an operation code (“REPLACE” in the image above). An element representing the deletion operation contains a key (since storing a value is unnecessary) and the corresponding operation code—“DELETE”. Also, each LSM tree element has a log sequence number (LSN), which is the value of a monotonically increasing sequence that uniquely identifies each operation. The whole tree is first ordered by key in ascending order, and then, within a single key scope, by LSN in descending order.

A single level of an LSM tree

Filling an LSM tree

Unlike a B-tree, which is stored completely on disk and can be partly cached in RAM, when using an LSM tree, memory is explicitly separated from disk right from the start. The issue of volatile memory and data persistence is beyond the scope of the storage algorithm and can be solved in various ways—for example, by logging changes.

The part of an LSM tree that’s stored in RAM is called L0 (level zero). The size of RAM is limited, so L0 is allocated a fixed amount of memory. For example, in Tarantool, the L0 size is controlled by the vinyl_memory parameter. Initially, when an LSM tree is empty, operations are written to L0. Recall that all elements are ordered by key in ascending order, and then within a single key scope, by LSN in descending order, so when a new value associated with a given key gets inserted, it’s easy to locate the older value and delete it. L0 can be structured as any container capable of storing a sorted sequence of elements. For example, in Tarantool, L0 is implemented as a B+*-tree. Lookups and insertions are standard operations for the data structure underlying L0, so I won’t dwell on those.

Sooner or later the number of elements in an LSM tree exceeds the L0 size and that’s when L0 gets written to a file on disk (called a “run”) and then cleared for storing new elements. This operation is called a “dump”.

Dumps on disk form a sequence ordered by LSN: LSN ranges in different runs don’t overlap, and the leftmost runs (at the head of the sequence) hold newer operations. Think of these runs as a pyramid, with the newest ones closer to the top. As runs keep getting dumped, the pyramid grows higher. Note that newer runs may contain deletions or replacements for existing keys. To remove older data, it’s necessary to perform garbage collection (this process is sometimes called “merge” or “compaction”) by combining several older runs into a new one. If two versions of the same key are encountered during a compaction, only the newer one is retained; however, if a key insertion is followed by a deletion, then both operations can be discarded.

The key choices determining an LSM tree’s efficiency are which runs to compact and when to compact them. Suppose an LSM tree stores a monotonically increasing sequence of keys (1, 2, 3, …,) with no deletions. In this case, compacting runs would be useless: all of the elements are sorted, the tree doesn’t have any garbage, and the location of any key can unequivocally be determined. On the other hand, if an LSM tree contains many deletions, doing a compaction would free up some disk space. However, even if there are no deletions, but key ranges in different runs overlap a lot, compacting such runs could speed up lookups as there would be fewer runs to scan. In this case, it might make sense to compact runs after each dump. But keep in mind that a compaction causes all data stored on disk to be overwritten, so with few reads it’s recommended to perform it less often.

To ensure it’s optimally configurable for any of the scenarios above, an LSM tree organizes all runs into a pyramid: the newer the data operations, the higher up the pyramid they are located. During a compaction, the algorithm picks two or more neighboring runs of approximately equal size, if possible.

Multi-level compaction can span any number of levels
A level can contain multiple runs

All of the neighboring runs of approximately equal size constitute an LSM tree level on disk. The ratio of run sizes at different levels determines the pyramid’s proportions, which allows optimizing the tree for write-intensive or read-intensive scenarios.

Suppose the L0 size is 100 Mb, the ratio of run sizes at each level (the vinyl_run_size_ratio parameter) is 5, and there can be no more than 2 runs per level (the vinyl_run_count_per_level parameter). After the first 3 dumps, the disk will contain 3 runs of 100 Mb each—which constitute L1 (level one). Since 3 > 2, the runs will be compacted into a single 300 Mb run, with the older ones being deleted. After 2 more dumps, there will be another compaction, this time of 2 runs of 100 Mb each and the 300 Mb run, which will produce one 500 Mb run. It will be moved to L2 (recall that the run size ratio is 5), leaving L1 empty. The next 10 dumps will result in L2 having 3 runs of 500 Mb each, which will be compacted into a single 1500 Mb run. Over the course of 10 more dumps, the following will happen: 3 runs of 100 Mb each will be compacted twice, as will two 100 Mb runs and one 300 Mb run, which will yield 2 new 500 Mb runs in L2. Since L2 now has 3 runs, they will also be compacted: two 500 Mb runs and one 1500 Mb run will produce a 2500 Mb run that will be moved to L3, given its size.

This can go on infinitely, but if an LSM tree contains lots of deletions, the resulting compacted run can be moved not only down, but also up the pyramid due to its size being smaller than the sizes of the original runs that were compacted. In other words, it’s enough to logically track which level a certain run belongs to, based on the run size and the smallest and greatest LSN among all of its operations.

Controlling the form of an LSM tree

If it’s necessary to reduce the number of runs for lookups, then the run size ratio can be increased, thus bringing the number of levels down. If, on the other hand, you need to minimize the compaction-related overhead, then the run size ratio can be decreased: the pyramid will grow higher, and even though runs will be compacted more often, they will be smaller, which will reduce the total amount of work done. In general, write amplification in an LSM tree is described by this formula: $log_{x}(\frac {N} {L0}) × x$ or, alternatively, $x × \frac {ln (\frac {N} {C0})} {ln(x)}$ , where N is the total size of all tree elements, L0 is the level zero size, and x is the level size ratio (the level_size_ratio parameter). At $\frac {N} {C0}$ = 40 (the disk-to- memory ratio), the plot would look something like this:

As for read amplification, it’s proportional to the number of levels. The lookup cost at each level is no greater than that for a B-tree. Getting back to the example of a tree with 100,000,000 elements: given 256 Mb of RAM and the default values of vinyl_run_size_ratio and vinyl_run_count_per_level, write amplification would come out to about 13, while read amplification could be as high as 150. Let’s try to figure out why this happens.

Search

When doing a lookup in an LSM tree, what we need to find is not the element itself, but the most recent operation associated with it. If it’s a deletion, then the tree doesn’t contain this element. If it’s an insertion, we need to grab the topmost value in the pyramid, and the search can be stopped after finding the first matching key. In the worst-case scenario, that is if the tree doesn’t hold the needed element, the algorithm will have to sequentially visit all of the levels, starting from L0.

Unfortunately, this scenario is quite common in real life. For example, when inserting a value into a tree, it’s necessary to make sure there are no duplicates among primary/unique keys. So to speed up membership checks, LSM trees use a probabilistic data structure called a “Bloom filter”, which will be covered a bit later, in a section on how vinyl works under the hood.

Range searching

In the case of a single-key search, the algorithm stops after encountering the first match. However, when searching within a certain key range (for example, looking for all the users with the last name “Ivanov”), it’s necessary to scan all tree levels.

Searching within a range of [24,30)

The required range is formed the same way as when compacting several runs: the algorithm picks the key with the largest LSN out of all the sources, ignoring the other associated operations, then moves on to the next key and repeats the procedure.

Deletion

Why would one store deletions? And why doesn’t it lead to a tree overflow in the case of for i=1,10000000 put(i) delete(i) end?

With regards to lookups, deletions signal the absence of a value being searched; with compactions, they clear the tree of “garbage” records with older LSNs.

While the data is in RAM only, there’s no need to store deletions. Similarly, you don’t need to keep them following a compaction if they affect, among other things, the lowest tree level, which contains the oldest dump. Indeed, if a value can’t be found at the lowest level, then it doesn’t exist in the tree.

We can’t delete from append-only files
Tombstones (delete markers) are inserted into L0 instead

Deletion, step 1: a tombstone is inserted into L0

Deletion, step 2: the tombstone passes through intermediate levels

Deletion, step 3: in the case of a major compaction, the tombstone is removed from the tree

If a deletion is known to come right after the insertion of a unique value, which is often the case when modifying a value in a secondary index, then the deletion can safely be filtered out while compacting intermediate tree levels. This optimization is implemented in vinyl.

Advantages of an LSM tree

Apart from decreasing write amplification, the approach that involves periodically dumping level L0 and compacting levels L1-Lk has a few advantages over the approach to writes adopted by B-trees:

Dumps and compactions write relatively large files: typically, the L0 size is 50-100 Mb, which is thousands of times larger than the size of a B-tree block.
This large size allows efficiently compressing data before writing it. Tarantool compresses data automatically, which further decreases write amplification.
There is no fragmentation overhead, since there’s no padding/empty space between the elements inside a run.
All operations create new runs instead of modifying older data in place. This allows avoiding those nasty locks that everyone hates so much. Several operations can run in parallel without causing any conflicts. This also simplifies making backups and moving data to replicas.
Storing older versions of data allows for the efficient implementation of transaction support by using multiversion concurrency control.

Disadvantages of an LSM tree and how to deal with them

One of the key advantages of the B-tree as a search data structure is its predictability: all operations take no longer than $log_{B}(N)$ to run. Conversely, in a classical LSM tree, both read and write speeds can differ by a factor of hundreds (best case scenario) or even thousands (worst case scenario). For example, adding just one element to L0 can cause it to overflow, which can trigger a chain reaction in levels L1, L2, and so on. Lookups may find the needed element in L0 or may need to scan all of the tree levels. It’s also necessary to optimize reads within a single level to achieve speeds comparable to those of a B-tree. Fortunately, most disadvantages can be mitigated or even eliminated with additional algorithms and data structures. Let’s take a closer look at these disadvantages and how they’re dealt with in Tarantool.

Unpredictable write speed

In an LSM tree, insertions almost always affect L0 only. How do you avoid idle time when the memory area allocated for L0 is full?

Clearing L0 involves two lengthy operations: writing to disk and memory deallocation. To avoid idle time while L0 is being dumped, Tarantool uses writeaheads. Suppose the L0 size is 256 Mb. The disk write speed is 10 Mbps. Then it would take 26 seconds to dump L0. The insertion speed is 10,000 RPS, with each key having a size of 100 bytes. While L0 is being dumped, it’s necessary to reserve 26 Mb of RAM, effectively slicing the L0 size down to 230 Mb.

Tarantool does all of these calculations automatically, constantly updating the rolling average of the DBMS workload and the histogram of the disk speed. This allows using L0 as efficiently as possible and it prevents write requests from timing out. But in the case of workload surges, some wait time is still possible. That’s why we also introduced an insertion timeout (the vinyl_timeout parameter), which is set to 60 seconds by default. The write operation itself is executed in dedicated threads. The number of these threads (4 by default) is controlled by the vinyl_write_threads parameter. The default value of 2 allows doing dumps and compactions in parallel, which is also necessary for ensuring system predictability.

In Tarantool, compactions are always performed independently of dumps, in a separate execution thread. This is made possible by the append-only nature of an LSM tree: after dumps runs are never changed, and compactions simply create new runs.

Delays can also be caused by L0 rotation and the deallocation of memory dumped to disk: during a dump, L0 memory is owned by two operating system threads, a transaction processing thread and a write thread. Even though no elements are being added to the rotated L0, it can still be used for lookups. To avoid read locks when doing lookups, the write thread doesn’t deallocate the dumped memory, instead delegating this task to the transaction processor thread. Following a dump, memory deallocation itself happens instantaneously: to achieve this, L0 uses a special allocator that deallocates all of the memory with a single operation.

anticipatory dump
throttling

The dump is performed from the so-called “shadow” L0 without blocking new insertions and lookups

Unpredictable read speed

Optimizing reads is the most difficult optimization task with regards to LSM trees. The main complexity factor here is the number of levels: any optimization causes not only much slower lookups, but also tends to require significantly larger RAM resources. Fortunately, the append-only nature of LSM trees allows us to address these problems in ways that would be nontrivial for traditional data structures.

page index
bloom filters
tuple range cache
multi-level compaction

Compression and page index

In B-trees, data compression is either the hardest problem to crack or a great marketing tool—rather than something really useful. In LSM trees, compression works as follows:

During a dump or compaction all of the data within a single run is split into pages. The page size (in bytes) is controlled by the vinyl_page_size parameter and can be set separately for each index. A page doesn’t have to be exactly of vinyl_page_size size—depending on the data it holds, it can be a little bit smaller or larger. Because of this, pages never have any empty space inside.

Data is compressed by Facebook’s streaming algorithm called “zstd”. The first key of each page, along with the page offset, is added to a “page index”, which is a separate file that allows the quick retrieval of any page. After a dump or compaction, the page index of the created run is also written to disk.

All .index files are cached in RAM, which allows finding the necessary page with a single lookup in a .run file (in vinyl, this is the extension of files resulting from a dump or compaction). Since data within a page is sorted, after it’s read and decompressed, the needed key can be found using a regular binary search. Decompression and reads are handled by separate threads, and are controlled by the vinyl_read_threads parameter.

Tarantool uses a universal file format: for example, the format of a .run file is no different from that of an .xlog file (log file). This simplifies backup and recovery as well as the usage of external tools.

Bloom filters

Even though using a page index enables scanning fewer pages per run when doing a lookup, it’s still necessary to traverse all of the tree levels. There’s a special case, which involves checking if particular data is absent when scanning all of the tree levels and it’s unavoidable: I’m talking about insertions into a unique index. If the data being inserted already exists, then inserting the same data into a unique index should lead to an error. The only way to throw an error in an LSM tree before a transaction is committed is to do a search before inserting the data. Such reads form a class of their own in the DBMS world and are called “hidden” or “parasitic” reads.

Another operation leading to hidden reads is updating a value in a field on which a secondary index is defined. Secondary keys are regular LSM trees that store differently ordered data. In most cases, in order not to have to store all of the data in all of the indexes, a value associated with a given key is kept in whole only in the primary index (any index that stores both a key and a value is called “covering” or “clustered”), whereas the secondary index only stores the fields on which a secondary index is defined, and the values of the fields that are part of the primary index. Thus, each time a change is made to a value in a field on which a secondary index is defined, it’s necessary to first remove the old key from the secondary index—and only then can the new key be inserted. At update time, the old value is unknown, and it is this value that needs to be read in from the primary key “under the hood”.

For example:

update t1 set city=’Moscow’ where id=1

To minimize the number of disk reads, especially for nonexistent data, nearly all LSM trees use probabilistic data structures, and Tarantool is no exception. A classical Bloom filter is made up of several (usually 3-to-5) bit arrays. When data is written, several hash functions are calculated for each key in order to get corresponding array positions. The bits at these positions are then set to 1. Due to possible hash collisions, some bits might be set to 1 twice. We’re most interested in the bits that remain 0 after all keys have been added. When looking for an element within a run, the same hash functions are applied to produce bit positions in the arrays. If any of the bits at these positions is 0, then the element is definitely not in the run. The probability of a false positive in a Bloom filter is calculated using Bayes’ theorem: each hash function is an independent random variable, so the probability of a collision simultaneously occurring in all of the bit arrays is infinitesimal.

The key advantage of Bloom filters in Tarantool is that they’re easily configurable. The only parameter that can be specified separately for each index is called vinyl_bloom_fpr (FPR stands for “false positive ratio”) and it has the default value of 0.05, which translates to a 5% FPR. Based on this parameter, Tarantool automatically creates Bloom filters of the optimal size for partial- key and full-key searches. The Bloom filters are stored in the .index file, along with the page index, and are cached in RAM.

Caching

A lot of people think that caching is a silver bullet that can help with any performance issue. “When in doubt, add more cache”. In vinyl, caching is viewed rather as a means of reducing the overall workload and consequently, of getting a more stable response time for those requests that don’t hit the cache. vinyl boasts a unique type of cache among transactional systems called a “range tuple cache”. Unlike, say, RocksDB or MySQL, this cache doesn’t store pages, but rather ranges of index values obtained from disk, after having performed a compaction spanning all tree levels. This allows the use of caching for both single-key and key-range searches. Since this method of caching stores only hot data and not, say, pages (you may need only some data from a page), RAM is used in the most efficient way possible. The cache size is controlled by the vinyl_cache parameter.

Garbage collection control

Chances are that by now you’ve started losing focus and need a well-deserved dopamine reward. Feel free to take a break, since working through the rest of the article is going to take some serious mental effort.

An LSM tree in vinyl is just a small piece of the puzzle. Even with a single table (or so-called “space”), vinyl creates and maintains several LSM trees, one for each index. But even a single index can be comprised of dozens of LSM trees. Let’s try to understand why this might be necessary.

Recall our example with a tree containing 100,000,000 records, 100 bytes each. As time passes, the lowest LSM level may end up holding a 10 Gb run. During compaction, a temporary run of approximately the same size will be created. Data at intermediate levels takes up some space as well, since the tree may store several operations associated with a single key. In total, storing 10 Gb of actual data may require up to 30 Gb of free space: 10 Gb for the last tree level, 10 Gb for a temporary run, and 10 Gb for the remaining data. But what if the data size is not 10 Gb, but 1 Tb? Requiring that the available disk space always be several times greater than the actual data size is financially unpractical, not to mention that it may take dozens of hours to create a 1 Tb run. And in the case of an emergency shutdown or system restart, the process would have to be started from scratch.

Here’s another scenario. Suppose the primary key is a monotonically increasing sequence—for example, a time series. In this case, most insertions will fall into the right part of the key range, so it wouldn’t make much sense to do a compaction just to append a few million more records to an already huge run.

But what if writes predominantly occur in a particular region of the key range, whereas most reads take place in a different region? How do you optimize the form of the LSM tree in this case? If it’s too high, read performance is impacted; if it’s too low—write speed is reduced.

Tarantool “factorizes” this problem by creating multiple LSM trees for each index. The approximate size of each subtree may be controlled by the vinyl_range_size configuration parameter. We call such subtrees “ranges”.

Factorizing large LSM trees via ranging

Ranges reflect a static layout of sorted runs
Slices connect a sorted run into a range

Initially, when the index has few elements, it consists of a single range. As more elements are added, its total size may exceed the maximum range size. In that case a special operation called “split” divides the tree into two equal parts. The tree is split at the middle element in the range of keys stored in the tree. For example, if the tree initially stores the full range of -inf…+inf, then after splitting it at the middle key X, we get two subtrees: one that stores the range of -inf…X, and the other storing the range of X…+inf. With this approach, we always know which subtree to use for writes and which one for reads. If the tree contained deletions and each of the neighboring ranges grew smaller as a result, the opposite operation called “coalesce” combines two neighboring trees into one.

Split and coalesce don’t entail a compaction, the creation of new runs, or other resource-intensive operations. An LSM tree is just a collection of runs. vinyl has a special metadata log that helps keep track of which run belongs to which subtree(s). This has the .vylog extension and its format is compatible with an .xlog file. Similarly to an .xlog file, the metadata log gets rotated at each checkpoint. To avoid the creation of extra runs with split and coalesce, we have also introduced an auxiliary entity called “slice”. It’s a reference to a run containing a key range and it’s stored only in the metadata log. Once the reference counter drops to zero, the corresponding file gets removed. When it’s necessary to perform a split or to coalesce, Tarantool creates slice objects for each new tree, removes older slices, and writes these operations to the metadata log, which literally stores records that look like this: <tree id, slice id> or <slice id, run id, min, max>.

This way all of the heavy lifting associated with splitting a tree into two subtrees is postponed until a compaction and then is performed automatically. A huge advantage of dividing all of the keys into ranges is the ability to independently control the L0 size as well as the dump and compaction processes for each subtree, which makes these processes manageable and predictable. Having a separate metadata log also simplifies the implementation of both “truncate” and “drop”. In vinyl, they’re processed instantly, since they only work with the metadata log, while garbage collection is done in the background.

Advanced features of vinyl

Upsert

In the previous sections, we mentioned only two operations stored by an LSM tree: deletion and replacement. Let’s take a look at how all of the other operations can be represented. An insertion can be represented via a replacement—you just need to make sure there are no other elements with the specified key. To perform an update, it’s necessary to read the older value from the tree, so it’s easier to represent this operation as a replacement as well—this speeds up future read requests by the key. Besides, an update must return the new value, so there’s no avoiding hidden reads.

In B-trees, the cost of hidden reads is negligible: to update a block, it first needs to be read from disk anyway. Creating a special update operation for an LSM tree that doesn’t cause any hidden reads is really tempting.

Such an operation must contain not only a default value to be inserted if a key has no value yet, but also a list of update operations to perform if a value does exist.

At transaction execution time, Tarantool just saves the operation in an LSM tree, then “executes” it later, during a compaction.

The upsert operation:

space:upsert(tuple, {{operator, field, value}, ... })

Non-reading update or insert
Delayed execution
Background upsert squashing prevents upserts from piling up

Unfortunately, postponing the operation execution until a compaction doesn’t leave much leeway in terms of error handling. That’s why Tarantool tries to validate upserts as fully as possible before writing them to an LSM tree. However, some checks are only possible with older data on hand, for example when the update operation is trying to add a number to a string or to remove a field that doesn’t exist.

A semantically similar operation exists in many products including PostgreSQL and MongoDB. But anywhere you look, it’s just syntactic sugar that combines the update and replace operations without avoiding hidden reads. Most probably, the reason is that LSM trees as data storage structures are relatively new.

Even though an upsert is a very important optimization and implementing it cost us a lot of blood, sweat, and tears, we must admit that it has limited applicability. If a table contains secondary keys or triggers, hidden reads can’t be avoided. But if you have a scenario where secondary keys are not required and the update following the transaction completion will certainly not cause any errors, then the operation is for you.

I’d like to tell you a short story about an upsert. It takes place back when vinyl was only beginning to “mature” and we were using an upsert in production for the first time. We had what seemed like an ideal environment for it: we had tons of keys, the current time was being used as values; update operations were inserting keys or modifying the current time; and we had few reads. Load tests yielded great results.

Nevertheless, after a couple of days, the Tarantool process started eating up 100% of our CPU, and the system performance dropped close to zero.

We started digging into the issue and found out that the distribution of requests across keys was significantly different from what we had seen in the test environment. It was…well, quite nonuniform. Most keys were updated once or twice a day, so the database was idle for the most part, but there were much hotter keys with tens of thousands of updates per day. Tarantool handled those just fine. But in the case of lookups by key with tens of thousands of upserts, things quickly went downhill. To return the most recent value, Tarantool had to read and “replay” the whole history consisting of all of the upserts. When designing upserts, we had hoped this would happen automatically during a compaction, but the process never even got to that stage: the L0 size was more than enough, so there were no dumps.

We solved the problem by adding a background process that performed readaheads on any keys that had more than a few dozen upserts piled up, so all those upserts were squashed and substituted with the read value.

Secondary keys

Update is not the only operation where optimizing hidden reads is critical. Even the replace operation, given secondary keys, has to read the older value: it needs to be independently deleted from the secondary indexes, and inserting a new element might not do this, leaving some garbage behind.

If secondary indexes are not unique, then collecting “garbage” from them can be put off until a compaction, which is what we do in Tarantool. The append-only nature of LSM trees allowed us to implement full-blown serializable transactions in vinyl. Read-only requests use older versions of data without blocking any writes. The transaction manager itself is fairly simple for now: in classical terms, it implements the MVTO (multiversion timestamp ordering) class, whereby the winning transaction is the one that finished earlier. There are no locks and associated deadlocks. Strange as it may seem, this is a drawback rather than an advantage: with parallel execution, you can increase the number of successful transactions by simply holding some of them on lock when necessary. We’re planning to improve the transaction manager soon. In the current release, we focused on making the algorithm behave 100% correctly and predictably. For example, our transaction manager is one of the few on the NoSQL market that supports so-called “gap locks”.

Difference between memtx and vinyl storage engines

The primary difference between memtx and vinyl is that memtx is an in-memory engine while vinyl is an on-disk engine. An in-memory storage engine is generally faster (each query is usually run under 1 ms), and the memtx engine is justifiably the default for Tarantool. But on-disk engine such as vinyl is preferable when the database is larger than the available memory, and adding more memory is not a realistic option.

Option	memtx	vinyl
Supported index type	TREE, HASH, RTREE or BITSET	TREE
Temporary spaces	Supported	Not supported
random() function	Supported	Not supported
alter() function	Supported	Supported starting from the 1.10.2 release (the primary index cannot be modified)
len() function	Returns the number of tuples in the space	Returns the maximum approximate number of tuples in the space
count() function	Takes a constant amount of time	Takes a variable amount of time depending on a state of a DB
delete() function	Returns the deleted tuple, if any	Always returns nil
yield	Does not yield on the select requests unless the transaction is committed to WAL	Yields on the select requests or on its equivalents: get() or pairs()

Configuration

Tarantool provides the ability to configure the full topology of a cluster and set parameters specific for concrete instances, such as connection settings, memory used to store data, logging, and snapshot settings. Each instance uses this configuration during startup to organize the cluster.

There are two approaches to configuring Tarantool:

Since version 3.0: In the YAML format.

YAML configuration allows you to provide the full cluster topology and specify all configuration options. You can use local configuration in a YAML file for each instance or store configuration data in a reliable centralized storage.
In version 2.11 and earlier: In code using the box.cfg API.

In this case, configuration is provided in a Lua initialization script.

Note

Starting with the 3.0 version, configuring Tarantool in code is considered a legacy approach.

Configuration overview

YAML configuration describes the full topology of a Tarantool cluster. A cluster’s topology includes the following elements, starting from the lower level:

groups:
  group001:
    replicasets:
      replicaset001:
        instances:
          instance001:
            # ...
          instance002:
            # ...

instances

An instance represents a single running Tarantool instance. It stores data or might act as a router for handling CRUD requests in a sharded cluster.
replicasets

A replica set is a pack of instances that operate on same data sets. Replication provides redundancy and increases data availability.
groups

A group provides the ability to organize replica sets. For example, in a sharded cluster, one group can contain storage instances and another group can contain routers used to handle CRUD requests.

You can flexibly configure a cluster’s settings on different levels: from global settings applied to all groups to parameters specific for concrete instances.

Note

All the available options are documented in the Configuration reference.

Configuration in a file

This section provides an overview on how to configure Tarantool in a YAML file.

Basic instance configuration

The example below shows a sample configuration of a single Tarantool instance:

# yaml-language-server: $schema=https://download.tarantool.org/tarantool/schema/config.schema.json

groups:
  group001:
    replicasets:
      replicaset001:
        instances:
          instance001:
            iproto:
              listen:
              - uri: '127.0.0.1:3301'

The instances section includes only one instance named instance001. The iproto.listen.uri option sets an address used to listen for incoming requests.
The replicasets section contains one replica set named replicaset001.
The groups section contains one group named group001.

Note

The initial line in this sample contains a link to an annotated Tarantool configuration schema for a YAML language server (e.g. for LSP-Yaml). With this link you can set up your code editor (VScode, Neovim, Sublime, etc.) to get full-text annotations and completion prompts upon Alt+ESC (Linux) / Option+ESC (MacOS) when you work with Tarantool configuration.

Configuration scopes

This section shows how to control a scope the specified configuration option is applied to. Most of the configuration options can be applied to a specific instance, replica set, group, or to all instances globally.

Instance

To apply certain configuration options to a specific instance, specify such options for this instance only. In the example below, iproto.listen is applied to instance001 only.

groups:
  group001:
    replicasets:
      replicaset001:
        instances:
          instance001:
            iproto:
              listen:
              - uri: '127.0.0.1:3301'

Replica set

In this example, iproto.listen is in effect for all instances in replicaset001.

groups:
  group001:
    replicasets:
      replicaset001:
        iproto:
          listen:
          - uri: '127.0.0.1:3301'
        instances:
          instance001: { }

Group

In this example, iproto.listen is in effect for all instances in group001.

groups:
  group001:
    iproto:
      listen:
      - uri: '127.0.0.1:3301'
    replicasets:
      replicaset001:
        instances:
          instance001: { }

Global

In this example, iproto.listen is applied to all instances of the cluster.

iproto:
  listen:
  - uri: '127.0.0.1:3301'

groups:
  group001:
    replicasets:
      replicaset001:
        instances:
          instance001: { }

Configuration scopes above are listed in the order of their precedence – from highest to lowest. For example, if the same option is defined at the instance and global level, the instance’s value takes precedence over the global one.

Note

The Configuration reference contains information about scopes to which each configuration option can be applied.

Configuration scopes: Replica set example

The example below shows how specific configuration options work in different configuration scopes for a replica set with a manual failover. You can learn more about configuring replication from Replication tutorials.

credentials:
  users:
    replicator:
      password: 'topsecret'
      roles: [replication]

iproto:
  advertise:
    peer:
      login: replicator

replication:
  failover: manual

groups:
  group001:
    replicasets:
      replicaset001:
        leader: instance001
        instances:
          instance001:
            iproto:
              listen:
              - uri: '127.0.0.1:3301'
          instance002:
            iproto:
              listen:
              - uri: '127.0.0.1:3302'
          instance003:
            iproto:
              listen:
              - uri: '127.0.0.1:3303'

credentials (global)

This section is used to create the replicator user and assign it the specified role. These options are applied globally to all instances.
iproto (global, instance)

The iproto section is specified on both global and instance levels. The iproto.advertise.peer option specifies the parameters used by an instance to connect to another instance as a replica, for example, a URI, a login and password, or SSL parameters . In the example above, the option includes login only. An URI is taken from iproto.listen that is set on the instance level.
replication (global)

The replication.failover global option sets a manual failover for all replica sets.
leader (replica set)

The <replicaset-name>.leader option sets a master instance for replicaset001.

Enabling and configuring roles

An application role is a Lua module that implements specific functions or logic. You can turn on or off a particular role for certain instances in a configuration without restarting these instances.

There can be built-in Tarantool roles, roles provided by third-party Lua modules, or custom roles that are developed as a part of a cluster application. This section describes how to enable and configure roles. To learn how to develop custom roles, see Application roles.

Enabling a role

To turn on or off a role for a specific instance or a set of instances, use the roles configuration option. The example below shows how to enable the roles.crud-router role provided by the CRUD module using the roles option:

roles: [ roles.crud-router ]

Similarly, you can enable the roles.crud-storage role to make instances act as CRUD storages:

roles: [ roles.crud-storage ]

Example on GitHub: sharded_cluster_crud

Configuring a role

The roles_cfg option allows you to specify the configuration for each role. In this option, the role name is the key and the role configuration is the value.

The example below shows how to enable statistics on called operations by providing the roles.crud-router role’s configuration:

roles:
- roles.crud-router
- roles.metrics-export
roles_cfg:
  roles.crud-router:
    stats: true
    stats_driver: metrics
    stats_quantiles: true

Example on GitHub: sharded_cluster_crud_metrics

Roles and configuration scopes

As the most of configuration options, roles and their configurations can be defined at different levels. Given that the roles option has the array type and roles_cfg has the map type, there are some specifics of applying the configuration:

For roles, an instance’s role takes precedence over roles defined at another level. In the example below, instance001 has only role3:
```
# ...
replicaset001:
  roles: [ role1, role2 ]
  instances:
    instance001:
      roles: [ role3 ]
```
Learn more about the order of precedence for different configuration scopes in Configuration scopes.

For roles_cfg, the following rules are applied:

If a configuration for the same role is provided at different levels, an instance configuration takes precedence over the configuration defined at another level. In the example below, role1.greeting is 'Hi':

# ...
replicaset001:
  roles_cfg:
    role1:
      greeting: 'Hello'
  instances:
    instance001:
      roles: [ role1 ]
      roles_cfg:
        role1:
          greeting: 'Hi'

If the configurations for different roles are provided at different levels, both configurations are applied at the instance level. In the example below, instance001 has role1.greeting set to 'Hi' and role2.farewell set to 'Bye':

# ...
replicaset001:
  roles_cfg:
    role1:
      greeting: 'Hi'
  instances:
    instance001:
      roles: [ role1, role2 ]
      roles_cfg:
        role2:
          farewell: 'Bye'

Adding labels

Labels allow adding custom attributes to your cluster configuration. A label is an arbitrary key: value pair with a string key and value.

labels:
  dc: 'east'
  production: 'false'

Labels can be defined in any configuration scope. An instance receives labels from all scopes it belongs to. The labels section in a group or a replica set scope applies to all instances of the group or a replica set. To override these labels on the instance level or add instance-specific labels, define another labels section in the instance scope.

groups:
  group001:
    replicasets:
      replicaset001:
        labels:
          dc: 'east'
          production: 'false'
        instances:
          instance001:
            labels:
              rack: '10'
              production: 'true'

Example on GitHub: labels

To access instance labels from the application code, call the config:get() function:

myapp:instance001> require('config'):get('labels')
---
- production: 'true'
  rack: '10'
  dc: east
...

Labels can be used to direct function calls to instances that match certain criteria using the connpool module.

Predefined variables

In a configuration file, you can use the following predefined variables that are replaced with actual values at runtime:

instance_name
replicaset_name
group_name

To reference these variables in a configuration file, enclose them in double curly braces with whitespaces. In the example below, {{ instance_name }} is replaced with instance001.

groups:
  group001:
    replicasets:
      replicaset001:
        instances:
          instance001:
            snapshot:
              dir: ./var/{{ instance_name }}/snapshots
            wal:
              dir: ./var/{{ instance_name }}/wals

As a result, the paths to snapshots and write-ahead logs differ for different instances.

Conditional configuration sections

A YAML configuration can include parts that apply only to instances that meet certain conditions. This is useful for cluster upgrade scenarios: during an upgrade, instances can be running different Tarantool versions and therefore require different configurations.

Conditional parts are defined in the conditional configuration section in the global scope. It includes one or more if subsections. Each if subsection defines conditions and configuration parts that apply to instances that meet these conditions.

The example below shows a conditional section for cluster upgrade from Tarantool 3.0.0 to Tarantool 3.1.0:

The user-defined label upgraded is true on instances that are running Tarantool 3.1.0 or later. On older versions, it is false.
Two compat options that were introduced in 3.1.0 are defined for Tarantool 3.1.0 instances. On older versions, they would cause an error.

conditional:
  - if: tarantool_version < 3.1.0
    labels:
      upgraded: 'false'
  - if: tarantool_version >= 3.1.0
    labels:
      upgraded: 'true'
    compat:
      box_error_serialize_verbose: 'new'
      box_error_unpack_type_and_code: 'new'

Example on GitHub: conditional

if sections can use one variable – tarantool_version. It contains a three-number Tarantool version and compares with values of the same format using the comparison operators >, <, >=, <=, ==, and !=. You can write complex conditions using the logical operators || (OR) and && (AND). Parentheses () can be used to define the operators precedence.

conditional:
  - if: (tarantool_version > 3.2.0 || tarantool_version == 3.1.3) && tarantool_version <= 3.99.0
    -- < ... >

If the same option is set in multiple if sections that are true for an instance, this option receives the value from the section declared last in the configuration.

Example:

conditional:
  - if: tarantool_version >= 3.0.0
    labels:
        version: '3.0' # applies to versions >= 3.0.0 and < 3.1.0
  - if: tarantool_version >= 3.1.0
    labels:
        version: '3.1+' # applies to versions >= 3.1.0

Environment variables

For each configuration parameter, Tarantool provides two sets of predefined environment variables:

TT_<CONFIG_PARAMETER>. These variables are used to substitute parameters specified in a configuration file. This means that these variables have a higher priority than the options specified in a configuration file.
TT_<CONFIG_PARAMETER>_DEFAULT. These variables are used to specify default values for parameters missing in a configuration file. These variables have a lower priority than the options specified in a configuration file.

For example, TT_IPROTO_LISTEN and TT_IPROTO_LISTEN_DEFAULT correspond to the iproto.listen option. TT_SNAPSHOT_DIR and TT_SNAPSHOT_DIR_DEFAULT correspond to the snapshot.dir option. To see all the supported environment variables, execute the tarantool command with the --help-env-list option.

$ tarantool --help-env-list

Note

There are also special TT_INSTANCE_NAME and TT_CONFIG environment variables that can be used to start the specified Tarantool instance with configuration from the given file.

Below are a few examples that show how to set environment variables of different types, like string, number, array, or map.

String

In this example, TT_LOG_LEVEL is used to set a logging level to CRITICAL:

$ export TT_LOG_LEVEL='crit'

Number

In this example, a logging level is set to CRITICAL using a corresponding numeric value:

$ export TT_LOG_LEVEL=3

Array

The examples below show how to set the TT_SHARDING_ROLES variable that accepts an array value. Arrays can be passed in two ways: using a simple …

$ export TT_SHARDING_ROLES=router,storage

… or JSON format:

$ export TT_SHARDING_ROLES='["router", "storage"]'

The simple format is applicable only to arrays containing scalar values.

Map

To assign map values to environment variables, you can also use simple or JSON formats. In the example below, TT_LOG_MODULES sets different logging levels for different modules using a simple format:

$ export TT_LOG_MODULES=module1=info,module2=error

In the next example, TT_ROLES_CFG is used to specify the value of a custom configuration for a role using a JSON format:

$ export TT_ROLES_CFG='{"greeter":{"greeting":"Hello"}}'

The simple format is applicable only to maps containing scalar values.

Array of maps

In the example below, TT_IPROTO_LISTEN is used to specify a listening host and port values:

$ export TT_IPROTO_LISTEN=['{"uri":"127.0.0.1:3311"}']

You can also pass several listening addresses:

$ export TT_IPROTO_LISTEN=['{"uri":"127.0.0.1:3311"}','{"uri":"127.0.0.1:3312"}']

Centralized configuration

Enterprise Edition

Centralized configuration storages are supported by the Enterprise Edition only.

Tarantool enables you to store configuration data in one place using a Tarantool or etcd-based storage. To achieve this, you need to:

Set up a centralized configuration storage.
Publish a cluster’s configuration to the storage.
Configure a connection to the storage by providing a local YAML configuration with an endpoint address and key prefix in the config section:
```
config:
  etcd:
    endpoints:
    - http://localhost:2379
    prefix: /myapp
```

Learn more from the following guide: Centralized configuration storages.

Configuration precedence

Tarantool configuration options are applied from multiple sources with the following precedence, from highest to lowest:

TT_* environment variables.
Configuration from a local YAML file.
Centralized configuration.
TT_*_DEFAULT environment variables.

If the same option is defined in two or more locations, the option with the highest precedence is applied.

Centralized configuration storages

Enterprise Edition

Centralized configuration storages are supported by the Enterprise Edition only.

Examples on GitHub: centralized_config

Tarantool enables you to store a cluster’s configuration in one reliable place using a Tarantool or etcd-based storage:

A Tarantool-based configuration storage is a replica set that stores a cluster’s configuration in synchronous spaces.
etcd is a distributed key-value storage for any type of critical data used by distributed systems.

With a local YAML configuration, you need to make sure that all cluster instances use identical configuration files:

Using a centralized configuration storage, all instances get the actual configuration from one place:

This topic describes how to set up a configuration storage, publish a cluster configuration to this storage, and use this configuration for all cluster instances.

Setting up a configuration storage

Tarantool-based storage

To make a replica set act as a configuration storage, use the built-in config.storage role.

Configuring a storage

To configure a Tarantool-based storage, follow the steps below:

Define a replica set topology and specify the following options at the replica set level:

Enable the config.storage role in roles.
Optionally, provide the role configuration in roles_cfg. In the example below, the status_check_interval option sets the interval (in seconds) of status checks.

groups:
  group001:
    replicasets:
      replicaset001:
        roles: [ config.storage ]
        roles_cfg:
          config.storage:
            status_check_interval: 3
        instances:
          instance001:
            iproto:
              listen:
              - uri: '127.0.0.1:4401'
          instance002:
            iproto:
              listen:
              - uri: '127.0.0.1:4402'
          instance003:
            iproto:
              listen:
              - uri: '127.0.0.1:4403'

Create a user and grant them the following privileges:

The read and write permissions to the config_storage and config_storage_meta spaces used to store configuration data.
The execute permission to universe to allow interacting with the storage using the tt utility.

credentials:
  users:
    sampleuser:
      password: '123456'
      privileges:
      - permissions: [ read, write ]
        spaces: [ config_storage, config_storage_meta ]
      - permissions: [ execute ]
        universe: true

Set the replication.failover option to election to enable automated failover:
```
replication:
  failover: election
```
Enable the MVCC transaction mode to provide linearizability of read operations:
```
database:
  use_mvcc_engine: true
```

The resulting storage configuration might look as follows:

credentials:
  users:
    sampleuser:
      password: '123456'
      privileges:
      - permissions: [ read, write ]
        spaces: [ config_storage, config_storage_meta ]
      - permissions: [ execute ]
        universe: true
    replicator:
      password: 'topsecret'
      roles: [ replication ]

iproto:
  advertise:
    peer:
      login: replicator

replication:
  failover: election

database:
  use_mvcc_engine: true

groups:
  group001:
    replicasets:
      replicaset001:
        roles: [ config.storage ]
        roles_cfg:
          config.storage:
            status_check_interval: 3
        instances:
          instance001:
            iproto:
              listen:
              - uri: '127.0.0.1:4401'
          instance002:
            iproto:
              listen:
              - uri: '127.0.0.1:4402'
          instance003:
            iproto:
              listen:
              - uri: '127.0.0.1:4403'

You can find the full example here: tarantool_config_storage.

Starting a storage

To start instances of the configured storage, use the tt start command, for example:

$ tt start tarantool_config_storage

Learn more from the Starting and stopping instances section.

etcd-based storage

To learn how to set up an etcd-based configuration storage, consult the etcd documentation.

The example script below demonstrates how to use the etcdctl utility to create a user that has read and write access to configurations stored by the /myapp/ prefix:

etcdctl user add root:topsecret
etcdctl role add myapp_config_manager
etcdctl role grant-permission myapp_config_manager --prefix=true readwrite /myapp/
etcdctl user add sampleuser:123456
etcdctl user grant-role sampleuser myapp_config_manager
etcdctl auth enable

The credentials of this user should be specified when configuring a connection to the etcd cluster.

Publishing a cluster’s configuration

Publishing configuration using the tt utility

The tt utility provides the tt cluster command for managing centralized cluster configurations. The tt cluster publish command can be used to publish a cluster’s configuration to both Tarantool and etcd-based storages.

The example below shows how a tt environment and a layout of the application called myapp might look:

├── tt.yaml
├── source.yaml
└── instances.enabled
    └── myapp
        ├── config.yaml
        └── instances.yml

tt.yaml: a tt configuration file.
source.yaml contains a cluster’s configuration to be published.
config.yaml contains a local configuration used to connect to the centralized storage.
instances.yml specifies instances to run in the current environment. The configured instances are used by tt when starting a cluster. tt cluster publish ignores this configuration file.

To publish a cluster’s configuration (source.yaml) to a centralized storage, execute tt cluster publish as follows:

$ tt cluster publish "http://sampleuser:123456@localhost:2379/myapp" source.yaml

Executing this command publishes a cluster configuration by the /myapp/config/all path.

Note

You can see a cluster’s configuration using the tt cluster show command.

Publishing configuration using the ‘config’ module

The config module provides the API for interacting with a Tarantool-based configuration storage. The example below shows how to read a configuration stored in the source.yaml file using the fio module API and put this configuration by the /myapp/config/all path:

local fio = require('fio')
local cluster_config_handle = fio.open('../../source.yaml')
local cluster_config = cluster_config_handle:read()
local response = config.storage.put('/myapp/config/all', cluster_config)
cluster_config_handle:close()

Learn more from the config.storage API section.

Note

The net.box module provides the ability to monitor configuration updates by watching path or prefix changes. Learn more in conn:watch().

Publishing configuration using etcdctl

To publish a cluster’s configuration to etcd using the etcdctl utility, use the put command:

$ etcdctl put /myapp/config/all < source.yaml

Note

For etcd versions earlier than 3.4, you need to set the ETCDCTL_API environment variable to 3.

Configuring connection to a storage

To use a configuration from a centralized storage for your cluster, you need to provide connection settings in a local configuration file.

Enterprise Edition

Centralized configuration storages are supported by the Enterprise Edition only.

Configuring connection to a Tarantool storage

Connection options for a Tarantool-based storage should be specified in the config.storage section of the configuration file. In the example below, the following options are specified:

config:
  storage:
    endpoints:
      - uri: '127.0.0.1:4401'
        login: sampleuser
        password: '123456'
      - uri: '127.0.0.1:4402'
        login: sampleuser
        password: '123456'
      - uri: '127.0.0.1:4403'
        login: sampleuser
        password: '123456'
    prefix: /myapp
    timeout: 3
    reconnect_after: 5

endpoints specifies the list of configuration storage endpoints.
prefix sets a key prefix used to search a configuration. Tarantool searches keys by the following path: <prefix>/config/*. Note that <prefix> should start with a slash (/).
timeout specifies the interval (in seconds) to perform the status check of a configuration storage.
reconnect_after specifies how much time to wait (in seconds) before reconnecting to a configuration storage.

You can find the full example here: config_storage.

Configuring connection to an etcd storage

Connection options for etcd should be specified in the config.etcd section of the configuration file. In the example below, the following options are specified:

config:
  etcd:
    endpoints:
    - http://localhost:2379
    prefix: /myapp
    username: sampleuser
    password: '123456'
    http:
      request:
        timeout: 3

endpoints specifies the list of etcd endpoints.
prefix sets a key prefix used to search a configuration. Tarantool searches keys by the following path: <prefix>/config/*. Note that <prefix> should start with a slash (/).
username and password specify credentials used for authentication.
http.request.timeout configures a request timeout for an etcd server.

You can find the full example here: config_etcd.

Starting a cluster

Note

To run instances in production, it is recommended to use Ansible Tarantool Enterprise installer (ATE). ATE is a set of Ansible playbooks that are used to deploy and maintain Tarantool Enterprise products. ATE documentation is available to users logged in on the Tarantool website.

The tt utility is the recommended way to start Tarantool instances. You can learn how to do this from the Starting and stopping instances section.

You can also use the tarantool command to start a Tarantool instance. In this case, you can eliminate creating a local configuration and provide connection settings using the following environment variables:

Tarantool-based storage: TT_CONFIG_STORAGE_ENDPOINTS and TT_CONFIG_STORAGE_PREFIX.
etcd-based storage: TT_CONFIG_ETCD_ENDPOINTS and TT_CONFIG_ETCD_PREFIX.

Enterprise Edition

Centralized configuration storages are supported by the Enterprise Edition only.

The example below shows how to provide etcd connection settings and start cluster instances using the tarantool command:

$ export TT_CONFIG_ETCD_ENDPOINTS=http://localhost:2379
$ export TT_CONFIG_ETCD_PREFIX=/myapp

$ tarantool --name instance001
$ tarantool --name instance002
$ tarantool --name instance003

Reloading configuration

By default, Tarantool watches keys with the specified prefix for changes in a cluster’s configuration and reloads a changed configuration automatically. If necessary, you can set the config.reload option to manual to turn off configuration reloading:

config:
  reload: 'manual'
  etcd:
    # ...

In this case, you can reload a configuration in an admin console or application code using the reload() function provided by the config module:

require('config'):reload()

Configuration in code

Note

Starting with the 3.0 version, the recommended way of configuring Tarantool is using a configuration file. Configuring Tarantool in code is considered a legacy approach.

This topic covers the specifics of configuring Tarantool in code using the box.cfg API. In this case, a configuration is stored in an initialization file - a Lua script with the specified configuration options. You can find all the available options in the Configuration reference.

Initialization file

If the command to start Tarantool includes an instance file, then Tarantool begins by invoking the Lua program in the file, which may have the name init.lua. The Lua program may get further arguments from the command line or may use operating-system functions, such as getenv(). The Lua program almost always begins by invoking box.cfg(), if the database server will be used or if ports need to be opened. For example, suppose init.lua contains the lines

#!/usr/bin/env tarantool
box.cfg{
    listen              = os.getenv("LISTEN_URI"),
    memtx_memory        = 33554432,
    pid_file            = "tarantool.pid",
    wal_max_size        = 2500
}
print('Starting ', arg[1])

and suppose the environment variable LISTEN_URI contains 3301, and suppose the command line is tarantool init.lua ARG. Then the screen might look like this:

$ export LISTEN_URI=3301
$ tarantool init.lua ARG
... main/101/init.lua C> Tarantool 2.8.3-0-g01023dbc2
... main/101/init.lua C> log level 5
... main/101/init.lua I> mapping 33554432 bytes for memtx tuple arena...
... main/101/init.lua I> recovery start
... main/101/init.lua I> recovering from './00000000000000000000.snap'
... main/101/init.lua I> set 'listen' configuration option to "3301"
... main/102/leave_local_hot_standby I> ready to accept requests
Starting  ARG
... main C> entering the event loop

If you wish to start an interactive session on the same terminal after initialization is complete, you can pass the -i command-line option.

Environment variables

Starting from version 2.8.1, you can specify configuration parameters via special environment variables. The name of a variable should have the following pattern: TT_<NAME>, where <NAME> is the uppercase name of the corresponding box.cfg parameter.

For example:

TT_LISTEN – corresponds to the box.cfg.listen option.
TT_MEMTX_DIR – corresponds to the box.cfg.memtx_dir option.

In case of an array value, separate the array elements by a comma without space:

export TT_REPLICATION="localhost:3301,localhost:3302"

If you need to pass additional parameters for URI, use the ? and & delimiters:

export TT_LISTEN="localhost:3301?param1=value1&param2=value2"

An empty variable (TT_LISTEN=) has the same effect as an unset one, meaning that the corresponding configuration parameter won’t be set when calling box.cfg{}.

Configuration parameters

Configuration parameters have the form:

box.cfg{[key = value [, key = value ...]]}

Configuration parameters can be set in a Lua initialization file, which is specified on the Tarantool command line.

Most configuration parameters are for allocating resources, opening ports, and specifying database behavior. All parameters are optional. Most of the parameters are dynamic, that is, they can be changed at runtime by calling box.cfg{} a second time. For example, the command below sets the listen port to 3301.

tarantool> box.cfg{ listen = 3301 }
2023-05-10 13:28:54.667 [31326] main/103/interactive I> tx_binary: stopped
2023-05-10 13:28:54.667 [31326] main/103/interactive I> tx_binary: bound to [::]:3301
2023-05-10 13:28:54.667 [31326] main/103/interactive/box.load_cfg I> set 'listen' configuration option to 3301
---
...

To see all the non-null parameters, execute box.cfg (no parentheses).

tarantool> box.cfg
---
- replication_skip_conflict: false
  wal_queue_max_size: 16777216
  feedback_host: https://feedback.tarantool.io
  memtx_dir: .
  memtx_min_tuple_size: 16
  -- other parameters --
...

To see a particular parameter value, call a corresponding box.cfg option. For example, box.cfg.listen shows the specified listen address.

tarantool> box.cfg.listen
---
- 3301
...

Listen URI

Some configuration parameters and some functions depend on a URI (Universal Resource Identifier). The URI string format is similar to the generic syntax for a URI schema. It may contain (in order):

user name for login
password
host name or host IP address
port number
query parameters

Only a port number is always mandatory. A password is mandatory if a user name is specified unless the user name is ‘guest’.

Formally, the URI syntax is [host:]port or [username:password@]host:port. If a host is omitted, then “0.0.0.0” or “[::]” is assumed, meaning respectively any IPv4 address or any IPv6 address on the local machine. If username:password is omitted, then the “guest” user is assumed. Some examples:

URI fragment	Example
port	3301
host:port	127.0.0.1:3301
username:password@host:port	notguest:sesame@mail.ru:3301

In code, the URI value can be passed as a number (if only a port is specified) or a string:

box.cfg { listen = 3301 }

box.cfg { listen = "127.0.0.1:3301" }

In certain circumstances, a Unix domain socket may be used where a URI is expected, for example, unix/:/tmp/unix_domain_socket.sock or simply /tmp/unix_domain_socket.sock.

The uri module provides functions that convert URI strings into their components or turn components into URI strings.

Specifying several URIs

Starting from version 2.10.0, a user can open several listening iproto sockets on a Tarantool instance and, consequently, can specify several URIs in the configuration parameters such as box.cfg.listen and box.cfg.replication.

URI values can be set in a number of ways:

As a string with URI values separated by commas.

box.cfg { listen = "127.0.0.1:3301, /unix.sock, 3302" }

As a table that contains URIs in the string format.

box.cfg { listen = {"127.0.0.1:3301", "/unix.sock", "3302"} }

As an array of tables with the uri field.

box.cfg { listen = {
        {uri = "127.0.0.1:3301"},
        {uri = "/unix.sock"},
        {uri = 3302}
    }
}

In a combined way – an array that contains URIs in both the string and the table formats.

box.cfg { listen = {
        "127.0.0.1:3301",
        { uri = "/unix.sock" },
        { uri = 3302 }
    }
}

Also, starting from version 2.10.0, it is possible to specify additional parameters for URIs. You can do this in different ways:

Using the ? delimiter when URIs are specified in a string format.

box.cfg { listen = "127.0.0.1:3301?p1=value1&p2=value2, /unix.sock?p3=value3" }

Using the params table: a URI is passed in a table with additional parameters in the “params” table. Parameters in the “params” table overwrite the ones from a URI string (“value2” overwrites “value1” for p1 in the example below).
```
box.cfg { listen = {
        "127.0.0.1:3301?p1=value1",
        params = {p1 = "value2", p2 = "value3"}
    }
}
```
Using the default_params table for specifying default parameter values.

In the example below, two URIs are passed in a table. The default value for the p3 parameter is defined in the default_params table and used if this parameter is not specified in URIs. Parameters in the default_params table are applicable to all the URIs passed in a table.
```
box.cfg { listen = {
        "127.0.0.1:3301?p1=value1",
        { uri = "/unix.sock", params = { p2 = "value2" } },
        default_params = { p3 = "value3" }
    }
}
```

The recommended way for specifying URI with additional parameters is the following:

box.cfg { listen = {
        {uri = "127.0.0.1:3301", params = {p1 = "value1"}},
        {uri = "/unix.sock", params = {p2 = "value2"}},
        {uri = 3302, params = {p3 = "value3"}}
    }
}

In case of a single URI, the following syntax also works:

box.cfg { listen = {
        uri = "127.0.0.1:3301",
        params = { p1 = "value1", p2 = "value2" }
    }
}

Traffic encryption

Enterprise Edition

Traffic encryption is supported by the Enterprise Edition only.

Since version 2.10.0, Tarantool Enterprise Edition has the built-in support for using SSL to encrypt the client-server communications over binary connections, that is, between Tarantool instances in a cluster or connecting to an instance via connectors using net.box.

Tarantool uses the OpenSSL library that is included in the delivery package. Note that SSL connections use only TLSv1.2.

Configuration

To configure traffic encryption, you need to set the special URI parameters for a particular connection. The parameters can be set for the following box.cfg options and net.box method:

box.cfg.listen – on the server side.
box.cfg.replication – on the client side.
net_box_object.connect() – on the client side.

Below is the list of the parameters. In the next section, you can find details and examples on what should be configured on both the server side and the client side.

transport – enables SSL encryption for a connection if set to ssl. The default value is plain, which means the encryption is off. If the parameter is not set, the encryption is off too. Other encryption-related parameters can be used only if the transport = 'ssl' is set.

Example:

local connection = require('net.box').connect({
    uri = 'admin:topsecret@127.0.0.1:3301',
    params = { transport = 'ssl',
               ssl_cert_file = 'certs/instance001/server001.crt',
               ssl_key_file = 'certs/instance001/server001.key',
               ssl_password = 'qwerty' }
})

ssl_key_file – a path to a private SSL key file. Mandatory for a server. For a client, it’s mandatory if the ssl_ca_file parameter is set for a server; otherwise, optional. If the private key is encrypted, provide a password for it in the ssl_password or ssl_password_file parameter.
ssl_cert_file – a path to an SSL certificate file. Mandatory for a server. For a client, it’s mandatory if the ssl_ca_file parameter is set for a server; otherwise, optional.
ssl_ca_file – a path to a trusted certificate authorities (CA) file. Optional. If not set, the peer won’t be checked for authenticity.

Both a server and a client can use the ssl_ca_file parameter:
- If it’s on the server side, the server verifies the client.
- If it’s on the client side, the client verifies the server.
- If both sides have the CA files, the server and the client verify each other.
ssl_ciphers – a colon-separated (:) list of SSL cipher suites the connection can use. See the Supported ciphers section for details. Optional. Note that the list is not validated: if a cipher suite is unknown, Tarantool just ignores it, doesn’t establish the connection and writes to the log that no shared cipher found.
ssl_password – a password for an encrypted private SSL key. Optional. Alternatively, the password can be provided in ssl_password_file.
ssl_password_file – a text file with one or more passwords for encrypted private SSL keys (each on a separate line). Optional. Alternatively, the password can be provided in ssl_password.

Tarantool applies the ssl_password and ssl_password_file parameters in the following order:
1. If ssl_password is provided, Tarantool tries to decrypt the private key with it.
2. If ssl_password is incorrect or isn’t provided, Tarantool tries all passwords from ssl_password_file one by one in the order they are written.
3. If ssl_password and all passwords from ssl_password_file are incorrect, or none of them is provided, Tarantool treats the private key as unencrypted.

Configuration example:

box.cfg{ listen = {
    uri = 'localhost:3301',
    params = {
        transport = 'ssl',
        ssl_key_file = '/path_to_key_file',
        ssl_cert_file = '/path_to_cert_file',
        ssl_ciphers = 'HIGH:!aNULL',
        ssl_password = 'topsecret'
    }
}}

Supported ciphers

Tarantool Enterprise supports the following cipher suites:

ECDHE-ECDSA-AES256-GCM-SHA384
ECDHE-RSA-AES256-GCM-SHA384
DHE-RSA-AES256-GCM-SHA384
ECDHE-ECDSA-CHACHA20-POLY1305
ECDHE-RSA-CHACHA20-POLY1305
DHE-RSA-CHACHA20-POLY1305
ECDHE-ECDSA-AES128-GCM-SHA256
ECDHE-RSA-AES128-GCM-SHA256
DHE-RSA-AES128-GCM-SHA256
ECDHE-ECDSA-AES256-SHA384
ECDHE-RSA-AES256-SHA384
DHE-RSA-AES256-SHA256
ECDHE-ECDSA-AES128-SHA256
ECDHE-RSA-AES128-SHA256
DHE-RSA-AES128-SHA256
ECDHE-ECDSA-AES256-SHA
ECDHE-RSA-AES256-SHA
DHE-RSA-AES256-SHA
ECDHE-ECDSA-AES128-SHA
ECDHE-RSA-AES128-SHA
DHE-RSA-AES128-SHA
AES256-GCM-SHA384
AES128-GCM-SHA256
AES256-SHA256
AES128-SHA256
AES256-SHA
AES128-SHA
GOST2012-GOST8912-GOST8912
GOST2001-GOST89-GOST89

Tarantool Enterprise static build has the embedded engine to support the GOST cryptographic algorithms. If you use these algorithms for traffic encryption, specify the corresponding cipher suite in the ssl_ciphers parameter, for example:

box.cfg{ listen = {
    uri = 'localhost:3301',
    params = {
        transport = 'ssl',
        ssl_key_file = '/path_to_key_file',
        ssl_cert_file = '/path_to_cert_file',
        ssl_ciphers = 'GOST2012-GOST8912-GOST8912'
    }
}}

For detailed information on SSL ciphers and their syntax, refer to OpenSSL documentation.

Using environment variables

The URI parameters for traffic encryption can also be set via environment variables, for example:

export TT_LISTEN="localhost:3301?transport=ssl&ssl_cert_file=/path_to_cert_file&ssl_key_file=/path_to_key_file"

Server-client configuration details

When configuring the traffic encryption, you need to specify the necessary parameters on both the server side and the client side. Below you can find the summary on the options and parameters to be used and examples of configuration.

Server side

Is configured via the box.cfg.listen option.
Mandatory URI parameters: transport, ssl_key_file and ssl_cert_file.
Optional URI parameters: ssl_ca_file, ssl_ciphers, ssl_password, and ssl_password_file.

Client side

Is configured via the box.cfg.replication option (see details) or net_box_object.connect().

Parameters:

If the server side has only the transport, ssl_key_file and ssl_cert_file parameters set, on the client side, you need to specify only transport = ssl as the mandatory parameter. All other URI parameters are optional.
If the server side also has the ssl_ca_file parameter set, on the client side, you need to specify transport, ssl_key_file and ssl_cert_file as the mandatory parameters. Other parameters – ssl_ca_file, ssl_ciphers, ssl_password, and ssl_password_file – are optional.

Configuration examples

Suppose, there is a master-replica set with two Tarantool instances:

127.0.0.1:3301 – master (server)
127.0.0.1:3302 – replica (client).

Examples below show the configuration related to connection encryption for two cases: when the trusted certificate authorities (CA) file is not set on the server side and when it does. Only mandatory URI parameters are mentioned in these examples.

Without CA

127.0.0.1:3301 – master (server)

box.cfg{
    listen = {
        uri = '127.0.0.1:3301',
        params = {
            transport = 'ssl',
            ssl_key_file = '/path_to_key_file',
            ssl_cert_file = '/path_to_cert_file'
        }
    }
}

127.0.0.1:3302 – replica (client)

box.cfg{
    listen = {
        uri = '127.0.0.1:3302',
        params = {transport = 'ssl'}
    },
    replication = {
        uri = 'username:password@127.0.0.1:3301',
        params = {transport = 'ssl'}
    },
    read_only = true
}

With CA

127.0.0.1:3301 – master (server)

box.cfg{
    listen = {
        uri = '127.0.0.1:3301',
        params = {
            transport = 'ssl',
            ssl_key_file = '/path_to_key_file',
            ssl_cert_file = '/path_to_cert_file',
            ssl_ca_file = '/path_to_ca_file'
        }
    }
}

127.0.0.1:3302 – replica (client)

box.cfg{
    listen = {
        uri = '127.0.0.1:3302',
        params = {
            transport = 'ssl',
            ssl_key_file = '/path_to_key_file',
            ssl_cert_file = '/path_to_cert_file'
        }
    },
    replication = {
        uri = 'username:password@127.0.0.1:3301',
        params = {
            transport = 'ssl',
            ssl_key_file = '/path_to_key_file',
            ssl_cert_file = '/path_to_cert_file'
        }
    },
    read_only = true
}

Starting a Tarantool instance

Below is the syntax for starting a Tarantool instance configured in a Lua initialization script:

$ tarantool LUA_INITIALIZATION_FILE [OPTION ...]

The tarantool command also provides a set of options that might be helpful for development purposes.

The command below starts a Tarantool instance configured in the init.lua file:

$ tarantool init.lua

Storage

This section contains guides on configuring a storage.

In-memory storage

Example on GitHub: memtx

In Tarantool, all data is stored in random-access memory (RAM) by default. For this purpose, the memtx storage engine is used.

This topic describes how to define basic settings related to in-memory storage in the memtx section of a YAML configuration – for example, memory size and maximum tuple size. For the specific settings related to allocator or sorting threads, check the corresponding memtx options in the Configuration reference.

Note

To estimate the required amount of memory, you can use the sizing calculator.

Memory size

In Tarantool, data is stored in spaces. Each space consists of tuples – the database records. To specify the amount of memory that Tarantool allocates to store tuples, use the memtx.memory configuration option.

In the example below, the memory size is set to 1 GB (1073741824 bytes):

memtx:
  memory: 1073741824

The server does not exceed this limit to allocate tuples. For indexes and connection information, additional memory is used.

When the memtx.memory limit is reached, INSERT or UPDATE requests fail with ER_MEMORY_ISSUE.

Tuple size

You can configure the minimum and the maximum tuple sizes in bytes.

If the tuples are small, you can decrease the minimum size.
If the tuples are large, you can increase the maximum size.

To define the tuple size, use the memtx.min_tuple_size and memtx.max_tuple_size configuration options.

In the example, the minimum size is set to 8 bytes and the maximum size is set to 5 MB:

memtx:
  memory: 1073741824
  min_tuple_size: 8
  max_tuple_size: 5242880

Persistence

To ensure data persistence, Tarantool provides the abilities to:

Record each data change request into a write-ahead log (WAL) file (.xlog files).

When a power outage occurs or the Tarantool instance is killed incidentally, the in-memory database is lost. In such case, Tarantool restores the data from WAL files by reading them and redoing the requests. This is called the “recovery process”.
Take internals-snapshot that contain an on-disk copy of the entire data set for a given moment (.snap files).

During the recovery process, Tarantool can load the latest snapshot file and then read the requests from the WAL files, produced after this snapshot was made. After creating a new snapshot, the earlier WAL files can be removed to free up space.

This topic describes how to configure:

the snapshot creation in the snapshot section of a YAML configuration.
the recording to the write-ahead log in the wal section of a YAML configuration.

To learn more about the persistence mechanism in Tarantool, see the Persistence section. The formats of WAL and snapshot files are described in detail in the File formats section.

Configure the snapshots

Example on GitHub: snapshot

This section describes how to define snapshot settings in the snapshot section of a YAML configuration.

Note

To force immediate creation of a snapshot file, use the box.snapshot() function.

Set up automatic snapshot creation

In Tarantool, it is possible to automate the snapshot creation. Automatic creation is enabled by default and can be configured in two ways:

A new snapshot is taken once in a given period (see snapshot.by.interval).
A new snapshot is taken once the size of all WAL files created since the last snapshot exceeds a given limit (see snapshot.by.wal_size).

The snapshot.by.interval option sets up the checkpoint daemon that takes a new snapshot every snapshot.by.interval seconds. If the snapshot.by.interval option is set to zero, the checkpoint daemon is disabled.

The snapshot.by.wal_size option defines the maximum size in bytes for all WAL files created since the last snapshot taken. Once this size is exceeded, the checkpoint daemon takes a snapshot. Then, Tarantool garbage collector deletes the old WAL files.

The example shows how to specify the snapshot.by.interval and the snapshot.by.wal_size options:

by:
  interval: 7200
  wal_size: 1000000000000000000

In the example, a new snapshot is created in two cases:

every 2 hours (every 7200 seconds)
when the size for all WAL files created since the last snapshot reaches the size of 1e18 (1000000000000000000) bytes.

Specify a directory for snapshot files

To configure a directory where the snapshot files are stored, use the snapshot.dir configuration option. The example below shows how to specify a snapshot directory for instance001 explicitly:

instance001:
  snapshot:
    dir: 'var/lib/{{ instance_name }}/snapshots'

By default, WAL files and snapshot files are stored in the same directory var/lib/{{ instance_name }}. However, you can specify different directories for them. For example, you can place snapshots and write-ahead logs on different hard drives for better reliability:

instance001:
  snapshot:
    dir: '/media/drive1/snapshots'
  wal:
    dir: '/media/drive2/wals'

Configure a maximum number of stored snapshots

You can set a limit on the number of snapshots stored in the snapshot.dir directory using the snapshot.count option. Once the number of snapshots reaches the given limit, Tarantool garbage collector deletes the oldest snapshot file and any associated WAL files after the new snapshot is taken.

In the example below, the snapshot is created every two hours (every 7200 seconds) until there are three snapshots in the snapshot.dir directory. After creating a new snapshot (the fourth one), the oldest snapshot and the corresponding WALs are deleted.

count: 3
by:
  interval: 7200

Configure the write-ahead log

Example on GitHub: wal

This section describes how to define WAL settings in the wal section of a YAML configuration.

Set the WAL mode

The recording to the write-ahead log is enabled by default. It means that if an instance restart occurs, the data will be recovered. The recording to the WAL can be configured using the wal.mode configuration option.

There are two modes that enable writing to the WAL:

write (default) – enable WAL and write the data without waiting for the data to be flushed to the storage device.
fsync – enable WAL and ensure that the record is written to the storage device.

The example below shows how to specify the write WAL mode:

mode: 'write'

To turn the WAL writer off, set the wal.mode option to none.

Specify a directory for WAL files

To configure a directory where the WAL files are stored, use the wal.dir configuration option. The example below shows how to specify a directory for instance001 explicitly:

instance001:
  wal:
    dir: 'var/lib/{{ instance_name }}/wals'

Set an interval between scans

In case of replication or hot standby mode, Tarantool scans for changes in the WAL files every wal.dir_rescan_delay seconds. The example below shows how to specify the interval between scans:

dir_rescan_delay: 3

Set a maximum size for the WAL file

A new WAL file is created when the current one reaches the wal.max_size size. The configuration for this option might look as follows:

max_size: 268435456

Set a delay for the garbage collector

In Tarantool, the checkpoint daemon takes new snapshots at the given interval (see snapshot.by.interval). After an instance restart, the Tarantool garbage collector deletes the old WAL files.

To delay the immediate deletion of WAL files, use the wal.cleanup_delay configuration option. The delay eliminates possible erroneous situations when the master deletes WALs needed by replicas after restart. As a consequence, replicas sync with the master faster after its restart and don’t need to download all the data again.

In the example, the delay is set to 5 hours (18000 seconds):

cleanup_delay: 18000

Specify the WAL extensions

In Tarantool Enterprise, you can store an old and new tuple for each CRUD operation performed. A detailed description and examples of the WAL extensions are provided in the WAL extensions section.

See also: wal.ext.* configuration options.

Checkpoint daemon

The checkpoint daemon (snapshot daemon) is a constantly running fiber. The checkpoint daemon creates a schedule for the periodic snapshot creation based on the configuration options and the speed of file size growth. If enabled, the daemon makes new snapshot (.snap) files according to this schedule.

The work of the checkpoint daemon is based on the following configuration options:

snapshot.by.interval – a new snapshot is taken once in a given period.
snapshot.by.wal_size – a new snapshot is taken once the size of all WAL files created since the last snapshot exceeds a given limit.

If necessary, the checkpoint daemon also activates the Tarantool garbage collector that deletes old snapshots and WAL files.

Note

The memtx engine takes only regular snapshots with the interval set in the checkpoint daemon configuration.

The vinyl engine runs checkpointing in the background at all times.

Tarantool garbage collector

Tarantool garbage collector can be activated by the checkpoint daemon. The garbage collector tracks the snapshots that are to be relayed to a replica or needed by other consumers. When the files are no longer needed, Tarantool garbage collector deletes them.

Note

The garbage collector called by the checkpoint daemon is distinct from the Lua garbage collector which is for Lua objects, and distinct from the Tarantool garbage collector that specializes in handling shard buckets.

This garbage collector is called as follows:

When the number of snapshots reaches the limit of snapshot.count size. After a new snapshot is taken, Tarantool garbage collector deletes the oldest snapshot file and any associated WAL files.
When the size of all WAL files created since the last snapshot reaches the limit of snapshot.by.wal_size. Once this size is exceeded, the checkpoint daemon takes a snapshot, then the garbage collector deletes the old WAL files.

If an old snapshot file is deleted, the Tarantool garbage collector also deletes any write-ahead log (.xlog) files that meet the following conditions:

The WAL files are older than the snapshot file.
The WAL files contain information present in the snapshot file.

Tarantool garbage collector also deletes obsolete vinyl .run files.

Tarantool garbage collector doesn’t delete a file in the following cases:

A backup is running, and the file has not been backed up (see Hot backup).
Replication is running, and the file has not been relayed to a replica (see Replication architecture),
A replica is connecting.
A replica has fallen behind. The progress of each replica is tracked; if a replica’s position is far from being up to date, then the server stops to give it a chance to catch up. If an administrator concludes that a replica is permanently down, then the correct procedure is to restart the server, or (preferably) remove the replica from the cluster.

WAL extensions

Enterprise Edition

WAL extensions are available in the Enterprise Edition only.

WAL extensions allow you to add auxiliary information to each write-ahead log record. For example, you can enable storing an old and new tuple for each CRUD operation performed. This information might be helpful for implementing a CDC (Change Data Capture) utility that transforms a data replication stream.

Configuration

WAL extensions are disabled by default. To configure them, use the wal.ext.* configuration options. Inside the wal.ext block, you can enable storing old and new tuples as follows:

To store old and new tuples in a write-ahead log for all spaces, set the wal.ext.old and wal.ext.new options to true:
```
ext:
  new: true
  old: true
```
To adjust these options for specific spaces, specify the wal.ext.spaces option:
```
wal:
  ext:
    old: true
    new: true
    spaces:
      space1:
        old: false
      space2:
        new: false
```
The configuration for specific spaces has priority over the configuration in the wal.ext.new and wal.ext.old options. It means that only new tuples are added to the log for space1 and only old tuples for space2.

Note that records with additional fields are replicated as follows:

If a replica doesn’t support the extended format configured on a master, auxiliary fields are skipped.
If a replica and master have different configurations for WAL records, the master’s configuration is ignored.

Example

The table below demonstrates how write-ahead log records might look for the specific CRUD operations if storing old and new tuples is enabled for the bands space.

Operation	Example	WAL information
insert	`bands:insert{4, 'The Beatles', 1960}`	new_tuple: [4, ‘The Beatles’, 1960] tuple: [4, ‘The Beatles’, 1960]
delete	`bands:delete{4}`	key: [4] old_tuple: [4, ‘The Beatles’, 1960]
update	`bands:update({2}, {{'=', 2, 'Pink Floyd'}})`	new_tuple: [2, ‘Pink Floyd’, 1965] old_tuple: [2, ‘Scorpions’, 1965] key: [2] tuple: [[‘=’, 2, ‘Pink Floyd’]]
upsert	`bands:upsert({2, 'Pink Floyd', 1965}, {{'=', 2, 'The Doors'}})`	new_tuple: [2, ‘The Doors’, 1965] old_tuple: [2, ‘Pink Floyd’, 1965] operations: [[‘=’, 2, ‘The Doors’]] tuple: [2, ‘Pink Floyd’, 1965]
replace	`bands:replace{1, 'The Beatles', 1960}`	old_tuple: [1, ‘Roxette’, 1986] new_tuple: [1, ‘The Beatles’, 1960] tuple: [1, ‘The Beatles’, 1960]

Storing both old and new tuples is especially useful for the update operation because a write-ahead log record contains only a key value.

Note

You can use the Printing the contents of .snap and .xlog files command to see the contents of a write-ahead log.

Defining and manipulating data

Tarantool stores data in spaces, which can be thought of as tables in a relational database. Every record or row in a space is called a tuple. A tuple may have any number of fields, and the fields may be of different types.

String data in fields are compared based on the specified collation rules. The user can provide hard limits for data values through constraints and link related spaces with foreign keys.

Tarantool supports highly customizable indexes of various types. In particular, indexes can be defined with generators like sequences.

There are six basic data operations in Tarantool: SELECT, INSERT, UPDATE, UPSERT, REPLACE, and DELETE. A number of complexity factors affects the resource usage of each function.

Tarantool allows describing the data schema but does not require it. The user can migrate a schema without migrating the data.

To ensure data persistence and recover quickly in case of failure, Tarantool uses mechanisms like the write-ahead log (WAL) and snapshots.

This section contains guides on performing data operations in Tarantool.

Data storage

Tuples

Tarantool operates data in the form of tuples.

tuple

A tuple is a group of data values in Tarantool’s memory. Think of it as a “database record” or a “row”. The data values in the tuple are called fields.

When Tarantool returns a tuple value in the console, by default, it uses YAML format, for example: [3, 'Ace of Base', 1993].

Internally, Tarantool stores tuples as MsgPack arrays.

field

Fields are distinct data values, contained in a tuple. They play the same role as “row columns” or “record fields” in relational databases, with a few improvements:

fields can be composite structures, such as arrays or maps,
fields don’t need to have names.

A given tuple may have any number of fields, and the fields may be of different types.

The field’s number is the identifier of the field. Numbers are counted from base 1 in Lua and other 1-based languages, or from base 0 in languages like PHP or C/C++. So, 1 or 0 can be used in some contexts to refer to the first field of a tuple.

Spaces

Tarantool stores tuples in containers called spaces.

space

In Tarantool, a space is a primary container that stores data. It is analogous to tables in relational databases. Spaces contain tuples – the Tarantool name for database records. The number of tuples in a space is unlimited.

At least one space is required to store data with Tarantool. Each space has the following attributes:

a unique name specified by the user,
a unique numeric identifier which can be specified by the user, but usually is assigned automatically by Tarantool,
an engine: memtx (default) — in-memory engine, fast but limited in size, or vinyl — on-disk engine for huge data sets.

To be functional, a space also needs to have a primary index. It can also have secondary indexes.

Data types

Tarantool is both a database manager and an application server. Therefore a developer often deals with two type sets: the types of the programming language (such as Lua) and the types of the Tarantool storage format (MsgPack).

Lua versus MsgPack

Scalar / compound	MsgPack type	Lua type	Example value
scalar	nil	cdata	box.NULL
scalar	boolean	boolean	`true`
scalar	string	string	`'A B C'`
scalar	integer	number	`12345`
scalar	integer	cdata	`12345`
scalar	float64 (double)	number	`1.2345`
scalar	float64 (double)	cdata	`1.2345`
scalar	binary	cdata	`[!!binary 3t7e]`
scalar	ext (for Tarantool `decimal`)	cdata	`1.2`
scalar	ext (for Tarantool `datetime`)	cdata	`'2021-08-20T16:21:25.122999906 Europe/Berlin'`
scalar	ext (for Tarantool `interval`)	cdata	`+1 months, 1 days`
scalar	ext (for Tarantool `uuid`)	cdata	`12a34b5c-de67-8f90-123g-h4567ab8901`
compound	map	table (with string keys)	`{'a': 5, 'b': 6}`
compound	array	table (with integer keys)	`[1, 2, 3, 4, 5]`
compound	array	tuple (cdata)	`[12345, 'A B C']`

Note

MsgPack values have variable lengths. So, for example, the smallest number requires only one byte, but the largest number requires nine bytes.

Note

The Lua nil type is encoded as MsgPack nil but decoded as msgpack.NULL.

Field type details

nil

In Lua, the nil type has only one possible value, also called nil. Tarantool displays it as null when using the default YAML format. Nil may be compared to values of any types with == (is-equal) or ~= (is-not-equal), but other comparison operations will not work. Nil may not be used in Lua tables; the workaround is to use box.NULL because nil == box.NULL is true. Example: nil.

boolean

A boolean is either true or false.

Example: true.

integer

The Tarantool integer type is for integers between -9223372036854775808 and 18446744073709551615, which is about 18 quintillion. This type corresponds to the number type in Lua and to the integer type in MsgPack.

Example: -2^63.

unsigned

The Tarantool unsigned type is for integers between 0 and 18446744073709551615. So it is a subset of integer.

Example: 123456.

double

The double field type exists mainly to be equivalent to Tarantool/SQL’s DOUBLE data type. In msgpuck.h (Tarantool’s interface to MsgPack), the storage type is MP_DOUBLE and the size of the encoded value is always 9 bytes. In Lua, fields of the double type can only contain non-integer numeric values and cdata values with double floating-point numbers.

Examples: 1.234, -44, 1.447e+44.

To avoid using the wrong kind of values inadvertently, use ffi.cast() when searching or changing double fields. For example, instead of space_object:insert{value} use ffi = require('ffi') ... space_object:insert({ffi.cast('double',value)}).

Example:

s = box.schema.space.create('s', {format = {{'d', 'double'}}})
s:create_index('ii')
s:insert({1.1})
ffi = require('ffi')
s:insert({ffi.cast('double', 1)})
s:insert({ffi.cast('double', tonumber('123'))})
s:select(1.1)
s:select({ffi.cast('double', 1)})

Arithmetic with cdata double will not work reliably, so for Lua, it is better to use the number type. This warning does not apply for Tarantool/SQL because Tarantool/SQL does implicit casting.

number

The Tarantool number field may have both integer and floating-point values, although in Lua a number is a double-precision floating-point.

Tarantool will try to store a Lua number as floating-point if the value contains a decimal point or is very large (greater than 100 trillion = 1e14), otherwise Tarantool will store it as an integer. To ensure that even very large numbers are stored as integers, use the tonumber64 function, or the LL (Long Long) suffix, or the ULL (Unsigned Long Long) suffix. Here are examples of numbers using regular notation, exponential notation, the ULL suffix and the tonumber64 function: -55, -2.7e+20, 100000000000000ULL, tonumber64('18446744073709551615').

You can also use the ffi module to specify a C type to cast the number to. In this case, the number will be stored as cdata.

decimal

The Tarantool decimal type is stored as a MsgPack ext (Extension). Values with the decimal type are not floating-point values although they may contain decimal points. They are exact with up to 38 digits of precision.

Example: a value returned by a function in the decimal module.

datetime

Introduced in v. 2.10.0. The Tarantool datetime type facilitates operations with date and time, accounting for leap years or the varying number of days in a month. It is stored as a MsgPack ext (Extension). Operations with this data type use code from c-dt, a third-party library.

For more information, see Module datetime.

interval

Since: v. 2.10.0

The Tarantool interval type represents periods of time. They can be added to or subtracted from datetime values or each other. Operations with this data type use code from c-dt, a third-party library. The type is stored as a MsgPack ext (Extension). For more information, see Module datetime.

string

A string is a variable-length sequence of bytes, usually represented with alphanumeric characters inside single quotes. In both Lua and MsgPack, strings are treated as binary data, with no attempts to determine a string’s character set or to perform any string conversion – unless there is an optional collation. So, usually, string sorting and comparison are done byte-by-byte, without any special collation rules applied. For example, numbers are ordered by their point on the number line, so 2345 is greater than 500; meanwhile, strings are ordered by the encoding of the first byte, then the encoding of the second byte, and so on, so '2345' is less than '500'.

Example: 'A, B, C'.

bin

A bin (binary) value is not directly supported by Lua but there is a Tarantool type varbinary. See the varbinary module reference for details.

Example: "\65 \66 \67".

uuid

The Tarantool uuid type is used for Universally Unique Identifiers. Since version 2.4.1 Tarantool stores uuid values as a MsgPack ext (Extension).

Example: 64d22e4d-ac92-4a23-899a-e5934af5479.

array

An array is represented in Lua with {...} (braces).

Examples: lists of numbers representing points in geometric figures: {10, 11}, {3, 5, 9, 10}.

table

Lua tables with string keys are stored as MsgPack maps; Lua tables with integer keys starting with 1 are stored as MsgPack arrays. Nils may not be used in Lua tables; the workaround is to use box.NULL.

Example: a box.space.tester:select() request will return a Lua table.

tuple

A tuple is a light reference to a MsgPack array stored in the database. It is a special type (cdata) to avoid conversion to a Lua table on retrieval. A few functions may return tables with multiple tuples. For tuple examples, see box.tuple.

scalar

Values in a scalar field can be boolean, integer, unsigned, double, number, decimal, string, uuid, or varbinary; but not array, map, or tuple.

Examples: true, 1, 'xxx'.

any

Values in a field of this type can be boolean, integer, unsigned, double, number, decimal, string, uuid, varbinary, array, map, or tuple.

Examples: true, 1, 'xxx', {box.NULL, 0}.

Examples

Examples of insert requests with different field types:

tarantool> box.space.K:insert{1,nil,true,'A B C',12345,1.2345}
---
- [1, null, true, 'A B C', 12345, 1.2345]
...
tarantool> box.space.K:insert{2,{['a']=5,['b']=6}}
---
- [2, {'a': 5, 'b': 6}]
...
tarantool> box.space.K:insert{3,{1,2,3,4,5}}
---
- [3, [1, 2, 3, 4, 5]]
...

Indexed field types

To learn more about what values can be stored in indexed fields, read the Indexes section.

Collations

By default, when Tarantool compares strings, it uses the so-called binary collation. It only considers the numeric value of each byte in a string. For example, the encoding of 'A' (what used to be called the “ASCII value”) is 65, the encoding of 'B' is 66, and the encoding of 'a' is 98. Therefore, if the string is encoded with ASCII or UTF-8, then 'A' < 'B' < 'a'.

Binary collation is the best choice for fast deterministic simple maintenance and searching with Tarantool indexes.

But if you want the ordering that you see in phone books and dictionaries, then you need Tarantool’s optional collations, such as unicode and unicode_ci, which allow for 'a' < 'A' < 'B' and 'a' == 'A' < 'B' respectively.

The unicode and unicode_ci optional collations use the ordering according to the Default Unicode Collation Element Table (DUCET) and the rules described in Unicode® Technical Standard #10 Unicode Collation Algorithm (UTS #10 UCA). The only difference between the two collations is about weights:

unicode collation observes L1, L2, and L3 weights (strength = ‘tertiary’);
unicode_ci collation observes only L1 weights (strength = ‘primary’), so for example 'a' == 'A' == 'á' == 'Á'.

As an example, take some Russian words:

'ЕЛЕ'
'елейный'
'ёлка'
'еловый'
'елозить'
'Ёлочка'
'ёлочный'
'ЕЛь'
'ель'

…and show the difference in ordering and selecting by index:

with unicode collation:

tarantool> box.space.T:create_index('I', {parts = {{field = 1, type = 'str', collation='unicode'}}})
...
tarantool> box.space.T.index.I:select()
---
- - ['ЕЛЕ']
  - ['елейный']
  - ['ёлка']
  - ['еловый']
  - ['елозить']
  - ['Ёлочка']
  - ['ёлочный']
  - ['ель']
  - ['ЕЛь']
...
tarantool> box.space.T.index.I:select{'ЁлКа'}
---
- []
...

with unicode_ci collation:

tarantool> box.space.T:create_index('I', {parts = {{field = 1, type ='str', collation='unicode_ci'}}})
...
tarantool> box.space.T.index.I:select()
---
- - ['ЕЛЕ']
  - ['елейный']
  - ['ёлка']
  - ['еловый']
  - ['елозить']
  - ['Ёлочка']
  - ['ёлочный']
  - ['ЕЛь']
...
tarantool> box.space.T.index.I:select{'ЁлКа'}
---
- - ['ёлка']
...

In all, collation involves much more than these simple examples of upper case / lower case and accented / unaccented equivalence in alphabets. We also consider variations of the same character, non-alphabetic writing systems, and special rules that apply for combinations of characters.

For English, Russian, and most other languages and use cases, use the “unicode” and “unicode_ci” collations. If you need Cyrillic letters ‘Е’ and ‘Ё’ to have the same level-1 weights, try the Kyrgyz collation.

The tailored optional collations: for other languages, Tarantool supplies tailored collations for every modern language that has more than a million native speakers, and for specialized situations such as the difference between dictionary order and telephone book order. Run box.space._collation:select() to see the complete list.

The tailored collation names have the form unicode_[language code]_[strength], where language code is a standard 2-character or 3-character language abbreviation, and strength is s1 for “primary strength” (level-1 weights), s2 for “secondary”, s3 for “tertiary”. Tarantool uses the same language codes as the ones in the “list of tailorable locales” on man pages of Ubuntu and Fedora. Charts explaining the precise differences from DUCET order are in the Common Language Data Repository.

Default values

Default values are assigned to tuple fields automatically if these fields are skipped during the tuple insert or update.

You can specify a default value for a field in the space_object:format() call that defines the space format. Default values apply regardless of the field nullability: any tuple in which the field is skipped or set to nil receives the default value.

Default values can be set in two ways: explicitly or using a function.

Explicit default values

Explicit default values are defined in the default parameter of the field declaration in a space_object:format() call.

local books = box.schema.space.create('books')
books:format({
    { name = 'id', type = 'number' },
    { name = 'name', type = 'string' },
    { name = 'year', type = 'number', default = 2024 },
})
books:create_index('primary', { parts = { 1 } })

To use a default value for a field, skip it or assign nil:

books:insert { 1, 'Thinking in Java' }
books:insert { 2, 'How to code in Go', nil }

Any Lua object that can be evaluated during the space_object.format() call may be used as a default value, for example:

a constant: default = 100
an initialized variable: default = default_size
an expression: default = 10 + default_size
a function return value: default = count_default()

Important

Explicit default values are evaluated only when setting the space format. If you use a variable as a default value, its further assignments do not affect the default value.

To change the default values, call space_object:format() again.

See also the space_object:format() reference.

Default functions

A default value can be defined as a return value of a stored Lua function. To be the default, a function must be created with box.schema.func.create() with the function body and return one value of the field’s type. It also must not yield.

box.schema.func.create('current_year', {
    language = 'Lua',
    body = "function() return require('datetime').now().year end"
})

Default functions are set in the default_func parameter of the field declaration in a space_object:format() call. To make a function with no arguments the default for a field, specify its name:

local books = box.schema.space.create('books')
books:format({
    { name = 'id', type = 'unsigned' },
    { name = 'isbn', type = 'string' },
    { name = 'title', type = 'string' },
    { name = 'year', type = 'unsigned', default_func = 'current_year' }
})
books:create_index('primary', { parts = { 1 } })

A default function can also have one argument.

box.schema.func.create('randomize', {
    language = 'Lua',
    body = "function(limit) return math.random(limit.min, limit.max) end"
})

To pass the function argument when setting the default, specify it in the default parameter of the space_object:format() call:

books:format({
    { name = 'id', type = 'unsigned', default_func= 'randomize', default = {min = 0, max = 1000} },
    { name = 'isbn', type = 'string' },
    { name = 'title', type = 'string' },
    { name = 'year', type = 'unsigned', default_func = 'current_year' }
})

Note

A key difference between a default function (default_func = 'count_default') and a function return value used as a field default value (default = count_default()) is the following:

A default function is called every time a default value must be produced, that is, a tuple is inserted or updated without specifying the field.
A return value used a field default value: the function is called once when setting the space format. Then, all tuples receive the result of this exact call if the field is not specified.

See also the space_object.format() reference.

Constraints

For better control over stored data, Tarantool supports constraints – user-defined limitations on the values of certain fields or entire tuples. Together with data types, constraints allow limiting the ranges of available field values both syntactically and semantically.

For example, the field age typically has the number type, so it cannot store strings or boolean values. However, it can still have values that don’t make sense, such as negative numbers. This is where constraints come to help.

Constraint types

There are two types of constraints in Tarantool:

Field constraints check that the value being assigned to a field satisfies a given condition. For example, age must be non-negative.
Tuple constraints check complex conditions that can involve all fields of a tuple. For example, a tuple contains a date in three fields: year, month, and day. You can validate day values based on the month value (and even year if you consider leap years).

Field constraints work faster, while tuple constraints allow implementing a wider range of limitations.

Constraint functions

Constraints use stored Lua functions or SQL expressions, which must return true when the constraint is satisfied. Other return values (including nil) and exceptions make the check fail and prevent tuple insertion or modification.

To create a constraint function, call box.schema.func.create() with the function definition specified in the body attribute.

Constraint functions take two parameters:

The tuple and the constraint name for tuple constraints.
```
-- Define a tuple constraint function --
box.schema.func.create('check_person', {
    language = 'LUA',
    is_deterministic = true,
    body = 'function(t, c) return (t.age >= 0 and #(t.name) > 3) end'
})
```
Warning

Tarantool doesn’t check field names used in tuple constraint functions. If a field referenced in a tuple constraint gets renamed, this constraint will break and prevent further insertions and modifications in the space.

The field value and the constraint name for field constraints.

-- Define a field constraint function --
box.schema.func.create('check_age', {
    language = 'LUA',
    is_deterministic = true,
    body = 'function(f, c) return (f >= 0 and f < 150) end'
})

Creating constraints

To create a constraint in a space, specify the corresponding function’s name in the constraint parameter:

Tuple constraints: when creating or altering a space.

-- Create a space with a tuple constraint --
customers = box.schema.space.create('customers', {constraint = 'check_person'})

Field constraints: when setting up the space format.

-- Specify format with a field constraint --
box.space.customers:format({
    {name = 'id', type = 'number'},
    {name = 'name', type = 'string'},
    {name = 'age',  type = 'number', constraint = 'check_age'},
})

In both cases, constraint can contain multiple function names passed as a tuple. Each constraint can have an optional name:

-- Create one more tuple constraint --
box.schema.func.create('another_constraint',
    {language = 'LUA', is_deterministic = true, body = 'function(t, c) return true end'})

-- Set two constraints with optional names --
box.space.customers:alter{
    constraint = { check1 = 'check_person', check2 = 'another_constraint'}
}

Note

When adding a constraint to an existing space with data, Tarantool checks it against the stored data. If there are fields or tuples that don’t satisfy the constraint, it won’t be applied to the space.

Foreign keys

Foreign keys provide links between related fields, therefore maintaining the referential integrity of the database.

Fields can contain values that exist only in other fields. For example, a shop order always belongs to a customer. Hence, all values of the customer field of the orders space must also exist in the id field of the customers space. In this case, customers is a parent space for orders (its child space). When two spaces are linked with a foreign key, each time a tuple is inserted or modified in the child space, Tarantool checks that a corresponding value is present in the parent space.

Note

A foreign key can link a field to another field in the same space. In this case, the child field must be nullable. Otherwise, it is impossible to insert the first tuple in such a space because there is no parent tuple to which it can link.

Foreign key types

There are two types of foreign keys in Tarantool:

Field foreign keys check that the value being assigned to a field is present in a particular field of another space. For example, the customer value in a tuple from the orders space must match an id stored in the customers space.
Tuple foreign keys check that multiple fields of a tuple have a match in another space. For example, if the orders space has fields customer_id and customer_name, a tuple foreign key can check that the customers space contains a tuple with both these values in the corresponding fields.

Field foreign keys work faster while tuple foreign keys allow implementing more strict references.

Creating foreign keys

Important

For each foreign key, there must exist a parent space index that includes all its fields.

To create a foreign key in a space, specify the parent space and linked fields in the foreign_key parameter. Parent spaces can be referenced by name or by id. When linking to the same space, the space can be omitted. Fields can be referenced by name or by number:

Field foreign keys: when setting up the space format.

-- Create a space with a field foreign key --
box.schema.space.create('orders')

box.space.orders:format({
    {name = 'id',   type = 'number'},
    {name = 'customer_id', foreign_key = {space = 'customers', field = 'id'}},
    {name = 'price_total', type = 'number'},
})

Tuple foreign keys: when creating or altering a space. Note that for foreign keys with multiple fields there must exist an index that includes all these fields.

-- Create a space with a tuple foreign key --
box.schema.space.create("orders", {
    foreign_key = {
        space = 'customers',
        field = {customer_id = 'id', customer_name = 'name'}
    }
})

box.space.orders:format({
    {name = "id", type = "number"},
    {name = "customer_id" },
    {name = "customer_name"},
    {name = "price_total", type = "number"},
})

Note

Type can be omitted for foreign key fields because it’s defined in the parent space.

Foreign keys can have an optional name.

-- Set a foreign key with an optional name --
box.space.orders:alter{
    foreign_key = {
        customer = {
            space = 'customers',
            field = { customer_id = 'id', customer_name = 'name'}
        }
    }
}

A space can have multiple tuple foreign keys. In this case, they all must have names.

-- Set two foreign keys: names are mandatory --
box.space.orders:alter{
    foreign_key = {
        customer = {
            space = 'customers',
            field = {customer_id = 'id', customer_name = 'name'}
        },
        item = {
            space = 'items',
            field = {item_id = 'id'}
        }
    }
}

Tarantool performs integrity checks upon data modifications in parent spaces. If you try to remove a tuple referenced by a foreign key or an entire parent space, you will get an error.

Important

Renaming parent spaces or referenced fields may break the corresponding foreign keys and prevent further insertions or modifications in the child spaces.

Indexes

Basics

An index is a special data structure that stores a group of key values and pointers. It is used for efficient manipulations with data.

As with spaces, you should specify the index name and let Tarantool come up with a unique numeric identifier (“index id”).

An index always has a type. The default index type is TREE. TREE indexes are provided by all Tarantool engines, can index unique and non-unique values, support partial key searches, comparisons, and ordered results. Additionally, the memtx engine supports HASH, RTREE and BITSET indexes.

An index may be multi-part, that is, you can declare that an index key value is composed of two or more fields in the tuple, in any order. For example, for an ordinary TREE index, the maximum number of parts is 255.

An index may be unique, that is, you can declare that it would be illegal to have the same key value twice.

The first index defined on a space is called the primary key index, and it must be unique. All other indexes are called secondary indexes, and they may be non-unique.

Indexes have certain limitations. See details on page Limitations.

To create a generator for indexes, you can use a sequence object. Learn how to do it in the tutorial.

Indexed field types

Not to be confused with index types – the types of the data structure that is an index. See more about index types below.

Indexes restrict values that Tarantool can store with MsgPack. This is why, for example, 'unsigned' and 'integer' are different field types, although in MsgPack they are both stored as integer values. An 'unsigned' index contains only non-negative integer values, while an ‘integer’ index contains any integer values.

The default field type is 'unsigned' and the default index type is TREE. Although 'nil' is not a legal indexed field type, indexes may contain nil as a non-default option.

To learn more about field types, check the Field type details section.

Field type name string	Field type	Index type
`'boolean'`	boolean	TREE or HASH
`'integer'` (may also be called `'int'`)	integer, which may include unsigned values	TREE or HASH
`'unsigned'` (may also be called `'uint'` or `'num'`, but `'num'` is deprecated)	unsigned	TREE, BITSET, or HASH
`'double'`	double	TREE or HASH
`'number'`	number, which may include integer, double, or decimal values	TREE or HASH
`'decimal'`	decimal	TREE or HASH
`'string'` (may also be called `'str'`)	string	TREE, BITSET, or HASH
`'varbinary'`	varbinary	TREE, HASH, or BITSET (since version 2.7.1)
`'uuid'`	uuid	TREE or HASH
`'datetime'`	datetime	TREE
`'array'`	array	RTREE
`'map'`	table	Cannot be indexed
`'scalar'`	may include nil, boolean, integer, unsigned, number, decimal, string, varbinary, or uuid values \| When a scalar field contains values of different underlying types, the key order is: nils, then booleans, then numbers, then strings, then varbinaries, then uuids.	TREE or HASH

Index types

An index always has a type. Different types are intended for different usage scenarios.

We give an overview of index features in the following table:

Feature	TREE	HASH	RTREE	BITSET
unique	+	+	-	-
non-unique	+	-	+	+
is_nullable	+	-	-	-
can be multi-part	+	+	-	-
multikey	+	-	-	-
partial-key search	+	-	-	-
can be primary key	+	+	-	-
`exclude_null` (version 2.8+)	+	-	-	-
Pagination (the after option)	+	-	-	-
iterator types	ALL, EQ, REQ, GT, GE, LT, LE	ALL, EQ	ALL, EQ, GT, GE, LT, LE, OVERLAPS, NEIGHBOR	ALL, EQ, BITS_ALL_SET, BITS_ANY_SET, BITS_ALL_NOT_SET

Note

In 2.11.0, the GT index type is deprecated for HASH indexes.

TREE indexes

The default index type is ‘TREE’. TREE indexes are provided by memtx and vinyl engines, can index unique and non-unique values, support partial key searches, comparisons and ordered results.

This is a universal type of indexes, for most cases it will be the best choice.

Additionally, memtx engine supports HASH, RTREE and BITSET indexes.

HASH indexes

HASH indexes require unique fields and loses to TREE in almost all respects. So we do not recommend to use it in the applications. HASH is now present in Tarantool mainly because of backward compatibility.

Here are some tips. Do not use HASH index:

just if you want to
if you think that HASH is faster with no performance metering
if you want to iterate over the data
for primary key
as an only index

Use HASH index:

if it is a secondary key
if you 100% won’t need to make it non-unique
if you have taken measurements on your data and you see an accountable increase in performance
if you save every byte on tuples (HASH is a little more compact)

RTREE indexes

RTREE is a multidimensional index supporting up to 20 dimensions. It is used especially for indexing spatial information, such as geographical objects. In this example we demonstrate spatial searches via RTREE index.

RTREE index could not be primary, and could not be unique. The option list of this type of index may contain dimension and distance options. The parts definition must contain the one and only part with type array. RTREE index can accept two types of distance functions: euclid and manhattan.

Warning

Currently, the isolation level of RTREE indexes in MVCC transaction mode is read-committed (not serializable, as stated). If a transaction uses these indexes, it can read committed or confirmed data (depending on the isolation level). However, the indexes are subject to different anomalies that can make them unserializable.

Example 1:

my_space = box.schema.create_space("tester")
my_space:format({ { type = 'number', name = 'id' }, { type = 'array', name = 'content' } })
hash_index = my_space:create_index('primary', { type = 'tree', parts = {'id'} })
rtree_index = my_space:create_index('spatial', { type = 'RTREE', unique = false, parts = {'content'} })

Corresponding tuple field thus must be an array of 2 or 4 numbers. 2 numbers mean a point {x, y}; 4 numbers mean a rectangle {x1, y1, x2, y2}, where (x1, y1) and (x2, y2) - diagonal point of the rectangle.

my_space:insert{1, {1, 1}}
my_space:insert{2, {2, 2, 3, 3}}

Selection results depend on a chosen iterator. The default EQ iterator searches for an exact rectangle, a point is treated as zero width and height rectangle:

tarantool> rtree_index:select{1, 1}
---
- - [1, [1, 1]]
...

tarantool> rtree_index:select{1, 1, 1, 1}
---
- - [1, [1, 1]]
...

tarantool> rtree_index:select{2, 2}
---
- []
...

tarantool> rtree_index:select{2, 2, 3, 3}
---
- - [2, [2, 2, 3, 3]]
...

Iterator ALL, which is the default when no key is specified, selects all tuples in arbitrary order:

tarantool> rtree_index:select{}
---
- - [1, [1, 1]]
  - [2, [2, 2, 3, 3]]
...

Iterator LE (less or equal) searches for tuples with their rectangles within a specified rectangle:

tarantool> rtree_index:select({1, 1, 2, 2}, {iterator='le'})
---
- - [1, [1, 1]]
...

Iterator LT (less than, or strictly less) searches for tuples with their rectangles strictly within a specified rectangle:

tarantool> rtree_index:select({0, 0, 3, 3}, {iterator = 'lt'})
---
- - [1, [1, 1]]
...

Iterator GE searches for tuples with a specified rectangle within their rectangles:

tarantool> rtree_index:select({1, 1}, {iterator = 'ge'})
---
- - [1, [1, 1]]
...

Iterator GT searches for tuples with a specified rectangle strictly within their rectangles:

tarantool> rtree_index:select({2.1, 2.1, 2.9, 2.9}, {iterator = 'gt'})
---
- []
...

Iterator OVERLAPS searches for tuples with their rectangles overlapping specified rectangle:

tarantool> rtree_index:select({0, 0, 10, 2}, {iterator='overlaps'})
---
- - [1, [1, 1]]
  - [2, [2, 2, 3, 3]]
...

Iterator NEIGHBOR searches for all tuples and orders them by distance to the specified point:

tarantool> for i=1,10 do
         >    for j=1,10 do
         >        my_space:insert{i*10+j, {i, j, i+1, j+1}}
         >    end
         > end
---
...

tarantool> rtree_index:select({1, 1}, {iterator = 'neighbor', limit = 5})
---
- - [11, [1, 1, 2, 2]]
  - [12, [1, 2, 2, 3]]
  - [21, [2, 1, 3, 2]]
  - [22, [2, 2, 3, 3]]
  - [31, [3, 1, 4, 2]]
...

Example 2:

3D, 4D and more dimensional RTREE indexes work in the same way as 2D except that user must specify more coordinates in requests. Here’s short example of using 4D tree:

tarantool> my_space = box.schema.create_space("tester")
tarantool> my_space:format{ { type = 'number', name = 'id' }, { type = 'array', name = 'content' } }
tarantool> primary_index = my_space:create_index('primary', { type = 'TREE', parts = {'id'} })
tarantool> rtree_index = my_space:create_index('spatial', { type = 'RTREE', unique = false, dimension = 4, parts = {'content'} })
tarantool> my_space:insert{1, {1, 2, 3, 4}} -- insert 4D point
tarantool> my_space:insert{2, {1, 1, 1, 1, 2, 2, 2, 2}} -- insert 4D box

tarantool> rtree_index:select{1, 2, 3, 4} -- find exact point
---
- - [1, [1, 2, 3, 4]]
...

tarantool> rtree_index:select({0, 0, 0, 0, 3, 3, 3, 3}, {iterator = 'LE'}) -- select from 4D box
---
- - [2, [1, 1, 1, 1, 2, 2, 2, 2]]
...

tarantool> rtree_index:select({0, 0, 0, 0}, {iterator = 'neighbor'}) -- select neighbours
---
- - [2, [1, 1, 1, 1, 2, 2, 2, 2]]
  - [1, [1, 2, 3, 4]]
...

Note

Keep in mind that select NEIGHBOR iterator with unset limits extracts the entire space in order of increasing distance. And there can be tons of data, and this can affect the performance.

And another frequent mistake is to specify iterator type without quotes, in such way: rtree_index:select(rect, {iterator = LE}). This leads to silent EQ select, because LE is undefined variable and treated as nil, so iterator is unset and default used.

BITSET indexes

Bitset is a bit mask. You should use it when you need to search by bit masks. This can be, for example, storing a vector of attributes and searching by these attributes.

Warning

Currently, the isolation level of BITSET indexes in MVCC transaction mode is read-committed (not serializable, as stated). If a transaction uses these indexes, it can read committed or confirmed data (depending on the isolation level). However, the indexes are subject to different anomalies that can make them unserializable.

Example 1:

The following script shows creating and searching with a BITSET index. Notice that BITSET cannot be unique, so first a primary-key index is created, and bit values are entered as hexadecimal literals for easier reading.

tarantool> my_space = box.schema.space.create('space_with_bitset')
tarantool> my_space:create_index('primary_index', {
         >   parts = {1, 'string'},
         >   unique = true,
         >   type = 'TREE'
         > })
tarantool> my_space:create_index('bitset_index', {
         >   parts = {2, 'unsigned'},
         >   unique = false,
         >   type = 'BITSET'
         > })
tarantool> my_space:insert{'Tuple with bit value = 01', 0x01}
tarantool> my_space:insert{'Tuple with bit value = 10', 0x02}
tarantool> my_space:insert{'Tuple with bit value = 11', 0x03}
tarantool> my_space.index.bitset_index:select(0x02, {
         >   iterator = box.index.EQ
         > })
---
- - ['Tuple with bit value = 10', 2]
...
tarantool> my_space.index.bitset_index:select(0x02, {
         >   iterator = box.index.BITS_ANY_SET
         > })
---
- - ['Tuple with bit value = 10', 2]
  - ['Tuple with bit value = 11', 3]
...
tarantool> my_space.index.bitset_index:select(0x02, {
         >   iterator = box.index.BITS_ALL_SET
         > })
---
- - ['Tuple with bit value = 10', 2]
  - ['Tuple with bit value = 11', 3]
...
tarantool> my_space.index.bitset_index:select(0x02, {
         >   iterator = box.index.BITS_ALL_NOT_SET
         > })
---
- - ['Tuple with bit value = 01', 1]
...

Example 2:

tarantool> box.schema.space.create('bitset_example')
tarantool> box.space.bitset_example:create_index('primary')
tarantool> box.space.bitset_example:create_index('bitset',{unique = false, type = 'BITSET', parts = {2,'unsigned'}})
tarantool> box.space.bitset_example:insert{1,1}
tarantool> box.space.bitset_example:insert{2,4}
tarantool> box.space.bitset_example:insert{3,7}
tarantool> box.space.bitset_example:insert{4,3}
tarantool> box.space.bitset_example.index.bitset:select(2, {iterator = 'BITS_ANY_SET'})

The result will be:

---
- - [3, 7]
  - [4, 3]
...

because (7 AND 2) is not equal to 0, and (3 AND 2) is not equal to 0.

Additionally, there exist index iterator operations. They can only be used with code in Lua and C/C++. Index iterators are for traversing indexes one key at a time, taking advantage of features that are specific to an index type. For example, they can be used for evaluating Boolean expressions when traversing BITSET indexes, or for going in descending order when traversing TREE indexes.

Using indexes

Creating an index

It is mandatory to create an index for a space before trying to insert tuples into the space, or select tuples from the space.

The simple index-creation operation is:

box.space.space-name:create_index('index-name')

This creates a unique TREE index on the first field of all tuples (often called “Field#1”), which is assumed to be numeric.

A recommended design pattern for a data model is to base primary keys on the first fields of a tuple. This speeds up tuple comparison due to the specifics of data storage and the way comparisons are arranged in Tarantool.

The simple SELECT request is:

box.space.space-name:select(value)

This looks for a single tuple via the first index. Since the first index is always unique, the maximum number of returned tuples will be 1. You can call select() without arguments, and it will return all tuples. Be careful! Using select() for huge spaces hangs your instance.

An index definition may also include identifiers of tuple fields and their expected types. See allowed indexed field types in section Details about indexed field types:

box.space.space-name:create_index(index-name, {type = 'tree', parts = {{field = 1, type = 'unsigned'}}}

Space definitions and index definitions are stored permanently in Tarantool’s system spaces _space and _index.

Tip

See full information about creating indexes, such as how to create a multikey index, an index using the path option, or how to create a functional index in our reference for space_object:create_index().

Index operations

Index operations are automatic: if a data manipulation request changes a tuple, then it also changes the index keys defined for the tuple.

Create a sample space named bands:

bands = box.schema.space.create('bands')

Format the created space by specifying field names and types:

box.space.bands:format({
    { name = 'id', type = 'unsigned' },
    { name = 'band_name', type = 'string' },
    { name = 'year', type = 'unsigned' }
})

Create the primary index (named primary):
```
box.space.bands:create_index('primary', { parts = { 'id' } })
```
This index is based on the id field of each tuple.

Insert some tuples into the space:

box.space.bands:insert { 1, 'Roxette', 1986 }
box.space.bands:insert { 2, 'Scorpions', 1965 }
box.space.bands:insert { 3, 'Ace of Base', 1987 }
box.space.bands:insert { 4, 'The Beatles', 1960 }
box.space.bands:insert { 5, 'Pink Floyd', 1965 }
box.space.bands:insert { 6, 'The Rolling Stones', 1962 }
box.space.bands:insert { 7, 'The Doors', 1965 }
box.space.bands:insert { 8, 'Nirvana', 1987 }
box.space.bands:insert { 9, 'Led Zeppelin', 1968 }
box.space.bands:insert { 10, 'Queen', 1970 }

Create secondary indexes:

-- Create a unique secondary index --
box.space.bands:create_index('band', { parts = { 'band_name' } })

-- Create a non-unique secondary index --
box.space.bands:create_index('year', { parts = { { 'year' } }, unique = false })

Create a multi-part index with two parts:

box.space.bands:create_index('year_band', { parts = { { 'year' }, { 'band_name' } } })

There are the following SELECT variations:

The search can use comparisons other than equality:
```
-- Select maximum 3 tuples with the key value greater than 1965 --
select_greater = bands.index.year:select({ 1965 }, { iterator = 'GT', limit = 3 })
--[[
---
- - [9, 'Led Zeppelin', 1968]
  - [10, 'Queen', 1970]
  - [1, 'Roxette', 1986]
...
--]]
```
The comparison operators are:
- LT for “less than”
- LE for “less than or equal”
- GT for “greater”
- GE for “greater than or equal”
- EQ for “equal”
- REQ for “reversed equal”
Value comparisons make sense if and only if the index type is TREE. The iterator types for other types of indexes are slightly different and work differently. See details in section Iterator types.

Note that we don’t use the name of the index, which means we use primary index here.

This type of search may return more than one tuple. The tuples will be sorted in descending order by key if the comparison operator is LT or LE or REQ. Otherwise they will be sorted in ascending order.

The search can use a secondary index.

-- Select a tuple by the specified secondary key value --
select_secondary = bands.index.band:select { 'The Doors' }
--[[
---
- - [7, 'The Doors', 1965]
...
--]]

Partial key search: The search may be for some key parts starting with the prefix of the key. Note that partial key searches are available only in TREE indexes.

-- Select tuples by the specified partial key value --
select_multipart_partial = bands.index.year_band:select { 1965 }
--[[
---
- - [5, 'Pink Floyd', 1965]
  - [2, 'Scorpions', 1965]
  - [7, 'The Doors', 1965]
...
--]]

The search can be for all fields, using a table as the value:

-- Select a tuple by the specified multi-part secondary key value --
select_multipart = bands.index.year_band:select { 1960, 'The Beatles' }
--[[
---
- - [4, 'The Beatles', 1960]
...
--]]

Tip

You can also add, drop, or alter the definitions at runtime, with some restrictions. Read more about index operations in reference for box.index submodule.

Tuple compression

Enterprise Edition

Tuple compression is available in the Enterprise Edition only.

Tuple compression, introduced in Tarantool Enterprise Edition 2.10.0, aims to save memory space. Typically, it decreases the volume of stored data by 15%. However, the exact volume saved depends on the type of data.

The following compression algorithms are supported:

lz4
zstd
zlib (since 2.11.0)

To learn about the performance costs of each algorithm, check Tuple compression performance.

Tarantool doesn’t compress tuples themselves, just the fields inside these tuples. You can only compress non-indexed fields. Compression works best when JSON is stored in the field.

Note

The compress module provides the API for compressing and decompressing data.

Enabling compression for a new space

First, create a space:

box.schema.space.create('bands')

Then, create an index for this space, for example:

box.space.bands:create_index('primary', {parts = {{1, 'unsigned'}}})

Create a format to declare field names and types. In the example below, the band_name and year fields have the zstd and lz4 compression formats, respectively. The first field (id) has the index, so it cannot be compressed.

box.space.bands:format({
           {name = 'id', type = 'unsigned'},
           {name = 'band_name', type = 'string', compression = 'zstd'},
           {name = 'year', type = 'unsigned', compression = 'lz4'}
       })

Now, the new tuples that you add to the space bands will be compressed. When you read a compressed tuple, you do not need to decompress it back yourself.

Checking which fields are compressed

To check which fields in a space are compressed, run space_object:format() on the space. If a field is compressed, the format includes the compression algorithm, for example:

tarantool> box.space.bands:format()
    ---
    - [{'name': 'id', 'type': 'unsigned'},
       {'type': 'string', 'compression': 'zstd', 'name': 'band_name'},
       {'type': 'unsigned', 'compression': 'lz4', 'name': 'year'}]
    ...

Enabling compression for existing spaces

You can enable compression for existing fields. All the tuples added after that will have this field compressed. However, this doesn’t affect the tuples already stored in the space. You need to make the snapshot and restart Tarantool to compress the existing tuples.

Here’s an example of how to compress existing fields:

Create a space without compression and add several tuples:

box.schema.space.create('bands')

box.space.bands:format({
    { name = 'id', type = 'unsigned' },
    { name = 'band_name', type = 'string' },
    { name = 'year', type = 'unsigned' }
})

box.space.bands:create_index('primary', { parts = { 'id' } })

box.space.bands:insert { 1, 'Roxette', 1986 }
box.space.bands:insert { 2, 'Scorpions', 1965 }
box.space.bands:insert { 3, 'Ace of Base', 1987 }
box.space.bands:insert { 4, 'The Beatles', 1960 }

Suppose that you want fields 2 and 3 to be compressed from now on. To enable compression, change the format as follows:
```
local new_format = box.space.bands:format()

new_format[2].compression = 'zstd'
new_format[3].compression = 'lz4'

box.space.bands:format(new_format)
```
From now on, all the tuples that you add to the space have fields 2 and 3 compressed.
To finalize the change, create a snapshot by running box.snapshot() and restart Tarantool. As a result, all old tuples will also be compressed in memory during recovery.

Note

space:upgrade() provides the ability to enable compression and update the existing tuples in the background. To achieve this, you need to pass a new space format in the format argument of space:upgrade().

Tuple compression performance

Below are the results of a synthetic test that illustrate how tuple compression affects performance. The test was carried out on a simple Tarantool space containing 100,000 tuples, each having a field with a sample JSON roughly 600 bytes large. The test compared the speed of running select and replace operations on uncompressed and compressed data as well as the overall data size of the space. Performance is measured in requests per second.

Compression type	`select`, RPS	`replace`, RPS	Space size, bytes
None	4,486k	1,109k	41,168,548
`zstd`	308k	26k	21,368,548
`lz4`	1,765k	672k	25,268,548
`zlib`	325k	107k	20,768,548

Data schema description

In Tarantool, the use of a data schema is optional.

When creating a space, you do not have to define a data schema. In this case, the tuples store random data. This rule does not apply to indexed fields. Such fields must contain data of the same type.

You can define a data schema when creating a space. Read more in the description of the box.schema.space.create() function. If you have already created a space without specifying a data schema, you can do it later using space_object:format().

After the data schema is defined, all the data is validated by type. Before any insert or update, you will get an error if the data types do not match.

We recommend using a data schema because it helps avoid mistakes.

In Tarantool, you can define a data schema in two different ways.

Data schema description in a code file

The code file is usually called init.lua and contains the following schema description:

box.cfg()

users = box.schema.create_space('users', { if_not_exists = true })
users:format({{ name = 'user_id', type = 'number'}, { name = 'fullname', type = 'string'}})

users:create_index('pk', { parts = { { field = 'user_id', type = 'number'}}})

This is quite simple: when you run tarantool, it executes this code and creates a data schema. To run this file, use:

tarantool init.lua

However, it may seem complicated if you do not plan to dive deep into the Lua language and its syntax.

Possible difficulty: the snippet above has a function call with a colon: users:format. It is used to pass the users variable as the first argument of the format function. This is similar to self in object-based languages.

So it might be more convenient for you to describe the data schema with YAML.

Data schema description using the DDL module

The DDL module allows you to describe a data schema in the YAML format in a declarative way.

The schema would look something like this:

spaces:
  users:
    engine: memtx
    is_local: false
    temporary: false
    format:
    - {name: user_id, type: uuid, is_nullable: false}
    - {name: fullname, type: string,  is_nullable: false}
    - {name: bucket_id, type: unsigned, is_nullable: false}
    indexes:
    - name: user_id
      unique: true
      parts: [{path: user_id, type: uuid, is_nullable: false}]
      type: HASH
    - name: bucket_id
      unique: false
      parts: [{path: bucket_id, type: unsigned, is_nullable: false}]
      type: TREE
    sharding_key: [user_id]
    sharding_func: test_module.sharding_func

This alternative is simpler to use, and you do not have to dive deep into Lua.

To use the DDL module, put the following Lua code into the file that you use to run Tarantool. This file is usually called init.lua.

local yaml = require('yaml')
local ddl = require('ddl')

box.cfg{}

local fh = io.open('ddl.yml', 'r')
local schema = yaml.decode(fh:read('*all'))
fh:close()
local ok, err = ddl.check_schema(schema)
if not ok then
    print(err)
end
local ok, err = ddl.set_schema(schema)
if not ok then
    print(err)
end

Warning

It is forbidden to modify the data schema in DDL after it has been applied. For migration, there are different scenarios described in the Migrations section.

Operations

Data operations

The basic data operations supported in Tarantool are:

five data-manipulation operations (INSERT, UPDATE, UPSERT, DELETE, REPLACE), and
one data-retrieval operation (SELECT).

All of them are implemented as functions in box.space submodule.

Examples:

INSERT: Add a new tuple to space ‘tester’.

The first field, field[1], will be 999 (MsgPack type is integer).

The second field, field[2], will be ‘Taranto’ (MsgPack type is string).
```
tarantool> box.space.tester:insert{999, 'Taranto'}
```
UPDATE: Update the tuple, changing field field[2].

The clause “{999}”, which has the value to look up in the index of the tuple’s primary-key field, is mandatory, because update() requests must always have a clause that specifies a unique key, which in this case is field[1].

The clause “{{‘=’, 2, ‘Tarantino’}}” specifies that assignment will happen to field[2] with the new value.
```
tarantool> box.space.tester:update({999}, {{'=', 2, 'Tarantino'}})
```
UPSERT: Upsert the tuple, changing field field[2] again.

The syntax of upsert() is similar to the syntax of update(). However, the execution logic of these two requests is different. UPSERT is either UPDATE or INSERT, depending on the database’s state. Also, UPSERT execution is postponed until after transaction commit, so, unlike update(), upsert() doesn’t return data back.
```
tarantool> box.space.tester:upsert({999, 'Taranted'}, {{'=', 2, 'Tarantism'}})
```
REPLACE: Replace the tuple, adding a new field.

This is also possible with the update() request, but the update() request is usually more complicated.
```
tarantool> box.space.tester:replace{999, 'Tarantella', 'Tarantula'}
```
SELECT: Retrieve the tuple.

The clause “{999}” is still mandatory, although it does not have to mention the primary key.
```
tarantool> box.space.tester:select{999}
```
DELETE: Delete the tuple.

In this example, we identify the primary-key field.
```
tarantool> box.space.tester:delete{999}
```

Summarizing the examples:

Functions insert and replace accept a tuple (where a primary key comes as part of the tuple).
Function upsert accepts a tuple (where a primary key comes as part of the tuple), and also the update operations to execute.
Function delete accepts a full key of any unique index (primary or secondary).
Function update accepts a full key of any unique index (primary or secondary), and also the operations to execute.
Function select accepts any key: primary/secondary, unique/non-unique, full/partial.

See reference on box.space for more details on using data operations.

Note

Besides Lua, you can use Perl, PHP, Python or other programming language connectors. The client server protocol is open and documented. See this annotated BNF.

Complexity factors

In reference for box.space and Submodule box.index submodules, there are notes about which complexity factors might affect the resource usage of each function.

Complexity factor	Effect
Index size	The number of index keys is the same as the number of tuples in the data set. For a TREE index, if there are more keys, then the lookup time will be greater, although, of course, the effect is not linear. For a HASH index, if there are more keys, then there is more RAM used, but the number of low-level steps tends to remain constant.
Index type	Typically, a HASH index is faster than a TREE index if the number of tuples in the space is greater than one.
Number of indexes accessed	Ordinarily, only one index is accessed to retrieve one tuple. But to update the tuple, there must be N accesses if the space has N different indexes. Note regarding storage engine: Vinyl optimizes away such accesses if secondary index fields are unchanged by the update. So, this complexity factor applies only to memtx, since it always makes a full-tuple copy on every update.
Number of tuples accessed	A few requests, for example, SELECT, can retrieve multiple tuples. This factor is usually less important than the others.
WAL settings	The important setting for the write-ahead log is wal.mode. If the setting causes no writing or delayed writing, this factor is unimportant. If the setting causes every data-change request to wait for writing to finish on a slow device, this factor is more important than all the others.

CRUD operation examples

Using data operations

This section shows basic usage scenarios and typical errors for each data operation in Tarantool: INSERT, DELETE, UPDATE, UPSERT, REPLACE, and SELECT. Before trying out the examples, you need to bootstrap a Tarantool instance as shown below.

-- Create a space --
bands = box.schema.space.create('bands')

-- Specify field names and types --
box.space.bands:format({
    { name = 'id', type = 'unsigned' },
    { name = 'band_name', type = 'string' },
    { name = 'year', type = 'unsigned' }
})

-- Create a primary index --
box.space.bands:create_index('primary', { parts = { 'id' } })

-- Create a unique secondary index --
box.space.bands:create_index('band', { parts = { 'band_name' } })

-- Create a non-unique secondary index --
box.space.bands:create_index('year', { parts = { { 'year' } }, unique = false })

-- Create a multi-part index --
box.space.bands:create_index('year_band', { parts = { { 'year' }, { 'band_name' } } })

INSERT

The space_object.insert method accepts a well-formatted tuple.

-- Insert a tuple with a unique primary key --
tarantool> bands:insert{1, 'Scorpions', 1965}
---
- [1, 'Scorpions', 1965]
...

insert also checks all the keys for duplicates.

-- Try to insert a tuple with a duplicate primary key --
tarantool> bands:insert{1, 'Scorpions', 1965}
---
- error: Duplicate key exists in unique index "primary" in space "bands" with old
    tuple - [1, "Scorpions", 1965] and new tuple - [1, "Scorpions", 1965]
...

-- Try to insert a tuple with a duplicate secondary key --
tarantool> bands:insert{2, 'Scorpions', 1965}
---
- error: Duplicate key exists in unique index "band" in space "bands" with old tuple
    - [1, "Scorpions", 1965] and new tuple - [2, "Scorpions", 1965]
...

-- Insert a second tuple with unique primary and secondary keys --
tarantool> bands:insert{2, 'Pink Floyd', 1965}
---
- [2, 'Pink Floyd', 1965]
...

-- Delete all tuples --
tarantool> bands:truncate()
---
...

DELETE

space_object.delete allows you to delete a tuple identified by the primary key.

-- Insert test data --
tarantool> bands:insert{1, 'Roxette', 1986}
           bands:insert{2, 'Scorpions', 1965}
           bands:insert{3, 'Ace of Base', 1987}
           bands:insert{4, 'The Beatles', 1960}

-- Delete a tuple with an existing key --
tarantool> bands:delete{4}
---
- [4, 'The Beatles', 1960]
...
tarantool> bands:select()
---
- - [1, 'Roxette', 1986]
  - [2, 'Scorpions', 1965]
  - [3, 'Ace of Base', 1987]
...

You can also use index_object.delete to delete a tuple by the specified unique index.

-- Delete a tuple by the primary index --
tarantool> bands.index.primary:delete{3}
---
- [3, 'Ace of Base', 1987]
...
tarantool> bands:select()
---
- - [1, 'Roxette', 1986]
  - [2, 'Scorpions', 1965]
...

-- Delete a tuple by a unique secondary index --
tarantool> bands.index.band:delete{'Scorpions'}
---
- [2, 'Scorpions', 1965]
...
tarantool> bands:select()
---
- - [1, 'Roxette', 1986]
...

-- Try to delete a tuple by a non-unique secondary index --
tarantool> bands.index.year:delete(1986)
---
- error: Get() doesn't support partial keys and non-unique indexes
...
tarantool> bands:select()
---
- - [1, 'Roxette', 1986]
...

-- Try to delete a tuple by a partial key --
tarantool> bands.index.year_band:delete('Roxette')
---
- error: Invalid key part count in an exact match (expected 2, got 1)
...

-- Delete a tuple by a full key --
tarantool> bands.index.year_band:delete{1986, 'Roxette'}
---
- [1, 'Roxette', 1986]
...
tarantool> bands:select()
---
- []
...

-- Delete all tuples --
tarantool> bands:truncate()
---
...

UPDATE

space_object.update allows you to update a tuple identified by the primary key. Similarly to delete, the update method accepts a full key and also an operation to execute.

-- Insert test data --
tarantool> bands:insert{1, 'Roxette', 1986}
           bands:insert{2, 'Scorpions', 1965}
           bands:insert{3, 'Ace of Base', 1987}
           bands:insert{4, 'The Beatles', 1960}

-- Update a tuple with an existing key --
tarantool> bands:update({2}, {{'=', 2, 'Pink Floyd'}})
---
- [2, 'Pink Floyd', 1965]
...

tarantool> bands:select()
---
- - [1, 'Roxette', 1986]
  - [2, 'Pink Floyd', 1965]
  - [3, 'Ace of Base', 1987]
  - [4, 'The Beatles', 1960]
...

index_object.update updates a tuple identified by the specified unique index.

-- Update a tuple by the primary index --
tarantool> bands.index.primary:update({2}, {{'=', 2, 'The Rolling Stones'}})
---
- [2, 'The Rolling Stones', 1965]
...

tarantool> bands:select()
---
- - [1, 'Roxette', 1986]
  - [2, 'The Rolling Stones', 1965]
  - [3, 'Ace of Base', 1987]
  - [4, 'The Beatles', 1960]
...

-- Update a tuple by a unique secondary index --
tarantool> bands.index.band:update({'The Rolling Stones'}, {{'=', 2, 'The Doors'}})
---
- [2, 'The Doors', 1965]
...

tarantool> bands:select()
---
- - [1, 'Roxette', 1986]
  - [2, 'The Doors', 1965]
  - [3, 'Ace of Base', 1987]
  - [4, 'The Beatles', 1960]
...

-- Try to update a tuple by a non-unique secondary index --
tarantool> bands.index.year:update({1965}, {{'=', 2, 'Scorpions'}})
---
- error: Get() doesn't support partial keys and non-unique indexes
...
tarantool> bands:select()
---
- - [1, 'Roxette', 1986]
  - [2, 'The Doors', 1965]
  - [3, 'Ace of Base', 1987]
  - [4, 'The Beatles', 1960]
...

-- Delete all tuples --
tarantool> bands:truncate()
---
...

UPSERT

space_object.upsert updates an existing tuple or inserts a new one:

If the existing tuple is found by the primary key, Tarantool applies the update operation to this tuple and ignores the new tuple.
If no existing tuple is found, Tarantool inserts the new tuple and ignores the update operation.

tarantool> bands:insert{1, 'Scorpions', 1965}
---
- [1, 'Scorpions', 1965]
...
-- As the first argument, upsert accepts a tuple, not a key --
tarantool> bands:upsert({2}, {{'=', 2, 'Pink Floyd'}})
---
- error: Tuple field 2 (band_name) required by space format is missing
...
tarantool> bands:select()
---
- - [1, 'Scorpions', 1965]
...
tarantool> bands:delete(1)
---
- [1, 'Scorpions', 1965]
...

upsert acts as insert when no existing tuple is found by the primary key.

tarantool> bands:upsert({1, 'Scorpions', 1965}, {{'=', 2, 'The Doors'}})
---
...
-- As you can see, {1, 'Scorpions', 1965} is inserted, --
-- and the update operation is not applied. --
tarantool> bands:select()
---
- - [1, 'Scorpions', 1965]
...

-- upsert with the same primary key but different values in other fields --
-- applies the update operation and ignores the new tuple. --
tarantool> bands:upsert({1, 'Scorpions', 1965}, {{'=', 2, 'The Doors'}})
---
...
tarantool> bands:select()
---
- - [1, 'The Doors', 1965]
...

upsert searches for the existing tuple by the primary index, not by the secondary index. This can lead to a duplication error if the tuple violates a secondary index uniqueness.

tarantool> bands:upsert({2, 'The Doors', 1965}, {{'=', 2, 'Pink Floyd'}})
---
- error: Duplicate key exists in unique index "band" in space "bands" with old tuple
    - [1, "The Doors", 1965] and new tuple - [2, "The Doors", 1965]
...
tarantool> bands:select()
---
- - [1, 'The Doors', 1965]
...

-- This works if uniqueness is preserved. --
tarantool> bands:upsert({2, 'The Beatles', 1960}, {{'=', 2, 'Pink Floyd'}})
---
...
tarantool> bands:select()
---
- - [1, 'The Doors', 1965]
  - [2, 'The Beatles', 1960]
...

-- Delete all tuples --
tarantool> bands:truncate()
---
...

REPLACE

space_object.replace accepts a well-formatted tuple and searches for the existing tuple by the primary key of the new tuple:

If the existing tuple is found, Tarantool deletes it and inserts the new tuple.
If no existing tuple is found, Tarantool inserts the new tuple.

tarantool> bands:replace{1, 'Scorpions', 1965}
---
- [1, 'Scorpions', 1965]
...
tarantool> bands:select()
---
- - [1, 'Scorpions', 1965]
...
tarantool> bands:replace{1, 'The Beatles', 1960}
---
- [1, 'The Beatles', 1960]
...
tarantool> bands:select()
---
- - [1, 'The Beatles', 1960]
...
tarantool> bands:truncate()
---
...

replace can violate unique constraints, like upsert does.

tarantool> bands:insert{1, 'Scorpions', 1965}
- [1, 'Scorpions', 1965]
...
tarantool> bands:insert{2, 'The Beatles', 1960}
---
- [2, 'The Beatles', 1960]
...
tarantool> bands:replace{2, 'Scorpions', 1965}
---
- error: Duplicate key exists in unique index "band" in space "bands" with old tuple
    - [1, "Scorpions", 1965] and new tuple - [2, "Scorpions", 1965]
...
tarantool> bands:truncate()
---
...

SELECT

The space_object.select request searches for a tuple or a set of tuples in the given space by the primary key. To search by the specified index, use index_object.select. These methods work with any keys, including unique and non-unique, full and partial. If a key is partial, select searches by all keys where the prefix matches the specified key part.

tarantool> bands:insert{1, 'Roxette', 1986}
           bands:insert{2, 'Scorpions', 1965}
           bands:insert{3, 'The Doors', 1965}
           bands:insert{4, 'The Beatles', 1960}

tarantool> bands:select(1)
---
- - [1, 'Roxette', 1986]
...

tarantool> bands:select()
---
- - [1, 'Roxette', 1986]
  - [2, 'Scorpions', 1965]
  - [3, 'The Doors', 1965]
  - [4, 'The Beatles', 1960]
...

tarantool> bands.index.primary:select(2)
---
- - [2, 'Scorpions', 1965]
...

tarantool> bands.index.band:select('The Doors')
---
- - [3, 'The Doors', 1965]
...

tarantool> bands.index.year:select(1965)
---
- - [2, 'Scorpions', 1965]
  - [3, 'The Doors', 1965]
...

Using box.space functions to read _space tuples

This example illustrates how to look at all the spaces, and for each display: approximately how many tuples it contains, and the first field of its first tuple. The function uses the Tarantool’s box.space functions len() and pairs(). The iteration through the spaces is coded as a scan of the _space system space, which contains metadata. The third field in _space contains the space name, so the key instruction space_name = v[3] means space_name is the space_name field in the tuple of _space that we’ve just fetched with pairs(). The function returns a table:

function example()
  local tuple_count, space_name, line
  local ta = {}
  for k, v in box.space._space:pairs() do
    space_name = v[3]
    if box.space[space_name].index[0] ~= nil then
      tuple_count = '1 or more'
    else
      tuple_count = '0'
    end
    line = space_name .. ' tuple_count =' .. tuple_count
    if tuple_count == '1 or more' then
      for k1, v1 in box.space[space_name]:pairs() do
        line = line .. '. first field in first tuple = ' .. v1[1]
        break
      end
    end
    table.insert(ta, line)
  end
  return ta
end

The output below shows what happens if you invoke this function:

tarantool> example()
---
- - _schema tuple_count =1 or more. first field in first tuple = cluster
  - _space tuple_count =1 or more. first field in first tuple = 272
  - _vspace tuple_count =1 or more. first field in first tuple = 272
  - _index tuple_count =1 or more. first field in first tuple = 272
  - _vindex tuple_count =1 or more. first field in first tuple = 272
  - _func tuple_count =1 or more. first field in first tuple = 1
  - _vfunc tuple_count =1 or more. first field in first tuple = 1
  - _user tuple_count =1 or more. first field in first tuple = 0
  - _vuser tuple_count =1 or more. first field in first tuple = 0
  - _priv tuple_count =1 or more. first field in first tuple = 1
  - _vpriv tuple_count =1 or more. first field in first tuple = 1
  - _cluster tuple_count =1 or more. first field in first tuple = 1
...

Using box.space functions to organize a _space tuple

This examples shows how to display field names and field types of a system space – using metadata to find metadata.

To begin: how can one select the _space tuple that describes _space?

A simple way is to look at the constants in box.schema, which shows that there is an item named SPACE_ID == 288, so these statements retrieve the correct tuple:

box.space._space:select{ 288 }
-- or --
box.space._space:select{ box.schema.SPACE_ID }

Another way is to look at the tuples in box.space._index, which shows that there is a secondary index named ‘name’ for a space number 288, so this statement also retrieve the correct tuple:

box.space._space.index.name:select{ '_space' }

However, the retrieved tuple is not easy to read:

tarantool> box.space._space.index.name:select{'_space'}
---
- - [280, 1, '_space', 'memtx', 0, {}, [{'name': 'id', 'type': 'num'}, {'name': 'owner',
        'type': 'num'}, {'name': 'name', 'type': 'str'}, {'name': 'engine', 'type': 'str'},
      {'name': 'field_count', 'type': 'num'}, {'name': 'flags', 'type': 'str'}, {
        'name': 'format', 'type': '*'}]]
...

It looks disorganized because field number 7 has been formatted with recommended names and data types. How can one get those specific sub-fields? Since it’s visible that field number 7 is an array of maps, this for loop will do the organizing:

tarantool> do
         >   local tuple_of_space = box.space._space.index.name:get{'_space'}
         >   for _, field in ipairs(tuple_of_space[7]) do
         >     print(field.name .. ', ' .. field.type)
         >   end
         > end
id, num
owner, num
name, str
engine, str
field_count, num
flags, str
format, *
---
...

Using sequences

A sequence is a generator of ordered integer values.

As with spaces and indexes, you should specify the sequence name and let Tarantool generate a unique numeric identifier (sequence ID).

As well, you can specify several options when creating a new sequence. The options determine the values that are generated whenever the sequence is used.

Options for box.schema.sequence.create()

Option name	Type and meaning	Default	Examples
`start`	Integer. The value to generate the first time a sequence is used	1	`start=0`
`min`	Integer. Values smaller than this cannot be generated	1	`min=-1000`
`max`	Integer. Values larger than this cannot be generated	9223372036854775807	`max=0`
`cycle`	Boolean. Whether to start again when values cannot be generated	false	`cycle=true`
`cache`	Integer. The number of values to store in a cache	0	`cache=0`
`step`	Integer. What to add to the previous generated value, when generating a new value	1	`step=-1`
`if_not_exists`	Boolean. If this is true and a sequence with this name exists already, ignore other options and use the existing values	`false`	`if_not_exists=true`

Once a sequence exists, it can be altered, dropped, reset, forced to generate the next value, or associated with an index.

Associating a sequence with an index

First, create a sequence:

-- Create a sequence --
box.schema.sequence.create('id_seq',{min=1000, start=1000})
--[[
---
- step: 1
  id: 1
  min: 1000
  cache: 0
  uid: 1
  cycle: false
  name: id_seq
  start: 1000
  max: 9223372036854775807
...
--]]

The result shows that the new sequence has all default values, except for the two that were specified, min and start.

Get the next value from the sequence by calling the next() function:

-- Get the next item --
box.sequence.id_seq:next()
--[[
---
- 1000
...
--]]

The result is the same as the start value. The next call increases the value by one (the default sequence step).

Create a space and specify that its primary key should be generated from the sequence:

-- Create a space --
box.schema.space.create('customers')

-- Create an index that uses the sequence --
box.space.customers:create_index('primary',{ sequence = 'id_seq' })
--[[
---
- parts:
  - type: unsigned
    is_nullable: false
    fieldno: 1
  sequence_id: 1
  id: 0
  space_id: 513
  unique: true
  hint: true
  type: TREE
  name: primary
  sequence_fieldno: 1
...
--]]

Insert a tuple without specifying a value for the primary key:

-- Insert a tuple without the primary key value --
box.space.customers:insert{ nil, 'Adams' }
--[[
---
- [1001, 'Adams']
...
--]]

The result is a new tuple where the first field is assigned the next value from the sequence. This arrangement, where the system automatically generates the values for a primary key, is sometimes called “auto-incrementing” or “identity”.

For syntax and implementation details, see the reference for box.schema.sequence.

Migrations

Migration refers to any change in a data schema: adding or removing a field, creating or dropping an index, changing a field format, and so on. Space creation is also a migration. Using migrations, you can track the evolution of your data schema since its initial state. In Tarantool, migrations are presented as Lua code that alters the data schema using the built-in Lua API.

There are two types of migrations:

simple migrations don’t require additional actions on existing data
complex migrations include both schema and data changes

Simple migrations

There are two types of schema migration that do not require data migration:

Creating an index. A new index can be created at any time. To learn more about index creation, see Indexes and the space_object:create_index() reference.
Adding a field to the end of a space. To add a field, update the space format so that it includes all its fields and also the new field. For example:
```
local users = box.space.writers
local fmt = users:format()

table.insert(fmt, { name = 'age', type = 'number', is_nullable = true })
users:format(fmt)
```
The field must have the is_nullable parameter. Otherwise, an error occurs if the space contains tuples of old format.

Note

After creating a new field, you probably want to fill it with data. The tarantool/moonwalker module is useful for this task.

Complex migrations

Other types of migrations are more complex and require additional actions to maintain data consistency.

Migrations are possible in two cases:

When Tarantool starts, and no client uses the database yet
During request processing, when active clients are already using the database

For the first case, it is enough to write and test the migration code. The most difficult task is to migrate data when there are active clients. You should keep it in mind when you initially design the data schema.

We identify the following problems if there are active clients:

Associated data can change atomically.
The system should be able to transfer data using both the new schema and the old one.
When data is being transferred to a new space, data access should consider that the data might be in one space or another.
Write requests must not interfere with the migration. A common approach is to write according to the new data schema.

These issues may or may not be relevant depending on your application and its availability requirements.

Tarantool offers the following features that make migrations easier and safer:

Transaction mechanism. It is useful when writing a migration, because it allows you to work with the data atomically. But before using the transaction mechanism, you should explore its limitations. For details, see the section about transactions.
space:upgrade() function (EE only). With the help of space:upgrade(), you can enable compression and migrate, including already created tuples. For details, check the Upgrading space schema section.
Centralized migration management mechanism (EE only). Implemented in the Enterprise version of the tt utility and in Tarantool Cluster Manager, this mechanism enables migration execution and tracking in the replication clusters. For details, see Centralized migration management.

Applying migrations

The migration code is executed on a running Tarantool instance. Important: no method guarantees you transactional application of migrations on the whole cluster.

Method 1: include migrations in the application code

This is quite simple: when you reload the code, the data is migrated at the right moment, and the database schema is updated. However, this method may not work for everyone. You may not be able to restart Tarantool or update the code using the hot-reload mechanism.

Method 2: the tt utility

Connect to the necessary instance using tt connect.

$ tt connect admin:password@localhost:3301

If your migration is written in a Lua file, you can execute it using dofile(). Call this function and specify the path to the migration file as the first argument. It looks like this:
```
tarantool> dofile('0001-delete-space.lua')
---
...
```
(or) Copy the migration script code, paste it into the console, and run it.

You can also connect to the instance and execute the migration script in a single call:

$ tt connect admin:password@localhost:3301 -f 0001-delete-space.lua

Centralized migration management

Enterprise Edition

Centralized migration management is available in the Enterprise Edition only.

Tarantool EE offers a mechanism for centralized migration management in replication clusters that use etcd as a configuration storage. The mechanism uses the same etcd storage to store migrations and applies them across the entire Tarantool cluster. This ensures migration consistency in the cluster and enables migration history tracking.

The centralized migration management mechanism is implemented in the Enterprise version of the tt utility and in Tarantool Cluster Manager.

To learn how to manage migrations in Tarantool EE clusters from the command line, see Centralized migrations with tt. To learn how to use the mechanism from the TCM web interface, see the Performing migrations TCM documentation page.

Centralized migrations with tt

Example on GitHub: migrations

In this section, you learn to use the centralized migration management mechanism implemented in the Enterprise Edition of the tt utility.

The section includes the following tutorials:

Basic tt migrations tutorial

Example on GitHub: migrations

In this tutorial, you learn to define the cluster data schema using the centralized migration management mechanism implemented in the Enterprise Edition of the tt utility.

Prerequisites

Before starting this tutorial:

Download and install Tarantool Enterprise SDK.
Install etcd.

Preparing a cluster

The centralized migration mechanism works with Tarantool EE clusters that:

use etcd as a centralized configuration storage
use the CRUD module or its Enterprise version for data distribution

Setting up etcd

First, start up an etcd instance to use as a configuration storage:

$ etcd

etcd runs on the default port 2379.

Optionally, enable etcd authentication by executing the following script:

#!/usr/bin/env bash

etcdctl user add root:topsecret
etcdctl role add app_config_manager
etcdctl role grant-permission app_config_manager --prefix=true readwrite /myapp/
etcdctl user add app_user:config_pass
etcdctl user grant-role app_user app_config_manager
etcdctl auth enable

It creates an etcd user app_user with read and write permissions to the /myapp prefix, in which the cluster configuration will be stored. The user’s password is config_pass.

Note

If you don’t enable etcd authentication, make tt migrations calls without the configuration storage credentials.

Creating a cluster

Initialize a tt environment:
```
$ tt init
```
In the instances.enabled directory, create the myapp directory.
Go to the instances.enabled/myapp directory and create application files:

instances.yml:

router-001-a:
storage-001-a:
storage-001-b:
storage-002-a:
storage-002-b:

config.yaml:

config:
  etcd:
    endpoints:
    - http://localhost:2379
    prefix: /myapp/
    username: app_user
    password: config_pass
    http:
      request:
        timeout: 3

myapp-scm-1.rockspec:

package = 'myapp'
version = 'scm-1'

source  = {
    url = '/dev/null',
}

dependencies = {
    'crud == 1.5.2',
}

build = {
    type = 'none';
}

Create the source.yaml with a cluster configuration to publish to etcd:

Note

This configuration describes a typical CRUD-enabled sharded cluster with one router and two storage replica sets, each including one master and one read-only replica.

credentials:
  users:
    client:
      password: 'secret'
      roles: [super]
    replicator:
      password: 'secret'
      roles: [replication]
    storage:
      password: 'secret'
      roles: [sharding]

iproto:
  advertise:
    peer:
      login: replicator
    sharding:
      login: storage

sharding:
  bucket_count: 3000

groups:
  routers:
    sharding:
      roles: [router]
    roles: [roles.crud-router]
    replicasets:
      router-001:
        instances:
          router-001-a:
            iproto:
              listen:
              - uri: localhost:3301
              advertise:
                client: localhost:3301
  storages:
    sharding:
      roles: [storage]
    roles: [roles.crud-storage]
    replication:
      failover: manual
    replicasets:
      storage-001:
        leader: storage-001-a
        instances:
          storage-001-a:
            iproto:
              listen:
                - uri: localhost:3302
              advertise:
                client: localhost:3302
          storage-001-b:
            iproto:
              listen:
              - uri: localhost:3303
              advertise:
                client: localhost:3303
      storage-002:
        leader: storage-002-a
        instances:
          storage-002-a:
            iproto:
              listen:
              - uri: localhost:3304
              advertise:
                client: localhost:3304
          storage-002-b:
            iproto:
              listen:
              - uri: localhost:3305
              advertise:
                client: localhost:3305

Publish the configuration to etcd:

$ tt cluster publish "http://app_user:config_pass@localhost:2379/myapp/" source.yaml

The full cluster code is available on GitHub here: migrations.

Building and starting the cluster

Build the application:
```
$ tt build myapp
```
Start the cluster:
```
$ tt start myapp
```
To check that the cluster is up and running, use tt status:
```
$ tt status myapp
```
Bootstrap vshard in the cluster:
```
$ tt replicaset vshard bootstrap myapp
```

Writing migrations

To perform migrations in the cluster, write them in Lua and publish to the cluster’s etcd configuration storage.

Each migration file must return a Lua table with one object named apply. This object has one field – scenario – that stores the migration function:

local function apply_scenario()
    -- migration code
end

return {
    apply = {
        scenario = apply_scenario,
    },
}

The migration unit is a single file: its scenario is executed as a whole. An error that happens in any step of the scenario causes the entire migration to fail.

Migrations are executed in the lexicographical order. Thus, it’s convenient to use filenames that start with ordered numbers to define the migrations order, for example:

000001_create_space.lua
000002_create_index.lua
000003_alter_space.lua

The default location where tt searches for migration files is /migrations/scenario. Create this subdirectory inside the tt environment. Then, create two migration files:

000001_create_writers_space.lua: create a space, define its format, and create a primary index.

local helpers = require('tt-migrations.helpers')

local function apply_scenario()
    local space = box.schema.space.create('writers')

    space:format({
        {name = 'id', type = 'number'},
        {name = 'bucket_id', type = 'number'},
        {name = 'name', type = 'string'},
        {name = 'age', type = 'number'},
    })

    space:create_index('primary', {parts = {'id'}})
    space:create_index('bucket_id', {parts = {'bucket_id'}})

    helpers.register_sharding_key('writers', {'id'})
end

return {
    apply = {
        scenario = apply_scenario,
    },
}

Note

Note the usage of the tt-migrations.helpers module. In this example, its function register_sharding_key is used to define a sharding key for the space.

000002_create_writers_index.lua: add one more index.

local function apply_scenario()
    local space = box.space['writers']

    space:create_index('age', {parts = {'age'}})
end

return {
    apply = {
        scenario = apply_scenario,
    },
}

Publishing migrations

To publish migrations to the etcd configuration storage, run tt migrations publish:

$ tt migrations publish "http://app_user:config_pass@localhost:2379/myapp"
   • 000001_create_writes_space.lua: successfully published to key "000001_create_writes_space.lua"
   • 000002_create_writers_index.lua: successfully published to key "000002_create_writers_index.lua"

Applying migrations

To apply published migrations to the cluster, run tt migrations apply providing a cluster user’s credentials:

$ tt migrations apply "http://app_user:config_pass@localhost:2379/myapp" \
                      --tarantool-username=client --tarantool-password=secret

Important

The cluster user must have enough access privileges to execute the migrations code.

The output should look as follows:

• router-001:
•     000001_create_writes_space.lua: successfully applied
•     000002_create_writers_index.lua: successfully applied
• storage-001:
•     000001_create_writes_space.lua: successfully applied
•     000002_create_writers_index.lua: successfully applied
• storage-002:
•     000001_create_writes_space.lua: successfully applied
•     000002_create_writers_index.lua: successfully applied

The migrations are applied on all replica set leaders. Read-only replicas receive the changes from the corresponding replica set leaders.

Check the migrations status with tt migration status:

$ tt migrations status "http://app_user:config_pass@localhost:2379/myapp" \
                       --tarantool-username=client --tarantool-password=secret
   • migrations centralized storage scenarios:
   •   000001_create_writes_space.lua
   •   000002_create_writers_index.lua
   • migrations apply status on Tarantool cluster:
   •   router-001:
   •     000001_create_writes_space.lua: APPLIED
   •     000002_create_writers_index.lua: APPLIED
   •   storage-001:
   •     000001_create_writes_space.lua: APPLIED
   •     000002_create_writers_index.lua: APPLIED
   •   storage-002:
   •     000001_create_writes_space.lua: APPLIED
   •     000002_create_writers_index.lua: APPLIED

To make sure that the space and indexes are created in the cluster, connect to the router instance and retrieve the space information:

$ tt connect myapp:router-001-a

myapp:router-001-a> require('crud').schema('writers')
---
- indexes:
    0:
      unique: true
      parts:
      - fieldno: 1
        type: number
        exclude_null: false
        is_nullable: false
      id: 0
      type: TREE
      name: primary
    2:
      unique: true
      parts:
      - fieldno: 4
        type: number
        exclude_null: false
        is_nullable: false
      id: 2
      type: TREE
      name: age
  format: [{'name': 'id', 'type': 'number'}, {'type': 'number', 'name': 'bucket_id',
      'is_nullable': true}, {'name': 'name', 'type': 'string'}, {'name': 'age', 'type': 'number'}]
...

Next steps

Learn to write and perform data migration in Data migrations with space.upgrade().

Data migrations with space.upgrade()

Example on GitHub: migrations

In this tutorial, you learn to write migrations that include data migration using the space.upgrade() function.

Prerequisites

Before starting this tutorial, complete the Basic tt migrations tutorial. As a result, you have a sharded Tarantool EE cluster that uses an etcd-based configuration storage. The cluster has a space with two indexes.

Writing a complex migration

Complex migrations require data migration along with schema migration. Connect to the router instance and insert some tuples into the space before proceeding to the next steps.

$ tt connect myapp:router-001-a

myapp:router-001-a> require('crud').insert_object_many('writers', {
    {id = 1, name = 'Haruki Murakami', age = 75},
    {id = 2, name = 'Douglas Adams', age = 49},
    {id = 3, name = 'Eiji Mikage', age = 41},
}, {noreturn = true})

The next migration changes the space format incompatibly: instead of one name field, the new format includes two fields first_name and last_name. To apply this migration, you need to change each tuple’s structure preserving the stored data. The space.upgrade function helps with this task.

Create a new file 000003_alter_writers_space.lua in /migrations/scenario. Prepare its initial structure the same way as in previous migrations:

local function apply_scenario()
--  migration code
end
return {
    apply = {
        scenario = apply_scenario,
    },
}

Start the migration function with the new format description:

local function apply_scenario()
    local space = box.space['writers']
    local new_format = {
        {name = 'id', type = 'number'},
        {name = 'bucket_id', type = 'number'},
        {name = 'first_name', type = 'string'},
        {name = 'last_name', type = 'string'},
        {name = 'age', type = 'number'},
    }
    box.space.writers.index.age:drop()

Note

box.space.writers.index.age:drop() drops an existing index. This is done because indexes rely on field numbers and may break during this format change. If you need the age field indexed, recreate the index after applying the new format.

Next, create a stored function that transforms tuples to fit the new format. In this case, the function extracts the first and the last name from the name field and returns a tuple of the new format:

box.schema.func.create('_writers_split_name', {
    language = 'lua',
    is_deterministic = true,
    body = [[
    function(t)
        local name = t[3]

        local split_data = {}
        local split_regex = '([^%s]+)'
        for v in string.gmatch(name, split_regex) do
            table.insert(split_data, v)
        end

        local first_name = split_data[1]
        assert(first_name ~= nil)

        local last_name = split_data[2]
        assert(last_name ~= nil)

        return {t[1], t[2], first_name, last_name, t[4]}
    end
    ]],
})

Finally, call space:upgrade() with the new format and the transformation function as its arguments. Here is the complete migration code:

local function apply_scenario()
    local space = box.space['writers']
    local new_format = {
        {name = 'id', type = 'number'},
        {name = 'bucket_id', type = 'number'},
        {name = 'first_name', type = 'string'},
        {name = 'last_name', type = 'string'},
        {name = 'age', type = 'number'},
    }
    box.space.writers.index.age:drop()

    box.schema.func.create('_writers_split_name', {
        language = 'lua',
        is_deterministic = true,
        body = [[
        function(t)
            local name = t[3]

            local split_data = {}
            local split_regex = '([^%s]+)'
            for v in string.gmatch(name, split_regex) do
                table.insert(split_data, v)
            end

            local first_name = split_data[1]
            assert(first_name ~= nil)

            local last_name = split_data[2]
            assert(last_name ~= nil)

            return {t[1], t[2], first_name, last_name, t[4]}
        end
        ]],
    })

    local future = space:upgrade({
        func = '_writers_split_name',
        format = new_format,
    })

    future:wait()
end

return {
    apply = {
        scenario = apply_scenario,
    },
}

Learn more about space.upgrade() in Upgrading space schema.

Publishing the migration

Publish the new migration to etcd.

$ tt migrations publish "http://app_user:config_pass@localhost:2379/myapp" \
                        migrations/scenario/000003_alter_writers_space.lua

Note

You can also publish all migrations from the default location /migrations/scenario. All other migrations stored in this directory are already published, so tt skips them.

$ tt migrations publish "http://app_user:config_pass@localhost:2379/myapp"

Applying the migration

Apply the published migrations:

$ tt migrations apply "http://app_user:config_pass@localhost:2379/myapp" \
                      --tarantool-username=client --tarantool-password=secret

Connect to the router instance and check that the space and its tuples have the new format:

$ tt connect myapp:router-001-a

myapp:router-001-a> require('crud').get('writers', 2)
---
- rows: [2, 401, 'Douglas', 'Adams', 49]
  metadata: [{'name': 'id', 'type': 'number'}, {'name': 'bucket_id', 'type': 'number'},
    {'name': 'first_name', 'type': 'string'}, {'name': 'last_name', 'type': 'string'},
    {'name': 'age', 'type': 'number'}]
- null
...

Next steps

Learn to use migrations for data schema definition on new instances added to the cluster in Extending the cluster.

Extending the cluster

Example on GitHub: migrations

In this tutorial, you learn how to consistently define the data schema on newly added cluster instances using the centralized migration management mechanism.

Prerequisites

Before starting this tutorial, complete the Basic tt migrations tutorial and Data migrations with space.upgrade(). As a result, you have a sharded Tarantool EE cluster that uses an etcd-based configuration storage. The cluster has a space with two indexes.

Extending the cluster

Having all migrations in a centralized etcd storage, you can extend the cluster and consistently define the data schema on new instances on the fly.

Add one more storage replica set to the cluster. To do this, edit the cluster files in instances.enabled/myapp:

instances.yml: add the lines below to the end.
```
storage-003-a:
storage-003-b:
```

source.yaml: add the lines below to the end.

storage-003:
  leader: storage-003-a
  instances:
    storage-003-a:
      iproto:
        listen:
        - uri: localhost:3306
        advertise:
          client: localhost:3306
    storage-003-b:
      iproto:
        listen:
        - uri: localhost:3307
        advertise:
          client: localhost:3307

Publish the new cluster configuration to etcd:

$ tt cluster publish "http://app_user:config_pass@localhost:2379/myapp/" source.yaml

Run tt start to start up the new instances:

$ tt start myapp
   • The instance myapp:router-001-a (PID = 61631) is already running.
   • The instance myapp:storage-001-a (PID = 61632) is already running.
   • The instance myapp:storage-001-b (PID = 61634) is already running.
   • The instance myapp:storage-002-a (PID = 61639) is already running.
   • The instance myapp:storage-002-b (PID = 61640) is already running.
   • Starting an instance [myapp:storage-003-a]...
   • Starting an instance [myapp:storage-003-b]...

Now the cluster contains three storage replica sets.

Applying migrations to the new replica set

The new replica set – storage-003– is just started and has no data schema yet. Apply all stored migrations to the cluster to load the same data schema to the new replica set:

$ tt migrations apply "http://app_user:config_pass@localhost:2379/myapp" \
                      --tarantool-username=client --tarantool-password=secret
                      --replicaset=storage-003

Note

You can also apply migrations without specifying the replica set. All published migrations are already applied on other replica sets, so tt skips the operation on them.

$ tt migrations apply "http://app_user:config_pass@localhost:2379/myapp" \
                      --tarantool-username=client --tarantool-password=secret

To make sure that the space exists on the new instances, connect to storage-003-a and check box.space.writers:

$ tt connect myapp:storage-003-a

myapp:storage-003-a> box.space.writers ~= nil
---
- true
...

Troubleshooting migrations

The centralized migrations mechanism allows troubleshooting migration issues using dedicated tt migration options. When troubleshooting migrations, remember that any unfinished or failed migration can bring the data schema into to inconsistency. Additional steps may be needed to fix this.

Warning

The options used for migration troubleshooting can cause migration inconsistency in the cluster. Use them only for local development and testing purposes.

Incorrect migration published

If an incorrect migration was published to etcd but wasn’t applied yet, fix the migration file and publish it again with the --overwrite option:

$ tt migrations publish "http://app_user:config_pass@localhost:2379/myapp" \
                        000001_create_space.lua --overwrite

If the migration that needs a fix isn’t the last in the lexicographical order, add also --ignore-order-violation:

$ tt migrations publish "http://app_user:config_pass@localhost:2379/myapp" \
                        000001_create_space.lua --overwrite --ignore-order-violation

If a migration was published by mistake and wasn’t applied yet, you can delete it from etcd using tt migrations remove:

$ tt migrations remove "http://app_user:config_pass@localhost:2379/myapp" \
                    --migration 000003_not_needed.lua

Incorrect migration applied

Warning

Any schema change that was made by an incorrect migration before its fail or cancellation must be resolved manually on each replica set before reapply. --force-reapply and other tt migrations options affect only internal status of the migration and don’t revert changes that it has made in the cluster.

If the migration is already applied, publish the fixed version and apply it with the --force-reapply option:

$ tt migrations apply "http://app_user:config_pass@localhost:2379/myapp" \
                      --tarantool-username=client --tarantool-password=secret \
                      --force-reapply

If execution of the incorrect migration version has failed, you may also need to add the --ignore-preceding-status option:

When you reapply a migration, tt checks the statuses of preceding migrations to ensure consistency. To skip this check, add the --ignore-preceding-status option:

$ tt migrations apply "http://app_user:config_pass@localhost:2379/myapp" \
                      --tarantool-username=client --tarantool-password=secret \
                      --migration=00003_alter_space.lua
                      --force-reapply --ignore-preceding-status

Migration execution takes too long

To interrupt migration execution on the cluster, use tt migrations stop:

$ tt migrations stop "http://app_user:config_pass@localhost:2379/myapp" \
                      --tarantool-username=client --tarantool-password=secret

You can adjust the maximum migration execution time using the --execution-timeout option of tt migrations apply:

$ tt migrations apply "http://app_user:config_pass@localhost:2379/myapp" \
                      --tarantool-username=client --tarantool-password=secret \
                      --execution-timeout=60

Note

If a migration timeout is reached, you may need to call tt migrations stop to cancel requests that were sent when applying migrations.

Upgrading space schema

Enterprise Edition

space:upgrade() is available in the Enterprise Edition only.

In Tarantool, migration refers to any change in a data schema, for example, creating an index, adding a field, or changing a field format. If you need to change a data schema, there are several possible cases:

Schema migration does not require data migration: adding a field with the is_nullable parameter to the end of the space, creating an index.
Schema migration requires data migration. For example, it is necessary when you have to iterate over the entire space to convert columns to a new format or remove the column completely.

To solve the task of migrating the data, you can:

Migrate data to a new space manually.
Use the space:upgrade() feature.

Space upgrade overview

The space:upgrade() feature allows users to upgrade the format of a space and the tuples stored in it without blocking the database.

How to apply space upgrade

First, specify an upgrade function – a function that will convert the tuples in the space to a new format. The requirements for this function are listed below.

The upgrade function takes two arguments. The first argument is a tuple to be upgraded. The second one is optional. It contains some additional information stored in plain Lua object. If omitted, the second argument is nil.
The function returns a new tuple or a Lua table. For example, it can add a new field to the tuple. The new tuple must conform to the new space format set by the upgrade operation.
The function should be registered with box.schema.func.create. It should also be stored, deterministic, and written in Lua.
The function should not change the primary key of the tuple.
The function should be idempotent: f(f(t)) = f(t). This is necessary because the function is applied to all tuples returned to the user, and some of them may have already been upgraded in the background.

Then define a new space format. This step is optional. However, it could be useful if, for example, you want to add a new column with data. For details, check the Usage Example section.

The next optional step is to choose an upgrade mode. There are three modes: upgrade, dryrun, and dryrun+upgrade. The default value is upgrade. To check an upgrade function without applying any changes, choose the dryrun mode. To run a space upgrade without testing the function, pick the upgrade mode. If you want to apply both the test and the actual upgrade, use the dryrun+upgrade option. For details, see the Upgrade Modes section.

How the upgrade works

The user defines an upgrade function. Each tuple of the chosen space is passed through the function. The function converts the tuple from the old format to a new one. The function is applied to all tuples stored in the space in the background. Besides, the function is applied to all tuples returned to the user via the box API (for example, select, get). Therefore, it appears that the space upgrades instantly.

Keep in mind that space:upgrade differs from the space_object:format() in the following ways:

Difference	`space:upgrade()`	`space:format()`
Non-blocking	Yes. It returns tuples in the new format, whether or not they have already been converted.	Yes.
Set a format incompatible with the current one	Yes. Works for non-indexed field types only.	No, only expand the format in a compatible way.
Visibility of changes	Immediately. All changes are visible and replicated immediately. New data should conform to the new format immediately after the call.	After data validation. Data validation starts in the background, it does not block the database. Inserting data incompatible with the new format is allowed before validation is completed – in this case `space.format` fails.
Cancel (error/restart)	Writes the state to the system table. Restart: the operation continues. Error: the operation should be restarted manually, any other attempt to change the table fails.	Leaves no traces.
Set the upgrade function	Yes. The upgrade may take a while to traverse the space and transform tuples.	No.

Note

At the moment, the feature is not supported for vinyl spaces.

User API

The space:upgrade() method is added to the space object:

space:upgrade({func[, arg, format, mode, is_async]}])¶

Parameters:

Parameters:	func (`string/integer`) – upgrade function name (string) or ID (integer). For details, see the upgrade function requirements section. arg – additional information passed to the upgrade function in the second argument. The option accepts any Lua value that can be encoded in MsgPack, which means that the msgpack.encode(arg) should succeed. For example, one can pass a scalar or a Lua table. The default value is `nil`. format (`map`) – new space format. The requirements for this are the same as for any other space:format(). If the field is omitted, the space format will remain the same as before the upgrade. mode (`string`) – upgrade mode. Possible values: `upgrade`, `dryrun`, `dryrun+upgrade`. The default value is `upgrade`. is_async (`boolean`) – the flag indicates whether to wait until the upgrade operation is complete before exiting the function. The default value is `false` – the function is blocked until the upgrade operation is finished.
Return:	object describing the status of the operation (also known as `future`). The methods of the object are described below.

func (string/integer) – upgrade function name (string) or ID (integer). For details, see the upgrade function requirements section.
arg – additional information passed to the upgrade function in the second argument. The option accepts any Lua value that can be encoded in MsgPack, which means that the msgpack.encode(arg) should succeed. For example, one can pass a scalar or a Lua table. The default value is nil.
format (map) – new space format. The requirements for this are the same as for any other space:format(). If the field is omitted, the space format will remain the same as before the upgrade.
mode (string) – upgrade mode. Possible values: upgrade, dryrun, dryrun+upgrade. The default value is upgrade.
is_async (boolean) – the flag indicates whether to wait until the upgrade operation is complete before exiting the function. The default value is false – the function is blocked until the upgrade operation is finished.

Return:

object describing the status of the operation (also known as future). The methods of the object are described below.

object future_object¶

future_object:info(dryrun, status, func, arg, owner, error, progress)¶

Shows information about the state of the upgrade operation.

Parameters:	dryrun (`boolean`) – dry run mode flag. Possible values: `true` for a dry run, `nil` for an actual upgrade. status (`string`) – upgrade status. Possible values: `inprogress`, `waitrw`, `error`, `replica`, `done`. func (`string/integer`) – name of the upgrade function. It is the same as passed to the `space:upgrade` method. The field is `nil` if the `status` is `done`. arg – additional information passed to the upgrade function. It is the same as for the `space:upgrade` method. The field is `nil` if it is omitted in the `space:upgrade`. owner (`string`) – UUID of the instance running the upgrade (see box.info.uuid). The field is `nil` if the `status` is `done`. error (`string`) – error message if the `status` is `error`, otherwise `nil`. progress (`string`) – completion percentage if the `status` is `inprogress`/`waitrw`, otherwise `nil`.
Return:	a table with information about the state of the upgrade operation
Rtype:	table

The fields can also be accessed directly, without calling the info() method. For example, future.status is the same as future:info().status.

future_object:wait([timeout])¶

Waits until the upgrade operation is completed or a timeout occurs. An operation is considered completed if its status is done or error.

Parameters:	timeout (`double`) – if the `timeout` argument is omitted, the method waits as long as it takes.
Return:	returns `true` if the operation has been completed, `false` on timeout
Rtype:	boolean

future_object:cancel()¶

Cancels the upgrade operation if it is currently running. Otherwise, an exception is thrown. A canceled upgrade operation completes with an error.

Return:	none
Rtype:	void

Running space:upgrade() with is_async = false or the is_async field not set is equal to:

local future = space:upgrade({func = 'my_func', is_async = true})
future:wait()
return future

If called without arguments, space:upgrade() returns a future object for the active upgrade operation. If there is none, it returns nil.

Upgrade modes

There are three upgrade modes: dryrun, dryrun+upgrade, and upgrade. Regardless of the mode selected, the upgrade does not block execution. Once in a while, the background fiber commits the upgraded tuples and yields.

Calling space:upgrade without arguments always returns the current state of the space upgrade, never the state of a dry run. If there is a dry run working in the background, space:upgrade will still return nil. Unlike an actual space upgrade, the future object returned by a dry run upgrade can’t be recovered if it is lost. So a dry run is aborted if it is garbage collected.

Warning

In dryrun+upgrade mode: if the future object is garbage collected by Lua before the end of the dry run and the start of the upgrade, then the dry run will be canceled, and no upgrade will be started.

Upgrade modes:

upgrade mode: the background fiber iterates over the space, applies the upgrade function, checks that obtained tuples fit the new space format, and updates the tuples. This mode prevents the space from being altered. The mode can only be performed on the master instance.
dryrun mode: the dry-run mode is used to check the upgrade function. The mode does not apply any changes to the target space. It starts a background fiber. The fiber:
- Iterates over the target space.
- Attempts to apply the upgrade function to each tuple stored in the space.
- Checks if the returned tuple matches the new format.
- Checks if the function is idempotent.
- Checks that the function does not modify the primary key.
For details, see the upgrade function requirements section.

To start a dry run, pass mode='dryrun' to the space:upgrade method. In this case, the future object has the dryrun field set to true. The possible statuses are inprogress and dryrun. replica and waitrw states are never set for a dry run future object.

The dryrun mode is not persisted. Restarting the instance does not restart a dry run. A dry run only works on the original instance, never on replicas. Unlike a real upgrade, a dry run does not prevent the space from being altered. The space can even be dropped. In this case, the dry run will complete with an error.
dryrun+upgrade mode: it starts a dry run, which, if completed successfully, triggers an actual upgrade. The future object returned by space:upgrade remains valid throughout the process. It starts as the future object of the dry run. Then, under the hood, it is converted into an upgrade future object. Waiting on it would wait for both the dry run and the upgrade to complete. During the dry run, the future object has the dryrun field set to true. When the actual upgrade starts, the dryrun field is set to nil. The mode can only be performed on the master instance.

States

An upgrade operation has one of the following upgrade states:

inprogress – the upgrade operation is running in the background. The function is applied to all tuples returned to the user.
waitrw – the instance was switched to the read-only mode (for example, by using box.cfg.read_only), so the upgrade couldn’t proceed. The upgrade process will resume as soon as the instance switches back to read-write mode. Nevertheless, the upgrade function is applied to all tuples returned to the user.
error – the upgrade operation failed with an error. See the error field for the error message. See the log for the tuple that caused the error. No alter operation is allowed, except for another upgrade, supposed to fix the problem. Nevertheless, the upgrade function is applied to all tuples returned to the user. The space is writable.
done – the upgrade operation is successfully completed. The upgrade function is not applied to tuples returned to the user anymore. The function can be deleted.
replica – the upgrade operation is either running or completed with an error on another instance. See the owner field for the UUID of the instance running the upgrade. Nevertheless, the upgrade function is applied to all tuples returned to the user.

Interaction with alter

While a space upgrade is in progress, the space can’t be altered or dropped. The attempt to do that will throw an exception. Restarting an upgrade is allowed in case the currently running upgrade is canceled or completed with an error. It means the manual restart is possible if the upgrade operation is in the error state.

If a space upgrade was canceled or failed with an error, the space can’t be altered or dropped. The only option is to restart the upgrade using a different upgrade function or format.

Interaction with recovery

The space upgrade state is persisted. It is stored in the _space system table. If an instance with a space upgrade in progress (inprogress state) is shut down, it restarts the space upgrade after recovery. If a space upgrade fails (switches to the error state), it remains in the error state after recovery.

Interaction with replication

The changes made to a space by a space upgrade are replicated. Just as on the instance where the upgrade is performed, the upgrade function is applied to all tuples returned to the user on the replicas. However, the upgrade operation is not performed on the replicas in the background. The replicas wait for the upgrade operation to complete on the master. They can’t alter or drop the space. Normally, they can’t cancel or restart the upgrade operation either.

There is an emergency exception when the master is permanently dead. It is possible to restart a space upgrade that started on another instance. The restart is possible if the upgrade owner UUID (see the owner field) has been deleted from the _cluster system table.

Note

Except the dryrun mode, the upgrade can only be performed on the master. If the instance is no longer the master, the upgrade is suspended until the instance is master again. Restarting the upgrade on a new master works only if the old one has been removed from the replica set (_cluster system space).

Usage example

Suppose there are two columns in the space test – id (unsigned) and data (string). The example shows how to upgrade the schema and add another column to the space using space:upgrade(). The new column contains the id values converted to string. Each step takes a while.

The test space is generated with the following script:

local log = require('log')
box.cfg{
    checkpoint_count = 1,
    memtx_memory = 5 * 1024 * 1024 * 1024,
}
box.schema.space.create('test')
box.space.test:format{
    {name = 'id', type = 'unsigned'},
    {name = 'data', type = 'string'},
}
box.space.test:create_index('pk')
local count = 20 * 1000 * 1000
local progress = 0
box.begin()
for i = 1, count do
    box.space.test:insert{i, 'data' .. i}

    if i % 1000 == 0 then
        box.commit()
        local p = math.floor(i / count * 100)
        if progress ~= p then
            progress = p
            log.info('Generating test data set... %d%% done', p)
        end
        box.begin()
    end
end
box.commit()
box.snapshot()
os.exit(0)

To upgrade the space, connect to the server and then run the commands below:

localhost:3301> box.schema.func.create('convert', {
              >     language = 'lua',
              >     is_deterministic = true,
              >     body = [[function(t)
              >         if #t == 2 then
              >             return t:update({{'!', 2, tostring(t.id)}})
              >         else
              >             return t
              >         end
              >     end]],
              > })
localhost:3301> box.space.test:upgrade({
              >     func = 'convert',
              >     format = {
              >         {name = 'id', type = 'unsigned'},
              >         {name = 'id_string', type = 'string'},
              >         {name = 'data', type = 'string'},
              >     },
              > })

While the upgrade is in progress, you can track the state of the upgrade. To check the status, connect to Tarantool from another console and run the following commands:

localhost:3311> box.space.test:upgrade()
---
- status: inprogress
  progress: 8%
  owner: 579a9e99-427e-4e99-9e2e-216bbd3098a7
  func: convert
...

Even though the upgrade is only 8% complete, selecting the data from the space returns the converted tuples:

localhost:3311> box.space.test:select({}, {iterator = 'req', limit = 5})
---
- - [20000000, '20000000', 'data20000000']
  - [19999999, '19999999', 'data19999999']
  - [19999998, '19999998', 'data19999998']
  - [19999997, '19999997', 'data19999997']
  - [19999996, '19999996', 'data19999996']
...

Note

The tuples contain the new field even though the space upgrade is still running.

Wait for the space upgrade to complete using the command below:

localhost:3311> box.space.test:upgrade():wait()

Read views

Enterprise Edition

Read views are available in the Enterprise Edition only.

A read view is an in-memory snapshot of the entire database that isn’t affected by future data modifications. Read views provide access to database spaces and their indexes and enable you to retrieve data using the same select and pairs operations.

Read views can be used to make complex analytical queries. This reduces the load on the main database and improves RPS for a single Tarantool instance.

To improve memory consumption and performance, Tarantool creates read views using the copy-on-write technique. In this case, duplication of the entire data set is not required: Tarantool duplicates only blocks modified after a read view is created.

Note

Tarantool Enterprise Edition supports read views starting from v2.11.0 and enables the ability to work with them using both Lua and C API.

Limitations

Read views have the following limitations:

Only the memtx engine is supported.
Only TREE, HASH and functional indexes are supported.

Working with read views

Creating a read view

To create a read view, call the box.read_view.open() function. The snippet below shows how to create a read view with the read_view1 name.

tarantool> read_view1 = box.read_view.open({name = 'read_view1'})

After creating a read view, you can see the information about it by calling read_view_object:info().

tarantool> read_view1:info()
---
- timestamp: 66.606817935
  signature: 24
  is_system: false
  status: open
  vclock: {1: 24}
  name: read_view1
  id: 1
...

To list all the created read views, call the box.read_view.list() function.

Querying data

After creating a read view, you can access database spaces using the read_view_object.space field. This field provides access to a space object that exposes the select, get, and pairs methods with the same behavior as corresponding box.space methods.

The example below shows how to select 4 records from the bands space:

tarantool> read_view1.space.bands:select({}, {limit = 4})
---
- - [1, 'Roxette', 1986]
  - [2, 'Scorpions', 1965]
  - [3, 'Ace of Base', 1987]
  - [4, 'The Beatles', 1960]
...

Similarly, you can retrieve data by the specific index.

tarantool> read_view1.space.bands.index.year:select({}, {limit = 4})
---
- - [4, 'The Beatles', 1960]
  - [2, 'Scorpions', 1965]
  - [1, 'Roxette', 1986]
  - [3, 'Ace of Base', 1987]
...

Pagination is supported in read views in the same ways as in select requests to spaces: using the fetch_pos and after arguments. To get the cursor position after executing a request on a read view, set fetch_pos to true:

tarantool> result, position = read_view1.space.bands:select({}, { limit = 3, fetch_pos = true })
---
...

tarantool> result
---
- - [1, 'Roxette', 1986]
  - [2, 'Scorpions', 1965]
  - [3, 'Ace of Base', 1987]
...

tarantool> position
---
- kQM
...

Then, pass this position in the after parameter of a request to get the next data chunk:

tarantool> read_view1.space.bands:select({}, { limit = 3, after = position })
---
- - [4, 'The Beatles', 1960]
  - [5, 'Pink Floyd', 1965]
  - [6, 'The Rolling Stones', 1962]
...

Closing a read view

When a read view is no longer needed, close it using the read_view_object:close() method because a read view may consume a substantial amount of memory.

tarantool> read_view1:close()
---
...

Otherwise, a read view is closed implicitly when the read view object is collected by the Lua garbage collector.

After the read view is closed, its status is set to closed. On an attempt to use it, an error is raised.

Example

A Tarantool session below demonstrates how to open a read view, get data from this view, and close it. To repeat these steps, you need to bootstrap a Tarantool instance as described in Using data operations (you can skip creating secondary indexes).

Insert test data.

tarantool> bands:insert{1, 'Roxette', 1986}
           bands:insert{2, 'Scorpions', 1965}
           bands:insert{3, 'Ace of Base', 1987}
           bands:insert{4, 'The Beatles', 1960}

Create a read view by calling the open function. Then, make sure that the read view status is open.

tarantool> read_view1 = box.read_view.open({name = 'read_view1'})

tarantool> read_view1.status
---
- open
...

Change data in a database using the delete and update operations.

tarantool> bands:delete(4)
---
- [4, 'The Beatles', 1960]
...
tarantool> bands:update({2}, {{'=', 2, 'Pink Floyd'}})
---
- [2, 'Pink Floyd', 1965]
...

Query a read view to make sure it contains a snapshot of data before a database is updated.

tarantool> read_view1.space.bands:select()
---
- - [1, 'Roxette', 1986]
  - [2, 'Scorpions', 1965]
  - [3, 'Ace of Base', 1987]
  - [4, 'The Beatles', 1960]
...

Close a read view.
```
tarantool> read_view1:close()
---
...
```

SQL guides

This section contains hands-on SQL guides. You might also want to read the in-depth SQL reference.

SQL beginners’ guide

The Beginners’ Guide describes how users can start up with SQL with Tarantool, and necessary concepts.

The SQL Beginners’ Guide is about databases in general, and about the relationship between Tarantool’s NoSQL and SQL products. Most of the matters in the Beginners’ Guide will already be familiar to people who have used relational databases before.

Prerequisites

Before starting this tutorial:

Install the tt CLI utility.

Start a Tarantool instance in the interactive mode by running tt run -i:

$ tt run -i
Tarantool 3.0.0-0-g6ba34da7f8
type 'help' for interactive help
tarantool>

Initialize the instance and switch the input language to SQL:

tarantool> box.cfg{}
tarantool> \set language sql
tarantool>  \set delimiter ;

Now you have a running Tarantool instance that accepts SQL input.

Sample table

In football training camp it is traditional for the trainer to begin by showing a football and saying “this is a football”. In that spirit, this is a table:

TABLE
          [1]              [2]              [3]
       +-----------------+----------------+----------------+
 Row#1 | Row#1,Column#1  | Row#1,Column#2 | Row#1,Column#3 |
       +-----------------+----------------+----------------+
 Row#2 | Row#2,Column#1  | Row#2,Column#2 | Row#2,Column#3 |
       +-----------------+----------------+----------------+
 Row#3 | Row#3,Column#1  | Row#3,Column#2 | Row#3,Column#3 |
       +-----------------+----------------+----------------+

But the labels are misleading – one usually doesn’t identify rows and columns by their ordinal positions, one prefers to pick out specific items by their contents. In that spirit, this is a table:

MODULES

+-----------------+------+---------------------+
| NAME            | SIZE | PURPOSE             |
+-----------------+------+---------------------+
| box             | 1432 | Database Management |
| clock           |  188 | Seconds             |
| crypto          |    4 | Cryptography        |
+-----------------+------+---------------------+

So one does not use longitude/latitude navigation by talking about “Row#2 Column #2”, one uses the contents of the Name column and the name of the Size column by talking about “the size, where the name is ‘clock’”. To be more exact, this is what one says:

SELECT size FROM modules WHERE name = 'clock';

If you’re familiar with Tarantool’s architecture – and ideally you read about that before coming to this chapter – then you know that there is a NoSQL way to get the same thing:

box.space.MODULES:select()[2][2]

Well, you can do that. One of the advantages of Tarantool is that if you can get data via an SQL statement, then you can get the same data via a NoSQL request. But the reverse is not true, because not all NoSQL tuple sets are definable as SQL tables. These restrictions apply for SQL that do not apply for NoSQL:
1. Every column must have a name.
2. Every column should have a scalar type (Tarantool is relaxed about which particular scalar type you can have, but there is no way to index and search arrays, tables within tables, or what MessagePack calls “maps”.)

Tarantool/NoSQL’s “format” clause causes the same restrictions.

So an SQL “table” is a NoSQL “tuple set with format restrictions”, an SQL “row” is a NoSQL “tuple”, an SQL “column” is a NoSQL “list of fields within a tuple set”.

Creating a table

This is how to create the modules table:

CREATE TABLE modules (name STRING, size INTEGER, purpose STRING, PRIMARY KEY (name));

The words that are IN CAPITAL LETTERS are “keywords” (although it is only a convention in this manual that keywords are in capital letters, in practice many programmers prefer to avoid shouting). A keyword has meaning for the SQL parser so many keywords are reserved, they cannot be used as names unless they are enclosed inside quotation marks.

The word “modules” is a “table name”, and the words “name” and “size” and “purpose” are “column names”. All tables and all columns must have names.

The words “STRING” and “INTEGER” are “data types”. STRING means “the contents should be characters, the length is indefinite, the equivalent NoSQL type is ‘string’’”. INTEGER means “the contents should be numbers without decimal points, the equivalent NoSQL type is ‘integer’”. Tarantool supports other data types but this section’s example table has data types from the two main groups, namely, data types for numbers and data types for strings.

The final clause, PRIMARY KEY (name), means that the name column is the main column used to identify the row.

Nulls

Frequently it is necessary, at least temporarily, that a column value should be NULL. Typical situations are: the value is unknown, or the value is not applicable. For example, you might make a module as a placeholder but you don’t want to say its size or purpose. If such things are possible, the column is “nullable”. The example table’s name column cannot contain nulls, and it could be defined explicitly as “name STRING NOT NULL”, but in this case that’s unnecessary – a column defined as PRIMARY KEY is automatically NOT NULL.

Is a NULL in SQL the same thing as a nil in Lua? No, but it is close enough that there will be confusion. When nil means “unknown” or “inapplicable”, yes. But when nil means “nonexistent” or “type is nil”, no. NULL is a value, it has a data type because it is inside a column which is defined with that data type.

Creating an index

This is how to create indexes for the modules table:

CREATE INDEX size ON modules (size);
CREATE UNIQUE INDEX purpose ON modules (purpose);

There is no need to create an index on the name column, because Tarantool creates an index automatically when it sees a PRIMARY KEY clause in the CREATE TABLE statement. In fact there is no need to create indexes on the size or purpose columns either – if indexes don’t exist, then it is still possible to use the columns for searches. Typically people create non-primary indexes, also called secondary indexes, when it becomes clear that the table will grow large and searches will be frequent, because searching with an index is generally much faster than searching without an index.

Another use for indexes is to enforce uniqueness. When an index is created with CREATE UNIQUE INDEX for the purpose column, it is not possible to have duplicate values in that column.

Data change

Putting data into a table is called “inserting”. Changing data is called “updating”. Removing data is called “deleting”. Together, the three SQL statements INSERT plus UPDATE plus DELETE are the three main “data-change” statements.

This is how to insert, update, and delete a row in the modules table:

INSERT INTO modules VALUES ('json', 14, 'format functions for JSON');
UPDATE modules SET size = 15 WHERE name = 'json';
DELETE FROM modules WHERE name = 'json';

The corresponding non-SQL Tarantool requests would be:

box.space.MODULES:insert{'json', 14, 'format functions for JSON'}
box.space.MODULES:update('json', {{'=', 2, 15}})
box.space.MODULES:delete{'json'}

This is how one would populate the table with the values that was shown earlier:

INSERT INTO modules VALUES ('box', 1432, 'Database Management');
INSERT INTO modules VALUES ('clock', 188, 'Seconds');
INSERT INTO modules VALUES ('crypto', 4, 'Cryptography');

Constraints

Some data-change statements are illegal due to something in the table’s definition. This is called “constraining what can be done”. Some types of constraints have already been shown …

NOT NULL – if a column is defined with a NOT NULL clause, it is illegal to put NULL into it. A primary-key column is automatically NOT NULL.

UNIQUE – if a column has a UNIQUE index, it is illegal to put a duplicate into it. A primary-key column automatically has a UNIQUE index.

data domain – if a column is defined as having data type INTEGER, it is illegal to put a non-number into it. More generally, if a value doesn’t correspond to the data type of the definition, it is illegal. Some database management systems (DBMSs) are very forgiving and will try to make allowances for bad values rather than reject them; Tarantool is a bit more strict than those DBMSs.

Now, here are other types of constraints …

CHECK – a table description can have a clause “CHECK (conditional expression)”. For example, if the CREATE TABLE modules statement looked like this:

CREATE TABLE modules (name STRING,
                      size INTEGER,
                      purpose STRING,
                      PRIMARY KEY (name),
                      CHECK (size > 0));

then this INSERT statement would be illegal:
INSERT INTO modules VALUES ('box', 0, 'The Database Kernel');
because there is a CHECK constraint saying that the second column, the size column, cannot contain a value which is less than or equal to zero. Try this instead:
INSERT INTO modules VALUES ('box', 1, 'The Database Kernel');

FOREIGN KEY – a table description can have a clause “FOREIGN KEY (column-list) REFERENCES table (column-list)”. For example, if there is a new table “submodules” which in a way depends on the modules table, it can be defined like this:

CREATE TABLE submodules (name STRING,
                         module_name STRING,
                         size INTEGER,
                         purpose STRING,
                         PRIMARY KEY (name),
                         FOREIGN KEY (module_name) REFERENCES
                         modules (name));

Now try to insert a new row into this submodules table:

INSERT INTO submodules VALUES
  ('space', 'Box', 10000, 'insert etc.');

The insert will fail because the second column (module_name) refers to the name column in the modules table, and the name column in the modules table does not contain ‘Box’. However, it does contain ‘box’. By default searches in Tarantool’s SQL use a binary collation. This will work:

INSERT INTO submodules
  VALUES ('space', 'box', 10000, 'insert etc.');

Now try to delete the corresponding row from the modules table:

DELETE FROM modules WHERE name = 'box';

The delete will fail because the second column (module_name) in the submodules table refers to the name column in the modules table, and the name column in the modules table would not contain ‘box’ if the delete succeeded. So the FOREIGN KEY constraint affects both the table which contains the FOREIGN KEY clause and the table that the FOREIGN KEY clause refers to.

The constraints in a table’s definition – NOT NULL, UNIQUE, data domain, CHECK, and FOREIGN KEY – are guarantors of the database’s integrity. It is important that they are fixed and well-defined parts of the definition, and hard to bypass with SQL. This is often seen as a difference between SQL and NoSQL – SQL emphasizes law and order, NoSQL emphasizes freedom and making your own rules.

Table relationships

Think about the two tables that have been discussed so far:

CREATE TABLE modules (name STRING,
                      size INTEGER,
                       purpose STRING,
                       PRIMARY KEY (name),
                       CHECK (size > 0));

CREATE TABLE submodules (name STRING,
                         module_name STRING,
                         size INTEGER,
                         purpose STRING,
                         PRIMARY KEY (name),
                         FOREIGN KEY (module_name) REFERENCES
                         modules (name));

Because of the FOREIGN KEYS clause in the submodules table, there is clearly a many-to-one relationship:
submodules –>> modules
that is, every submodules row must refer to one (and only one) modules row, while every modules row can be referred to in zero or more submodules rows.

Table relationships are important, but beware: do not trust anyone who tells you that databases made with SQL are relational “because there are relationships between tables”. That is wrong, as will be clear in the discussion about what makes a database relational, later.

Selecting with WHERE

Important

By default, Tarantool prohibits SELECT queries that scan table rows instead of using indexes to avoid unwanted heavy load. For the purposes of this tutorial, allow SQL scan queries in Tarantool by running the command:

SET SESSION "sql_seq_scan" = true;

Alternatively, you can allow a specific query to perform a table scan by adding the SEQSCAN keyword before the table name. Learn more about using SEQSCAN in SQL scan queries in the SQL FROM clause description.

We gave a simple example of a SELECT statement earlier:

SELECT size FROM modules WHERE name = 'clock';

The clause “WHERE name = ‘clock’” is legal in other statements – it is in examples with UPDATE and DELETE – but here the only examples will be with SELECT.

The first variation is that the WHERE clause does not have to be specified at all, it is optional. So this statement would return all rows:

SELECT size FROM modules;

The second variation is that the comparison operator does not have to be ‘=’, it can be anything that makes sense: ‘>’ or ‘>=’ or ‘<’ or ‘<=’, or ‘LIKE’ which is an operator that works with strings that may contain wildcard characters ‘_’ meaning ‘match any one character’ or ‘%’ meaning ‘match any zero or one or many characters’. These are legal statements which return all rows:

SELECT size FROM modules WHERE name >= '';
SELECT size FROM modules WHERE name LIKE '%';

The third variation is that IS [NOT] NULL is a special condition. Remembering that the NULL value can mean “it is unknown what the value should be”, and supposing that in some row the size is NULL, then the condition “size > 10” is not certainly true and it is not certainly false, so it is evaluated as “unknown”. Ordinarily the application of a WHERE clause filters out both false and unknown results. So when searching for NULL, say IS NULL; when searching anything that is not NULL, say IS NOT NULL. This statement will return all rows because (due to the definition) there are no NULLs in the name column:

SELECT size FROM modules WHERE name IS NOT NULL;

The fourth variation is that conditions can be combined with AND / OR, and negated with NOT.

So this statement would return all rows (the first condition is false but the second condition is true, and OR means “return true if either condition is true”):

SELECT size
FROM modules
WHERE name = 'wombat' OR size IS NOT NULL;

Selecting with a select list

Yet again, here is a simple example of a SELECT statement:

SELECT size FROM modules WHERE name = 'clock';

The words between SELECT and FROM are the select list. In this case, the select list is just one word: size. Formally it means that the desire is to return the size values, and technically the name for picking a particular column is called “projection”.

The first variation is that one can specify any column in any order:

SELECT name, purpose, size FROM modules;

The second variation is that one can specify an expression, it does not have to be a column name, it does not even have to include a column name. The common expression operators for numbers are the arithmetic operators + - / *; the common expression operator for strings is the concatenation operator ||. For example this statement will return 8, ‘XY’:

SELECT size * 2, 'X' || 'Y' FROM modules WHERE size = 4;

The third variation is that one can add a clause [AS name] after every expression, so that in the return the column titles will make sense. This is especially important when a title might otherwise be ambiguous or meaningless. For example this statement will return 8, ‘XY’ as before

SELECT size * 2 AS double_size, 'X' || 'Y' AS concatenated_literals  FROM modules
  WHERE size = 4;

but displayed as a table the result will look like

+----------------+------------------------+
| DOUBLE_SIZE    | CONCATENATED_LITERALS  |
+----------------+------------------------+
|              8 | XY                     |
+----------------+------------------------+

Selecting with a select list with asterisk

Instead of listing columns in a select list, one can just say '*'. For example

SELECT * FROM modules;

This is the same thing as

SELECT name, size, purpose FROM modules;

Selecting with "*" saves time for the writer, but it is unclear to a reader who has not memorized what the column names are. Also it is unstable, because there is a way to change a table’s definition (the ALTER statement, which is an advanced topic). Nevertheless, although it might be bad to use it for production, it is handy to use it for introduction, so "*" will appear in some following examples.

Select with subqueries

Remember that there is a modules table and there is a submodules table. Suppose that there is a desire to list the submodules that refer to modules for which the purpose is X. That is, this involves a search of one table using a value in another table. This can be done by enclosing “(SELECT …)” within the WHERE clause. For example:

SELECT name FROM submodules
WHERE module_name =
    (SELECT name FROM modules WHERE purpose LIKE '%Database%');

Subqueries are also useful in the select list, when one wishes to combine information from more than one table. For example this statement will display submodules rows but will include values that come from the modules table:

SELECT name AS submodules_name,
    (SELECT purpose FROM modules
     WHERE modules.name = submodules.module_name)
     AS modules_purpose,
    purpose AS submodules_purpose
FROM submodules;

Whoa. What are “modules.name” and “submodules.name”? Whenever you see “x . y” you are looking at a “qualified column name”, and the first part is a table identifier, the second part is a column identifier. It is always legal to use qualified column names, but until now it has not been necessary. Now it is necessary, or at least it is a good idea, because both tables have a column named “name”.

The result will look like this:

+-------------------+------------------------+--------------------+
| SUBMODULES_NAME   | MODULES_PURPOSE        | SUBMODULES_PURPOSE |
+-------------------+------------------------+--------------------+
| space             | Database Management    | insert etc.        |
+-------------------+------------------------+--------------------+

Perhaps you have read somewhere that SQL stands for “Structured Query Language”. That is not true any more. But it is true that the query syntax allows for a structural component, namely the subquery, and that was the original idea. However, there is a different way to combine tables – with joins instead of subqueries.

Select with Cartesian join

Until now only “FROM modules” or “FROM submodules” was used in SELECT statements. What if there was more than one table in the FROM clause? For example

SELECT * FROM modules, submodules;

SELECT * FROM modules JOIN submodules;

That is legal. Usually it is not what you want, but it is a learning aid. The result will be:

{ columns from modules table }         { columns from submodules table }
+--------+------+---------------------+-------+-------------+-------+-------------+
| NAME   | SIZE | PURPOSE             | NAME  | MODULE_NAME | SIZE  | PURPOSE     |
+--------+------+---------------------+-------+-------------+-------+-------------+
| box    | 1432 | Database Management | space | box         | 10000 | insert etc. |
| clock  |  188 | Seconds             | space | box         | 10000 | insert etc. |
| crypto |    4 | Cryptography        | space | box         | 10000 | insert etc. |
+--------+------+---------------------+-------+-------------+-------+-------------+

It is not an error. The meaning of this type of join is “combine every row in table-1 with every row in table-2”. It did not specify what the relationship should be, so the result has everything, even when the submodule has nothing to do with the module.

It is handy to look at the above result, called a “Cartesian join” result, to see what would really be desirable. Probably for this case the row that actually makes sense is the one where the modules.name = submodules.module_name, and it’s better to make that clear in both the select list and the WHERE clause, thus:

SELECT modules.name AS modules_name,
       modules.size AS modules_size,
       modules.purpose AS modules_purpose,
       submodules.name,
       module_name,
       submodules.size,
       submodules.purpose
FROM modules, submodules
WHERE modules.name = submodules.module_name;

The result will be:

+----------+-----------+------------+--------+---------+-------+-------------+
| MODULES_ |  MODULES_ | MODULES_   | NAME   | MODULE_ | SIZE  | PURPOSE     |
| NAME     |  SIZE     | PURPOSE    |        | NAME    |       |             |
+----------+-----------+--------- --+--------+---------+-------+-------------+
| box      |      1432 | Database   | space  | box     | 10000 | insert etc. |
|          |           | Management |        |         |       |             |
+----------+-----------+------------+--------+---------+-------+-------------+

In other words, you can specify a Cartesian join in the FROM clause, then you can filter out the irrelevant rows in the WHERE clause, and then you can rename columns in the select list. This is fine, and every SQL DBMS supports this. But it is worrisome that the number of rows in a Cartesian join is always (number of rows in first table multiplied by number of rows in second table), which means that conceptually you are often filtering in a large set of rows.

It is good to start by looking at Cartesian joins because they show the concept. Many people, though, prefer to use different syntaxes for joins because they look better or clearer. So now those alternatives will be shown.

Select with join with ON clause

The ON clause would have the same comparisons as the WHERE clause that was illustrated for the previous section, but the use of different syntax would be making it clear “this is for the sake of the join”. Readers can see at a glance that it is, in concept at least, an initial step before the result rows are filtered. For example this

SELECT * FROM modules JOIN submodules
  ON (modules.name = submodules.module_name);

is the same as

SELECT * FROM modules, submodules
  WHERE modules.name = submodules.module_name;

Select with join with USING clause

The USING clause would take advantage of names that are held in common between the two tables, with the assumption that the intent is to match those columns with ‘=’ comparisons. For example,

SELECT * FROM modules JOIN submodules USING (name);

has the same effect as

SELECT * FROM modules JOIN submodules WHERE modules.name = submodules.name;

If the table had been created with a plan in advance to use USING clauses, that would save time. But that did not happen. So, although the above example “works”, the results will not be sensible.

Select with natural join

A natural join would take advantage of names that are held in common between the two tables, and would do the filtering automatically based on that knowledge, and throw away duplicate columns.

If the table had been created with a plan in advance to use natural joins, that would be very handy. But that did not happen. So, although the following example “works”, the results won’t be sensible.

SELECT * FROM modules NATURAL JOIN submodules;

Result: nothing, because modules.name does not match submodules.name, and so on And even if there had been a result, it would only have included four columns: name, module_name, size, purpose.

Select with left join

Now what if there is a desire to join modules to submodules, but it’s necessary to be sure that all the modules are found? In other words, suppose the requirement is to get modules even if the condition submodules.module_name = modules.name is not true, because the module has no submodules.

When that is the requirement, the type of join is an “outer join” (as opposed to the type that has been used so far which is an “inner join”). Specifically the format will be LEFT [OUTER] JOIN because the main table, modules, is on the left. For example:

SELECT *
FROM modules LEFT JOIN submodules
ON modules.name = submodules.module_name;

which returns:

{ columns from modules table }         { columns from submodules table }
+--------+------+---------------------+-------+-------------+-------+-------------+
| NAME   | SIZE | PURPOSE             | NAME  | MODULE_NAME | SIZE  | PURPOSE     |
+--------+------+---------------------+-------+-------------+-------+-------------+
| box    | 1432 | Database Management | space | box         | 10000 | insert etc. |
| clock  |  188 | Seconds             | NULL  | NULL        | NULL  | NULL        |
| crypto |    4 | Cryptography        | NULL  | NULL        | NULL  | NULL        |
+--------+------+---------------------+-------+-------------+-------+-------------+

Thus, for the submodules of the clock module and the submodules of the crypto module – which do not exist – there are NULLs in every column.

Select with functions

A function can take any expression, including an expression that contains another function, and return a scalar value. There are many such functions. Here will be a description of only one, SUBSTR, which returns a substring of a string.

Format: SUBSTR(input-string, start-with [, length])

Description: SUBSTR takes input-string, eliminates any characters before start-with, eliminates any characters after (start-with plus length), and returns the result.

Example: SUBSTR('abcdef', 2, 3) returns ‘bcd’.

Select with aggregation, GROUP BY, and HAVING

Remember that the modules table looks like this:

MODULES

+-----------------+------+---------------------+
| NAME            | SIZE | PURPOSE             |
+-----------------+------+---------------------+
| box             | 1432 | Database Management |
| clock           |  188 | Seconds             |
| crypto          |    4 | Cryptography        |
+-----------------+------+---------------------+

Suppose that there is no need to know all the individual size values, all that is important is their aggregation, that is, take the attributes of the collection. SQL allows aggregation functions including: AVG (average), SUM, MIN (minimum), MAX (maximum), and COUNT. For example

SELECT AVG(size), SUM(size), MIN(size), MAX(size), COUNT(size) FROM modules;

The result will look like this:

+-----------+-----------+-----------+-----------+-----------+
| COLUMN_1  | COLUMN_2  | COLUMN_3  | COLUMN_4  | COLUMN_5  |
+-----------+-----------+-----------+-----------+-----------|
|       541 |      1624 |         4 |      1432 |         3 |
+-----------+-----------+-----------+-----------+-----------+

Suppose that the requirement is aggregations, but aggregations of rows that have some common characteristic. Supposing further, the rows should be divided into two groups, the ones whose names begin with ‘b’ and the ones whose names begin with ‘c’. This can be done by adding a clause [GROUP BY expression]. For example,

SELECT SUBSTR(name, 1, 1), AVG(size), SUM(size), MIN(size), MAX(size), COUNT(size)
FROM modules
GROUP BY SUBSTR(name, 1, 1);

The result will look like this:

+------------+--------------+-----------+-----------+-----------+-------------+
| COLUMN_1   | COLUMN_2     | COLUMN_3  | COLUMN_4  | COLUMN_5  | COLUMN_6    |
+------------+--------------+-----------+-----------+-----------+-------------+
| b          |         1432 |      1432 |      1432 |      1432 |           1 |
| c          |           96 |       192 |         4 |       188 |           2 |
+------------+--------------+-----------+-----------+-----------+-------------+

Select with common table expression

It is possible to define a temporary (viewed) table within a statement, usually within a SELECT statement, using a WITH clause. For example:

WITH tmp_table AS (SELECT x1 FROM t1) SELECT * FROM tmp_table;

Select with order, limit, and offset clauses

So far, tor every search in the modules table, the rows have come out in alphabetical order by name: ‘box’, then ‘clock’, then ‘crypto’. However, to really be sure about the order, or to ask for a different order, it is necessary to be explicit and add a clause: ORDER BY column-name [ASC|DESC]. (ASC stands for ASCending, DESC stands for DESCending.) For example:

SELECT * FROM modules ORDER BY name DESC;

The result will be the usual rows, in descending alphabetical order: ‘crypto’ then ‘clock’ then ‘box’.

After the ORDER BY clause there can be a clause LIMIT n, where n is the maximum number of rows to retrieve. For example:

SELECT * FROM modules ORDER BY name DESC LIMIT 2;

The result will be the first two rows, ‘crypto’ and ‘clock’.

After the ORDER BY clause and the LIMIT clause there can be a clause OFFSET n, where n is the row to start with. The first offset is 0. For example:

SELECT * FROM modules ORDER BY name DESC LIMIT 2 OFFSET 2;

The result will be the third row, ‘box’.

Views

A view is a canned SELECT. If you have a complex SELECT that you want to run frequently, create a view and then do a simple SELECT on the view. For example:

CREATE VIEW v AS SELECT size, (size *5) AS size_times_5
FROM modules
GROUP BY size, name
ORDER BY size_times_5;
SELECT * FROM v;

Transactions

Tarantool has a “Write Ahead Log” (WAL). Effects of data-change statements are logged before they are permanently stored on disk. This is a reason that, although entire databases can be stored in temporary memory, they are not vulnerable in case of power failure.

Tarantool supports commits and rollbacks. In effect, asking for a commit means asking for all the recent data-change statements, since a transaction began, to become permanent. In effect, asking for a rollback means asking for all the recent data-change statements, since a transaction began, to be cancelled.

For example, consider these statements:

CREATE TABLE things (remark STRING, PRIMARY KEY (remark));
START TRANSACTION;
INSERT INTO things VALUES ('A');
COMMIT;
START TRANSACTION;
INSERT INTO things VALUES ('B');
ROLLBACK;
SELECT * FROM things;

The result will be: one row, containing ‘A’. The ROLLBACK cancelled the second INSERT statement, but did not cancel the first one, because it had already been committed.

Ordinarily every statement is automatically committed.

After START TRANSACTION, statements are not automatically committed – Tarantool considers that a transaction is now “active”, until the transaction ends with a COMMIT statement or a ROLLBACK statement. While a transaction is active, all statements are legal except another START TRANSACTION.

Implementing Tarantool’s SQL On Top of NoSQL

Tarantool’s SQL data is the same as Tarantool’s NoSQL data. When you create a table or an index with SQL, you are creating a space or an index in NoSQL. For example:

CREATE TABLE things (remark STRING, PRIMARY KEY (remark));
INSERT INTO things VALUES ('X');

is somewhat similar to

box.schema.space.create('THINGS',
{
    format = {
              [1] = {["name"] = "REMARK", ["type"] = "string"}
              }
})
box.space.THINGS:create_index('pk_unnamed_THINGS_1',{unique=true,parts={1,'string'}})
box.space.THINGS:insert{'X'}

Therefore you can take advantage of Tarantool’s NoSQL features even though your primary language is SQL. Here are some possibilities.

(1) NoSQL applications written in one of the connector languages may be slightly faster than SQL applications because SQL statements may require more parsing and may be translated to NoSQL requests.

(2) You can write stored procedures in Lua, combining Lua loop-control and Lua library-access statements with SQL statements. These routines are executed on the server, which is the principal advantage of pure-SQL stored procedures.

(3) There are some options that are implemented in NoSQL that are not (yet) implemented in SQL. For example you can use NoSQL to change an index option, and to deny access to users named ‘guest’.

(4) System spaces such as _space and _index can be accessed with SQL SELECT statements. This is not quite the same as an information_schema, but it does mean that you can use SQL to access the database’s metadata catalog.

Fields in NoSQL spaces can be accessed with SQL if and only if they are scalar and are defined in format clauses. Indexes of NoSQL spaces will be used with SQL if and only if they are TREE indexes.

Relational databases

Edgar F. Codd, the person most responsible for researching and explaining relational database concepts, listed the main criteria as (Codd’s 12 rules).

Although Tarantool is not advertised as “relational”, Tarantool comes with a claim that it complies with these rules, with the following caveats and exceptions …

The rules state that all data must be viewable as relations. A Tarantool SQL table is a relation. However, it is possible to have duplicate values in SQL tables and it is possible to have an implicit ordering. Those characteristics are not allowed for true relations.

The rules state that there must be a dynamic online catalog. Tarantool has one but some metadata is missing from it.

The rules state that the data language must support authorization. Tarantool’s SQL does not. Authorization occurs via NoSQL requests.

The rules require that data must be physically independent (from underlying storage changes) and logically independent (from application program changes). So far there is not enough experience to make this guarantee.

The rules require certain types of updatable views. Tarantool’s views are not updatable.

The rules state that it should be impossible to use a low-level language to bypass integrity as defined in the relational-level language. In Tarantool’s case, this is not true, for example one can execute a request with Tarantool’s NoSQL to violate a foreign-key constraint that was defined with Tarantool’s SQL.

To learn more about SQL in Tarantool, check the reference.

SQL tutorial

This tutorial is a demonstration of the support for SQL in Tarantool. It includes the functionality that you’d encounter in an “SQL-101” course.

Prerequisites

Before starting this tutorial:

Install the tt CLI utility.

Start a Tarantool instance in the interactive mode by running tt run -i:

$ tt run -i
Tarantool 3.0.0-0-g6ba34da7f8
type 'help' for interactive help
tarantool>

Initialize the instance and switch the input language to SQL:

tarantool> box.cfg{}
tarantool> \set language sql
tarantool>  \set delimiter ;

Now you have a running Tarantool instance that accepts SQL input.

Create a table and execute SQL statements

CREATE, INSERT, UPDATE, SELECT

To get started, enter these SQL statements:

CREATE TABLE table1 (column1 INTEGER PRIMARY KEY, column2 VARCHAR(100));
INSERT INTO table1 VALUES (1, 'A');
UPDATE table1 SET column2 = 'B';
SELECT * FROM table1 WHERE column1 = 1;

The result of the SELECT statement looks like this:

sql_tutorial:instance001> SELECT * FROM table1 WHERE column1 = 1;
---
- metadata:
  - name: COLUMN1
    type: integer
  - name: COLUMN2
    type: string
  rows:
  - [1, 'B']
...

The result includes:

metadata: the names and data types of each column
result rows

For conciseness, metadata is skipped in query results in this tutorial. Only the result rows are shown.

CREATE TABLE

Here is CREATE TABLE with more details:

There are multiple columns, with different data types.
There is a PRIMARY KEY (unique and not-null) for two of the columns.

Create another table:

CREATE TABLE table2 (column1 INTEGER,
                     column2 VARCHAR(100),
                     column3 SCALAR,
                     column4 DOUBLE,
                     PRIMARY KEY (column1, column2));

The result is: row_count: 1.

INSERT

Put four rows in the table (table2):

The INTEGER and DOUBLE columns get numbers
The VARCHAR and SCALAR columns get strings (the SCALAR strings are expressed as hexadecimals)

INSERT INTO table2 VALUES (1, 'AB', X'4142', 5.5);
INSERT INTO table2 VALUES (1, 'CD', X'2020', 1E4);
INSERT INTO table2 VALUES (2, 'AB', X'2020', 12.34567);
INSERT INTO table2 VALUES (-1000, '', X'', 0.0);

Then try to put another row:

INSERT INTO table2 VALUES (1, 'AB', X'A5', -5.5);

This INSERT fails because of a primary-key violation: the row with the primary key 1, 'AB' already exists.

The SEQSCAN keyword

Sequential scan is the scan through all the table rows instead of using indexes. In Tarantool, SELECT SQL queries that perform sequential scans are prohibited by default. For example, this query leads to the error Scanning is not allowed for 'table2':

SELECT * FROM table2;

To execute a scan query, put the SEQSCAN keyword before the table name:

SELECT * FROM SEQSCAN table2;

Try to execute these queries that use indexed column1 in filters:

SELECT * FROM table2 WHERE column1 = 1;
SELECT * FROM table2 WHERE column1 + 1 = 2;

The result is:

The first query returns rows:

- [1, 'AB', 'AB', 10.5]
- [1, 'CD', '  ', 10005]

The second query fails with the error Scanning is not allowed for 'TABLE2'. Although column1 is indexed, the expression column1 + 1 is not calculated from the index, which makes this SELECT a scan query.

Note

To enable SQL scan queries without SEQSCAN for the current session, run this command:

SET SESSION "sql_seq_scan" = true;

Learn more about using SEQSCAN in the SQL FROM clause description.

SELECT with ORDER BY clause

Retrieve the 4 rows in the table, in descending order by column2, then (where the column2 values are the same) in ascending order by column4.

* is short for “all columns”.

SELECT * FROM SEQSCAN table2 ORDER BY column2 DESC, column4 ASC;

The result is:

- - [1, 'CD', '  ', 10000]
  - [1, 'AB', 'AB', 5.5]
  - [2, 'AB', '  ', 12.34567]
  - [-1000, '', '', 0]

SELECT with WHERE clauses

Retrieve some of what you inserted:

The first statement uses the LIKE comparison operator which is asking for “first character must be ‘A’, the next characters can be anything.”
The second statement uses logical operators and parentheses, so the AND expressions must be true, or the OR expression must be true. Notice the columns don’t have to be indexed.

SELECT column1, column2, column1 * column4 FROM SEQSCAN table2 WHERE column2
LIKE 'A%';
SELECT column1, column2, column3, column4 FROM SEQSCAN table2
    WHERE (column1 < 2 AND column4 < 10)
    OR column3 = X'2020';

The first result is:

- - [1, 'AB', 5.5]
  - [2, 'AB', 24.69134]

The second result is:

- - [-1000, '', '', 0]
  - [1, 'AB', 'AB', 5.5]
  - [1, 'CD', '  ', 10000]
  - [2, 'AB', '  ', 12.34567]

SELECT with GROUP BY and aggregate functions

Retrieve with grouping.

The rows that have the same values for column2 are grouped and are aggregated – summed, counted, averaged – for column4.

SELECT column2, SUM(column4), COUNT(column4), AVG(column4)
FROM SEQSCAN table2
GROUP BY column2;

The result is:

- - ['', 0, 1, 0]
  - ['AB', 17.84567, 2, 8.922835]
  - ['CD', 10000, 1, 10000]

Complications and complex SELECTs

NULLs

Insert rows that contain NULL values.

NULL is not the same as Lua nil; it commonly is used in SQL for unknown or not-applicable.

INSERT INTO table2 VALUES (1, NULL, X'4142', 5.5);
INSERT INTO table2 VALUES (0, '!!@', NULL, NULL);
INSERT INTO table2 VALUES (0, '!!!', X'00', NULL);

The results are:

The first INSERT fails because NULL is not permitted for a column that was defined with a PRIMARY KEY clause.
The other INSERT statements succeed.

Indexes

Create a new index on column4.

There already is an index for the primary key. Indexes are useful for making queries faster. In this case, the index also acts as a constraint, because it prevents two rows from having the same values in column4. However, it is not an error that column4 has multiple occurrences of NULLs.

CREATE UNIQUE INDEX i ON table2 (column4);

The result is: rowcount: 1.

Create a subset table

Create a table table3, which contains a subset of the table2 columns and a subset of the table2 rows.

You can do this by combining INSERT with SELECT. Then select everything from the result table.

CREATE TABLE table3 (column1 INTEGER, column2 VARCHAR(100), PRIMARY KEY
(column2));
INSERT INTO table3 SELECT column1, column2 FROM SEQSCAN table2 WHERE column1 <> 2;
SELECT * FROM SEQSCAN table3;

The result is:

- - [-1000, '']
  - [0, '!!!']
  - [0, '!!@']
  - [1, 'AB']
  - [1, 'CD']

SELECT with a subquery

A subquery is a query within a query.

Find all the rows in table2 whose (column1, column2) values are not present in table3.

SELECT * FROM SEQSCAN table2 WHERE (column1, column2) NOT IN (SELECT column1,
column2 FROM SEQSCAN table3);

The result is the single row that was excluded when inserting the rows with the INSERT ... SELECT statement:

- - [2, 'AB', '  ', 12.34567]

SELECT with a join

A join is a combination of two tables. There is more than one way to do them in Tarantool, for example, “Cartesian joins” or “left outer joins”.

This example shows the most typical case, where column values from one table match column values from another table.

SELECT * FROM SEQSCAN table2, table3
    WHERE table2.column1 = table3.column1 AND table2.column2 = table3.column2
    ORDER BY table2.column4;

The result is:

- - [0, '!!!', "\0", null, 0, '!!!']
  - [0, '!!@', null, null, 0, '!!@']
  - [-1000, '', '', 0, -1000, '']
  - [1, 'AB', 'AB', 5.5, 1, 'AB']
  - [1, 'CD', ' ', 10000, 1, 'CD']

Constraints and foreign keys

CREATE TABLE with a CHECK clause

Create a table that includes a constraint – there must not be any rows containing 13 in column2. After that, try to insert the following row:

CREATE TABLE table4 (column1 INTEGER PRIMARY KEY, column2 INTEGER, CHECK
(column2 <> 13));
INSERT INTO table4 VALUES (12, 13);

Result: the insert fails, as it should, with the message Check constraint 'ck_unnamed_TABLE4_1' failed for tuple.

CREATE TABLE with a FOREIGN KEY clause

Create a table that includes a constraint: there must not be any rows containing values that do not appear in table2.

CREATE TABLE table5 (column1 INTEGER, column2 VARCHAR(100),
    PRIMARY KEY (column1),
    FOREIGN KEY (column1, column2) REFERENCES table2 (column1, column2));
INSERT INTO table5 VALUES (2,'AB');
INSERT INTO table5 VALUES (3,'AB');

Result:

The first INSERT statement succeeds because table3 contains a row with [2, 'AB', ' ', 12.34567].
The second INSERT statement, correctly, fails with the message Foreign key constraint ''fk_unnamed_TABLE5_1'' failed: foreign tuple was not found.

UPDATE

Due to earlier INSERT statements, these values are in column4 of table2: {0, NULL, NULL, 5.5, 10000, 12.34567}. Add 5 to each of these values except 0. Adding 5 to NULL results in NULL, as SQL arithmetic requires. Use SELECT to see what happened to column4.

UPDATE table2 SET column4 = column4 + 5 WHERE column4 <> 0;
SELECT column4 FROM SEQSCAN table2 ORDER BY column4;

The result is: {NULL, NULL, 0, 10.5, 17.34567, 10005}.

DELETE

Due to earlier INSERT statements, there are 6 rows in table2:

- - [-1000, '', '', 0]
  - [0, '!!!', "\0", null]
  - [0, '!!@', null, null]
  - [1, 'AB', 'AB', 10.5]
  - [1, 'CD', '  ', 10005]
  - [2, 'AB', '  ', 17.34567]

Try to delete the last and first of these rows:

DELETE FROM table2 WHERE column1 = 2;
DELETE FROM table2 WHERE column1 = -1000;
SELECT COUNT(column1) FROM SEQSCAN table2;

The result is:

The first DELETE statement causes an error because there’s a foreign-key constraint.
The second DELETE statement succeeds.
The SELECT statement shows that there are 5 rows remaining.

ALTER TABLE with a FOREIGN KEY clause

Create another constraint that there must not be any rows in table1 containing values that do not appear in table5. This was impossible during the table1 creation because at that time table5 did not exist. You can add constraints to existing tables with the ALTER TABLE statement.

ALTER TABLE table1 ADD CONSTRAINT c
    FOREIGN KEY (column1) REFERENCES table5 (column1);
DELETE FROM table1;
ALTER TABLE table1 ADD CONSTRAINT c
    FOREIGN KEY (column1) REFERENCES table5 (column1);

Result: the ALTER TABLE statement fails the first time because there is a row in table1, and ADD CONSTRAINT requires that the table be empty. After the row is deleted, the ALTER TABLE statement completes successfully. Now there is a chain of references, from table1 to table5 and from table5 to table2.

Triggers

The idea of a trigger is: if a change (INSERT or UPDATE or DELETE) happens, then a further action – perhaps another INSERT or UPDATE or DELETE – will happen.

Set up the following trigger: when a update to table3 is done, do an update to table2. Specify this as FOR EACH ROW, so that the trigger activates 5 times (since there are 5 rows in table3).

SELECT column4 FROM table2 WHERE column1 = 2;
CREATE TRIGGER tr AFTER UPDATE ON table3 FOR EACH ROW
BEGIN UPDATE table2 SET column4 = column4 + 1 WHERE column1 = 2; END;
UPDATE table3 SET column2 = column2;
SELECT column4 FROM table2 WHERE column1 = 2;

Result:

The first SELECT shows that the original value of column4 in table2 where column1 = 2 was: 17.34567.
The second SELECT returns:

- - [22.34567]

Operators and functions

String operations

You can manipulate string data (usually defined with CHAR or VARCHAR data types) in many ways. For example:

concatenate strings with the || operator
extract substrings with the SUBSTR function

SELECT column2, column2 || column2, SUBSTR(column2, 2, 1) FROM SEQSCAN table2;

The result is:

- - ['!!!', '!!!!!!', '!']
  - ['!!@', '!!@!!@', '!']
  - ['AB', 'ABAB', 'B']
  - ['CD', 'CDCD', 'D']
  - ['AB', 'ABAB', 'B']

Number operations

You can also manipulate number data (usually defined with INTEGER or DOUBLE data types) in many ways. For example:

shift left with the << operator
get modulo with the % operator

SELECT column1, column1 << 1, column1 << 2, column1 % 2 FROM SEQSCAN table2;

The result is:

- - [0, 0, 0, 0]
  - [0, 0, 0, 0]
  - [1, 2, 4, 1]
  - [1, 2, 4, 1]
  - [2, 4, 8, 0]

Ranges and limits

Tarantool can handle:

integers anywhere in the 4-byte integer range
approximate-numerics anywhere in the 8-byte IEEE floating point range
any Unicode characters, with UTF-8 encoding and a choice of collations

Insert such values in a new table and see what happens when you select them with arithmetic on a number column and ordering by a string column.

CREATE TABLE t6 (column1 INTEGER, column2 VARCHAR(10), column4 DOUBLE,
PRIMARY KEY (column1));
INSERT INTO t6 VALUES (-1234567890, 'АБВГД', 123456.123456);
INSERT INTO t6 VALUES (+1234567890, 'GD', 1e30);
INSERT INTO t6 VALUES (10, 'FADEW?', 0.000001);
INSERT INTO t6 VALUES (5, 'ABCDEFG', NULL);
SELECT column1 + 1, column2, column4 * 2 FROM SEQSCAN t6 ORDER BY column2;

The result is:

- - [6, 'ABCDEFG', null]
  - [11, 'FADEW?', 2e-06]
  - [1234567891, 'GD', 2e+30]
  - [-1234567889, 'АБВГД', 246912.246912]

Views

A view (or viewed table), is virtual, meaning that its rows aren’t physically in the database, their values are calculated from other tables.

Create a view v3 based on table3 and select from it:

CREATE VIEW v3 AS SELECT SUBSTR(column2,1,2), column4 FROM SEQSCAN t6
WHERE column4 >= 0;
SELECT * FROM v3;

The result is:

- - ['АБ', 123456.123456]
  - ['FA', 1e-06]
  - ['GD', 1e+30]

Common table expressions

By putting WITH + SELECT in front of a SELECT, you can make a temporary view that lasts for the duration of the statement.

Create such a view and select from it:

WITH cte AS (
             SELECT SUBSTR(column2,1,2), column4 FROM SEQSCAN t6
             WHERE column4 >= 0)
SELECT * FROM cte;

The result is the same as the CREATE VIEW result:

- - ['АБ', 123456.123456]
  - ['FA', 1e-06]
  - ['GD', 1e+30]

VALUES

Tarantool can handle statements like SELECT 55; (select without FROM) like some other popular DBMSs. But it also handles the more standard statement VALUES (expression [, expression ...]);.

SELECT 55 * 55, 'The rain in Spain';
VALUES (55 * 55, 'The rain in Spain');

The result of both these statements is:

- - [3025, 'The rain in Spain']

Metadata

To find out the internal structure of the Tarantool database with SQL, select from the Tarantool system tables _space, _index, and _trigger:

SELECT * FROM SEQSCAN "_space";
SELECT * FROM SEQSCAN "_index";
SELECT * FROM SEQSCAN "_trigger";

Actually, these statements select from NoSQL “system spaces”.

Select from _space by a table name:

SELECT "id", "name", "owner", "engine" FROM "_space" WHERE "name"='TABLE3';

The result is:

- - [517, 'TABLE3', 1, 'memtx']

Using SQL from Lua

You can execute SQL statements directly from the Lua code without switching to the SQL input.

Change the settings so that the console accepts statements written in Lua instead of statements written in SQL:

sql_tutorial:instance001> \set language lua

box.execute()

You can invoke SQL statements using the Lua function box.execute(string).

sql_tutorial:instance001> box.execute([[SELECT * FROM SEQSCAN table3;]]);

The result is:

- - [-1000, '']
  - [0, '!!!']
  - [0, '!!@']
  - [1, 'AB']
  - [1, 'CD']
...

Create a million-row table

To see how the SQL in Tarantool scales, create a bigger table.

The following Lua code generates one million rows with random data and inserts them into a table. Copy this code into the Tarantool console and wait a bit:

box.execute("CREATE TABLE tester (s1 INT PRIMARY KEY, s2 VARCHAR(10))");

function string_function()
    local random_number
    local random_string
    random_string = ""
    for x = 1, 10, 1 do
        random_number = math.random(65, 90)
        random_string = random_string .. string.char(random_number)
    end
    return random_string
end;

function main_function()
    local string_value, t, sql_statement
    for i = 1, 1000000, 1 do
        string_value = string_function()
        sql_statement = "INSERT INTO tester VALUES (" .. i .. ",'" .. string_value .. "')"
        box.execute(sql_statement)
    end
end;
start_time = os.clock();
main_function();
end_time = os.clock();
print('insert done in ' .. end_time - start_time .. ' seconds');

The result is: you now have a table with a million rows, with a message saying insert done in 88.570578 seconds.

Select from a million-row table

Check how SELECT works on the million-row table:

the first query goes by an index because s1 is the primary key
the second query does not go by an index

box.execute([[SELECT * FROM tester WHERE s1 = 73446;]]);
box.execute([[SELECT * FROM SEQSCAN tester WHERE s2 LIKE 'QFML%';]]);

The result is:

the first statement completes instantaneously
the second statement completed noticeably slower

Cleanup and exit

To cleanup all the objects created in this tutorial, switch to the SQL input language again. Then run the DROP statements for all created tables, views, and triggers.

These statements must be entered separately.

sql_tutorial:instance001> \set language sql
sql_tutorial:instance001> DROP TABLE tester;
sql_tutorial:instance001> DROP TABLE table1;
sql_tutorial:instance001> DROP VIEW v3;
sql_tutorial:instance001> DROP TRIGGER tr;
sql_tutorial:instance001> DROP TABLE table5;
sql_tutorial:instance001> DROP TABLE table4;
sql_tutorial:instance001> DROP TABLE table3;
sql_tutorial:instance001> DROP TABLE table2;
sql_tutorial:instance001> DROP TABLE t6;
sql_tutorial:instance001> \set language lua
sql_tutorial:instance001> os.exit();

Improving MySQL with Tarantool

Replicating MySQL is one of the Tarantool’s killer functions. It allows you to keep your existing MySQL database while at the same time accelerating it and scaling it out horizontally. Even if you aren’t interested in extensive expansion, replacing existing replicas with Tarantool can save you money, because Tarantool is more efficient per core than MySQL. To read a testimonial of a company that implemented Tarantool replication on a large scale, see the following article.

If you run into any trouble with regards to the basics of Tarantool, see the Getting started guide or the Data model description. A helpful log for troubleshooting during this tutorial is replicatord.log in /var/log. You can also have a look at the instance’s log example.log in /var/log/tarantool.

The tutorial is intended for CentOS 7.5 and MySQL 5.7. The tutorial requires that systemd and MySQL are installed.

Setting up MySQL

In this section, you configure MySQL and create a database.

First, install the necessary packages in CentOS:

$ yum -y install git ncurses-devel cmake gcc-c++ boost boost-devel wget unzip nano bzip2 mysql-devel mysql-lib

Clone the Tarantool-MySQL replication package from GitHub:

$ git clone https://github.com/tarantool/mysql-tarantool-replication.git

Build the replicator with cmake:

$ cd mysql-tarantool-replication
$ git submodule update --init --recursive
$ cmake .
$ make

The replicator will run as a systemd daemon called replicatord, so, edit its systemd service file (replicatord.service) in the mysql-tarantool-replication repository:
```
$ nano replicatord.service
```
The following line should be changed:
```
ExecStart=/usr/local/sbin/replicatord -c /usr/local/etc/replicatord.cfg
```
To change it, replace the .cfg extension with .yml:
```
ExecStart=/usr/local/sbin/replicatord -c /usr/local/etc/replicatord.yml
```

Next, copy the files from the replicatord repository to other necessary locations:

$ cp replicatord /usr/local/sbin/replicatord
$ cp replicatord.service /etc/systemd/system

Enter MySQL console and create a sample database (depending on your existing installation, you may be a user other than root):
```
mysql -u root -p
CREATE DATABASE menagerie;
QUIT
```

Get some sample data from MySQL. The data will be pulled into the root directory. After that, install it from the terminal.

cd
wget http://downloads.mysql.com/docs/menagerie-db.zip
unzip menagerie-db.zip
cd menagerie-db
mysql -u root -p menagerie < cr_pet_tbl.sql
mysql -u root -p menagerie < load_pet_tbl.sql
mysql menagerie -u root -p < ins_puff_rec.sql
mysql menagerie -u root -p < cr_event_tbl.sql

Enter MySQL console and massage the data for use with the Tarantool replicator. In this step, you:

add an ID
change a field name to avoid conflict
cut down the number of fields

With real data, this is the step that involves the most tweaking.

mysql -u root -p
USE menagerie;
ALTER TABLE pet ADD id INT PRIMARY KEY AUTO_INCREMENT FIRST;
ALTER TABLE pet CHANGE COLUMN 'name' 'name2' VARCHAR(255);
ALTER TABLE pet DROP sex, DROP birth, DROP death;
QUIT

The sample data is set up. Edit MySQL configuration file to use it with the replicator:

$ cd
$ nano /etc/my.cnf

Note that your my.cnf for MySQL could be in a slightly different location. Set:

[mysqld]
binlog_format = ROW
server_id = 1
log-bin = mysql-bin
interactive_timeout = 3600
wait_timeout = 3600
max_allowed_packet = 32M
socket = /var/lib/mysql/mysql.sock
bind-address = 127.0.0.1

[client]
socket = /var/lib/mysql/mysql.sock

After exiting nano, restart mysqld:
```
$ systemctl restart mysqld
```

Installing and configuring Tarantool

In this section, you install Tarantool and set up spaces for replication.

Go to the Download page and follow the installation instructions.
Install the tt CLI utility.
Create a new tt environment in the current directory using the tt init command.

In the /etc/tarantool/instances.available/mysql directory, create the tt instance configuration files:

config.yaml – specifies the following configuration

app:
  file: 'myapp.lua'

groups:
  group001:
    replicasets:
      replicaset001:
        instances:
          instance001:
            iproto:
              listen:
              - uri: '127.0.0.1:3301'

instances.yml – specifies instances to run in the current environment
```
instance001:
```

myapp.lua – contains a Lua script with an application to load

box.schema.user.grant('guest', 'read,write,execute', 'universe')

local function bootstrap()
    if not box.space.mysqldaemon then
        s = box.schema.space.create('mysqldaemon')
        s:create_index('primary',
                { type = 'tree', parts = { 1, 'unsigned' }, if_not_exists = true })
    end
    if not box.space.mysqldata then
        t = box.schema.space.create('mysqldata')
        t:create_index('primary',
                { type = 'tree', parts = { 1, 'unsigned' }, if_not_exists = true })
    end
end
bootstrap()

For details, see the Configuration section.

Inside the instances.enabled directory of the created tt environment, create a symlink (mysql) to the directory from the previous step:
```
$ ln -s /etc/tarantool/instances.available/mysql mysql
```
Next, start up the Lua program with tt, the Tarantool command-line utility:
```
$ tt start mysql
```
Enter the Tarantool instance:
```
$ tt connect mysql:instance001
```
Check that the target spaces were successfully created:
```
mysql:instance001> box.space._space:select()
```
At the bottom, you will see mysqldaemon and mysqldata spaces. Then exit with “CTRL+C”.

Setting up the replicator

MySQL and Tarantool are now set up. You can proceed to configure the replicator.

Edit the replicatord.yml file in the main tarantool-mysql-replication directory:
```
nano replicatord.yml
```

Change the entire file as follows. Don’t forget to add your MySQL password and set the appropriate user:

mysql:
    host: 127.0.0.1
    port: 3306
    user: root
    password:
    connect_retry: 15 # seconds

tarantool:
    host: 127.0.0.1:3301
    binlog_pos_space: 512
    binlog_pos_key: 0
    connect_retry: 15 # seconds
    sync_retry: 1000 # milliseconds

mappings:
 - database: menagerie
   table: pet
   columns: [ id, name2, owner, species ]
   space: 513
   key_fields:  [ 0 ]
   # insert_call: function_name
   # update_call: function_name
   # delete_call: function_name

Copy replicatord.yml to the location where systemd looks for it:
```
$ cp replicatord.yml /usr/local/etc/replicatord.yml
```
Next, start up the replicator:
```
$ systemctl start replicatord
```
Enter the Tarantool instance:
```
$ tt connect mysql:instance001
```

Do a select on the mysqldata space. The replicated content from MySQL looks the following way:

mysql:instance001> box.space.mysqldata:select()
---
- - [1, 'Fluffy', 'Harold', 'cat']
  - [2, 'Claws', 'Gwen', 'cat']
  - [3, 'Buffy', 'Harold', 'dog']
  - [4, 'Fang', 'Benny', 'dog']
  - [5, 'Bowser', 'Diane', 'dog']
  - [6, 'Chirpy', 'Gwen', 'bird']
  - [7, 'Whistler', 'Gwen', 'bird']
  - [8, 'Slim', 'Benny', 'snake']
  - [9, 'Puffball', 'Diane', 'hamster']

Testing the replication

In this section, you enter a record into MySQL and check that the record is replicated to Tarantool. To do this:

Exit the Tarantool instance with CTRL-D.

Insert a record into MySQL:

mysql -u root -p
USE menagerie;
INSERT INTO pet(name2, owner, species) VALUES ('Spot', 'Brad', 'dog');
QUIT

In the terminal, enter the Tarantool instance:
```
$ tt connect mysql:instance001
```
To see the replicated data in Tarantool, run the following command:
```
mysql:instance001> box.space.mysqldata:select()
```

Transactions

Transactions allow users to perform multiple operations atomically.

For more information on how transactions work in Tarantool, see the following sections:

Transaction model

Overview

The transaction model of Tarantool corresponds to the properties ACID (atomicity, consistency, isolation, durability).

Tarantool has two modes of transaction behavior:

Default – suitable for fast monopolistic atomic transactions
MVCC – designed for long-running concurrent transactions

Each transaction in Tarantool is executed in a single fiber on a single thread, sees a consistent database state and commits all changes atomically.

All transaction changes are written to the WAL (Write Ahead Log) in a single batch in a specific order at the time of the commit. If needed, transaction changes can also be rolled back – completely or to a specified savepoint.

Therefore, every transaction in Tarantool has the highest transaction isolation level – serializable.

Isolation level

By default, the isolation level of Tarantool is serializable. The exception is a failure during writing to the WAL, which can occur, for example, when the disk space is over. In this case, the isolation level of the concurrent read transaction would be read-committed.

The MVСС mode provides several options that enable you to tune the visibility behavior during transaction execution.

Read-committed

The read-committed isolation level makes visible all transactions that started commit (box.commit() was called).

Write transactions with reads

Manual usage of read-committed for write transactions with reads is completely safe, as this transaction will eventually result in a commit. If a previous transactions fails, this transaction will inevitably fail as well due to the serializable isolation level.
Read transactions

Manual usage of read-committed for read transactions may be unsafe, as it may lead to phantom reads.

Read-confirmed

The read-confirmed isolation level makes visible all transactions that finished the commit (box.commit() was returned). This means that new data is already on disk or even on other replicas.

Read transactions

The use of read-confirmed is safe for read transactions given that data is on disk (for asynchronous replication) or even in other replicas (for synchronous replication).
Write transactions

To achieve serializable, any write transaction should read all data that has already been committed. Otherwise, it may conflict when it reaches its commit.

Linearizable read

Linearizability of read operations implies that if a response for a write request arrived earlier than a read request was made, this read request should return the results of the write request. When called with linearizable, box.begin() yields until the instance receives enough data from remote peers to be sure that the transaction is linearizable.

Linearizable transactions may only perform requests to the following memtx space types:

synchronous
local (created with is_local = true)
temporary (created with temporary = true)

A linearizable transaction can fail with an error in the following cases:

If the node can’t contact enough remote peers to determine which data is committed.
If the data isn’t received during the timeout specified in box.begin().

Note

To start a linearizable transaction, the node should be the replication source for at least N - Q + 1 remote replicas. Here N is the count of registered nodes in the cluster and Q is replication_synchro_quorum. So, for example, you can’t perform a linearizable transaction on anonymous replicas because they can’t be the source of replication for other nodes.

Best-effort (default)

To minimize the possibility of conflicts, MVCC uses what is called best-effort visibility:

for write transactions, MVCC chooses read-committed
for read transactions, MVCC chooses read-confirmed

This inevitably leads to the serializable isolation level. Since there is no option for MVCC to analyze the whole transaction to make a decision, it makes the choice on the first operation.

Note

If the serializable isolation level becomes unreachable, the transaction is marked as “conflicted” and rolled back.

Thread model

Main threads

The thread model assumes that a query received by Tarantool via network is processed with three operating system threads:

The network thread (or threads) on the server side receives the query, parses the statement, checks if it is correct, and then transforms it into a special structure – a message containing an executable statement and its options.
The network thread sends this message to the instance’s transaction processor thread (TX thread) via a lock-free message bus. Lua programs are executed directly in the transaction processor thread, and do not need to be parsed and prepared.

The TX thread either uses a space index to find and update the tuple, or executes a stored function that performs a data operation.
The execution of the operation results in a message to the write-ahead logging (WAL) thread used to commit the transaction and the fiber executing the transaction is suspended. When the transaction results in a COMMIT or ROLLBACK, the following actions are taken:
- The WAL thread responds with a message to the TX thread.
- The fiber executing the transaction is resumed to process the result of the transaction.
- The result of the fiber execution is passed to the network thread, and the network thread returns the result to the client.

Note

There is only one TX thread in Tarantool. Some users are used to the idea that there can be multiple threads working on the database. For example, thread #1 reads a row #x while thread #2 writes a row #y. With Tarantool this does not happen. Only the TX thread can access the database, and there is only one TX thread for each Tarantool instance.

The TX thread can handle many fibers – a set of computer instructions that can contain “yield” signals. The TX thread executes all computer instructions up to a yield signal, and then switches to execute the instructions of another fiber.

Yields must happen, otherwise the TX thread would be permanently stuck on the same fiber.

Supplementary threads

There are also several supplementary threads that serve additional capabilities:

For replication, Tarantool creates a separate thread for each connected replica. This thread reads a write-ahead log and sends it to the replica, following its position in the log. Separate threads are required because each replica can point to a different position in the log and can run at different speeds.
There is a thread pool for ad hoc asynchronous tasks, such as a DNS resolver or fsync.
There is a thread pool that can be used for parallel sorting (hence, to parallelize building indexes). To configure it, use the memtx.sort_threads configuration option. The option sets the number of threads used to sort keys of secondary indexes on loading a memtx database.

Note

Since 3.0.0, this option replaces the approach when OpenMP threads are used to parallelize sorting. For backward compatibility, the OMP_NUM_THREADS environment variable is taken into account to set the number of sorting threads.

Transaction mode: default

By default, Tarantool does not allow “yielding” inside a memtx transaction and the transaction manager is disabled. This allows fast atomic transactions without conflicts, but brings some limitations:

You cannot use interactive transactions.
Any fiber yield leads to the abort of a transaction.
All changes are made immediately, but in the event of a yield or error, the transaction is rolled back, including the return of the previous data.

To learn how to enable yielding inside a memtx transaction, see Transaction mode: MVCC.

To switch back to the default mode, disable the transaction manager:

box.cfg { memtx_use_mvcc_engine = false }

Transaction mode: MVCC

Since version 2.6.1, Tarantool has another transaction behavior mode that allows “yielding” inside a memtx transaction. This is controlled by the transaction manager.

This mode allows concurrent transactions but may cause conflicts. You can use this mode on the memtx storage engine. The vinyl storage engine also supports MVCC mode, but has a different implementation.

Note

Currently, you cannot use several different storage engines within one transaction.

Transaction manager

The transaction manager is designed to isolate concurrent transactions and provides a serializable transaction isolation level. It consists of two parts:

MVCC – multi version concurrency control engine, which stores all change actions of all transactions. It also creates the transaction view of the database state and a read view (a fixed state of the database that is never changed by other transactions) when necessary.
Conflict manager – a manager that tracks changes to transactions and determines their correctness in the serialization order. The conflict manager declares transactions to be in conflict or sends transactions to read views when necessary.

Since version 2.10.1, the conflict manager detects conflicts right after the first one of several conflicting transactions is committed. After this moment, any CRUD operations in the conflicted transaction will result in errors until the transaction is rolled back.

The transaction manager also provides a non-classical snapshot isolation level – this snapshot is not necessarily tied to the start time of the transaction, like the classical snapshot where a transaction can get a consistent snapshot of the database. The conflict manager decides if and when each transaction gets which snapshot. This avoids some conflicts compared to the classic snapshot isolation approach.

Warning

Currently, the isolation level of BITSET and RTREE indexes in MVCC transaction mode is read-committed (not serializable, as stated). If a transaction uses these indexes, it can read committed or confirmed data (depending on the isolation level). However, the indexes are subject to different anomalies that can make them unserializable.

Enabling the transaction manager

By default, the transaction manager is disabled. Use the memtx_use_mvcc_engine option to enable it via box.cfg.

box.cfg{memtx_use_mvcc_engine = true}

Setting the transaction isolation level

The transaction manager has the following options for the transaction isolation level:

best-effort (default)
read-committed
read-confirmed
linearizable (only for a specific transaction)

Using best-effort as the default option allows MVCC to consider the actions of transactions independently and determine the best isolation level for them. It increases the probability of successful completion of the transaction and helps to avoid possible conflicts.

To set another default isolation level, for example, read-committed, use the following command:

box.cfg { txn_isolation = 'read-committed' }

Note that the linearizable isolation level can’t be set as default and can be used for a specific transaction only. You can set an isolation level for a specific transaction in its box.begin() call:

box.begin({ txn_isolation = 'best-effort' })

In this case, you can also use the default option. It sets the transaction’s isolation level to the one set in box.cfg.

Note

For autocommit transactions (actions with a statement without explicit box.begin/box.commit calls) there is a rule:

Read-only transactions (for example, select) are performed with read-confirmed.
All other transactions (for example, replace) are performed with read-committed.

You can also set the isolation level in the net.box stream:begin() method and IPROTO_BEGIN binary protocol request.

Choosing the better option depends on whether you have conflicts or not. If you have many conflicts, you should set a different option or use the default transaction mode.

Examples with MVCC enabled and disabled

Create a file init.lua, containing the following:

fiber = require 'fiber'

box.cfg{ listen = '127.0.0.1:3301', memtx_use_mvcc_engine = false }
box.schema.user.grant('guest', 'super', nil, nil, {if_not_exists = true})

tickets = box.schema.create_space('tickets', { if_not_exists = true })
tickets:format({
    { name = "id", type = "number" },
    { name = "place", type = "number" },
})
tickets:create_index('primary', {
    parts = { 'id' },
    if_not_exists = true
})

Connect to the instance using the tt connect command:

tt connect 127.0.0.1:3301

Then try to execute the transaction with yield inside:

box.atomic(function() tickets:replace{1, 429} fiber.yield() tickets:replace{2, 429} end)

You will receive an error message:

---
- error: Transaction has been aborted by a fiber yield
...

Also, if you leave a transaction open while returning from a request, you will get an error message:

127.0.0.1:3301> box.begin()
    ⨯ Failed to execute command: Transaction is active at return from function

Change memtx_use_mvcc_engine to true, restart Tarantool, and try again:

127.0.0.1:3301> box.atomic(function() tickets:replace{1, 429} fiber.yield() tickets:replace{2, 429} end)
---
...

Now check if this transaction was successful:

127.0.0.1:3301> box.space.tickets:select({}, {limit = 10})
---
- - [1, 429]
  - [2, 429]
...

Streams and interactive transactions

Since v. 2.10.0, IPROTO implements streams and interactive transactions that can be used when memtx_use_mvcc_engine is enabled on the server.

Stream

A stream supports multiplexing several transactions over one connection. Each stream has its own identifier, which is unique within the connection. All requests with the same non-zero stream ID belong to the same stream. All requests in a stream are executed strictly sequentially. This allows the implementation of interactive transactions. If the stream ID of a request is 0, it does not belong to any stream and is processed in the old way.

In net.box, a stream is an object above the connection that has the same methods but allows sequential execution of requests. The ID is automatically generated on the client side. If a user writes their own connector and wants to use streams, they must transmit the stream_id over the IPROTO protocol.

Unlike a thread, which involves multitasking and execution within a program, a stream transfers data via the protocol between a client and a server.

Interactive transaction

An interactive transaction is one that does not need to be sent in a single request. There are multiple ways to begin, commit, and roll back a transaction, and they can be mixed. You can use stream:begin(), stream:commit(), stream:rollback() or the appropriate stream methods – call, eval, or execute – using the SQL transaction syntax.

Let’s create a Lua client (client.lua) and run it with Tarantool:

local net_box = require 'net.box'
local conn = net_box.connect('127.0.0.1:3301')
local conn_tickets = conn.space.tickets
local yaml = require 'yaml'

local stream = conn:new_stream()
local stream_tickets = stream.space.tickets

-- Begin transaction over an iproto stream:
stream:begin()
print("Replaced in a stream\n".. yaml.encode(  stream_tickets:replace({1, 768}) ))

-- Empty select, the transaction was not committed.
-- You can't see it from the requests that do not belong to the
-- transaction.
print("Selected from outside of transaction\n".. yaml.encode(conn_tickets:select({}, {limit = 10}) ))

-- Select returns the previously inserted tuple
-- because this select belongs to the transaction:
print("Selected from within transaction\n".. yaml.encode(stream_tickets:select({}, {limit = 10}) ))

-- Commit transaction:
stream:commit()

-- Now this select also returns the tuple because the transaction has been committed:
print("Selected again from outside of transaction\n".. yaml.encode(conn_tickets:select({}, {limit = 10}) ))

os.exit()

Then call it and see the following output:

Replaced in a stream
--- [1, 768]
...

Selected from outside of transaction
---
- [1, 429]
- [2, 429]
...

Selected from within transaction
---
- [1, 768]
- [2, 429]
...

Selected again from outside of transaction
---
- [1, 768]
- [2, 429]
...```

Replication

Replication allows multiple Tarantool instances to work on copies of the same databases. The databases are kept in sync because each instance can communicate its changes to all the other instances.

This section includes the following topics:

For practical guides to replication, see Replication tutorials. You can learn about bootstrapping a replica set, adding instances to the replica set, or removing them.

Replication architecture

Replication mechanism

Overview

A pack of instances that operate on copies of the same databases makes up a replica set. Each instance in a replica set has a role: master or replica.

A replica gets all updates from the master by continuously fetching and applying its write-ahead log (WAL). Each record in the WAL represents a single Tarantool data-change request such as INSERT, UPDATE, or DELETE, and is assigned a monotonically growing log sequence number (LSN). In essence, Tarantool replication is row-based: each data-change request is fully deterministic and operates on a single tuple. However, unlike a classical row-based log, which contains entire copies of the changed rows, Tarantool’s WAL contains copies of the requests. For example, for UPDATE requests, Tarantool only stores the primary key of the row and the update operations to save space.

Note

WAL extensions available in Tarantool Enterprise Edition enable you to add auxiliary information to each write-ahead log record. This information might be helpful for implementing a CDC (Change Data Capture) utility that transforms a data replication stream.

The following are specifics of adding different types of information to the WAL:

Invocations of stored programs are not written to the WAL. Instead, records of the actual data-change requests, performed by the Lua code, are written to the WAL. This ensures that the possible non-determinism of Lua does not cause replication to go out of sync.
Data definition operations on temporary spaces (created with temporary = true), such as creating/dropping, adding indexes, and truncating, are written to the WAL, since information about temporary spaces is stored in non-temporary system spaces, such as box.space._space.
Data change operations on temporary spaces are not written to the WAL and are not replicated.

Data change operations on replication-local spaces (created with is_local = true) are written to the WAL but are not replicated.

To learn how to enable replication, check the Bootstrapping a replica set guide.

Replication stages

To create a valid initial state, to which WAL changes can be applied, every instance of a replica set requires a start set of checkpoint files, such as .snap files for memtx and .run files for vinyl. A replica goes through the following stages:

Bootstrap (optional)

When an entire replica set is bootstrapped for the first time, there is no master that could provide the initial checkpoint. In such a case, replicas connect to each other and elect a master. The master creates the starting set of checkpoint files and distributes them to all the other replicas. This is called an automatic bootstrap of a replica set.
Join

At this stage, a replica downloads the initial state from the master. The master register this replica in the box.space._cluster space. If join fails with a non-critical error, for example, ER_READONLY, ER_ACCESS_DENIED, or a network-related issue, an instance tries to find a new master to join.

Note

On subsequent connections, a replica downloads all changes happened after the latest local LSN (there can be many LSNs – each master has its own LSN).
Follow

At this stage, a replica fetches and applies updates from the master’s write-ahead log.

You can use the box.info.replication[n].upstream.status property to monitor the status of a replica.

Replica set and instance UUIDs

Each replica set is identified by a globally unique identifier, called the replica set UUID. The identifier is created by the master, which creates the very first checkpoint and is part of the checkpoint file. It is stored in the box.space._schema system space, for example:

tarantool> box.space._schema:select{'cluster'}
---
- - ['cluster', '6308acb9-9788-42fa-8101-2e0cb9d3c9a0']
...

Additionally, each instance in a replica set is assigned its own UUID, when it joins the replica set. It is called an instance UUID and is a globally unique identifier. The instance UUID is checked to ensure that instances do not join a different replica set, e.g. because of a configuration error. A unique instance identifier is also necessary to apply rows originating from different masters only once, that is, to implement multi-master replication. This is why each row in the write-ahead log, in addition to its log sequence number, stores the instance identifier of the instance on which it was created. But using a UUID as such an identifier would take too much space in the write-ahead log, thus a shorter integer number is assigned to the instance when it joins a replica set. This number is then used to refer to the instance in the write-ahead log. It is called instance ID. All identifiers are stored in the system space box.space._cluster, for example:

tarantool> box.space._cluster:select{}
---
- - [1, '88580b5c-4474-43ab-bd2b-2409a9af80d2']
...

Here the instance ID is 1 (unique within the replica set), and the instance UUID is 88580b5c-4474-43ab-bd2b-2409a9af80d2 (globally unique).

Using instance IDs is also handy for tracking the state of the entire replica set. For example, box.info.vclock describes the state of replication in regard to each connected peer.

tarantool> box.info.vclock
---
- {1: 827, 2: 584}
...

Here vclock contains log sequence numbers (827 and 584) for instances with instance IDs 1 and 2.

If required, you can explicitly specify the instance and the replica set UUID values rather than letting Tarantool generate them. To learn more, see the replicaset_uuid configuration parameter description.

Replication roles: master and replica

The replication role (master or replica) is set by the read_only configuration parameter. The recommended role is “read_only” (replica) for all but one instance in the replica set.

In a master-replica configuration, every change that happens on the master will be visible on the replicas, but not vice versa.

A simple two-instance replica set with the master on one machine and the replica on a different machine provides two benefits:

failover, because if the master goes down, then the replica can take over, and
load balancing, because clients can connect to either the master or the replica for read requests.

In a master-master configuration (also called “multi-master”), every change that happens on either instance will be visible on the other one.

The failover benefit in this case is still present, and the load-balancing benefit is enhanced, because any instance can handle both read and write requests. Meanwhile, for multi-master configurations, it is necessary to understand the replication guarantees provided by the asynchronous protocol that Tarantool implements.

Tarantool multi-master replication guarantees that each change on each master is propagated to all instances and is applied only once. Changes from the same instance are applied in the same order as on the originating instance. Changes from different instances, however, can be mixed and applied in a different order on different instances. This may lead to replication going out of sync in certain cases.

For example, assuming the database is only appended to (i.e. it contains only insertions), a multi-master configuration is safe. If there are also deletions, but it is not mission critical that deletion happens in the same order on all replicas (e.g. the DELETE is used to prune expired data), a master-master configuration is also safe.

UPDATE operations, however, can easily go out of sync. For example, assignment and increment are not commutative and may yield different results if applied in a different order on different instances.

More generally, it is only safe to use Tarantool master-master replication if all database changes are commutative: the end result does not depend on the order in which the changes are applied. You can start learning more about conflict-free replicated data types here.

Replication topologies: cascade, ring, and full mesh

Replication topology is set by the replication configuration parameter. The recommended topology is a full mesh because it makes potential failover easy.

Some database products offer cascading replication topologies: creating a replica on a replica. Tarantool does not recommend such a setup.

The problem with a cascading replica set is that some instances have no connection to other instances and may not receive changes from them. One essential change that must be propagated across all instances in a replica set is an entry in box.space._cluster system space with the replica set UUID. Without knowing the replica set UUID, a master refuses to accept connections from such instances when replication topology changes. Here is how this can happen:

We have a chain of three instances. Instance #1 contains entries for instances #1 and #2 in its _cluster space. Instances #2 and #3 contain entries for instances #1, #2, and #3 in their _cluster spaces.

Now instance #2 is faulty. Instance #3 tries connecting to instance #1 as its new master, but the master refuses the connection since it has no entry, for example, #3.

Ring replication topology is, however, supported:

So, if you need a cascading topology, you may first create a ring to ensure all instances know each other’s UUID, and then disconnect the chain in the place you desire.

A stock recommendation for a master-master replication topology, however, is a full mesh:

You then can decide where to locate instances of the mesh – within the same data center, or spread across a few data centers. Tarantool will automatically ensure that each row is applied only once on each instance. To remove a degraded instance from a mesh, simply change the replication configuration parameter.

This ensures full cluster availability in case of a local failure, e.g. one of the instances failing in one of the data centers, as well as in case of an entire data center failure.

The maximal number of replicas in a mesh is 32.

Orphan status

During box.cfg(), an instance tries to join all nodes listed in box.cfg.replication. If the instance does not succeed in connecting to the required number of nodes (see bootstrap_strategy), it switches to the orphan status.

Synchronous replication

Overview

By default, replication in Tarantool is asynchronous: if a transaction is committed locally on a master node, it does not mean it is replicated onto any replicas. If a master responds success to a client and then dies, after failover to a replica, from the client’s point of view the transaction will disappear.

Synchronous replication exists to solve this problem. Synchronous transactions are not considered committed and are not responded to a client until they are replicated onto some number of replicas.

To enable synchronous replication, use the space_opts.is_sync option when creating or altering a space.

Synchronous and asynchronous transactions

A killer feature of Tarantool’s synchronous replication is its being per-space. So, if you need it only rarely for some critical data changes, you won’t pay for it in performance terms.

When there is more than one synchronous transaction, they all wait for being replicated. Moreover, if an asynchronous transaction appears, it will also be blocked by the existing synchronous transactions. This behavior is very similar to a regular queue of asynchronous transactions because all the transactions are committed in the same order as they make the box.commit() call. So, here comes the commit rule: transactions are committed in the same order as they make the box.commit() call – regardless of being synchronous or asynchronous.

If one of the waiting synchronous transactions times out and is rolled back, it will first roll back all the newer pending transactions. Again, just like how asynchronous transactions are rolled back when WAL write fails. So, here comes the rollback rule: transactions are always rolled back in the order reversed from the one they make the box.commit() call – regardless of being synchronous or asynchronous.

One more important thing is that if an asynchronous transaction is blocked by a synchronous transaction, it does not become synchronous as well. This just means it will wait for the synchronous transaction to be committed. But once it is done, the asynchronous transaction will be committed immediately – it won’t wait for being replicated itself.

Warning

Be careful when using synchronous and asynchronous transactions together. Asynchronous transactions are considered committed even if there is no connection to other nodes. Therefore, an old leader node (synchronous transaction queue owner) might have some committed asynchronous transactions that no other replica set member has.

When the connection to such an old (previous) leader node is restored, it starts receiving data from the new leader. At the same time, other replica set members receive the data from the previous leader that they don’t have yet. The data from the previous leader contains some committed asynchronous transactions. At this time, the integrity protection will throw the ER_SPLIT_BRAIN error, which will force the user to rebootstrap the previous leader.

Limitations and known problems

Until version 2.5.2, there was no way to enable synchronous replication for existing spaces, but since 2.5.2 it can be enabled by space_object:alter({is_sync = true}).

Synchronous transactions work only for master-slave topology. You can have multiple replicas, anonymous replicas, but only one node can make synchronous transactions.

Since Tarantool 2.10.0, anonymous replicas do not participate in the quorum.

Leader election

Starting from version 2.6.1, Tarantool has the built-in functionality managing automated leader election in a replica set. For more information, refer to the corresponding chapter.

Automated leader election

Starting from version 2.6.1, Tarantool has the built-in functionality managing automated leader election in a replica set. This functionality increases the fault tolerance of the systems built on the base of Tarantool and decreases dependency on external tools for replica set management.

To learn how to configure and monitor automated leader elections, check Managing leader elections.

The following topics are described below:

Leader election and synchronous replication
Leader election process
Managing leader elections

Leader election and synchronous replication

Leader election and synchronous replication are implemented in Tarantool as a modification of the Raft algorithm. Raft is an algorithm of synchronous replication and automatic leader election. Its complete description can be found in the corresponding document.

In Tarantool, synchronous replication and leader election are supported as two separate subsystems. So it is possible to get synchronous replication but use an alternative algorithm for leader election. And vice versa – elect a leader in the cluster but don’t use synchronous spaces at all. Synchronous replication has a separate documentation section. Leader election is described below.

Note

The system behavior can be specified exactly according to the Raft algorithm. To do this:

Ensure that the user has only synchronous spaces.
Set the replication.synchro_quorum option to N / 2 + 1.
Set the replication.synchro_timeout option to infinity.
In the replication.election_fencing_mode option, select either the soft mode (the default) or the strict mode, which is more restrictive.

Leader election process

Automated leader election in Tarantool helps guarantee that there is at most one leader at any given moment of time in a replica set. A leader is a writable node, and all other nodes are non-writable – they accept read-only requests exclusively.

When the election is enabled, the life cycle of a replica set is divided into so-called terms. Each term is described by a monotonically growing number. After the first boot, each node has its term equal to 1. When a node sees that it is not a leader and there is no leader available for some time in the replica set, it increases the term and starts a new leader election round.

Leader election happens via votes. The node that started the election votes for itself and sends vote requests to other nodes. Upon receiving vote requests, a node votes for the first of them, and then cannot do anything in the same term but wait for a leader to be elected.

The node that collected a quorum of votes defined by the replication.synchro_quorum parameter becomes the leader and notifies other nodes about that. Also, a split vote can happen when no nodes received a quorum of votes. In this case, after a random timeout, each node increases its term and starts a new election round if no new vote request with a greater term arrives during this time. Eventually, a leader is elected.

If any unfinalized synchronous transactions are left from the previous leader, the new leader finalizes them automatically.

All the non-leader nodes are called followers. The nodes that start a new election round are called candidates. The elected leader sends heartbeats to the non-leader nodes to let them know it is alive.

In case there are no heartbeats for the period of replication.timeout * 4, a non-leader node starts a new election if the following conditions are met:

The node has a quorum of connections to other cluster members.
None of these cluster members can see the leader node.

Note

A cluster member considers the leader node to be alive if the member received heartbeats from the leader at least once during the replication.timeout * 4, and there are no replication errors (the connection is not broken due to timeout or due to an error).

Terms and votes are persisted by each instance to preserve certain Raft guarantees.

During the election, the nodes prefer to vote for those ones that have the newest data. So as if an old leader managed to send something before its death to a quorum of replicas, that data wouldn’t be lost.

When election is enabled, there must be connections between each node pair so as it would be the full mesh topology. This is needed because election messages for voting and other internal things need a direct connection between the nodes.

In the classic Raft algorithm, a leader doesn’t track its connectivity to the rest of the cluster. Once the leader is elected, it considers itself in the leader position until receiving a new term from another cluster node. This can lead to a split situation if the other nodes elect a new leader upon losing the connectivity to the previous one.

The issue is resolved in Tarantool version 2.10.0 by introducing the leader fencing mode. The mode can be switched by the replication.election_fencing_mode configuration parameter. When the fencing is set to soft or strict, the leader resigns its leadership if it has less than replication.synchro_quorum of alive connections to the cluster nodes. The resigning leader receives the status of a follower in the current election term and becomes read-only. Leader fencing can be turned off by setting the replication.election_fencing_mode configuration parameter to off.

In soft mode, a connection is considered dead if there are no responses for 4 * replication.timeout seconds both on the current leader and the followers.

In strict mode, a connection is considered dead if there are no responses for 2 * replication.timeout seconds on the current leader and for 4 * replication.timeout seconds on the followers. This improves chances that there is only one leader at any time.

Fencing applies to the instances that have the replication.election_mode set to “candidate” or “manual”.

There can still be a situation when a replica set has two leaders working independently (so-called split-brain). It can happen, for example, if a user mistakenly lowered the replication.synchro_quorum below N / 2 + 1. In this situation, to preserve the data integrity, if an instance detects the split-brain anomaly in the incoming replication data, it breaks the connection with the instance sending the data and writes the ER_SPLIT_BRAIN error in the log.

Eventually, there will be two sets of nodes with the diverged data, and any node from one set is disconnected from any node from the other set with the ER_SPLIT_BRAIN error.

Once noticing the error, a user can choose any representative from each of the sets and inspect the data on them. To correlate the data, the user should remove it from the nodes of one set, and reconnect them to the nodes from the other set that have the correct data.

Also, if election is enabled on the node, it doesn’t replicate from any nodes except the newest leader. This is done to avoid the issue when a new leader is elected, but the old leader has somehow survived and tries to send more changes to the other nodes.

Term numbers also work as a kind of filter. For example, if election is enabled on two nodes and node1 has the term number less than node2, then node2 doesn’t accept any transactions from node1.

Managing leader elections

Configuration

replication:
  election_mode: <string>
  election_fencing_mode: <string>
  election_timeout: <seconds>
  timeout: <seconds>
  synchro_quorum: <count>

replication.election_mode – specifies the role of a node in the leader election process.
replication.election_fencing_mode – specifies the leader fencing mode.
replication.election_timeout – specifies the timeout between election rounds if the previous round ended up with a split vote.
replication.timeout – a time interval (in seconds) used by a master to send heartbeat requests to a replica when there are no updates to send to this replica.
replication.synchro_quorum – a number of replicas that should confirm the receipt of a synchronous transaction before it can finish its commit.

It is important to know that being a leader is not the only requirement for a node to be writable. The leader should also satisfy the following requirements:

The database.mode option is set to rw.
The leader shouldn’t be in the orphan state.

Nothing prevents you from setting the database.mode option to ro, but the leader won’t be writable then. The option doesn’t affect the election process itself, so a read-only instance can still vote and become a leader.

Monitoring

To monitor the current state of a node regarding the leader election, use the box.info.election function.

Example:

tarantool> box.info.election
---
- state: follower
  vote: 0
  leader: 0
  term: 1
...

The Raft-based election implementation logs all its actions with the RAFT: prefix. The actions are new Raft message handling, node state changing, voting, and term bumping.

Important notes

Leader election doesn’t work correctly if the election quorum is set to less or equal than <cluster size> / 2. In that case, a split vote can lead to a state when two leaders are elected at once.

For example, suppose there are five nodes. When the quorum is set to 2, node1 and node2 can both vote for node1. node3 and node4 can both vote for node5. In this case, node1 and node5 both win the election. When the quorum is set to the cluster majority, that is (<cluster size> / 2) + 1 or greater, the split vote is impossible.

That should be considered when adding new nodes. If the majority value is changing, it’s better to update the quorum on all the existing nodes before adding a new one.

Also, the automated leader election doesn’t bring many benefits in terms of data safety when used without synchronous replication. If the replication is asynchronous and a new leader gets elected, the old leader is still active and considers itself the leader. In such case, nothing stops it from accepting requests from clients and making transactions. Non-synchronous transactions are successfully committed because they are not checked against the quorum of replicas. Synchronous transactions fail because they are not able to collect the quorum – most of the replicas reject these old leader’s transactions since it is not a leader anymore.

Supervised failover

Enterprise Edition

Supervised failover is supported by the Enterprise Edition only.

Example on GitHub: supervised_failover

Tarantool provides the ability to control leadership in a replica set using an external failover coordinator. A failover coordinator reads a cluster configuration from a file or an etcd-based configuration storage, polls instances for their statuses, and appoints a leader for each replica set depending on the availability and health of instances.

To increase fault tolerance, you can run two or more failover coordinators. In this case, an etcd cluster provides synchronization between coordinators.

Overview

The main steps of using an external failover coordinator for a newly configured cluster might look as follows:

Configure a cluster to work with an external coordinator. The main step is setting the replication.failover option to supervised for all replica sets that should be managed by the external coordinator.
Start a configured cluster. When an external coordinator is still not running, instances in a replica set start in the following modes:
- If a replica set is already bootstrapped, all instances are started in read-only mode.
- If a replica set is not bootstrapped, one instance is started in read-write mode.
Start a failover coordinator. You can start two or more failover coordinators to increase fault tolerance. In this case, one coordinator is active and others are passive.

Once a cluster and failover coordinators are up and running, a failover coordinator appoints one instance to be a master if there is no master instance in a replica set. Then, the following events may occur:

If a master instance fails, a failover coordinator performs an automated failover.
If an active failover coordinator fails, another coordinator becomes active and performs an automated failover.

Note

Note that a failover coordinator doesn’t work with replica sets with two or more read-write instances. In this case, a coordinator logs a warning to stdout and doesn’t perform any appointments.

Appointing a new master instance

After a master instance has been appointed, a failover coordinator monitors the statuses of all instances in a replica set by sending requests each probe_interval seconds. For the master instance, the coordinator maintains a read-write mode deadline, which is renewed periodically each renew_interval seconds. If all attempts to renew the deadline fail during the specified time interval (lease_interval), the master switches to read-only mode. Then, the coordinator appoints a new instance as the master.

Note

Anonymous replicas are not considered as candidates to be a master.

If a remote etcd-based storage is used to maintain the state of failover coordinators, you can also perform a manual failover.

Active and passive coordinators

To increase fault tolerance, you can run two or more failover coordinators. In this case, only one coordinator is active and used to control leadership in a replica set. Other coordinators are passive and don’t perform any read-write appointments.

To maintain the state of coordinators, Tarantool uses a stateboard – a remote etcd-based storage. This storage uses the same connection settings as a centralized etcd-based configuration storage. If a cluster configuration is stored in the <prefix>/config/* keys in etcd, the failover coordinator looks into <prefix>/failover/* for its state. Here are a few examples of keys used for different purposes:

<prefix>/failover/info/by-uuid/<uuid>: contains a state of a failover coordinator identified by the specified uuid.
<prefix>/failover/active/lock: a unique identifier (UUID) of an active failover coordinator.
<prefix>/failover/active/term: a kind of fencing token allowing to have an order in which coordinators become active (took the lock) over time.
<prefix>/failover/command/<id>: a key used to perform a manual failover.

Configuring a cluster

To configure a cluster to work with an external failover coordinator, follow the steps below:

(Optional) If you need to run several failover coordinators to increase fault tolerance, set up an etcd-based configuration storage, as described in Centralized configuration storages.
Set the replication.failover option to supervised:
```
replication:
  failover: supervised
```
Grant a user used for replication permissions to execute the failover.execute function:
```
credentials:
  users:
    replicator:
      password: 'topsecret'
      roles: [ replication ]
      privileges:
      - permissions: [ execute ]
        lua_call: [ 'failover.execute' ]
```
Note

In Tarantool 3.0 and 3.1, the configuration is different and the function must be created in the application code. See Tarantool 3.0 and 3.1 configuration for details.

(Optional) Configure options that control how a failover coordinator operates in the failover section:

failover:
  probe_interval: 5
  lease_interval: 15
  renew_interval: 5
  stateboard:
    keepalive_interval: 5
    renew_interval: 1

You can find the full example on GitHub: supervised_failover.

Tarantool 3.0 and 3.1 configuration

Before version 3.2, Tarantool used another mechanism to grant execute access to Lua functions. In Tarantool 3.0 and 3.1, the credentials configuration section should look as follows:

# Tarantool 3.0 and 3.1
credentials:
  users:
    replicator:
      password: 'topsecret'
      roles: [ replication ]
      privileges:
      - permissions: [ execute ]
        functions: [ 'failover.execute' ]

Additionally, you should create the failover.execute function in the application code. For example, you can create a custom role for this purpose:

-- Tarantool 3.0 and 3.1 --
-- supervised_instance.lua --
return {
    validate = function()
    end,
    apply = function()
        if box.info.ro then
            return
        end
        local func_name = 'failover.execute'
        local opts = { if_not_exists = true }
        box.schema.func.create(func_name, opts)
    end,
    stop = function()
        if box.info.ro then
            return
        end
        local func_name = 'failover.execute'
        if not box.schema.func.exists(func_name) then
            return
        end
        box.schema.func.drop(func_name)
    end,
}

Then, enable this role for all storage instances:

# Tarantool 3.0 and 3.1
roles: [ 'supervised_instance' ]

Starting a failover coordinator

To start a failover coordinator, you need to execute the tarantool command with the failover option. This command accepts the path to a cluster configuration file:

tarantool --failover --config instances.enabled/supervised_failover/config.yaml

If a cluster’s configuration is stored in etcd, the config.yaml file contains connection options for the etcd storage.

You can run two or more failover coordinators to increase fault tolerance. In this case, only one coordinator is active and used to control leadership in a replica set. Learn more from Active and passive coordinators.

Performing manual failover

If an etcd-based storage is used to maintain the state of failover coordinators, you can perform a manual failover. External tools can use the <prefix>/failover/command/<id> key to choose a new master. For example, the tt utility provides the tt cluster failover command for managing a supervised failover.

Replication tutorials

Master-replica: manual failover

Example on GitHub: manual_leader

This tutorial shows how to configure and work with a replica set with manual failover.

Prerequisites

Before starting this tutorial:

Install the tt utility.
Create a tt environment in the current directory by executing the tt init command.
Inside the instances.enabled directory of the created tt environment, create the manual_leader directory.
Inside instances.enabled/manual_leader, create the instances.yml and config.yaml files:
- instances.yml specifies instances to run in the current environment and should look like this:
```
instance001:
instance002:
```
- The config.yaml file is intended to store a replica set configuration.

Configuring a replica set

This section describes how to configure a replica set in config.yaml.

Step 1: Configuring a failover mode

First, set the replication.failover option to manual:

replication:
  failover: manual

Step 2: Defining a replica set topology

Define a replica set topology inside the groups section:

The leader option sets instance001 as a replica set leader.
The iproto.listen option specifies an address used to listen for incoming requests and allows replicas to communicate with each other.

groups:
  group001:
    replicasets:
      replicaset001:
        leader: instance001
        instances:
          instance001:
            iproto:
              listen:
              - uri: '127.0.0.1:3301'
          instance002:
            iproto:
              listen:
              - uri: '127.0.0.1:3302'

Step 3: Creating a user for replication

In the credentials section, create the replicator user with the replication role:

credentials:
  users:
    replicator:
      password: 'topsecret'
      roles: [replication]

Step 4: Specifying advertise URIs

Set iproto.advertise.peer to advertise the current instance to other replica set members:

iproto:
  advertise:
    peer:
      login: replicator

Resulting configuration

The resulting replica set configuration should look as follows:

credentials:
  users:
    replicator:
      password: 'topsecret'
      roles: [replication]

iproto:
  advertise:
    peer:
      login: replicator

replication:
  failover: manual

groups:
  group001:
    replicasets:
      replicaset001:
        leader: instance001
        instances:
          instance001:
            iproto:
              listen:
              - uri: '127.0.0.1:3301'
          instance002:
            iproto:
              listen:
              - uri: '127.0.0.1:3302'

Working with a replica set

Starting instances

After configuring a replica set, execute the tt start command from the tt environment directory:

$ tt start manual_leader
   • Starting an instance [manual_leader:instance001]...
   • Starting an instance [manual_leader:instance002]...

Check that instances are in the RUNNING status using the tt status command:

$ tt status manual_leader
INSTANCE                   STATUS   PID   MODE  CONFIG  BOX      UPSTREAM
manual_leader:instance001  RUNNING  8841  RW    ready   running  --
manual_leader:instance002  RUNNING  8842  RO    ready   running  --

Checking a replica set status

Connect to instance001 using tt connect:

$ tt connect manual_leader:instance001
   • Connecting to the instance...
   • Connected to manual_leader:instance001

Make sure that the instance is in the running state by executing box.info.status:
```
manual_leader:instance001> box.info.status
---
- running
...
```

Check that the instance is writable using box.info.ro:

manual_leader:instance001> box.info.ro
---
- false
...

Execute box.info.replication to check a replica set status. For instance002, upstream.status and downstream.status should be follow.

manual_leader:instance001> box.info.replication
---
- 1:
    id: 1
    uuid: 9bb111c2-3ff5-36a7-00f4-2b9a573ea660
    lsn: 7
    name: instance001
  2:
    id: 2
    uuid: 4cfa6e3c-625e-b027-00a7-29b2f2182f23
    lsn: 0
    upstream:
      status: follow
      idle: 0.3893879999996
      peer: replicator@127.0.0.1:3302
      lag: 0.00028800964355469
    name: instance002
    downstream:
      status: follow
      idle: 0.37777199999982
      vclock: {1: 7}
      lag: 0
...

To see the diagrams that illustrate how the upstream and downstream connections look, refer to Monitoring a replica set.

Adding data

To check that a replica (instance002) gets all updates from the master, follow the steps below:

On instance001, create a space and add data as described in CRUD operation examples.
Open the second terminal, connect to instance002 using tt connect, and use the select operation to make sure data is replicated.
Check that box.info.vclock values are the same on both instances:
- instance001:
```
manual_leader:instance001> box.info.vclock
---
- {1: 21}
...
```
- instance002:
```
manual_leader:instance002> box.info.vclock
---
- {1: 21}
...
```
Note

Note that a vclock value might include the 0 component that is related to local space operations and might differ for different instances in a replica set.

Adding instances

This section describes how to add a new replica to a replica set.

Adding an instance to the configuration

Add instance003 to the instances.yml file:
```
instance001:
instance002:
instance003:
```

Add instance003 with the specified iproto.listen option to the config.yaml file:

groups:
  group001:
    replicasets:
      replicaset001:
        leader: instance001
        instances:
          instance001:
            iproto:
              listen:
              - uri: '127.0.0.1:3301'
          instance002:
            iproto:
              listen:
              - uri: '127.0.0.1:3302'
          instance003:
            iproto:
              listen:
              - uri: '127.0.0.1:3303'

Starting an instance

Open the third terminal to work with a new instance. Start instance003 using tt start:

$ tt start manual_leader:instance003
   • Starting an instance [manual_leader:instance003]...

Check a replica set status using tt status:

$ tt status manual_leader
INSTANCE                   STATUS   PID   MODE  CONFIG  BOX      UPSTREAM
manual_leader:instance001  RUNNING  8841  RW    ready   running  --
manual_leader:instance002  RUNNING  8842  RO    ready   running  --
manual_leader:instance003  RUNNING  8856  RO    ready   running  --

Reloading configuration

After you added instance003 to the configuration and started it, you need to reload configurations on all instances. This is required to allow instance001 and instance002 to get data from the new instance in case it becomes a master.

Connect to instance003 using tt connect:

$ tt connect manual_leader:instance003
   • Connecting to the instance...
   • Connected to manual_leader:instance001

Reload configurations on all three instances using the reload() function provided by the config module:

instance001:

manual_leader:instance001> require('config'):reload()
---
...

instance002:

manual_leader:instance002> require('config'):reload()
---
...

instance003:

manual_leader:instance003> require('config'):reload()
---
...

Execute box.info.replication to check a replica set status. Make sure that upstream.status and downstream.status are follow for instance003.

manual_leader:instance001> box.info.replication
---
- 1:
    id: 1
    uuid: 9bb111c2-3ff5-36a7-00f4-2b9a573ea660
    lsn: 21
    name: instance001
  2:
    id: 2
    uuid: 4cfa6e3c-625e-b027-00a7-29b2f2182f23
    lsn: 0
    upstream:
      status: follow
      idle: 0.052655000000414
      peer: replicator@127.0.0.1:3302
      lag: 0.00010204315185547
    name: instance002
    downstream:
      status: follow
      idle: 0.09503500000028
      vclock: {1: 21}
      lag: 0.00026917457580566
  3:
    id: 3
    uuid: 9a3a1b9b-8a18-baf6-00b3-a6e5e11fd8b6
    lsn: 0
    upstream:
      status: follow
      idle: 0.77522099999987
      peer: replicator@127.0.0.1:3303
      lag: 0.0001838207244873
    name: instance003
    downstream:
      status: follow
      idle: 0.33186100000012
      vclock: {1: 21}
      lag: 0
        ...

Performing manual failover

This section shows how to perform manual failover and change a replica set leader.

Switching instances to read-only mode

In the config.yaml file, change the replica set leader from instance001 to null:
```
replicaset001:
  leader: null
```

Reload configurations on all three instances using config:reload() and check that instances are in read-only mode. The example below shows how to do this for instance001:

manual_leader:instance001> require('config'):reload()
---
...
manual_leader:instance001> box.info.ro
---
- true
...
manual_leader:instance001> box.info.ro_reason
---
- config
...

Make sure that box.info.vclock values are the same on all instances:

instance001:

manual_leader:instance001> box.info.vclock
---
- {1: 21}
...

instance002:

manual_leader:instance002> box.info.vclock
---
- {1: 21}
...

instance003:

manual_leader:instance003> box.info.vclock
---
- {1: 21}
...

Configuring a new leader

Change a replica set leader in config.yaml to instance002:
```
replicaset001:
  leader: instance002
```
Reload configuration on all instances using config:reload().

Make sure that instance002 is a new master:

manual_leader:instance002> box.info.ro
---
- false
...

Check replication status using box.info.replication.

Removing instances

This section describes the process of removing an instance from a replica set.

Before removing an instance, make sure it is in read-only mode. If the instance is a master, perform manual failover.

Disconnecting an instance

Clear the iproto option for instance003 by setting its value to {}:
```
instance003:
  iproto: {}
```

Reload configurations on instance001 and instance002:

instance001:

manual_leader:instance001> require('config'):reload()
---
...

instance002:

manual_leader:instance002> require('config'):reload()
---
...

Check that the upstream section is missing for instance003 by executing box.info.replication[3]:

manual_leader:instance001> box.info.replication[3]
---
- id: 3
  uuid: 9a3a1b9b-8a18-baf6-00b3-a6e5e11fd8b6
  lsn: 0
  downstream:
    status: follow
    idle: 0.4588760000006
    vclock: {1: 21}
    lag: 0
  name: instance003
...

Stopping an instance

Stop instance003 using the tt stop command:

$ tt stop manual_leader:instance003
   • The Instance manual_leader:instance003 (PID = 15551) has been terminated.

Check that downstream.status is stopped for instance003:

manual_leader:instance001> box.info.replication[3]
---
- id: 3
  uuid: 9a3a1b9b-8a18-baf6-00b3-a6e5e11fd8b6
  lsn: 0
  downstream:
    status: stopped
    message: 'unexpected EOF when reading from socket, called on fd 27, aka 127.0.0.1:3301,
      peer of 127.0.0.1:54185: Broken pipe'
    system_message: Broken pipe
  name: instance003
...

Removing an instance from the configuration

Remove instance003 from the instances.yml file:
```
instance001:
instance002:
```

Remove instance003 from config.yaml:

instances:
  instance001:
    iproto:
      listen:
      - uri: '127.0.0.1:3301'
  instance002:
    iproto:
      listen:
      - uri: '127.0.0.1:3302'

Reload configurations on instance001 and instance002:

instance001:

manual_leader:instance001> require('config'):reload()
---
...

instance002:

manual_leader:instance002> require('config'):reload()
---
...

Removing an instance from the ‘_cluster’ space

To remove an instance from the replica set permanently, it should be removed from the box.space._cluster system space:

Select all the tuples in the box.space._cluster system space:

manual_leader:instance002> box.space._cluster:select{}
---
- - [1, '9bb111c2-3ff5-36a7-00f4-2b9a573ea660', 'instance001']
  - [2, '4cfa6e3c-625e-b027-00a7-29b2f2182f23', 'instance002']
  - [3, '9a3a1b9b-8a18-baf6-00b3-a6e5e11fd8b6', 'instance003']
...

Delete a tuple corresponding to instance003:

manual_leader:instance002> box.space._cluster:delete(3)
---
- [3, '9a3a1b9b-8a18-baf6-00b3-a6e5e11fd8b6', 'instance003']
...

Execute box.info.replication to check the health status:

manual_leader:instance002> box.info.replication
---
- 1:
    id: 1
    uuid: 9bb111c2-3ff5-36a7-00f4-2b9a573ea660
    lsn: 21
    upstream:
      status: follow
      idle: 0.73316000000159
      peer: replicator@127.0.0.1:3301
      lag: 0.00016212463378906
    name: instance001
    downstream:
      status: follow
      idle: 0.7269320000014
      vclock: {2: 1, 1: 21}
      lag: 0.00083398818969727
  2:
    id: 2
    uuid: 4cfa6e3c-625e-b027-00a7-29b2f2182f23
    lsn: 1
    name: instance002
...

Master-replica: automated failover

Example on GitHub: auto_leader

This tutorial shows how to configure and work with a replica set with automated failover.

Prerequisites

Before starting this tutorial:

Install the tt utility.
Create a tt environment in the current directory by executing the tt init command.
Inside the instances.enabled directory of the created tt environment, create the auto_leader directory.
Inside instances.enabled/auto_leader, create the instances.yml and config.yaml files:
- instances.yml specifies instances to run in the current environment and should look like this:
```
instance001:
instance002:
instance003:
```
- The config.yaml file is intended to store a replica set configuration.

Configuring a replica set

This section describes how to configure a replica set in config.yaml.

Step 1: Configuring a failover mode

First, set the replication.failover option to election:

replication:
  failover: election

Step 2: Defining a replica set topology

Define a replica set topology inside the groups section. The iproto.listen option specifies an address used to listen for incoming requests and allows replicas to communicate with each other.

groups:
  group001:
    replicasets:
      replicaset001:
        instances:
          instance001:
            iproto:
              listen:
              - uri: '127.0.0.1:3301'
          instance002:
            iproto:
              listen:
              - uri: '127.0.0.1:3302'
          instance003:
            iproto:
              listen:
              - uri: '127.0.0.1:3303'

Step 3: Creating a user for replication

In the credentials section, create the replicator user with the replication role:

credentials:
  users:
    replicator:
      password: 'topsecret'
      roles: [replication]

Step 4: Specifying advertise URIs

Set iproto.advertise.peer to advertise the current instance to other replica set members:

iproto:
  advertise:
    peer:
      login: replicator

Resulting configuration

The resulting replica set configuration should look as follows:

credentials:
  users:
    replicator:
      password: 'topsecret'
      roles: [replication]

iproto:
  advertise:
    peer:
      login: replicator

replication:
  failover: election

groups:
  group001:
    replicasets:
      replicaset001:
        instances:
          instance001:
            iproto:
              listen:
              - uri: '127.0.0.1:3301'
          instance002:
            iproto:
              listen:
              - uri: '127.0.0.1:3302'
          instance003:
            iproto:
              listen:
              - uri: '127.0.0.1:3303'

Working with a replica set

Starting instances

After configuring a replica set, execute the tt start command from the tt environment directory:

$ tt start auto_leader
   • Starting an instance [auto_leader:instance001]...
   • Starting an instance [auto_leader:instance002]...
   • Starting an instance [auto_leader:instance003]...

Check that instances are in the RUNNING status using the tt status command:

$ tt status auto_leader
INSTANCE                 STATUS   PID   MODE  CONFIG  BOX      UPSTREAM
auto_leader:instance001  RUNNING  9170  RO    ready   running  --
auto_leader:instance002  RUNNING  9171  RO    ready   running  --
auto_leader:instance003  RUNNING  9172  RW    ready   running  --

Checking a replica set status

Connect to instance001 using tt connect:

$ tt connect auto_leader:instance001
   • Connecting to the instance...
   • Connected to auto_leader:instance001

Check the instance state in regard to leader election using box.info.election. The output below shows that instance001 is a follower while instance002 is a replica set leader.

auto_leader:instance001> box.info.election
---
- leader_idle: 0.77491499999815
  leader_name: instance002
  state: follower
  vote: 0
  term: 2
  leader: 1
...

Check that instance001 is in read-only mode using box.info.ro:
```
auto_leader:instance001> box.info.ro
---
- true
...
```

Execute box.info.replication to check a replica set status. Make sure that upstream.status and downstream.status are follow for instance002 and instance003.

auto_leader:instance001> box.info.replication
---
- 1:
    id: 1
    uuid: 4cfa6e3c-625e-b027-00a7-29b2f2182f23
    lsn: 9
    upstream:
      status: follow
      idle: 0.8257709999998
      peer: replicator@127.0.0.1:3302
      lag: 0.00012326240539551
    name: instance002
    downstream:
      status: follow
      idle: 0.81174199999805
      vclock: {1: 9}
      lag: 0
  2:
    id: 2
    uuid: 9bb111c2-3ff5-36a7-00f4-2b9a573ea660
    lsn: 0
    name: instance001
  3:
    id: 3
    uuid: 9a3a1b9b-8a18-baf6-00b3-a6e5e11fd8b6
    lsn: 0
    upstream:
      status: follow
      idle: 0.83125499999733
      peer: replicator@127.0.0.1:3303
      lag: 0.00010204315185547
    name: instance003
    downstream:
      status: follow
      idle: 0.83213399999659
      vclock: {1: 9}
      lag: 0
...

To see the diagrams that illustrate how the upstream and downstream connections look, refer to Monitoring a replica set.

Adding data

To check that replicas (instance001 and instance003) get all updates from the master (instance002), follow the steps below:

Connect to instance002 using tt connect:

$ tt connect auto_leader:instance002
   • Connecting to the instance...
   • Connected to auto_leader:instance002

Create a space and add data as described in CRUD operation examples.
Use the select operation on instance001 and instance003 to make sure data is replicated.

Check that the 1 component of box.info.vclock values are the same on all instances:

instance001:

auto_leader:instance001> box.info.vclock
---
- {0: 1, 1: 32}
...

instance002:

auto_leader:instance002> box.info.vclock
---
- {0: 1, 1: 32}
...

instance003:

auto_leader:instance003> box.info.vclock
---
- {0: 1, 1: 32}
...

Note

Note that a vclock value might include the 0 component that is related to local space operations and might differ for different instances in a replica set.

Testing automated failover

To test how automated failover works if the current master is stopped, follow the steps below:

Stop the current master instance (instance002) using the tt stop command:

$ tt stop auto_leader:instance002
   • The Instance auto_leader:instance002 (PID = 24769) has been terminated.

On instance001, check box.info.election. In this example, a new replica set leader is instance001.

auto_leader:instance001> box.info.election
---
- leader_idle: 0
  leader_name: instance001
  state: leader
  vote: 2
  term: 3
  leader: 2
...

Check replication status using box.info.replication for instance002:

upstream.status is disconnected.
downstream.status is stopped.

auto_leader:instance001> box.info.replication
---
- 1:
    id: 1
    uuid: 4cfa6e3c-625e-b027-00a7-29b2f2182f23
    lsn: 32
    upstream:
      peer: replicator@127.0.0.1:3302
      lag: 0.00032305717468262
      status: disconnected
      idle: 48.352504000002
      message: 'connect, called on fd 20, aka 127.0.0.1:62575: Connection refused'
      system_message: Connection refused
    name: instance002
    downstream:
      status: stopped
      message: 'unexpected EOF when reading from socket, called on fd 32, aka 127.0.0.1:3301,
        peer of 127.0.0.1:62204: Broken pipe'
      system_message: Broken pipe
  2:
    id: 2
    uuid: 9bb111c2-3ff5-36a7-00f4-2b9a573ea660
    lsn: 1
    name: instance001
  3:
    id: 3
    uuid: 9a3a1b9b-8a18-baf6-00b3-a6e5e11fd8b6
    lsn: 0
    upstream:
      status: follow
      idle: 0.18620999999985
      peer: replicator@127.0.0.1:3303
      lag: 0.00012516975402832
    name: instance003
    downstream:
      status: follow
      idle: 0.19718099999955
      vclock: {2: 1, 1: 32}
      lag: 0.00051403045654297
...

The diagram below illustrates how the upstream and downstream connections look like:

Start instance002 back using tt start:

$ tt start auto_leader:instance002
   • Starting an instance [auto_leader:instance002]...

Choosing a leader manually

Make sure that box.info.vclock values (except the 0 components) are the same on all instances:

instance001:

auto_leader:instance001> box.info.vclock
---
- {0: 2, 1: 32, 2: 1}
...

instance002:

auto_leader:instance002> box.info.vclock
---
- {0: 2, 1: 32, 2: 1}
...

instance003:

auto_leader:instance003> box.info.vclock
---
- {0: 3, 1: 32, 2: 1}
...

On instance002, run box.ctl.promote() to choose it as a new replica set leader:
```
auto_leader:instance002> box.ctl.promote()
---
...
```

Check box.info.election to make sure instance002 is a leader now:

auto_leader:instance002> box.info.election
---
- leader_idle: 0
  leader_name: instance002
  state: leader
  vote: 1
  term: 4
  leader: 1
...

Adding and removing instances

The process of adding instances to a replica set and removing them is similar for all failover modes. Learn how to do this from the Master-replica: manual failover tutorial:

Before removing an instance from a replica set with replication.failover set to election, make sure this instance is in read-only mode. If the instance is a master, choose a new leader manually.

Master-master

Example on GitHub: master_master

This tutorial shows how to configure and work with a master-master replica set.

Prerequisites

Before starting this tutorial:

Install the tt utility.
Create a tt environment in the current directory by executing the tt init command.
Inside the instances.enabled directory of the created tt environment, create the master_master directory.
Inside instances.enabled/master_master, create the instances.yml and config.yaml files:
- instances.yml specifies instances to run in the current environment and should look like this:
```
instance001:
instance002:
```
- The config.yaml file is intended to store a replica set configuration.

Configuring a replica set

This section describes how to configure a replica set in config.yaml.

Step 1: Configuring a failover mode

First, set the replication.failover option to off:

replication:
  failover: off

Step 2: Defining a replica set topology

Define a replica set topology inside the groups section:

The database.mode option should be set to rw to make instances work in read-write mode.
The iproto.listen option specifies an address used to listen for incoming requests and allows replicas to communicate with each other.

groups:
  group001:
    replicasets:
      replicaset001:
        instances:
          instance001:
            database:
              mode: rw
            iproto:
              listen:
              - uri: '127.0.0.1:3301'
          instance002:
            database:
              mode: rw
            iproto:
              listen:
              - uri: '127.0.0.1:3302'

Step 3: Creating a user for replication

In the credentials section, create the replicator user with the replication role:

credentials:
  users:
    replicator:
      password: 'topsecret'
      roles: [replication]

Step 4: Specifying advertise URIs

Set iproto.advertise.peer to advertise the current instance to other replica set members:

iproto:
  advertise:
    peer:
      login: replicator

Resulting configuration

The resulting replica set configuration should look as follows:

credentials:
  users:
    replicator:
      password: 'topsecret'
      roles: [replication]

iproto:
  advertise:
    peer:
      login: replicator

replication:
  failover: off

groups:
  group001:
    replicasets:
      replicaset001:
        instances:
          instance001:
            database:
              mode: rw
            iproto:
              listen:
              - uri: '127.0.0.1:3301'
          instance002:
            database:
              mode: rw
            iproto:
              listen:
              - uri: '127.0.0.1:3302'

Working with a replica set

Starting instances

After configuring a replica set, execute the tt start command from the tt environment directory:

$ tt start master_master
   • Starting an instance [master_master:instance001]...
   • Starting an instance [master_master:instance002]...

Check that instances are in the RUNNING status using the tt status command:

$ tt status master_master
INSTANCE                   STATUS   PID   MODE  CONFIG  BOX      UPSTREAM
master_master:instance001  RUNNING  9263  RW    ready   running  --
master_master:instance002  RUNNING  9264  RW    ready   running  --

Checking a replica set status

Connect to both instances using tt connect. Below is the example for instance001:

$ tt connect master_master:instance001
   • Connecting to the instance...
   • Connected to master_master:instance001

master_master:instance001>

Check that both instances are writable using box.info.ro:

instance001:

master_master:instance001> box.info.ro
---
- false
...

instance002:

master_master:instance002> box.info.ro
---
- false
...

Execute box.info.replication to check a replica set status. For instance002, upstream.status and downstream.status should be follow.

master_master:instance001> box.info.replication
---
- 1:
    id: 1
    uuid: c3bfd89f-5a1c-4556-aa9f-461377713a2a
    lsn: 7
    name: instance001
  2:
    id: 2
    uuid: dccf7485-8bff-47f6-bfc4-b311701e36ef
    lsn: 0
    upstream:
      status: follow
      idle: 0.93246499999987
      peer: replicator@127.0.0.1:3302
      lag: 0.00016188621520996
    name: instance002
    downstream:
      status: follow
      idle: 0.8988360000003
      vclock: {1: 7}
      lag: 0
...

To see the diagrams that illustrate how the upstream and downstream connections look, refer to Monitoring a replica set.

Note

Note that a vclock value might include the 0 component that is related to local space operations and might differ for different instances in a replica set.

Adding data

To check that both instances get updates from each other, follow the steps below:

On instance001, create a space, format it, and create a primary index:

box.schema.space.create('bands')
box.space.bands:format({
    { name = 'id', type = 'unsigned' },
    { name = 'band_name', type = 'string' },
    { name = 'year', type = 'unsigned' }
})
box.space.bands:create_index('primary', { parts = { 'id' } })

Then, add sample data to this space:

box.space.bands:insert { 1, 'Roxette', 1986 }
box.space.bands:insert { 2, 'Scorpions', 1965 }

On instance002, use the select operation to make sure data is replicated:

master_master:instance002> box.space.bands:select()
---
- - [1, 'Roxette', 1986]
  - [2, 'Scorpions', 1965]
...

Add more data to the created space on instance002:

box.space.bands:insert { 3, 'Ace of Base', 1987 }
box.space.bands:insert { 4, 'The Beatles', 1960 }

Get back to instance001 and use select to make sure new records are replicated:

master_master:instance001> box.space.bands:select()
---
- - [1, 'Roxette', 1986]
  - [2, 'Scorpions', 1965]
  - [3, 'Ace of Base', 1987]
  - [4, 'The Beatles', 1960]
...

Check that box.info.vclock values are the same on both instances:

instance001:

master_master:instance001> box.info.vclock
---
- {2: 2, 1: 12}
...

instance002:

master_master:instance002> box.info.vclock
---
- {2: 2, 1: 12}
...

Resolving replication conflicts

Note

To learn how to fix and prevent replication conflicts using trigger functions, see Resolving replication conflicts.

Inserting conflicting records

To insert conflicting records to instance001 and instance002, follow the steps below:

Stop instance001 using the tt stop command:
```
$ tt stop master_master:instance001
```

On instance002, insert a new record:

box.space.bands:insert { 5, 'incorrect data', 0 }

Stop instance002 using tt stop:
```
$ tt stop master_master:instance002
```
Start instance001 back:
```
$ tt start master_master:instance001
```
Connect to instance001 and insert a record that should conflict with a record already inserted on instance002:
```
box.space.bands:insert { 5, 'Pink Floyd', 1965 }
```

Start instance002 back:

$ tt start master_master:instance002

Then, check box.info.replication on instance001. upstream.status should be stopped because of the Duplicate key exists error:

master_master:instance001> box.info.replication
---
- 1:
    id: 1
    uuid: c3bfd89f-5a1c-4556-aa9f-461377713a2a
    lsn: 13
    name: instance001
  2:
    id: 2
    uuid: dccf7485-8bff-47f6-bfc4-b311701e36ef
    lsn: 2
    upstream:
      peer: replicator@127.0.0.1:3302
      lag: 115.99977827072
      status: stopped
      idle: 2.0342070000006
      message: Duplicate key exists in unique index "primary" in space "bands" with
        old tuple - [5, "Pink Floyd", 1965] and new tuple - [5, "incorrect data",
        0]
    name: instance002
    downstream:
      status: stopped
      message: 'unexpected EOF when reading from socket, called on fd 24, aka 127.0.0.1:3301,
        peer of 127.0.0.1:58478: Broken pipe'
      system_message: Broken pipe
...

The diagram below illustrates how the upstream and downstream connections look like:

Reseeding a replica

To resolve a replication conflict, instance002 should get the correct data from instance001 first. To achieve this, instance002 should be rebootstrapped:

Select all the tuples in the box.space._cluster system space to get a UUID of instance002:

master_master:instance001> box.space._cluster:select()
---
- - [1, 'c3bfd89f-5a1c-4556-aa9f-461377713a2a', 'instance001']
  - [2, 'dccf7485-8bff-47f6-bfc4-b311701e36ef', 'instance002']
...

In the config.yaml file, change the following instance002 settings:
- Set database.mode to ro.
- Set database.instance_uuid to a UUID value obtained in the previous step.
```
instance002:
  database:
    mode: ro
    instance_uuid: 'dccf7485-8bff-47f6-bfc4-b311701e36ef'
```

Reload configurations on both instances using the config:reload() function:

instance001:

master_master:instance001> require('config'):reload()
---
...

instance002:

master_master:instance002> require('config'):reload()
---
...

Delete write-ahead logs and snapshots stored in the var/lib/instance002 directory.

Note

var/lib is the default directory used by tt to store write-ahead logs and snapshots. Learn more from Configuration.
Restart instance002 using the tt restart command:
```
$ tt restart master_master:instance002
```

Connect to instance002 and make sure it received the correct data from instance001:

master_master:instance002> box.space.bands:select()
---
- - [1, 'Roxette', 1986]
  - [2, 'Scorpions', 1965]
  - [3, 'Ace of Base', 1987]
  - [4, 'The Beatles', 1960]
  - [5, 'Pink Floyd', 1965]
...

Restarting replication

After reseeding a replica, you need to resolve a replication conflict that keeps replication stopped:

Execute box.info.replication on instance001. upstream.status is still stopped:

master_master:instance001> box.info.replication
---
- 1:
    id: 1
    uuid: c3bfd89f-5a1c-4556-aa9f-461377713a2a
    lsn: 13
    name: instance001
  2:
    id: 2
    uuid: dccf7485-8bff-47f6-bfc4-b311701e36ef
    lsn: 2
    upstream:
      peer: replicator@127.0.0.1:3302
      lag: 115.99977827072
      status: stopped
      idle: 1013.688243
      message: Duplicate key exists in unique index "primary" in space "bands" with
        old tuple - [5, "Pink Floyd", 1965] and new tuple - [5, "incorrect data",
        0]
    name: instance002
    downstream:
      status: follow
      idle: 0.69694700000036
      vclock: {2: 2, 1: 13}
      lag: 0
...

The diagram below illustrates how the upstream and downstream connections look like:

replication status after reseeding a replica

In the config.yaml file, clear the iproto option for instance001 by setting its value to {} to disconnect this instance from instance002. Set database.mode to ro:
```
instance001:
  database:
    mode: ro
  iproto: {}
```

Reload configuration on instance001 only:

master_master:instance001> require('config'):reload()
---
...

Change database.mode values back to rw for both instances and restore iproto.listen for instance001. The database.instance_uuid option can be removed for instance002:

instance001:
  database:
    mode: rw
  iproto:
    listen:
    - uri: '127.0.0.1:3301'
instance002:
  database:
    mode: rw
  iproto:
    listen:
    - uri: '127.0.0.1:3302'

Reload configurations on both instances one more time:

instance001:

master_master:instance001> require('config'):reload()
---
...

instance002:

master_master:instance002> require('config'):reload()
---
...

Check box.info.replication. upstream.status should be follow now.

master_master:instance001> box.info.replication
---
- 1:
    id: 1
    uuid: c3bfd89f-5a1c-4556-aa9f-461377713a2a
    lsn: 13
    name: instance001
  2:
    id: 2
    uuid: dccf7485-8bff-47f6-bfc4-b311701e36ef
    lsn: 2
    upstream:
      status: follow
      idle: 0.86873800000012
      peer: replicator@127.0.0.1:3302
      lag: 0.0001060962677002
    name: instance002
    downstream:
      status: follow
      idle: 0.058662999999797
      vclock: {2: 2, 1: 13}
      lag: 0
...

Adding and removing instances

The process of adding instances to a replica set and removing them is similar for all failover modes. Learn how to do this from the Master-replica: manual failover tutorial:

Before removing an instance from a replica set with replication.failover set to off, make sure this instance is in read-only mode.

Sharding

Scaling databases in a growing project is often considered one of the most challenging issues. Once a single server cannot withstand the load, scaling methods should be applied.

Sharding is a database architecture that allows for horizontal scaling, which implies that a dataset is partitioned and distributed over multiple servers.

With Tarantool’s vshard module, the tuples of a dataset are distributed across multiple nodes, with a Tarantool database server instance on each node. Each instance handles only a subset of the total data, so larger loads can be handled by simply adding more servers. The initial dataset is partitioned into multiple parts, so each part is stored on a separate server.

The vshard module is based on the concept of virtual buckets, where a tuple set is partitioned into a large number of abstract virtual nodes (virtual buckets, further just buckets) rather than into a smaller number of physical nodes.

The dataset is partitioned using sharding keys (bucket id numbers). Hashing a sharding key into a large number of buckets allows seamlessly changing the number of servers in the cluster. The rebalancing mechanism distributes buckets evenly among all shards in case some servers were added or removed.

The buckets have states, so it is easy to monitor the server states. For example, a server instance is active and available for all types of requests, or a failover occurred and the instance accepts only read requests.

The vshard module provides router and storage API (public and internal) for sharding-aware applications.

Check out the quick start guide or learn more about how sharding works in Tarantool:

You can also find out more about sharding administration or dive into the vshard configuration and API reference.

Architecture

Overview

Consider a distributed Tarantool cluster that consists of subclusters called shards, each storing some part of data. Each shard, in its turn, constitutes a replica set consisting of several replicas, one of which serves as a master node that processes all read and write requests.

The whole dataset is logically partitioned into a predefined number of virtual buckets (further just buckets), each assigned a unique number ranging from 1 to N, where N is the total number of buckets. The number of buckets is specifically chosen to be several orders of magnitude larger than the potential number of cluster nodes, even given future cluster scaling. For example, with M projected nodes the dataset may be split into 100 * M or even 1,000 * M buckets. Care should be taken when picking the number of buckets: if too large, it may require extra memory for storing the routing information; if too small, it may decrease the granularity of rebalancing.

Each shard stores a unique subset of buckets, which means that a bucket cannot belong to several shards at once, as illustrated below:

This shard-to-bucket mapping is stored in a table in one of Tarantool’s system spaces, with each shard holding only a specific part of the mapping that covers those buckets that were assigned to this shard.

Apart from the mapping table, the bucket id is also stored in a special field of every tuple of every table participating in sharding.

Once a shard receives any request (except for SELECT) from an application, this shard checks the bucket id specified in the request against the table of bucket ids that belong to a given node. If the specified bucket id is invalid, the request gets terminated with the following error: “wrong bucket”. Otherwise the request is executed, and all the data created in the process is assigned the bucket id specified in the request. Note that the request should only modify the data that has the same bucket id as the request itself.

Storing bucket ids both in the data itself and the mapping table ensures data consistency regardless of the application logic and makes rebalancing transparent for the application. Storing the mapping table in a system space ensures sharding is performed consistently in case of a failover, as all the replicas in a shard share a common table state.

Virtual buckets

The sharded dataset is partitioned into a large number of abstract nodes called virtual buckets (further just buckets).

The dataset is partitioned using the sharding key (or bucket id, in Tarantool terminology). Bucket id is a number from 1 to N, where N is the total number of buckets.

Each replica set stores a unique subset of buckets. One bucket cannot belong to multiple replica sets at a time.

The total number of buckets is determined by the administrator who sets up the initial cluster configuration.

Every space you plan to shard must have a numeric field containing bucket id-s. You can learn more from Data definition.

Structure

A sharded cluster in Tarantool consists of:

One or more replica sets.

Each replica set should contain at least two storage instances. For redundancy, it is recommended to have 3 or more storage instances in a replica set.
One or more router instances.

The number of router instances is not limited and should be increased if the existing router instances become CPU or I/O bound.
Rebalancer.

Storage

Storage is a node storing a subset of the dataset. Multiple replicated (for redundancy) storages comprise a replica set (also called shard).

Each storage in a replica set has a role, master or replica. A master processes read and write requests. A replica processes read requests but cannot process write requests.

Router

Router is a standalone software component that routes read and write requests from the client application to shards.

All requests from the application come to the sharded cluster through a router. The router keeps the topology of a sharded cluster transparent for the application, thus keeping the application unaware of:

the number and location of shards,
data rebalancing process,
the fact and the process of a failover that occurred after a replica’s failure.

A router can also calculate a bucket id on its own provided that the application clearly defines rules for calculating a bucket id based on the request data. To do it, a router needs to be aware of the data schema.

The router does not have a persistent state, nor does it store the cluster topology or balance the data. The router is a standalone software component that can run in the storage layer or application layer depending on the application features.

A router maintains a constant pool of connections to all the storages that is created at startup. Creating it this way helps avoid configuration errors. Once a pool is created, a router caches the current state of the _vbucket table to speed up the routing. In case a bucket id is moved to another storage as a result of data rebalancing, or one of the shards fails over to a replica, a router updates the routing table in a way that’s transparent for the application.

Sharding is not integrated into any centralized configuration storage system. It is assumed that the application itself handles all the interactions with such systems and passes sharding parameters. That said, the configuration can be changed dynamically - for example, when adding or deleting one or several shards:

To add a new shard to the cluster, a system administrator first changes the configuration of all the routers and then the configuration of all the storages.
The new shard becomes available to the storage layer for rebalancing.
As a result of rebalancing, one of the vbuckets is moved to the new shard.
When trying to access the vbucket, a router receives a special error code that specifies the new vbucket location.

CRUD (create, read, update, delete) operations

CRUD operations can be:

executed in a stored procedure inside a storage, or
initialized by the application.

In any case, the application must include the operation bucket id in a request. When executing an INSERT request, the operation bucket id is stored in a newly created tuple. In other cases, it is checked if the specified operation bucket id matches the bucket id of a tuple being modified.

SELECT requests

Since a storage is not aware of the mapping between a bucket id and a primary key, all the SELECT requests executed in stored procedures inside a storage are only executed locally. Those SELECT requests that were initialized by the application are forwarded to a router. Then, if the application has passed a bucket id, a router uses it for shard calculation.

Calling stored procedures

There are several ways of calling stored procedures in cluster replica sets. Stored procedures can be called:

on a specific vbucket located in a replica set (in this case, it is necessary to differentiate between read and write procedures, as write procedures are not applicable to vbuckets that are being migrated), or
without specifying any particular vbucket.

All the routing validity checks performed for sharded DML operations hold true for vbucket-bound stored procedures as well.

Rebalancer

Rebalancer is a background rebalancing process that ensures an even distribution of buckets across the shards. During rebalancing, buckets are being migrated among replica sets.

The rebalancer “wakes up” periodically and redistributes data from the most loaded nodes to less loaded nodes. Rebalancing starts if the replicaset disbalance of a replica set exceeds a disbalance threshold specified in the configuration.

The replicaset disbalance is calculated as follows:

|etalon_bucket_number - real_bucket_number| / etalon_bucket_number * 100

Migration of buckets

A replica set from which the bucket is being migrated is called a source ; a target replica set to which the bucket is being migrated is called a destination.

A replica set lock makes a replica set invisible to the rebalancer. A locked replica set can neither receive new buckets nor migrate its own buckets.

While a bucket is being migrated, it can have different states:

ACTIVE – the bucket is available for read and write requests.
PINNED – the bucket is locked for migrating to another replica set. Otherwise pinned buckets are similar to buckets in the ACTIVE state.
SENDING – the bucket is currently being copied to the destination replica set; read requests to the source replica set are still processed.
RECEIVING – the bucket is currently being filled; all requests to it are rejected.
SENT – the bucket was migrated to the destination replica set. The router uses the SENT state to calculate the new location of the bucket. A bucket in the SENT state goes to the GARBAGE state automatically after 0.5 seconds.
GARBAGE – the bucket was already migrated to the destination replica set during rebalancing; or the bucket was initially in the RECEIVING state, but some error occurred during the migration.

Buckets in the GARBAGE state are deleted by the garbage collector.

Migration is performed as follows:

At the destination replica set, a new bucket is created and assigned the RECEIVING state, the data copying starts, and the bucket rejects all requests.
The source bucket in the source replica set is assigned the SENDING state, and the bucket continues to process read requests.
Once the data is copied, the bucket on the source replica set is assigned the SENT and it starts rejecting all requests.
The bucket on the destination replica set is assigned the ACTIVE state and starts accepting all requests.

Note

There is a specific error vshard.error.code.TRANSFER_IS_IN_PROGRESS that returns in case a request tries to perform an action not applicable to a bucket which is being relocated. You need to retry the request in this case.

The _bucket system space

The _bucket system space of each replica set stores the ids of buckets present in the replica set. The space contains the following fields:

bucket – bucket id
status – state of the bucket
destination – UUID of the destination replica set

An example of _bucket.select{}:

---
- - [1, ACTIVE, abfe2ef6-9d11-4756-b668-7f5bc5108e2a]
  - [2, SENT, 19f83dcb-9a01-45bc-a0cf-b0c5060ff82c]
...

Once the bucket is migrated, the destination replica set identified by UUID is filled in the table. While the bucket is still located on the source replica set, the value of the destination replica set UUID is equal to NULL.

The routing table

А routing table on the router stores the map of all bucket ids to replica sets. It ensures the consistency of sharding in case of failover.

The router keeps a persistent pool of connections to all the storages that are created at startup. This helps prevent configuration errors. Once the connection pool is created, the router caches the current state of the routing table in order to speed up routing. If a bucket migrated to another storage after rebalancing, or a failover occurred and caused one of the shards switching to another replica, the discovery fiber on the router updates the routing table automatically.

As the bucket id is explicitly indicated both in the data and in the mapping table on the router, the data is consistent regardless of the application logic. It also makes rebalancing transparent for the application.

Processing requests

Requests to the database can be performed by the application or using stored procedures. Either way, the bucket id should be explicitly specified in the request.

All requests are forwarded to the router first. The only operation supported by the router is call. The operation is performed via the vshard.router.call() function:

result = vshard.router.call(<bucket_id>, <mode>, <function_name>, {<argument_list>}, {<opts>})

Requests are processed as follows:

The router uses the bucket id to search for a replica set with the corresponding bucket in the routing table.

If the map of the bucket id to the replica set is not known to the router (the discovery fiber hasn’t filled the table yet), the router makes requests to all storages to find out where the bucket is located.
Once the bucket is located, the shard checks:
- whether the bucket is stored in the _bucket system space of the replica set;
- whether the bucket is ACTIVE or PINNED (for a read request, it can also be SENDING).
If all the checks succeed, the request is executed. Otherwise, it is terminated with the error: “wrong bucket”.

Glossary

Vertical scaling: Adding more power to a single server: using a more powerful CPU, adding more capacity to RAM, adding more storage space, etc.
Horizontal scaling: Adding more servers to the pool of resources, then partitioning and distributing a dataset across the servers.
Sharding: A database architecture that allows partitioning a dataset using a sharding key and distributing a dataset across multiple servers. Sharding is a special case of horizontal scaling.
Node: A virtual or physical server instance.
Cluster: A set of nodes that make up a single group.
Storage: A node storing a subset of a dataset.
Replica set: A set of storage nodes storing copies of a dataset. Each storage in a replica set has a role, master or replica.
Master: A storage in a replica set processing read and write requests.
Replica: A storage in a replica set processing only read requests.
Read requests: Read-only requests, that is, select requests.
Write requests: Data-change operations, that is create, read, update, delete requests.
Buckets (virtual buckets): The abstract virtual nodes into which the dataset is partitioned by the sharding key (bucket id).
Bucket id: A sharding key defining which bucket belongs to which replica set. A bucket id may be calculated from a hash key.
Router: A proxy server responsible for routing requests from an application to nodes in a cluster.

Sharding with vshard

Sharding in Tarantool is implemented in the vshard module. For a quick start with vshard, refer to Creating a sharded cluster.

Note

Starting with the 3.0 version, the recommended way of configuring Tarantool is using a configuration file. The sharding section defines configuration parameters related to sharding. To learn how to configure vshard in code, see Configuration reference.

Installation

The vshard module is distributed separately from the main Tarantool package. To install the module, execute the following command:

$ tt rocks install vshard

If you are developing a sharded cluster application, add the vshard module dependency to a *.rockspec file:

dependencies = {
    'vshard == 0.1.27'
}

Note

The minimum required version of vshard is 0.1.25.

Configuration overview

Configuring settings related to sharding might include the following steps:

Configure connection settings to allow instances within a sharded cluster to communicate with each other.
Specify which role each replica set plays in a sharded cluster.
Configure how data is partitioned across shards.
Specify settings related to data rebalancing.

Connectivity

This section describes connection options that enable communication between instances within a sharded cluster. For general information about connections, see the Connections topic.

Advertise URI

In a sharded cluster configuration, you need to specify how a router and rebalancer connect to storages using the iproto.advertise.sharding option. In the example below, the storage user is used for this purpose:

iproto:
  advertise:
    peer:
      login: replicator
    sharding:
      login: storage

The storage user should have the sharding role described in the next section.

Credentials

To allow a router and rebalancer to connect to storages, a user with the sharding role should be used. The example below shows how to grant the sharding role to the storage user:

credentials:
  users:
    replicator:
      password: 'topsecret'
      roles: [replication]
    storage:
      password: 'secret'
      roles: [sharding]

The sharding role has different privileges depending on a replica set’s sharding role. For replica sets with the storage sharding role, the sharding credential role has the following privileges:

All privileges provided by the replication role.
Executing vshard.storage.* functions.

If a replica set does not have the storage sharding role, the sharding credential role does not have any privileges.

Sharding roles

Each replica set in a sharded cluster can have one of three roles:

router: a replica set acts as a router.
storage: a replica set acts as a storage.
rebalancer: a replica set acts as a rebalancer.

You can use the sharding.roles option to assign a specific role to a replica set or group of replica sets. In the example below, all replica sets in the storages group have the storage role while replica sets in the routers group have the router role.

groups:
  storages:
    sharding:
      roles: [storage]
    # ...
  routers:
    sharding:
      roles: [router]
    # ...

Note that the rebalancer role is optional. If it is not specified, a rebalancer is selected automatically from the master instances of replica sets. To specify the rebalancer manually or turn it off, use the sharding.rebalancer_mode option.

Data partitioning

This section describes configuration settings related to data partitioning. Learn how to define spaces to be sharded in Data definition.

Bucket count

To define the total number of buckets in a cluster, configure the sharding.bucket_count option at the global level. In the example below, sharding.bucket_count is set to 1000:

sharding:
  bucket_count: 1000

sharding.bucket_count should be several orders of magnitude larger than the potential number of cluster nodes considering potential scaling out in the future.

If the estimated number of nodes in a cluster is N, then the data set should be divided into 100N or even 1000N buckets depending on the planned scaling out. This number is greater than the potential number of cluster nodes in the system being designed.

Keep in mind that too many buckets can cause a need to allocate more memory to store routing information. On the other hand, an insufficient number of buckets can lead to decreased granularity when rebalancing.

Replica set weights

A replica set weight defines the storage capacity of the replica set: the larger the weight, the more buckets the replica set can store. You can configure a replica set weight using the sharding.weight option. This option can be used to store the prevailing amount of data on a replica set with more memory space. You can also assign a zero weight to a replica set to initiate migration of its buckets to the remaining cluster nodes.

In the example below, the storage-a replica set can store twice as much data as storage-b:

# ...
replicasets:
  storage-a:
    sharding:
      weight: 2
    # ...
  storage-b:
    sharding:
      weight: 1
    # ...

Data rebalancing

Rebalancing process

There is an etalon number of buckets for a replica set. (Etalon in this context means “ideal”.) If there is no deviation from this number in the whole replica set, then the buckets are distributed evenly.

The etalon number is calculated automatically considering the number of buckets in the cluster and the weights of the replica sets.

Rebalancing starts if the disbalance threshold of a replica set exceeds the disbalance threshold specified in the configuration (the sharding.rebalancer_disbalance_threshold option).

The disbalance threshold of a replica set is calculated as follows:

|etalon_bucket_number - real_bucket_number| / etalon_bucket_number * 100

For example, a cluster is configured as follows:

The number of buckets (sharding.bucket_count) is set to 3000.
Weights of 3 replica sets are 1, 0.5, and 1.5.

In this case, the etalon numbers of buckets for the replica sets are:

1st replica set – 1000.
2nd replica set – 500.
3rd replica set – 1500.

You can set a replica set weight to zero to initiate migration of its buckets to the remaining cluster nodes. You can also add a new replica set with a non-zero weight to initiate migration of the buckets from the existing replica sets.

When a new shard is added, a configuration should be reloaded on each instance to migrate buckets to a new shard:

If a centralized configuration storage is used, Tarantool reloads a changed configuration automatically.
If a local configuration file is used, you need to reload a configuration on all the routers first and then on all the storages.

Parallel rebalancing

Originally, vshard had quite a simple rebalancer – one process on one node that calculated routes that should send buckets, how many, and to whom. The nodes applied these routes one by one sequentially.

Unfortunately, such a simple schema worked not fast enough, especially for Vinyl, where costs of reading disk were comparable with network costs. In fact, with Vinyl the rebalancer routes applier was sleeping most of the time.

Now each node can send multiple buckets in parallel in a round-robin manner to multiple destinations, or to just one.

To set the degree of parallelism, use the sharding.rebalancer_max_sending option:

sharding:
  rebalancer_max_sending: 5

Note

Specifying sharding.rebalancer_max_sending = N probably won’t give N times speed up. It depends on network, disk, number of other fibers in the system.

Example 1

You have 10 replica sets and a new one is added. Now all the 10 replica sets will try to send buckets to the new one.

Assume that each replica set can send up to 5 buckets at once. In that case, the new replica set will experience a rather big load of 50 buckets being downloaded at once. If the node needs to do some other work, perhaps such a big load is undesirable. Also too, many parallel buckets can cause timeouts in the rebalancing process itself.

To fix the problem, you can set a lower value for rebalancer_max_sending for old replica sets, or decrease rebalancer_max_receiving for the new one. In the latter case, some workers on old nodes will be throttled, and you will see that in the logs.

rebalancer_max_sending is important, if you have restrictions for the maximum number of buckets that can be read only at once in the cluster. As you remember, when a bucket is being sent, it does not accept new write requests.

Example 2

You have 100000 buckets and each bucket stores ~0.001% of your data. The cluster has 10 replica sets. And you never can afford > 0.1% of data locked on write. Then you should not set rebalancer_max_sending > 10 on these nodes. It guarantees that the rebalancer won’t send more than 100 buckets at once in the whole cluster.

If rebalancer_max_sending is too high and rebalancer_max_receiving is too low, then some buckets will try to get relocated – and will fail with that. This problem will consume network resources and time. It is important to configure these parameters to not conflict with each other.

Replica set lock and bucket pin

A replica set lock (sharding.lock) makes a replica set invisible to the rebalancer: a locked replica set can neither receive new buckets nor migrate its own buckets.

A bucket pin (vshard.storage.bucket_pin(bucket_id)) blocks a specific bucket from migrating: a pinned bucket stays on the replica set to which it is pinned until it is unpinned.

Pinning all replica set buckets is not equivalent to locking a replica set. Even if you pin all buckets, a non-locked replica set can still receive new buckets.

A replica set lock is helpful, for example, to separate a replica set from production replica sets for testing, or to preserve some application metadata that must not be sharded for a while. A bucket pin is used for similar cases but in a smaller scope.

By both locking a replica set and pinning all buckets, you can isolate an entire replica set.

Locked replica sets and pinned buckets affect the rebalancing algorithm as the rebalancer must ignore locked replica sets and consider pinned buckets when attempting to reach the best possible balance.

The issue is not trivial as a user can pin too many buckets to a replica set, so a perfect balance becomes unreachable. For example, consider the following cluster (assume all replica set weights are equal to 1).

The initial configuration:

rs1: bucket_count = 150
rs2: bucket_count = 150, pinned_count = 120

Adding a new replica set:

rs1: bucket_count = 150
rs2: bucket_count = 150, pinned_count = 120
rs3: bucket_count = 0

The perfect balance would be 100 - 100 - 100, which is impossible since the rs2 replica set has 120 pinned buckets. The best possible balance here is the following:

rs1: bucket_count = 90
rs2: bucket_count = 120, pinned_count 120
rs3: bucket_count = 90

The rebalancer moved as many buckets as possible from rs2 to decrease the disbalance. At the same time, it respected equal weights of rs1 and rs3.

The algorithms for implementing locks and pins are completely different, although they look similar in terms of functionality.

Replica set lock and rebalancing

Locked replica sets do not participate in rebalancing. This means that even if the actual total number of buckets is not equal to the etalon number, the disbalance cannot be fixed due to the lock. When the rebalancer detects that one of the replica sets is locked, it recalculates the etalon number of buckets of the non-locked replica sets as if the locked replica set and its buckets did not exist at all.

Bucket pin and rebalancing

Rebalancing replica sets with pinned buckets requires a more complex algorithm. Here pinned_count[o] is the number of pinned buckets, and etalon_count is the etalon number of buckets for a replica set:

The rebalancer calculates the etalon number of buckets as if all buckets were not pinned. Then the rebalancer checks each replica set and compares the etalon number of buckets with the number of pinned buckets in a replica set. If pinned_count < etalon_count, non-locked replica sets (at this point all locked replica sets already are filtered out) with pinned buckets can receive new buckets.
If pinned_count > etalon_count, the disbalance cannot be fixed, as the rebalancer cannot move pinned buckets out of this replica set. In such a case the etalon number is updated and set equal to the number of pinned buckets. The replica sets with pinned_count > etalon_count are not processed by the rebalancer, and the number of pinned buckets is subtracted from the total number of buckets. The rebalancer tries to move out as many buckets as possible from such replica sets.
This procedure is restarted from step 1 for replica sets with pinned_count >= etalon_count until pinned_count <= etalon_count on all replica sets. The procedure is also restarted when the total number of buckets is changed.

Here is the pseudocode for the algorithm:

function cluster_calculate_perfect_balance(replicasets, bucket_count)
        -- rebalance the buckets using weights of the still viable replica sets --
end;

cluster = <all of the non-locked replica sets>;
bucket_count = <the total number of buckets in the cluster>;
can_reach_balance = false
while not can_reach_balance do
        can_reach_balance = true
        cluster_calculate_perfect_balance(cluster, bucket_count);
        foreach replicaset in cluster do
                if replicaset.perfect_bucket_count <
                   replicaset.pinned_bucket_count then
                        can_reach_balance = false
                        bucket_count -= replicaset.pinned_bucket_count;
                        replicaset.perfect_bucket_count =
                                replicaset.pinned_bucket_count;
                end;
        end;
end;
cluster_calculate_perfect_balance(cluster, bucket_count);

The complexity of the algorithm is O(N^2), where N is the number of replica sets. On each step, the algorithm either finishes the calculation, or ignores at least one new replica set overloaded with the pinned buckets, and updates the etalon number of buckets on other replica sets.

Bucket ref

Bucket ref is an in-memory counter that is similar to the bucket pin, but has the following differences:

Bucket ref is not persistent. Refs are intended for forbidding bucket transfer during request execution, but on restart all requests are dropped.
There are two types of bucket refs: read-only (RO) and read-write (RW).

If a bucket has RW refs, it cannot be moved. However, when the rebalancer needs it to be sent, it locks the bucket for new write requests, waits until all current requests are finished, and then sends the bucket.

If a bucket has RO refs, it can be sent, but cannot be dropped. Such a bucket can even enter GARBAGE or SENT state, but its data is kept until the last reader is gone.

A single bucket can have both RO and RW refs.
Bucket ref is countable.

The vshard.storage.bucket_ref/unref() methods are called automatically when vshard.router.call() or vshard.storage.call() is used. For raw API like r = vshard.router.route() r:callro/callrw, you should explicitly call the bucket_ref() method inside the function. Also, make sure that you call bucket_unref() after bucket_ref(), otherwise the bucket cannot be moved from the storage until the instance is restarted.

To see how many refs there are for a bucket, use vshard.storage.buckets_info([bucket_id]) (the bucket_id parameter is optional).

For example:

vshard.storage.buckets_info(1)
---
- 1:
    status: active
    ref_rw: 1
    ref_ro: 1
    ro_lock: true
    rw_lock: true
    id: 1

Defining and manipulating data

Data definition

Sharded spaces should be defined in a storage application inside box.once() and should have a field with bucket id values. This field should meet the following requirements:

The field’s data type can be unsigned, number, or integer.
The field must be non-nullable.
The field must be indexed by the shard_index. The default name for this index is bucket_id.

In the example below, the bands space has the bucket_id field, which is used to partition a dataset across different storage instances:

box.once('bands', function()
    box.schema.create_space('bands', {
        format = {
            { name = 'id', type = 'unsigned' },
            { name = 'bucket_id', type = 'unsigned' },
            { name = 'band_name', type = 'string' },
            { name = 'year', type = 'unsigned' }
        },
        if_not_exists = true
    })
    box.space.bands:create_index('id', { parts = { 'id' }, if_not_exists = true })
    box.space.bands:create_index('bucket_id', { parts = { 'bucket_id' }, unique = false, if_not_exists = true })
end)

Example on GitHub: sharded_cluster

Note

In a sharded space, uniqueness by secondary index is only guaranteed within a single shard, not across the whole cluster.

Data manipulation

All DML operations with data should be performed via a router using the vshard.router.call functions, such as vshard.router.callrw() or vshard.router.callro(). For example, a storage application has the insert_band function used to insert new tuples:

function insert_band(id, bucket_id, band_name, year)
    box.space.bands:insert({ id, bucket_id, band_name, year })
end

In a router application, you can define the put function that specifies how a router selects the storage to write data:

function put(id, band_name, year)
    local bucket_id = vshard.router.bucket_id_mpcrc32({ id })
    vshard.router.callrw(bucket_id, 'insert_band', { id, bucket_id, band_name, year })
end

Learn more at Processing requests.

Deduplication of non-idempotent requests

Idempotent requests produce the same result every time they are executed. For example, a data read request or a multiplication by one are both idempotent. Therefore, incrementing by one is an example of a non-idempotent operation. When such an operation is applied again, the value for the field increases by 2 instead of just 1.

Note

Any write requests that are intended to be executed repeatedly (for example, retried after an error) should be idempotent. The operations’ idempotency ensures that the change is applied only once.

A request may need to be run again if an error occurs on the server or client side. In this case:

Read requests can be executed repeatedly. For this purpose, vshard.router.call() (with mode=read) uses the request_timeout parameter (since vshard 0.1.28). It is necessary to pass the request_timeout and timeout parameters together, with the following requirement:
```
timeout > request_timeout
```
For example, if timeout = 10 and request_timeout = 2, within 10 seconds the router is able to make 5 attempts (2 seconds each) to send a request to different replicas until the request finally succeeds.
Write requests (vshard.router.callrw()) generally cannot be re-executed without verifying that they have not been applied before. Lack of such a check might lead to duplicate records or unplanned data changes.

For example, a client has sent a request to the server. The client is waiting for a response within a specified timeout. If the server sends a successful response after this time has elapsed, the client won’t see this response due to a timeout, and will consider the request as failed. When re-executing this request without additional check, the operation may be applied twice.

A write request can be executed repeatedly without a check in two cases:
- The request is idempotent.
- It’s known for sure that the previous request raised an error before executing any write operations. For example, ER_READONLY was thrown by the server. In this case, we know that the request couldn’t complete due to server in read-only mode.

Deduplication examples

To ensure that the write requests (INSERT, UPDATE, UPSERT, and autoincrement) are idempotent, you should implement a check that the request is applied for the first time.

Note

There is no built-in deduplication check in Tarantool. Currently, deduplication can be only implemented by the user in the application code.

For example, when you add a new tuple to a space, you can use a unique insert ID to check the request. In the example below, within a single transaction:

It is checked whether a tuple with the key ID exists in the bands space.
If there is no tuple with this ID in the space, the tuple is inserted.

box.begin()
if box.space.bands:get{key} == nil then
    box.space.bands:insert{key, value}
end
box.commit()

For update and upsert requests, you can create a deduplication space where the request IDs will be saved. Deduplication space is a user space that contains a list of unique identifiers. Each identifier corresponds to one applied request. This space can have any name, in the example it is called deduplication.

In the example below, within a single transaction:

It is checked whether the deduplication_key request ID exists in the deduplication space.
If there is no such ID, the ID is added to the deduplication space.
If the request hasn’t been applied before, it increments the specified field in the bands space by one.

This approach ensures that each data modification request will be executed only once.

function update_1(deduplication_key, key)
    box.begin()
    if box.space.deduplication:get{deduplication_key} == nil then
        box.space.deduplication:insert{deduplication_key}
        box.space.bands:update(key, {{'+', 'value', 1 }})
    end
    box.commit()
end

Sharded cluster maintenance

Master crash

If a replica set master fails, it is recommended to:

Switch one of the replicas into the master mode. This allows the new master to process all the incoming requests.
Update the configuration of all the cluster members. This forwards all the requests to the new master.

Replica set crash

In case a whole replica set fails, some part of the dataset becomes inaccessible. Meanwhile, the router tries to reconnect to the master of the failed replica set. This way, once the replica set is up and running again, the cluster is automatically restored.

Master scheduled downtime

To perform a scheduled downtime of a replica set master, it is recommended to:

Update the configuration to use another instance as a master.
Reload the configuration on all the instances. All the requests then are forwarded to a new master.
Shut down the old master.

Replica set scheduled downtime

To perform a scheduled downtime of a replica set, it is recommended to:

Migrate all the buckets to the other cluster storages. You can do this by assigning a zero weight to a replica set to initiate migration of its buckets to the remaining cluster nodes.
Update the configuration of all the nodes.
Shut down the replica set.

Fibers

Searches for buckets, buckets recovery, and buckets rebalancing are performed automatically and do not require manual intervention.

Technically, there are multiple fibers responsible for different types of operations:

a discovery fiber on the router searches for buckets in the background
a failover fiber on the router maintains replica connections
a garbage collector fiber on each master storage removes the contents of buckets that were moved
a bucket recovery fiber on each master storage recovers buckets in the SENDING and RECEIVING states in case of reboot
a rebalancer on a single master storage among all replica sets executes the rebalancing process.

See the Rebalancing process and Migration of buckets sections for details.

Garbage collector

A garbage collector fiber runs in the background on the master storages of each replica set. It starts deleting the contents of the bucket in the GARBAGE state part by part. Once the bucket is empty, its record is deleted from the _bucket system space.

Bucket recovery

A bucket recovery fiber runs on the master storages. It helps to recover buckets in the SENDING and RECEIVING states in case of reboot.

Buckets in the SENDING state are recovered as follows:

The system first searches for buckets in the SENDING state.
If such a bucket is found, the system sends a request to the destination replica set.
If the bucket on the destination replica set is ACTIVE, the original bucket is deleted from the source node.

Buckets in the RECEIVING state are deleted without extra checks.

Failover

A failover fiber runs on every router. If a master of a replica set becomes unavailable, the failover fiber redirects read requests to the replicas. Write requests are rejected with an error until the master becomes available.

Connections and authentication

This section contains guides on how to configure connections and authentication features.

Connections

To set up a Tarantool cluster, you need to enable communication between its instances, regardless of whether they running on one or different hosts. This requires configuring connection settings that include:

One or several URIs used to listen for incoming requests.
An URI used to advertise an instance to other cluster members. This URI lets other cluster members know how to connect to the current Tarantool instance.
(Optional) SSL settings used to secure connections between instances.

Configuring connection settings is also required to enable communication of a Tarantool cluster to external systems. For example, this might be administering cluster members using tt, managing clusters using Tarantool Cluster Manager, or using connectors for different languages.

This topic describes how to define connection settings in the iproto section of a YAML configuration.

Note

iproto is a binary protocol used to communicate between cluster instances and with external systems.

Listen URI

To configure URIs used to listen for incoming requests, use the iproto.listen configuration option.

One listen address

The example below shows how to set a listening IP address for instance001 to 127.0.0.1:3301:

instance001:
  iproto:
    listen:
    - uri: '127.0.0.1:3301'

Multiple listen addresses

In this example, instance001 listens on two IP addresses:

instance001:
  iproto:
    listen:
    - uri: '127.0.0.1:3301'
    - uri: '127.0.0.1:3302'

Listen port

You can pass only a port value to iproto.listen:

instance001:
  iproto:
    listen:
    - uri: '3301'

In this case, this port is used for all IP addresses the server listens on.

SSL parameters

In the Enterprise Edition, you can enable SSL for a connection using the params section of the specified URI:

instance001:
  iproto:
    listen:
    - uri: '127.0.0.1:3301'
      params:
        transport: 'ssl'
        ssl_cert_file: 'certs/server.crt'
        ssl_key_file: 'certs/server.key'

Learn more from Securing connections with SSL.

Unix domain socket

For local development, you can enable communication between cluster members by using Unix domain sockets:

instance001:
  iproto:
    listen:
    - uri: 'unix/:./var/run/{{ instance_name }}/tarantool.iproto'

Advertise URI

An advertise URI (iproto.advertise.*) lets other cluster members or clients know how to connect to the current Tarantool instance:

iproto.advertise.peer specifies how to advertise the instance to other cluster members.
iproto.advertise.sharding specifies how to advertise the instance to a router and rebalancer.
iproto.advertise.client accepts a URI used to advertise the instance to clients.

iproto.advertise.<peer_or_sharding> might include the credentials required to connect to this instance, a URI used to listen for incoming requests, and SSL settings.

If iproto.advertise.<peer_or_sharding>.uri is not specified explicitly, a listen URI of this instance is used. In this case, you need at least to specify credentials for connecting to this instance.

Connection credentials

In the example below, the iproto.advertise.peer option is used to inform other replica set members that the replicator user should be used to connect to the current instance:

iproto:
  advertise:
    peer:
      login: replicator

In a sharded cluster, iproto.advertise.sharding specifies that a router and rebalancer should use the storage user to connect to storages:

iproto:
  advertise:
    peer:
      login: replicator
    sharding:
      login: storage

URI

If required, you can specify an advertise URI explicitly by setting up the iproto.advertise.<peer_or_sharding>.uri option. In the example below, iproto.listen includes two URIs that can be used to connect to instance001 but only the second one is used to advertise this instance to other replica set peers:

instance001:
  iproto:
    listen:
    - uri: '127.0.0.1:3301'
    - uri: '127.0.0.1:4401'
    advertise:
      peer:
        uri: '127.0.0.1:4401'

The iproto.advertise.<peer_or_sharding>.uri option can also accept an FQDN instead of an IP address:

instance001:
  iproto:
    listen:
    - uri: '192.168.0.101:3301'
    advertise:
      peer:
        uri: 'server001.example.com:3301'

To learn about the specifics of configuring an advertise URI’s SSL settings, see Advertise URI specifics.

Securing connections with SSL

Enterprise Edition

SSL is supported by the Enterprise Edition only.

Tarantool supports the use of SSL connections to encrypt client-server communications for increased security. To enable SSL, use the <uri>.params.* options, which can be applied to both listen and advertise URIs.

Without CA

The example below demonstrates how to enable traffic encryption by using a self-signed server certificate. The following parameters are specified for each instance:

ssl_cert_file: a path to an SSL certificate file.
ssl_key_file: a path to a private SSL key file.

instances:
  instance001:
    iproto:
      listen:
      - uri: '127.0.0.1:3301'
        params:
          transport: 'ssl'
          ssl_cert_file: 'certs/server.crt'
          ssl_key_file: 'certs/server.key'
  instance002:
    iproto:
      listen:
      - uri: '127.0.0.1:3302'
        params:
          transport: 'ssl'
          ssl_cert_file: 'certs/server.crt'
          ssl_key_file: 'certs/server.key'
  instance003:
    iproto:
      listen:
      - uri: '127.0.0.1:3303'
        params:
          transport: 'ssl'
          ssl_cert_file: 'certs/server.crt'
          ssl_key_file: 'certs/server.key'

You can find the full example here: ssl_without_ca.

With CA

The example below demonstrates how to enable traffic encryption by using a server certificate signed by a trusted certificate authority. In this case, all replica set peers verify each other for authenticity.

The following parameters are specified for each instance:

ssl_ca_file: a path to a trusted certificate authorities (CA) file.
ssl_cert_file: a path to an SSL certificate file.
ssl_key_file: a path to a private SSL key file.
ssl_password (instance001): a password for an encrypted private SSL key.
ssl_password_file (instance002 and instance003): a text file containing passwords for encrypted SSL keys.
ssl_ciphers: a colon-separated list of SSL cipher suites the connection can use.

instances:
  instance001:
    iproto:
      listen:
      - uri: '127.0.0.1:3301'
        params:
          transport: 'ssl'
          ssl_ca_file: 'certs/root_ca.crt'
          ssl_cert_file: 'certs/instance001/server001.crt'
          ssl_key_file: 'certs/instance001/server001.key'
          ssl_password: 'qwerty'
          ssl_ciphers: 'ECDHE-RSA-AES256-GCM-SHA384:ECDHE-RSA-AES128-GCM-SHA256'
  instance002:
    iproto:
      listen:
      - uri: '127.0.0.1:3302'
        params:
          transport: 'ssl'
          ssl_ca_file: 'certs/root_ca.crt'
          ssl_cert_file: 'certs/instance002/server002.crt'
          ssl_key_file: 'certs/instance002/server002.key'
          ssl_password_file: 'certs/ssl_passwords.txt'
          ssl_ciphers: 'ECDHE-RSA-AES256-GCM-SHA384:ECDHE-RSA-AES128-GCM-SHA256'
  instance003:
    iproto:
      listen:
      - uri: '127.0.0.1:3303'
        params:
          transport: 'ssl'
          ssl_ca_file: 'certs/root_ca.crt'
          ssl_cert_file: 'certs/instance003/server003.crt'
          ssl_key_file: 'certs/instance003/server003.key'
          ssl_password_file: 'certs/ssl_passwords.txt'
          ssl_ciphers: 'ECDHE-RSA-AES256-GCM-SHA384:ECDHE-RSA-AES128-GCM-SHA256'

You can find the full example here: ssl_with_ca.

Advertise URI specifics

SSL parameters for an advertise URI should be set only if this advertise URI is specified explicitly. Otherwise, SSL parameters of a listen URI are used and no additional configuration is required.

Configuring an advertise URI’s SSL options depends on whether a trusted certificate authorities (CA) file is set or not. Without the CA file, you only need to set iproto.advertise.<peer_or_sharding>.params.transport to ssl as shown below:

instance001:
  iproto:
    listen:
    - uri: '192.168.0.101:3301'
      params:
        transport: 'ssl'
        ssl_cert_file: 'certs/server.crt'
        ssl_key_file: 'certs/server.key'
    advertise:
      peer:
        uri: 'server.example.com:3301'
        params:
          transport: 'ssl'

If the CA file is specified for a listen URI, you also need to configure ssl_cert_file and ssl_key_file for this advertise URI:

instance001:
  iproto:
    listen:
    - uri: '192.168.0.101:3301'
      params:
        transport: 'ssl'
        ssl_ca_file: 'certs/root_ca.crt'
        ssl_cert_file: 'certs/instance001/server001.crt'
        ssl_key_file: 'certs/instance001/server001.key'
    advertise:
      peer:
        uri: 'server001.example.com:3301'
        params:
          transport: 'ssl'
          ssl_cert_file: 'certs/instance001/server001.crt'
          ssl_key_file: 'certs/instance001/server001.key'

Reloading certificates

To reload SSL certificate files specified in the configuration, open an admin console and reload the configuration using config.reload():

require('config'):reload()

New certificates will be used for new connections. Existing connections will continue using old SSL certificates until reconnection is required. For example, certificate expiry or a network issue causes reconnection.

Credentials

Tarantool enables flexible management of access to various database resources by providing specific privileges to users. You can read more about the main concepts of Tarantool access control system in the Access control section.

This topic describes how to create users and grant them the specified privileges in the credentials section of a YAML configuration. For example, you can define users with the replication and sharding roles to maintain replication and sharding in a Tarantool cluster.

Managing users and roles

Creating a user

You can create new or configure credentials of the existing users in the credentials.users section.

In the example below, a dbadmin user without a password is created:

credentials:
  users:
    dbadmin: {}

To set a password, use the credentials.users.<username>.password option:

credentials:
  users:
    dbadmin:
      password: 'T0p_Secret_P@$$w0rd'

Granting privileges to a user

To assign a role to a user, use the credentials.users.<username>.roles option. In this example, the dbadmin user gets privileges granted to the super built-in role:

credentials:
  users:
    dbadmin:
      password: 'T0p_Secret_P@$$w0rd'
      roles: [ super ]

To create a new role, define it in the credentials.roles.* section. In the example below, the writers_space_reader role gets privileges to select data in the writers space:

roles:
  writers_space_reader:
    privileges:
    - permissions: [ read ]
      spaces: [ writers ]

Then, you can assign this role to a user using credentials.users.<username>.roles (sampleuser in the example below):

sampleuser:
  password: '123456'
  roles: [ writers_space_reader ]

You can grant specific privileges directly using credentials.users.<username>.privileges. In this example, sampleuser gets privileges to select and modify data in the books space:

sampleuser:
  password: '123456'
  roles: [ writers_space_reader ]
  privileges:
  - permissions: [ read, write ]
    spaces: [ books ]

You can find the full example here: credentials.

Revoking privileges from a user

To revoke a previously granted privilege, remove it from the configuration.

For example, here is how to grant privileges to a space and how to revoke one of the privileges:

# grant privileges:
privileges:
- permissions: [read, write]
  spaces: [books]

# revoke a privilege:
  privileges:
  - permissions: [read] # !! write permission revoked !!
    spaces: [books]

If you want to revoke the remaining privilege to from a space, you can remove it, too, thus making permissions an empty array:

# empty permissions array:
privileges:
- permissions: [] # !! read permission revoked !!
  spaces: [books]

You can revoke all privileges by making the privileges an empty array:

# empty privileges array:
  privileges: [] # !! no privileges at all !!

Warning

Do not remove a user or a role from configuration in order to revoke that user’s or role’s privileges. If a user or a role is entirely removed from the configuration, it is not tracked by configuration machinery anymore. The user/role is not removed and its privileges are not revoked.

Loading secrets from safe storage

Tarantool enables you to load secrets from safe storage such as external files or environment variables. To do this, you need to define corresponding options in the config.context section. In the examples below, context.dbadmin_password and context.sampleuser_password define how to load user passwords from *.txt files or environment variables:

This example shows how to load passwords from *.txt files:

config:
  context:
    dbadmin_password:
      from: file
      file: secrets/dbadmin_password.txt
      rstrip: true
    sampleuser_password:
      from: file
      file: secrets/sampleuser_password.txt
      rstrip: true

This example shows how to load passwords from environment variables:

config:
  context:
    dbadmin_password:
      from: env
      env: DBADMIN_PASSWORD
    sampleuser_password:
      from: env
      env: SAMPLEUSER_PASSWORD

These environment variables should be set before starting instances.

After configuring how to load passwords, you can set password values using credentials.users.<username>.password as follows:

credentials:
  users:
    dbadmin:
      password: '{{ context.dbadmin_password }}'
    sampleuser:
      password: '{{ context.sampleuser_password }}'

You can find the full examples here: credentials_context_file, credentials_context_env.

Authentication

Enterprise Edition

Authentication features are supported by the Enterprise Edition only.

Authentication restrictions

Tarantool Enterprise Edition provides the ability to apply additional restrictions for user authentication. For example, you can specify the minimum time between authentication attempts or turn off access for guest users.

In the configuration below, security.auth_retries is set to 2, which means that Tarantool lets a client try to authenticate with the same username three times. At the fourth attempt, the authentication delay configured with security.auth_delay is enforced. This means that a client should wait 10 seconds after the first failed attempt.

security:
  auth_delay: 10
  auth_retries: 2
  disable_guest: true

The disable_guest option turns off access over remote connections from unauthenticated or guest users.

Password policy

A password policy allows you to improve database security by enforcing the use of strong passwords, setting up a maximum password age, and so on. When you create a new user with box.schema.user.create or update the password of an existing user with box.schema.user.passwd, the password is checked against the configured password policy settings.

In the example below, the following options are specified:

password_min_length specifies that a password should be at least 16 characters.
password_enforce_lowercase and password_enforce_uppercase specify that a password should contain lowercase and uppercase letters.
password_enforce_digits and password_enforce_specialchars specify that a password should contain digits and at least one special character.
password_lifetime_days sets a maximum password age to 365 days.
password_history_length specifies that a new password should differ from the last three passwords.

security:
  password_min_length: 16
  password_enforce_lowercase: true
  password_enforce_uppercase: true
  password_enforce_digits: true
  password_enforce_specialchars: true
  password_lifetime_days: 365
  password_history_length: 3

Authentication protocol

By default, Tarantool uses the CHAP protocol to authenticate users and applies SHA-1 hashing to passwords. Note that CHAP stores password hashes in the _user space unsalted. If an attacker gains access to the database, they may crack a password, for example, using a rainbow table.

In the Enterprise Edition, you can enable PAP authentication with the SHA256 hashing algorithm. For PAP, a password is salted with a user-unique salt before saving it in the database, which keeps the database protected from cracking using a rainbow table.

To enable PAP, specify the security.auth_type option as follows:

security:
  auth_type: 'pap-sha256'

For new users, the box.schema.user.create method generates authentication data using PAP-SHA256. For existing users, you need to reset a password using box.schema.user.passwd to use the new authentication protocol.

Warning

Given that PAP transmits a password as plain text, Tarantool requires configuring SSL/TLS for a connection.

The example below shows how to specify the authentication protocol using the auth_type parameter when connecting to an instance using net.box:

local connection = require('net.box').connect({
    uri = 'admin:topsecret@127.0.0.1:3301',
    params = { auth_type = 'pap-sha256',
               transport = 'ssl',
               ssl_cert_file = 'certs/server.crt',
               ssl_key_file = 'certs/server.key' }
})

If the authentication protocol isn’t specified explicitly on the client side, the client uses the protocol configured on the server via security.auth_type.

Security

This section contains guides related to security features.

Audit module

Enterprise Edition

The audit module is available in the Enterprise Edition only.

Example on GitHub: audit_log

The audit module allows you to record various events occurred in Tarantool. Each event is an action related to authorization and authentication, data manipulation, administrator activity, or system events.

The module provides detailed reports of these activities and helps you find and fix breaches to protect your business. For example, you can see who created a new user and when.

It is up to each company to decide exactly what activities to audit and what actions to take. System administrators, security engineers, and people in charge of the company may want to audit different events for different reasons. Tarantool provides such an option for each of them.

Configure audit log

The section describes how to enable and configure audit logging and write logs to a selected destination – a file, a pipe, or a system logger.

Enable audit logging

To enable audit logging, define the log location using the audit_log.to option in the configuration file. Possible log locations:

In the configuration below, the audit_log.to option is set to file. It means that the logs are written to a file. By default, audit logs are saved in the var/log/{{ instance_name }}/audit.log file. To specify the path to an audit log file explicitly, use the audit_log.file option.

audit_log:
  to: file
  file: 'audit_tarantool.log'

If you log to a file, Tarantool reopens the audit log at SIGHUP.

To disable audit logging, set the audit_log.to option to devnull.

Filter the events

Tarantool’s extensive filtering options help you write only the events you need to the audit log. To select the recorded events, use the audit_log.filter option. Its value can be a list of events and event groups. You can customize the filters and use different combinations of them for your purposes. Possible filtering options:

Filter by event. You can set a list of events to be recorded. For example, select password_change to monitor the users who have changed their passwords:
```
audit_log:
  filter: [ password_change ]
```
Filter by group. You can specify a list of event groups to be recorded. For example, select auth and priv to see the events related to authorization and granted privileges:
```
audit_log:
  filter: [ auth,priv ]
```
Filter by group and event. You can specify a group and a certain event depending on the purpose. In the configuration below, user_create, data_operations, ddl, and custom are selected to see the events related to:
- user creation
- space creation, altering, and dropping
- data modification or selection from spaces
- custom events (any events added manually using the audit module API)
```
filter: [ user_create,data_operations,ddl,custom ]
```

Set the format of audit log events

Use the audit_log.format option to choose the format of audit log events – plain text, CSV, or JSON.

format: json

JSON is used by default. It is more convenient to receive log events, analyze them, and integrate them with other systems if needed. The plain format can be efficiently compressed. The CSV format allows you to view audit log events in tabular form.

Specify the spaces to be logged

The audit_log.spaces option is used to specify a list of space names for which data operation events should be logged.

In the configuration below, only the events from the bands space are logged:

spaces: [ bands ]

Specify the logging mode in DML events

If set to true, the audit_log.extract_key option forces the audit subsystem to log the primary key instead of a full tuple in DML operations.

extract_key: true

Examples of audit log entries

In this example, the following audit log configuration is used:

audit_log:
  to: file
  file: 'audit_tarantool.log'
  filter: [ user_create,data_operations,ddl,custom ]
  format: json
  spaces: [ bands ]
  extract_key: true

Create a space bands and check the logs in the file after the creation:

box.schema.space.create('bands')

The audit log entry for the space_create event might look as follows:

{
  "time": "2024-01-24T11:43:21.566+0300",
  "uuid": "26af0a7d-1052-490a-9946-e19eacc822c9",
  "severity": "INFO",
  "remote": "unix/:(socket)",
  "session_type": "console",
  "module": "tarantool",
  "user": "admin",
  "type": "space_create",
  "tag": "",
  "description": "Create space Bands"
}

Then insert one tuple to space:

box.space.bands:insert { 1, 'Roxette', 1986 }

If the extract_key option is set to true, the audit system prints the primary key instead of the full tuple:

{
  "time": "2024-01-24T11:45:42.358+0300",
  "uuid": "b437934d-62a7-419a-8d59-e3b33c688d7a",
  "severity": "VERBOSE",
  "remote": "unix/:(socket)",
  "session_type": "console",
  "module": "tarantool",
  "user": "admin",
  "type": "space_insert",
  "tag": "",
  "description": "Insert key [1] into space bands"
}

If the extract_key option is set to false, the audit system prints the full tuple like this:

{
  "time": "2024-01-24T11:45:42.358+0300",
  "uuid": "b437934d-62a7-419a-8d59-e3b33c688d7a",
  "severity": "VERBOSE",
  "remote": "unix/:(socket)",
  "session_type": "console",
  "module": "tarantool",
  "user": "admin",
  "type": "space_insert",
  "tag": "",
  "description": "Insert tuple [1, \"Roxette\", 1986] into space bands"
}

Audit log events

Events types

The Tarantool audit log module can record various events that you can monitor and decide whether you need to take actions:

Administrator activity – events related to actions performed by the administrator. For example, such logs record the creation of a user.
Access events – events related to authorization and authentication of users. For example, such logs record failed attempts to access secure data.
Data access and modification – events of data manipulation in the storage.
System events – events related to modification or configuration of resources. For example, such logs record the replacement of a space.
Custom events – any events added manually using the audit module API.

The full list of available audit log events is provided in the table below:

Event	Event type	Severity level	Example
Audit log enabled for events	`audit_enable`	`VERBOSE`
Custom events	`custom`	`INFO` (default)
User authorized successfully	`auth_ok`	`VERBOSE`	`Authenticate user <USER>`
User authorization failed	`auth_fail`	`ALARM`	`Failed to authenticate user <USER>`
User logged out or quit the session	`disconnect`	`VERBOSE`	`Close connection`
User created	`user_create`	`INFO`	`Create user <USER>`
User dropped	`user_drop`	`INFO`	`Drop user <USER>`
Role created	`role_create`	`INFO`	`Create role <ROLE>`
Role dropped	`role_drop`	`INFO`	`Drop role <ROLE>`
User disabled	`user_disable`	`INFO`	`Disable user <USER>`
User enabled	`user_enable`	`INFO`	`Enable user <USER>`
User granted rights	`user_grant_rights`	`INFO`	`Grant <PRIVILEGE> rights for <OBJECT_TYPE> <OBJECT_NAME> to user <USER>`
User revoked rights	`user_revoke_rights`	`INFO`	`Revoke <PRIVILEGE> rights for <OBJECT_TYPE> <OBJECT_NAME> from user <USER>`
Role granted rights	`role_grant_rights`	`INFO`	`Grant <PRIVILEGE> rights for <OBJECT_TYPE> <OBJECT_NAME> to role <ROLE>`
Role revoked rights	`role_revoke_rights`	`INFO`	`Revoke <PRIVILEGE> rights for <OBJECT_TYPE> <OBJECT_NAME> from role <ROLE>`
User password changed	`password_change`	`INFO`	`Change password for user <USER>`
Failed attempt to access secure data (for example, personal records, details, geolocation)	`access_denied`	`ALARM`	`<ACCESS_TYPE> denied to <OBJECT_TYPE> <OBJECT_NAME>`
Expressions with arguments evaluated in a string	`eval`	`INFO`	`Evaluate expression <EXPR>`
Function called with arguments	`call`	`VERBOSE`	`Call function <FUNCTION> with arguments <ARGS>`
Iterator key selected from `space.index`	`space_select`	`VERBOSE`	`Select <ITER_TYPE> <KEY> from <SPACE>.<INDEX>`
Space created	`space_create`	`INFO`	`Create space <SPACE>`
Space altered	`space_alter`	`INFO`	`Alter space <SPACE>`
Space dropped	`space_drop`	`INFO`	`Drop space <SPACE>`
Tuple inserted into space	`space_insert`	`VERBOSE`	`Insert tuple <TUPLE> into space <SPACE>`
Tuple replaced in space	`space_replace`	`VERBOSE`	`Replace tuple <TUPLE> with <NEW_TUPLE> in space <SPACE>`
Tuple deleted from space	`space_delete`	`VERBOSE`	`Delete tuple <TUPLE> from space <SPACE>`

Note

The eval event displays data from the console module and the eval function of the net.box module. For more on how they work, see Module console and Module net.box – eval. To separate the data, specify console or binary in the session field.

Structure of audit log event

Each audit log event contains a number of fields that can be used to filter and aggregate the resulting logs. An example of a Tarantool audit log entry in JSON:

{
    "time": "2024-01-15T13:39:36.046+0300",
    "uuid": "cb44fb2b-5c1f-4c4b-8f93-1dd02a76cec0",
    "severity": "VERBOSE",
    "remote": "unix/:(socket)",
    "session_type": "console",
    "module": "tarantool",
    "user": "admin",
    "type": "auth_ok",
    "tag": "",
    "description": "Authenticate user Admin"
}

Each event consists of the following fields:

Field	Description	Example
`time`	Time of the event	`2024-01-15T16:33:12.368+0300`
`uuid`	Since 3.0.0. A unique identifier of audit log event	`cb44fb2b-5c1f-4c4b-8f93-1dd02a76cec0`
`severity`	Since 3.0.0. A severity level. Each system audit event has a severity level determined by its importance. Custom events have the `INFO` severity level by default.	`VERBOSE`
`remote`	Remote host that triggered the event	`unix/:(socket)`
`session_type`	Session type	`console`
`module`	Audit log module. Set to `tarantool` for system events; can be overwritten for custom events	`tarantool`
`user`	User who triggered the event	`admin`
`type`	Audit event type	`auth_ok`
`tag`	A text field that can be overwritten by the user
`description`	Human-readable event description	`Authenticate user Admin`

Event groups

Built-in event groups are used to filter the event types that you want to audit. For example, you can set to record only authorization events or only events related to a space.

Tarantool provides the following event groups:

all – all events.

Note

Events call and eval are included only in the all group.
audit – audit_enable event.
auth – authorization events: auth_ok, auth_fail.
priv – events related to authentication, authorization, users, and roles: user_create, user_drop, role_create, role_drop, user_enable, user_disable, user_grant_rights, user_revoke_rights, role_grant_rights, role_revoke_rights.
ddl – events of space creation, altering, and dropping: space_create, space_alter, space_drop.
dml – events of data modification in spaces: space_insert, space_replace, space_delete.
data_operations – events of data modification or selection from spaces: space_select, space_insert, space_replace, space_delete.
compatibility – events available in Tarantool before the version 2.10.0. auth_ok, auth_fail, disconnect, user_create, user_drop, role_create, role_drop, user_enable, user_disable, user_grant_rights, user_revoke_rights, role_grant_rights. role_revoke_rights, password_change, access_denied. This group enables the compatibility with earlier Tarantool versions.

Warning

Be careful when recording all and data_operations event groups. The more events you record, the slower the requests are processed over time. It is recommended that you select only those groups whose events your company needs to monitor and analyze.

Custom events

Tarantool provides an API for writing custom audit log events. To enable these events, specify the custom value in the audit_log.filter option:

filter: [ user_create,data_operations,ddl,custom ]

Log a custom event

To log an event, use the audit.log() function that takes one of the following values:

Message string. Printed to the audit log with type message:
```
audit.log('Hello, Alice!')
```
Format string and arguments. Passed to string format and then output to the audit log with type message:
```
audit.log('Hello, %s!', 'Bob')
```

Table with audit log field values. The table must contain at least one field – description.

audit.log({ type = 'custom_hello', description = 'Hello, World!' })
audit.log({ type = 'custom_farewell', user = 'eve', module = 'custom', description = 'Farewell, Eve!' })

Alternatively, you can use audit.new() to create a new log module. This allows you to avoid passing all custom audit log fields each time audit.log() is called. The audit.new() function takes a table of audit log field values (same as audit.log()). The type of the log module for writing custom events must either be message or have the custom_ prefix.

local my_audit = audit.new({ type = 'custom_hello', module = 'my_module' })
my_audit:log('Hello, Alice!')
my_audit:log({ tag = 'admin', description = 'Hello, Bob!' })

Overwrite custom event fields

It is possible to overwrite most of the custom audit log fields using audit.new() or audit.log(). The only audit log field that cannot be overwritten is time.

audit.log({ type = 'custom_hello', description = 'Hello!',
            session_type = 'my_session', remote = 'my_remote' })

If omitted, the session_type is set to the current session type, remote is set to the remote peer address.

Note

To avoid confusion with system events, the value of the type field must either be message (default) or begin with the custom_ prefix. Otherwise, you receive the error message. Custom events are filtered out by default.

Severity level

By default, custom events have the INFO severity level. To override the level, you can:

specify the severity field
use a shortcut function

The following shortcuts are available:

Shortcut	Equivalent
`audit.verbose(...)`	`audit.log({severity = 'VERBOSE', ...})`
`audit.info(...)`	`audit.log({severity = 'INFO', ...})`
`audit.warning(...)`	`audit.log({severity = 'WARNING', ...})`
`audit.alarm(...)`	`audit.log({severity = 'ALARM', ...})`

Example

audit.log({ severity = 'VERBOSE', description = 'Hello!' })

Tips

How many events can be recorded?

If you write to a file, the size of the Tarantool audit log is limited by the disk space. If you write to a system logger, the size of the Tarantool audit log is limited by the system logger. If you write to a pipe, the size of the Tarantool audit message is limited by the system buffer. If the audit_log.nonblock = false, if audit_log.nonblock = true, there is no limit.

How often should audit logs be reviewed?

Consider setting up a schedule in your company. It is recommended to review audit logs at least every 3 months.

How long should audit logs be stored?

It is recommended to store audit logs for at least one year.

What is the best way to process audit logs?

It is recommended to use SIEM systems for this issue.

Security audit

This document will help you audit the security of a Tarantool cluster. It explains certain security aspects, their rationale, and the ways to check them. For details on how to configure Tarantool Enterprise Edition and its infrastructure for each aspect, refer to the security hardening guide.

Encryption of external iproto traffic

Tarantool uses the iproto binary protocol for replicating data between instances and also in the connector libraries.

Since version 2.10.0, the Enterprise Edition has the built-in support for using SSL to encrypt the client-server communications over binary connections. For details on enabling SSL encryption, see the Securing connections with SSL section of this document.

In case the built-in encryption is not enabled, we recommend using VPN to secure data exchange between data centers.

Closed iproto ports

When a Tarantool cluster does not use iproto for external requests, connections to the iproto ports should be allowed only between Tarantool instances.

For more details on configuring ports for iproto, see the advertise_uri section in the Cartridge documentation.

HTTPS connection termination

A Tarantool instance can accept HTTP connections from external services or access the administrative web UI. All such connections must go through an HTTPS-providing web server, running on the same host, such as nginx. This requirement is for both virtual and physical hosts. Running HTTP traffic through a few separate hosts with HTTPS termination is not sufficiently secure.

Closed HTTP ports

Tarantool accepts HTTP connections on a specific port. It must be only available on the same host for nginx to connect to it.

Check that the configured HTTP port is closed and that the HTTPS port (443 by default) is open.

Restricted access to the administrative console

The console module provides a way to connect to a running instance and run custom Lua code. This can be useful for development and administration. The following code examples open connections on a TCP port and on a UNIX socket.

console.listen(<port number>)
console.listen('/var/lib/tarantool/socket_name.sock')

Opening an administrative console through a TCP port is always unsafe. Check that there are no calls like console.listen(<port_number>) in the code.

Connecting through a socket requires having the write permission on the /var/lib/tarantool directory. Check that write permission to this directory is limited to the tarantool user.

Limiting the guest user

Connecting to the instance with tt connect or tarantoolctl connect without user credentials (under the guest user) must be disabled.

There are two ways to check this vulnerability:

Check that the source code doesn’t grant access to the guest user. The corresponding code can look like this:
```
box.schema.user.grant('guest',
    'read,write',
    'universe',
    nil, { if_not_exists = true }
)
```
Besides searching for the whole code pattern, search for any entries of 'universe'.
Try connecting with tt connect to each Tarantool node.

For more details, refer to the documentation on access control.

Authorization in the web UI

Using the web interface must require logging in with a username and password.

Running under the tarantool user

All Tarantool instances should be running under the tarantool user.

Limiting access to the tarantool user

The tarantool user must be a non-privileged user without the sudo permission. Also, it must not have a password set to prevent logging in via SSH or su.

Keeping two or more snapshots

In order to have a reliable backup, a Tarantool instance must keep two or more latest snapshots. This should be checked on each Tarantool instance.

The snapshot_count value determines the number of kept snapshots. Configuration values are primarily set in the configuration files but can be overridden with environment variables and command-line arguments. So, it’s best to check both the values in the configuration files and the actual values using the console:

tarantool> box.cfg.checkpoint_count
---
- 2

Enabled write-ahead logging (WAL)

Tarantool records all incoming data in the write-ahead log (WAL). The WAL must be enabled to ensure that data will be recovered in case of a possible instance restart.

Secure values of the wal.mode configuration option are write and fsync:

wal:
  dir: 'var/lib/{{ instance_name }}/wals'
  mode: 'write'

An exclusion from this requirement is when the instance is processing data, which can be freely rejected - for example, when Tarantool is used for caching. In this case, WAL can be disabled to reduce i/o load.

The logging level is INFO or higher

The logging level should be set to 5 (INFO), 6 (VERBOSE), or 7 (DEBUG). Application logs will then have enough information to research a possible security breach.

tarantool> box.cfg.log_level
---
- 5

For a full list of logging levels, see the log_level reference.

Logging with journald

Tarantool should use journald for logging.

Security hardening guide

This guide explains how to enhance security in your Tarantool Enterprise Edition’s cluster using built-in features and provides general recommendations on security hardening. If you need to perform a security audit of a Tarantool Enterprise cluster, refer to the security checklist.

Tarantool Enterprise Edition does not provide a dedicated API for security control. All the necessary configurations can be done via an administrative console or initialization code.

Tarantool Enterprise Edition has the following built-in security features:

Authentication

Tarantool Enterprise Edition supports password-based authentication and allows for two types of connections:

Via an administrative console.
Over a binary port for read and write operations and procedure invocation.

For more information on authentication and connection types, see the Security section in Administration.

In addition, Tarantool provides the following functionality:

Sessions – states which associate connections with users and make Tarantool API available to them after authentication.
Authentication triggers, which execute actions on authentication events.
Third-party (external) authentication protocols and services such as LDAP or Active Directory – supported in the web interface, but unavailable on the binary-protocol level.

Access control

Tarantool Enterprise Edition provides the means for administrators to prevent unauthorized access to the database and to certain functions.

Tarantool recognizes:

different users (guests and administrators)
privileges associated with users
roles (containers for privileges) granted to users

The following system spaces are used to store users and privileges:

The _user space to store usernames and hashed passwords for authentication.
The _priv space to store privileges for access control.

For more information, see the Access control section.

Users who create objects (spaces, indexes, users, roles, sequences, and functions) in the database become their owners and automatically acquire privileges for what they create. For more information, see the Owners and privileges section.

Audit log

Tarantool Enterprise Edition has a built-in audit log that records events such as:

authentication successes and failures
connection closures
creation, removal, enabling, and disabling of users
changes of passwords, privileges, and roles
denials of access to database objects

The audit log contains:

timestamps
usernames of users who performed actions
event types (for example, user_create, user_enable, disconnect)
descriptions

You can configure the following audit log options:

audit_log.to – enable audit logging and define the log location (file, pipe, or syslog). The option is similar to the log.
audit_log.nonblock – specify the logging behavior if the system is not ready to write. The option is similar to the log_nonblock.

For more information on logging, see the following:

the Logs section
the log section in the configuration reference
the Tarantool audit module topic

Access permissions to audit log files can be set up as to any other Unix file system object – via chmod.

Recommendations on security hardening

This section lists recommendations that can help you harden the cluster’s security.

Encrypting traffic

Since version 2.10.0, Tarantool Enterprise Edition has built-in support for using SSL to encrypt the client-server communications over binary connections, that is, between Tarantool instances in a cluster. For details on enabling SSL encryption, see the Securing connections with SSL section of this guide.

In case the built-in encryption is not set for particular connections, consider the following security recommendations:

setting up connection tunneling, or
encrypting the actual data stored in the database.

For more information on data encryption, see the crypto module reference.

The HTTP server module provided by rocks does not support the HTTPS protocol. To set up a secure connection for a client (e.g., REST service), consider hiding the Tarantool instance (router if it is a cluster of instances) behind an Nginx server and setting up an SSL certificate for it.

To make sure that no information can be intercepted ‘from the wild’, run nginx on the same physical server as the instance and set up their communication over a Unix socket. For more information, see the socket module reference.

Firewall configuration

To protect the cluster from any unwanted network activity ‘from the wild’, configure the firewall on each server to allow traffic on ports listed in Network requirements.

If you are using static IP addresses, whitelist them, again, on each server as the cluster has a full mesh network topology. Consider blacklisting all the other addresses on all servers except the router (running behind the Nginx server).

Tarantool Enterprise does not provide defense against DoS or DDoS attacks. Consider using third-party software instead.

Data integrity

Tarantool Enterprise Edition does not keep checksums or provide the means to control data integrity. However, it ensures data persistence using a write-ahead log, regularly snapshots the entire data set to disk, and checks the data format whenever it reads the data back from the disk. For more information, see the Data persistence section.

Triggers

Triggers, also known as callbacks, are functions which the server executes when certain events happen.

To associate an event with a callback, pass the callback to the corresponding on_event function:

Then the server will store the callback function and call it when the corresponding event happens.

All triggers have the following characteristics:

Triggers are defined only by the ‘admin’ user.
Triggers are stored in the Tarantool instance’s memory, not in the database. Therefore triggers disappear when the instance is shut down. To make them permanent, put function definitions and trigger settings into Tarantool’s initialization script.
Triggers have low overhead. If a trigger is not defined, then the overhead is minimal: merely a pointer dereference and check. If a trigger is defined, then its overhead is equivalent to the overhead of calling a function.
There can be multiple triggers for one event. In this case, triggers are executed in the reverse order that they were defined in.
Triggers must work within the event context, that is, operate variables passed as the trigger function arguments. Triggers should not affect the global state of the program or change things unrelated to the event. If a trigger performs such calls as, for example, os.exit() or box.rollback(), the result of its execution is undefined.
Triggers are replaceable. The request to “redefine a trigger” implies passing a new trigger function and an old trigger function to one of the on_event functions.
The on_event functions all have parameters which are function pointers, and they all return function pointers. Remember that a Lua function definition such as function f() x = x + 1 end is the same as f = function () x = x + 1 end - in both cases f gets a function pointer. And trigger = box.session.on_connect(f) is the same as trigger = box.session.on_connect(function () x = x + 1 end) - in both cases trigger gets the function pointer which was passed.
You can call any on_event function with no arguments to get a list of its triggers. For example, use box.session.on_connect() to return a table of all connect-trigger functions.
Triggers can be useful in solving problems with replication. See details in Resolving replication conflicts.

Example:

Here we log connect and disconnect events into Tarantool server log.

log = require('log')

function on_connect_impl()
 log.info("connected "..box.session.peer()..", sid "..box.session.id())
end

function on_disconnect_impl()
 log.info("disconnected, sid "..box.session.id())
end

function on_auth_impl(user)
 log.info("authenticated sid "..box.session.id().." as "..user)
end

function on_connect() pcall(on_connect_impl) end
function on_disconnect() pcall(on_disconnect_impl) end
function on_auth(user) pcall(on_auth_impl, user) end

box.session.on_connect(on_connect)
box.session.on_disconnect(on_disconnect)
box.session.on_auth(on_auth)

Applications

Using Tarantool as an application server, you can write your own applications. Tarantool’s native language for writing applications is Lua, so a typical application would be a file that contains your Lua script. But you can also write applications in C or C++.

Launching an application

Note

If you’re new to Lua, we recommend going over the interactive Tarantool tutorial before proceeding with this chapter. To launch the tutorial, say tutorial() in Tarantool console:

tarantool> tutorial()
---
- |
 Tutorial -- Screen #1 -- Hello, Moon
 ====================================

 Welcome to the Tarantool tutorial.
 It will introduce you to Tarantool’s Lua application server
 and database server, which is what’s running what you’re seeing.
 This is INTERACTIVE -- you’re expected to enter requests
 based on the suggestions or examples in the screen’s text.
 <...>

Let’s create and launch our first Lua application for Tarantool. Here’s a simplest Lua application, the good old “Hello, world!”:

#!/usr/bin/env tarantool
print('Hello, world!')

We save it in a file. Let it be myapp.lua in the current directory.

Now let’s discuss how we can launch our application with Tarantool.

Launching in Docker

If we run Tarantool in a Docker container, the following command will start Tarantool without any application:

$ # create a temporary container and run it in interactive mode
$ docker run --rm -t -i tarantool/tarantool:latest

To run Tarantool with our application, we can say:

$ # create a temporary container and
$ # launch Tarantool with our application
$ docker run --rm -t -i \
             -v `pwd`/myapp.lua:/opt/tarantool/myapp.lua \
             -v /data/dir/on/host:/var/lib/tarantool \
             tarantool/tarantool:latest tarantool /opt/tarantool/myapp.lua

Here two resources on the host get mounted in the container:

our application file (myapp.lua) and
Tarantool data directory (/data/dir/on/host).

By convention, the directory for Tarantool application code inside a container is /opt/tarantool, and the directory for data is /var/lib/tarantool.

Launching a binary program

If we run Tarantool from a package or from a source build, we can launch our application:

in the script mode,
as a server application, or
as a daemon service.

The simplest way is to pass the filename to Tarantool at start:

$ tarantool myapp.lua
Hello, world!
$

Tarantool starts, executes our script in the script mode and exits.

Now let’s turn this script into a server application. We use box.cfg from Tarantool’s built-in Lua module to:

launch the database (a database has a persistent on-disk state, which needs to be restored after we start an application) and
configure Tarantool as a server that accepts requests over a TCP port.

We also add some simple database logic, using space.create() and create_index() to create a space with a primary index. We use the function box.once() to make sure that our logic will be executed only once when the database is initialized for the first time, so we don’t try to create an existing space or index on each invocation of the script:

#!/usr/bin/env tarantool
-- Configure database
box.cfg {
   listen = 3301
}
box.once("bootstrap", function()
   box.schema.space.create('tweedledum')
   box.space.tweedledum:create_index('primary',
       { type = 'TREE', parts = {1, 'unsigned'}})
end)

Now we launch our application in the same manner as before:

$ tarantool myapp.lua
Hello, world!
2017-08-11 16:07:14.250 [41436] main/101/myapp.lua C> version 2.1.0-429-g4e5231702
2017-08-11 16:07:14.250 [41436] main/101/myapp.lua C> log level 5
2017-08-11 16:07:14.251 [41436] main/101/myapp.lua I> mapping 1073741824 bytes for tuple arena...
2017-08-11 16:07:14.255 [41436] main/101/myapp.lua I> recovery start
2017-08-11 16:07:14.255 [41436] main/101/myapp.lua I> recovering from `./00000000000000000000.snap'
2017-08-11 16:07:14.271 [41436] main/101/myapp.lua I> recover from `./00000000000000000000.xlog'
2017-08-11 16:07:14.271 [41436] main/101/myapp.lua I> done `./00000000000000000000.xlog'
2017-08-11 16:07:14.272 [41436] main/102/hot_standby I> recover from `./00000000000000000000.xlog'
2017-08-11 16:07:14.274 [41436] iproto/102/iproto I> binary: started
2017-08-11 16:07:14.275 [41436] iproto/102/iproto I> binary: bound to [::]:3301
2017-08-11 16:07:14.275 [41436] main/101/myapp.lua I> done `./00000000000000000000.xlog'
2017-08-11 16:07:14.278 [41436] main/101/myapp.lua I> ready to accept requests

This time, Tarantool executes our script and keeps working as a server, accepting TCP requests on port 3301. We can see Tarantool in the current session’s process list:

$ ps | grep "tarantool"
  PID TTY           TIME CMD
41608 ttys001       0:00.47 tarantool myapp.lua <running>

But the Tarantool instance will stop if we close the current terminal window. To detach Tarantool and our application from the terminal window, we can launch it in the daemon mode. To do so, we add some parameters to box.cfg{}:

background = true that actually tells Tarantool to work as a daemon service,
log = 'dir-name' that tells the Tarantool daemon where to store its log file (other log settings are available in Tarantool log module), and
pid_file = 'file-name' that tells the Tarantool daemon where to store its pid file.

For example:

box.cfg {
   listen = 3301,
   background = true,
   log = '1.log',
   pid_file = '1.pid'
}

We launch our application in the same manner as before:

$ tarantool myapp.lua
Hello, world!
$

Tarantool executes our script, gets detached from the current shell session (you won’t see it with ps | grep "tarantool") and continues working in the background as a daemon attached to the global session (with SID = 0):

$ ps -ef | grep "tarantool"
  PID SID     TIME  CMD
42178   0  0:00.72 tarantool myapp.lua <running>

Now that we have discussed how to create and launch a Lua application for Tarantool, let’s dive deeper into programming practices.

Application roles

An application role is a Lua module that implements specific functions or logic. You can turn on or off a particular role for certain instances in a configuration without restarting these instances. A role is run when a configuration is loaded or reloaded.

Roles can be divided into the following groups:

Tarantool’s built-in roles. For example, the config.storage role can be used to make a Tarantool replica set act as a configuration storage.
Roles provided by third-party Lua modules. For example, the CRUD module provides the roles.crud-storage and roles.crud-router roles that enable CRUD operations in a sharded cluster.
Custom roles that are developed as a part of a cluster application. For example, you can create a custom role to define a stored procedure or implement a supplementary service, such as an email notifier or a replicator.

This section describes how to develop custom roles. To learn how to enable and configure roles, see Enabling and configuring roles.

Note

Don’t confuse application roles with other role types:

A role is a container for privileges that can be granted to users. Learn more in Roles.
A role of a replica set in regard to sharding. Learn more in Sharding roles.

Providing a role configuration

A custom role can be configured in the same way as roles provided by Tarantool or third-party Lua modules. You can learn more from Enabling and configuring roles.

This example shows how to enable and configure the greeter role, which is implemented in the next section:

instance001:
  roles: [ greeter ]
  roles_cfg:
    greeter:
      greeting: 'Hi'

The role configuration provided in roles_cfg can be accessed when validating and applying this configuration.

Tarantool includes the experimental.config.utils.schema built-in module that provides tools for managing user-defined configurations of applications (app.cfg) and roles (roles_cfg). The examples below show its basic usage.

Given that a role is a Lua module, a role name is passed to require() to obtain the module. When developing an application, you can place a file with the role code next to the cluster configuration file.

Creating a custom role

Overview

A custom application role is an object which implements custom functions or logic adding to Tarantool’s built-in roles and roles provided by third-party Lua modules. For example, a logging role can be created to add logging functionality on top of the built-in one.

Creating a custom role includes the following steps:

(Optional) Define the role configuration schema.
Define a function that validates a role configuration.
Define a function that applies a validated configuration.
Define a function that stops a role.
(Optional) Define roles from which this custom role depends on.
(Optional) Define the on_event callback function.

As a result, a role module should return an object that has corresponding functions and fields specified:

return {
    validate = function() -- ... -- end,
    apply = function() -- ... -- end,
    stop = function() -- ... -- end,
    dependencies = { -- ... -- },
    on_event = function(config, key, value)
        local log = require('log')
        log.info('roles_cfg.my_role.foo: ' .. config.foo)
        log.info('on_event is triggered by ' .. key)
        log.info('is_ro: ' .. value.is_ro)
    end,
}

The examples in this article show how to do this.

You can omit the optional steps and get a simple role as in the example below.

return {
    validate = function() -- ... -- end,
    apply = function() -- ... -- end,
    stop = function() -- ... -- end,
}

You can modify a role, for example, by adding dependencies or specifying the on_event callback. If you modify a role, you need to restart the Tarantool instance with the role in order to apply the changes.

Note

Code snippets shown in this section are included from the following application: application_role_cfg.

Defining the role configuration schema

The experimental.config.utils.schema built-in module provides the schema_object class. An object of this class defines a custom configuration scheme of a role or an application.

This example shows how to define a schema that reflects the role configuration shown above:

local greeter_schema = schema.new('greeter', schema.record({
    greeting = schema.scalar({
        type = 'string',
        allowed_values = { 'Hi', 'Hello' }
    })
}))

If you don’t use the module, skip this step. In this case, use the cfg argument of the role’s validate() and apply() functions to refer to its configuration values, for example, cfg.greeting.

Validating a role configuration

To validate a role configuration, you need to define the validate([cfg]) function.

In the example below, the validate() function of the role configuration schema is used to validate the greeting value:

local function validate(cfg)
    greeter_schema:validate(cfg)
end

If the configuration is not valid, validate() reports an unrecoverable error by throwing an error object.

Applying a role configuration

To apply the validated configuration, define the apply([cfg]) function. As the validate() function, apply() provides access to a role’s configuration using the cfg argument.

In the example below, the apply() function uses the log module to write a value from the role configuration to the log:

local function apply(cfg)
    log.info("%s from the 'greeter' role!", greeter_schema:get(cfg, 'greeting'))
end

Stopping a role

To stop a role, use the stop() function.

In the example below, the stop() function uses the log module to indicate that a role is stopped:

local function stop()
    log.info("The 'greeter' role is stopped")
end

When you’ve defined all the role functions, you need to return an object that has corresponding functions specified:

return {
    validate = validate,
    apply = apply,
    stop = stop,
}

Role dependencies

To define a role’s dependencies, use the dependencies field. In this example, the byeer role has the greeter role as the dependency:

-- byeer.lua --
local log = require('log').new("byeer")

return {
    dependencies = { 'greeter' },
    validate = function() end,
    apply = function() log.info("Bye from the 'byeer' role!") end,
    stop = function() end,
}

A role cannot be started without its dependencies. This means that all the dependencies of a role should be defined in the roles configuration parameter:

instance001:
  roles: [ greeter, byeer ]

You can find the full example here: application_role_cfg.

On_event callback

Since version 3.3.1, you can define the on_event callback for custom roles. The on_event callback is called every time a box.status system event is broadcasted. If multiple custom roles have the on_event callback defined, these callbacks are called one after another in the order defined by roles dependencies.

The on_event callback accepts 3 arguments, when it is called:

config, which contains the configuration of the role;
key, which reflects the trigger event and is set to:
- config.apply if the callback was triggered by a configuration update;
- box.status if it was triggered by the box.status system event.
value, which shows the information about the instance status as in the trigger box.status system event. If the callback is triggered by a configuration update, the value shows the information of the most recent box.status system event.

Note

All on_event callbacks with the config.apply key are executed as a part of the configuration process. Process statuses ready or check_warnings are reached only after all such on_event callbacks are done.
All on_event callbacks are executed inside of a pcall. If an error is raised for a callback, it is logged with the error level and the series execution continues.

The example of the on_event callback is provided in the spaces creation article below.

Adding initialization code

You can add initialization code to a role by defining and calling a function with an arbitrary name at the top level of a module, for example:

local function init()
    -- ... --
end

init()

For example, you can create spaces, define indexes, or grant privileges to specific users or roles.

Specifics of creating spaces

To create a space in a role, you need to make sure that the target instance is in read-write mode (its box.info.ro is false). You can check an instance state by subscribing to the box.status event using box.watch():

box.watch('box.status', function()
    -- creating a space
    -- ...
end)

Note

Given that a role may be enabled when an instance is already in read-write mode, you also need to execute schema initialization code from apply(). To make sure a space is created only once, use the if_not_exists option.

Since version 3.3.1, you can define space creation in a role via the on_event callback function.

See the example of such definition below:

return {
    validate = function() end,
    apply = function() end,
    stop = function() end,
    on_event = function(config, key, value)
        -- Can only create spaces on RW.
        if value.is_ro then
            return
        end
        -- Assume the role config is a table.
        if type(config) ~= 'table' then
            error('Config must be a table')
        end
        local space_name = config.space_name or 'default'
        box.schema.space.create(space_name, {
            if_not_exists = true,
        })
    end
}

Roles life cycle

A role’s life cycle includes the stages described below.

Loading roles

On each run, all roles are loaded in the order they are specified in the configuration. This stage takes effect when a role is enabled or an instance with this role is restarted. At this stage, a role executes the initialization code.

A role cannot be started if it has dependencies that are not specified in a configuration.

Note

Dependencies do not affect the order in which roles are loaded. However, the validate(), apply(), and stop() functions are executed taking dependencies into account. Learn more in Executing functions for dependent roles.

Stopping roles

This stage takes effect during a configuration reload when a role is removed from the configuration for a given instance. Note that all stop() calls are performed before any validate() or apply() calls. This means that old roles are stopped first, and only then new roles are started.

Validating a role’s configurations

At this stage, a configuration for each role is validated using the corresponding validate() function in the same order in which they are specified in the configuration.

Applying a role’s configurations

At this stage, a configuration for each role is applied using the corresponding apply() function in the same order in which they are specified in the configuration.

All role’s functions report an unrecoverable error by throwing an error object. If an error is thrown in any phase, applying a configuration is stopped. If starting or stopping a role throws an error, no roles are stopped or started afterward. An error is caught and shown in config:info() in the alerts section.

Executing functions for dependent roles

For roles that depend on each other, their validate(), apply(), and stop() functions are executed taking into account the dependencies. Suppose, there are three independent and two dependent roles:

role1
role2
role3
    └─── role4
             └─── role5

role1, role2, and role5 are independent roles.
role3 depends on role4, role4 depends on role5.

The roles are enabled in a configuration as follows:

roles: [ role1, role2, role3, role4, role5 ]

In this case, validate() and apply() for these roles are executed in the following order:

role1 -> role2 -> role5 -> role4 -> role3

Roles removed from a configuration are stopped in the order reversed to the order they are specified in a configuration, taking into account the dependencies. Suppose, all roles except role1 are removed from the configuration above:

roles: [ role1 ]

After reloading a configuration, stop() functions for the removed roles are executed in the following order:

role3 -> role4 -> role5 -> role2

Example: Role without a configuration

The example below shows how to enable the custom greeter role for instance001:

instance001:
  roles: [ greeter ]

The implementation of this role looks as follows:

-- greeter.lua --
return {
    validate = function() end,
    apply = function() require('log').info("Hi from the 'greeter' role!") end,
    stop = function() end,
}

Example on GitHub: application_role

Example: Role with a configuration

The example below shows how to enable the custom greeter role for instance001 and specify the configuration for this role:

instance001:
  roles: [ greeter ]
  roles_cfg:
    greeter:
      greeting: 'Hi'

The implementation of this role looks as follows:

-- greeter.lua --
local log = require('log').new("greeter")
local schema = require('experimental.config.utils.schema')

local greeter_schema = schema.new('greeter', schema.record({
    greeting = schema.scalar({
        type = 'string',
        allowed_values = { 'Hi', 'Hello' }
    })
}))

local function validate(cfg)
    greeter_schema:validate(cfg)
end

local function apply(cfg)
    log.info("%s from the 'greeter' role!", greeter_schema:get(cfg, 'greeting'))
end

local function stop()
    log.info("The 'greeter' role is stopped")
end

return {
    validate = validate,
    apply = apply,
    stop = stop,
}

Example on GitHub: application_role_cfg

Example: HTTP API

The example below shows how to enable and configure the http-api custom role:

instance001:
  roles: [ http-api ]
  roles_cfg:
    http-api:
      host: '127.0.0.1'
      port: 8080

The implementation of this role looks as follows:

-- http-api.lua --
local httpd
local json = require('json')
local schema = require('experimental.config.utils.schema')

local function validate_host(host, w)
    local host_pattern = "^(%d+)%.(%d+)%.(%d+)%.(%d+)$"
    if not host:match(host_pattern) then
        w.error("'host' should be a string containing a valid IP address, got %q", host)
    end
end

local function validate_port(port, w)
    if port <= 1 or port >= 65535 then
        w.error("'port' should be between 1 and 65535, got %d", port)
    end
end

local listen_address_schema = schema.new('listen_address', schema.record({
    host = schema.scalar({
        type = 'string',
        validate = validate_host,
        default = '127.0.0.1',
    }),
    port = schema.scalar({
        type = 'integer',
        validate = validate_port,
        default = 8080,
    }),
}))

local function validate(cfg)
    listen_address_schema:validate(cfg)
end

local function apply(cfg)
    if httpd then
        httpd:stop()
    end
    local cfg_with_defaults = listen_address_schema:apply_default(cfg)
    local host = listen_address_schema:get(cfg_with_defaults, 'host')
    local port = listen_address_schema:get(cfg_with_defaults, 'port')
    httpd = require('http.server').new(host, port)
    local response_headers = { ['content-type'] = 'application/json' }
    httpd:route({ path = '/band/:id', method = 'GET' }, function(req)
        local id = req:stash('id')
        local band_tuple = box.space.bands:get(tonumber(id))
        if not band_tuple then
            return { status = 404, body = 'Band not found' }
        else
            local band = { id = band_tuple['id'],
                           band_name = band_tuple['band_name'],
                           year = band_tuple['year'] }
            return { status = 200, headers = response_headers, body = json.encode(band) }
        end
    end)
    httpd:route({ path = '/band', method = 'GET' }, function(req)
        local limit = req:query_param('limit')
        if not limit then
            limit = 5
        end
        local band_tuples = box.space.bands:select({}, { limit = tonumber(limit) })
        local bands = {}
        for _, tuple in pairs(band_tuples) do
            local band = { id = tuple['id'],
                           band_name = tuple['band_name'],
                           year = tuple['year'] }
            table.insert(bands, band)
        end
        return { status = 200, headers = response_headers, body = json.encode(bands) }
    end)
    httpd:start()
end

local function stop()
    httpd:stop()
end

local function init()
    require('data'):add_sample_data()
end

init()

return {
    validate = validate,
    apply = apply,
    stop = stop,
}

Example on GitHub: application_role_http_api

API Reference

Members
validate([cfg])	Validate a role’s configuration.
apply([cfg])	Apply a role’s configuration.
stop()	Stop a role.
dependencies	Define a role’s dependencies.

validate([cfg])¶

Validate a role’s configuration. This function is called on instance startup or when the configuration is reloaded for the instance with this role. Note that the validate() function is called regardless of whether the role’s configuration or any field in a cluster’s configuration is changed.

validate() should throw an error if the validation fails.

Parameters:	cfg – a role’s role configuration to be validated. This parameter provides access to configuration options defined in roles_cfg.<role_name>. To get values of configuration options placed outside `roles_cfg.<role_name>`, use config:get().

apply([cfg])¶

Apply a role’s configuration. apply() is called after validate() is executed for all the enabled roles. As the validate() function, apply() is called on instance startup or when the configuration is reloaded for the instance with this role.

apply() should throw an error if the specified configuration can’t be applied.

Note

Note that apply() is not invoked if an instance switches to read-write mode when replication.failover is set to election or supervised. You can check an instance state by subscribing to the box.status event using box.watch().

Parameters:	cfg – a role’s role configuration to be applied. This parameter provides access to configuration options defined in roles_cfg.<role_name>. To get values of configuration options placed outside `roles_cfg.<role_name>`, use config:get().

Fibers, yields, and cooperative multitasking

Creating a fiber is the Tarantool way of making application logic work in the background at all times. A fiber is a set of instructions that are executed with cooperative multitasking: the instructions contain yield signals, upon which control is passed to another fiber.

Fibers

Fibers are similar to threads of execution in computing. The key difference is that threads use preemptive multitasking, while fibers use cooperative multitasking (see below). This gives fibers the following two advantages over threads:

Better controllability. Threads often depend on the kernel’s thread scheduler to preempt a busy thread and resume another thread, so preemption may occur unpredictably. Fibers yield themselves to run another fiber while executing, so yields are controlled by application logic.
Higher performance. Threads require more resources to preempt as they need to address the system kernel. Fibers are lighter and faster as they don’t need to address the kernel to yield.

Yet fibers have some limitations as compared with threads, the main limitation being no multi-core mode. All fibers in an application belong to a single thread, so they all use the same CPU core as the parent thread. Meanwhile, this limitation is not really serious for Tarantool applications, because a typical bottleneck for Tarantool is the HDD, not the CPU.

A fiber has all the features of a Lua coroutine and all programming concepts that apply for Lua coroutines will apply for fibers as well. However, Tarantool has made some enhancements for fibers and has used fibers internally. So, although the use of coroutines is possible and supported, the use of fibers is recommended.

Any live fiber can be in one of three states: running, suspended, and ready. After a fiber dies, the dead status returns.

To learn more about fibers, go to the fiber module documentation.

Yields

Yield is an action that occurs in a cooperative environment that transfers control of the thread from the current fiber to another fiber that is ready to execute.

Any live fiber can be in one of three states: running, suspended, and ready. After a fiber dies, the dead status is returned. By observing fibers from the outside, you can only see running (for the current fiber) and suspended for any other fiber waiting for an event from the event loop (ev) for execution.

After a yield has occurred, the next ready fiber is taken from the queue and executed. When there are no more ready fibers, execution is transferred to the event loop.

After a fiber has yielded and regained control, it immediately issues testcancel.

Yields can be explicit or implicit.

Explicit yields

Explicit yields are clearly visible from the invoking code. There are only two explicit yields: fiber.yield() and fiber.sleep(t).

fiber.yield() yields execution to another ready fiber while putting itself in the ready state, meaning that it will be executed again as soon as possible while being polite to other fibers waiting for execution.
fiber.sleep(t) yields execution to another ready fiber and puts itself in the suspended state for time t until time passes and the event loop wakes up this fiber to the ready state.

In general, it is good behavior for long-running cpu-intensive tasks to yield periodically to be cooperative to other waiting fibers.

Implicit yields

On the other hand, there are many operations, such as operations with sockets, file system, and disk I/O, which imply some waiting for the current fiber while others can be executed. When such an operation occurs, a possible blocking operation would be passed into the event loop and the fiber would be suspended until the resource is ready to continue fiber execution.

Here is the list of implicitly yielding operations:

Connection establishment (socket).
Socket read and write (socket).
Filesystem operations (from fio).
Channel data transfer (fiber.channel).
File input/output (from fio).
Console operations (since console is a socket).
HTTP requests (since HTTP is a socket operation).
Database modifications (if they imply a disk write).
Database reading for the vinyl engine.
Invocation of another process (popen).

Note

Please note that all operations of the os module are non-cooperative and exclusively block the whole tx thread.

For memtx, since all data is in memory, there is no yielding for a read request (like :select, :pairs, :get).

For vinyl, since some data may not be in memory, there may be disk I/O for a read (to fetch data from disk) or write (because a stall may occur while waiting for memory to be freed).

For both memtx and vinyl, since data change requests must be recorded in the WAL, there is normally a box.commit().

With the default autocommit mode the following operations are yielding:

space:alter.
space:drop.
space:create_index.
space:truncate.
space:insert.
space:replace.
space:update.
space:upserts.
space:delete.
index:update.
index:delete.
index:alter.
index:drop.
index:rename.
box.commit (if there were some modifications within the transaction).

To provide atomicity for transactions in transaction mode, some changes are applied to the modification operations for the memtx engine. After executing box.begin or within a box.atomic call, any modification operation will not yield, and yield will occur only on box.commit or upon return from box.atomic. Meanwhile, box.rollback does not yield.

That is why executing separate commands like select(), insert(), update() in the console inside a transaction without MVCC will cause it to an abort. This is due to implicit yield after each chunk of code is executed in the console.

Example #1

Engine = memtx.

space:get()
space:insert()

The sequence has one yield, at the end of the insert, caused by implicit commit; get() has nothing to write to the WAL and so does not yield.

Engine = memtx.

box.begin()
space1:get()
space1:insert()
space2:get()
space2:insert()
box.commit()

The sequence has one yield, at the end of the box.commit, none of the inserts are yielding.

Engine = vinyl.

space:get()
space:insert()

The sequence has one to three yields, since get() may yield if the data is not in the cache, insert() may yield if it waits for available memory, and there is an implicit yield at commit.

Engine = vinyl.

box.begin()
space1:get()
space1:insert()
space2:get()
space2:insert()
box.commit()

The sequence may yield from 1 to 5 times.

Example #2

Assume that there are tuples in the memtx space tester where the third field represents a positive dollar amount.

Let’s start a transaction, withdraw from tuple#1, deposit in tuple#2, and end the transaction, making its effects permanent.

tarantool> function txn_example(from, to, amount_of_money)
         >   box.atomic(function()
         >     box.space.tester:update(from, {{'-', 3, amount_of_money}})
         >     box.space.tester:update(to,   {{'+', 3, amount_of_money}})
         >   end)
         >   return "ok"
         > end

Result:
---
...
tarantool> txn_example({999}, {1000}, 1.00)
---
- "ok"
...

If wal_mode = none, then there is no implicit yielding at the commit time because there are no writes to the WAL.

If a request if performed via network connector such as net.box and implies sending requests to the server and receiving responses, then it involves network I/O and thus implicit yielding. Even if the request that is sent to the server has no implicit yield. Therefore, the following sequence causes yields three times sequentially when sending requests to the network and awaiting the results.

conn.space.test:get{1}
conn.space.test:get{2}
conn.space.test:get{3}

Cooperative multitasking

Cooperative multitasking means that unless a running fiber deliberately yields control, it is not preempted by some other fiber. But a running fiber will deliberately yield when it encounters a “yield point”: a transaction commit, an operating system call, or an explicit “yield” request. Any system call which can block will be performed asynchronously, and any running fiber which must wait for a system call will be preempted, so that another ready-to-run fiber takes its place and becomes the new running fiber.

This model makes all programmatic locks unnecessary: cooperative multitasking ensures that there will be no concurrency around a resource, no race conditions, and no memory consistency issues. The way to achieve this is simple: Use no yields, explicit or implicit in critical sections, and no one can interfere with code execution.

For small requests, such as simple UPDATE or INSERT or DELETE or SELECT, fiber scheduling is fair: it takes little time to process the request, schedule a disk write, and yield to a fiber serving the next client.

However, a function may perform complex calculations or be written in such a way that yields take a long time to occur. This can lead to unfair scheduling when a single client throttles the rest of the system, or to apparent stalls in processing requests. It is the responsibility of the function author to avoid this situation. As a protective mechanism, a fiber slice can be used.

Lua cookbook recipes

Here are contributions of Lua programs for some frequent or tricky situations.

You can execute any of these programs by copying the code into a .lua file, and then entering chmod +x ./program-name.lua and ./program-name.lua on the terminal.

The first line is a “hashbang”:

#!/usr/bin/env tarantool

This runs Tarantool Lua application server, which should be on the execution path.

This section contains the following recipes:

Use freely.

See more recipes on Tarantool GitHub.

hello_world.lua

The standard example of a simple program.

#!/usr/bin/env tarantool

print('Hello, World!')

console_start.lua

Use box.once() to initialize a database (creating spaces) if this is the first time the server has been run. Then use console.start() to start interactive mode.

#!/usr/bin/env tarantool

-- Configure database
box.cfg {
    listen = 3313
}

box.once("bootstrap", function()
    box.schema.space.create('tweedledum')
    box.space.tweedledum:create_index('primary',
        { type = 'TREE', parts = {1, 'unsigned'}})
end)

require('console').start()

fio_read.lua

Use the fio module to open, read, and close a file.

#!/usr/bin/env tarantool

local fio = require('fio')
local errno = require('errno')
local f = fio.open('/tmp/xxxx.txt', {'O_RDONLY' })
if not f then
    error("Failed to open file: "..errno.strerror())
end
local data = f:read(4096)
f:close()
print(data)

fio_write.lua

Use the fio module to open, write, and close a file.

#!/usr/bin/env tarantool

local fio = require('fio')
local errno = require('errno')
local f = fio.open('/tmp/xxxx.txt', {'O_CREAT', 'O_WRONLY', 'O_APPEND'},
    tonumber('0666', 8))
if not f then
    error("Failed to open file: "..errno.strerror())
end
f:write("Hello\n");
f:close()

ffi_printf.lua

Use the LuaJIT ffi library to call a C built-in function: printf(). (For help understanding ffi, see the FFI tutorial.)

#!/usr/bin/env tarantool

local ffi = require('ffi')
ffi.cdef[[
    int printf(const char *format, ...);
]]

ffi.C.printf("Hello, %s\n", os.getenv("USER"));

ffi_gettimeofday.lua

Use the LuaJIT ffi library to call a C function: gettimeofday(). This delivers time with millisecond precision, unlike the time function in Tarantool’s clock module.

#!/usr/bin/env tarantool

local ffi = require('ffi')
ffi.cdef[[
    typedef long time_t;
    typedef struct timeval {
    time_t tv_sec;
    time_t tv_usec;
} timeval;
    int gettimeofday(struct timeval *t, void *tzp);
]]

local timeval_buf = ffi.new("timeval")
local now = function()
    ffi.C.gettimeofday(timeval_buf, nil)
    return tonumber(timeval_buf.tv_sec * 1000 + (timeval_buf.tv_usec / 1000))
end

ffi_zlib.lua

Use the LuaJIT ffi library to call a C library function. (For help understanding ffi, see the FFI tutorial.)

#!/usr/bin/env tarantool

local ffi = require("ffi")
ffi.cdef[[
    unsigned long compressBound(unsigned long sourceLen);
    int compress2(uint8_t *dest, unsigned long *destLen,
    const uint8_t *source, unsigned long sourceLen, int level);
    int uncompress(uint8_t *dest, unsigned long *destLen,
    const uint8_t *source, unsigned long sourceLen);
]]
local zlib = ffi.load(ffi.os == "Windows" and "zlib1" or "z")

-- Lua wrapper for compress2()
local function compress(txt)
    local n = zlib.compressBound(#txt)
    local buf = ffi.new("uint8_t[?]", n)
    local buflen = ffi.new("unsigned long[1]", n)
    local res = zlib.compress2(buf, buflen, txt, #txt, 9)
    assert(res == 0)
    return ffi.string(buf, buflen[0])
end

-- Lua wrapper for uncompress
local function uncompress(comp, n)
    local buf = ffi.new("uint8_t[?]", n)
    local buflen = ffi.new("unsigned long[1]", n)
    local res = zlib.uncompress(buf, buflen, comp, #comp)
    assert(res == 0)
    return ffi.string(buf, buflen[0])
end

-- Simple test code.
local txt = string.rep("abcd", 1000)
print("Uncompressed size: ", #txt)
local c = compress(txt)
print("Compressed size: ", #c)
local txt2 = uncompress(c, #txt)
assert(txt2 == txt)

ffi_meta.lua

Use the LuaJIT ffi library to access a C object via a metamethod (a method which is defined with a metatable).

#!/usr/bin/env tarantool

local ffi = require("ffi")
ffi.cdef[[
typedef struct { double x, y; } point_t;
]]

local point
local mt = {
  __add = function(a, b) return point(a.x+b.x, a.y+b.y) end,
  __len = function(a) return math.sqrt(a.x*a.x + a.y*a.y) end,
  __index = {
    area = function(a) return a.x*a.x + a.y*a.y end,
  },
}
point = ffi.metatype("point_t", mt)

local a = point(3, 4)
print(a.x, a.y)  --> 3  4
print(#a)        --> 5
print(a:area())  --> 25
local b = a + point(0.5, 8)
print(#b)        --> 12.5

print_arrays.lua

Create Lua tables, and print them. Notice that for the ‘array’ table the iterator function is ipairs(), while for the ‘map’ table the iterator function is pairs(). (ipairs() is faster than pairs(), but pairs() is recommended for map-like tables or mixed tables.) The display will look like: “1 Apple | 2 Orange | 3 Grapefruit | 4 Banana | k3 v3 | k1 v1 | k2 v2”.

#!/usr/bin/env tarantool

array = { 'Apple', 'Orange', 'Grapefruit', 'Banana'}
for k, v in ipairs(array) do print(k, v) end

map = { k1 = 'v1', k2 = 'v2', k3 = 'v3' }
for k, v in pairs(map) do print(k, v) end

count_array.lua

Use the ‘#’ operator to get the number of items in an array-like Lua table. This operation has O(log(N)) complexity.

#!/usr/bin/env tarantool

array = { 1, 2, 3}
print(#array)

count_array_with_nils.lua

Missing elements in arrays, which Lua treats as “nil”s, cause the simple “#” operator to deliver improper results. The “print(#t)” instruction will print “4”; the “print(counter)” instruction will print “3”; the “print(max)” instruction will print “10”. Other table functions, such as table.sort(), will also misbehave when “nils” are present.

#!/usr/bin/env tarantool

local t = {}
t[1] = 1
t[4] = 4
t[10] = 10
print(#t)
local counter = 0
for k,v in pairs(t) do counter = counter + 1 end
print(counter)
local max = 0
for k,v in pairs(t) do if k > max then max = k end end
print(max)

count_array_with_nulls.lua

Use explicit NULL values to avoid the problems caused by Lua’s nil == missing value behavior. Although json.NULL == nil is true, all the print instructions in this program will print the correct value: 10.

#!/usr/bin/env tarantool

local json = require('json')
local t = {}
t[1] = 1; t[2] = json.NULL; t[3]= json.NULL;
t[4] = 4; t[5] = json.NULL; t[6]= json.NULL;
t[6] = 4; t[7] = json.NULL; t[8]= json.NULL;
t[9] = json.NULL
t[10] = 10
print(#t)
local counter = 0
for k,v in pairs(t) do counter = counter + 1 end
print(counter)
local max = 0
for k,v in pairs(t) do if k > max then max = k end end
print(max)

count_map.lua

Get the number of elements in a map-like table.

#!/usr/bin/env tarantool

local map = { a = 10, b = 15, c = 20 }
local size = 0
for _ in pairs(map) do size = size + 1; end
print(size)

swap.lua

Use a Lua peculiarity to swap two variables without needing a third variable.

#!/usr/bin/env tarantool

local x = 1
local y = 2
x, y = y, x
print(x, y)

class.lua

Create a class, create a metatable for the class, create an instance of the class. Another illustration is at http://lua-users.org/wiki/LuaClassesWithMetatable.

#!/usr/bin/env tarantool

-- define class objects
local myclass_somemethod = function(self)
    print('test 1', self.data)
end

local myclass_someothermethod = function(self)
    print('test 2', self.data)
end

local myclass_tostring = function(self)
    return 'MyClass <'..self.data..'>'
end

local myclass_mt = {
    __tostring = myclass_tostring;
    __index = {
        somemethod = myclass_somemethod;
        someothermethod = myclass_someothermethod;
    }
}

-- create a new object of myclass
local object = setmetatable({ data = 'data'}, myclass_mt)
print(object:somemethod())
print(object.data)

garbage.lua

Activate the Lua garbage collector with the collectgarbage function.

#!/usr/bin/env tarantool

collectgarbage('collect')

fiber_producer_and_consumer.lua

Start one fiber for producer and one fiber for consumer. Use fiber.channel() to exchange data and synchronize. One can tweak the channel size (ch_size in the program code) to control the number of simultaneous tasks waiting for processing.

#!/usr/bin/env tarantool

local fiber = require('fiber')
local function consumer_loop(ch, i)
    -- initialize consumer synchronously or raise an error()
    fiber.sleep(0) -- allow fiber.create() to continue
    while true do
        local data = ch:get()
        if data == nil then
            break
        end
        print('consumed', i, data)
        fiber.sleep(math.random()) -- simulate some work
    end
end

local function producer_loop(ch, i)
    -- initialize consumer synchronously or raise an error()
    fiber.sleep(0) -- allow fiber.create() to continue
    while true do
        local data = math.random()
        ch:put(data)
        print('produced', i, data)
    end
end

local function start()
    local consumer_n = 5
    local producer_n = 3

    -- Create a channel
    local ch_size = math.max(consumer_n, producer_n)
    local ch = fiber.channel(ch_size)

    -- Start consumers
    for i=1, consumer_n,1 do
        fiber.create(consumer_loop, ch, i)
    end

    -- Start producers
    for i=1, producer_n,1 do
        fiber.create(producer_loop, ch, i)
    end
end

start()
print('started')

socket_tcpconnect.lua

Use socket.tcp_connect() to connect to a remote host via TCP. Display the connection details and the result of a GET request.

#!/usr/bin/env tarantool

local s = require('socket').tcp_connect('google.com', 80)
print(s:peer().host)
print(s:peer().family)
print(s:peer().type)
print(s:peer().protocol)
print(s:peer().port)
print(s:write("GET / HTTP/1.0\r\n\r\n"))
print(s:read('\r\n'))
print(s:read('\r\n'))

socket_tcp_echo.lua

Use socket.tcp_connect() to set up a simple TCP server, by creating a function that handles requests and echos them, and passing the function to socket.tcp_server(). This program has been used to test with 100,000 clients, with each client getting a separate fiber.

#!/usr/bin/env tarantool

local function handler(s, peer)
    s:write("Welcome to test server, " .. peer.host .."\n")
    while true do
        local line = s:read('\n')
        if line == nil then
            break -- error or eof
        end
        if not s:write("pong: "..line) then
            break -- error or eof
        end
    end
end

local server, addr = require('socket').tcp_server('localhost', 3311, handler)

getaddrinfo.lua

Use socket.getaddrinfo() to perform non-blocking DNS resolution, getting both the AF_INET6 and AF_INET information for ‘google.com’. This technique is not always necessary for tcp connections because socket.tcp_connect() performs socket.getaddrinfo under the hood, before trying to connect to the first available address.

#!/usr/bin/env tarantool

local s = require('socket').getaddrinfo('google.com', 'http', { type = 'SOCK_STREAM' })
print('host=',s[1].host)
print('family=',s[1].family)
print('type=',s[1].type)
print('protocol=',s[1].protocol)
print('port=',s[1].port)
print('host=',s[2].host)
print('family=',s[2].family)
print('type=',s[2].type)
print('protocol=',s[2].protocol)
print('port=',s[2].port)

socket_udp_echo.lua

Tarantool does not currently have a udp_server function, therefore socket_udp_echo.lua is more complicated than socket_tcp_echo.lua. It can be implemented with sockets and fibers.

#!/usr/bin/env tarantool

local socket = require('socket')
local errno = require('errno')
local fiber = require('fiber')

local function udp_server_loop(s, handler)
    fiber.name("udp_server")
    while true do
        -- try to read a datagram first
        local msg, peer = s:recvfrom()
        if msg == "" then
            -- socket was closed via s:close()
            break
        elseif msg ~= nil then
            -- got a new datagram
            handler(s, peer, msg)
        else
            if s:errno() == errno.EAGAIN or s:errno() == errno.EINTR then
                -- socket is not ready
                s:readable() -- yield, epoll will wake us when new data arrives
            else
                -- socket error
                local msg = s:error()
                s:close() -- save resources and don't wait GC
                error("Socket error: " .. msg)
            end
        end
    end
end

local function udp_server(host, port, handler)
    local s = socket('AF_INET', 'SOCK_DGRAM', 0)
    if not s then
        return nil -- check errno:strerror()
    end
    if not s:bind(host, port) then
        local e = s:errno() -- save errno
        s:close()
        errno(e) -- restore errno
        return nil -- check errno:strerror()
    end

    fiber.create(udp_server_loop, s, handler) -- start a new background fiber
    return s
end

A function for a client that connects to this server could look something like this …

local function handler(s, peer, msg)
    -- You don't have to wait until socket is ready to send UDP
    -- s:writable()
    s:sendto(peer.host, peer.port, "Pong: " .. msg)
end

local server = udp_server('127.0.0.1', 3548, handler)
if not server then
    error('Failed to bind: ' .. errno.strerror())
end

print('Started')

require('console').start()

http_get.lua

Use the http module to get data via HTTP.

#!/usr/bin/env tarantool

local http_client = require('http.client')
local json = require('json')
local r = http_client.get('https://api.frankfurter.app/latest?to=USD%2CRUB')
if r.status ~= 200 then
    print('Failed to get currency ', r.reason)
    return
end
local data = json.decode(r.body)
print(data.base, 'rate of', data.date, 'is', data.rates.RUB, 'RUB or', data.rates.USD, 'USD')

http_send.lua

Use the http module to send data via HTTP.

#!/usr/bin/env tarantool

local http_client = require('http.client')
local json = require('json')
local data = json.encode({ Key = 'Value'})
local headers = { Token = 'xxxx', ['X-Secret-Value'] = '42' }
local r = http_client.post('http://localhost:8081', data, { headers = headers})
if r.status == 200 then
    print 'Success'
end

http_server.lua

Use the http rock (which must first be installed) to turn Tarantool into a web server.

#!/usr/bin/env tarantool

local function handler(self)
    return self:render{ json = { ['Your-IP-Is'] = self.peer.host } }
end

local server = require('http.server').new(nil, 8080, {charset = "utf8"}) -- listen *:8080
server:route({ path = '/' }, handler)
server:start()
-- connect to localhost:8080 and see json

http_generate_html.lua

Use the http rock (which must first be installed) to generate HTML pages from templates. The http rock has a fairly simple template engine which allows execution of regular Lua code inside text blocks (like PHP). Therefore there is no need to learn new languages in order to write templates.

#!/usr/bin/env tarantool

local function handler(self)
local fruits = {'Apple', 'Orange', 'Grapefruit', 'Banana'}
    return self:render{ fruits = fruits }
end

local server = require('http.server').new(nil, 8080, {charset = "utf8"}) -- nil means '*'
server:route({ path = '/', file = 'index.html.lua' }, handler)
server:start()

An “HTML” file for this server, including Lua, could look like this (it would produce “1 Apple | 2 Orange | 3 Grapefruit | 4 Banana”). Create a templates directory and put this file in it:

<html>
<body>
    <table border="1">
        % for i,v in pairs(fruits) do
        <tr>
            <td><%= i %></td>
            <td><%= v %></td>
        </tr>
        % end
    </table>
</body>
</html>

select_all.go

In Go, there is no one-liner to select all tuples from a Tarantool space. Yet you can use a script like this one. Call it on the instance you want to connect to.

package main

import (
	"fmt"
	"log"

	"github.com/tarantool/go-tarantool"
)

/*
box.cfg{listen = 3301}
box.schema.user.passwd('pass')

s = box.schema.space.create('tester')
s:format({
    {name = 'id', type = 'unsigned'},
    {name = 'band_name', type = 'string'},
    {name = 'year', type = 'unsigned'}
})
s:create_index('primary', { type = 'hash', parts = {'id'} })
s:create_index('scanner', { type = 'tree', parts = {'id', 'band_name'} })

s:insert{1, 'Roxette', 1986}
s:insert{2, 'Scorpions', 2015}
s:insert{3, 'Ace of Base', 1993}
*/

func main() {
	conn, err := tarantool.Connect("127.0.0.1:3301", tarantool.Opts{
		User: "admin",
		Pass: "pass",
	})

	if err != nil {
		log.Fatalf("Connection refused")
	}
	defer conn.Close()

	spaceName := "tester"
	indexName := "scanner"
	idFn := conn.Schema.Spaces[spaceName].Fields["id"].Id
	bandNameFn := conn.Schema.Spaces[spaceName].Fields["band_name"].Id

	var tuplesPerRequest uint32 = 2
	cursor := []interface{}{}

	for {
		resp, err := conn.Select(spaceName, indexName, 0, tuplesPerRequest, tarantool.IterGt, cursor)
		if err != nil {
			log.Fatalf("Failed to select: %s", err)
		}

		if resp.Code != tarantool.OkCode {
			log.Fatalf("Select failed: %s", resp.Error)
		}

		if len(resp.Data) == 0 {
			break
		}

		fmt.Println("Iteration")

		tuples := resp.Tuples()
		for _, tuple := range tuples {
			fmt.Printf("\t%v\n", tuple)
		}

		lastTuple := tuples[len(tuples)-1]
		cursor = []interface{}{lastTuple[idFn], lastTuple[bandNameFn]}
	}
}

Lua tutorials

First steps

If you’re new to Lua, we recommend going over the interactive Tarantool tutorial. To launch the tutorial, run the tutorial() command in the Tarantool console:

tarantool> tutorial()
---
- |
 Tutorial -- Screen #1 -- Hello, Moon
 ====================================

 Welcome to the Tarantool tutorial.
 It will introduce you to Tarantool’s Lua application server
 and database server, which is what’s running what you’re seeing.
 This is INTERACTIVE -- you’re expected to enter requests
 based on the suggestions or examples in the screen’s text.
 <...>

Insert one million tuples with a Lua stored procedure

This is an exercise assignment: “Insert one million tuples. Each tuple should have a constantly-increasing numeric primary-key field and a random alphabetic 10-character string field.”

The purpose of the exercise is to show what Lua functions look like inside Tarantool. It will be necessary to employ the Lua math library, the Lua string library, the Tarantool box library, the Tarantool box.tuple library, loops, and concatenations. It should be easy to follow even for a person who has not used either Lua or Tarantool before. The only requirement is a knowledge of how other programming languages work and a memory of the first two chapters of this manual. But for better understanding, follow the comments and the links, which point to the Lua manual or to elsewhere in this Tarantool manual. To further enhance learning, type the statements in with the tarantool client while reading along.

Configure

We are going to use the Tarantool sandbox that was created for our “Getting started” exercises. So there is a single space, and a numeric primary key, and a running Tarantool server instance which also serves as a client.

Delimiter

In earlier versions of Tarantool, multi-line functions had to be enclosed within “delimiters”. They are no longer necessary, and so they will not be used in this tutorial. However, they are still supported. Users who wish to use delimiters, or users of older versions of Tarantool, should check the syntax description for declaring a delimiter before proceeding.

Create a function that returns a string

We will start by making a function that returns a fixed string, “Hello world”.

function string_function()
  return "hello world"
end

The word “function” is a Lua keyword – we’re about to go into Lua. The function name is string_function. The function has one executable statement, return "hello world". The string “hello world” is enclosed in double quotes here, although Lua doesn’t care – one could use single quotes instead. The word “end” means “this is the end of the Lua function declaration.” To confirm that the function works, we can say

string_function()

Sending function-name() means “invoke the Lua function.” The effect is that the string which the function returns will end up on the screen.

For more about Lua strings see Lua manual chapter 2.4 “Strings” . For more about functions see Lua manual chapter 5 “Functions”.

The screen now looks like this:

tarantool> function string_function()
         >   return "hello world"
         > end
---
...
tarantool> string_function()
---
- hello world
...
tarantool>

Create a function that calls another function and sets a variable

Now that string_function exists, we can invoke it from another function.

function main_function()
  local string_value
  string_value = string_function()
  return string_value
end

We begin by declaring a variable “string_value”. The word “local” means that string_value appears only in main_function. If we didn’t use “local” then string_value would be visible everywhere - even by other users using other clients connected to this server instance! Sometimes that’s a very desirable feature for inter-client communication, but not this time.

Then we assign a value to string_value, namely, the result of string_function(). Soon we will invoke main_function() to check that it got the value.

For more about Lua variables see Lua manual chapter 4.2 “Local Variables and Blocks” .

The screen now looks like this:

tarantool> function main_function()
         >   local string_value
         >   string_value = string_function()
         >   return string_value
         > end
---
...
tarantool> main_function()
---
- hello world
...
tarantool>

Modify the function so it returns a one-letter random string

Now that it’s a bit clearer how to make a variable, we can change string_function() so that, instead of returning a fixed literal “Hello world”, it returns a random letter between ‘A’ and ‘Z’.

function string_function()
  local random_number
  local random_string
  random_number = math.random(65, 90)
  random_string = string.char(random_number)
  return random_string
end

It is not necessary to destroy the old string_function() contents, they’re simply overwritten. The first assignment invokes a random-number function in Lua’s math library; the parameters mean “the number must be an integer between 65 and 90.” The second assignment invokes an integer-to-character function in Lua’s string library; the parameter is the code point of the character. Luckily the ASCII value of ‘A’ is 65 and the ASCII value of ‘Z’ is 90 so the result will always be a letter between A and Z.

For more about Lua math-library functions see Lua users “Math Library Tutorial”. For more about Lua string-library functions see Lua users “String Library Tutorial” .

Once again the string_function() can be invoked from main_function() which can be invoked with main_function().

The screen now looks like this:

tarantool> function string_function()
         >   local random_number
         >   local random_string
         >   random_number = math.random(65, 90)
         >   random_string = string.char(random_number)
         >   return random_string
         > end
---
...
tarantool> main_function()
---
- C
...
tarantool>

… Well, actually it won’t always look like this because math.random() produces random numbers. But for the illustration purposes it won’t matter what the random string values are.

Modify the function so it returns a ten-letter random string

Now that it’s clear how to produce one-letter random strings, we can reach our goal of producing a ten-letter string by concatenating ten one-letter strings, in a loop.

function string_function()
  local random_number
  local random_string
  random_string = ""
  for x = 1,10,1 do
    random_number = math.random(65, 90)
    random_string = random_string .. string.char(random_number)
  end
  return random_string
end

The words “for x = 1,10,1” mean “start with x equals 1, loop until x equals 10, increment x by 1 for each iteration.” The symbol “..” means “concatenate”, that is, add the string on the right of the “..” sign to the string on the left of the “..” sign. Since we start by saying that random_string is “” (a blank string), the end result is that random_string has 10 random letters. Once again the string_function() can be invoked from main_function() which can be invoked with main_function().

For more about Lua loops see Lua manual chapter 4.3.4 “Numeric for”.

The screen now looks like this:

tarantool> function string_function()
         >   local random_number
         >   local random_string
         >   random_string = ""
         >   for x = 1,10,1 do
         >     random_number = math.random(65, 90)
         >     random_string = random_string .. string.char(random_number)
         >   end
         >   return random_string
         > end
---
...
tarantool> main_function()
---
- 'ZUDJBHKEFM'
...
tarantool>

Make a tuple out of a number and a string

Now that it’s clear how to make a 10-letter random string, it’s possible to make a tuple that contains a number and a 10-letter random string, by invoking a function in Tarantool’s library of Lua functions.

function main_function()
  local string_value, t
  string_value = string_function()
  t = box.tuple.new({1, string_value})
  return t
end

Once this is done, t will be the value of a new tuple which has two fields. The first field is numeric: 1. The second field is a random string. Once again the string_function() can be invoked from main_function() which can be invoked with main_function().

For more about Tarantool tuples see Tarantool manual section Submodule box.tuple.

The screen now looks like this:

tarantool> function main_function()
         > local string_value, t
         > string_value = string_function()
         > t = box.tuple.new({1, string_value})
         > return t
         > end
---
...
tarantool> main_function()
---
- [1, 'PNPZPCOOKA']
...
tarantool>

Modify main_function to insert a tuple into the database

Now that it’s clear how to make a tuple that contains a number and a 10-letter random string, the only trick remaining is putting that tuple into tester. Remember that tester is the first space that was defined in the sandbox, so it’s like a database table.

function main_function()
  local string_value, t
  string_value = string_function()
  t = box.tuple.new({1,string_value})
  box.space.tester:replace(t)
end

The new line here is box.space.tester:replace(t). The name contains ‘tester’ because the insertion is going to be to tester. The second parameter is the tuple value. To be perfectly correct we could have said box.space.tester:insert(t) here, rather than box.space.tester:replace(t), but “replace” means “insert even if there is already a tuple whose primary-key value is a duplicate”, and that makes it easier to re-run the exercise even if the sandbox database isn’t empty. Once this is done, tester will contain a tuple with two fields. The first field will be 1. The second field will be a random 10-letter string. Once again the string_function() can be invoked from main_function() which can be invoked with main_function(). But main_function() won’t tell the whole story, because it does not return t, it only puts t into the database. To confirm that something got inserted, we’ll use a SELECT request.

main_function()
box.space.tester:select{1}

For more about Tarantool insert and replace calls, see Tarantool manual section Submodule box.space, space_object:insert(), and space_object:replace().

The screen now looks like this:

tarantool> function main_function()
         >   local string_value, t
         >   string_value = string_function()
         >   t = box.tuple.new({1,string_value})
         >   box.space.tester:replace(t)
         > end
---
...
tarantool> main_function()
---
...
tarantool> box.space.tester:select{1}
---
- - [1, 'EUJYVEECIL']
...
tarantool>

Modify main_function to insert a million tuples into the database

Now that it’s clear how to insert one tuple into the database, it’s no big deal to figure out how to scale up: instead of inserting with a literal value = 1 for the primary key, insert with a variable value = between 1 and 1 million, in a loop. Since we already saw how to loop, that’s a simple thing. The only extra wrinkle that we add here is a timing function.

function main_function()
  local string_value, t
  for i = 1,1000000,1 do
    string_value = string_function()
    t = box.tuple.new({i,string_value})
    box.space.tester:replace(t)
  end
end
start_time = os.clock()
main_function()
end_time = os.clock()
'insert done in ' .. end_time - start_time .. ' seconds'

The standard Lua function os.clock() will return the number of CPU seconds since the start. Therefore, by getting start_time = number of seconds just before the inserting, and then getting end_time = number of seconds just after the inserting, we can calculate (end_time - start_time) = elapsed time in seconds. We will display that value by putting it in a request without any assignments, which causes Tarantool to send the value to the client, which prints it. (Lua’s answer to the C printf() function, which is print(), will also work.)

For more on Lua os.clock() see Lua manual chapter 22.1 “Date and Time”. For more on Lua print() see Lua manual chapter 5 “Functions”.

Since this is the grand finale, we will redo the final versions of all the necessary requests: the request that created string_function(), the request that created main_function(), and the request that invokes main_function().

function string_function()
  local random_number
  local random_string
  random_string = ""
  for x = 1,10,1 do
    random_number = math.random(65, 90)
    random_string = random_string .. string.char(random_number)
  end
  return random_string
end

function main_function()
  local string_value, t
  for i = 1,1000000,1 do
    string_value = string_function()
    t = box.tuple.new({i,string_value})
    box.space.tester:replace(t)
  end
end
start_time = os.clock()
main_function()
end_time = os.clock()
'insert done in ' .. end_time - start_time .. ' seconds'

The screen now looks like this:

tarantool> function string_function()
         >   local random_number
         >   local random_string
         >   random_string = ""
         >   for x = 1,10,1 do
         >     random_number = math.random(65, 90)
         >     random_string = random_string .. string.char(random_number)
         >   end
         >   return random_string
         > end
---
...
tarantool> function main_function()
         >   local string_value, t
         >   for i = 1,1000000,1 do
         >     string_value = string_function()
         >     t = box.tuple.new({i,string_value})
         >     box.space.tester:replace(t)
         >   end
         > end
---
...
tarantool> start_time = os.clock()
---
...
tarantool> main_function()
---
...
tarantool> end_time = os.clock()
---
...
tarantool> 'insert done in ' .. end_time - start_time .. ' seconds'
---
- insert done in 37.62 seconds
...
tarantool>

What has been shown is that Lua functions are quite expressive (in fact one can do more with Tarantool’s Lua stored procedures than one can do with stored procedures in some SQL DBMSs), and that it’s straightforward to combine Lua-library functions and Tarantool-library functions.

What has also been shown is that inserting a million tuples took 37 seconds. The host computer was a Linux laptop. By changing wal_mode to ‘none’ before running the test, one can reduce the elapsed time to 4 seconds.

Sum a JSON field for all tuples

This is an exercise assignment: “Assume that inside every tuple there is a string formatted as JSON. Inside that string there is a JSON numeric field. For each tuple, find the numeric field’s value and add it to a ‘sum’ variable. At end, return the ‘sum’ variable.” The purpose of the exercise is to get experience in one way to read and process tuples.

json = require('json')
function sum_json_field(field_name)
  local v, t, sum, field_value, is_valid_json, lua_table
  sum = 0
  for v, t in box.space.tester:pairs() do
    is_valid_json, lua_table = pcall(json.decode, t[2])
    if is_valid_json then
      field_value = lua_table[field_name]
      if type(field_value) == "number" then sum = sum + field_value end
    end
  end
  return sum
end

LINE 3: WHY “LOCAL”. This line declares all the variables that will be used in the function. Actually it’s not necessary to declare all variables at the start, and in a long function it would be better to declare variables just before using them. In fact it’s not even necessary to declare variables at all, but an undeclared variable is “global”. That’s not desirable for any of the variables that are declared in line 1, because all of them are for use only within the function.

LINE 5: WHY “PAIRS()”. Our job is to go through all the rows and there are two ways to do it: with box.space.space_object:pairs() or with variable = select(...) followed by for i, n, 1 do some-function(variable[i]) end. We preferred pairs() for this example.

LINE 5: START THE MAIN LOOP. Everything inside this “for” loop will be repeated as long as there is another index key. A tuple is fetched and can be referenced with variable t.

LINE 6: WHY “PCALL”. If we simply said lua_table = json.decode(t[2])), then the function would abort with an error if it encountered something wrong with the JSON string - a missing colon, for example. By putting the function inside “pcall” (protected call), we’re saying: we want to intercept that sort of error, so if there’s a problem just set is_valid_json = false and we will know what to do about it later.

LINE 6: MEANING. The function is json.decode which means decode a JSON string, and the parameter is t[2] which is a reference to a JSON string. There’s a bit of hard coding here, we’re assuming that the second field in the tuple is where the JSON string was inserted. For example, we’re assuming a tuple looks like

field[1]: 444
field[2]: '{"Hello": "world", "Quantity": 15}'

meaning that the tuple’s first field, the primary key field, is a number while the tuple’s second field, the JSON string, is a string. Thus the entire statement means “decode t[2] (the tuple’s second field) as a JSON string; if there’s an error set is_valid_json = false; if there’s no error set is_valid_json = true and set lua_table = a Lua table which has the decoded string”.

LINE 8. At last we are ready to get the JSON field value from the Lua table that came from the JSON string. The value in field_name, which is the parameter for the whole function, must be a name of a JSON field. For example, inside the JSON string '{"Hello": "world", "Quantity": 15}', there are two JSON fields: “Hello” and “Quantity”. If the whole function is invoked with sum_json_field("Quantity"), then field_value = lua_table[field_name] is effectively the same as field_value = lua_table["Quantity"] or even field_value = lua_table.Quantity. Those are just three different ways of saying: for the Quantity field in the Lua table, get the value and put it in variable field_value.

LINE 9: WHY “IF”. Suppose that the JSON string is well formed but the JSON field is not a number, or is missing. In that case, the function would be aborted when there was an attempt to add it to the sum. By first checking type(field_value) == "number", we avoid that abortion. Anyone who knows that the database is in perfect shape can skip this kind of thing.

And the function is complete. Time to test it. Starting with an empty database, defined the same way as the sandbox database in our “Getting started” exercises,

-- if tester is left over from some previous test, destroy it
box.space.tester:drop()
box.schema.space.create('tester')
box.space.tester:create_index('primary', {parts = {1, 'unsigned'}})

then add some tuples where the first field is a number and the second field is a string.

box.space.tester:insert{444, '{"Item": "widget", "Quantity": 15}'}
box.space.tester:insert{445, '{"Item": "widget", "Quantity": 7}'}
box.space.tester:insert{446, '{"Item": "golf club", "Quantity": "sunshine"}'}
box.space.tester:insert{447, '{"Item": "waffle iron", "Quantit": 3}'}

Since this is a test, there are deliberate errors. The “golf club” and the “waffle iron” do not have numeric Quantity fields, so must be ignored. Therefore the real sum of the Quantity field in the JSON strings should be: 15 + 7 = 22.

Invoke the function with sum_json_field("Quantity").

tarantool> sum_json_field("Quantity")
---
- 22
...

It works. We’ll just leave, as exercises for future improvement, the possibility that the “hard coding” assumptions could be removed, that there might have to be an overflow check if some field values are huge, and that the function should contain a yield instruction if the count of tuples is huge.

Indexed pattern search

Here is a generic function which takes a field identifier and a search pattern, and returns all tuples that match.
* The field must be the first field of a TREE index.
* The function will use Lua pattern matching, which allows “magic characters” in regular expressions.
* The initial characters in the pattern, as far as the first magic character, will be used as an index search key. For each tuple that is found via the index, there will be a match of the whole pattern.
* To be cooperative, the function should yield after every 10 tuples, unless there is a reason to delay yielding.
With this function, we can take advantage of Tarantool’s indexes for speed, and take advantage of Lua’s pattern matching for flexibility. It does everything that an SQL LIKE search can do, and far more.

Read the following Lua code to see how it works. The comments that begin with “SEE NOTE …” refer to long explanations that follow the code.

function indexed_pattern_search(space_name, field_no, pattern)
  -- SEE NOTE #1 "FIND AN APPROPRIATE INDEX"
  if (box.space[space_name] == nil) then
    print("Error: Failed to find the specified space")
    return nil
  end
  local index_no = -1
  for i=0,box.schema.INDEX_MAX,1 do
    if (box.space[space_name].index[i] == nil) then break end
    if (box.space[space_name].index[i].type == "TREE"
        and box.space[space_name].index[i].parts[1].fieldno == field_no
        and (box.space[space_name].index[i].parts[1].type == "scalar"
        or box.space[space_name].index[i].parts[1].type == "string")) then
      index_no = i
      break
    end
  end
  if (index_no == -1) then
    print("Error: Failed to find an appropriate index")
    return nil
  end
  -- SEE NOTE #2 "DERIVE INDEX SEARCH KEY FROM PATTERN"
  local index_search_key = ""
  local index_search_key_length = 0
  local last_character = ""
  local c = ""
  local c2 = ""
  for i=1,string.len(pattern),1 do
    c = string.sub(pattern, i, i)
    if (last_character ~= "%") then
      if (c == '^' or c == "$" or c == "(" or c == ")" or c == "."
                   or c == "[" or c == "]" or c == "*" or c == "+"
                   or c == "-" or c == "?") then
        break
      end
      if (c == "%") then
        c2 = string.sub(pattern, i + 1, i + 1)
        if (string.match(c2, "%p") == nil) then break end
        index_search_key = index_search_key .. c2
      else
        index_search_key = index_search_key .. c
      end
    end
    last_character = c
  end
  index_search_key_length = string.len(index_search_key)
  if (index_search_key_length < 3) then
    print("Error: index search key " .. index_search_key .. " is too short")
    return nil
  end
  -- SEE NOTE #3 "OUTER LOOP: INITIATE"
  local result_set = {}
  local number_of_tuples_in_result_set = 0
  local previous_tuple_field = ""
  while true do
    local number_of_tuples_since_last_yield = 0
    local is_time_for_a_yield = false
    -- SEE NOTE #4 "INNER LOOP: ITERATOR"
    for _,tuple in box.space[space_name].index[index_no]:
    pairs(index_search_key,{iterator = box.index.GE}) do
      -- SEE NOTE #5 "INNER LOOP: BREAK IF INDEX KEY IS TOO GREAT"
      if (string.sub(tuple[field_no], 1, index_search_key_length)
      > index_search_key) then
        break
      end
      -- SEE NOTE #6 "INNER LOOP: BREAK AFTER EVERY 10 TUPLES -- MAYBE"
      number_of_tuples_since_last_yield = number_of_tuples_since_last_yield + 1
      if (number_of_tuples_since_last_yield >= 10
          and tuple[field_no] ~= previous_tuple_field) then
        index_search_key = tuple[field_no]
        is_time_for_a_yield = true
        break
        end
      previous_tuple_field = tuple[field_no]
      -- SEE NOTE #7 "INNER LOOP: ADD TO RESULT SET IF PATTERN MATCHES"
      if (string.match(tuple[field_no], pattern) ~= nil) then
        number_of_tuples_in_result_set = number_of_tuples_in_result_set + 1
        result_set[number_of_tuples_in_result_set] = tuple
      end
    end
    -- SEE NOTE #8 "OUTER LOOP: BREAK, OR YIELD AND CONTINUE"
    if (is_time_for_a_yield ~= true) then
      break
    end
    require('fiber').yield()
  end
  return result_set
end

NOTE #1 “FIND AN APPROPRIATE INDEX”
The caller has passed space_name (a string) and field_no (a number). The requirements are:
(a) index type must be “TREE” because for other index types (HASH, BITSET, RTREE) a search with iterator=GE will not return strings in order by string value;
(b) field_no must be the first index part;
(c) the field must contain strings, because for other data types (such as “unsigned”) pattern searches are not possible;
If these requirements are not met by any index, then print an error message and return nil.

NOTE #2 “DERIVE INDEX SEARCH KEY FROM PATTERN”
The caller has passed pattern (a string). The index search key will be the characters in the pattern as far as the first magic character. Lua’s magic characters are % ^ $ ( ) . [ ] * + - ?. For example, if the pattern is “ABC.E”, the period is a magic character and therefore the index search key will be “ABC”. But there is a complication … If we see “%” followed by a punctuation character, that punctuation character is “escaped” so remove the “%” when making the index search key. For example, if the pattern is “AB%$E”, the dollar sign is escaped and therefore the index search key will be “AB$E”. Finally there is a check that the index search key length must be at least three – this is an arbitrary number, and in fact zero would be okay, but short index search keys will cause long search times.

NOTE #3 – “OUTER LOOP: INITIATE”
The function’s job is to return a result set, just as box.space...select <box_space-select> would. We will fill it within an outer loop that contains an inner loop. The outer loop’s job is to execute the inner loop, and possibly yield, until the search ends. The inner loop’s job is to find tuples via the index, and put them in the result set if they match the pattern.

NOTE #4 “INNER LOOP: ITERATOR”
The for loop here is using pairs(), see the explanation of what index iterators are. Within the inner loop, there will be a local variable named “tuple” which contains the latest tuple found via the index search key.

NOTE #5 “INNER LOOP: BREAK IF INDEX KEY IS TOO GREAT”
The iterator is GE (Greater or Equal), and we must be more specific: if the search index key has N characters, then the leftmost N characters of the result’s index field must not be greater than the search index key. For example, if the search index key is ‘ABC’, then ‘ABCDE’ is a potential match, but ‘ABD’ is a signal that no more matches are possible.

NOTE #6 “INNER LOOP: BREAK AFTER EVERY 10 TUPLES – MAYBE”
This chunk of code is for cooperative multitasking. The number 10 is arbitrary, and usually a larger number would be okay. The simple rule would be “after checking 10 tuples, yield, and then resume the search (that is, do the inner loop again) starting after the last value that was found”. However, if the index is non-unique or if there is more than one field in the index, then we might have duplicates – for example {“ABC”,1}, {“ABC”, 2}, {“ABC”, 3}” – and it would be difficult to decide which “ABC” tuple to resume with. Therefore, if the result’s index field is the same as the previous result’s index field, there is no break.

NOTE #7 “INNER LOOP: ADD TO RESULT SET IF PATTERN MATCHES”
Compare the result’s index field to the entire pattern. For example, suppose that the caller passed pattern “ABC.E” and there is an indexed field containing “ABCDE”. Therefore the initial index search key is “ABC”. Therefore a tuple containing an indexed field with “ABCDE” will be found by the iterator, because “ABCDE” > “ABC”. In that case string.match will return a value which is not nil. Therefore this tuple can be added to the result set.

NOTE #8 “OUTER LOOP: BREAK, OR YIELD AND CONTINUE”
There are three conditions which will cause a break from the inner loop: (1) the for loop ends naturally because there are no more index keys which are greater than or equal to the index search key, (2) the index key is too great as described in NOTE #5, (3) it is time for a yield as described in NOTE #6. If condition (1) or condition (2) is true, then there is nothing more to do, the outer loop ends too. If and only if condition (3) is true, the outer loop must yield and then continue. If it does continue, then the inner loop – the iterator search – will happen again with a new value for the index search key.

EXAMPLE:

Start Tarantool, cut and paste the code for function indexed_pattern_search(), and try the following:

box.space.t:drop()
box.schema.space.create('t')
box.space.t:create_index('primary',{})
box.space.t:create_index('secondary',{unique=false,parts={2,'string',3,'string'}})
box.space.t:insert{1,'A','a'}
box.space.t:insert{2,'AB',''}
box.space.t:insert{3,'ABC','a'}
box.space.t:insert{4,'ABCD',''}
box.space.t:insert{5,'ABCDE','a'}
box.space.t:insert{6,'ABCDE',''}
box.space.t:insert{7,'ABCDEF','a'}
box.space.t:insert{8,'ABCDF',''}
indexed_pattern_search("t", 2, "ABC.E.")

The result will be:

tarantool> indexed_pattern_search("t", 2, "ABC.E.")
---
- - [7, 'ABCDEF', 'a']
...

Tips on Lua syntax

The Lua syntax for data-manipulation functions can vary. Here are examples of the variations with select() requests. The same rules exist for the other data-manipulation functions.

Every one of the examples does the same thing: select a tuple set from a space named ‘tester’ where the primary-key field value equals 1. For these examples, we assume that the numeric id of ‘tester’ is 512, which happens to be the case in our sandbox example only.

Object reference variations

First, there are three object reference variations:

-- #1 module . submodule . name
tarantool> box.space.tester:select{1}
-- #2 replace name with a literal in square brackets
tarantool> box.space['tester']:select{1}
-- #3 use a variable for the entire object reference
tarantool> s = box.space.tester
tarantool> s:select{1}

Examples in this manual usually have the “box.space.tester:” form (#1). However, this is a matter of user preference and all the variations exist in the wild.

Also, descriptions in this manual use the syntax “space_object:” for references to objects which are spaces, and “index_object:” for references to objects which are indexes (for example box.space.tester.index.primary:).

Parameter variations

Then, there are seven parameter variations:

-- #1
tarantool> box.space.tester:select{1}
-- #2
tarantool> box.space.tester:select({1})
-- #3
tarantool> box.space.tester:select(1)
-- #4
tarantool> box.space.tester.select(box.space.tester,1)
-- #5
tarantool> box.space.tester:select({1},{iterator='EQ'})
-- #6
tarantool> variable = 1
tarantool> box.space.tester:select{variable}
-- #7
tarantool> variable = {1}
tarantool> box.space.tester:select(variable)

Lua allows to omit parentheses () when invoking a function if its only argument is a Lua table, and we use it sometimes in our examples. This is why select{1} is equivalent to select({1}). Literal values such as 1 (a scalar value) or {1} (a Lua table value) may be replaced by variable names, as in examples #6 and #7.

Although there are special cases where braces can be omitted, they are preferable because they signal “Lua table”. Examples and descriptions in this manual have the {1} form. However, this too is a matter of user preference and all the variations exist in the wild.

Rules for object names

Database objects have loose rules for names: the maximum length is 65000 bytes (not characters), and almost any legal Unicode character is allowed, including spaces, ideograms and punctuation.

In those cases, to prevent confusion with Lua operators and separators, object references should have the literal-in-square-brackets form (#2), or the variable form (#3). For example:

tarantool> box.space['1*A']:select{1}
tarantool> s = box.space['1*A !@$%^&*()_+12345678901234567890']
tarantool> s:select{1}

Disallowed:

characters which are unassigned code points,
line and paragraph separators,
control characters,
the replacement character (U+FFFD).

Not recommended: characters which cannot be displayed.

Names are “case sensitive”, so ‘A’ and ‘a’ are not the same.

Enterprise modules

This section covers open and closed source Lua modules for Tarantool Enterprise Edition included in the distribution as an offline rocks repository.

Open source modules

avro-schema is an assembly of Apache Avro schema tools;
checks is a type checker of functional arguments. This library that declares a checks() function and checkers table that allow to check the parameters passed to a Lua function in a fast and unobtrusive way.
http is an on-board HTTP-server, which comes in addition to Tarantool’s out-of-the-box HTTP client, and must be installed as described in the installation section.
icu-date is a date-and-time formatting library for Tarantool based on International Components for Unicode;
kafka is a full-featured high-performance kafka library for Tarantool based on librdkafka;
luacheck is a static analyzer and linter for Lua, preconfigured for Tarantool.
luarapidxml is a fast XML parser.
luatest is a Tarantool test framework written in Lua.
membership builds a mesh from multiple Tarantool instances based on gossip protocol. The mesh monitors itself, helps members discover everyone else in the group and get notified about their status changes with low latency. It is built upon the ideas from Consul or, more precisely, the SWIM algorithm.
metrics is a collection of useful monitoring metrics.
tracing is a module for debugging performance issues.
vshard is an automatic sharding system that enables horizontal scaling for Tarantool DBMS instances.

Closed source modules

ldap allows you to authenticate in a LDAP server and perform searches.
odbc is an ODBC connector for Tarantool based on unixODBC.
oracle is an Oracle connector for Lua applications through which they can send and receive data to and from Oracle databases. The advantage of the Tarantool-Oracle integration is that anyone can handle all the tasks with Oracle DBMSs (control, manipulation, storage, access) with the same high-level language (Lua) and with minimal delay.
task is a module for managing background tasks in a Tarantool cluster.

Installing and using modules

To use a module, install the following:

All the necessary third-party software packages (if any). See the module’s prerequisites for the list.

The module itself on every Tarantool instance:

$ tt rocks install MODULE_NAME [MODULE_VERSION]

See the tt rocks reference to learn more about managing Lua modules.

Creating an application

Further we walk you through key programming practices that will give you a good start in writing Lua applications for Tarantool. We will implement a real microservice based on Tarantool! It is a backend for a simplified version of Pokémon Go, a location-based augmented reality game launched in mid-2016.

In this game, players use the GPS capability of a mobile device to locate, catch, battle, and train virtual monsters called “pokémon” that appear on the screen as if they were in the same real-world location as the player.

To stay within the walk-through format, let’s narrow the original gameplay as follows. We have a map with pokémon spawn locations. Next, we have multiple players who can send catch-a-pokémon requests to the server (which runs our Tarantool microservice). The server responds whether the pokémon is caught or not, increases the player’s pokémon counter if yes, and triggers the respawn-a-pokémon method that spawns a new pokémon at the same location in a while.

We leave client-side applications outside the scope of this story. However, we promise a mini-demo in the end to simulate real users and give us some fun.

Follow these topics to implement our application:

Modules, rocks and applications

To make our game logic available to other developers and Lua applications, let’s put it into a Lua module.

A module (called “rock” in Lua) is an optional library which enhances Tarantool functionality. So, we can install our logic as a module in Tarantool and use it from any Tarantool application or module. Like applications, modules in Tarantool can be written in Lua (rocks), C or C++.

Modules are good for two things:

easier code management (reuse, packaging, versioning), and
hot code reload without restarting the Tarantool instance.

Technically, a module is a file with source code that exports its functions in an API. For example, here is a Lua module named mymodule.lua that exports one function named myfun:

local exports = {}
exports.myfun = function(input_string)
   print('Hello', input_string)
end
return exports

To launch the function myfun() – from another module, from a Lua application, or from Tarantool itself, – we need to save this module as a file, then load this module with the require() directive and call the exported function.

For example, here’s a Lua application that uses myfun() function from mymodule.lua module:

-- loading the module
local mymodule = require('mymodule')

-- calling myfun() from within test() function
local test = function()
  mymodule.myfun()
end

A thing to remember here is that the require() directive takes load paths to Lua modules from the package.path variable. This is a semicolon-separated string, where a question mark is used to interpolate the module name. By default, this variable contains system-wide Lua paths and the working directory. But if we put our modules inside a specific folder (e.g. scripts/), we need to add this folder to package.path before any calls to require():

package.path = 'scripts/?.lua;' .. package.path

For our microservice, a simple and convenient solution would be to put all methods in a Lua module (say pokemon.lua) and to write a Lua application (say game.lua) that initializes the gaming environment and starts the game loop.

Now let’s get down to implementation details. In our game, we need three entities:

map, which is an array of pokémons with coordinates of respawn locations; in this version of the game, let a location be a rectangle identified with two points, upper-left and lower-right;
player, which has an ID, a name, and coordinates of the player’s location point;
pokémon, which has the same fields as the player, plus a status (active/inactive, that is present on the map or not) and a catch probability (well, let’s give our pokémons a chance to escape :-) )

We’ll store these entities as tuples in Tarantool spaces. But to deliver our backend application as a microservice, the good practice would be to send/receive our data in the universal JSON format, thus using Tarantool as a document storage.

Avro schemas

To store JSON data as tuples, we will apply a savvy practice which reduces data footprint and ensures all stored documents are valid. We will use Tarantool module avro-schema which checks the schema of a JSON document and converts it to a Tarantool tuple. The tuple will contain only field values, and thus take a lot less space than the original document. In avro-schema terms, converting JSON documents to tuples is “flattening”, and restoring the original documents is “unflattening”.

First you need to install the module with tt rocks install avro-schema.

Further usage is quite straightforward:

For each entity, we need to define a schema in Apache Avro schema syntax, where we list the entity’s fields with their names and Avro data types.
At initialization, we call avro-schema.create() that creates objects in memory for all schema entities, and compile() that generates flatten/unflatten methods for each entity.
Further on, we just call flatten/unflatten methods for a respective entity on receiving/sending the entity’s data.

Here’s what our schema definitions for the player and pokémon entities look like:

local schema = {
    player = {
        type="record",
        name="player_schema",
        fields={
            {name="id", type="long"},
            {name="name", type="string"},
            {
                name="location",
                type= {
                    type="record",
                    name="player_location",
                    fields={
                        {name="x", type="double"},
                        {name="y", type="double"}
                    }
                }
            }
        }
    },
    pokemon = {
        type="record",
        name="pokemon_schema",
        fields={
            {name="id", type="long"},
            {name="status", type="string"},
            {name="name", type="string"},
            {name="chance", type="double"},
            {
                name="location",
                type= {
                    type="record",
                    name="pokemon_location",
                    fields={
                        {name="x", type="double"},
                        {name="y", type="double"}
                    }
                }
            }
        }
    }
}

And here’s how we create and compile our entities at initialization:

-- load avro-schema module with require()
local avro = require('avro_schema')

-- create models
local ok_m, pokemon = avro.create(schema.pokemon)
local ok_p, player = avro.create(schema.player)
if ok_m and ok_p then
    -- compile models
    local ok_cm, compiled_pokemon = avro.compile(pokemon)
    local ok_cp, compiled_player = avro.compile(player)
    if ok_cm and ok_cp then
        -- start the game
        <...>
    else
        log.error('Schema compilation failed')
    end
else
    log.info('Schema creation failed')
end
return false

As for the map entity, it would be an overkill to introduce a schema for it, because we have only one map in the game, it has very few fields, and – which is most important – we use the map only inside our logic, never exposing it to external users.

Next, we need methods to implement the game logic. To simulate object-oriented programming in our Lua code, let’s store all Lua functions and shared variables in a single local variable (let’s name it as game). This will allow us to address functions or variables from within our module as self.func_name or self.var_name. Like this:

local game = {
    -- a local variable
    num_players = 0,

    -- a method that prints a local variable
    hello = function(self)
      print('Hello! Your player number is ' .. self.num_players .. '.')
    end,

    -- a method that calls another method and returns a local variable
    sign_in = function(self)
      self.num_players = self.num_players + 1
      self:hello()
      return self.num_players
    end
}

In OOP terms, we can now regard local variables inside game as object fields, and local functions as object methods.

Note

In this manual, Lua examples use local variables. Use global variables with caution, since the module’s users may be unaware of them.

To enable/disable the use of undeclared global variables in your Lua code, use Tarantool’s strict module.

So, our game module will have the following methods:

catch() to calculate whether the pokémon was caught (besides the coordinates of both the player and pokémon, this method will apply a probability factor, so not every pokémon within the player’s reach will be caught);
respawn() to add missing pokémons to the map, say, every 60 seconds (we assume that a frightened pokémon runs away, so we remove a pokémon from the map on any catch attempt and add it back to the map in a while);
notify() to log information about caught pokémons (like “Player 1 caught pokémon A”);
start() to initialize the game (it will create database spaces, create and compile avro schemas, and launch respawn()).

Besides, it would be convenient to have methods for working with Tarantool storage. For example:

add_pokemon() to add a pokémon to the database, and
map() to populate the map with all pokémons stored in Tarantool.

We’ll need these two methods primarily when initializing our game, but we can also call them later, for example to test our code.

Bootstrapping a database

Let’s discuss game initialization. In start() method, we need to populate Tarantool spaces with pokémon data. Why not keep all game data in memory? Why use a database? The answer is: persistence. Without a database, we risk losing data on power outage, for example. But if we store our data in an in-memory database, Tarantool takes care to persist it on disk whenever it’s changed. This gives us one more benefit: quick startup in case of failure. Tarantool has a smart algorithm that quickly loads all data from disk into memory on startup, so the warm-up takes little time.

We’ll be using functions from Tarantool built-in box module:

box.schema.create_space('pokemons') to create a space named pokemon for storing information about pokémons (we don’t create a similar space for players, because we intend to only send/receive player information via API calls, so we needn’t store it);
box.space.pokemons:create_index('primary', {type = 'hash', parts = {1, 'unsigned'}}) to create a primary HASH index by pokémon ID;
box.space.pokemons:create_index('status', {type = 'tree', parts = {2, 'str'}}) to create a secondary TREE index by pokémon status.

Notice the parts = argument in the index specification. The pokémon ID is the first field in a Tarantool tuple since it’s the first member of the respective Avro type. So does the pokémon status. The actual JSON document may have ID or status fields at any position of the JSON map.

The implementation of start() method looks like this:

-- create game object
start = function(self)
    -- create spaces and indexes
    box.once('init', function()
        box.schema.create_space('pokemons')
        box.space.pokemons:create_index(
            "primary", {type = 'hash', parts = {1, 'unsigned'}}
        )
        box.space.pokemons:create_index(
            "status", {type = "tree", parts = {2, 'str'}}
        )
    end)

    -- create models
    local ok_m, pokemon = avro.create(schema.pokemon)
    local ok_p, player = avro.create(schema.player)
    if ok_m and ok_p then
        -- compile models
        local ok_cm, compiled_pokemon = avro.compile(pokemon)
        local ok_cp, compiled_player = avro.compile(player)
        if ok_cm and ok_cp then
            -- start the game
            <...>
        else
            log.error('Schema compilation failed')
        end
    else
        log.info('Schema creation failed')
    end
    return false
end

GIS

Now let’s discuss catch(), which is the main method in our gaming logic.

Here we receive the player’s coordinates and the target pokémon’s ID number, and we need to answer whether the player has actually caught the pokémon or not (remember that each pokémon has a chance to escape).

First thing, we validate the received player data against its Avro schema. And we check whether such a pokémon exists in our database and is displayed on the map (the pokémon must have the active status):

catch = function(self, pokemon_id, player)
    -- check player data
    local ok, tuple = self.player_model.flatten(player)
    if not ok then
        return false
    end
    -- get pokemon data
    local p_tuple = box.space.pokemons:get(pokemon_id)
    if p_tuple == nil then
        return false
    end
    local ok, pokemon = self.pokemon_model.unflatten(p_tuple)
    if not ok then
        return false
    end
    if pokemon.status ~= self.state.ACTIVE then
        return false
    end
    -- more catch logic to follow
    <...>
end

Next, we calculate the answer: caught or not.

To work with geographical coordinates, we use Tarantool gis module.

To keep things simple, we don’t load any specific map, assuming that we deal with a world map. And we do not validate incoming coordinates, assuming again that all received locations are within the planet Earth.

We use two geo-specific variables:

wgs84, which stands for the latest revision of the World Geodetic System standard, WGS84. Basically, it comprises a standard coordinate system for the Earth and represents the Earth as an ellipsoid.
nationalmap, which stands for the US National Atlas Equal Area. This is a projected coordinates system based on WGS84. It gives us a zero base for location projection and allows positioning our players and pokémons in meters.

Both these systems are listed in the EPSG Geodetic Parameter Registry, where each system has a unique number. In our code, we assign these listing numbers to respective variables:

wgs84 = 4326,
nationalmap = 2163,

For our game logic, we need one more variable, catch_distance, which defines how close a player must get to a pokémon before trying to catch it. Let’s set the distance to 100 meters.

catch_distance = 100,

Now we’re ready to calculate the answer. We need to project the current location of both player (p_pos) and pokémon (m_pos) on the map, check whether the player is close enough to the pokémon (using catch_distance), and calculate whether the player has caught the pokémon (here we generate some random value and let the pokémon escape if the random value happens to be less than 100 minus pokémon’s chance value):

-- project locations
local m_pos = gis.Point(
    {pokemon.location.x, pokemon.location.y}, self.wgs84
):transform(self.nationalmap)
local p_pos = gis.Point(
    {player.location.x, player.location.y}, self.wgs84
):transform(self.nationalmap)

-- check catch distance condition
if p_pos:distance(m_pos) > self.catch_distance then
    return false
end
-- try to catch pokemon
local caught = math.random(100) >= 100 - pokemon.chance
if caught then
    -- update and notify on success
    box.space.pokemons:update(
        pokemon_id, {{'=', self.STATUS, self.state.CAUGHT}}
    )
    self:notify(player, pokemon)
end
return caught

Index iterators

By our gameplay, all caught pokémons are returned back to the map. We do this for all pokémons on the map every 60 seconds using respawn() method. We iterate through pokémons by status using Tarantool index iterator function index_object:pairs() and reset the statuses of all “caught” pokémons back to “active” using box.space.pokemons:update().

respawn = function(self)
    fiber.name('Respawn fiber')
    for _, tuple in box.space.pokemons.index.status:pairs(
           self.state.CAUGHT) do
        box.space.pokemons:update(
            tuple[self.ID],
            {{'=', self.STATUS, self.state.ACTIVE}}
        )
    end
 end

For readability, we introduce named fields:

ID = 1, STATUS = 2,

The complete implementation of start() now looks like this:

-- create game object
start = function(self)
    -- create spaces and indexes
    box.once('init', function()
       box.schema.create_space('pokemons')
       box.space.pokemons:create_index(
           "primary", {type = 'hash', parts = {1, 'unsigned'}}
       )
       box.space.pokemons:create_index(
           "status", {type = "tree", parts = {2, 'str'}}
       )
    end)

    -- create models
    local ok_m, pokemon = avro.create(schema.pokemon)
    local ok_p, player = avro.create(schema.player)
    if ok_m and ok_p then
        -- compile models
        local ok_cm, compiled_pokemon = avro.compile(pokemon)
        local ok_cp, compiled_player = avro.compile(player)
        if ok_cm and ok_cp then
            -- start the game
            self.pokemon_model = compiled_pokemon
            self.player_model = compiled_player
            self.respawn()
            log.info('Started')
            return true
         else
            log.error('Schema compilation failed')
         end
    else
        log.info('Schema creation failed')
    end
    return false
end

Fibers, yields and cooperative multitasking

But wait! If we launch it as shown above – self.respawn() – the function will be executed only once, just like all the other methods. But we need to execute respawn() every 60 seconds. Creating a fiber is the Tarantool way of making application logic work in the background at all times.

A fiber is a set of instructions that are executed with cooperative multitasking: the instructions contain yield signals, upon which control is passed to another fiber.

Let’s launch respawn() in a fiber to make it work in the background all the time. To do so, we’ll need to amend respawn():

respawn = function(self)
    -- let's give our fiber a name;
    -- this will produce neat output in fiber.info()
    fiber.name('Respawn fiber')
    while true do
        for _, tuple in box.space.pokemons.index.status:pairs(
                self.state.CAUGHT) do
            box.space.pokemons:update(
                tuple[self.ID],
                {{'=', self.STATUS, self.state.ACTIVE}}
            )
        end
        fiber.sleep(self.respawn_time)
    end
end

and call it as a fiber in start():

start = function(self)
    -- create spaces and indexes
        <...>
    -- create models
        <...>
    -- compile models
        <...>
    -- start the game
       self.pokemon_model = compiled_pokemon
       self.player_model = compiled_player
       fiber.create(self.respawn, self)
       log.info('Started')
    -- errors if schema creation or compilation fails
       <...>
end

Logging

One more helpful function that we used in start() was log.infо() from Tarantool log module. We also need this function in notify() to add a record to the log file on every successful catch:

-- event notification
notify = function(self, player, pokemon)
    log.info("Player '%s' caught '%s'", player.name, pokemon.name)
end

We use default Tarantool log settings, so we’ll see the log output in console when we launch our application in script mode.

Great! We’ve discussed all programming practices used in our Lua module (see pokemon.lua).

Now let’s prepare the test environment. As planned, we write a Lua application (see game.lua) to initialize Tarantool’s database module, initialize our game, call the game loop and simulate a couple of player requests.

To launch our microservice, we put both the pokemon.lua module and the game.lua application in the current directory, install all external modules, and launch the Tarantool instance running our game.lua application (this example is for Ubuntu):

$ ls
game.lua  pokemon.lua
$ sudo apt-get install tarantool-gis
$ sudo apt-get install tarantool-avro-schema
$ tarantool game.lua

Tarantool starts and initializes the database. Then Tarantool executes the demo logic from game.lua: adds a pokémon named Pikachu (its chance to be caught is very high, 99.1), displays the current map (it contains one active pokémon, Pikachu) and processes catch requests from two players. Player1 is located just near the lonely Pikachu pokémon and Player2 is located far away from it. As expected, the catch results in this output are “true” for Player1 and “false” for Player2. Finally, Tarantool displays the current map which is empty, because Pikachu is caught and temporarily inactive:

$ tarantool game.lua
2017-01-09 20:19:24.605 [6282] main/101/game.lua C> version 1.7.3-43-gf5fa1e1
2017-01-09 20:19:24.605 [6282] main/101/game.lua C> log level 5
2017-01-09 20:19:24.605 [6282] main/101/game.lua I> mapping 1073741824 bytes for tuple arena...
2017-01-09 20:19:24.609 [6282] main/101/game.lua I> initializing an empty data directory
2017-01-09 20:19:24.634 [6282] snapshot/101/main I> saving snapshot `./00000000000000000000.snap.inprogress'
2017-01-09 20:19:24.635 [6282] snapshot/101/main I> done
2017-01-09 20:19:24.641 [6282] main/101/game.lua I> ready to accept requests
2017-01-09 20:19:24.786 [6282] main/101/game.lua I> Started
---
- {'id': 1, 'status': 'active', 'location': {'y': 2, 'x': 1}, 'name': 'Pikachu', 'chance': 99.1}
...

2017-01-09 20:19:24.789 [6282] main/101/game.lua I> Player 'Player1' caught 'Pikachu'
true
false
--- []
...

2017-01-09 20:19:24.789 [6282] main C> entering the event loop

nginx

In the real life, this microservice would work over HTTP. Let’s add nginx web server to our environment and make a similar demo. But how do we make Tarantool methods callable via REST API? We use nginx with Tarantool nginx upstream module and create one more Lua script (app.lua) that exports three of our game methods – add_pokemon(), map() and catch() – as REST endpoints of the nginx upstream module:

local game = require('pokemon')
box.cfg{listen=3301}
game:start()

-- add, map and catch functions exposed to REST API
function add(request, pokemon)
    return {
        result=game:add_pokemon(pokemon)
    }
end

function map(request)
    return {
        map=game:map()
    }
end

function catch(request, pid, player)
    local id = tonumber(pid)
    if id == nil then
        return {result=false}
    end
    return {
        result=game:catch(id, player)
    }
end

An easy way to configure and launch nginx would be to create a Docker container based on a Docker image with nginx and the upstream module already installed (see http/Dockerfile). We take a standard nginx.conf, where we define an upstream with our Tarantool backend running (this is another Docker container, see details below):

upstream tnt {
      server pserver:3301 max_fails=1 fail_timeout=60s;
      keepalive 250000;
}

and add some Tarantool-specific parameters (see descriptions in the upstream module’s README file):

server {
  server_name tnt_test;

  listen 80 default deferred reuseport so_keepalive=on backlog=65535;

  location = / {
      root /usr/local/nginx/html;
  }

  location /api {
    # answers check infinity timeout
    tnt_read_timeout 60m;
    if ( $request_method = GET ) {
       tnt_method "map";
    }
    tnt_http_rest_methods get;
    tnt_http_methods all;
    tnt_multireturn_skip_count 2;
    tnt_pure_result on;
    tnt_pass_http_request on parse_args;
    tnt_pass tnt;
  }
}

Likewise, we put Tarantool server and all our game logic in a second Docker container based on the official Tarantool 1.9 image (see src/Dockerfile) and set the container’s default command to tarantool app.lua. This is the backend.

Non-blocking IO

To test the REST API, we create a new script (client.lua), which is similar to our game.lua application, but makes HTTP POST and GET requests rather than calling Lua functions:

local http = require('curl').http()
local json = require('json')
local URI = os.getenv('SERVER_URI')
local fiber = require('fiber')

local player1 = {
    name="Player1",
    id=1,
    location = {
        x=1.0001,
        y=2.0003
    }
}
local player2 = {
    name="Player2",
    id=2,
    location = {
        x=30.123,
        y=40.456
    }
}

local pokemon = {
    name="Pikachu",
    chance=99.1,
    id=1,
    status="active",
    location = {
        x=1,
        y=2
    }
}

function request(method, body, id)
    local resp = http:request(
        method, URI, body
    )
    if id ~= nil then
        print(string.format('Player %d result: %s',
            id, resp.body))
    else
        print(resp.body)
    end
end

local players = {}
function catch(player)
    fiber.sleep(math.random(5))
    print('Catch pokemon by player ' .. tostring(player.id))
    request(
        'POST', '{"method": "catch",
        "params": [1, '..json.encode(player)..']}',
        tostring(player.id)
    )
    table.insert(players, player.id)
end

print('Create pokemon')
request('POST', '{"method": "add",
    "params": ['..json.encode(pokemon)..']}')
request('GET', '')

fiber.create(catch, player1)
fiber.create(catch, player2)

-- wait for players
while #players ~= 2 do
    fiber.sleep(0.001)
end

request('GET', '')
os.exit()

When you run this script, you’ll notice that both players have equal chances to make the first attempt at catching the pokémon. In a classical Lua script, a networked call blocks the script until it’s finished, so the first catch attempt can only be done by the player who entered the game first. In Tarantool, both players play concurrently, since all modules are integrated with Tarantool cooperative multitasking and use non-blocking I/O.

Indeed, when Player1 makes its first REST call, the script doesn’t block. The fiber running catch() function on behalf of Player1 issues a non-blocking call to the operating system and yields control to the next fiber, which happens to be the fiber of Player2. Player2’s fiber does the same. When the network response is received, Player1’s fiber is activated by Tarantool cooperative scheduler, and resumes its work. All Tarantool modules use non-blocking I/O and are integrated with Tarantool cooperative scheduler. For module developers, Tarantool provides an API.

For our HTTP test, we create a third container based on the official Tarantool 1.9 image (see client/Dockerfile) and set the container’s default command to tarantool client.lua.

To run this test locally, download our pokemon project from GitHub and say:

$ docker-compose build
$ docker-compose up

Docker Compose builds and runs all the three containers: pserver (Tarantool backend), phttp (nginx) and pclient (demo client). You can see log messages from all these containers in the console, pclient saying that it made an HTTP request to create a pokémon, made two catch requests, requested the map (empty since the pokémon is caught and temporarily inactive) and exited:

pclient_1  | Create pokemon
<...>
pclient_1  | {"result":true}
pclient_1  | {"map":[{"id":1,"status":"active","location":{"y":2,"x":1},"name":"Pikachu","chance":99.100000}]}
pclient_1  | Catch pokemon by player 2
pclient_1  | Catch pokemon by player 1
pclient_1  | Player 1 result: {"result":true}
pclient_1  | Player 2 result: {"result":false}
pclient_1  | {"map":[]}
pokemon_pclient_1 exited with code 0

Congratulations! Here’s the end point of our walk-through. As further reading, see more about installing and contributing a module.

See also reference on Tarantool modules and C API, and don’t miss our Lua cookbook recipes.

C tutorial

C stored procedures

Tarantool can call C code with modules, or with ffi, or with C stored procedures. This tutorial only is about the third option, C stored procedures. In fact the routines are always “C functions” but the phrase “stored procedure” is commonly used for historical reasons.

In this tutorial, which can be followed by anyone with a Tarantool development package and a C compiler, there are five tasks:

easy.c – prints “hello world”;
harder.c – decodes a passed parameter value;
hardest.c – uses the C API to do a DBMS insert;
read.c – uses the C API to do a DBMS select;
write.c – uses the C API to do a DBMS replace.

After following the instructions, and seeing that the results are what is described here, users should feel confident about writing their own stored procedures.

Preparation

Check that these items exist on the computer:

Tarantool 2.1 or later
A gcc compiler, any modern version should work
module.h and files #included in it
msgpuck.h
libmsgpuck.a (only for some recent msgpuck versions)

The module.h file will exist if Tarantool was installed from source. Otherwise Tarantool’s “developer” package must be installed. For example on Ubuntu say:

$ sudo apt-get install tarantool-dev

or on Fedora say:

$ dnf -y install tarantool-devel

The msgpuck.h file will exist if Tarantool was installed from source. Otherwise the “msgpuck” package must be installed from https://github.com/tarantool/msgpuck.

Both module.h and msgpuck.h must be on the include path for the C compiler to see them. For example, if module.h address is /usr/local/include/tarantool/module.h, and msgpuck.h address is /usr/local/include/msgpuck/msgpuck.h, and they are not currently on the include path, say:

$ export CPATH=/usr/local/include/tarantool:/usr/local/include/msgpuck

The libmsgpuck.a static library is necessary with msgpuck versions produced after February 2017. If and only if you encounter linking problems when using the gcc statements in the examples for this tutorial, you should put libmsgpuck.a on the path (libmsgpuck.a is produced from both msgpuck and Tarantool source downloads so it should be easy to find). For example, instead of “gcc -shared -o harder.so -fPIC harder.c” for the second example below, you will need to say “gcc -shared -o harder.so -fPIC harder.c libmsgpuck.a”.

Requests will be done using Tarantool as a client. Start Tarantool, and enter these requests.

box.cfg{listen=3306}
box.schema.space.create('capi_test')
box.space.capi_test:create_index('primary')
net_box = require('net.box')
capi_connection = net_box:new(3306)

In plainer language: create a space named capi_test, and make a connection to self named capi_connection.

Leave the client running. It will be necessary to enter more requests later.

easy.c

Start another shell. Change directory (cd) so that it is the same as the directory that the client is running on.

Create a file. Name it easy.c. Put the following code in it:

#include "module.h"
int easy(box_function_ctx_t *ctx, const char *args, const char *args_end)
{
  printf("hello world\n");
  return 0;
}
int easy2(box_function_ctx_t *ctx, const char *args, const char *args_end)
{
  printf("hello world -- easy2\n");
  return 0;
}

Compile the program, producing a library file named easy.so:

$ gcc -shared -o easy.so -fPIC easy.c

Now go back to the client and execute these requests:

box.schema.func.create('easy', {language = 'C'})
box.schema.user.grant('guest', 'execute', 'function', 'easy')
capi_connection:call('easy')

If these requests appear unfamiliar, re-read the descriptions of box.schema.func.create(), box.schema.user.grant() and conn:call().

The function that matters is capi_connection:call('easy').

Its first job is to find the ‘easy’ function, which should be easy because by default Tarantool looks on the current directory for a file named easy.so.

Its second job is to call the ‘easy’ function. Since the easy() function in easy.c begins with printf("hello world\n"), the words “hello world” will appear on the screen.

Its third job is to check that the call was successful. Since the easy() function in easy.c ends with return 0, there is no error message to display and the request is over.

The result should look like this:

tarantool> capi_connection:call('easy')
hello world
---
- []
...

Now let’s call the other function in easy.c – easy2(). This is almost the same as the easy() function, but there’s a detail: when the file name is not the same as the function name, then we have to specify file-name.function-name.

box.schema.func.create('easy.easy2', {language = 'C'})
box.schema.user.grant('guest', 'execute', 'function', 'easy.easy2')
capi_connection:call('easy.easy2')

… and this time the result will be “hello world – easy2”.

Conclusion: calling a C function is easy.

harder.c

Go back to the shell where the easy.c program was created.

Create a file. Name it harder.c. Put these 17 lines in it:

#include "module.h"
#include "msgpuck.h"
int harder(box_function_ctx_t *ctx, const char *args, const char *args_end)
{
  uint32_t arg_count = mp_decode_array(&args);
  printf("arg_count = %d\n", arg_count);
  uint32_t field_count = mp_decode_array(&args);
  printf("field_count = %d\n", field_count);
  uint32_t val;
  int i;
  for (i = 0; i < field_count; ++i)
  {
    val = mp_decode_uint(&args);
    printf("val=%d.\n", val);
  }
  return 0;
}

Compile the program, producing a library file named harder.so:

$ gcc -shared -o harder.so -fPIC harder.c

Now go back to the client and execute these requests:

box.schema.func.create('harder', {language = 'C'})
box.schema.user.grant('guest', 'execute', 'function', 'harder')
passable_table = {}
table.insert(passable_table, 1)
table.insert(passable_table, 2)
table.insert(passable_table, 3)
capi_connection:call('harder', {passable_table})

This time the call is passing a Lua table (passable_table) to the harder() function. The harder() function will see it, it’s in the char *args parameter.

At this point the harder() function will start using functions defined in msgpuck.h. The routines that begin with “mp” are msgpuck functions that handle data formatted according to the MsgPack specification. Passes and returns are always done with this format so one must become acquainted with msgpuck to become proficient with the C API.

For now, though, it’s enough to know that mp_decode_array() returns the number of elements in an array, and mp_decode_uint returns an unsigned integer, from args. And there’s a side effect: when the decoding finishes, args has changed and is now pointing to the next element.

Therefore the first displayed line will be “arg_count = 1” because there was only one item passed: passable_table.
The second displayed line will be “field_count = 3” because there are three items in the table.
The next three lines will be “1” and “2” and “3” because those are the values in the items in the table.

And now the screen looks like this:

tarantool> capi_connection:call('harder', passable_table)
arg_count = 1
field_count = 3
val=1.
val=2.
val=3.
---
- []
...

Conclusion: decoding parameter values passed to a C function is not easy at first, but there are routines to do the job, and they’re documented, and there aren’t very many of them.

hardest.c

Go back to the shell where the easy.c and the harder.c programs were created.

Create a file. Name it hardest.c. Put these 13 lines in it:

#include "module.h"
#include "msgpuck.h"
int hardest(box_function_ctx_t *ctx, const char *args, const char *args_end)
{
  uint32_t space_id = box_space_id_by_name("capi_test", strlen("capi_test"));
  char tuple[1024]; /* Must be big enough for mp_encode results */
  char *tuple_pointer = tuple;
  tuple_pointer = mp_encode_array(tuple_pointer, 2);
  tuple_pointer = mp_encode_uint(tuple_pointer, 10000);
  tuple_pointer = mp_encode_str(tuple_pointer, "String 2", 8);
  int n = box_insert(space_id, tuple, tuple_pointer, NULL);
  return n;
}

Compile the program, producing a library file named hardest.so:

$ gcc -shared -o hardest.so -fPIC hardest.c

Now go back to the client and execute these requests:

box.schema.func.create('hardest', {language = "C"})
box.schema.user.grant('guest', 'execute', 'function', 'hardest')
box.schema.user.grant('guest', 'read,write', 'space', 'capi_test')
capi_connection:call('hardest')

This time the C function is doing three things:

finding the numeric identifier of the capi_test space by calling box_space_id_by_name();
formatting a tuple using more msgpuck.h functions;
inserting a tuple using box_insert().

Warning

char tuple[1024]; is used here as just a quick way of saying “allocate more than enough bytes”. For serious programs the developer must be careful to allow enough space for all the bytes that the mp_encode routines will use up.

Now, still on the client, execute this request:

box.space.capi_test:select()

The result should look like this:

tarantool> box.space.capi_test:select()
---
- - [10000, 'String 2']
...

This proves that the hardest() function succeeded, but where did box_space_id_by_name() and box_insert() come from? Answer: the C API.

read.c

Go back to the shell where the easy.c and the harder.c and the hardest.c programs were created.

Create a file. Name it read.c. Put these 43 lines in it:

#include "module.h"
#include <msgpuck.h>
int read(box_function_ctx_t *ctx, const char *args, const char *args_end)
{
  char tuple_buf[1024];      /* where the raw MsgPack tuple will be stored */
  uint32_t space_id = box_space_id_by_name("capi_test", strlen("capi_test"));
  uint32_t index_id = 0;     /* The number of the space's first index */
  uint32_t key = 10000;      /* The key value that box_insert() used */
  mp_encode_array(tuple_buf, 0); /* clear */
  box_tuple_format_t *fmt = box_tuple_format_default();
  box_tuple_t *tuple = NULL;
  char key_buf[16];          /* Pass key_buf = encoded key = 1000 */
  char *key_end = key_buf;
  key_end = mp_encode_array(key_end, 1);
  key_end = mp_encode_uint(key_end, key);
  assert(key_end <= key_buf + sizeof(key_buf));
  /* Get the tuple. There's no box_select() but there's this. */
  int r = box_index_get(space_id, index_id, key_buf, key_end, &tuple);
  assert(r == 0);
  assert(tuple != NULL);
  /* Get each field of the tuple + display what you get. */
  int field_no;             /* The first field number is 0. */
  for (field_no = 0; field_no < 2; ++field_no)
  {
    const char *field = box_tuple_field(tuple, field_no);
    assert(field != NULL);
    assert(mp_typeof(*field) == MP_STR || mp_typeof(*field) == MP_UINT);
    if (mp_typeof(*field) == MP_UINT)
    {
      uint32_t uint_value = mp_decode_uint(&field);
      printf("uint value=%u.\n", uint_value);
    }
    else /* if (mp_typeof(*field) == MP_STR) */
    {
      const char *str_value;
      uint32_t str_value_length;
      str_value = mp_decode_str(&field, &str_value_length);
      printf("string value=%.*s.\n", str_value_length, str_value);
    }
  }
  return 0;
}

Compile the program, producing a library file named read.so:

$ gcc -shared -o read.so -fPIC read.c

Now go back to the client and execute these requests:

box.schema.func.create('read', {language = "C"})
box.schema.user.grant('guest', 'execute', 'function', 'read')
box.schema.user.grant('guest', 'read,write', 'space', 'capi_test')
capi_connection:call('read')

This time the C function is doing four things:

once again, finding the numeric identifier of the capi_test space by calling box_space_id_by_name();
formatting a search key = 10000 using more msgpuck.h functions;
getting a tuple using box_index_get();
going through the tuple’s fields with box_tuple_get() and then decoding each field depending on its type. In this case, since what we are getting is the tuple that we inserted with hardest.c, we know in advance that the type is either MP_UINT or MP_STR; however, it’s very common to have a case statement here with one option for each possible type.

The result of capi_connection:call('read') should look like this:

tarantool> capi_connection:call('read')
uint value=10000.
string value=String 2.
---
- []
...

This proves that the read() function succeeded. Once again the important functions that start with box – box_index_get() and box_tuple_field() – came from the C API.

write.c

Go back to the shell where the programs easy.c, harder.c, hardest.c and read.c were created.

Create a file. Name it write.c. Put these 24 lines in it:

#include "module.h"
#include <msgpuck.h>
int write(box_function_ctx_t *ctx, const char *args, const char *args_end)
{
  static const char *space = "capi_test";
  char tuple_buf[1024]; /* Must be big enough for mp_encode results */
  uint32_t space_id = box_space_id_by_name(space, strlen(space));
  if (space_id == BOX_ID_NIL) {
    return box_error_set(__FILE__, __LINE__, ER_PROC_C,
    "Can't find space %s", "capi_test");
  }
  char *tuple_end = tuple_buf;
  tuple_end = mp_encode_array(tuple_end, 2);
  tuple_end = mp_encode_uint(tuple_end, 1);
  tuple_end = mp_encode_uint(tuple_end, 22);
  box_txn_begin();
  if (box_replace(space_id, tuple_buf, tuple_end, NULL) != 0)
    return -1;
  box_txn_commit();
  fiber_sleep(0.001);
  struct tuple *tuple = box_tuple_new(box_tuple_format_default(),
                                      tuple_buf, tuple_end);
  return box_return_tuple(ctx, tuple);
}

Compile the program, producing a library file named write.so:

$ gcc -shared -o write.so -fPIC write.c

Now go back to the client and execute these requests:

box.schema.func.create('write', {language = "C"})
box.schema.user.grant('guest', 'execute', 'function', 'write')
box.schema.user.grant('guest', 'read,write', 'space', 'capi_test')
capi_connection:call('write')

This time the C function is doing six things:

once again, finding the numeric identifier of the capi_test space by calling box_space_id_by_name();
making a new tuple;
starting a transaction;
replacing a tuple in box.space.capi_test
ending a transaction;
the final line is a replacement for the loop in read.c – instead of getting each field and printing it, use the box_return_tuple(...) function to return the entire tuple to the caller and let the caller display it.

The result of capi_connection:call('write') should look like this:

tarantool> capi_connection:call('write')
---
- [[1, 22]]
...

This proves that the write() function succeeded. Once again the important functions that start with box – box_txn_begin(), box_txn_commit() and box_return_tuple() – came from the C API.

Conclusion: the long description of the whole C API is there for a good reason. All of the functions in it can be called from C functions which are called from Lua. So C “stored procedures” have full access to the database.

Cleaning up

Get rid of each of the function tuples with box.schema.func.drop().
Get rid of the capi_test space with box.schema.capi_test:drop().
Remove the .c and .so files that were created for this tutorial.

An example in the test suite

Download the source code of Tarantool. Look in a subdirectory test/box. Notice that there is a file named tuple_bench.test.lua and another file named tuple_bench.c. Examine the Lua file and observe that it is calling a function in the C file, using the same techniques that this tutorial has shown.

Conclusion: parts of the standard test suite use C stored procedures, and they must work, because releases don’t happen if Tarantool doesn’t pass the tests.

Developing with an IDE

You can use IntelliJ IDEA as an IDE to develop and debug Lua applications for Tarantool.

Download and install the IDE from the official web-site.

JetBrains provides specialized editions for particular languages: IntelliJ IDEA (Java), PHPStorm (PHP), PyCharm (Python), RubyMine (Ruby), CLion (C/C++), WebStorm (Web) and others. So, download a version that suits your primary programming language.

Tarantool integration is supported for all editions.
Configure the IDE:
1. Start IntelliJ IDEA.
2. Click Configure button and select Plugins.
3. Click Browse repositories.
4. Install EmmyLua plugin.
  
  Note
  
  Please don’t be confused with Lua plugin, which is less powerful than EmmyLua.
5. Restart IntelliJ IDEA.
6. Click Configure, select Project Defaults and then Run Configurations.
7. Find Lua Application in the sidebar at the left.
8. In Program, type a path to an installed tarantool binary.
  
  By default, this is tarantool or /usr/bin/tarantool on most platforms.
  
  If you installed tarantool from sources to a custom directory, please specify the proper path here.
  
  Now IntelliJ IDEA is ready to use with Tarantool.
Create a new Lua project.
Add a new Lua file, for example init.lua.
Write your code, save the file.
To run you application, click Run -> Run in the main menu and select your source file in the list.

Or click Run -> Debug to start debugging.

Note

To use Lua debugger, please upgrade Tarantool to version 1.7.5-29-gbb6170e4b or later.

Tooling

This section describes the tools that enable developers and administrators to work with Tarantool.

tt CLI utility

tt is a utility that provides a unified command-line interface for managing Tarantool-based applications. It covers a wide range of tasks – from installing a specific Tarantool version to managing remote instances and developing applications.

tt is developed in its own GitHub repository. Here you can find its source code, changelog, and releases information. For a complete list of releases, see the Releases section on GitHub.

There is also the Enterprise version of tt available in a Tarantool Enterprise Edition’s release package. The Enterprise version provides additional features, for example, importing and exporting data.

This section provides instructions on tt installation and configuration, concept explanation, and the tt command reference.

tt environments

The key aspect of the tt usage is an environment. A tt environment is a directory that includes a tt configuration, Tarantool installations, application files, and other resources. If you’re familiar with Python virtual environments, you can think of tt environments as their analog.

tt environments enable independent management of multiple Tarantool applications, each running on its own Tarantool version and configuration, on a single host in an isolated manner.

To create a tt environment in a directory, run tt init in it.

Multi-instance applications

tt supports Tarantool applications that run on multiple instances. For example, you can write an application that includes different source files for storage and router instances. With tt, you can start and stop them in a single call, or manage each instance independently.

Learn more about working with multi-instance applications in Multi-instance applications.

Replacement for tarantoolctl and Cartridge CLI

A multi-purpose tool for working with Tarantool from the command line, tt has come to replace the deprecated utilities tarantoolctl and Cartridge CLI command-line utilities. The instructions on migration to tt are provided in Migration from tarantoolctl to tt.

Installation

To install the tt command-line utility, use a package manager – Yum or APT on Linux, or Homebrew on macOS. If you need a specific build, you can build tt from sources.

Note

A Tarantool Enterprise Edition’s release package includes the tt utility extended with additional features like importing and exporting data.

Using Linux package managers

On Linux systems, you can install tt with yum or apt package managers from the tarantool/modules repository. Learn how to add this repository.

The installation command looks like this:

On Ubuntu:
```
$ sudo apt-get install tt
```
On CentOS:
```
$ sudo yum install tt
```

Using Homebrew on macOS

On macOS, use Homebrew to install tt:

$ brew install tt

Building from sources

To build tt from sources:

Install third-party software required for building tt:

git, the version control system.

Go language, version 1.18 or later.

mage build tool.

Clone the tarantool/tt repository:

git clone https://github.com/tarantool/tt --recursive

Go to the tt directory:
```
cd tt
```
(Optional) Checkout a release tag to build a specific version:
```
git checkout tags/v1.0.0
```
Build tt using mage:
```
mage build
```

tt will appear in the current directory.

Enabling shell completion

To enable the completion for tt commands, run the following command specifying the shell (bash or zsh):

. <(tt completion bash)

Configuration

Configuration file

The key artifact that defines the tt environment and various aspects of its execution is its configuration file. You can generate it with a tt init call. In the default launch mode, the file is generated in the current directory, making it the environment root.

Name and location

By default, the configuration file is called tt.yaml and located in the tt environment root directory. It depends on the launch mode.

It is also possible to pass the configuration file name and location explicitly using the following ways:

-c/--cfg global option
TT_CLI_CFG environment variable.

The TT_CLI_CFG variable has a lower priority than the --cfg option.

Structure

The tt configuration file is a YAML file with the following structure:

env:
  instances_enabled: path/to/available/applications
  bin_dir: path/to/bin_dir
  inc_dir: path/to/inc_dir
  restart_on_failure: bool
  tarantoolctl_layout: bool
modules:
  directory: path/to/modules/dir
app:
  run_dir: path/to/run_dir
  log_dir: path/to/log_dir
  wal_dir: path/to/wal_dir
  vinyl_dir: path/to/vinyl_dir
  memtx_dir: path/to/memtx_dir
repo:
  rocks: path/to/rocks
  distfiles: path/to/install
ee:
  credential_path: path/to/file
templates:
  - path: path/to/app/templates1
  - path: path/to/app/templates2

Note

The tt configuration format and application layout have been changed in version 2.0. Learn how to upgrade from earlier versions in Migrating from tt 1.* to 2.0 or later.

env section

Note

The paths specified in env.* parameters are relative to the current tt environment’s root.

instances_enabled – the directory where instances are stored. Default: instances.enabled.
bin_dir – the directory where binary files are stored. Default: bin.
inc_dir – the base directory for storing header files. They will be placed in the include subdirectory inside the specified directory. Default: include.

Note

The header files directory path can also be passed using the TT_CLI_TARANTOOL_PREFIX environment variable. If it is set, tt rocks and tt build commands use the include/tarantool directory inside TT_CLI_TARANTOOL_PREFIX as the header files directory.
restart_on_failure – restart the instance on failure: true or false. Default: false.
tarantoolctl_layout – use a layout compatible with the deprecated tarantoolctl utility for artifact files: control sockets, .pid files, log files. Default: false.

modules section

directory – the directory where external modules are stored.

app section

Note

The paths specified in app.*_dir parameters are relative to the application location inside the instances.enabled directory specified in the env configuration section. For example, the default location of the myapp application’s logs is instances.enabled/myapp/var/log. Inside this location, tt creates separate directories for each application instance that runs in the current environment.

run_dir– the directory for instance runtime artifacts, such as console sockets or PID files. Default: var/run.
log_dir – the directory where log files are stored. Default: var/log.
wal_dir – the directory where write-ahead log (.xlog) files are stored. Default: var/lib.
memtx_dir – the directory where memtx stores snapshot (.snap) files. Default: var/lib.
vinyl_dir – the directory where vinyl files or subdirectories are stored. Default: var/lib.

repo section

rocks – the directory where rocks files are stored.

Note

The rocks directory path can be passed in the TT_CLI_REPO_ROCKS environment variable instead. The variable is also used if the directory specified in repo.rocks does not include a repository manifest.
distfiles – the directory where installation files are stored.

ee section

credential_path – a path to the file with credentials used for downloading Tarantool Enterprise Edition (Tarantool customer zone credentials). The file should contain a username and a password, each on a separate line. Find an example in the tt install command reference.

Note

The customer zone credentials can also be passed in the TT_CLI_EE_USERNAME and TT_CLI_EE_PASSWORD environment variables.

templates section

path – a path to application templates used for creating applications with tt create. May be specified more than once.

Launch modes

tt launch mode defines its working directory and the way it searches for the configuration file. There are three launch modes:

default
system
local

Default launch

Global option: none

Configuration file: searched from the current directory to the root. Taken from /etc/tarantool if the file is not found.

Working directory: The directory where the configuration file is found.

System launch

Global option: --system or -S

Configuration file: Taken from /etc/tarantool.

Working directory: Current directory.

Local launch

Global option: --local=DIRECTORY or -L=DIRECTORY

Configuration file: Searched from the specified directory to the root. Taken from /etc/tarantool if the file is not found.

Working directory: The specified directory. If tarantool or tt executable files are found in the working directory, they will be used.

Migrating from tt 1.* to 2.0 or later

The tt configuration and application layout were changed in version 2.0. If you are using tt 1.*, complete the following steps to migrate to tt 2.0 or later:

Update the tt configuration file. In tt 2.0, the following changes were made to the configuration file:
- The root section tt was removed. Its child sections – app, repo, modules, and other – have been moved to the top level.
- Environment configuration parameters were moved from the app section to the new section env. These parameters are instances.enabled, bin_dir, inc_dir, and restart_on_failure.
- The paths in the app section are now relative to the app directory in instances.enabled instead of the environment root.
You can use tt init to generate a configuration file with the new structure and default parameter values.
Move application artifacts. With tt 1.*, application artifacts (logs, snapshots, pid, and other files) were created in the var directory inside the environment root. Starting from tt 2.0, these artifacts are created in the var directory inside the application directory, which is instances.enabled/<app-name>. This is how an application directory looks:
```
instances.enabled/app/
├── init.lua
├── instances.yml
└── var
    ├── lib
    │   ├── instance1
    │   └── instance2
    ├── log
    │   ├── instance1
    │   └── instance2
    └── run
        ├── instance1
        └── instance2
```
To continue using existing application artifacts after migration from tt 1.*:
1. Create the var directory inside the application directory.
2. Create the lib, log, and run directories inside var.
3. Move directories with instance artifacts from the old var directory to the new var directories in applications’ directories.
Move the files accessed from the application code. The working directory of instance processes was changed from the tt working directory to the application directory inside instances.enabled. If the application accesses files using relative paths, move the files accordingly or adjust the application code.

Global options

Important

Global options of tt must be passed before its commands and other options. For example:

$ tt --cfg tt-conf.yaml start app

tt has the following global options:

-c=file, --cfg=file,¶

Path to the configuration file.

Alternatively, this path can be passed in the TT_CLI_CFG environment variable.

-h, --help¶: Display help.

--integrity-check PUBLIC_KEY¶: Enterprise Edition

This option is supported by the Enterprise Edition only.

Perform an integrity check using the specified public key before executing the operation. Learn more in Integrity check.

-I, --internal¶: Force the use of an internal module even if there is an external module with the same name.

-L=DIRECTORY, --local=DIRECTORY¶: Use the tt environment from the specified directory. Learn more about the local launch mode.

-s, --self¶: Use the current tt version instead of executing the one located in the bin_dir directory.

-S, --system¶: Use the tt environment installed in the system. Learn more about the system launch mode.

-V, --verbose¶: Display detailed processing information (verbose mode).

Developing applications

This section describes tt capabilities related to developing cluster applications.

Application environment

This section provides a high-level overview on how to prepare a Tarantool application for deployment and how the application’s environment and layout might look. This information is helpful for understanding how to administer Tarantool instances using tt CLI in both development and production environments.

The main steps of creating and preparing the application for deployment are:

Initializing a local environment.
Creating and developing an application.
Packaging the application.

In this section, a sharded_cluster_crud application is used as an example. This cluster includes 5 instances: one router and 4 storages, which constitute two replica sets.

Initializing a local environment

Before creating an application, you need to set up a local environment for tt:

Create a home directory for the environment.

Run tt init in this directory:

~/myapp$ tt init
   • Environment config is written to 'tt.yaml'

This command creates a default tt configuration file tt.yaml for a local environment and the directories for applications, control sockets, logs, and other artifacts:

~/myapp$ ls
bin  distfiles  include  instances.enabled  modules  templates  tt.yaml

Find detailed information about the tt configuration parameters and launch modes on the tt configuration page.

Creating and developing an application

You can create an application in two ways:

Manually by preparing its layout in a directory inside instances_enabled. The directory name is used as the application identifier.
From a template by using the tt create command.

In this example, the application’s layout is prepared manually and looks as follows.

~/myapp$ tree
.
├── bin
├── distfiles
├── include
├── instances.enabled
│   └── sharded_cluster_crud
│       ├── config.yaml
│       ├── instances.yaml
│       ├── router.lua
│       ├── sharded_cluster_crud-scm-1.rockspec
│       └── storage.lua
├── modules
├── templates
└── tt.yaml

The sharded_cluster_crud directory contains the following files:

config.yaml: contains the configuration of the cluster. This file might include the entire cluster topology or provide connection settings to a centralized configuration storage.
instances.yml: specifies instances to run in the current environment. For example, on the developer’s machine, this file might include all the instances defined in the cluster configuration. In the production environment, this file includes instances to run on the specific machine.
router.lua: includes code specific for a router.
sharded_cluster_crud-scm-1.rockspec: specifies the required external dependencies (for example, vshard and crud).
storage.lua: includes code specific for storages.

You can find the full example here: sharded_cluster_crud.

Packaging the application

To package the ready application, use the tt pack command. This command can create an installable DEB/RPM package or generate .tgz archive.

The structure below reflects the content of the packed .tgz archive for the sharded_cluster_crud application:

~/myapp$ tree -a
.
├── bin
│   ├── tarantool
│   └── tt
├── instances.enabled
│   └── sharded_cluster_crud -> ../sharded_cluster_crud
├── sharded_cluster_crud
│   ├── .rocks
│   │   └── share
│   │       └── ...
│   ├── config.yaml
│   ├── instances.yaml
│   ├── router.lua
│   └── storage.lua
└── tt.yaml

The application’s layout looks similar to the one defined when developing the application with some differences:

bin: contains the tarantool and tt binaries packed with the application bundle.
instances.enabled: contains a symlink to the packed sharded_cluster application.
sharded_cluster_crud: a packed application. In addition to files created during the application development, includes the .rocks directory containing application dependencies (for example, vshard and crud).
tt.yaml: a tt configuration file.

Note

In DEB/PRM packages generated by tt pack, there are also .service unit files for each packaged application.

Deploying the application

Instances to run

When deploying a distributed cluster application from a .tar.gz archive, you can define instances to run on each machine by changing the content of the instances.yaml file.

On the developer’s machine, this file might include all the instances defined in the cluster configuration.

instances.yaml:
```
storage-a-001:
storage-a-002:
storage-b-001:
storage-b-002:
router-a-001:
```
In the production environment, this file includes instances to run on the specific machine.

instances.yaml (Server-001):
```
router-a-001:
```
instances.yaml (Server-002):
```
storage-a-001:
storage-b-001:
```
instances.yaml (Server-003):
```
storage-a-002:
storage-b-002:
```

The Starting and stopping instances section describes how to start and stop Tarantool instances.

DEB and RPM packages

Tarantool applications installed from DEB and RPM packages built with tt pack can run as systemd services. They run on behalf of the tarantool system user. It is created automatically during the package installation.

By default, the application artifacts are placed in the following directories:

/var/lib/tarantool/sys_env – application data
/var/log/tarantool/sys_env – logs
/var/run/tarantool/sys_env – runtime artifacts

If you want to change these directories, make sure that the tarantool user has enough permissions on the directories you use.

Starting and stopping instances

Note

This section describes how to manage instances in a Tarantool cluster using the tt utility. A cluster can include multiple instances that run different code. A typical example is a cluster application that includes router and storage instances. Particularly, you can perform the following actions:

start all instances in a cluster or only specific ones
check the status of instances
connect to a specific instance
stop all instances or only specific ones

To get more context on how the application’s environment might look, refer to Application environment.

Note

In this section, a sharded_cluster_crud application is used to demonstrate how to start, stop, and manage instances in a cluster.

Starting Tarantool instances

To start Tarantool instances use the tt start command:

$ tt start sharded_cluster_crud
   • Starting an instance [sharded_cluster_crud:storage-a-001]...
   • Starting an instance [sharded_cluster_crud:storage-a-002]...
   • Starting an instance [sharded_cluster_crud:storage-b-001]...
   • Starting an instance [sharded_cluster_crud:storage-b-002]...
   • Starting an instance [sharded_cluster_crud:router-a-001]...

After the cluster has started and worked for some time, you can find its artifacts in the directories specified in the tt configuration. These are the default locations in the local launch mode:

sharded_cluster_crud/var/log/<instance_name>/ – instance logs.
sharded_cluster_crud/var/lib/<instance_name>/ – snapshots and write-ahead logs.
sharded_cluster_crud/var/run/<instance_name>/ – control sockets and PID files.

In the system launch mode, artifacts are created in these locations:

/var/log/tarantool/<instance_name>/
/var/lib/tarantool/<instance_name>/
/var/run/tarantool/<instance_name>/

Basic instance management

Most of the commands described in this section can be called with or without an instance name. Without the instance name, they are executed for all instances defined in instances.yaml.

Checking an instance’s status

To check the status of instances, execute tt status:

$ tt status sharded_cluster_crud
 INSTANCE                            STATUS   PID   MODE  CONFIG  BOX      UPSTREAM
 sharded_cluster_crud:router-a-001   RUNNING  8382  RW    ready   running  --
 sharded_cluster_crud:storage-a-001  RUNNING  8386  RW    ready   running  --
 sharded_cluster_crud:storage-a-002  RUNNING  8390  RO    ready   running  --
 sharded_cluster_crud:storage-b-001  RUNNING  8379  RW    ready   running  --
 sharded_cluster_crud:storage-b-002  RUNNING  8380  RO    ready   running  --

To check the status of a specific instance, you need to specify its name:

$ tt status sharded_cluster_crud:storage-a-001
 INSTANCE                            STATUS   PID   MODE  CONFIG  BOX      UPSTREAM
 sharded_cluster_crud:storage-a-001  RUNNING  8386  RW    ready   running  --

Connecting to an instance

To connect to the instance, use the tt connect command:

$ tt connect sharded_cluster_crud:storage-a-001
   • Connecting to the instance...
   • Connected to sharded_cluster_crud:storage-a-001

sharded_cluster_crud:storage-a-001>

In the instance’s console, you can execute commands provided by the box module. For example, box.info can be used to get various information about a running instance:

sharded_cluster_crud:storage-a-001> box.info.ro
---
- false
...

Restarting instances

To restart an instance, use tt restart:

$ tt restart sharded_cluster_crud:storage-a-002

After executing tt restart, you need to confirm this operation:

Confirm restart of 'sharded_cluster_crud:storage-a-002' [y/n]: y
   • The Instance sharded_cluster_crud:storage-a-002 (PID = 2026) has been terminated.
   • Starting an instance [sharded_cluster_crud:storage-a-002]...

Stopping instances

To stop the specific instance, use tt stop as follows:

$ tt stop sharded_cluster_crud:storage-a-002

You can also stop all the instances at once as follows:

$ tt stop sharded_cluster_crud
   • The Instance sharded_cluster_crud:storage-b-001 (PID = 2020) has been terminated.
   • The Instance sharded_cluster_crud:storage-b-002 (PID = 2021) has been terminated.
   • The Instance sharded_cluster_crud:router-a-001 (PID = 2022) has been terminated.
   • The Instance sharded_cluster_crud:storage-a-001 (PID = 2023) has been terminated.
   • can't "stat" the PID file. Error: "stat /home/testuser/myapp/instances.enabled/sharded_cluster_crud/var/run/storage-a-002/tt.pid: no such file or directory"

Note

The error message indicates that storage-a-002 is already not running.

Removing instance artifacts

The tt clean command removes instance artifacts (such as logs or snapshots):

$ tt clean sharded_cluster_crud
   • List of files to delete:

   • /home/testuser/myapp/instances.enabled/sharded_cluster_crud/var/log/storage-a-001/tt.log
   • /home/testuser/myapp/instances.enabled/sharded_cluster_crud/var/lib/storage-a-001/00000000000000001062.snap
   • /home/testuser/myapp/instances.enabled/sharded_cluster_crud/var/lib/storage-a-001/00000000000000001062.xlog
   • ...

Confirm [y/n]:

Enter y and press Enter to confirm removing of artifacts for each instance.

Note

The -f option of the tt clean command can be used to remove the files without confirmation.

Preloading Lua scripts and modules

Tarantool supports loading and running chunks of Lua code before starting instances. To load or run Lua code immediately upon Tarantool startup, specify the TT_PRELOAD environment variable. Its value can be either a path to a Lua script or a Lua module name:

To run the Lua script preload_script.lua from the sharded_cluster_crud directory, set TT_PRELOAD as follows:
```
$ TT_PRELOAD=preload_script.lua tt start sharded_cluster_crud
```
Tarantool runs the preload_script.lua code, waits for it to complete, and then starts instances.
To load the preload_module from the sharded_cluster_crud directory, set TT_PRELOAD as follows:
```
$ TT_PRELOAD=preload_module tt start sharded_cluster_crud
```
Note

TT_PRELOAD values that end with .lua are considered scripts, so avoid module names with this ending.

To load several scripts or modules, pass them in a single quoted string, separated by semicolons:

$ TT_PRELOAD="preload_script.lua;preload_module" tt start sharded_cluster_crud

If an error happens during the execution of the preload script or module, Tarantool reports the problem and exits.

Commands

Below is a list of tt commands. Run tt COMMAND help to see the detailed help for the given command.

binaries	Show a list of installed binaries and their versions
build	Build an application locally
cartridge	Manage a Cartridge application
cat	Print the contents of `.snap` or `.xlog` files into stdout
cfg	Manage a `tt` environment configuration
check	Check an application file for syntax errors
clean	Clean instance files
cluster	Manage a cluster’s configuration
completion	Generate completion for a specified shell
connect	Connect to a Tarantool instance
coredump	Manipulate Tarantool core dumps
create	Create an application from a template
crud	Interact with the CRUD module (Enterprise only)
download	Download the Tarantool Enterprise SDK
export	Export data to a file (Enterprise only)
help	Display help for `tt` or a specific command
import	Import data from a file (Enterprise only)
init	Create a new `tt` environment in the current directory
install	Install Tarantool or `tt`
instances	List enabled applications
kill	Terminate Tarantool applications or instances
log	Print instance logs
logrotate	Rotate instance logs
migrations	Manage migrations
pack	Package an application
play	Play the contents of `.snap` or `.xlog` files to another Tarantool instance
replicaset	Manage replica sets
restart	Restart Tarantool applications or instances
rocks	Use the LuaRocks package manager
run	Run Lua code in a Tarantool instance
search	Search available Tarantool and `tt` versions
start	Start Tarantool applications or instances
status	Get the current status of applications or instances
stop	Stop Tarantool applications or instances
tdg2	Interact with Tarantool Data Grid 2 clusters
uninstall	Uninstall Tarantool or `tt`
version	Show the `tt` version information

Managing binaries in the current environment

$ tt binaries COMMAND [COMMAND_OPTION ...]

tt binaries manages Tarantool and tt binaries installed in the current environment.

COMMAND is one of the following:

list
switch

list

$ tt binaries list

tt binaries list shows a list of installed binaries and their versions.

To show a list of installed Tarantool versions:

$ tt binaries list
List of installed binaries:
   • tarantool:
        3.1.0 [active]
        2.11.2
   • tt:
        2.3.0
        2.2.1 [active]

switch

$ tt binaries switch [PROGRAM_NAME] [VERSION]

tt binaries switch switches binaries used in the current environment. The possible values of PROGRAM_NAME are:

tarantool: Tarantool Community Edition.
tarantool-ee: Tarantool Enterprise Edition.
tt: the tt command-line utility.

When called without arguments, the command lets you choose the program and version interactively:

$ tt binaries switch
Use the arrow keys to navigate: ↓ ↑ → ←
? Select program:
  ▸ tarantool
    tarantool-ee
    tt

You can also specify the program name and version in the call.

To view tt versions installed in the current environment and switch between them:

$ tt binaries switch tt
Use the arrow keys to navigate: ↓ ↑ → ←
? Select version:
  ▸ 2.2.1
    2.3.0 [active]

To switch to a specific Tarantool EE version installed in the current environment:

$ tt binaries switch tarantool-ee 3.1.0

Building an application

$ tt build [PATH] [--spec SPEC_FILE_PATH]

tt build builds a Tarantool application locally.

Options

--spec SPEC_FILE_PATH¶: Path to a .rockspec file to use for the current build

Details

The PATH argument should contain the path to the application directory (that is, to the build source). The default path is . (current directory).

The application directory must contain a .rockspec file to use for the build. If there is more than one .rockspec file in the application directory, specify the one to use in the --spec argument.

tt build builds an application with the tt rocks make command. It downloads the application dependencies into the .rocks directory, making the application ready to run locally.

Pre-build and post-build scripts

In addition to building the application with LuaRocks, tt build can execute pre-build and post-build scripts. These scripts should contain steps to execute right before and after building the application. These files must be named tt.pre-build and tt.post-build correspondingly and located in the application directory.

Note

For compatibility with Cartridge applications, the pre-build and post-build scripts can also have names cartridge.pre-build and cartridge.post-build.

tt.pre-build is helpful when your application depends on closed-source rocks, or if the build should contain rocks from a project added as a submodule. You can install these dependencies using the pre-build script before building. Example:

#!/bin/sh

# The main purpose of this script is to build non-standard rocks modules.
# The script will run before `tt rocks make` during application build.

tt rocks make --chdir ./third_party/proj

tt.post-build is a script that runs after tt rocks make. The main purpose of this script is to remove build artifacts from the final package. Example:

#!/bin/sh

# The main purpose of this script is to remove build artifacts from the resulting package.
# The script will run after `tt rocks make` during application build.

rm -rf third_party
rm -rf node_modules
rm -rf doc

Examples

Build the application app1 from its directory:
```
$ tt build
```
Build the application app1 from the simple_app directory inside the current directory:
```
$ tt build simple_app
```
Build the application app1 from its directory explicitly specifying the rockspec file to use:
```
$ tt build --spec app1-scm-1.rockspec
```

Managing a Cartridge application

Important

The Tarantool Cartridge framework is deprecated and is not compatible with Tarantool 3.0 and later. This command is added for backward compatibility with earlier versions.

$ tt cartridge COMMAND {[OPTION ...]|SUBCOMMAND}

tt cartridge manages a Cartridge application. COMMAND is one of the following:

admin
bench
failover
repair
replicasets

admin

$ tt cartridge admin ADMIN_FUNC_NAME [ADMIN_OPTION ...]

tt cartridge admin calls admin functions provided by the application.

Options

--name STRING¶: (Required) An application name.

-l, --list¶: List the available admin functions.

--instance STRING¶: A name of the instance to connect to.

--conn STRING¶: An address to connect to.

--run-dir STRING¶: A directory where PID and socket files are stored. Defaults to /var/run/tarantool.

Examples

Get a list of the available admin functions:

$ tt cartridge admin --name APPNAME --list

   • Available admin functions:

probe  Probe instance

Get help for a specific function:

$ tt cartridge admin --name APPNAME probe --help

   • Admin function "probe" usage:

Probe instance

Args:
  --uri string  Instance URI

Call a function with an argument:

$ tt cartridge admin --name APPNAME probe --uri localhost:3301

   • Probe "localhost:3301": OK

bench

$ tt cartridge bench [BENCH_OPTION ...]

tt cartridge bench runs benchmarks for Tarantool.

Options

--url STRING¶: A Tarantool instance address (the default is 127.0.0.1:3301).

--user STRING¶: A username used to connect to the instance (the default is guest).

--password STRING¶: A password used to connect to the instance.

--connections INT¶: A number of concurrent connections (the default is 10).

--requests INT¶: A number of simultaneous requests per connection (the default is 10).

--duration INT¶: The duration of a benchmark test in seconds (the default is 10).

--keysize INT¶: The size of a key part of benchmark data in bytes (the default is 10).

--datasize INT¶: The size of a value part of benchmark data in bytes (the default is 20).

--insert INT¶: A percentage of inserts (the default is 100).

--select INT¶: A percentage of selects.

--update INT¶: A percentage of updates.

--fill INT¶: A number of records to pre-fill the space (the default is 1000000).

failover

$ tt cartridge failover COMMAND [COMMAND_OPTION ...]

tt cartridge failover manages an application failover. The following commands are available:

set
setup
status
disable

failover set

$ tt cartridge failover set MODE [FAILOVER_SET_OPTION ...]

Setup failover in the specified mode:

stateful
eventual
disabled

Options:

--state-provider STRING: A failover’s state provider. Can be stateboard or etcd2. Used only in the stateful mode.
--params STRING: Failover parameters specified in a JSON-formatted string, for example, "{'fencing_timeout': 10', 'fencing_enabled': true}".
--provider-params STRING: Failover provider parameters specified in a JSON-formatted string, for example, "{'lock_delay': 14}".

failover setup

$ tt cartridge failover setup --file STRING

Setup failover with parameters described in a file. The failover configuration file defaults to failover.yml.

The failover.yml file might look as follows:

mode: stateful
state_provider: stateboard
stateboard_params:
    uri: localhost:4401
    password: passwd
failover_timeout: 15

failover status

$ tt cartridge failover status

Get the current failover status.

failover disable

$ tt cartridge failover disable

Disable failover.

Options

--name STRING¶: An application name. Defaults to “package” in rockspec.

--file STRING¶: A path to the file containing failover settings. Defaults to failover.yml.

repair

$ tt cartridge repair COMMAND [REPAIR_OPTION ...]

tt cartridge repair repairs a running application. The following commands are available:

list-topology
remove-instance
set-advertise-uri
set-leader

repair list-topology

$ tt cartridge repair list-topology [REPAIR_OPTION ...]

Get a summary of the current cluster topology.

repair remove-instance

$ tt cartridge repair remove-instance UUID [REPAIR_OPTION ...]

Remove the instance with the specified UUID from the cluster. If the instance isn’t found, raise an error.

repair set-advertise-uri

$ tt cartridge repair set-advertise-uri INSTANCE-UUID NEW-URI [REPAIR_OPTION ...]

Change the instance’s advertise URI. Raise an error if the instance isn’t found or is expelled.

repair set-leader

$ tt cartridge repair set-leader REPLICASET-UUID INSTANCE-UUID [REPAIR_OPTION ...]

Set the instance as the leader of the replica set. Raise an error in the following cases:

There is no replica set or instance with that UUID.
The instance doesn’t belong to the replica set.
The instance has been disabled or expelled.

Options

The following options work with any repair subcommand:

--name¶: (Required) An application name.

--data-dir¶: The directory containing the instances’ working directories. Defaults to /var/lib/tarantool.

The following options work with any repair command, except list-topology:

--run-dir¶: The directory where PID and socket files are stored. Defaults to /var/run/tarantool.

--dry-run¶: Launch in dry-run mode: show changes but do not apply them.

--reload¶: Enable instance configuration to reload after the patch.

replicasets

$ tt cartridge replicasets COMMAND [COMMAND_OPTION ...]

tt cartridge replicasets manages an application’s replica sets. The following commands are available:

setup
save
list
join
list-roles
list-vshard-groups
add-roles
remove-roles
set-weight
set-failover-priority
bootstrap-vshard
expel

replicasets setup

$ tt cartridge replicasets setup [--file FILEPATH] [--bootstrap-vshard]

Setup replica sets using a file.

Options:

--file: A file with a replica set configuration. Defaults to replicasets.yml.
--bootstrap-vshard: Bootstrap vshard upon setup.

replicasets save

$ tt cartridge replicasets save [--file FILEPATH]

Save the current replica set configuration to a file.

Options:

--file: A file to save the configuration to. Defaults to replicasets.yml.

replicasets list

$ tt cartridge replicasets list [--replicaset STRING]

List the current cluster topology.

Options:

--replicaset STRING: A replica set name.

replicasets join

$ tt cartridge replicasets join INSTANCE_NAME ... [--replicaset STRING]

Join the instance to a cluster. If a replica set with the specified alias isn’t found in the cluster, it is created. Otherwise, instances are joined to an existing replica set.

Options:

--replicaset STRING: A replica set name.

replicasets list-roles

$ tt cartridge replicasets list-roles

List the available roles.

replicasets list-vshard-groups

$ tt cartridge replicasets list-vshard-groups

List the available vshard groups.

replicasets add-roles

$ tt cartridge replicasets add-roles ROLE_NAME ... [--replicaset STRING] [--vshard-group STRING]

Add roles to the replica set.

Options:

--replicaset STRING: A replica set name.
--vshard-group STRING: A vshard group for vshard-storage replica sets.

replicasets remove-roles

$ tt cartridge replicasets remove-roles ROLE_NAME ... [--replicaset STRING]

Remove roles from the replica set.

Options:

--replicaset STRING: A replica set name.

replicasets set-weight

$ tt cartridge replicasets set-weight WEIGHT [--replicaset STRING]

Specify replica set weight.

Options:

--replicaset STRING: A replica set name.

replicasets set-failover-priority

$ tt cartridge replicasets set-failover-priority INSTANCE_NAME ... [--replicaset STRING]

Configure replica set failover priority.

Options:

--replicaset STRING: A replica set name.

replicasets bootstrap-vshard

$ tt cartridge replicasets bootstrap-vshard

Bootstrap vshard.

replicasets expel

$ tt cartridge replicasets expel INSTANCE_NAME ...

Expel one or more instances from the cluster.

Printing the contents of .snap and .xlog files

$ tt cat FILE ... [OPTION ...]

tt cat prints the contents of snapshot (.snap) and WAL (.xlog) files to stdout. A single call of tt cat can print the contents of multiple files.

Options

--format FORMAT¶: Output format: yaml (default), json, or lua.

--from LSN¶: Show operations starting from the given LSN.

--to LSN¶: Show operations up to the given LSN. Default: 18446744073709551615.

--replica ID¶

Filter the output by replica ID. Can be passed more than once.

When calling tt cat with filters by LSN (--from and --to flags) and replica ID (--replica), remember that LSNs differ across replicas. Thus, if you pass more than one replica ID via --from or --to, the result may not reflect the actual sequence of operations.

--space ID¶: Filter the output by space ID. Can be passed more than once.

--show-system¶: Show the contents of system spaces.

Examples

Output contents of 00000000000000000000.xlog WAL file in the YAML format:
```
$ tt cat 00000000000000000000.xlog
```
Output operations on spaces with space_id 512 and 513 from the 00000000000000000012.snap snapshot file in the JSON format:
```
$ tt cat 00000000000000000012.snap --space 512 --space 513 --format json
```
Output operations on all spaces, including system spaces, from the 00000000000000000000.xlog WAL file:
```
$ tt cat 00000000000000000000.xlog --show-system
```
Output operations with LSNs between 100 and 500 on replica 1 from the 00000000000000000000.xlog WAL file:
```
$ tt cat 00000000000000000000.xlog --from 100 --to 500 --replica 1
```

Environment configuration

$ tt cfg COMMAND [OPTION ...]

tt cfg manages a tt environment configuration.

Commands

dump

Print a tt environment configuration.

Options:

-r, --raw: Print a raw content of the tt.yaml configuration file.

Examples

Print the current tt environment configuration:

$ tt cfg dump

Checking an application file

$ tt check {FILEPATH | APPLICATION[:APP_INSTANCE]}

tt check checks the syntax correctness of Lua files within Tarantool applications or separate Lua scripts. The files must be stored inside the instances_enabled directory specified in the tt configuration file.

Examples

To check all Lua files in an application directory at once, specify the directory name:

$ tt check app

To check a single Lua file from an application directory, add the path to this file:

$ tt check app/router
# or
$ tt check app/router.lua

Note

The .lua extension can be omitted.

Cleaning instance files

$ tt clean APPLICATION[:APP_INSTANCE] [OPTION ...]

tt clean cleans stored files of Tarantool instances: logs, snapshots, and other files. To avoid accidental deletion of files, tt clean shows the files it is going to delete and asks for confirmation.

When called without arguments, cleans files of all applications in the current environment.

Options

-f, --force¶: Clean files without confirmation.

Examples

Clean the files of all instances of the app application:
```
$ tt clean app
```
Clean the files of the master instance of the app application:
```
$ tt clean app:master
```

Managing cluster configurations

$ tt cluster COMMAND [COMMAND_OPTION ...]

tt cluster manages configurations of Tarantool applications. This command works both with local YAML files in application directories and with centralized configuration storages (etcd or Tarantool-based).

COMMAND is one of the following:

publish
show
replicaset
failover

publish

$ tt cluster publish {APPLICATION[:APP_INSTANCE] | CONFIG_URI} [FILE] [OPTION ...]

tt cluster publish publishes a cluster configuration using an arbitrary YAML file as a source.

Publishing local configurations

tt cluster publish can modify local cluster configurations stored in config.yaml files inside application directories.

To write a configuration to a local config.yaml, run tt cluster publish with two arguments:

the application name.
the path to a YAML file from which the configuration should be taken.

$ tt cluster publish myapp source.yaml

Publishing configurations in centralized storages

tt cluster publish can modify centralized cluster configurations in storages of both supported types: etcd or a Tarantool-based configuration storage.

To publish a configuration from a file to a centralized configuration storage, run tt cluster publish with a URI of this storage’s instance as the target. For example, the command below publishes a configuration from source.yaml to a local etcd instance running on the default port 2379:

$ tt cluster publish "http://localhost:2379/myapp" source.yaml

A URI must include a prefix that is unique for the application. It can also include credentials and other connection parameters. Find the detailed description of the URI format in URI format.

Publishing configurations of specific instances

In addition to whole cluster configurations, tt cluster publish can manage configurations of specific instances within applications: rewrite configurations of existing instances and add new instance configurations.

In this case, it operates with YAML fragments that describe a single instance configuration section. For example, the following YAML file can be a source when publishing an instance configuration:

# instance_source.yaml
iproto:
  listen:
  - uri: 127.0.0.1:3311

To send an instance configuration to a local config.yaml, run tt cluster publish with the application:instance pair as the target argument:

$ tt cluster publish myapp:instance-002 instance_source.yaml

To send an instance configuration to a centralized configuration storage, specify the instance name in the name argument of the storage URI:

$ tt cluster publish "http://localhost:2379/myapp?name=instance-002" instance_source.yaml

If the instance already exists, this call overwrites its configuration with the one from the file.

To add a new instance configuration from a YAML fragment, specify the name to assign to the new instance and its location in the cluster topology – replica set and group – in the --replicaset and --group options.

Note

The --group option can be omitted if the configuration contains only one group.

To add a new instance instance-003 to the replicaset-001 replica set:

$ tt cluster publish "http://localhost:2379/myapp?name=instance-003" instance_source.yaml --replicaset replicaset-001

Configuration validation

tt cluster publish validates configurations against the Tarantool configuration schema and aborts in case of an error. To skip the validation, add the --force option:

$ tt cluster publish myapp source.yaml --force

Publishing configurations with integrity check

Enterprise Edition

The integrity check functionality is supported by the Enterprise Edition only.

When called with the --with-integrity-check option, tt cluster publish generates a checksum of the configurations it publishes. It signs the checksum using the private key passed as the option argument, and writes it into the configuration store.

$ tt cluster publish "http://localhost:2379/myapp" source.yaml --with-integrity-check private.pem

If an application configuration is published this way, it can be checked for integrity using the --integrity-check global option.

$ tt --integrity-check public.pem cluster show myapp
$ tt --integrity-check public.pem start myapp

Learn more about integrity checks upon application startup and in runtime in the tt start reference.

To ensure the configuration integrity when updating it, call tt cluster publish with two options:

--integrity-check PUBLIC_KEY global option checks that the configuration wasn’t changed since it was published
--with-integrity-check PRIVATE_KEY generates new hash and signature for future integrity checks of the updated configuration.

$ tt --integrity-check public.pem cluster publish \
     --with-integrity-check private.pem \
     "http://localhost:2379/myapp" source.yaml

show

$ tt cluster show {APPLICATION[:APP_INSTANCE] | CONFIG_URI} [OPTION ...]

tt cluster show displays a cluster configuration.

Displaying local configurations

tt cluster show can read local cluster configurations stored in config.yaml files inside application directories.

To print a local configuration from an application’s config.yaml, specify the application name as an argument:

$ tt cluster show myapp

Displaying configurations from centralized storages

tt cluster show can display centralized cluster configurations from configuration storages of both supported types: etcd or a Tarantool-based configuration storage.

To print a cluster configuration from a centralized storage, run tt cluster show with a storage URI including the prefix identifying the application. For example, to print myapp’s configuration from a local etcd storage:

$ tt cluster show "http://localhost:2379/myapp"

Displaying configurations of specific instances

In addition to whole cluster configurations, tt cluster show can display configurations of specific instances within applications. In this case, it prints YAML fragments that describe a single instance configuration section.

To print an instance configuration from a local config.yaml, use the application:instance argument:

$ tt cluster show myapp:instance-002

To print an instance configuration from a centralized configuration storage, specify the instance name in the name argument of the URI:

$ tt cluster show "http://localhost:2379/myapp?name=instance-002"

Configuration validation

To validate configurations when printing them with tt cluster show, enable the validation by adding the --validate option:

$ tt cluster show "http://localhost:2379/myapp" --validate

replicaset

$ tt cluster replicaset SUBCOMMAND {APPLICATION[:APP_INSTANCE] | CONFIG_URI} [OPTION ...]

tt cluster replicaset manages instances in a replica set. It supports the following subcommands:

promote
demote
expel
roles

Important

tt cluster replicaset works only with centralized cluster configurations. To manage replica sets in clusters with local YAML configurations, use tt replicaset.

promote

$ tt cluster replicaset promote CONFIG_URI INSTANCE_NAME [OPTION ...]

tt cluster replicaset promote promotes the specified instance, making it a leader of its replica set. This command works on Tarantool clusters with centralized configuration and with failover modes off and manual. It updates the centralized configuration according to the specified arguments and reloads it:

off failover mode: the command sets database.mode to rw on the specified instance.

Important

If failover is off, the command doesn’t consider the modes of other replica set members, so there can be any number of read-write instances in one replica set.
manual failover mode: the command updates the leader option of the replica set configuration. Other instances of this replica set become read-only.

Example:

$ tt cluster replicaset promote "http://localhost:2379/myapp" storage-001-a

demote

$ tt cluster replicaset demote CONFIG_URI INSTANCE_NAME [OPTION ...]

tt cluster replicaset demote demotes an instance in a replica set. This command works on Tarantool clusters with centralized configuration and with failover mode off.

Note

In clusters with manual failover mode, you can demote a read-write instance by promoting a read-only instance from the same replica set with tt cluster replicaset promote.

The command sets the instance’s database.mode to ro and reloads the configuration.

Important

If failover is off, the command doesn’t consider the modes of other replica set members, so there can be any number of read-write instances in one replica set.

expel

$ tt cluster replicaset expel CONFIG_URI INSTANCE_NAME [OPTION ...]

tt cluster replicaset expel expels an instance from the cluster. Example:

$ tt cluster replicaset expel "http://localhost:2379" storage-b-002

roles

$ tt cluster replicaset roles [add|remove] CONFIG_URI ROLE_NAME [OPTION ...]

tt cluster replicaset roles manages application roles in the configuration scope specified in the command options. It has two subcommands:

add adds a role
remove removes a role

Use the --global, --group, --replicaset, --instance options to specify the configuration scope to add or remove roles. For example, to add a role to all instances in a replica set:

$ tt cluster replicaset roles add "http://localhost:2379" roles.my-role --replicaset storage-a

To remove a role defined in the global configuration scope:

$ tt cluster replicaset roles remove "http://localhost:2379" roles.my-role --global

Implementation details

The changes that tt cluster replicaset makes to the configuration storage occur transactionally. Each call creates a new revision. In case of a revision mismatch, an error is raised.

If the cluster configuration is distributed over multiple keys in the configuration storage (for example, in two paths /myapp/config/k1 and /myapp/config/k2), the affected instance configuration can be present in more that one of them. If it is found under several different keys, the command prompts the user to choose a key for patching. You can skip the selection by adding the -f/--force option:

$ tt cluster replicaset promote "http://localhost:2379/myapp" storage-001-a --force

In this case, the command selects the key for patching automatically. A key’s priority is determined by the detail level of the instance or replica set configuration stored under this key. For example, when failover is off, a key with instance.database options takes precedence over a key with the only instance field. In case of equal priority, the first key in the lexicographical order is patched.

failover

$ tt cluster failover SUBCOMMAND [OPTION ...]

tt cluster failover manages a supervised failover in Tarantool clusters.

switch
switch-status

Important

tt cluster failover works only with centralized cluster configurations stored in etcd.

switch

$ tt cluster failover switch CONFIG_URI INSTANCE_NAME [OPTION ...]

tt cluster failover switch appoints the specified instance to be a master. This command accepts the following arguments and options:

CONFIG_URI: A URI of the cluster configuration storage.
INSTANCE_NAME: An instance name.
[OPTION ...]: Options to pass to the command.

In the example below, tt cluster failover switch appoints storage-a-002 to be a master:

$ tt cluster failover switch http://localhost:2379/myapp storage-a-002
To check the switching status, run:
tt cluster failover switch-status http://localhost:2379/myapp b1e938dd-2867-46ab-acc4-3232c2ef7ffe

Note that the command output includes an identifier of the task responsible for switching a master. You can use this identifier to see the status of switching a master instance using tt cluster failover switch-status.

switch-status

$ tt cluster failover switch-status CONFIG_URI TASK_ID

tt cluster failover switch-status shows the status of switching a master instance. This command accepts the following arguments:

CONFIG_URI: A URI of the cluster configuration storage.
TASK_ID: An identifier of the task used to switch a master instance. You can find the task identifier in the tt cluster failover switch command output.

Example:

$ tt cluster failover switch-status http://localhost:2379/myapp b1e938dd-2867-46ab-acc4-3232c2ef7ffe

Authentication

There are three ways to pass the credentials for connecting to the centralized configuration storage. They all apply to both etcd and Tarantool-based storages. The following list shows these ways ordered by precedence, from highest to lowest:

Credentials specified in the storage URI: https://username:password@host:port/prefix:
```
$ tt cluster show "http://myuser:p4$$w0rD@localhost:2379/myapp"
```

tt cluster options -u/--username and -p/--password:

$ tt cluster show "http://localhost:2379/myapp" -u myuser -p p4$$w0rD

Environment variables TT_CLI_ETCD_USERNAME and TT_CLI_ETCD_PASSWORD:

$ export TT_CLI_ETCD_USERNAME=myuser
$ export TT_CLI_ETCD_PASSWORD=p4$$w0rD
$ tt cluster show "http://localhost:2379/myapp"

If connection encryption is enabled on the configuration storage, pass the required SSL parameters in the URI arguments.

URI format

A URI of the cluster configuration storage has the following format:

http(s)://[username:password@]host:port[/prefix][?arguments]

username and password define credentials for connecting to the configuration storage.
prefix is a base path identifying a specific application in the storage.
arguments defines connection parameters. The following arguments are available:
- name – a name of an instance in the cluster configuration.
- key – a target configuration key in the specified prefix.
- timeout – a request timeout in seconds. Default: 3.0.
- ssl_key_file – a path to a private SSL key file.
- ssl_cert_file – a path to an SSL certificate file.
- ssl_ca_file – a path to a trusted certificate authorities (CA) file.
- ssl_ca_path – a path to a trusted certificate authorities (CA) directory.
- ssl_ciphers – a colon-separated (:) list of SSL cipher suites the connection can use (for Tarantool-based storage only).
- verify_host – verify the certificate’s name against the host. Default true.
- verify_peer – verify the peer’s SSL certificate. Default true.

Options

-u, --username STRING¶

A username for connecting to the configuration storage.

See also: Authentication.

-p, --password STRING¶

A password for connecting to the configuration storage.

See also: Authentication.

--force¶

Applicable to: publish, replicaset

publish: skip validation when publishing. Default: false (validation is enabled).
replicaset: skip key selection for patching. Learn more in tt-cluster-replicaset-details:.

-G, --global¶

Applicable to: replicaset roles

Apply the operation to the global configuration scope, that is, to all instances.

-g, --group¶

Applicable to: publish, replicaset roles

A name of the configuration group to which the operation applies.

-i, --instance¶

Applicable to: replicaset roles

A name of the instance to which the operation applies.

-r, --replicaset¶

Applicable to: publish, replicaset roles

A name of the replica set to which the operation applies.

-t, --timeout UINT¶

Applicable to: failover

A timeout (in seconds) for executing a command. Default: 30.

--validate¶

Applicable to: show

Validate the printed configuration. Default: false (validation is disabled).

-w, --wait¶

Applicable to: failover

Wait while the command completes the execution. Default: false (don’t wait).

--with-integrity-check STRING¶

Enterprise Edition

This option is supported by the Enterprise Edition only.

Applicable to: publish, replicaset

Generate hashes and signatures for integrity checks.

Generating completion for tt

$ tt completion SHELL

tt completion generates tab-based completion for tt commands in the specified shell: bash or zsh.

Examples

Generate tt completion for the current bash terminal:

$ . <(tt completion bash)

Note

You can add an execution of the completion script to a user’s .bashrc file to make the completion work for this user in all their terminals.

Connecting to a Tarantool instance

$ tt connect {URI|INSTANCE_NAME} [OPTION ...]

tt connect connects to a Tarantool instance by its URI or instance name specified in the current environment.

Options

-u USERNAME, --username USERNAME¶: A Tarantool user for connecting to the instance.

-p PASSWORD, --password PASSWORD¶: The user’s password.

-f FILEPATH, --file FILEPATH¶

Connect and evaluate the script from a file.

- – read the script from stdin.

-i, --interactive¶: Enter the interactive mode after evaluating the script passed in -f/--file.

-l LANGUAGE, --language LANGUAGE¶: The input language of the tt interactive console: lua (default) or sql.

-x FORMAT, --outputformat FORMAT¶: The output format of the tt interactive console: yaml (default), lua, table, ttable.

--sslcertfile FILEPATH¶: The path to an SSL certificate file for encrypted connections.

--sslkeyfile FILEPATH¶: The path to a private SSL key file for encrypted connections.

--sslcafile FILEPATH¶: The path to a trusted certificate authorities (CA) file for encrypted connections.

--sslciphers STRING¶: The list of SSL cipher suites used for encrypted connections, separated by colons (:).

Details

To connect to an instance, tt typically needs its URI – the host name or IP address and the port.

You can also connect to instances in the same tt environment (that is, those that use the same configuration file and Tarantool installation) by their instance names.

Authentication

When connecting to an instance by its URI, tt connect establishes a remote connection for which authentication is required. Use one of the following ways to pass the username and the password:

The -u (--username) and -p (--password) options:

$ tt connect 192.168.10.10:3301 -u myuser -p p4$$w0rD

The connection string:

$ tt connect myuser:p4$$w0rD@192.168.10.10:3301

Environment variables TT_CLI_USERNAME and TT_CLI_PASSWORD:

$ export TT_CLI_USERNAME=myuser
$ export TT_CLI_PASSWORD=p4$$w0rD
$ tt connect 192.168.10.10:3301

If no credentials are provided for a remote connection, the user is automatically guest.

Note

Local connections (by instance name instead of the URI) don’t require authentication.

Encrypted connection

To connect to instances that use SSL encryption, provide the SSL certificate and SSL key files in the --sslcertfile and --sslkeyfile options. If necessary, add other SSL parameters – --sslcafile and --sslciphers.

Script evaluation

By default, tt connect opens an interactive tt console. Alternatively, you can open a connection to evaluate a Lua script from a file or stdin. To do this, pass the file path in the -f (--file) option or use -f - to take the script from stdin.

$ tt connect app -f test.lua

Examples

Connect to the app instance in the same environment:
```
$ tt connect app
```
Connect to the master instance of the app application in the same environment:
```
$ tt connect app:master
```
Connect to the 192.168.10.10 host on port 3301 with authentication:
```
$ tt connect 192.168.10.10:3301 -u myuser -p p4$$w0rD
```
Connect to the app instance and evaluate the code from the test.lua file:
```
$ tt connect app -f test.lua
```

Connect to the app instance and evaluate the code from stdin:

$ echo "function test() return 1 end" | tt connect app -f - # Create the test() function
$ echo "test()" | tt connect app -f -                       # Call this function

Manipulating Tarantool core dumps

$ tt coredump COMMAND [COMMAND_OPTION ...]

tt coredump provides commands for manipulating Tarantool core dumps.

To be able to investigate Tarantool crashes, make sure that core dumps are enabled on the host. Here is the instruction on enabling core dumps on Unix systems.

COMMAND is one of the following:

pack
unpack
inspect

Important

tt coredump is not supported on macOS.

pack

$ tt coredump pack COREDUMP_FILE

Pack a Tarantool core dump and supporting data into a tar.gz archive. It includes:

the Tarantool executable
Tarantool version information
OS information
shared libraries
the GNU debugger with extensions.

Pack a tar.gz file with a Tarantool core dump and supporting data:

$ tt coredump pack name.core

unpack

$ tt coredump unpack ARCHIVE

Unpack a Tarantool core dump archive created with tt coredump pack into a new directory:

$ tt coredump unpack tarantool-core-dump.tar.gz

inspect

$ tt coredump inspect [ARCHIVE|DIRECTORY] [-s]

Inspect a Tarantool core dump with the GNU debugger (gdb). The command argument can be either an archive file produced with tt coredump pack or directory where such an archive is extracted.

Inspect the core dump archive with gdb:

$ tt coredump inspect tarantool-core-dump.tar.gz

Inspect the unpacked core dump directory with gdb:

$ tt coredump inspect tarantool-core-dump

Options

-s¶

Applicable to: inspect

Specify the location of Tarantool sources.

Creating an application from a template

$ tt create TEMPLATE_NAME [OPTION ...]

tt create creates a new Tarantool application from a template.

Application templates speed up the development of Tarantool applications by defining their initial structure and content. A template can include application code, configuration, build scripts, and other resources.

tt comes with built-in templates for popular use cases. You can also create custom templates for specific purposes.

Built-in templates

There are the following built-in templates:

vshard_cluster: a sharded cluster application for Tarantool 3.0 or later.
single_instance: a single-instance application for Tarantool 3.0 or later.
cartridge: a Cartridge cluster application for Tarantool 2.x.

Important

The Tarantool Cartridge framework is deprecated and is not compatible with Tarantool 3.0 and later.

To create the app1 application in the current tt environment from the built-in vshard_cluster template:

$ tt create vshard_cluster --name app1 -dst /opt/tt/apps/

The command requests cluster topology parameters, such as the number of shards or routers, interactively during the execution.

To create the application in the /opt/tt/apps directory with default cluster topology and force rewrite the application directory if it already exists:

$ tt create vshard_cluster --name app1 -f --non-interactive -dst /opt/tt/apps/

Creating custom application templates

tt searches for custom templates in the directories specified in the templates section of its configuration file.

To create the application app1 from the simple_app custom template in the current directory:

$ tt create simple_app --name app1

Template structure

Application templates are directories with files.

The main file of a template is its manifest. It defines how the applications are instantiated from this template.

A template manifest is a YAML file named MANIFEST.yaml. It can contain the following sections:

description – the template description.
vars – template variables.
pre-hook and post-hook – paths to executables to run before and after the template instantiation.
include – a list of files to keep in the application directory after instantiation. If this section is omitted, the application will contain all template files and directories.

All sections are optional.

Example:

description: Template description
vars:
  - prompt: User name
    name: user_name
    default: admin
    re: ^\w+$

  - prompt: Retry count
    default: "3"
    name: retry_count
    re: ^\d+$
pre-hook: ./hooks/pre-gen.sh
post-hook: ./hooks/post-gen.sh
include:
  - init.lua
  - instances.yml

Files and directories of a template are copied to the application directory according to the include section of the manifest (or its absence).

Note

Don’t include the .rocks directory in application templates. To specify application dependencies, use the .rockspec files.

There is a special file type *.tt.template. The content of such files is adjusted for each application with the help of template variables. During the instantiation, the variables in these files are replaced with provided values and the *.tt.template extension is removed.

Variables

Templates variables are replaced with their values provided upon the instantiation.

All templates have the name variable. Its value is taken from the --name option.

To add other variables, define them in the vars section of the template manifest. A variable can have the following attributes:

prompt: a line of text inviting to enter the variable value in the interactive mode. Required.
name: the variable name. Required.
default: the default value. Optional.
re: a regular expression that the value must match. Optional.

Example:

vars:
  - prompt: Cluster cookie
    name: cluster_cookie
    default: cookie
    re: ^\w+$

Variables can be used in all file names and the content of *.tt template files.

Note

Variables don’t work in directory names.

To use a variable, enclose its name with a period in the beginning in double curly braces: {{.var_name}} (as in the Golang text templates syntax).

Examples:

init.lua.tt.template file:

local app_name = {{.name}}
local login = {{.user_name}}

A file name {{.user_name}}.txt

Variables receive their values during the template instantiation. By default, tt create asks you to provide the values interactively. You can use the -s (or --non-interactive) option to disable the interactive input. In this case, the values are searched in the following order:

In the --var option. Pass a string of the var=value format after the --var option. You can pass multiple variables, each after a separate --var option:
```
$ tt create template app --var user_name=admin
```
In a file. Specify var=value pairs in a plain text file, each on a new line, and pass it as the value of the --vars-file option:
```
$ tt create template app --vars-file variables.txt
```
variables.txt can look like this:
```
user_name=admin
password=p4$$w0rd
version=2
```

If a variable isn’t initialized in any of these ways, the default value from the manifest is used.

You can combine different ways of passing variables in a single call of tt create.

Application directory

By default, the application appears in the directory named after the provided application name (--name value).

To change the application location, use the -dst option.

Options

-d PATH, --dst PATH¶: Path to the directory where the application will be created.

-f, --force¶: Force rewrite the application directory if it already exists.

--name NAME¶: Application name.

-s, --non-interactive¶: Non-interactive mode.

--var [VAR=VALUE ...]¶: Variable definition. Usage: --var var_name=value.

--vars-file FILEPATH¶: Path to the file with variable definitions.

Interacting with the CRUD module

Enterprise Edition

This command is supported by the Enterprise Edition only.

$ tt crud COMMAND [COMMAND_OPTION ...]

tt crud enables the interaction with a cluster using the CRUD module. COMMAND is one of the following:

export: export a cluster’s data to a file. Learn more at Exporting data.
import: import data from a file. Learn more at Importing data.

Downloading Tarantool Enterprise SDK

$ tt download VERSION [OPTION ...]

tt download downloads Tarantool Enterprise SDK from the customer zone.

The VERSION is a part of the SDK archive name between tarantool-enterprise-sdk- and the platform identifier. For example, to download tarantool-enterprise-sdk-gc64-3.0.0-0-gf58f7d82a-r23.linux.x86_64.tar.gz, run:

$ tt download gc64-3.0.0-0-gf58f7d82a-r23

tt automatically chooses the bundle for the current platform.

Authentication

To download the Tarantool Enterprise SDK using tt download, you need to provide access credentials for the Tarantool customer zone. Use one of the following ways to pass the username and the password:

A text file specified in the ee.credential_path parameter of the tt environment configuration:
```
# tt.yaml
# ...
ee:
  credential_path: cred.txt
```
cred.txt should contain a username and a password on separate lines:
```
myuser@tarantool.io
p4$$w0rD
```

Environment variables TT_CLI_EE_USERNAME and TT_CLI_EE_PASSWORD:

$ export TT_CLI_EE_USERNAME=myuser@tarantool.io
$ export TT_CLI_EE_PASSWORD=p4$$w0rD
$ tt download gc64-3.0.0-0-gf58f7d82a-r23

Options

--dev¶: Download a development build.

--directory-prefix STRING¶: The downloaded SDK location. Default: . (current directory).

Adding external applications to environments

$ tt enable {APPLICATION|SCRIPT}

tt enable adds an external Tarantool application to the current environment by creating a symlink to it in the instances.enabled directory.

To add the application located in /home/tt-user/external_app to the current tt environment:

$ tt enable /home/tt-user/external_app

Once the application is added, you can work with it the same way as with applications created in this environment.

Exporting data

Enterprise Edition

This command is supported by the Enterprise Edition only.

$ tt [crud|tdg2] export URI SPACE:FILE ... [EXPORT_OPTION ...]

tt [crud|tdg2] export exports a space’s data to a file. Three export commands cover the following cases:

tt export exports data from a replica set using the box.space API.
tt crud export exports data from a sharded cluster through a router using the CRUD module.
tt tdg2 export exports data from a Tarantool Data Grid 2 cluster through its connector using TDG2 Repository API.

tt [crud|tdg2] export takes the following arguments:

URI: The URI of a router instance if crud is used. Otherwise, it should specify the URI of a storage.
FILE: The name of a file for storing exported data.
SPACE: The name of a space from which data is exported.

Note

Read access to the space is required to export its data.

Output format

tt export exports data in the following formats:

tt export and tt crud export: CSV
tt tdg2 export: JSON lines

Limitations

Exporting isn’t supported for the interval field type.

Exporting with default settings

The command below exports data of the customers space to the customers.csv file:

$ tt crud export localhost:3301 customers:customers.csv

If the customers space has five fields (id, bucket_id, firstname, lastname, and age), the file with exported data might look like this:

1,477,Andrew,Fuller,38
2,401,Michael,Suyama,46
3,2804,Robert,King,33
# ...

If a tuple contains a null value, for example, [1, 477, 'Andrew', null, 38], it is exported as an empty value:

1,477,Andrew,,38

Exporting headers

To export data with a space’s field names in the first row of the CSV file, use the --header option:

$ tt crud export localhost:3301 customers:customers.csv  \
                 --header

In this case, field values start from the second row, for example:

id,bucket_id,firstname,lastname,age
1,477,Andrew,Fuller,38
2,401,Michael,Suyama,46
3,2804,Robert,King,33
# ...

Exporting compound data

In the CSV format, tt exports empty values by default for fields containing compound data such as arrays or maps. To export compound values in a specific format, use the --compound-value-format option. For example, the command below exports compound values to CSV serialized in JSON:

$ tt crud export localhost:3301 customers:customers.csv  \
                 --compound-value-format json

Exporting from Tarantool Data Grid 2

Note

In the TDG2 data model, a type represents a Tarantool space, and an object of a type represents a tuple in the type’s underlying space.

The command below exports data of the customers type from a TDG2 cluster to the customers.jsonl file:

$ tt tdg2 export localhost:3301 customers:customers.jsonl

If token authentication is enabled in TDG2, pass the application token in the --token option:

$ tt tdg2 export localhost:3301 customers:customers.jsonl \
                 --token=2fc136cf-8cae-4655-a431-7c318967263d

If the customers type has four fields (id, firstname, lastname, and age), the file with exported data might look like this:

{"age":30,"first_name":"Samantha","id":1,"second_name":"Carter"}
{"age":41,"first_name":"Fay","id":2,"second_name":"Rivers"}
{"age":74,"first_name":"Milo","id":4,"second_name":"Walters"}

null field values are skipped:

{"age":13,"first_name":"Zachariah","id":3}

Object fields that contain maps with non-string keys are converted to maps with string keys.

TDG2 sets a limit on the number of objects transferred from each storage during a query execution (the hard-limits.returned TDG2 configuration parameter). If an export batch size (--batch-size parameter) is greater than this limit, it is possible that more than hard-limits.returned objects will be requested from one storage and export will fail. To make sure that hard-limits.returned is never exceeded during an export operation, set the export batch size less or equal to this limit.

For example, if your TDG2 cluster has a 1000 objects hard-limits.returned limit:

# tdg2 config.yaml
# ...
hard-limits.returned: 1000

Set the tt tdg2 export batch size less or equal to 1000:

$ tt tdg2 export localhost:3301 customers:customers.jsonl --batch-size=1000

Authentication

When connecting to the cluster with enabled authentication, specify access credentials in the --username and --password command options:

$ tt crud export localhost:3301 customers:customers.csv \
                 --username myuser --password p4$$w0rD

Encrypted connection

$ tt crud export localhost:3301 customers:customers.csv \
                 --username myuser --password p4$$w0rD   \
                 --auth pap-sha256 --sslcertfile certs/server.crt \
                 --sslkeyfile certs/server.key

For connections that use SSL but don’t require additional parameters, add the --use-ssl option:

$ tt crud export localhost:3301 customers:customers.csv \
                 --username myuser --password p4$$w0rD   \
                 --use-ssl

Options

--auth STRING¶

Applicable to: tt crud export, tt tdg2 export

Authentication type: chap-sha1, pap-sha256, or auto.

--batch-queue-size INT¶

The maximum number of tuple batches in a queue between a fetch and write threads (the default is 32).

tt exports data using two threads:

A fetch thread makes requests and receives data from a Tarantool instance.
A write thread encodes received data and writes it to the output.

The fetch thread uses a queue to pass received tuple batches to the write thread. If a queue is full, the fetch thread waits until the write thread takes a batch from the queue.

--batch-size INT¶

The number of tuples to transfer per request. The default is:

10000 for tt export and tt crud export.

100 for tt tdg2 export.

Important

When using tt tdg2 export, make sure that the batch size does not exceed the hard-limits.returned TDG2 parameter value set on the cluster.

--compound-value-format STRING¶

Applicable to: tt export, tt crud export

A format used to export compound values like arrays or maps. By default, tt exports empty values for fields containing such values.

Supported formats: json.

See also Encrypted connection.

--sslcertfile STRING¶

Applicable to: tt crud export, tt tdg2 export

The path to an SSL certificate file for encrypted connections.

See also Encrypted connection.

--sslciphersfile STRING¶

Applicable to: tt crud export, tt tdg2 export

The list of SSL cipher suites used for encrypted connections, separated by colons (:).

See also Encrypted connection.

--sslkeyfile STRING¶

Applicable to: tt crud export, tt tdg2 export

The path to a private SSL key file for encrypted connections.

See also Encrypted connection.

--sslpassword STRING¶

Applicable to: tt crud export, tt tdg2 export

The password for the SSL key file for encrypted connections.

See also Encrypted connection.

--sslpasswordfile STRING¶

Applicable to: tt crud export, tt tdg2 export

A file with list of passwords to the SSL key file for encrypted connections.

See also Encrypted connection.

--username STRING¶: A username for connecting to the instance.

Displaying help for tt and its commands

$ tt help [COMMAND]

tt help displays help:

for tt utility when called without a COMMAND.
for a specified tt command.

Examples

Display tt help
```
$ tt help
```
Display help for the start command:
```
$ tt help start
```

Importing data

Enterprise Edition

This command is supported by the Enterprise Edition only.

$ tt [crud|tdg2] import URI FILE:SPACE [IMPORT_OPTION ...]
# or
$ tt [crud|tdg2] import URI :SPACE < FILE [IMPORT_OPTION ...]

tt [crud|tdg] import imports data from a file to a space. Three import commands cover the following cases:

tt import imports data into a replica set through its master instance using the box.space API.
tt crud import imports data into a sharded cluster through a router using the CRUD module.
tt tdg2 import imports data into a Tarantool Data Grid 2 cluster through its router using the repository.put function of the TDG2 Repository API.

tt [crud|tdg2] import takes the following arguments:

URI: The URI of a router instance if crud is used. Otherwise, it should specify the URI of a storage.
FILE: The name of a file containing data to be imported.
SPACE: The name of a space to which data is imported.

Note

Write access to the space and execute access to universe are required to import data.

Input file format

tt import imports data from the following formats:

tt import and tt crud import: CSV
tt tdg2 import: JSON lines

Limitations

Importing isn’t supported for the interval field type.

Matching of input and space fields

Automatic matching

Suppose that you have the customers.csv file with a header containing field names in the first row:

id,firstname,lastname,age
1,Andrew,Fuller,38
2,Michael,Suyama,46
3,Robert,King,33
# ...

If the target customers space has fields with the same names, you can import data using the --header and --match options specified as follows:

$ tt crud import localhost:3301 customers.csv:customers \
                 --header \
                 --match=header

In this case, fields in the input file and the target space are matched automatically. You can also match fields manually if field names in the input file and the target space differ. Note that if you’re importing data into a cluster, you don’t need to specify the bucket_id field. The CRUD module generates bucket_id values automatically.

Manual matching

The --match option enables importing data by matching field names in the input file and the target space manually. Suppose that you have the following customers.csv file with four fields:

customer_id,name,surname,customer_age
1,Andrew,Fuller,38
2,Michael,Suyama,46
3,Robert,King,33
# ...

If the target customers space has the id, firstname, lastname, and age fields, you can configure mapping as follows:

$ tt crud import localhost:3301 customers.csv:customers \
                 --header \
                 --match "id=customer_id;firstname=name;lastname=surname;age=customer_age"

Similarly, you can configure mapping using numeric field positions in the input file:

$ tt crud import localhost:3301 customers.csv:customers \
                 --header \
                 --match "id=1;firstname=2;lastname=3;age=4"

Below are the rules if some fields are missing in input data or space:

If a space has fields that are not specified in input data, tt [crud] import tries to insert null values.
If input data contains fields missing in a target space, these fields are ignored.

Importing bucket_id into sharded clusters

When importing data into a CRUD-enabled sharded cluster, tt crud import ignores the bucket_id field values from the input file. This allows CRUD to automatically manage data distribution in the cluster by generating new bucket_id for tuples during import.

If you need to preserve the original bucket_id values, use the --keep-bucket-id option:

$ tt crud import localhost:3301 customers.csv:customers \
                 --keep-bucket-id \
                 --header \
                 --match=header

Handling duplicate primary key errors

The --on-exist option enables you to control data import when a duplicate primary key error occurs. In the example below, values already existing in the space are replaced with new ones:

$ tt crud import localhost:3301 customers.csv:customers \
                 --on-exist replace

Handling parsing errors

To skip rows whose data cannot be parsed correctly, use the --on-error option as follows:

$ tt crud import localhost:3301 customers.csv:customers \
                 --on-error skip

Importing into Tarantool Data Grid 2

Note

In the TDG2 data model, a type represents a Tarantool space, and an object of a type represents a tuple in the type’s underlying space.

The command below imports objects of the customers type into a TDG2 cluster. The objects are described in the customers.jsonl file.

$ tt tdg2 import localhost:3301 customers.jsonl:customers

If token authentication is enabled in TDG2, pass the application token in the --token option:

$ tt tdg2 import localhost:3301 customers.jsonl:customers \
                 --token=2fc136cf-8cae-4655-a431-7c318967263d

The input file can look like this:

{"age":30,"first_name":"Samantha","id":1,"second_name":"Carter"}
{"age":41,"first_name":"Fay","id":2,"second_name":"Rivers"}
{"age":74,"first_name":"Milo","id":4,"second_name":"Walters"}

Note

Since JSON describes objects in maps with string keys, there is no way to import a field value that is a map with a non-string key.

In case of an error during TDG2 import, tt tdg2 import rolls back the changes made within the current batch on the storage where the error has happened (per-storage rollback) and reports an error. On other storages, objects from the same batch can be successfully imported. So, the rollback process of tt tdg2 import is the same as the one of tt crud import with the --rollback-on-error option.

Since object batches can be imported partially (per-storage rollback), the absence of error matching complicates the debugging in case of errors. To minimize this effect, the default batch size (--batch-size) for tt tdg2 import is 1. This makes the debugging straightforward: you always know which object caused the error. On the other hand, this decreases the performance in comparison to import in larger batches.

If you increase the batch size, tt informs you about the possible issues and asks for an explicit confirmation to proceed. To automatically confirm a batch import operation, add the --force option:

$ tt tdg2 import localhost:3301 customers.jsonl:customers \
                 --batch-size=100 \
                 --force

Authentication

When connecting to the cluster with enabled authentication, specify access credentials in the --username and --password command options:

$ tt crud import localhost:3301 customers.csv:customers \
                 --header --match=header \
                 --username myuser --password p4$$w0rD

Encrypted connection

$ tt crud import localhost:3301 customers.csv:customers \
                 --header --match=header \
                 --username myuser --password p4$$w0rD   \
                 --auth pap-sha256 --sslcertfile certs/server.crt \
                 --sslkeyfile certs/server.key

For connections that use SSL but don’t require additional parameters, add the --use-ssl option:

$ tt crud import localhost:3301 customers.csv:customers \
                 --header --match=header \
                 --username myuser --password p4$$w0rD   \
                 --use-ssl

Options

--auth STRING¶

Applicable to: tt crud import, tt tdg2 import

Authentication type: chap-sha1, pap-sha256, or auto.

--batch-size INT¶

Applicable to: tt crud import, tt tdg2 import

The number of tuples to transfer per request. The default is:

100 for tt crud import.

1 for tt tdg2 import. See Importing into Tarantool Data Grid 2 for details.

--dec-sep STRING¶

Applicable to: tt import, tt crud import

The string of symbols that defines decimal separators for numeric data (the default is .,).

Note

Symbols specified in this option cannot intersect with --th-sep.

--delimiter STRING¶

Applicable to: tt import, tt crud import

A symbol that defines a field value delimiter. For CSV, the default delimiter is a comma (,). To use a tab character as a delimiter, set this value as tab:

$ tt crud import localhost:3301 customers.csv:customers \
                 --delimiter tab

Note

A delimiter cannot be \r, \n, or the Unicode replacement character (U+FFFD).

--error STRING¶

The name of a file containing rows that are not imported (the default is error).

See also Encrypted connection.

--sslcertfile STRING¶

Applicable to: tt crud import, tt tdg2 import

The path to an SSL certificate file for encrypted connections.

See also Encrypted connection.

--sslciphersfile STRING¶

Applicable to: tt crud import, tt tdg2 import

The list of SSL cipher suites used for encrypted connections, separated by colons (:).

See also Encrypted connection.

--sslkeyfile STRING¶

Applicable to: tt crud import, tt tdg2 import

The path to a private SSL key file for encrypted connections.

See also Encrypted connection.

--sslpassword STRING¶

Applicable to: tt crud import, tt tdg2 import

The password for the SSL key file for encrypted connections.

See also Encrypted connection.

--sslpasswordfile STRING¶

Applicable to: tt crud import, tt tdg2 import

A file with a list of passwords to the SSL key file for encrypted connections.

See also Encrypted connection.

--username STRING¶: A username for connecting to the instance.

Creating a tt environment

$ tt init

tt init creates a tt environment in the current directory. This includes:

Setting up directories for working files: binaries, templates, and so on.
Creating a corresponding tt.yaml configuration file.

Details

Important

The Tarantool Cartridge framework is deprecated and is not compatible with Tarantool 3.0 and later. This command is added for backward compatibility with earlier versions.

tt init checks the existence of configuration files for Cartridge (cartridge.yml) or the tarantoolctl utility (.tarantoolctl) in the current directory. If such files are found, tt generates an environment that uses the same directories:

cartridge.yml – the directories specified in the file.
.tarantoolctl – the directories specified in the default_cfg table.

Note

init is the only tt command that invokes .tarantoolctl files. Thus, variables defined in this script will not be available in applications launched by a tt start call.

If there is no cartridge.yml or .tarantoolctl files in the current directory, tt init creates a default environment in it. This includes creating the following directories and files:

bin – the directory for storing binary files.
include – the directory for storing header files.
distfiles – the directory for storing installation files.
instances.enabled – the directory for storing running applications or symlinks.
modules – the directory for storing external modules.
tt.yaml – the configuration file.
templates – the directory for storing application templates.

Example

Create a tt environment in the current directory:

$ tt init

Installing Tarantool software

$ tt install PROGRAM_NAME [VERSION|COMMIT_HASH|PR_ID] [OPTION ...]

tt install installs the latest or an explicitly specified version of Tarantool or tt. The possible values of PROGRAM_NAME are:

tarantool: Install Tarantool Community Edition.
tarantool-dev: Install Tarantool from a local build directory.
tarantool-ee: Install Tarantool Enterprise Edition.
tt: Install the tt command-line utility.

Note

For tarantool-ee, account credentials are required. Specify them in a file (see the ee section of the configuration file) or provide them interactively.

Additionally, tt install can build open source programs tarantool and tt from a specific commit or a pull request on their GitHub repositories.

To uninstall a Tarantool or tt version, use tt uninstall.

Options

--dynamic¶

Applicable to: tarantool, tarantool-ee

Use dynamic linking for building Tarantool.

-f, --force¶: Skip dependency check before installation.

--local-repo¶: Install a program from the local repository, which is specified in the repo section of the tt configuration file.

--no-clean¶: Don’t delete temporary files.

--reinstall¶: Reinstall a previously installed program.

--use-docker¶

Applicable to: tarantool, tarantool-ee

Build Tarantool in an Ubuntu 18.04 Docker container.

Details

When called without an explicitly specified version, tt install installs the latest available version. If the version is specified in the incomplete format <MAJOR>.<MINOR>, the command installs the latest available patch version in the series. To check versions available for installation, use tt search.

By default, available versions of Tarantool Community Edition and tt are taken from their git repositories. Their installation includes building from sources, which requires some tools and dependencies, such as a C compiler. Make sure they are available in the system.

Tarantool Enterprise Edition is installed from prebuilt packages.

Authentication

To install Tarantool EE using tt install, you need to provide access credentials for the Tarantool customer zone. Use one of the following ways to pass the username and the password:

A text file specified in the ee.credential_path parameter of the tt enviromnment configuration:
```
# tt.yaml
# ...
ee:
  credential_path: cred.txt
```
cred.txt should contain a username and a password on separate lines:
```
myuser@tarantool.io
p4$$w0rD
```

Environment variables TT_CLI_EE_USERNAME and TT_CLI_EE_PASSWORD:

$ export TT_CLI_EE_USERNAME=myuser@tarantool.io
$ export TT_CLI_EE_PASSWORD=p4$$w0rD
$ tt install tarantool-ee

Development versions

tt install can be used to build custom Tarantool and tt versions for development purposes from commits and pull requests on their GitHub repositories.

To build Tarantool or tt from a specific commit on their GitHub repository, pass the commit hash (7 or more characters) after the program name. If you want to use a PR as a source, provide a pr/<PR_ID> argument:

$ tt install tarantool 03c184d
$ tt install tt pr/50

If you build Tarantool from sources, you can install local builds to the current tt environment by running tt install with the tarantool-dev program name and the path to the build:

$ tt install tarantool-dev ~/src/tarantool/build

Local repositories

You can also set up a local repository with installation files you need. To use it, specify its location in the repo section of the tt configuration file and run tt install with the --local-repo flag.

Example

Install the latest available version of Tarantool CE:
```
$ tt install tarantool
```
Install the latest available patch version of Tarantool CE 3.2 release series:
```
$ tt install tarantool 3.2
```
Install Tarantool 2.11.1 from the local repository:
```
$ tt install tarantool 2.11.1 --local-repo
```

Reinstall Tarantool 2.10.8:

$ tt install tarantool 2.10.8 --reinstall

Install Tarantool from a PR #1234 on the tarantool/tarantool GitHub repository:
```
$ tt install tarantool pr/1234
```
Install tt from a commit with a hash 40e696e on the tarantool/tt GitHub repository:
```
$ tt install tt 40e696e
```

Install Tarantool built from sources:

$ tt install tarantool-dev ~/src/tarantool/build

Listing enabled applications

$ tt instances

tt instances shows the list of enabled applications and their instances in the current environment.

Note

Enabled applications are applications that are stored inside the instances_enabled directory specified in the tt configuration file. They can be either running or not. To check if an application is running, use tt status.

Example

Show the list of enabled applications and their instances:
```
$ tt instances
```

Terminating Tarantool instances

$ tt kill APPLICATION[:APP_INSTANCE]

tt kill terminates instances with SIGQUIT and SIGKILL signals.

To terminate all instances of the app application:

$ tt kill app

To terminate the storage-001-r instance of the app application without confirmation:

$ tt kill app:storage-001-r --force

To terminate the storage-001-r instance of the app application and generate its core dump:

$ tt kill app:storage-001-r --dump

Options

-d, --dump¶: Generate core dumps of terminated processes.

-f, --force¶: Kill instances without confirmation.

Printing Tarantool logs

$ tt log [APPLICATION[:APP_INSTANCE]]

tt log prints the last lines of instance logs.

To print 10 last log lines of all the app application instances:

$ tt log app

To print 50 last log lines of the router instance of the app application:

$ tt log -n 50 app:router

To keep printing logs of the app application instances as they grow:

$ tt log -f app

Options

-f, --follow¶: Keep printing new lines added to the log file.

-n, --lines¶: The number of last lines to output. Default: 10.

Rotating instance logs

$ tt logrotate APPLICATION[:APP_INSTANCE]

tt logrotate rotates logs of a Tarantool application or specific instances, and the tt log. For example, you need to call this function to continue logging after a log rotation program renames or moves instances’ logs. Learn more about rotating logs.

Calling tt logrotate on an application has the same effect as executing the built-in log.rotate() function on all its instances.

Examples

Rotate logs of the app application’s instances:

$ tt logrotate app

Managing centralized migrations

Enterprise Edition

This command is supported by the Enterprise Edition only.

$ tt migrations COMMAND [COMMAND_OPTION ...]

tt migrations manages centralized migrations in a Tarantool EE cluster. See Centralized migrations with tt for a detailed guide on using the centralized migrations mechanism.

Important

Only Tarantool EE clusters with etcd centralized configuration storage are supported.

COMMAND is one of the following:

publish
apply
status
stop
remove

publish

$ tt migrations publish ETCD_URI [MIGRATIONS_DIR | MIGRATION_FILE] [OPTION ...]

tt migrations publish sends the migration files to the cluster’s centralized configuration storage for future execution.

By default, the command sends all files stored in migrations/ inside the current directory.

$ tt migrations publish "https://user:pass@localhost:2379/myapp"

To select another directory with migration files, provide a path to it as the command argument:

$ tt migrations publish "https://user:pass@localhost:2379/myapp" my_migrations

To publish a single migration from a file, use its name or path as the command argument:

$ tt migrations publish "https://user:pass@localhost:2379/myapp" migrations/000001_create_space.lua

Optionally, you can provide a key to use as a migration identifier instead of the filename:

$ tt migrations publish "https://user:pass@localhost:2379/myapp" file.lua  \
                        --key=000001_create_space.lua

When publishing migrations, tt performs checks for:

Syntax errors in migration files. To skip syntax check, add the --skip-syntax-check option.
Existence of migrations with same names. To overwrite an existing migration with the same name, add the --overwirte option.
Migration names order. By default, tt migrations only adds new migrations to the end of the migrations list ordered lexicographically. For example, if migrations 001.lua and 003.lua are already published, an attempt to publish 002.lua will fail. To force publishing migrations disregarding the order, add the --ignore-order-violation option.

Warning

Using the options that ignore checks when publishing migration may cause migration inconsistency in the cluster.

apply

$ tt migrations apply ETCD_URI [OPTION ...]

tt migrations apply applies published migrations to the cluster. It executes all migrations from the cluster’s centralized configuration storage on all its read-write instances (replica set leaders).

$ tt migrations apply "https://user:pass@localhost:2379/myapp"  \
                    --tarantool-username=admin --tarantool-password=pass

To apply a single published migration, pass its name in the --migration option:

$ tt migrations apply "https://user:pass@localhost:2379/myapp"  \
                    --tarantool-username=admin --tarantool-password=pass  \
                    --migration=000001_create_space.lua

To apply migrations on a single replica set, specify the replicaset option:

$ tt migrations apply "https://user:pass@localhost:2379/myapp"  \
                    --tarantool-username=admin --tarantool-password=pass  \
                    --replicaset=storage-001

The command also provides options for migration troubleshooting: --ignore-order-violation, --force-reapply, and --ignore-preceding-status. Learn to use them in Troubleshooting migrations.

Warning

The use of migration troubleshooting options may lead to migration inconsistency in the cluster. Use them only for local development and testing purposes.

status

$ tt migrations status ETCD_URI [OPTION ...]

tt migrations status prints the list of migrations published to the centralized storage and the result of their execution on the cluster instances.

Possible migration statuses are:

APPLY_STARTED – the migration execution has started but not completed yet
or has been interrupted with tt migrations stop <tt-migrations-stop>`
APPLIED – the migration is successfully applied on the instance
FAILED – there were errors during the migration execution on the instance

To get the list of migrations stored in the given etcd storage and information about their execution on the cluster, run:

$ tt migrations status "https://user:pass@localhost:2379/myapp"  \
                       --tarantool-username=admin --tarantool-password=pass

If the cluster uses SSL encryption, add SSL options. Learn more in Authentication.

Use the --migration and --replicaset options to get information about specific migrations or replica sets:

$ tt migrations status "https://user:pass@localhost:2379/myapp"  \
                     --tarantool-username=admin --tarantool-password=pass \
                     --replicaset=storage-001 --migration=000001_create_writers_space.lua

The --display-mode option allows to tailor the command output:

with --display-mode config-storage, the command prints only the list of migrations published to the centralized storage.
with --display-mode cluster, the command prints only the migration statuses on the cluster instances.

To find out the results of a migration execution on a specific replica set in the cluster, run:

$ tt migrations status "https://user:pass@localhost:2379/myapp"  \
                       --tarantool-username=admin --tarantool-password=pass  \
                       --replicaset=storage-001 --display-mode=cluster

stop

$ tt migrations stop ETCD_URI [OPTION ...]

tt migrations stop stops the execution of migrations in the cluster.

Warning

Calling tt migration stop may cause migration inconsistency in the cluster.

To stop the execution of a migration currently running in the cluster:

$ tt migrations stop "https://user:pass@localhost:2379/myapp"  \
                     --tarantool-username=admin --tarantool-password=pass

tt migrations stop interrupts a single migration. If you call it to interrupt the process that applies multiple migrations, the ones completed before the call receive the APPLIED status. The migration is interrupted by the call remains in APPLY_STARTED.

remove

$ tt migrations remove ETCD_URI [OPTION ...]

tt migrations remove removes published migrations from the centralized storage. With additional options, it can also remove the information about the migration execution on the cluster instances.

To remove all migrations from a specified centralized storage:

$ tt migrations remove "https://user:pass@localhost:2379/myapp"  \
                       --tarantool-username=admin --tarantool-password=pass

To remove a specific migration, pass its name in the --migration option:

$ tt migrations remove "https://user:pass@localhost:2379/myapp"  \
                       --tarantool-username=admin --tarantool-password=pass  \
                       --migration=000001_create_writers_space.lua

Before removing migrations, the command checks their status on the cluster. To ignore the status and remove migrations anyway, add the --force-remove-on=config-storage option:

$ tt migrations remove "https://user:pass@localhost:2379/myapp"  \
                        --force-remove-on=config-storage

Note

In this case, cluster credentials are not required.

To remove migration execution information from the cluster (clear the migration status), use the --force-remove-on=cluster option:

$ tt migrations remove "https://user:pass@localhost:2379/myapp"  \
                       --tarantool-username=admin --tarantool-password=pass  \
                       --force-remove-on=cluster

To clear all migration information from the centralized storage and cluster, use the --force-remove-on=all option:

$ tt migrations remove "https://user:pass@localhost:2379/myapp"  \
                       --tarantool-username=admin --tarantool-password=pass  \
                       --force-remove-on=all

Authentication

Since tt migrations operates migrations via a centralizes etcd storage, it needs credentials to access this storage. There are two ways to pass etcd credentials:

command-line options --config-storage-username and --config-storage-password
the etcd URI, for example, https://user:pass@localhost:2379/myapp

Credentials specified in the URI have a higher priority.

For commands that connect to the cluster (that is, all except publish), Tarantool credentials are also required. The are passed in the --tarantool-username and --tarantool-password options.

If the cluster uses SSL traffic encryption, provide the necessary connection parameters in the --tarantool-ssl* options: --tarantool-sslcertfile, --tarantool-sslkeyfile, and other. All options are listed in Options.

Options

--acquire-lock-timeout INT¶

Applicable to: apply

Migrations fiber lock acquire timeout in seconds. Default: 60. Fiber lock is used to prevent concurrent migrations run

--config-storage-password STRING¶

A password for connecting to the centralized migrations storage (etcd).

See also: Authentication.

--config-storage-username STRING¶

A username for connecting to the centralized migrations storage (etcd).

See also: Authentication.

--display-mode STRING¶

Applicable to: status

Display only specific information. Possible values:

config-storage – information about migrations published to the centralized storage.
cluster – information about migration applied on the cluster.

Packaging the application

$ tt pack TYPE [OPTION ...] ..

tt pack packages an application into a distributable bundle of the specified TYPE:

tgz: create a .tgz archive.
deb: create a DEB package.
rpm: create an RPM package.

Example: a DEB package

The command below creates a DEB package with all applications from the current tt environment:

$ tt pack deb

This command generates a .deb file whose name depends on the environment directory name and the operating system architecture, for example, test-env_0.1.0.0-1_x86_64.deb. The package contains the following files:

The content of the application directories: source files, resources, dependencies.
tt environment files: tarantool and tt executables, tt.yaml configuration file, external modules, headers.
.service unit files that allow running applications as systemd services (a separate file for each application).

You can also pass various options to the tt pack command to adjust generation properties, for example, customize a bundle name, choose which artifacts should be included, specify the required application dependencies.

systemd unit parameters

You can customize your application’s systemd unit file generated by tt pack. To add parameters to the unit file, define them in a YAML file named systemd-unit-params.yml in the application directory.

$ tt pack rpm # unit file with parameters from systemd-unit-params.yml if it exists

You can also pass unit parameters from an arbitrary file by adding the --unit-params-file option to the tt pack call:

$ tt pack rpm --unit-params-file my-params.yml # unit file with parameters from my-params.yml

Important

The systemd-unit-params.yml file has a higher priority than the --unit-params-file option. If this file exists, it overrides parameters from the file passed in the option.

tt pack supports the following systemd unit parameters:

FdLimit – the number of open file descriptors (LimitNOFile in the unit file).
instance-env – a list of environment variables in the <VAR_NAME>: <VALUE> format. Each list item adds an Environment=<VAR_NAME>=<VALUE> line to the unit file.

An example of the systemd-unit-params.yml file:

FdLimit: 200
instance-env:
  INSTANCE: "inst:%i"
  TARANTOOL_WORKDIR: "/tmp"

Generating files for integrity checks

Enterprise Edition

The integrity check functionality is supported by the Enterprise Edition only.

tt pack can generate checksums and signatures to use for integrity checks when running the application. These files are:

hashes.json and hashes.json.sig in each application directory. hashes.json contains SHA256 checksums of executable files that the application uses and its configuration file. hashes.json.sig contains a digital signature for hashes.json.
env_hashes.json and env_hashes.json.sig in the environment root are similar files for the tt environment. They contain checksums for Tarantool and tt executables, and for the tt.yaml configuration file.

To generate checksums and signatures for integrity check, use the --with-integrity-check option. Its argument must be an RSA private key.

Note

You can generate a key pair using OpenSSL 3 as follows:

$ openssl genrsa -traditional -out private.pem 2048
$ openssl rsa -in private.pem -pubout > public.pem

To create a tar.gz archive with integrity check artifacts:

$ tt pack tgz --with-integrity-check private.pem

Learn how to perform integrity checks at the application startup and in runtime in the tt start reference.

Options

--all¶: Include all artifacts in a bundle. In this case, a bundle might include snapshots, WAL files, and logs.

--app-list APPLICATIONS¶

Specify the applications included in a bundle.

Example

$ tt pack tgz --app-list app1,app3

--cartridge-compat¶

Applicable to: tgz

Package a Cartridge CLI-compatible archive.

Important

The Tarantool Cartridge framework is deprecated and is not compatible with Tarantool 3.0 and later. This command is added for backward compatibility with earlier versions.

--deps STRINGS¶

Applicable to: deb, rpm

Specify dependencies included in RPM and DEB packages.

Example

$ tt pack deb --deps 'wget,make>0.1.0,unzip>1,unzip<=7'

--deps-file STRING¶

Applicable to: deb, rpm

Specify the path to a file containing dependencies included in RPM and DEB packages. For example, the package-deps.txt file below contains several dependencies and their versions:

unzip==6.0
neofetch>=6,<7
gcc>8

If this file is placed in the current directory, a tt pack command might look like this:

$ tt pack deb --deps-file package-deps.txt

--filename¶

Specify a bundle name.

Example

$ tt pack tgz --filename sample-app.tar.gz

--name PACKAGE_NAME¶

Specify a package name.

Example

$ tt pack tgz --name sample-app --version 1.0.1

--preinst¶

Applicable to: deb, rpm

Specify the path to a pre-install script for RPM and DEB packages.

Example

$ tt pack deb --preinst pre.sh

--postinst¶

Applicable to: deb, rpm

Specify the path to a post-install script for RPM and DEB packages.

Example

$ tt pack deb --postinst post.sh

--tarantool-version¶: Specify a Tarantool version for packaging in a Docker container. For use with --use-docker only.

--unit-params-file¶: The path to a file with custom systemd unit parameters.

--use-docker¶

Build a package in an Ubuntu 18.04 Docker container. To specify a Tarantool version to use in the container, add the --tarantool-version option.

Before executing tt pack with this option, make sure Docker is running.

--version PACKAGE_VERSION¶

Specify a package version.

Example

$ tt pack tgz --name sample-app --version 1.0.1

--with-binaries¶: Include Tarantool and tt binaries in a bundle.

--with-integrity-check PRIVATE_KEY¶

Generate checksums and signatures for integrity checks at the application startup.

--with-tarantool-deps¶: Add Tarantool and tt as package dependencies.

--without-binaries¶: Don’t include Tarantool and tt binaries in a bundle.

--without-modules¶: Don’t include external modules in a bundle.

Playing the contents of .snap and .xlog files to a Tarantool instance

$ tt play URI FILE ... [OPTION ...]

tt play plays the contents of snapshot (.snap) and WAL (.xlog) files to another Tarantool instance. A single call of tt play can play multiple files.

Options

-u USERNAME, --username USERNAME¶: A Tarantool user for connecting to the instance.

-p PASSWORD, --password PASSWORD¶: The user’s password.

--from LSN¶: Play operations starting from the given LSN.

--to LSN¶: Play operations up to the given LSN. Default: 18446744073709551615.

--replica ID¶

Filter the operations by replica ID. Can be passed more than once.

--space ID¶: Filter the output by space ID. Can be passed more than once.

--show-system¶: Show the operations on system spaces.

Details

tt play plays operations from .xlog and .snap files to the destination instance one by one. All data changes happen the same way as if they were performed on this instance. This means that:

All affected spaces must exist on the destination instance. They must have the same structure and space_id as on the instance that created the snapshot or WAL file.

To play a snapshot or a WAL to a clean instance, include the operations on system spaces by adding the --show-system flag. With this flag, tt plays the operations that create and configure user-defined spaces.
The operations’ LSNs change unless you play all operations that took place since the instance startup.
Replica IDs change in accordance with the destination instance configuration.

Authentication

Use one of the following ways to pass the username and the password when connecting to the instance:

The -u (--username) and -p (--password) options:

$ tt play 192.168.10.10:3301 00000000000000000000.xlog -u myuser -p p4$$w0rD

The connection string:

$ tt play myuser:p4$$w0rD@192.168.10.10:3301 00000000000000000000.xlog

Environment variables TT_CLI_USERNAME and TT_CLI_PASSWORD:

$ export TT_CLI_USERNAME=myuser
$ export TT_CLI_PASSWORD=p4$$w0rD
$ tt play 192.168.10.10:3301 00000000000000000000.xlog

Examples

Play the contents of 00000000000000000000.xlog to the instance on 192.168.10.10:3301:
```
$ tt play 192.168.10.10:3301 00000000000000000000.xlog
```
Play operations on spaces with space_id 512 and 513 from the 00000000000000000012.snap snapshot file:
```
$ tt play 192.168.10.10:3301 00000000000000000012.snap --space 512 --space 513
```
Play the contents of 00000000000000000000.xlog including operations on system spaces:
```
$ tt play 192.168.10.10:3301 00000000000000000000.xlog --show-system
```

Managing replica sets

$ tt replicaset COMMAND {APPLICATION[:APP_INSTANCE] | URI} [OPTIONS ...]
# or
$ tt rs COMMAND {APPLICATION[:APP_INSTANCE] | URI} [OPTIONS ...]

tt replicaset (or tt rs) manages a Tarantool replica set.

COMMAND is one of the following:

status
promote
demote
expel
vshard
bootstrap
rebootstrap
roles

status

$ tt replicaset status {APPLICATION[:APP_INSTANCE] | URI} [OPTIONS ...]
# or
$ tt rs status {APPLICATION[:APP_INSTANCE] | URI} [OPTIONS ...]

tt replicaset status (tt rs status) shows the current status of a replica set.

Displaying status of all replica sets

To view the status of all replica sets of an application in the current tt environment, run tt replicaset status with the application name:

$ tt replicaset status myapp

Displaying status of a single replica set

To view the status of a single replica set of an application, run tt replicaset status with a name or a URI of an instance from this replica set:

$ tt replicaset status myapp:storage-001-a

For a replica outside the current tt environment, specify its URI and access credentials:

$ tt replicaset status 192.168.10.10:3301 -u myuser -p p4$$w0rD

Learn about other ways to provide user credentials in Authentication.

promote

$ tt replicaset promote {APPLICATION:APP_INSTANCE | URI} [OPTIONS ...]
# or
$ tt rs promote {APPLICATION:APP_INSTANCE | URI} [OPTIONS ...]

tt replicaset promote (tt rs promote) promotes the specified instance, making it a leader of its replica set. This command works on Tarantool clusters with a local YAML configuration and Cartridge clusters.

Note

To promote an instance in a Tarantool cluster with a centralized configuration, use tt cluster replicaset promote.

Promoting in clusters with local YAML configurations

tt replicaset promote works on Tarantool clusters with local YAML configurations with failover modes off, manual, and election.

In failover modes off or manual, this command updates the cluster configuration file according to the specified arguments and reloads it:

off failover mode: the command sets database.mode to rw on the specified instance.

Important

If failover is off, the command doesn’t consider the modes of other replica set members, so there can be any number of read-write instances in one replica set.
manual failover mode: the command updates the leader option of the replica set configuration. Other instances of this replica set become read-only.

Example:

$ tt replicaset promote my-app:storage-001-a

If some members of the affected replica set are running outside the current tt environment, tt replicaset promote can’t ensure the configuration reload on them and reports an error. You can skip this check by adding the -f/--force option:

$ tt replicaset promote my-app:storage-001-a --force

In the election failover mode, tt replicaset promote initiates the new leader election by calling box.ctl.promote() on the specified instance. The --timeout option can be used to specify the election completion timeout:

$ tt replicaset promote my-app:storage-001-a --timeout=10

Promoting in Cartridge clusters

Important

The Tarantool Cartridge framework is deprecated and is not compatible with Tarantool 3.0 and later. This command is added for backward compatibility with earlier versions.

tt replicaset promote promotes instances in Cartridge clusters as follows:

disabled or eventual failover mode: the command changes the instance failover priority.

Important

In these cases, consistency is not guaranteed and replication conflicts may occur.
eventual or raft failover mode: the command calls cartridge.failover_promote() and waits until the instance transitions to the read-write mode. If the -f/--force option is specified, the force_inconsistency option of cartridge.failover_promote is set to true.

$ tt replicaset promote my-cartridge-app:storage-001-a --force

Learn more about Cartridge failover modes.

demote

$ tt replicaset demote APPLICATION:APP_INSTANCE [OPTIONS ...]
# or
$ tt rs demote APPLICATION:APP_INSTANCE [OPTIONS ...]

tt replicaset demote (tt rs demote) demotes an instance in a Tarantool cluster with a local YAML configuration.

Note

To demote an instance in a Tarantool cluster with a centralized configuration, use tt cluster replicaset demote.

Demoting in clusters with local YAML configurations

tt replicaset demote can demote instances in Tarantool clusters with local YAML configurations with failover modes off and election.

Note

In clusters with manual failover mode, you can demote a read-write instance by promoting a read-only instance from the same replica set with tt replicaset promote.

In the off failover mode, tt replicaset demote sets the instance’s database.mode to ro and reloads the configuration.

Important

If failover is off, the command doesn’t consider the modes of other replica set members, so there can be any number of read-write instances in one replica set.

If some members of the affected replica set are running outside the current tt environment, tt replicaset demote can’t ensure the configuration reload on them and reports an error. You can skip this check by adding the -f/--force option:

$ tt replicaset demote my-app:storage-001-a --force

In the election failover mode, tt replicaset demote initiates a leader election in the replica set. The specified instance’s replication.election_mode is changed to voter for this election, which guarantees that another instance is elected as a new replica set leader.

The --timeout option can be used to specify the election completion timeout:

$ tt replicaset demote my-app:storage-001-a --timeout=10

expel

$ tt replicaset expel APPLICATION:APP_INSTANCE [OPTIONS ...]
# or
$ tt rs expel  APPLICATION[:APP_INSTANCE] [OPTIONS ...]

tt replicaset expel (tt rs expel) expels an instance from the cluster.

$ tt replicaset expel myapp:storage-001-b

The command supports the --config, --cartridge, and --custom options that force the use of a specific orchestrator.

To expel an instance from a Cartridge cluster:

$ tt replicaset expel my-cartridge-app:storage-001-b --cartridge

vshard

$ tt replicaset vshard COMMAND {APPLICATION[:APP_INSTANCE] | URI} [OPTIONS ...]
# or
$ tt rs vshard COMMAND {APPLICATION[:APP_INSTANCE] | URI} [OPTIONS ...]
# or
$ tt rs vs COMMAND {APPLICATION[:APP_INSTANCE] | URI} [OPTIONS ...]

tt replicaset vshard (tt rs vs) manages vshard in the cluster.

It has the following subcommands:

bootstrap

vshard bootstrap

$ tt replicaset vshard bootstrap {APPLICATION[:APP_INSTANCE] | URI} [OPTIONS ...]
# or
$ tt rs vshard bootstrap {APPLICATION[:APP_INSTANCE] | URI} [OPTIONS ...]
# or
$ tt rs vs bootstrap {APPLICATION[:APP_INSTANCE] | URI} [OPTIONS ...]

tt replicaset vshard bootstrap (tt rs vs bootstrap) bootstraps vshard in the cluster.

$ tt replicaset vshard bootstrap myapp

With a URI and credentials:

$ tt replicaset vshard bootstrap 192.168.10.10:3301 -u myuser -p p4$$w0rD

You can specify the application name or the name of any cluster instance. The command automatically finds a vshard router in the cluster and calls vshard.router.bootstrap() on it.

The command supports the --config, --cartridge, and --custom options that force the use of a specific orchestrator.

To bootstrap vshard in a Cartridge cluster:

$ tt replicaset vshard bootstrap my-cartridge-app --cartridge

bootstrap

Important

The Tarantool Cartridge framework is deprecated and is not compatible with Tarantool 3.0 and later. This command is added for backward compatibility with earlier versions.

$ tt replicaset bootstrap APPLICATION[:APP_INSTANCE] [OPTIONS ...]
# or
$ tt rs bootstrap APPLICATION[:APP_INSTANCE] [OPTIONS ...]

tt replicaset bootstrap (tt rs bootstrap) bootstraps a Cartridge cluster or an instance. The command works within the current tt environment and uses application and instance names.

Note

tt replicasets bootstrap effectively duplicates two other commands:

When called with an application name: tt cartridge replicasets setup
When called with an instance name: tt cartridge replicasets join

Bootstrapping a Cartridge cluster

To bootstrap the cartridge_app application using its default replica sets file replicasets.yml:

$ tt replicaset bootstrap cartridge_app

To use another file with replica set configuration, provide a path to it in the --file option:

$ tt replicaset bootstrap cartridge_app --file replicasets1.yml

To additionally bootstrap vshard after the cluster bootstrap, add --bootstrap-vshard:

$ tt replicaset bootstrap --bootstrap-vshard cartridge_app

Bootstrapping an instance

When called with the instance name, tt replicaset bootstrap joins the instance to the replica set specified in the --replicaset option:

$ tt replicaset bootstrap --replicaset replicaset cartridge_app:instance1

rebootstrap

$ tt replicaset rebootstrap APPLICATION:APP_INSTANCE [-y | --yes]
# or
$ tt rs rebootstrap APPLICATION:APP_INSTANCE [-y | --yes]

tt replicaset rebootstrap (tt rs rebootstrap) rebootstraps an instance: stops it, removes instance artifacts, starts it again.

To rebootstrap the storage-001 instance of the myapp application:

$ tt replicaset rebootstrap myapp:storage-001

To automatically confirm reboostrap, add the -y/--yes option:

$ tt replicaset rebootstrap myapp:storage-001 -y

roles

$ tt replicaset roles [add|remove] APPLICATION[:APP_INSTANCE] ROLE_NAME [OPTIONS ...]
# or
$ tt rs roles [add|remove] APPLICATION[:APP_INSTANCE] ROLE_NAME [OPTIONS ...]

tt replicaset roles (tt rs roles) manages application roles in the cluster. This command works on Tarantool clusters with a local YAML configuration and Cartridge clusters. It has two subcommands:

add adds a role
remove removes a role

Note

To manage roles in a Tarantool cluster with a centralized configuration, use tt cluster replicaset roles.

Managing roles in clusters with local YAML configurations

When called on clusters with local YAML configurations, tt replicaset roles subcommands add or remove the corresponding lines from the configuration file and reload the configuration.

Use the --global, --group, --replicaset, --instance options to specify the configuration scope to add or remove roles. For example, to add a role to all instances in a replica set:

$ tt replicaset roles add my-app roles.my-role --replicaset storage-a

You can also manage roles of a specific instance by specifying its name after the application name:

$ tt replicaset roles add my-app:router-001 roles.my-role

To remove a role defined in the global configuration scope:

$ tt replicaset roles remove my-app roles.my-role --global

If some instances of the affected scope are running outside the current tt environment, tt replicaset roles can’t ensure the configuration reload on them and reports an error. You can skip this check by adding the -f/--force option:

$ tt replicaset roles add my-app roles.my-role --replicaset storage-a --force

Managing roles in Cartridge clusters

Important

The Tarantool Cartridge framework is deprecated and is not compatible with Tarantool 3.0 and later. This command is added for backward compatibility with earlier versions.

When called on Cartridge clusters, tt replicaset roles subcommands add or remove Cartridge cluster roles.

Cartridge cluster roles are defined per replica set. Thus, you can use the --replicaset and --group options to define a role’s scope. In this case, a group is a vshard group.

To add a role to a Cartridge cluster replica set:

$ tt replicaset roles add my-cartridge-app my-role --replicaset storage-001

To remove a role from a vshard group:

$ tt replicaset roles remove my-cartridge-app my-role --group cold-data

Learn more about Cartridge cluster roles.

Selecting the application orchestrator manually

You can specify the orchestrator to use for the application when calling tt replicaset commands. The following options are available:

--config for applications that use YAML cluster configuration (Tarantool 3.x or later).
--cartridge for Cartridge applications (Tarantool 2.x).
--custom for any other orchestrators used on Tarantool 2.x clusters.

$ tt replicaset status myapp --config
$ tt replicaset promote my-cartridge-app:storage-001-a --cartridge

If an actual orchestrator that the application uses does not match the specified option, an error is raised.

Authentication

Use one of the following ways to pass the credentials of a Tarantool user when connecting to the instance by its URI:

The -u (--username) and -p (--password) options:

$ tt replicaset status 192.168.10.10:3301 -u myuser -p p4$$w0rD

The connection string:

$ tt replicaset status myuser:p4$$w0rD@192.168.10.10:3301

Environment variables TT_CLI_USERNAME and TT_CLI_PASSWORD:

$ export TT_CLI_USERNAME=myuser
$ export TT_CLI_PASSWORD=p4$$w0rD
$ tt replicaset status 192.168.10.10:3301

Options

--bootstrap-vshard¶

Applicable to: bootstrap

Additionally bootstrap vshard when bootstrapping a Cartridge application.

--cartridge¶: Force the Cartridge orchestrator for Tarantool 2.x clusters.

--config¶: Force the YAML configuration orchestrator for Tarantool 3.0 or later clusters.

--custom¶: Force a custom orchestrator for Tarantool 2.x clusters.

--file STRING¶

Applicable to: bootstrap

A file with Cartridge replica sets configuration. Default: instances.yml in the application directory.

-f, --force¶

Applicable to: promote, demote, roles

Skip operation on instances not running in the same environment.

-G, --global¶

Applicable to: roles on Tarantool 3.x and later

Apply the operation to the global configuration scope, that is, to all instances.

-g, --group STRING¶

Applicable to: roles

A name of the configuration group to which the operation applies.

-i, --instance STRING¶

Applicable to: roles

A name of the instance to which the operation applies. Not applicable to Cartridge clusters. Learn more in Managing roles in Cartridge clusters.

-r, --replicaset STRING¶

Applicable to: bootstrap, roles

A name of the replica set to which the operation applies.

Restarting a Tarantool instance

$ tt restart APPLICATION[:APP_INSTANCE] [OPTION ...]

tt restart restarts the specified running Tarantool instance. A tt restart call is equivalent to consecutive calls of tt stop and tt start.

When called without arguments, restarts all running applications in the current environment.

Options

-y, --yes¶: Automatic “Yes” to confirmation prompt.

Examples

Restart all instances of the application stored in the app directory inside instances_enabled in accordance with the instances configuration:
```
$ tt restart app
```
Note

This call starts all application instances specified in its instances.yml, even those that were not running before the call.
Restart only the master instance of the app application with automatic confirmation:
```
$ tt restart app:master -y
```

Using the LuaRocks package manager

$ tt rocks [OPTION ...] [VAR=VALUE] COMMAND [ARGUMENT]

tt rocks provides means to manage Lua modules (rocks) via the LuaRocks package manager. tt uses its own LuaRocks installation connected to the Tarantool rocks repository.

Below are lists of supported LuaRocks flags and commands. For detailed information on their usage, refer to LuaRocks documentation.

Options

--dev¶: Enable the sub-repositories in rocks servers for rockspecs of in-development versions.

--server=SERVER¶: Fetch rocks/rockspecs from this server (takes priority over config file).

--only-server=SERVER¶: Fetch rocks/rockspecs from this server only (overrides any entries in the config file).

--only-sources=URL¶: Restrict downloads to paths matching the given URL.

--lua-dir=PREFIX¶: Specify which Lua installation to use

--lua-version=VERSION¶: Specify which Lua version to use.

--tree=TREE¶: Specify which tree to operate on.

--local¶: Use the tree in the user’s home directory. Call tt rocks help path to learn how to enable it.

--global¶: Use the system tree when local_by_default is true.

--verbose¶: Display verbose output for the command executed.

--timeout=SECONDS¶: Timeout on network operations, in seconds. 0 means no timeout (wait forever). Default: 30.

Commands

`admin`	Use the luarocks-admin tool
`build`	Build and compile a rock
`config`	Query information about the LuaRocks configuration
`doc`	Show documentation for an installed rock
`download`	Download a specific rock file from a rocks server
`help`	Help on commands. Type `tt rocks help <command>` for more
`init`	Initialize a directory for a Lua project using LuaRocks
`install`	Install a rock
`lint`	Check syntax of a rockspec
`list`	List the currently installed rocks
`make`	Compile package in the current directory using a rockspec
`make_manifest`	Compile a manifest file for a repository
`new_version`	Auto-write a rockspec for a new version of a rock
`pack`	Create a rock, packing sources or binaries
`purge`	Remove all installed rocks from a tree
`remove`	Uninstall a rock
`search`	Query the LuaRocks servers
`show`	Show information about an installed rock
`test`	Run the test suite in the current directory
`unpack`	Unpack the contents of a rock
`which`	Tell which file corresponds to a given module name
`write_rockspec`	Write a template for a rockspec file

Examples

Install the rock queue from the Tarantool rocks repository:
```
$ tt rocks install queue
```
Search for the rock queue in both the Tarantool rocks repository and the default LuaRocks repository:
```
$ tt rocks search queue --server='https://luarocks.org'
```
List the documentation files for the installed rock queue:
```
$ tt rocks doc queue --list
```
Without the --list flag, this command displays documentation in the user’s default browser.
Create a *.rock file from the installed rock queue:
```
$ tt rocks pack queue
```
Unpack a *.rock file:
```
$ tt rocks unpack queue-scm-1.all.rock
```
Remove the installed rock queue:
```
$ tt rocks remove queue
```

Running code in a Tarantool instance

$ tt run [SCRIPT|-e EXPR] [OPTION ...]

tt run executes Lua code in a new Tarantool instance.

Options

-e EXPR, --evaluate EXPR¶: Execute the specified expression in a Tarantool instance.

-l LIB_NAME, --library LIB_NAME¶: Require the specified library.

-i, --interactive¶: Enter the interactive mode after the script execution.

-v, --version¶: Print the Tarantool version that is used for script execution.

Details

tt run executes arbitrary Lua code in a Tarantool instance. The code can be provided either in a Lua file, or in a string passed after the -e/--evaluate flag. When called without arguments or flags, tt run opens the Tarantool console.

If libraries are required for execution, pass their names after the -l/--library flag.

By default, a Tarantool instance started by tt run shuts down after code execution completes. To leave this instance running and continue working in its console, add the -i/--interactive flag.

Examples

Execute the app.lua file in a Tarantool instance:
```
$ tt run app.lua
```
Execute an expression in a Tarantool instance:
```
$ tt run -e "print('hi there')"
```
Execute the app.lua file in a Tarantool instance and leave it running:
```
$ tt run -i app.lua
```

Listing available Tarantool versions

$ tt search PROGRAM_NAME [OPTION ...]

tt search lists versions of Tarantool and tt that are available for installation. The possible values of PROGRAM_NAME are:

tarantool
tarantool-ee
tt

Note

For tarantool-ee, account credentials are required. Specify them in a file (see the ee section of the configuration file) or provide interactively.

Options

--debug¶

Applicable to: tarantool-ee

Search for debug builds of Tarantool Enterprise Edition’s SDK.

--local-repo¶: Search in the local repository, which is specified in the repo section of the tt configuration file.

--version VERSION¶

Applicable to: tarantool-ee

Tarantool Enterprise version.

Example

List available Tarantool versions:
```
$ tt search tarantool
```
List available tt versions from the local repository:
```
$ tt search --local-repo tt
```

Starting Tarantool applications

$ tt start [APPLICATION[:APP_INSTANCE]]

tt start starts Tarantool applications. The application files must be stored inside the instances_enabled directory specified in the tt configuration file. For detailed instructions on preparing and running Tarantool applications, see Application environment and Starting and stopping instances.

To start all instances of the application stored in the app directory inside instances_enabled in accordance with its instances.yml:

$ tt start app

To start all instances of the app application appending their logs to stdout (in the interactive mode):

$ tt start -i app

To start the router instance of the app application:

$ tt start app:router

When called without arguments, starts all enabled applications in the current environment:

$ tt start

Application layout

tt start can start entire Tarantool clusters based on their YAML configurations. A cluster application directory inside instances_enabled must contain the following files:

config.yaml – a YAML configuration that defines the cluster topology and settings. It can either contain an explicit configuration in the YAML format or point to a centralized configuration storage (for Enterprise Edition).
instances.yml – a file that defines the list of cluster instances to run in the current environment.
(Optionally) *.lua files with code to load and run in the cluster.

For more information about Tarantool application layout, see Application environment.

Note

tt also supports Tarantool applications with configuration in code, which is considered a legacy approach since Tarantool 3.0. For information about using tt with such applications, refer to the Tarantool 2.11 documentation.

Running in the background

tt start runs Tarantool applications in the background and uses its own watchdog process for status checks (tt status) and application stopping (tt stop).

Important

Do not switch on the background mode using the cluster configuration (process.background: true in the YAML configuration) or code (box.cfg.background = true) in applications that you run with tt. If you start such an application with tt start, tt won’t be able to check the application status or stop it using the corresponding commands.

Integrity check

Enterprise Edition

The integrity check functionality is supported by the Enterprise Edition only.

tt start can perform initial and periodical integrity checks of the environment, application, and centralized configuration.

To enable integrity checks of environment and application files, you need to pack the application using tt pack with the --with-integrity-check option. This option generates and signs checksums of executables and configuration files in the current tt environment. Learn more in Generating files for integrity checks.

To enable integrity check of the configuration at the centralized storage, publish the configuration to this storage using tt cluster publish with the --with-integrity-check option. This option generates and signs configuration checksums and saves them to the storage. Learn more in Publishing configurations with integrity check.

To perform the integrity checks when running the application, start it with the --integrity-check global option. Its argument must be a public key matching the private key that was used for generating checksums.

$ tt --integrity-check public.pem start myapp

After such a call, tt checks the environment, application, and configuration integrity using the checksums and starts the application in case of the success. Then, integrity checks are performed periodically while the application is running. By default, they are performed once every 24 hours. You can adjust the integrity check period by adding the --integrity-check-period option:

$ tt --integrity-check public.pem start myapp --integrity-check-period 60

Additionally, Tarantool checks the integrity of the modules that the application uses at the load time, that is, when require('module') is called.

If an integrity check fails, tt stops the application.

Options

-i, --interactive¶

Start the application or instance in the interactive mode. In this mode, instance logs are printed to the standard output in real time.

You can use the SIGINT signal (CTRL+C) to stop tt and its child Tarantool processes in the interactive mode. No watchdog processes are created.

--integrity-check-interval NUMBER¶

Integrity check interval in seconds. Default: 86400 (24 hours). Set this option to 0 to disable periodic checks.

Checking instance status

$ tt status [APPLICATION[:APP_INSTANCE]] [OPTION ...]

tt status prints the information about Tarantool applications and instances in the current environment. This includes:

INSTANCE – application and instance names
STATUS – instance status: running, not running, or terminated with an error
PID – process IDs
MODE – instance modes: read-write or read-only
CONFIG – the instances’ states in regard to configuration for Tarantool 3.0 or later (see config.info())
BOX – the instances’ box.info() statuses
UPSTREAM – the instances’ box.info.replication[*].upstream statuses

When called without arguments, prints the status of all enabled applications in the current environment.

Examples

Print the status of all instances of the app application:
```
$ tt status app
```
Print the status of the replica instance of the app application:
```
$ tt status app:replica
```
Pretty-print the status of the replica instance of the app application:
```
$ tt status app:replica --pretty
```

Options

-d, --details¶: Print detailed alerts.

-p, --pretty¶: Print the status as a pretty-formatted table.

Stopping a Tarantool instance

$ tt stop [APPLICATION[:APP_INSTANCE]]

tt stop stops the specified running Tarantool applications or instances. Before stopping the instances, the command prompts the user for confirmation.

When called without arguments, tt stop stops all running applications in the current environment.

Examples

Stop all instances of the app application:
```
$ tt stop app
```
Stop all instances of the app application without confirmation:
```
$ tt stop app -y
```
Stop the replica instance of the app application:
```
$ tt stop app:replica
```

Options

-y, --yes¶: Stop instances without confirmation.

Interacting with the Tarantool Data Grid 2

Enterprise Edition

This command is supported by the Enterprise Edition only.

$ tt tdg2 COMMAND [COMMAND_OPTION ...]

tt tdg2 enables the interaction with Tarantool Data Grid 2 clusters. COMMAND is one of the following:

export: export a TDG2 cluster’s data to a file. Learn more at Exporting data.
import: import data to a TDG2 cluster from a file. Learn more at Importing data.

Uninstalling Tarantool software

$ tt uninstall PROGRAM_NAME [VERSION]

tt uninstall uninstalls a previously installed Tarantool version.

Example

Uninstall Tarantool 2.10.4:

$ tt uninstall tarantool 2.10.4

Displaying the tt version

$ tt version

tt version shows the version of the tt utility being used.

Extending the tt functionality

The tt utility implements a modular architecture: its commands are, in fact, separate modules. When you run tt with a command, the corresponding module is executed with the given arguments.

The modular architecture enables the option to extend the tt functionality with external modules (as opposed to internal modules that implement built-in commands). Simply said, you can write any code you want to execute from tt, pack it into an executable, and run it with a tt command:

tt my-module-name my-args

The name of the command that executes a module is the same as the name of the module’s executable.

Module description and help

Executables that implement external tt modules must have two flags:

--description – print a short description of the module. The description is shown alongside the command in the tt help.
--help – display help. The help message is shown when tt help <module_name> is called.

Location

External modules must be located in the modules directory specified in the configuration file:

tt:
  modules:
    directory: path/to/modules/dir

To check if a module is available in tt, call tt help. It will show the available external modules in the EXTERNAL COMMANDS section together with their descriptions.

Overloading built-in commands

External modules can overload built-in tt commands. If you want to change the behavior of a built-in command, create an external module with the same name and your own implementation.

When tt sees two modules – an external and an internal one – with the same name, it will use the external module by default.

For example, if you want tt to show the information about your Tarantool application, write the external module version that outputs the information you need. The tt version call will execute this module instead of the built-in one:

tt version # Calls the external module if it's available

You can force the use of the internal module by running tt with the --internal or -I option. The following call will execute the built-in version even if there is an external module with the same name:

tt version -I # Calls the internal module

tt interactive console

The tt utility features a command-line console that allows executing requests and Lua code interactively on the connected Tarantool instances. It is similar to the Tarantool interactive console with one key difference: the tt console allows connecting to any available instance, both local and remote. Additionally, it offers more flexible output formatting capabilities.

Entering the console

To connect to a Tarantool instance using the tt console, run tt connect.

Specify the instance URI and the user credentials in the corresponding options:

$ tt connect 192.168.10.10:3301 -u myuser -p p4$$w0rD
   • Connecting to the instance...
   • Connected to 192.168.10.10:3301

192.168.10.10:3301>

If a user is not specified, the connection is established on behalf of the guest user.

If the instance runs in the same tt environment, you can establish a local connection with it by specifying the <application>:<instance> string instead of the URI:

$ tt connect app:storage001
    • Connecting to the instance...
    • Connected to app:storage001

 app:storage001>

Local connections are established on behalf of the admin user.

To get the list of supported console commands, enter \help or ?. To quit the console, enter \quit or \q.

Console input

Similarly to the Tarantool interactive console, the tt console can handle Lua or SQL input. The default is Lua. For Lua input, the tab-based autocompletion works automatically for loaded modules.

To change the input language to SQL, run \set language sql:

app:storage001> \set language sql
app:storage001> select * from bands where id = 1
---
- metadata:
  - name: id
    type: unsigned
  - name: band_name
    type: string
  - name: year
    type: unsigned
  rows:
  - [1, 'Roxette', 1986]
...

To change the input language back to Lua, run \set language lua:

app:storage001> \set language lua
app:storage001> box.space.bands:select { 1 }
---
- - [1, 'Roxette', 1986]
...

Note

You can also specify the input language in the tt connect call using the -l/--language option:

$ tt connect app:storage001 -l sql

Console output

By default, the tt console prints the output data in the YAML format, each tuple on the new line:

app:storage001> box.space.bands:select { }
---
- - [1, 'Roxette', 1986]
  - [2, 'Scorpions', 1965]
  - [3, 'Ace of Base', 1987]
...

You can switch to alternative output formats – Lua or ASCII (pseudographics) tables – using the \set output console command:

app:storage001> \set output lua
app:storage001> box.space.bands:select { }
{{1, "Roxette", 1986}, {2, "Scorpions", 1965}, {3, "Ace of Base", 1987}};
app:storage001> \set output table
app:storage001> box.space.bands:select { }
+------+-------------+------+
| id   | band_name   | year |
+------+-------------+------+
| 1    | Roxette     | 1986 |
+------+-------------+------+
| 2    | Scorpions   | 1965 |
+------+-------------+------+
| 3    | Ace of Base | 1987 |
+------+-------------+------+

Note

Field names are printed since Tarantool 3.2. On earlier versions, actual names are replaced by numbered placeholders col1, col2, and so on.

The table output can be printed in the transposed format, where an object’s fields are arranged in columns instead of rows:

app:storage001> \set output ttable
app:storage001> box.space.bands:select { }
+-----------+---------+-----------+-------------+
| id        | 1       | 2         | 3           |
+-----------+---------+-----------+-------------+
| band_name | Roxette | Scorpions | Ace of Base |
+-----------+---------+-----------+-------------+
| year      | 1986    | 1965      | 1987        |
+-----------+---------+-----------+-------------+

Note

You can also specify the output format in the tt connect call using the -x/--outputformat option:

$ tt connect app:storage001 -x table

For table and ttable output, more customizations are possible with the following commands:

\set table_format – table format: default (pseudographics, or ASCII table), Markdown, or Jira-compatible format:

app:storage001> \set table_format jira
app:storage001> box.space.bands:select {}
| id | 1 | 2 | 3 |
| band_name | Roxette | Scorpions | Ace of Base |
| year | 1986 | 1965 | 1987 |

\set grahpics – enable or disable graphics for table cells in the default format:

app:storage001> \set table_format default
app:storage001> \set graphics false
app:storage001> box.space.bands:select {}
 id         1        2          3
 band_name  Roxette  Scorpions  Ace of Base
 year       1986     1965       1987

\set table_column_width – maximum column width.

app:storage001> \set table_column_width 6
app:storage001> box.space.bands:select {}
 id      1       2       3
 band_n  Roxett  Scorpi  Ace of
 +ame    +e      +ons    + Base
 year    1986    1965    1987

Commands

\help, ?

Show help on the tt console.

\quit, \q

Quit the tt console.

\shortcuts

Show available keyboard shortcuts.

\set language {lua|sql}

Set the input language. Possible values:

lua (default)
sql

An analog of the tt connect option -l/--language.

\set output FORMAT, \x{l|t|T|y}, \x

Set the output format. Possible FORMAT values:

yaml (default) – each output item is a YAML object. Example: [1, 'Roxette', 1986]. Shorthand: \xy.
lua – each output tuple is a separate Lua table. Example: {{1, "Roxette", 1986}};. Shorthand: \xl.
table – the output is a table where tuples are rows. Shorthand: \xt.
ttable – the output is a transposed table where tuples are columns. Shorthand: \xT.

Note

The \x command switches the output format cyclically in the order yaml > lua > table > ttable.

The format of table and ttable output can be adjusted using the \set table_format, \set graphics, and \set table_colum_width commands.

An analog of the tt connect option -x/--outputformat.

\set table_format

Set the table format if the output format is table or ttable. Possible values:

default – a pseudographics (ASCII) table.
markdown – a table in the Markdown format.
jira – a Jira-compatible table.

\set graphics {true|false}, \x{g|G}

Whether to print pseudographics for table cells if the output format is table or ttable. Possible values: true (default) and false.

The shorthands are:

\xG for true
\xg for false

\set table_colum_width WIDTH, \xw WIDTH

Set the maximum printed width of a table cell content. If the length exceeds this value, it continues on the next line starting from the + (plus) sign.

Shorthand: \xw

Migration from tarantoolctl to tt

tt is a command-line utility for managing Tarantool applications that comes to replace tarantoolctl. Starting from version 3.0, tarantoolctl is no longer shipped as a part of Tarantool distribution; tt is the only recommended tool for managing Tarantool applications from the command line.

tarantoolctl remains fully compatible with Tarantool 2.* versions. However, it doesn’t receive major updates anymore.

We recommend that you migrate from tarantoolctl to tt to ensure the full support and timely updates and fixes.

System-wide configuration

tt supports system-wide environment configuration by default. If you have Tarantool instances managed by tarantoolctl in such an environment, you can switch to tt without additional migration steps or use tt along with tarantoolctl.

Example:

$ sudo tt instances
List of enabled applications:
• example

$ tarantoolctl start example
Starting instance example...
Forwarding to 'systemctl start tarantool@example'

$ tarantoolctl status example
Forwarding to 'systemctl status tarantool@example'
● tarantool@example.service - Tarantool Database Server
    Loaded: loaded (/lib/systemd/system/tarantool@.service; enabled; vendor preset: enabled)
    Active: active (running)
    Docs: man:tarantool(1)
    Main PID: 6698 (tarantool)
. . .

$ sudo tt status
• example: RUNNING. PID: 6698.

$ sudo tt connect example
• Connecting to the instance...
• Connected to /var/run/tarantool/example.control

/var/run/tarantool/example.control>

$ sudo tt stop example
• The Instance example (PID = 6698) has been terminated.

$ tarantoolctl status example
Forwarding to 'systemctl status tarantool@example'
○ tarantool@example.service - Tarantool Database Server
    Loaded: loaded (/lib/systemd/system/tarantool@.service; enabled; vendor preset: enabled)
    Active: inactive (dead)

Local configuration

If you have a local tarantoolctl configuration, create a tt environment based on the existing .tarantoolctl configuration file. To do this, run tt init in the directory where the file is located.

Example:

$ cat .tarantoolctl
default_cfg = {
    pid_file  = "./run/tarantool",
    wal_dir   = "./lib/tarantool",
    memtx_dir = "./lib/tarantool",
    vinyl_dir = "./lib/tarantool",
    log       = "./log/tarantool",
    language  = "Lua",
}
instance_dir = "./instances.enabled"

$ tt init
• Found existing config '.tarantoolctl'
• Environment config is written to 'tt.yaml'

After that, you can start managing Tarantool instances in this environment with tt:

$ tt start app1
• Starting an instance [app1]...

$ tt status app1
• app1: RUNNING. PID: 33837.

$ tt stop app1
• The Instance app1 (PID = 33837) has been terminated.

$ tt check app1
• Result of check: syntax of file '/home/user/instances.enabled/app1.lua' is OK

Commands difference

Most tarantoolctl commands look the same in tt: tarantoolctl start and tt start, tarantoolctl play and tt play, and so on. To migrate such calls, it is usually enough to replace the utility name. There can be slight differences in command flags and format. For details on tt commands, see the tt commands reference.

The following commands are different in tt:

`tarantoolctl` command	`tt` command
`tarantoolctl enter`	`tt connect`
`tarantoolctl eval`	`tt connect` with `-f` flag

Note

tt connect also covers tarantoolctl connect with the same syntax.

Example:

 # tarantoolctl enter > tt connect
 $ tarantoolctl enter app1
 connected to unix/:./run/tarantool/app1.control
 unix/:./run/tarantool/app1.control>

 $ tt connect app1
 • Connecting to the instance...
 • Connected to /home/user/run/tarantool/app1/app1.control

 # tarantoolctl eval > tt connect -f
 $ tarantoolctl eval app1 eval.lua
 connected to unix/:./run/tarantool/app1.control
 ---
 - 42
 ...

$ tt connect app1 -f eval.lua
 ---
 - 42
 ...

 # tarantoolctl connect > tt connect
 $ tarantoolctl connect localhost:3301
 connected to localhost:3301
 localhost:3301>

 $ tt connect localhost:3301
 • Connecting to the instance...
 • Connected to localhost:3301

Tarantool Cluster Manager

Enterprise Edition

Tarantool Cluster Manager is a part of the Enterprise Edition.

Tarantool Cluster Manager (TCM) is a web-based visual tool for configuring, managing, and monitoring Tarantool EE clusters. It provides a GUI for working with clusters and individual instances, from monitoring their state to executing commands interactively in an instance’s console.

TCM is a standalone application included in the Tarantool Enterprise Edition distribution package. It is shipped as ready-to-run executable for Linux platforms.

TCM works only with Tarantool EE clusters that use centralized configuration in etcd or a Tarantool-based configuration storage. When you create or edit a cluster’s configuration in TCM, it publishes the saved configuration to the storage. This ensures consistent and reliable configuration storage. A single TCM installation can connect to multiple Tarantool EE clusters and switch between them in one click.

To provide enterprise-grade security, TCM features its own role-based access control. You can create users and assign them roles that include required permissions. For example, a user can be an administrator of a specific cluster or only have the right to read data. LDAP authorization is supported as well.

Web interface overview

Enterprise Edition

Tarantool Cluster Manager is a part of the Enterprise Edition.

The Tarantool Cluster Manager web interface is available on the hostname and port defined by the http.host and http.port configuration options. If TLS is enabled, it uses the https protocol, otherwise the protocol is http. When started locally with the default configuration, TCM is available at http://127.0.0.1:8080.

Logging into TCM

To log into TCM after bootstrap, use the following credentials:

Username: admin

Password: the initial password is shown in the TCM boot log in a message like this:

Jun 11 11:24:08.900 WRN Generated super admin credentials login=admin password=jS9PsdkEJBYNhdMtSswMlxDR1vdbfc1N

After logging in with the default password:

Adjust the password policy in accordance with the security requirements that apply in your organization.
Change the admin user’s password on the User settings page.

To log out of TCM, click the user’s name in the header and click Log out.

Page structure

The TCM web interface consists of three parts:

Navigation page on the left shows the list of pages available to the user. The navigation pane can be collapsed by clicking the cross icon at its top.
Header at the top provides access to notifications and user settings.
Working area displays the contents of the selected page.

TCM UI parts: navigation pane, header, working area

Onboarding

The Onboarding item of the navigation pane starts the interactive onboarding tutorial. Use it to get familiar with the main TCM features directly in the web interface.

Page visibility

This overview describes most TCM pages. The exact set of pages and controls available to a particular user is determined by the user’s permissions.

Some features, such as data schema editing, are available only in the development mode. You can switch to it in the user settings of the Default Admin user. To learn more about the development mode, see Development mode.

Page groups

For easier navigation, TCM pages are grouped in the navigation pane by their content. There are the following page groups:

Cluster: interaction with the selected cluster.
Clusters: interaction with all connected clusters in general.
Users: access management.
Tools: TCM administration.
Settings: runtime management of TCM settings.

Read on to learn what you can do on the pages of these groups.

Cluster

The Cluster group includes pages used for interaction with a particular cluster. To switch between clusters, click the Cluster group name and select a connected cluster from the drop-down list.

Stateboard

The cluster Stateboard is a main page for monitoring the cluster state and interacting with its instances.

On this page, you can:

view and edit the cluster topology
group and filter instances based on various criteria
view memory statistics and Tarantool versions running on instances
navigate to instance pages by clicking instance names in the cluster topology list
start and stop instances (in the development mode)
manage TDB workers, including their visibility, monitoring, and diagnostics.

Support for TDB workers in TCM was introduced starting with TDB 3.1.0 and TCM 1.6.0. This feature allows monitoring and management of workers within the cluster, providing full visibility into their status and metrics. Workers are automatically discovered through etcd and continuously monitored using dedicated health check endpoints. Their metrics are proxied through TCM and exposed individually, enabling comprehensive insight into the system’s state.

On the Stateboard interface, workers are displayed with clear status indicators and a details panel. Possible worker statuses include:

healthy — the worker is functioning correctly
degraded — the worker is experiencing issues but remains available
unhealthy — the worker is malfunctioning or unavailable
no connection — it is not possible to establish a connection to the worker.

Learn more about using the cluster stateboard in Viewing cluster state.

Instance page

The instance page opens when you click an instance name on the Stateboard.

It provides a set of tabs for performing actions on the selected Tarantool instance:

Details and State tabs: view instance details as a human-readable table or as a console output of box.cfg, box.info, and other built-in functions
SQL and Terminal tabs: run SQL and Lua commands on the instance. TCM provides two ways to interact with Tarantool instances:
- direct — a terminal that connects directly to a Tarantool instance using the go-tarantool library, bypassing the tt connect utility
- tt-connect — a terminal that uses the tt CLI utility utility to connect to a Tarantool instance
Logs tab: view instance logs
Slabs tab: view slab allocator statistics
Users tab: manage Tarantool users and roles on the instance
Funcs: manage and call stored functions
Metrics: view instance metrics

The instance page has an Actions menu at the top that allows you to:

navigate to the instance explorer
edit the instance configuration
remove the instance

Slabs tab overview

The Slabs tab in the TCM Web UI visualizes memory allocation within each Tarantool instance using the slab allocator.

This tab is useful for:

identifying memory fragmentation
analyzing slab saturation by object size
debugging excessive memory use in real time

Data source

This visualization is based on the output of:

box.slab.stats()

This function returns a Lua table with per-class (per object size) memory allocation statistics from the slab allocator. More about box.slab.stats().

Each entry in the output contains:

item_size: object size class
slab_count: number of slab blocks
slab_size: memory size of each slab
item_count: number of allocated objects
mem_used: bytes used
mem_free: bytes free

These values are parsed and rendered as visual elements in the UI.

Slab visualization

Each block represents a single slab (a fixed-size memory region). The color indicates how full the slab is:

Green — the slab is less than 30% full
Red — slab is full (100% usage)
Gradient colors between green and red — indicate intermediate fill levels (e.g., 30%, 50%, 75%)

The color transitions smoothly, providing a quick visual way to understand which slabs are:

actively used
partially utilized
potentially underused or contributing to memory fragmentation

In the example screenshot:

Slab #17 (168 KB) — 75% full (dark red)
Slab #18 (320 KB) — 53% full (brownish-red)
Slab #16 (40 KB) — only 1% used (bright green)
Slab #2 (56 B) — 60% used (intermediate gradient)

Each slab block’s size in the visualization reflects the total memory allocated for its item_size class – the more memory allocated, the larger the visual representation.

Calculating fill percentage

The overall fill percentage for a slab is calculated using:

fill % = (item_count * item_size) / (slab_count * slab_size)

However, each slab is visualized individually, so different fill levels across slabs will result in various colors within the same row.

Behavior across Tarantool instances

Slab allocation may vary between instances in the same replicaset due to differences in configuration, data loading order, and use of local memory. The reasons are:

Slab allocation may differ because each instance can use its own values for slab_alloc_factor and slab_alloc_granularity. These parameters control how memory is divided into size classes and slabs, affecting memory layout and potential fragmentation.
Differences also appear during replica join or restart. A replica allocates memory for tuples in primary index order, while on the master, allocation follows the order of incoming requests. This results in different slab structures and usually lower fragmentation on replicas after a restart.
Local and temporary spaces exist only on specific instances and are not replicated. They consume memory independently and contribute to differences in slab allocation across nodes.

Slab allocator tuning

You can fine-tune the allocator behavior with two configuration options:

slab_alloc_factor – multiplier for calculating object size classes. Default value: 1.05
slab_alloc_granularity – minimum allocation step (in bytes) for the small allocator. Default value: 8

These parameters affect how memory is allocated per object size class and can help:

reduce internal fragmentation
optimize memory usage
improve slab locality and performance
better understand memory consumption via the Slabs tab

Use cases and recommendations table:

Scenario / Goal	Parameters (`slab_alloc_factor` / `slab_alloc_granularity`)	Effect on memory	Effect on performance	Visualization in Slabs tab
Reduce memory waste (small, uniform tuples)	`1.05` / `4`	Many size classes – minimal internal memory waste	Higher overhead for managing slab pools	Many rows, partially filled blocks, gradient from green to red
Optimize performance (mixed-size tuples)	`1.3` / `16`	Fewer size classes – slightly more memory waste	Lower overhead – faster memory allocation	Fewer rows, larger blocks, color contrast: partially or filled
Control fragmentation and slab count	Task-dependent: lower values – more classes; higher values – fewer classes	Balance between internal memory waste and the number of blocks	Balance between overhead and allocator speed	Balance between number of rows and block sizes; colors indicate fill level

Configuration

The cluster Configuration page provides an interactive editor for the cluster configuration. It is connected to the centralized configuration storage that the cluster uses. All changes you make and apply to this page are sent to this centralized storage.

Learn more in Configuring clusters.

Security

The Security page provides controls for managing the cluster security settings.

Learn more in Security settings.

Migrations

The Migrations page provides centralized migration management tools for the selected cluster.

Learn more in Performing migrations.

Tuples

Important

The cluster-wide access to stored data on the Tuples page is supported only for sharded clusters that use the CRUD module. Starting with TCM 1.6.0, the Tuples tab is disabled by default. You can enable the tab in the TCM configuration file (tcm.yaml) using the option below:

feature:
  tuples: True

The Tuples page provides access to data stored in the user spaces of the selected cluster.

On this page, you can:

view the list of user spaces, their size, and engines
view and edit tuples stored in user spaces
search for tuples by entering search condition in the Search bar

Search by condition

TCM supports the following comparison operators:

== – equal to
> – greater than
< – less than
>= – greater than or equal to
<= – less than or equal to

Search condition has the following structure:

index_name comparator value

where:

index_name – the name of the index. This is the left-hand side of the expression.
comparator – a comparison operator (>, >=, ==, <=, <). It must be separated by spaces on both sides of the expression.
value – a string, numeric, or boolean value. This is the right-hand side of the expression. String values must be enclosed in double quotes ("").

Note

TCM does not support text search without a search condition. For example, to search for customers named Ivan in a space, use the index name and a comparison operator to specify the expression:

correct: typing name == "Ivan" in the Search bar
incorrect: typing Ivan in the Search bar

Examples

The search expression below returns tuples with IDs greater than 9990:

id > 9990

In TCM, the result might look as follows:

In the example below, the search returns tuples with the name index equal to Ivan:

name == "Ivan"

The example below specifies a multiple search condition. The search returns all people with an ID greater than 2 who were born in 1980 or earlier.

id > 2; year <= 1980;

TCF

The TCF tab provides an interface for clusters that run within Tarantool Clusters Federation.

TCF tab can be added via the TCM configuration file:

# tcm.yaml
feature:
    tcf: True

You can also enable it using the environment variable or the feature command-line option. For more details, refer to configuration reference.

On this page, you can:

view information about TCF clusters
toggle the state of clusters
promote or demote clusters
change key cluster parameters.

To open the settings, click Actions (the three dots next to the cluster status) and select Settings. Available parameters:

dml_users: list of DML users

cluster1, cluster2: cluster settings

replication_user: replication username

replication_password: password associated with the replication user

failover_timeout: time period (in seconds) to wait before initiating failover to another cluster. Default value: 20

initial_status: initial service state

max_suspect_counts: maximum suspect counts for failover. Default value: 3

health_check_delay: delay (in seconds) between health checks. Default value: 2

enable_system_check: enables or disables system-level health checks. Default value: true

status_ttl: time-to-live for service status. Default value: 4

Learn more in TCF integration.

TQE

TCM provides built-in support for monitoring and inspecting Tarantool Queue Enterprise through the web interface.

The TQE tab can be added via the TCM configuration file:

# tcm.yaml
feature:
    tqe: True

You can also enable it using the environment variable or the feature command-line option. For more details, refer to configuration reference.

After enabling the feature, the TQE page appears in the TCM UI and provides access to Metrics and Queues pages.

Metrics can be viewed in two formats:

Chart view
Table view

The Queues page displays runtime information for each queue, including:

Latency – the time delay (ms) between a message being added to the queue and being processed.
Poll max batch – the number of messages retrieved in a single request for processing.
Deduplication mode – specifies how duplicate messages are handled. Deduplication is always enabled. Available modes: basic (default), extended, keep_latest, keep_first.

Cluster metrics

The Cluster metrics page provides access to the selected cluster’s metrics.

Learn more in Viewing cluster metrics.

Instance explorer

The instance Explorer provides access to all spaces of a specific instance, including system spaces.

On this page, you can:

view and edit instance spaces, their size, and engines
view and edit tuples stored in all spaces of the instance

Clusters

The Clusters group includes pages used for managing TCM’s cluster connections.

Clusters

The Clusters page lists Tarantool clusters that are connected to TCM.

On this page, you can:

connect Tarantool clusters to TCM
edit cluster connections
disconnect clusters

Learn more in Connecting clusters.

When managing Tarantool clusters via TCM, you can configure individual cluster settings by clicking the three-dot menu (⋯) next to the cluster name on the Clusters page and selecting Edit. This opens a dedicated configuration panel for the selected cluster.

The Config storage tab contains settings for the cluster’s configuration storage:

Provider – the type of configuration storage used by the cluster. Values: etcd, tarantool.
Prefix – the key prefix in the configuration storage under which the Tarantool cluster configuration is stored. This helps isolate multiple clusters using the same backend. Must start with a forward slash /.
Workers prefix – the key prefix used to locate TDB workers configurations in the storage backend. Must start with a forward slash /.
Endpoints – a list of URLs for the configuration storage nodes, each on a new line. These endpoints are used by TCM to connect to the storage backend and retrieve or update cluster data.
Username – the username for authenticating with the configuration storage. Used if the storage backend is secured with user authentication.
Password – the password for the specified username. Provides secure access to the configuration storage.

ACL

The ACL page displays the TCM access control list.

On this page, you can add and delete ACL entries. Learn more in Access control list.

Users

The Users group includes pages related to user access to TCM.

Users

The Users page lists TCM users.

On this page, you can:

add, edit, and delete users
manage user secrets (passwords and API tokens)
revoke user sessions

Learn more in Users.

Roles

The Roles page lists TCM user roles.

On this page, you can add, edit, and delete roles. Learn more in Roles.

Sessions

The Sessions page lists active sessions of TCM users.

On this page, you can view and revoke sessions. Learn more in Sessions.

Tools

The Tools group includes service pages used for TCM maintenance and monitoring.

Audit log

The Audit log tab displays the TCM audit log.

TCS

The TT Column Store tab provides an interface for querying data stored in Tarantool Column Store directly from TCM.

Interface elements:

Endpoint — TCS HTTP endpoint
Query — SQL query editor for entering SELECT statements
Query button — executes the query
Clear button — clears the editor
Result table — displays returned rows and columns

The TCS page can be added via the TCM configuration file:

# tcm.yaml
feature:
    column-store: True

When enabled, the TT Column Store section appears in the left navigation panel. You can also enable it using the environment variable or the feature command-line option. For more details, refer to configuration reference.

TT Graph DB

TCM provides built-in support for interacting with Tarantool Graph DB through the web interface.

The TT Graph DB tab can be added via the TCM configuration file:

# tcm.yaml
feature:
    ttgraph: True

You can also enable it using the environment variable or the feature command-line option. For more details, refer to configuration reference.

After enabling the feature, the TT Graph DB page appears in the TCM UI and provides the following capabilities:

Specify the Endpoint and Query fields for placing a request for a TT Graph DB instance. Click Execute to run the query. The result appears in a table below the input form:
Press Query to view the results in a graph format, allowing to analyze relationships between entities:

Each row in the result table can be expanded to view detailed information about a specific record:

The graph can be opened in full size:

TCM metrics

The TCM metrics tab provides access to the TCM metrics.

Settings

The Settings group includes service pages where you can configure various TCM features.

Password policy

On the Password policy page, you can configure the requirements to user passwords, such as minimal length, required symbols, expiration, and other settings. Learn more in Password policy.

Audit settings

On the Audit settings page, you can configure how TCM records events to its audit log: whether audit log is enabled, which events are recorded, and so on. Learn more in Audit log.

LDAP

On the LDAP page, you can manage TCM LDAP configurations.

User settings

The user settings dialog opens when you click Settings under the user’s name in the header.

This dialog includes the following tabs:

General tab: switch the color theme
Change password tab: change your password
API tokens tab: generate and delete API tokens
Sessions tab: view and revoke your user sessions
About tab: view TCM information about switch between development and production modes

Connecting clusters

Enterprise Edition

Tarantool Cluster Manager is a part of the Enterprise Edition.

Tarantool Cluster Manager works with clusters that:

run on Tarantool EE 3.0 or later
use centralized configuration storage: etcd or Tarantool-based.

A single TCM installation can have multiple connected clusters. A connection to TCM doesn’t affect the cluster’s functioning. You can connect clusters to TCM and disconnect them on the fly.

There are two scenarios of cluster connection to TCM:

Connect an existing cluster.
Add a new cluster and write its configuration from scratch in the TCM web interface.

In both cases, you need to deploy Tarantool and start the cluster instances using the tt CLI utility or another suitable way.

To add a cluster to TCM, you can use two ways:

Use the TCM web interface as described on this page.
Specify the initial-settings.clusters section of the TCM configuration. To learn more, see Initial settings.

Connection parameters

When connecting a cluster to TCM, you need to provide two sets of connection parameters: for the cluster instances and for the centralized configuration storage.

Configuration storage connection

The cluster configuration can be stored in either an etcd cluster or a separate Tarantool-based storage. In both cases, the following connection parameters are required:

A key prefix used to identify the cluster in the configuration storage. A prefix must be unique for each cluster in storage.
URIs of all instances of the configuration storage.
The credentials for accessing the configuration storage: an etcd user or a Tarantool user.

Additionally, if SSL or TLS encryption is enabled for the configuration storage, provide the corresponding encryption configuration: keys, certificates, and other parameters. For the complete list of parameters, consult the etcd documentation or Tarantool Securing connections with SSL.

Cluster connection

For interaction with the cluster instances, TCM needs the following access parameters:

A Tarantool user that exists in the cluster and their password. TCM connects to the cluster on behalf of this user.
An SSL configuration if the traffic encryption is enabled on the cluster.

Managing connected clusters

Administrators can add new clusters, edit, and remove existing ones from TCM.

Connected clusters are listed on the Clusters page.

Connecting a pre-configured cluster

If you already have a cluster and want to connect it to TCM, follow these steps:

Go to Clusters and click Add.
Fill in the general cluster information:
- Specify an arbitrary name.
- Optionally, provide a description and select a color to mark this cluster in TCM.
- Optionally, enter the URLs of additional services for the cluster. For example, a Grafana dashboard that monitors the cluster metrics, or a syslog server for viewing the cluster logs. TCM provides quick access to these URLs on the cluster Stateboard page.

Provide the details of the cluster configuration storage:
- Storage type: etcd or tarantool.
- The Prefix specified in the cluster configuration.
- The URIs of the configuration storage instances.
- The credentials for accessing the configuration storage.
- The SSL/TLS parameters if the connection encryption is enabled on the storage.
Provide the credentials for accessing the cluster: a Tarantool user’s name, their password, and SSL parameters in case traffic encryption is enabled on the cluster.

Adding a new cluster

If you don’t have a cluster yet, you can add one in TCM and write its configuration from scratch using the built-in configuration editor.

Important

When adding a new cluster, you need to have a storage for its configuration up and running so that TCM can connect to it. Cluster instances can be deployed later.

To add a new cluster:

Go to Clusters and click Add.
Fill in the general cluster information:
- Specify an arbitrary name.
- Optionally, provide a description and select a color to mark this cluster in TCM.
- Optionally, enter the URLs of additional services for the cluster. For example, a Grafana dashboard that monitors the cluster metrics, or a syslog server for viewing the cluster logs. TCM provides quick access to these URLs on the cluster Stateboard page.
Select the type of the cluster configuration storage: etcd or tarantool.
Define a unique Prefix for identifying this cluster in the configuration storage.
Provide the connection details for the cluster configuration storage:
- The URIs of configuration storage instances.
- The credentials for accessing the configuration storage.
- The SSL/TLS parameters if the connection encryption is enabled on the storage.
Provide the cluster credentials: a username, a password, and SSL parameters in case traffic encryption is enabled on the cluster.

Once you add the cluster:

Set up the cluster configuration using the TCM configuration editor.
Deploy Tarantool on the cluster nodes using the tt CLI utility or other suitable tools.
Start the cluster using the tt CLI utility or other suitable tools.

Editing a connected cluster

To edit a connected cluster, go to Clusters and click Edit in the Actions menu of the corresponding table row.

Disconnecting a cluster

To disconnect a cluster from TCM, go to Clusters and click Disconnect in the Actions menu of the corresponding table row.

Note

Disconnecting a cluster does not affect its functioning. The only thing that changes is that it’s no longer shown in TCM. You can connect this cluster again at any time.

Cluster management

Enterprise Edition

Tarantool Cluster Manager is a part of the Enterprise Edition.

The main goal of Tarantool Cluster Manager is to provide visual tools for managing various aspects of Tarantool clusters from the browser. See the pages of this section to learn how to perform various management operations on Tarantool clusters from TCM.

Viewing cluster state

Enterprise Edition

Tarantool Cluster Manager is a part of the Enterprise Edition.

Tarantool Cluster Manager provides a visual interface for checking various aspects of connected clusters, such as:

topology
instance state
memory usage
data distribution
Tarantool versions

Cluster state information is available on the Cluster > Stateboard page.

Cluster topology

The cluster topology is displayed on the Stateboard page in one of two forms: a list or a graph.

List view

The list view of the cluster topology is used by default. In this view, each row contains the general information about an instance: its current state, memory usage and limit, and other parameters.

In the list view, TCM additionally displays the Tarantool version information and instance states on circle diagrams. You can click the sectors of these diagrams to filter the instances with the selected versions and states.

To switch to the list view, click the list button on the right of the search bar on the Stateboard page.

Graph view

The graph view of the cluster topology is shown in a tree-like structure where leafs are the cluster’s instances. Each instance’s state is shown by its color. You can move the graph vertices to arrange them as you like, and zoom in and out, which is helpful for larger clusters.

To switch to the graph view, click the graph button on the right of the search bar on the Stateboard page.

Instance grouping

By default, the cluster topology is shown hierarchically as it’s defined in the configuration: instances are grouped by their replica set, and replica sets are grouped by their configuration group.

For better navigation across the cluster, you can adjust the instance grouping. For example, you can group instances by their roles or custom tags defined in the configuration. A typical case for such tags is adding a geographical markers to instances. In this case, you see if issues happen in a specific data center or server.

To change the instance grouping, click Group by in the Actions menu on the Stateboard page. Then add or remove grouping criteria.

Filtering

You can filter the instances shown on the Stateboard page using the search bar at the top. It has predefined filters that select:

instances with errors or warnings
leader or read-only instances
instances with no issues
stale instances

To display all instances, delete the filter applied in the search bar.

Instance details

The general information about the state of cluster instances is shown in the list view of the cluster topology. Each row contains the information about the instance status, used and available memory, read-only status, and virtual buckets for sharded clusters.

To view the detailed information about an instance or connect to it, click the corresponding row in the instances list or a vertex of the graph. On the instance page, you can find:

the instance configuration overview
current state (with warning and error messages if any)
the detailed Tarantool information returned by the instance introspection functions from box.info, box.stat, and other built-in modules
memory usage by the slab allocator
instance users and roles
stored functions
instance metrics

The page also provides Lua and SQL terminals to execute built-in functions and requests on the instance. You can choose between two Lua terminals: the tt interactive console with code completion and highlighting or the default Tarantool console.

Linked external services

When you connect a cluster to TCM, you can specify URLs of external services linked to this cluster. For example, this can be a Grafana server that monitors the cluster metrics.

All the URLs added for a cluster are available for quick access in the Actions menu on the Stateboard page.

Configuring clusters

Enterprise Edition

Tarantool Cluster Manager is a part of the Enterprise Edition.

Tarantool Cluster Manager features a built-in text editor for Tarantool EE cluster configurations.

When you connect a cluster to TCM, it gains access to the cluster’s centralized configuration storage: an etcd or a Tarantool cluster. TCM has both read and write access to the cluster configuration. This enables the configuration editor to work in two ways:

If a configuration already exists, the editor shows its current state.
When you change the configuration in the editor and apply changes, they are sent to the configuration storage.

To learn how to write Tarantool cluster configurations, see Configuration.

Managing a cluster’s configuration

The configuration editor is available on the Cluster > Configuration page.

To start managing a cluster’s configuration, select this cluster in the Cluster drop-down and go to the Configuration page.

A cluster configuration in TCM can consist of one or multiple YAML files. When there are multiple files, they are all considered parts of a single cluster configuration. You can use this for structuring big cluster configurations. All files that form the configuration of a cluster are listed on the left side of the Cluster configuration page.

To add a cluster configuration file, click the plus icon (+) below the page title.

To open a configuration file in the editor, click its name in the file list.

To delete a cluster configuration file, click the Delete button beside the filename.

To download a cluster configuration file, click the Download button beside the filename.

Warning

All configuration changes are discarded when you leave the Cluster configuration page. Save the configuration if you want to continue editing it later or apply it to start using it on the cluster.

Saving a configuration draft

TCM can store configurations drafts. If you want to leave an unfinished configuration and return to it later, save it in TCM. Saving applies to whole cluster configurations: it records the edits of all files, file additions, and file deletions.

To save a cluster configuration draft after editing, click Save in the Cluster configuration page.

All unsaved changes are discarded when you leave the Cluster configuration page.

If you have a saved configuration draft, you can reset the changes for each of its files individually. A reset returns the file into the state that is currently used by a cluster (that is, saved in the configuration storage). If you reset a newly added file, it is deleted.

To reset a saved configuration file, click the Reset button beside the filename.

Applying a configuration

When you finish editing a configuration and it’s ready to use, apply the updated configuration to the cluster. To apply a cluster configuration, click Apply on the Cluster configuration page. This sends the new configuration to the cluster configuration storage, and it comes into effect upon the cluster configuration reload.

Managing cluster users and roles

Enterprise Edition

Tarantool Cluster Manager is a part of the Enterprise Edition.

Tarantool Cluster Manager provides a visual interface for managing Tarantool users and roles on connected clusters.

Note

This page describes management of Tarantool users and roles on instances of connected clusters. To learn to manage TCM users, see Access control.

The Tarantool access model defines user access to entities inside a single instance. Thus, to create or alter a cluster-wide user or role, you need to do this on all cluster instances. In replication clusters, changes in access model are possible only on read-write instances (replica set leaders). Changes made on a leader instance are propagated to all instances of its replica set automatically.

Operations on the cluster access model are possible only if the user that TCM uses to connect to the cluster has the privileges to manage users and roles.

You can also manage Tarantool users and roles from TCM using the Lua API as described in Access control. To do this, connect to instance consoles from the Terminal tab of the instance page.

Managing cluster users

The tools for managing cluster users are located on the Users tab of the instance page.

Important

To ensure the access model consistency across the cluster, repeat all user management operations on all read-write instances of the cluster.

To create a user on a cluster:

Go to Stateboard.
Find a replica set leader in the instances list and click it to open the instance page.
Go to the Users tab and click Add user.

To edit or delete a user, click the Edit or Delete button against the username in the Users table.

To edit a user’s privileges:

Click the lock icon against the username in the Users table.
In the privileges dialog:
- Click Add to grant privileges
- Click Revoke (the trash bin icon) to revoke a privilege

Managing cluster roles

The tools for managing cluster roles are located on the Users tab of the instance page.

Important

To ensure the access model consistency across the cluster, repeat all role management operations on all read-write instances of the cluster.

To create a role on a cluster:

Go to Stateboard.
Find a replica set leader in the instances list and click it to open the instance page.
Go to the Users tab and click Add role.

To delete a role, click the Delete button against the role name in the Roles table.

To edit a role’s privileges:

Click the lock icon against the role name in the Roles table.
In the privileges dialog:
- Click Add to grant privileges
- Click Revoke (the trash bin icon) to revoke a privilege

Security settings

Enterprise Edition

Tarantool Cluster Manager is a part of the Enterprise Edition.

Tarantool Cluster Manager includes a web interface for managing security settings of connected clusters. It is available on the Cluster > Security page. On this page, you can manage the following security features in the cluster:

Authentication settings: protocol (CHAP or PAP), number of retries, and the delay after a failed authentication attempt (security.auth_* configuration options). To learn more about Tarantool authentication settings, see Authentication.
Password policy: minimal password length, required characters, expiration period, and other settings (security.password_* configuration options). To learn more about Tarantool password policy, see Password policy.
Guest access: whether unauthenticated or guest users can connect to cluster (security.disable_guest configuration option).
Secure erasing: whether to delete data files securely so that they cannot be restored (security.secure_erasing configuration option).
Audit log: configure audit logging in the cluster (audit_log.* configuration options). To learn how to manage audit logging in the cluster, see Audit module.

Viewing cluster metrics

Enterprise Edition

Tarantool Cluster Manager is a part of the Enterprise Edition.

In Tarantool Cluster Manager, you can view metrics of connected clusters in real time on the Cluster > Cluster metrics page. The list of metrics that Tarantool exposes is provided in the Metrics reference.

Metrics are displayed one by one. To view a metric, select it in the drop-down list at the top of the page. Then, choose a way to visualize it:

Chart: a time series chart with the metric values displayed as lines.
Table: a table where the metric values are displayed as numbers in table cells.

Once you select a metric, TCM starts visualizing its current values, updating them once per second. To pause the visualization, click the button on the left from the metrics selector. To stop the visualization, clear the metric selection.

Viewing instance metrics

To view metrics of a specific instance, find this instance on the Stateboard, click its name, and go to the Metrics tab of the instance page.

Monitoring metrics with Prometheus

To allow collecting cluster metrics with external systems, such as Prometheus, TCM provides HTTP endpoints at /api/metrics/<clusterId>.

Note

Cluster IDs are shown in the cluster selection dialog that opens when you click Cluster at the top of the left navigation pane.

To access such an endpoint, a request must be authorized with an API token that has a cluster.metrics permission on the target cluster.

Below is an example of a Prometheus scrape configuration that collects metrics of a Tarantool cluster from TCM:

- job_name: "tarantool"
    static_configs:
      - targets: ["127.0.0.1:8080"]
    metrics_path: "/api/metrics/00000000-0000-0000-0000-000000000000"
    bearer_token: QgMPZ22JZ3uw7n0QTbqYGAQDmNDs1JnTkhaC1OlQzWM3utmpV78b23GG97zp8YE3

Using supervised failover

Enterprise Edition

Tarantool Cluster Manager is a part of the Enterprise Edition.

For Tarantool clusters that use supervised failover, Tarantool Cluster Manager offers tools for interaction with external failover coordinators from its web interface.

The tools for using supervised failover are located on the Failovers page available from the Actions menu on the cluster stateboard.

Note

TCM can interact with failover coordinators that are already running. There is no way to start or stop coordinators from TCM.

Viewing failover coordinators

To view failover coordinators running on the cluster, go to the Failovers tab. On this tab, you can see the information about all Tarantool instances that the cluster uses as failover coordinators. The information includes:

Current coordinator status – Active or Not active
PID – process ID
Hostname – the host on which the coordinator is running
UUID – the coordinator ID
Term – a value that defines the order in which coordinators become active (take the lock) over time.

Executing failover commands

To send a failover command to a coordinator, go to the Commands tab and click Add. Then, provide the command description in the YAML format. It can include the following fields:

command – the command name. Possible value: switch – switch master in a replica set.
new_master – the name of the instance to make the new master.
timeout – the command execution timeout.

Example:

command: switch
new_master: instance-002
timeout: 30

After entering the command, click Save to send the command for execution.

Tarantool assigns an id to the command and waits for the active coordinator to process the command.

All failover commands executed on the cluster are shown on the Commands tab with their ids and statuses. A command can have the following statuses:

taken – a failover coordinator has started the command execution.
success – the command has completed successfully.
failed – an error occurred during the command execution. A short error description is shown in the Reason field.

To see the command execution details, click this command in the list.

Performing migrations

Enterprise Edition

Tarantool Cluster Manager is a part of the Enterprise Edition.

Tarantool Cluster Manager provides a web interface for managing and performing migrations in connected clusters. To learn more about migrations in Tarantool, see Migrations.

Migrations are named Lua files with code that alters the cluster data schema, for example, creates a space, changes its format, or adds indexes. In TCM, there is a dedicated page where you can organize migrations, edit their code, and apply them to the cluster.

Important

Migrations created between Tarantool versions 1.5.3 and 1.7.3 are not compatible with the tt CLI utility utility, so TCM reverted to the old behavior of handling them; however, these migrations cannot be applied in TCM version 1.8.0, and must be applied on the corresponding Tarantool versions first before upgrading to TCM 1.8.0.

Managing migrations

The tools for managing migrations from TCM are located on the Cluster > Migrations page.

To create a migration:

Click Add (the + icon) on the Migrations page.
Enter the migration name.

Important

When naming migrations, remember that they are applied in the lexicographical order. Use ordered numbers as filename prefixes to define the migrations order. For example, 001_create_table, 002_add_column, 003_create_index.
Write the migration code in the editor window. Use the box.schema module reference to learn how to work with Tarantool data schema.

Once you complete writing the migration, save it by clicking Save. This saves the migration that is currently opened in the editor.

Appliyng migrations

After you prepare a set of migrations, apply it to the cluster. To apply all saved migrations to the cluster at once, click Apply.

Important

Applying all saved migrations at once, in the lexicographical order is the only way to apply migrations in TCM. There is no way to select a single or several migrations to apply. The migrations that are already applied are skipped. To learn how to check a migration status, see Checking migrations status.

Migrations that were created but not saved yet are not applied when you click Apply.

Checking migrations status

To check the migration results on the cluster, use the Migrated widget on the cluster stateboard. It reflects the general result of the last applied migration set:

If all saved migration are applied successfully, the widget is green.
If any migration from this set fails on certain instances, the widget color changes to yellow.
If there are saved migrations that are not applied yet, the widget becomes gray.

Hovering a cursor over the widget shows the number of instances on which the currently saved migration set is successfully applied.

You can also check the status of each particular migration on the Migrations page. The migrations that are successfully applied are marked with green check marks. Failed migrations are marked with exclamation mark icons (!). Hover the cursor over the icon to see the information about the error. To reapply a failed migration, click Force apply in the pop-up with the error information.

Migration file example

The following migration code creates a formatted space with two indexes in a sharded cluster:

local function apply_scenario()
    local space = box.schema.space.create('customers')

    space:format {
        { name = 'id',        type = 'number' },
        { name = 'bucket_id', type = 'number' },
        { name = 'name',      type = 'string' },
    }

    space:create_index('primary', { parts = { 'id' } })
    space:create_index('bucket_id', { parts = { 'bucket_id' }, unique = false })
end

return {
    apply = {
        scenario = apply_scenario,
    },
}

TCF integration

Enterprise Edition

Tarantool Cluster Manager is a part of the Enterprise Edition.

Tarantool Cluster Manager provides a web interface for clusters that run within Tarantool Clusters Federation. It is available on the Cluster > TCF page. If a connected cluster is configured to run in a TCF installation, this page shows information about both clusters in this installation: their ID’s, names, and statuses. To switch cluster states in TCF, click Toggle on the TCF page.

To learn more about Tarantool Clusters Federation, see its documentation.

Note

For individual clusters, the TCF page is empty.

Accessing cluster data

Enterprise Edition

Tarantool Cluster Manager is a part of the Enterprise Edition.

Tarantool Cluster Manager provides access to data stored in connected clusters through its web interface. You can view, add, edit, and delete tuples from spaces.

Note

A TCM user’s access to specific clusters and spaces is determined by their cluster permissions and access control list.

Data access is implemented in TCM on a per-instance basis: you can access data stored on one cluster instance at a time. For sharded clusters that use the CRUD module, it’s also possible to access data throughout the whole cluster.

Instance data

There are the following ways to access data stored on a cluster instance from TCM:

Instance explorer displays the instance’s spaces as tables in the web interface
SQL terminal allows executing SQL statements on the instance
Tarantool and tt consoles allow accessing the data using the Lua API

Important

Data modification is possible only on instances in the read-write mode (replica set leaders). Changes are applied to read-only replicas in accordance with the cluster topology.

Instance explorer

The instance explorer provides access to all spaces that exist on the instances in the web interface. This includes both system and user spaces.

To open the instance explorer:

Go to Stateboard.
Click the instance row in the instances list or its graph vertex in the graph view.
Click Explorer in the Actions menu of the instance details page.

To view tuples of a space, click its row in the spaces list.

To add a new tuple, click + on the space page and provide tuple field values in the Lua format, for example, [ 1, 1000, true, "test"].

To edit a tuple, click it in the table and then click Edit.

To delete a tuple, select it in the table and click Delete (the trash bin button).

In the development mode, you can also create, edit, truncate, and delete spaces in the instance explorer. To create a space, click Add and follow the wizard steps. To edit, truncate, or remove a space, click the corresponding button in the Actions menu of the space row in the table.

SQL terminal

TCM features an SQL terminal that you can use to access stored data. It is located on the SQL tab of the instance details page. In the SQL terminal, you can execute any supported SQL expressions on the selected instance.

For select SQL queries, you can also download the query result set in the CSV format.

To learn more about using SQL in Tarantool, see the SQL tutorial.

Lua API: Tarantool and tt consoles

TCM provides interactive access to instances’ consoles on the Terminal tab of the instance details page. You can choose between the tt console (TT Connect tab) and Tarantool interactive console (Direct tab).

In these consoles, you can access the stored data using the Tarantool Lua API.

Sharded cluster data

For sharded clusters that use the CRUD module, it’s possible to access stored data throughout the cluster on the Cluster > Tuples page. This page displays only user spaces.

To view all tuples of a space in a sharded cluster, click the space row in the list.

To add a new tuple, click + on the space page and provide tuple field values in the Lua format, for example [ 1, 1000, true, "test"]. When you add a tuple in a sharded cluster, it is distributed to a replica set based on the sharding key (the bucket_id field) value.

To edit a tuple, click it in the table and then click Edit.

To delete a tuple, select it in the table and click Delete (the trash bin button).

Creating spaces in sharded clusters

To create a space in a sharded cluster, create it on all read-write cluster instances on their Instance explorer pages.

Important

Sharded spaces must include the bucket_id field of the unsigned type and a non-unique index by this field with the same name.

To edit, truncate, or delete spaces in a sharded cluster, perform the corresponding action on all read-write cluster instances.

Access control

Enterprise Edition

Tarantool Cluster Manager is a part of the Enterprise Edition.

Tarantool Cluster Manager provides means for managing user and client applications access to its own functions and connected clusters:

Local role-based access model allow flexible access management with user accounts created inside TCM.
LDAP authentication enable authentication with an external directory server.
Access control list enables fine-grained access to entities stored on connected clusters.
API tokens enable integration with third-party applications.
Sessions management allow administrators to view and revoke user sessions.

Role-based access control

Enterprise Edition

Tarantool Cluster Manager is a part of the Enterprise Edition.

Tarantool Cluster Manager features a role-based access control system. It enables flexible management of access to TCM functions, connected clusters, and stored data. The TCM access system uses three main entities: permissions, roles, and users (or user accounts). They work as follows:

Permissions correspond to specific functions or objects in TCM (administrative permissions) or operations on clusters (cluster permissions).
Roles are predefined sets of administrative permissions to assign to users.
Users have roles that define their access rights to TCM functions and objects, and cluster permissions that are assigned for each cluster individually.

Note

TCM users, roles, and permissions are not to be confused with similar subjects of the Tarantool access control system. To access Tarantool instances directly, Tarantool users with corresponding roles are required.

Permissions

Permissions define access to specific actions that users can do in TCM. For example, there are permissions to view connected clusters or to manage users.

There are two types of permissions in TCM: administrative and cluster permissions.

Administrative permissions provide access to TCM functions. They define which pages and controls are available to users in the web UI. Typically, read permissions define pages shown in the left menu. Write permissions define the availability of controls for managing objects on the pages. For example, users with read permission to clusters can view the Clusters page but they don’t see Add, Edit, or Remove buttons unless they have the write permission.

Administrative permissions are assigned to users through roles.
Cluster permissions enable actions with connected Tarantool clusters. These permissions are granted to users on a per-cluster level: each user has a separate set of permissions for each cluster.

Cluster permissions define which pages of the Cluster menu section users see and what actions they can take on these pages. For example, users with the read configuration permission to a cluster configuration see the Configuration page when this cluster is selected.

Cluster permissions are assigned to users individually when creating or editing them.

For a fine-grained control over user access to particular spaces and functions stored in clusters, there is the access control list.

Permissions are predefined in TCM, there is no way to change, add, or delete them. The complete lists of administrative and cluster permissions in TCM are provided in the Permissions reference.

Roles

Roles are groups of administrative permissions that are assigned to users together.

The assigned roles define pages that users see in TCM and actions available on these pages.

Note

Roles don’t include cluster permissions. Access to connected clusters is configured for each user individually.

Default roles

TCM comes with default roles that cover three common usage scenarios:

Super Admin Role is a default role with all available administrative permissions. Additionally, the users with this role automatically gain all cluster permissions to all clusters.
Cluster Admin Role is a default role for cluster administration. It includes administrative permissions for cluster management.
Default User Role is a default role for working with clusters. It includes basic administrative read permissions that are required to log in to TCM and navigate to a cluster.

Managing roles

Administrators can create new roles, edit, and delete existing ones.

Roles are listed on the Roles page.

To create a new role, click Add, enter the role name, and select the permissions to include in the role.

To edit an existing role, click Edit in the Actions menu of the corresponding table row.

To delete a role, click Delete in the Actions menu of the corresponding table row.

Note

You can delete a role only if there are no users with this role.

Users

TCM users gain access to objects and actions through assigned roles and cluster permissions.

A user can have any number of roles or none of them. Users without roles have access only to clusters that are assigned to them.

TCM uses password authentication for users. For information on password management, see the Passwords section below.

Default admin

There is one default user Default Admin. It has all the available permissions, both administrative and cluster ones. When new clusters are added in TCM, Default Admin automatically receives all cluster permissions for them as well.

Managing users

Administrators can create new users, edit, and delete existing ones.

The tools for managing users are located on the Users page.

To create a user:

Click Add.
Fill in the user information: username, full name, and description.
Generate or type in a password.
Select roles to assign to the user.
Add clusters to give the user access to, and select cluster permissions for each of them.

To edit a user, click Edit in the Actions menu of the corresponding table row.

To delete a user, click Delete in the Actions menu of the corresponding table row.

Passwords

TCM uses the general term secret for user authentication keys. A secret is any pair of a public and a private key that can be used for authentication. A password combined with a username is a secret type used for TCM user authentication. In this case, the public key is a username, and the private key is a password.

Users receive their first passwords during their account creation.

All passwords are governed by the password policy. It can be flexibly configured to follow the security requirements of your organization.

Changing your password

To change your own password, click your name in the top-right corner and go to Settings > Change password.

Changing users’ passwords

Administrators can manage a user’s password on this user’s Secrets page. To open it, click Secrets in the Actions menu of the corresponding Users table row.

To change a user’s password, click Edit in the Actions menu of the corresponding Secrets table row and enter the new password in the New secret key field.

Password expiry

Passwords expire automatically after the expiration period defined in the password policy. When a user logs in to TCM with an expired password, the only action available to them is a password change. All other TCM functions and objects are unavailable until the new password is set.

Administrators can also set users’ passwords to expired manually. To set a user’s password to expired, click Expire in the Actions menu of the corresponding Secrets table row.

Important

Password expiration can’t be reverted.

Blocking passwords

To forbid users’ access to TCM, administrators can temporarily block their passwords. A blocked password can’t be used to log into TCM until it’s unblocked manually or the blocking period expires.

To block a user’s password, click Block in the Actions menu of the corresponding Secrets table row. Then provide a blocking reason and enter the blocking period.

To unblock a blocked password, click Unblock in the Actions menu of the corresponding Secrets table row.

Password policy

Password policy helps improve security and comply with security requirements that can apply to your organization.

You can edit the TCM password policy on the Password policy page. There are the following password policy settings:

Minimal password length.
Do not use last N passwords.
Password expiration in days. Users’ passwords expire after this number of days since they were set. Users with expired passwords lose access to any objects and functions except password change until they set a new password.
Password expiration warning in days. After this number of days, the user sees a warning that their password expires soon.
Block after N login attempts. Temporarily block users if they enter their username or password incorrectly this number of times consecutively.
User lockout time in seconds. The time interval for which users can’t log in after spending all failed login attempts.
Password must include. Characters and symbols that must be present in passwords:
- Lowercase characters (a-z)
- Uppercase characters (A-Z)
- Digits (0-9)
- Symbols (such as !@#$%^&*()_+№”’:,.;=][{}`?>/.)

Permissions reference

Administrative permissions

The following administrative permissions are available in TCM:

Permission	Description
`admin.clusters.read`	View connected clusters’ details
`admin.clusters.write`	Edit cluster details and add new clusters
`admin.users.read`	View users’ details
`admin.users.write`	Edit user details and add new users
`admin.roles.read`	View roles’ details
`admin.roles.write`	Edit roles and add new roles
`admin.addons.read`	View add-ons
`admin.addons.write`	Edit add-on flags
`admin.addons.upload`	Upload new add-ons
`admin.auditlog.read`	View audit log configuration and read audit log in TCM
`admin.auditlog.write`	Edit audit log configuration
`admin.sessions.read`	View users’ sessions
`admin.sessions.write`	Revoke users’ sessions
`admin.ldap.read`	View LDAP configurations
`admin.ldap.write`	Manage LDAP configurations
`admin.passwordpolicy.read`	View password policy
`admin.passwordpolicy.write`	Manage password policy
`admin.secrets.read`	View information about users’ secrets
`admin.secrets.write`	Manage users’ secrets: add, edit, expire, block, delete
`user.password.change`	User’s permission to change their own password
`user.api-token.read`	User’s permission to read their own API tokens information
`user.api-token.write`	User’s permission to modify their own API tokens
`admin.metrics`	Read TCM metrics
`admin.acl.read`	View the access control list (ACL)
`admin.acl.write`	Add and delete ACL entries

Cluster permissions

The following cluster permissions are available in TCM:

Permission	Description
`cluster.config.read`	View cluster configuration
`cluster.config.write`	Manage cluster configuration
`cluster.stateboard.read`	View cluster stateboard
`cluster.func.read`	View cluster’s stored functions
`cluster.func.write`	Edit cluster’s stored functions
`cluster.func.call`	Execute stored functions on cluster instances
`cluster.space.read`	Read cluster data schema
`cluster.space.write`	Modify cluster data schema
`cluster.space.data.read`	Read stored data from cluster
`cluster.space.data.write`	Edit stored data on cluster
`cluster.failover.read`	Read cluster failover information
`cluster.failover.write`	Write cluster failover commands
`cluster.terminal`	Connect to cluster instances with `tt` terminal from TCM
`cluster.sql`	Execute SQL queries
`cluster.metrics`	View cluster metrics

LDAP authentication

Enterprise Edition

Tarantool Cluster Manager is a part of the Enterprise Edition.

In addition to its internal role-based access control model, Tarantool Cluster Manager can use an external LDAP (Lightweight Directory Access Protocol) directory server for user authentication and authorization.

When LDAP authentication is enabled, TCM uses a connected LDAP directory server to authenticates users who submit the login form. TCM constructs requests to the servers according to configuration parameters described on this page. Permissions of LDAP users in TCM are defined by LDAP group mapping.

Both LDAP and secure LDAPS (LDAP over TLS) protocols are supported.

Enabling LDAP authentication

LDAP authentication can be enabled using either of two configuration methods:

Enabling via CLI – set the security.auth option to include ldap in the TCM YAML config or as a CLI flag.
Enabling via web interface – starting from version 1.4.0, you can enable LDAP authentication interactively in the TCM UI.

Via CLI

To allow LDAP user authentication in TCM, enable the ldap authentication method in the security.auth configuration option before startup:

In the YAML TCM configuration:
```
security:
  auth:
    - ldap
```
In the command line:
```
$ tcm --security.auth="ldap"
```

Note

If both authentication methods – LDAP and local – are enabled, TCM tries them for each login attempt in the order they are specified in the configuration.

Via web interface

To enable LDAP authentication using the TCM web interface:

Click the user icon in the top-right corner of the screen.
Select Settings from the dropdown menu.
Navigate to the Authentication methods tab.
Check the box next to LDAP.
Save the changes.

LDAP configuration

To enable LDAP user access to TCM, create an LDAP configuration that connects TCM to the LDAP server that stores the users. An LDAP configuration defines how TCM connects to the server and queries user data. To create an LDAP configuration, go to the LDAP page in the Settings group and click Add.

To edit an LDAP configuration, click Edit in the Actions menu of the corresponding row.

To delete an LDAP configuration, click Delete in the Actions menu of the corresponding row.

General settings

Define the general configuration settings:

Enabled. Defines if the configuration is used. Turn the toggle off to stop using the configuration.

Note

If there are several enabled LDAP configurations, TCM attempts to use them for user authentication in the order they are created.
Automatically add non-existent users. By default, TCM automatically saves LDAP user information to its backend store upon their first login. Turn the toggle off if you don’t want to save users from this LDAP server.

LDAP server connection

Enter the LDAP server connection parameters:

Endpoints. URLs of the LDAP server. Example: 127.0.0.1:5056.
Request timeout. The timeout for TCM requests to the LDAP server, in seconds.
Enabled TLS. If the server uses LDAPS, turn this toggle on and specify TLS connection parameters, such as a certificate and a key file.

LDAP queries

To define how TCM queries the LDAP server for user authentication and authorization, fill in the fields of the Queries step:

Query user and Query password. Credentials of the LDAP user on behalf of which all LDAP queries are executed: a distinguished name (DN) and a password. Example DN:
```
cn=admin,cn=users,dc=tarantool,dc=io
```
Base DN. The DN of a directory that serves as a root for making all LDAP requests. Example: dc=tarantool,dc=io.
Username regex. A regular expression that defines a username template for this LDAP configuration. When a user enters their username on the login page, TCM matches it against username regular expressions of all enabled LDAP configurations and selects the one to use for this user authentication.

Example: a regex to match employee email addresses within the specified domain.
```
^([\w\-\.]+)@tarantool.io$
```
(Optional) Template DN. A template for building a DN to send in an authentication bind request. Use the numbers in curly braces as placeholders to replace with username regex parts: {0}, {1}, and so on.

Example:
```
cn={0},cn=users,dc=tarantool,dc=io
```
When used with the Username regex shown above, it substitutes {0} with the username part of the email address (before @) entered into the login form. For example, the username user1@tarantool.io forms the following DN for bind request:
```
cn=user1,cn=users,dc=tarantool,dc=io
```
(Optional) Template query. A template for querying the LDAP server for the DN. This way is used if Template DN is not provided.
Group query template. A template for querying groups to which a user belongs for authorization purposes. Learn more in LDAP user permissions. Example:
```
(&(objectCategory=person)(objectClass=user)(cn={0}))
```

LDAP user permissions

Permissions of LDAP users in TCM are defined by the groups to which they belong. You can map TCM administrative and cluster permissions to LDAP groups on the Groups step of the configuration creation.

To assign permissions to an LDAP group, click Add group. In the dialog that opens, enter the group name, for example, CN=Admins,CN=Builtin,DC=tarantool,DC=io. Then, select administrative permission to grant to this group in the Permissions list.

To grant cluster permissions, click Add cluster. Select a cluster and the cluster permissions to grant to the group. Save the group.

Each user has permissions of all LDAP groups to which they belong.

Disabling LDAP configurations

To stop using an LDAP configuration, open its Edit page and turn off the Enabled toggle.

Access control list

Enterprise Edition

Tarantool Cluster Manager is a part of the Enterprise Edition.

Tarantool Cluster Manager access control list (ACL) determines user access to particular data and functions stored in clusters. You can use it to allow or deny access to specific stored objects one by one.

Each ACL entry specifies privileges that a TCM user has on a particular space or a function. There are three access privileges that can be granted in the ACL: read, write, and execute (for stored functions only). The privileges work as follows:

Spaces:
- Read: the user sees the space and its tuples on the Tuples and Explorer pages
- Write: the user can add new and edit existing tuples of the space
Functions:
- Read: the user sees the function on the Functions tab of the instance details page.
- Write: the user can edit or delete the function
- Execute: the user can call the function

Important

User access to space data and stored functions is primarily defined by the cluster permissions cluster.space.data.* and cluster.func.*. ACL only increases the access control granularity to particular objects. Make sure that users have these permissions before enabling ACL for them.

Enabling ACL for a user

To granularly manage a user’s access to particular objects in a cluster, enable the use of ACL in the user profile:

Go to Users and click Edit in the Actions menu of the corresponding table row.
In the user’s Clusters list, add a cluster on which you want to use ACL or click the pencil icon if the cluster is already on the list.
Select the Use Access Control List (ACL) checkbox and save changes.
Repeat two previous steps for each cluster on which you want to use ACL for this user.
Click Update to save the user account.

If the user doesn’t exist yet, you can do the same when creating it.

Important

When ACL use is enabled for a user, this user loses access to all spaces and functions of the selected cluster except the ones explicitly specified in the ACL.

Managing ACL

The tools for managing ACL are located on the ACL page.

To add an ACL entry:

Click Add.
Select a user to which you want to grant access.
Select a cluster that stores the target object: a space or a function.
Select the target object type and enter its name.
Select the privileges you want to grant.

To delete an ACL entry, click Delete in the Actions menu of the corresponding table row.

API tokens

Enterprise Edition

Tarantool Cluster Manager is a part of the Enterprise Edition.

Tarantool Cluster Manager uses the Bearer HTTP authentication scheme with API tokens to authenticate external applications’ requests to TCM. For example, these can be Prometheus jobs that retrieve metrics of connected Tarantool clusters.

The API tokens functionality is disabled by default. To enable it, set the feature.api-token configuration option to true.

feature:
  api-token: true

Each TCM API token belongs to the user that created it and has the same access permissions. Thus, if a user has a permission to view a cluster’s metrics in TCM, this user’s API tokens can be used to read this cluster’s metrics with Prometheus.

API tokens have expiration dates that are set during the token creation and cannot be changed.

Managing API tokens

Note

Each user, including Default Admin and other administrators, can create only their own tokens. There is no way to create a token for another user.

To create a TCM API token:

Open the user settings by clicking the user’s name in the top-right corner.
Go to the API tokens tab and click Add.
Specify the token expiration date and an optional description and click Add.

The created token is shown in a dialog.

Important

An API token is shown only once after its creation. There is no way to view it again after you close the dialog. Make sure to copy the token in a safe place.

To delete an API token, click Delete in the actions menu of the corresponding API tokens table row.

Administrators can also view information about users’ API tokens and delete them on the Secrets page. To open a user’s secrets, click Secrets in the Actions menu of the corresponding Users table row.

Sessions

Enterprise Edition

Tarantool Cluster Manager is a part of the Enterprise Edition.

Tarantool Cluster Manager administrators can view and revoke user sessions in the web interface. All active sessions are listed on the Sessions page. To revoke a session, click Revoke in the Actions menu of the corresponding table row.

To revoke all sessions of a TCM user, go to Users and click Revoke all sessions in the Actions menu of the corresponding table row.

Audit log

Enterprise Edition

Tarantool Cluster Manager is a part of the Enterprise Edition.

Tarantool Cluster Manager provides the audit logging functionality for tracking user activity and security-related events, such as:

Successful and failed login attempts.
Access to clusters, their configurations, data models, and stored data.
Changes in the access control system: users, roles, passwords, LDAP configurations.

The complete list of TCM audit events is provided in Event types.

Note

TCM audit log records only events that happen in TCM itself. For information about Tarantool audit logging, see Audit module.

Audit logging is disabled in TCM by default. To start recording events, you need to enable and configure it.

The audit log stores event details in the JSON format. Each log entry contains the event type, description, time, impacted objects, and other information that may be used for incident investigation. The complete list of fields is provided in Structure of audit log events.

TCM also provides a built-in interface for reading and searching the audit log. For details, see Viewing audit log.

Enabling audit logging

To enable audit logging in TCM, go to Audit settings and click Enable.

To additionally send audit log events to the standard output, click Send to stdout.

Audit log configuration

TCM audit events can be logged to a local file or sent to a syslog server. To configure audit logging, go to Audit settings.

Writing to a file

To write TCM audit logs to a file:

Go to Audit settings and select the file protocol.
Specify the name of the audit log file. The file appears in the TCM working directory.
Configure the log files rotation: the maximum file size and age, and the number of files to store simultaneously.
(Optional) Enable compression of audit log files.

Configuration parameters:

Output file name. The name of the audit log file. Default: audit.log
Max size (in MB). The maximum size of the log file before it gets rotated, in megabytes. Default: 100.
Max backups. The maximum number of stored audit log files. Default: 10.
Max age (in days). The maximum age of audit log files in days. Default: 30.
Compress. Compress audit log files into gzip archives when rotating.

Sending to syslog

If you use a centralized log management system based on syslog, you can configure TCM to send its audit log to your syslog server:

Go to Audit settings and select the syslog protocol.
Enter the syslog server URI and select the network protocol. Typically, syslogd listens on port 514 and uses the UDP protocol.
Specify the syslog logging parameters: timeout, priority, and facility.

Configuration parameters:

Protocol. The network protocol used for connecting to the syslog server. Default: udp.
Output. The syslog server URI. Default: 127.0.0.1:514 (localhost).
Timeout. The syslog write timeout in the ISO 8601 duration format. Default: PT2S (two seconds).
Priority. The syslog severity level. Default: info.
Facility. The syslog facility. Default: local0.

Selecting audit events to record

When the audit log is enabled, TCM records all audit events listed in Event types. To decrease load and make the audit log comply with specific security requirements, you can record only selected events. For example, these can be events of user account management or events of cluster data access.

To select events to record into the audit log, go to Audit settings and enter their types into the Filters field one-by-one, pressing the Enter key after each type.

To remove an event type from a filters list, click the cross icon beside it.

Viewing audit log

If the audit log is written to a file, you can view it in TCM on the Audit log page. On this page, you can view or search for events.

To view the details of a logged audit event, click the corresponding line in the table.

To search for an event, use the search bar at the top of the page. Note that the search is case-sensitive. For example, to find events with the ALARM severity, enter ALARM, not alarm.

Structure of audit log events

All entries of the TCM audit log include the mandatory fields listed in the table below.

Field	Description	Example
`time`	Time of the event	2023-11-23T12:05:27.099+07:00
`severity`	Event severity: `VERBOSE`, `INFO`, `WARNING`, or `ALARM`	INFO
`type`	Audit event type	user.update
`description`	Human-readable event description	Update user
`uuid`	Event UUID	f8744f51-5760-40c3-ae2d-0b4d6b44836f
`user`	UUID of the user who triggered the event	942a4f54-cf7f-4f46-80ce-3511dbbb57b7
`remote`	Remote host that triggered the event	100.96.163.226:48722
`host`	The TCM host on which the event happened	100.96.163.226:8080
`userAgent`	Information about the client application and platform that was used to trigger the event	Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36
`permission`	The permission that was used to trigger the event	[“admin.users.write”]
`result`	Event result: `ok` or `nok`	ok
`err`	Human-readable error description for events with `nok` result	failed to login
`fields`	Additional fields for specific event types in the key-value format	Key examples: `clusterId` in cluster-related events `payload` in events that include sending data to the server `username` in `current.` or `auth.` events

This is an example of an audit log entry on a successful login attempt:

{
    "time": "2023-11-23T12:01:27.247+07:00",
    "severity": "INFO",
    "description": "Login user",
    "type": "current.login",
    "uuid": "4b9c2dd1-d9a1-4b40-a448-6bef4a0e5c79",
    "user": "",
    "remote": "127.0.0.1:63370",
    "user-agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36",
    "host": "127.0.0.1:8080",
    "permissions": [],
    "result": "ok",
    "fields": [
        {
            "Key": "username",
            "Value": "admin"
        },
        {
            "Key": "method",
            "Value": "null"
        },
        {
            "Key": "output",
            "Value": "true"
        }
    ]
}

Event types

The following table lists all possible values of the type field of TCM audit log events.

Event type	Description
`auth.fail`	Authentication failed
`auth.ok`	Authentication successful
`access.denied`	An attempt to access an object without the required permission
`crud.insert`	Data inserted via CRUD operations
`crud.delete`	Data deleted via CRUD operations
`user.add`	User added
`user.update`	User updated
`user.delete`	User deleted
`secret.add`	User secret added
`secret.update`	User secret updated
`secret.block`	User secret blocked
`secret.unblock`	User secret unblocked
`secret.delete`	User secret deleted
`secret.expire`	User secret expired
`session.revoke`	Session revoked
`session.revokeuser`	All user’s sessions revoked
`explorer.insert`	Data inserted in a cluster
`explorer.delete`	Master switched manually
`test.devmode`	Switched to development mode
`auditlog.config`	Audit log configuration changed
`passwordpolicy.save`	Password policy changed
`passwordpolicy.resetpasswords`	All passwords are expired by an administrator
`ddl.save`	Cluster data model saved
`ddl.apply`	Cluster data model applied
`cluster.config.save`	Cluster configuration saved
`cluster.config.reset`	Saved cluster configuration reset
`cluster.config.apply`	Cluster configuration applied
`current.logout`	User logged out their own session
`current.revoke`	User revoked their own session
`current.revokeall`	User revoked all their active sessions
`current.changepassword`	User changed their password
`role.add`	Role added
`role.update`	Role updated
`role.delete`	Role deleted
`cluster.add`	Cluster added
`cluster.update`	Cluster updated
`cluster.delete`	Cluster removed
`ldap.testlogin`	Login test executed for an LDAP configuration
`ldap.testconnection`	Connection test executed for an LDAP configuration
`ldap.add`	LDAP configuration added
`ldap.update`	LDAP configuration updated
`ldap.delete`	LDAP configuration deleted
`addon.enable`	Add-on enabled
`addon.disable`	Add-on disabled
`addon.delete`	Add-on removed
`tcmstate.save`	Low-level information saved in the TCM storage (for debug purposes)
`tcmstate.delete`	Low-level information deleted from the TCM storage (for debug purposes)

Configuration

Enterprise Edition

Tarantool Cluster Manager is a part of the Enterprise Edition.

This topic describes how to configure Tarantool Cluster Manager. For the complete list of TCM configuration parameters, see the TCM configuration reference.

Note

To learn about Tarantool cluster configuration, see Configuration.

Configuration structure

Tarantool Cluster Manager configuration is a set of parameters that define various aspects of TCM functioning. Parameters are grouped by the particular aspect that they affect. There are the following groups:

HTTP
logging
configuration storage
security
add-ons
limits
TCM running mode

Parameter groups can be nested. For example, in the http group there are tls and websession-cookie groups, which define TLS encryption and cookie settings.

Parameter names are the full paths from the top-level group to the specific parameter. For example:

http.host is the host parameter that is defined directly in the http group.
http.tls.enabled is the enabled parameter that is defined in the tls nested group within http.

Ways to pass configuration parameters

There are three ways to pass TCM configuration parameters:

a YAML file
environment variables
command-line options of the TCM executable

YAML file

TCM configuration can be stored in a YAML file. Its structure must reflect the configuration parameters hierarchy.

The example below shows a fragment of a TCM configuration file:

# a fragment of a YAML configuration file
cluster: # top-level group
    on-air-limit: 4096
    connection-rate-limit: 512
    tarantool-timeout: 10s
    tarantool-ping-timeout: 5s
http: # top-level group
    basic-auth: # nested group
        enabled: false
    network: tcp
    host: 127.0.0.1
    port: 8080
    request-size: 1572864
    websocket: # nested group
        read-buffer-size: 16384
        write-buffer-size: 16384
        keepalive-ping-interval: 20s
        handshake-timeout: 10s
        init-timeout: 15s

To start TCM with a YAML configuration, pass the location of the configuration file in the -c command-line option:

$ tcm -c=config.yml

Environment variables

TCM can take values of its configuration parameters from environment variables. The variable names start with TCM_. Then goes the full path to the parameter, converted to upper case. All delimiters are replaced with underscores (_). Examples:

TCM_HTTP_HOST is a variable for the http.host parameter.
TCM_HTTP_WEBSESSION_COOKIE_NAME is a variable for the http.websession-cookie.name parameter.

The example below shows how to start TCM with configuration parameters passed in environment variables:

$ export TCM_HTTP_HOST=0.0.0.0
$ export TCM_HTTP_PORT=8888
$ tcm

Command-line arguments

The TCM executable has -- command-line options for each configuration parameter. Their names reflect the full path to the parameter, with configuration levels separated by periods (.). Examples:

--http.host is an option for http.host.
--http.websession-cookie.name is an option for http.websession-cookie.name.

The example below shows how to start TCM with configuration parameters passed in command-line options:

$ tcm --storage.etcd.embed.enabled --addon.enabled --http.host=0.0.0.0 --http.port=8888

Configuration precedence

TCM configuration options are applied from multiple sources with the following precedence, from highest to lowest:

tcm executable arguments.
TCM_* environment variables.
Configuration from a YAML file.

If the same option is defined in two or more locations, the option with the highest precedence is applied. For options that aren’t defined in any location, the default values are used.

You can combine different ways of TCM configuration for efficient management of multiple TCM installations:

A single YAML file for all installations can contain the common configuration parts. For example, a single configuration storage that is used for all installations, or TLS settings.
Environment variables that set specific parameters for each server, such as local directories and paths.
Command-line options for parameters that must be unique for different TCM instances running on a single server. For example, http.port.

Configuration parameter types

TCM configuration parameters have the Go language types. Note that this is different from the Tarantool configuration parameters, which have Lua types.

Most options have the Go’s basic types: int and other numeric types, bool, string.

http:
    basic-auth:
        enabled: false # bool
    network: tcp # string
    host: 127.0.0.1 # string
    port: 8080 # int
    request-size: 1572864 # int64

Parameters that can take multiple values are arrays. In YAML, they are passed as YAML arrays: each item on a new line, starting with a dash.

storage:
provider: etcd
etcd:
    endpoints: # array
        - https://192.168.0.1:2379 # item 1
        - https://192.168.0.2:2379 # item 2

Note

In environment variables and command line options, such arrays are passed as semicolon-separated strings of items.

Parameters that set timeouts, TTLs, and other duration values, have the Go’s time.Duration type. Their values can be passed in time-formatted strings such as 4h30m25s.

cluster:
    tarantool-timeout: 10s # duration
    tarantool-ping-timeout: 5s # duration

Finally, there are parameters whose values are constants defined in Go packages. For example, http.websession-cookie.same-site values are constants from the Go’s http.SameSite type. To find out the exact values available for such parameters, refer to the Go packages documentation.

http:
    websession-cookie:
        same-site: SameSiteStrictMode

Creating a configuration template

You can create a YAML configuration template for TCM with all parameters and their default values using the generate-config option of the tcm executable.

To write a default TCM configuration to the tcm.example.yml file, run:

$ tcm generate-config > tcm.example.yml.

Initial settings

You can use YAML configuration files to create entities in TCM automatically upon the first start. These entities are defined in the initial-settings section of the configuration file.

Important

The initial settings are applied only once upon the first TCM start. Further changes are not applied upon TCM restarts.

Clusters

To add clusters to TCM upon the first start, specify their settings in the initial-settings.clusters configuration section.

The initial-settings.clusters section is an array whose items describe separate clusters, for example:

initial-settings:
  clusters:
    - name: Cluster 1
      description: First cluster
      # cluster settings
    - name: Cluster 2
      description: Second cluster
      # cluster settings

In this configuration, you can specify all cluster settings that you define when connecting clusters through the TCM web interface. This includes:

the cluster name
description
additional URLs
configuration storage connection
Tarantool instances connection
and other settings.

For the full list of cluster configuration parameters, see the initial-settings.clusters reference. For example, this is how you add a cluster that uses an etcd configuration storage:

initial-settings:
  clusters:
    - name: My cluster
      description: Cluster description
      urls:
      - label: Test
        url: http://example.com
      storage-connection:
        provider: etcd
        etcd-connection:
          endpoints:
            - http://127.0.0.1:2379
          username: ""
          password: ""
          prefix: /cluster1
        tarantool-connection:
          username: guest
          password: ""

By default, TCM contains a cluster named Default cluster with ID 00000000-0000-0000-0000-000000000000. You can use this ID to modify the default cluster settings upon the first TCM start. For example, rename it and add its connection settings:

initial-settings:
  clusters:
    - id: 00000000-0000-0000-0000-000000000000
      name: My cluster
      storage-connection:
        provider: etcd
        etcd-connection:
          endpoints:
            - http://127.0.0.1:2379
          username: etcd-user
          password: secret
          prefix: /cluster1
        tarantool-connection:
          username: guest
          password: ""

Backend store

Enterprise Edition

Tarantool Cluster Manager is a part of the Enterprise Edition.

Tarantool Cluster Manager uses an underlying data store (backend store) for its entities: users, roles, cluster connections, settings, and other objects that you manipulate in TCM. The backend store can be either an etcd or a Tarantool cluster.

For better reliability and scalability, the backend store works independently from TCM. For example, it can be the same ectd or Tarantool cluster that you use as a centralized configuration storage. This makes TCM stateless: all objects created or modified in its web UI are saved to the backend store, and nothing is stored inside the TCM instances themselves. Any number of instances can duplicate each other when connected to the same backend store. If you stop all instances, the store still contains their objects. You can continue working with them right after starting a new instance.

In addition to using an external backend store, you can run TCM with an embedded etcd or Tarantool instance to use as the backend store.

On this page, you will learn to connect TCM to backend stores of both types, or start TCM with an embedded backend store.

Setting up a backend store

The TCM backend store requires the same configuration as Tarantool centralized configuration storage. Follow the instructions in Setting up a configuration storage to set up a backend store.

Note

If you already have the centralized configuration store for your Tarantool clusters, you can use it as a TCM backend store as well.

Configuring backend store connection

The TCM’s connection to its backend store is configured using the storage.* configuration options. The storage.provider option selects the store type. It can be either etcd or tarantool.

External etcd store

To use an etcd cluster as a TCM backend store, set the storage.provider option to etcd and specify connection parameters in storage.etcd.* options. A minimal etcd configuration includes the storage endpoints:

storage:
  provider: etcd
  etcd:
    endpoints:
      - http://127.0.0.1:2379

If authentication is enabled in etcd, specify storage.etcd.username and storage.etcd.password:

storage:
  provider: etcd
  etcd:
    endpoints:
      - http://127.0.0.1:2379
    username: etcduser
    password: secret

The TCM data is stored in etcd under the prefix specified in storage.etcd.prefix. By default, the prefix is /tcm. If you want to change it or store data of different TCM instances separately in one etcd cluster, set the prefix explicitly:

storage:
  provider: etcd
  etcd:
    endpoints:
      - http://127.0.0.1:2379
    prefix: /tcm2

Other storage.etcd.* options configure various aspects of the etcd store connection, such as network timeouts and limits or TLS parameters. For the full list of the etcd TCM backend store options, see the TCM configuration reference.

External Tarantool-based store

To use a Tarantool cluster as a TCM backend store, set the storage.provider option to tarantool and specify connection parameters in storage.tarantool.* options. A minimal configuration includes the one or more addresses of the backend store instances:

storage:
  provider: tarantool
  tarantool:
    addr: http://127.0.0.1:3301

Or:

storage:
  provider: tarantool
  tarantool:
    addrs:
      - http://127.0.0.1:3301
      - http://127.0.0.1:3302
      - http://127.0.0.1:3303

If authentication is enabled in the backend store, specify storage.tarantool.username and storage.tarantool.password:

storage:
  provider: tarantool
  tarantool:
    addr: http://127.0.0.1:3301
    username: tarantooluser
    password: secret

The TCM data is stored in the Tarantool-based backend store under the prefix specified in storage.tarantool.prefix. By default, the prefix is /tcm. If you want to change it or store data of different TCM instances separately in one Tarantool cluster, set the prefix explicitly:

storage:
  provider: tarantool
  tarantool:
    addr: http://127.0.0.1:3301
    username: tarantooluser
    password: secret
    prefix: /tcm2

Other storage.tarantool.* options configure various aspects of TCM connection to the Tarantool-based backend store, such as network timeouts and limits or TLS parameters. For the full list of the Tarantool-based TCM backend store options, see the TCM configuration reference.

Embedded backend store

For development purposes, you can start TCM with an embedded backend store. This is useful for local runs when you don’t have or don’t need an external backend store.

Important

Do not use the embedded backend stores in production environments.

An embedded TCM backend store is a single instance of etcd or Tarantool that is started automatically on the same host during the TCM startup. It runs in the background until TCM is stopped. The embedded backend store is persistent: if you start TCM again with the same backend store configuration, it restores the TCM data from the previous runs.

Note

To start a clean instance of TCM, remove the working directory of the embedded backend store specified in the storage.etcd.embed.workdir or storage.tarantool.embed.workdir option.

The embedded backend store parameters are configured using the storage.etcd.embed.* options for etcd or storage.tarantool.embed.* options for a Tarantool-based store.

To start TCM with an embedded etcd with default settings, set storage.etcd.embed.enabled to true and leave other storage.* options default:

storage.etcd.embed.enabled: true

You can use the following call to get TCM running with embedded etcd without a configuration file:

$ tcm --storage.etcd.embed.enabled

To start TCM with an embedded Tarantool storage with default settings:

set storage.provider to tarantool
set storage.tarantool.embed.enabled to true

storage:
  provider: tarantool
  tarantool.embed.enabled: true

With command-line arguments:

$ tcm --storage.provider=tarantool --storage.tarantool.embed.enabled

You can tune the embedded backend store, for example, enable and configure TLS on it or change its working directories or startup arguments. To set specific parameters, specify the corresponding storage.etcd.embed.* or storage.tarantool.embed.* options. For the full list of configuration options of embedded backend stores, see the TCM configuration reference.

Setting up a cluster of embedded backend stores

To simulate the production environment, you can form a distributed multi-instance cluster from embedded stores of multiple TCM instances. To do this, configure each TCM instance’s embedded store to join each other.

For etcd, provide the embedded store clustering parameters storage.etcd.embed.* and specify the endpoints in storage.etcd.endpoints. The options that configure embedded etcd mostly match the etcd configuration options. For more information about these options, see the etcd documentation.

Below are example configurations of three TCM instances that start with embedded etcd instances and form an etcd cluster from them:

First instance:

http:
  port: 8080
storage:
  provider: etcd
  etcd:
    endpoints:
      - http://127.0.0.1:2379
      - http://127.0.0.1:22379
      - http://127.0.0.1:32379
    embed:
      enabled: true
      name: infra1
      endpoints:
        - http://127.0.0.1:2379
      advertises:
        - http://127.0.0.1:2379
      initial-cluster-state: new
      initial-cluster: "infra1=http://127.0.0.1:12380,infra2=http://127.0.0.1:22380,infra3=http://127.0.0.1:32380"
      initial-cluster-token: etcd-cluster-1
      peer-endpoints:
        - http://127.0.0.1:12380
      peer-advertises:
        - http://127.0.0.1:12380
      workdir: node1.etcd

Second instance:

http:
  port: 8081
storage:
  provider: etcd
  etcd:
    endpoints:
      - http://127.0.0.1:2379
      - http://127.0.0.1:22379
      - http://127.0.0.1:32379
    embed:
      enabled: true
      name: infra2
      endpoints:
        - http://127.0.0.1:22379
      advertises:
        - http://127.0.0.1:22379
      initial-cluster-state: new
      initial-cluster: "infra1=http://127.0.0.1:12380,infra2=http://127.0.0.1:22380,infra3=http://127.0.0.1:32380"
      initial-cluster-token: etcd-cluster-1
      peer-endpoints:
        - http://127.0.0.1:22380
      peer-advertises:
        - http://127.0.0.1:22380
      workdir: node2.etcd

Third instance:

http:
  port: 8082
storage:
  provider: etcd
  etcd:
    endpoints:
      - http://127.0.0.1:2379
      - http://127.0.0.1:22379
      - http://127.0.0.1:32379
    embed:
      enabled: true
      name: infra3
      endpoints:
        - http://127.0.0.1:32379
      advertises:
        - http://127.0.0.1:32379
      initial-cluster-state: new
      initial-cluster: "infra1=http://127.0.0.1:12380,infra2=http://127.0.0.1:22380,infra3=http://127.0.0.1:32380"
      initial-cluster-token: etcd-cluster-1
      peer-endpoints:
        - http://127.0.0.1:32380
      peer-advertises:
        - http://127.0.0.1:32380
      workdir: node3.etcd

To set up a cluster from embedded Tarantool-based backend stores:

Specify the Tarantool cluster configuration in storage.tarantool.embed.config (as a plain text) or storage.tarantool.embed.config-file (as a YAML file).
Assign an instance name from this configuration to each instance using storage.tarantool.embed.args to each embedded store.

Below are example configurations of three TCM instances that start with embedded Tarantool-based backend stores and form a cluster from them:

First instance:

http:
  port: 8080
storage:
  provider: tarantool
  tarantool:
    addrs:
      - http://127.0.0.1:3301
      - http://127.0.0.1:3302
      - http://127.0.0.1:3303
    embed:
      enabled: true
      executable: /path/to/execfile/tarantool-enterprise/tarantool
      config-filename: config.yml
      workdir: node1.tarantool
      args:
        - --name
        - instance-001
        - --config
        - config.yml

Second instance:

http:
  port: 8081
storage:
  provider: tarantool
  tarantool:
    addrs:
      - http://127.0.0.1:3301
      - http://127.0.0.1:3302
      - http://127.0.0.1:3303
    embed:
      enabled: true
      executable: /path/to/execfile/tarantool-enterprise/tarantool
      config-filename: config.yml
      workdir: node2.tarantool
      args:
        - --name
        - instance-002
        - --config
        - config.yml

Third instance:

http:
  port: 8082
storage:
  provider: tarantool
  tarantool:
    addrs:
      - http://127.0.0.1:3301
      - http://127.0.0.1:3302
      - http://127.0.0.1:3303
    embed:
      enabled: true
      executable: /path/to/execfile/tarantool-enterprise/tarantool
      config-filename: config.yml
      workdir: node3.tarantool
      args:
        - --name
        - instance-003
        - --config
        - config.yml

Development mode

Enterprise Edition

Tarantool Cluster Manager is a part of the Enterprise Edition.

Tarantool Cluster Manager provides a special mode aimed to use during the development. This mode extends the web interface with capabilities that can help in development or testing environments, such as starting and stopping instances or instance promotion.

Enabling development mode

You can enable TCM development mode in different ways: in its web interface, in the configuration file, using an environment variable, or using a command-line option.

Web interface

To enable development mode on the running TCM instance, use its web interface:

Open user settings: click Settings under the user name in the header.
Go to the About tab.
Click the toggle button beside tcm/mode.

Configuration file

To start TCM in the development mode, specify the mode: development option in its configuration file:

# tcm_config.yaml
mode: development

Command-line option

To start TCM in the development mode, specify the --mode=development command-line option:

$ tcm --mode=development

Environment variable

To make new TCM instances start in the development mode by default, set the TCM_MODE environment variable to development:

$ export TCM_MODE=development
$ tcm

Configuration reference

Enterprise Edition

Tarantool Cluster Manager is a part of the Enterprise Edition.

This topic describes configuration parameters of Tarantool Cluster Manager.

There are the following groups of TCM configuration parameters:

cluster
http
log
storage
addon
limits
security
mode
feature
initial-settings

cluster

The cluster group defines parameters of TCM interaction with connected Tarantool clusters.

connection-rate-limit
tarantool-timeout
tarantool-ping-timeout
tt-command
refresh-state-period
refresh-state-timeout
discovery-period
sharding-index
skew-time
fragmentation-threshold

cluster.connection-rate-limit¶: A rate limit for connections to Tarantool instances.

Type: uint

Default: 512

Environment variable: TCM_CLUSTER_CONNECTION_RATE_LIMIT

Command-line option: --cluster.connection-rate-limit

cluster.tarantool-timeout¶: A timeout for receiving a response from Tarantool instances.

Type: time.Duration

Default: 10s

Environment variable: TCM_CLUSTER_TARANTOOL_TIMEOUT

Command-line option: --cluster.tarantool-timeout

cluster.tarantool-ping-timeout¶: A timeout for receiving a ping response from Tarantool instances.

Type: time.Duration

Default: 5s

Environment variable: TCM_CLUSTER_TARANTOOL_PING_TIMEOUT

Command-line option: --cluster.tarantool-ping-timeout

cluster.tt-command¶: The command that runs the tt utility on hosts with cluster instances.

Type: string

Default: tt

Environment variable: TCM_CLUSTER_TT_COMMAND

Command-line option: --cluster.tt-command

cluster.refresh-state-period¶: The time interval for refreshing the cluster instances state on the Stateboard.

Type: time.Duration

Default: 5s

Environment variable: TCM_CLUSTER_REFRESH_STATE_PERIOD

Command-line option: --cluster.refresh-state-period

cluster.refresh-state-timeout¶: The time limit for refreshing an instance state. If this limit is reached, an error is shown.

Type: time.Duration

Default: 4s

Environment variable: TCM_CLUSTER_REFRESH_STATE_TIMEOUT

Command-line option: --cluster.refresh-state-timeout

cluster.discovery-period¶: The time interval for checking the leadership in replica sets.

Type: time.Duration

Default: 4s

Environment variable: TCM_CLUSTER_DISCOVERY_PERIOD

Command-line option: --cluster.discovery-period

cluster.sharding-index¶: The name of the space field that is used as a sharding key.

Type: string

Default: bucket_id

Environment variable: TCM_CLUSTER_SHARDING_INDEX

Command-line option: --cluster.sharding-index

cluster.skew-time¶: The maximum time skew between any two cluster instances. If this limit is reached, a warning is shown.

Type: time.Duration

Default: 30s

Environment variable: TCM_CLUSTER_SKEW_TIME

Command-line option: --cluster.skew-time

cluster.fragmentation-threshold¶

The count of allocated slabs that reflects high memory fragmentation. When this number is reached, a warning is shown.

http

The http group defines parameters of HTTP connections between TCM and clients.

http.network
http.host
http.port
http.request-size
http.websocket.read-buffer-size
http.websocket.write-buffer-size
http.websocket.keepalive-ping-interval
http.websocket.handshake-timeout
http.websocket.init-timeout
http.websession-cookie.name
http.websession-cookie.path
http.websession-cookie.domain
http.websession-cookie.ttl
http.websession-cookie.secure
http.websession-cookie.http-only
http.websession-cookie.same-site
http.cors.enabled
http.cors.allowed-origins
http.cors.allowed-methods
http.cors.allowed-headers
http.cors.exposed-headers
http.cors.allow-credentials
http.cors.debug
http.tls.enabled
http.tls.cert-file
http.tls.key-file
http.tls.server
http.tls.min-version
http.tls.max-version
http.tls.curve-preferences
http.tls.cipher-suites
http.read-timeout
http.read-header-timeout
http.write-timeout
http.idle-timeout
http.disable-general-options-handler
http.max-header-bytes
http.api-timeout
http.api-update-interval
http.frontend-dir
http.show-stack-trace
http.trace
http.max-static-size
http.graphql.complexity

http.network¶

An addressing scheme that TCM uses.

Possible values:

tcp: IPv4 address
tcp6: IPv6 address
unix: Unix domain socket

Type: string
Default: tcp
Environment variable: TCM_HTTP_NETWORK
Command-line option: --http.network

http.host¶: A host name on which TCM serves.

Type: string

Default: 127.0.0.1

Environment variable: TCM_HTTP_HOST

Command-line option: --http.host

http.port¶: A port on which TCM serves.

Type: int

Default: 8080

Environment variable: TCM_HTTP_PORT

Command-line option: --http.port

http.request-size¶: The maximum size (in bytes) of a client HTTP request to TCM.

Type: int64

Default: 1572864

Environment variable: TCM_HTTP_REQUEST_SIZE

Command-line option: --http.request-size

http.websocket.read-buffer-size¶: The size (in bytes) of the read buffer for WebSocket connections.

Type: int

Default: 16384

Environment variable: TCM_HTTP_WEBSOCKET_READ_BUFFER_SIZE

Command-line option: --http.websocket.read-buffer-size

http.websocket.write-buffer-size¶: The size (in bytes) of the write buffer for WebSocket connections.

Type: int

Default: 16384

Environment variable: TCM_HTTP_WEBSOCKET_WRITE_BUFFER_SIZE

Command-line option: --http.websocket.write-buffer-size

http.websocket.keepalive-ping-interval¶: The time interval for sending WebSocket keepalive pings.

Type: time.Duration

Default: 20s

Environment variable: TCM_HTTP_WEBSOCKET_KEEPALIVE_PING_INTERVAL

Command-line option: --http.websocket.keepalive-ping-interval

http.websocket.handshake-timeout¶: The time limit for completing a WebSocket opening handshake with a client.

Type: time.Duration

Default: 10s

Environment variable: TCM_HTTP_WEBSOCKET_HANDSHAKE_TIMEOUT

Command-line option: --http.websocket.handshake-timeout

http.websocket.init-timeout¶: The time limit for establishing a WebSocket connection with a client.

Type: time.Duration

Default: 15s

Environment variable: TCM_HTTP_WEBSOCKET_INIT_TIMEOUT

Command-line option: --http.websocket.init-timeout

http.websession-cookie.name¶

The name of the cookie that TCM sends to clients.

This value is used as the cookie name in the Set-Cookie HTTP response header.

Type: string
Default: tcm
Environment variable: TCM_HTTP_WEBSESSION_COOKIE_NAME
Command-line option: ---http.websession-cookie.name

http.websession-cookie.path¶

The URL path that must be present in the requested URL in order to send the cookie.

This value is used in the Path attribute of the Set-Cookie HTTP response header.

Type: string
Default: “”
Environment variable: TCM_HTTP_WEBSESSION_COOKIE_PATH
Command-line option: ---http.websession-cookie.path

http.websession-cookie.domain¶

The domain to which the cookie can be sent.

This value is used in the Domain attribute of the Set-Cookie HTTP response header.

Type: string
Default: “”
Environment variable: TCM_HTTP_WEBSESSION_COOKIE_DOMAIN
Command-line option: ---http.websession-cookie.domain

http.websession-cookie.ttl¶

The maximum lifetime of the TCM cookie.

This value is used in the Max-Age attribute of the Set-Cookie HTTP response header.

Type: time.Duration
Default: 2h0m0s
Environment variable: TCM_HTTP_WEBSESSION_COOKIE_TTL
Command-line option: ---http.websession-cookie.ttl

http.websession-cookie.secure¶

Indicates whether the cookie can be sent only over the HTTPS protocol. In this case, it’s never sent over the unencrypted HTTP, therefore preventing man-in-the-middle attacks.

When true, the Secure attribute is added to the Set-Cookie HTTP response header.

Type: bool
Default: false
Environment variable: TCM_HTTP_WEBSESSION_COOKIE_SECURE
Command-line option: ---http.websession-cookie.secure

http.websession-cookie.http-only¶

Indicates that the cookie can’t be accessed from the JavaScript Document.cookie API. This helps mitigate cross-site scripting attacks.

When true, the HttpOnly attribute is added to the Set-Cookie HTTP response header.

Type: bool
Default: true
Environment variable: TCM_HTTP_WEBSESSION_COOKIE_HTTP_ONLY
Command-line option: ---http.websession-cookie.http-only

http.websession-cookie.same-site¶

Indicates if it is possible to send the TCM cookie along with cross-site requests. Possible values are the Go’s http.SameSite constants:

SameSiteDefaultMode
SameSiteLaxMode
SameSiteStrictMode
SameSiteNoneMode

For details on SameSite modes, see the Set-Cookie header documentation in the MDN web docs.

This value is used in the SameSite attribute of the Set-Cookie HTTP response header.

Type: http.SameSite
Default: SameSiteDefaultMode
Environment variable: TCM_HTTP_WEBSESSION_COOKIE_SAME_SITE
Command-line option: ---http.websession-cookie.same-site

http.cors.enabled¶: Indicates whether to use the Cross-Origin Resource Sharing (CORS).

Type: bool

Default: false

Environment variable: TCM_HTTP_CORS_ENABLED

Command-line option: --http.cors.enabled

http.cors.allowed-origins¶

The origins with which the HTTP response can be shared, separated by semicolons.

The specified values are sent in the Access-Control-Allow-Origin HTTP response headers.

Type: []string
Default: []
Environment variable: TCM_HTTP_CORS_ALLOWED_ORIGINS
Command-line option: --http.cors.allowed-origins

http.cors.allowed-methods¶

HTTP request methods that are allowed when accessing a resource, separated by semicolons.

The specified values are sent in the Access-Control-Allow-Methods HTTP header of a response to a CORS preflight request.

Type: []string
Default: []
Environment variable: TCM_HTTP_CORS_ALLOWED_METHODS
Command-line option: --http.cors.allowed-methods

http.cors.allowed-headers¶

HTTP headers that are allowed during the actual request, separated by semicolons.

The specified values are sent in the Access-Control-Allow-Headers HTTP header of a response to a CORS preflight request.

Type: []string
Default: []
Environment variable: TCM_HTTP_CORS_ALLOWED_HEADERS
Command-line option: --http.cors.allowed-headers

http.cors.exposed-headers¶

Response headers that should be made available to scripts running in the browser, in response to a cross-origin request, separated by semicolons.

The specified values are sent in the Access-Control-Expose-Headers HTTP response headers.

Type: []string
Default: []
Environment variable: TCM_HTTP_CORS_EXPOSED_HEADERS
Command-line option: --http.cors.exposed-headers

http.cors.allow-credentials¶

Whether to expose the response to the frontend JavaScript code when the request’s credentials mode is include.

When true, the Access-Control-Allow-Credentials HTTP response header is sent.

Type: bool
Default: false
Environment variable: TCM_HTTP_CORS_ALLOW_CREDENTIALS
Command-line option: --http.cors.allow-credentials

http.cors.debug¶: For debug purposes.

Type: bool

Default: false

http.tls.enabled¶: Indicates whether TLS is enabled for client connections to TCM.

Type: bool

Default: false

Environment variable: TCM_HTTP_TLS_ENABLED

Command-line option: --http.tls.enabled

http.tls.cert-file¶: A path to a TLS certificate file. Mandatory when TLS is enabled.

Type: string

Default: “”

Environment variable: TCM_HTTP_TLS_CERT_FILE

Command-line option: --http.tls.cert-file

http.tls.key-file¶: A path to a TLS private key file. Mandatory when TLS is enabled.

Type: string

Default: “”

Environment variable: TCM_HTTP_TLS_KEY_FILE

Command-line option: --http.tls.key-file

http.tls.server¶: The TLS server.

Type: string

Default: “”

Environment variable: TCM_HTTP_TLS_SERVER

Command-line option: --http.tls.server

http.tls.min-version¶: The minimum version of the TLS protocol.

Type: uint16

Default: 0

Environment variable: TCM_HTTP_TLS_MIN_VERSION

Command-line option: --http.tls.min-version

http.tls.max-version¶: The maximum version of the TLS protocol.

Type: uint16

Default: 0

Environment variable: TCM_HTTP_TLS_MAX_VERSION

Command-line option: --http.tls.max-version

http.tls.curve-preferences¶

Elliptic curves that are used for TLS connections. Possible values are the Go’s tls.CurveID constants:

CurveP256
CurveP384
CurveP521
X25519

Type: []tls.CurveID
Default: []
Environment variable: TCM_HTTP_TLS_CURVE_PREFERENCES
Command-line option: --http.tls.curve-preferences

http.tls.cipher-suites¶

Enabled TLS cipher suites. The supported ciphers are:

TLS 1.0 - 1.2 cipher suites:

TLS_RSA_WITH_RC4_128_SHA

TLS_RSA_WITH_3DES_EDE_CBC_SHA

TLS_RSA_WITH_AES_128_CBC_SHA

TLS_RSA_WITH_AES_256_CBC_SHA

TLS_RSA_WITH_AES_128_CBC_SHA256

TLS_RSA_WITH_AES_128_GCM_SHA256

TLS_RSA_WITH_AES_256_GCM_SHA384

TLS_ECDHE_ECDSA_WITH_RC4_128_SHA

TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA

TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA

TLS_ECDHE_RSA_WITH_RC4_128_SHA

TLS_ECDHE_RSA_WITH_3DES_EDE_CBC_SHA

TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA

TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA

TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256

TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256

TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256

TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256

TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384

TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384

TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305_SHA256

TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305_SHA256

TLS 1.3 cipher suites:

TLS_AES_128_GCM_SHA256

TLS_AES_256_GCM_SHA384

TLS_CHACHA20_POLY1305_SHA256

TLS_FALLBACK_SCSV isn’t a standard cipher suite but an indicator that the client is doing version fallback

TLS_FALLBACK_SCSV uint16 = 0x5600

TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305 = TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305_SHA256

TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305 = TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305_SHA25

For detailed information on ciphers, refer to the Golang tls.TLS_* constants.

The example below shows how to configure cipher suites:

http:
  tls:
    cipher-suites:
      - TLS_AES_256_GCM_SHA384
      - TLS_AES_128_GCM_SHA256
      - TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384
      - TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256
      - TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384
      - TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
      - TLS_DHE_RSA_WITH_AES_256_GCM_SHA384
      - TLS_DHE_RSA_WITH_AES_128_GCM_SHA256

Type: []uint16
Default: []
Environment variable: TCM_HTTP_TLS_CIPHER_SUITES
Command-line option: --http.tls.cipher-suites

http.read-timeout¶: A timeout for reading an incoming request.

Type: time.Duration

Default: 30s

Environment variable: TCM_HTTP_READ_TIMEOUT

Command-line option: --http.read-timeout

http.read-header-timeout¶: A timeout for reading headers of an incoming request.

Type: time.Duration

Default: 30s

Environment variable: TCM_HTTP_READ_HEADER_TIMEOUT

Command-line option: --http.read-header-timeout

http.write-timeout¶: A timeout for writing a response.

Type: time.Duration

Default: 30s

Environment variable: TCM_HTTP_WRITE_TIMEOUT

Command-line option: --http.write-timeout

http.idle-timeout¶: The timeout for idle connections.

Type: time.Duration

Default: 30s

Environment variable: TCM_HTTP_IDLE_TIMEOUT

Command-line option: --http.idle-timeout

http.disable-general-options-handler¶: Whether the client requests with the OPTIONS HTTP method are allowed.

Type: bool

Default: false

Environment variable: TCM_HTTP_DISABLE_GENERAL_OPTIONS_HANDLER

Command-line option: --http.disable-general-options-handler

http.max-header-bytes¶: The maximum size (in bytes) of a header in a client’s request to TCM.

Type: int

Default: 0

Environment variable: TCM_HTTP_MAX_HEADER_BYTES

Command-line option: --http.max-header-bytes

http.api-timeout¶: The stateboard update timeout.

Type: time.Duration

Default: 8s

Environment variable: TCM_HTTP_API_TIMEOUT

Command-line option: --http.api-timeout

http.api-update-interval¶: The stateboard update interval.

Type: time.Duration

Default: 5s

Environment variable: TCM_HTTP_API_UPDATE_INTERVAL

Command-line option: --http.api-update-interval

http.frontend-dir¶: The directory with custom TCM frontend files (for development purposes).

Type: string

Default: “”

Environment variable: TCM_HTTP_FRONTEND_DIR

Command-line option: --http.frontend-dir

http.show-stack-trace¶: Whether error stack traces are shown in the web UI.

Type: bool

Default: true

Environment variable: TCM_HTTP_SHOW_STACK_TRACE

Command-line option: --http.show-stack-trace

http.trace¶: Whether all query tracing information is written in logs.

Type: bool

Default: false

Environment variable: TCM_HTTP_TRACE

Command-line option: --http.trace

http.max-static-size¶: The maximum size (in bytes) of a static content sent to TCM.

Type: int

Default: 104857600

Environment variable: TCM_HTTP_MAX_STATIC_SIZE

Command-line option: --http.max-static-size

http.graphql.complexity¶: The maximum complexity of GraphQL queries that TCM processes. If this value is exceeded, TCM returns an error.

Type: int

Default: 40

Environment variable: TCM_HTTP_GRAPHQL_COMPLEXITY

Command-line option: --http.graphql.complexity

log

The log section defines the TCM logging parameters.

log.default.add-source
log.default.show-stack-trace
log.default.level
log.default.format
log.default.output
log.default.no-colorized
log.default.file.name
log.default.file.maxsize
log.default.file.maxage
log.default.file.maxbackups
log.default.file.compress
log.default.syslog.protocol
log.default.syslog.output
log.default.syslog.priority
log.default.syslog.facility
log.default.syslog.tag
log.default.syslog.timeout
log.outputs

log.default.add-source¶: Whether sources are added to the TCM log.

Type: bool

Default: false

Environment variable: TCM_LOG_DEFAULT_ADD_SOURCE

Command-line option: --log.default.add-source

log.default.show-stack-trace¶: Whether stack traces are added to the TCM log.

Type: bool

Default: false

Environment variable: TCM_LOG_DEFAULT_SHOW_STACK_TRACE

Command-line option: --log.default.show-stack-trace

log.default.level¶

The default TCM logging level.

Possible values:

VERBOSE
INFO
WARN
ALARM

Type: string
Default: INFO
Environment variable: TCM_LOG_DEFAULT_LEVEL
Command-line option: --log.default.level

log.default.format¶

TCM log entries format.

Possible values:

struct
json

Type: string
Default: struct
Environment variable: TCM_LOG_DEFAULT_FORMAT
Command-line option: --log.default.format

log.default.output¶

The output used for TCM log.

Possible values:

stdout
stderr
file
syslog

Type: string
Default: stdout
Environment variable: TCM_LOG_DEFAULT_OUTPUT
Command-line option: --log.default.output

log.default.no-colorized¶: Whether the stdout log is not colorized.

Type: bool

Default: false

Environment variable: TCM_LOG_DEFAULT_NO_COLORIZED

Command-line option: --log.default.no-colorized

log.default.file.name¶: The name of the TCM log file.

Type: string

Default: “”

Environment variable: TCM_LOG_DEFAULT_FILE_NAME

Command-line option: --log.default.file.name

log.default.file.maxsize¶: The maximum size (in bytes) of the TCM log file.

Type: int

Default: 0

Environment variable: TCM_LOG_DEFAULT_FILE_MAXSIZE

Command-line option: --log.default.file.maxsize

log.default.file.maxage¶: The maximum age of a TCM log file, in days.

Type: int

Default: 0

Environment variable: TCM_LOG_DEFAULT_FILE_MAXAGE

Command-line option: --log.default.file.maxage

log.default.file.maxbackups¶: The maximum number of users in TCM.

Type: int

Default: 0

Environment variable: TCM_LOG_DEFAULT_FILE_MAXBACKUPS

Command-line option: --log.default.file.maxbackups

log.default.file.compress¶: Indicated that TCM compresses log files upon rotation.

Type: bool

Default: false

Environment variable: TCM_LOG_DEFAULT_FILE_COMPRESS

Command-line option: --log.default.file.compress

log.default.syslog.protocol¶: The network protocol used for connecting to the syslog server. Typically, it’s tcp, udp, or unix. All possible values are listed in the Go’s net.Dial documentation.

Type: string

Default: tcp

Environment variable: TCM_LOG_DEFAULT_SYSLOG_PROTOCOL

Command-line option: --log.default.syslog.protocol

log.default.syslog.output¶: The syslog server URI.

Type: string

Default: 127.0.0.1:5514

Environment variable: TCM_LOG_DEFAULT_SYSLOG_OUTPUT

Command-line option: --log.default.syslog.output

log.default.syslog.priority¶: The syslog severity level.

Type: string

Default: “”

Environment variable: TCM_LOG_DEFAULT_SYSLOG_PRIORITY

Command-line option: --log.default.syslog.priority

log.default.syslog.facility¶: The syslog facility.

Type: string

Default: “”

Environment variable: TCM_LOG_DEFAULT_SYSLOG_FACILITY

Command-line option: --log.default.syslog.facility

log.default.syslog.tag¶: The syslog tag.

Type: string

Default: “”

Environment variable: TCM_LOG_DEFAULT_SYSLOG_TAG

Command-line option: --log.default.syslog.tag

log.default.syslog.timeout¶: The timeout for connecting to the syslog server.

Type: time.Duration

Default: 10s

Environment variable: TCM_LOG_DEFAULT_SYSLOG_TIMEOUT

Command-line option: --log.default.syslog.timeout

log.outputs¶: An array of log outputs that TCM uses in addition to the default one that is defined by the log.default.* parameters. Each array item can include the parameters of the log.default group. If a parameter is skipped, its value is taken from log.default.

Type: []LogOuputConfig

Default: []

Environment variable: TCM_LOG_OUTPUTS

Command-line option: --log-outputs

storage

The storage section defines the parameters of the TCM backend store.

storage.provider

etcd backend store parameters:

storage.etcd.prefix
storage.etcd.endpoints
storage.etcd.dial-timeout
storage.etcd.auto-sync-interval
storage.etcd.dial-keep-alive-time
storage.etcd.dial-keep-alive-timeout
storage.etcd.bootstrap-timeout
storage.etcd.max-call-send-msg-size
storage.etcd.username
storage.etcd.password
storage.etcd.password-file
storage.etcd.tls.enabled
storage.etcd.tls.auto
storage.etcd.tls.cert-file
storage.etcd.tls.key-file
storage.etcd.tls.trusted-ca-file
storage.etcd.tls.client-cert-auth
storage.etcd.tls.crl-file
storage.etcd.tls.insecure-skip-verify
storage.etcd.tls.skip-client-san-verify
storage.etcd.tls.server-name
storage.etcd.tls.cipher-suites
storage.etcd.tls.allowed-cn
storage.etcd.tls.allowed-hostname
storage.etcd.tls.empty-cn
storage.etcd.permit-without-stream
storage.etcd.embed.enabled
storage.etcd.embed.endpoints
storage.etcd.embed.advertises
storage.etcd.embed.tls.enabled
storage.etcd.embed.tls.auto
storage.etcd.embed.tls.cert-file
storage.etcd.embed.tls.key-file
storage.etcd.embed.tls.trusted-ca-file
storage.etcd.embed.tls.client-cert-auth
storage.etcd.embed.tls.crl-file
storage.etcd.embed.tls.insecure-skip-verify
storage.etcd.embed.tls.skip-client-san-verify
storage.etcd.embed.tls.server-name
storage.etcd.embed.tls.cipher-suites
storage.etcd.embed.tls.allowed-cn
storage.etcd.embed.tls.allowed-hostname
storage.etcd.embed.tls.empty-cn
storage.etcd.embed.peer-endpoints
storage.etcd.embed.peer-advertises
storage.etcd.embed.peer-tls.enabled
storage.etcd.embed.peer-tls.auto
storage.etcd.embed.peer-tls.cert-file
storage.etcd.embed.peer-tls.key-file
storage.etcd.embed.peer-tls.trusted-ca-file
storage.etcd.embed.peer-tls.client-cert-auth
storage.etcd.embed.peer-tls.crl-file
storage.etcd.embed.peer-tls.insecure-skip-verify
storage.etcd.embed.peer-tls.skip-client-san-verify
storage.etcd.embed.peer-tls.server-name
storage.etcd.embed.peer-tls.cipher-suites
storage.etcd.embed.peer-tls.allowed-cn
storage.etcd.embed.peer-tls.allowed-hostname
storage.etcd.embed.peer-tls.empty-cn
storage.etcd.embed.grpc-keep-alive-timeout
storage.etcd.embed.grpc-keep-alive-interval
storage.etcd.embed.grpc-keep-alive-min-time
storage.etcd.embed.workdir
storage.etcd.embed.waldir
storage.etcd.embed.max-request-bytes
storage.etcd.embed.debug
storage.etcd.embed.start-timeout
storage.etcd.embed.log-level
storage.etcd.embed.initial-cluster
storage.etcd.embed.initial-cluster-token
storage.etcd.embed.name
storage.etcd.embed.initial-cluster-state
storage.etcd.embed.self-signed-cert-validity

Tarantool backend store parameters:

storage.tarantool.prefix
storage.tarantool.addr
storage.tarantool.addrs
storage.tarantool.auth
storage.tarantool.timeout
storage.tarantool.reconnect
storage.tarantool.max-reconnects
storage.tarantool.username
storage.tarantool.password
storage.tarantool.password-file
storage.tarantool.rate-limit
storage.tarantool.rate-limit-action
storage.tarantool.concurrency
storage.tarantool.skip-schema
storage.tarantool.transport
storage.tarantool.ssl.key-file
storage.tarantool.ssl.cert-file
storage.tarantool.ssl.ca-file
storage.tarantool.ssl.ciphers
storage.tarantool.ssl.password
storage.tarantool.ssl.password-file
storage.tarantool.required-protocol-info.auth
storage.tarantool.required-protocol-info.version
storage.tarantool.required-protocol-info.features
storage.tarantool.embed.enabled
storage.tarantool.embed.workdir
storage.tarantool.embed.executable
storage.tarantool.embed.config-filename
storage.tarantool.embed.config
storage.tarantool.embed.args
storage.tarantool.embed.env

storage.provider¶

The type of the storage used for storing TCM configuration.

Possible values:

etcd
tarantool

Type: string
Default: etcd
Environment variable: TCM_STORAGE_PROVIDER
Command-line option: --storage.provider

storage.etcd.prefix¶: A prefix for the TCM configuration parameters in etcd.

Type: string

Default: “/tcm”

Environment variable: TCM_STORAGE_ETCD_PREFIX

Command-line option: --storage.etcd.prefix

storage.etcd.endpoints¶: An array of node URIs of the etcd cluster where the TCM configuration is stored, separated by semicolons (;).

Type: []string

Default: [“http://127.0.0.1:2379”]

Environment variable: TCM_STORAGE_ETCD_ENDPOINTS

Command-line option: --storage.etcd.endpoints

storage.etcd.dial-timeout¶: An etcd dial timeout.

Type: time.Duration

Default: 10s

Environment variable: TCM_STORAGE_ETCD_DIAL_TIMEOUT

Command-line option: --storage.etcd.dial-timeout

storage.etcd.auto-sync-interval¶: An automated sync interval.

Type: time.Duration

Default: 0 (disabled)

Environment variable: TCM_STORAGE_ETCD_AUTO_SYNC_INTERVAL

Command-line option: --storage.etcd.auto-sync-interval

storage.etcd.dial-keep-alive-time¶: A dial keep-alive time.

Type: time.Duration

Default: 30s

Environment variable: TCM_STORAGE_ETCD_DIAL_KEEP_ALIVE_TIME

Command-line option: --storage.etcd.dial-keep-alive-time

storage.etcd.dial-keep-alive-timeout¶: A dial keep-alive timeout.

Type: time.Duration

Default: 30s

Environment variable: TCM_STORAGE_ETCD_DIAL_KEEP_ALIVE_TIMEOUT

Command-line option: --storage.etcd.dial-keep-alive-timeout

storage.etcd.bootstrap-timeout¶: A bootstrap timeout.

Type: time.Duration

Default: 30s

Environment variable: TCM_STORAGE_ETCD_BOOTSTRAP_TIMEOUT

Command-line option: --storage.etcd.bootstrap-timeout

storage.etcd.max-call-send-msg-size¶: The maximum size (in bytes) of a transaction between TCM and etcd.

Type: int

Default: 2097152

Environment variable: TCM_STORAGE_ETCD_MAX_CALL_SEND_MSG_SIZE

Command-line option: --storage.etcd.max-call-send-msg-size

storage.etcd.username¶: A username for accessing the etcd storage.

Type: string

Default: “”

Environment variable: TCM_STORAGE_ETCD_USERNAME

Command-line option: --storage.etcd.username

storage.etcd.password¶: A password for accessing the etcd storage.

Type: string

Default: “”

Environment variable: TCM_STORAGE_ETCD_PASSWORD

Command-line option: --storage.etcd.password

storage.etcd.password-file¶: A path to the file with a password for accessing the etcd storage.

Type: string

Default: “”

Environment variable: TCM_STORAGE_ETCD_PASSWORD_FILE

Command-line option: --storage.etcd.password-file

storage.etcd.tls.enabled¶: Indicates whether TLS is enabled for etcd connections.

Type: bool

Default: false

Environment variable: TCM_STORAGE_ETCD_TLS_ENABLED

Command-line option: --storage.etcd.tls.enabled

storage.etcd.tls.auto¶: Use generated certificates for etcd connections.

Type: bool

Default: false

Environment variable: TCM_STORAGE_ETCD_TLS_AUTO

Command-line option: --storage.etcd.tls.auto

storage.etcd.tls.cert-file¶: A path to a TLS certificate file to use for etcd connections.

Type: string

Default: “”

Environment variable: TCM_STORAGE_ETCD_TLS_CERT_FILE

Command-line option: --storage.etcd.tls.cert-file

storage.etcd.tls.key-file¶: A path to a TLS private key file to use for etcd connections.

Type: string

Default: “”

Environment variable: TCM_STORAGE_ETCD_TLS_KEY_FILE

Command-line option: --storage.etcd.tls.key-file

storage.etcd.tls.trusted-ca-file¶: A path to a trusted CA certificate file to use for etcd connections.

Type: string

Default: “”

Environment variable: TCM_STORAGE_ETCD_TLS_TRUSTED_CA_FILE

Command-line option: --storage.etcd.tls.trusted-ca-file

storage.etcd.tls.client-cert-auth¶: Indicates whether client cert authentication is enabled.

Type: bool

Default: false

Environment variable: TCM_STORAGE_ETCD_TLS_CLIENT_CERT_AUTH

Command-line option: --storage.etcd.tls.client-cert-auth

storage.etcd.tls.crl-file¶: A path to the client certificate revocation list file.

Type: string

Default: “”

Environment variable: TCM_STORAGE_ETCD_TLS_CRL_FILE

Command-line option: --storage.etcd.tls.crl-file

storage.etcd.tls.insecure-skip-verify¶: Skip checking client certificate in etcd connections.

Type: bool

Default: false

Environment variable: TCM_STORAGE_ETCD_TLS_INSECURE_SKIP_VERIFY

Command-line option: --storage.etcd.tls.insecure-skip-verify

storage.etcd.tls.skip-client-san-verify¶: Skip verification of SAN field in client certificate for etcd connections.

Type: bool

Default: false

Environment variable: TCM_STORAGE_ETCD_TLS_SKIP_CLIENT_SAN_VERIFY

Command-line option: --storage.etcd.tls.skip-client-san-verify

storage.etcd.tls.server-name¶: Name of the TLS server for etcd connections.

Type: string

Default: “”

Environment variable: TCM_STORAGE_ETCD_TLS_SERVER_NAME

Command-line option: --storage.etcd.tls.server-name

storage.etcd.tls.cipher-suites¶: TLS cipher suites for etcd connections. Possible values are the Golang tls.TLS_* constants.

Type: []uint16

Default: []

Environment variable: TCM_STORAGE_ETCD_TLS_CIPHER_SUITES

Command-line option: --storage.etcd.tls.cipher-suites

storage.etcd.tls.allowed-cn¶: An allowed common name for authentication in etcd connections.

Type: string

Default: “”

Environment variable: TCM_STORAGE_ETCD_TLS_ALLOWED_CN

Command-line option: --storage.etcd.tls.allowed-cn

storage.etcd.tls.allowed-hostname¶: An allowed TLS certificate name for authentication in etcd connections.

Type: string

Default: “”

Environment variable: TCM_STORAGE_ETCD_TLS_ALLOWED_HOSTNAME

Command-line option: --storage.etcd.tls.allowed-hostname

storage.etcd.tls.empty-cn¶: Whether the empty common name is allowed in etcd connections.

Type: bool

Default: false

Environment variable: TCM_STORAGE_ETCD_TLS_EMPTY_CN

Command-line option: --storage.etcd.tls.empty-cn

storage.etcd.permit-without-stream¶: Whether keepalive pings can be send to the etcd server without active streams.

Type: bool

Default: false

Environment variable: TCM_STORAGE_ETCD_PERMIT_WITHOUT_STREAM

Command-line option: --storage.etcd.permit-without-stream

storage.etcd.embed.*

The storage.etcd.embed group defines the configuration of the embedded etcd cluster to use as a TCM backend store. This cluster can be used for development purposes when the production or testing etcd cluster is not available or not needed.

See also go-tarantool.Opts.

Type: time.Duration
Default: 0s
Environment variable: TCM_STORAGE_TARANTOOL_TIMEOUT
Command-line option: --storage.tarantool.timeout

storage.tarantool.reconnect¶

A timeout between reconnect attempts for the Tarantool-based configuration storage.

See also go-tarantool.Opts.

Type: time.Duration
Default: 0s
Environment variable: TCM_STORAGE_TARANTOOL_RECONNECT
Command-line option: --storage.tarantool.reconnect

storage.tarantool.max-reconnects¶

The maximum number of reconnect attempts for the Tarantool-based configuration storage.

See also go-tarantool.Opts.

Type: int
Default: 0
Environment variable: TCM_STORAGE_TARANTOOL_MAX_RECONNECTS
Command-line option: --storage.tarantool.max-reconnects

storage.tarantool.username¶

A username for connecting to the Tarantool-based configuration storage.

See also go-tarantool.Opts.

Type: string
Default: “”
Environment variable: TCM_STORAGE_TARANTOOL_USERNAME
Command-line option: --storage.tarantool.username

storage.tarantool.password¶

A password for connecting to the Tarantool-based configuration storage.

See also go-tarantool.Opts.

Type: string
Default: “”
Environment variable: TCM_STORAGE_TARANTOOL_PASSWORD
Command-line option: --storage.tarantool.password

storage.tarantool.password-file¶: A path to the file with a password for connecting to the Tarantool-based configuration storage.

Type: string

Default: “”

Environment variable: TCM_STORAGE_TARANTOOL_PASSWORD_FILE

Command-line option: --storage.tarantool.password-file

storage.tarantool.rate-limit¶

A rate limit for connecting to the Tarantool-based configuration storage.

See also go-tarantool.Opts.

Type: int
Default: 0
Environment variable: TCM_STORAGE_TARANTOOL_RATE_LIMIT
Command-line option: --storage.tarantool.rate-limit

storage.tarantool.rate-limit-action¶

An action to perform when the storage.tarantool.rate-limit is reached.

See also go-tarantool.Opts.

Type: int
Default: 0
Environment variable: TCM_STORAGE_TARANTOOL_RATE_LIMIT_ACTION
Command-line option: --storage.tarantool.rate-limit-action

storage.tarantool.concurrency¶

An amount of separate mutexes for request queues and buffers inside of a connection to the Tarantool TCM configuration storage.

See also go-tarantool.Opts.

Type: int
Default: 0
Environment variable: TCM_STORAGE_TARANTOOL_CONCURRENCY
Command-line option: --storage.tarantool.concurrency

storage.tarantool.skip-schema¶

Whether the schema is loaded from the Tarantool TCM configuration storage.

See also go-tarantool.Opts.

Type: bool
Default: true
Environment variable: TCM_STORAGE_TARANTOOL_SKIP_SCHEMA
Command-line option: --storage.tarantool.skip-schema

storage.tarantool.transport¶

The connection type for the Tarantool TCM configuration storage.

See also go-tarantool.Opts.

Type: string
Default: “”
Environment variable: TCM_STORAGE_TARANTOOL_TRANSPORT
Command-line option: --storage.tarantool.transport

storage.tarantool.ssl.key-file¶

A path to a TLS private key file to use for connecting to the Tarantool TCM configuration storage.

Type: string
Default: “”
Environment variable: TCM_STORAGE_TARANTOOL_SSL_KEY_FILE
Command-line option: --storage.tarantool.ssl.key-file

storage.tarantool.ssl.cert-file¶

A path to an SSL certificate to use for connecting to the Tarantool TCM configuration storage.

Type: string
Default: “”
Environment variable: TCM_STORAGE_TARANTOOL_SSL_CERT_FILE
Command-line option: --storage.tarantool.ssl.cert-file

storage.tarantool.ssl.ca-file¶

A path to a trusted CA certificate to use for connecting to the Tarantool TCM configuration storage.

Type: string
Default: “”
Environment variable: TCM_STORAGE_TARANTOOL_SSL_CA_FILE
Command-line option: --storage.tarantool.ssl.ca-file

storage.tarantool.ssl.ciphers¶

A list of SSL cipher suites that can be used for connecting to the Tarantool TCM configuration storage. Possible values are listed in <uri>.params.ssl_ciphers.

Type: string
Default: “”
Environment variable: TCM_STORAGE_TARANTOOL_SSL_CIPHERS
Command-line option: --storage.tarantool.ssl.ciphers

storage.tarantool.ssl.password¶

A password for an encrypted private SSL key to use for connecting to the Tarantool TCM configuration storage.

Type: string
Default: “”
Environment variable: TCM_STORAGE_TARANTOOL_SSL_PASSWORD
Command-line option: --storage.tarantool.ssl.password

storage.tarantool.ssl.password-file¶

A text file with passwords for encrypted private SSL keys to use for connecting to the Tarantool TCM configuration storage.

Type: string
Default: “”
Environment variable: TCM_STORAGE_TARANTOOL_SSL_PASSWORD_FILE
Command-line option: --storage.tarantool.ssl.password-file

storage.tarantool.required-protocol-info.auth¶

An authentication method for the Tarantool TCM configuration storage.

Possible values are the Go’s go-tarantool/Auth constants:

AutoAuth (0)
ChapSha1Auth
PapSha256Auth

storage.tarantool.embed.*

The storage.tarantool.embed group parameters define the configuration of the embedded Tarantool cluster to use as a TCM backend store. This cluster can be used for development purposes when the production or testing cluster is not available or not needed.

addon

The addon section defines settings related to TCM add-ons.

addon.enabled
addon.addons-dir
addon.max-upload-size
addon.dev-addons-dir

addon.enabled¶: Whether to enable the add-on functionality in TCM.

Type: bool

Default: false

Environment variable: TCM_ADDON_ENABLED

Command-line option: --addon.enabled

addon.addons-dir¶: The directory from which TCM takes add-ons.

Type: string

Default: addons

Environment variable: TCM_ADDON_ADDONS_DIR

Command-line option: --addon.addons-dir

addon.max-upload-size¶: The maximum size (in bytes) of addon to upload to TCM.

Type: int64

Default: 104857600

Environment variable: TCM_ADDON_MAX_UPLOAD_SIZE

Command-line option: --addon.max-upload-size

addon.dev-addons-dir¶: Additional add-on directories for development purposes, separated by semicolons (;).

Type: []string

Default: []

Environment variable: TCM_ADDON_DEV_ADDONS_DIR

Command-line option: --addon.dev-addons-dir

limits

The limits section defines limits on various TCM objects and relations between them.

limits.users-count
limits.clusters-count
limits.roles-count
limits.webhooks-count
limits.user-secrets-count
limits.user-websessions-count
limits.linked-cluster-users

limits.users-count¶: The maximum number of users in TCM.

Type: int

Default: 1000

Environment variable: TCM_LIMITS_USERS_COUNT

Command-line option: --limits.users-count

limits.clusters-count¶: The maximum number of clusters in TCM.

Type: int

Default: 10

Environment variable: TCM_LIMITS_CLUSTERS_COUNT

Command-line option: --limits.clusters-count

limits.roles-count¶: The maximum number of roles in TCM.

Type: int

Default: 100

Environment variable: TCM_LIMITS_ROLES_COUNT

Command-line option: --limits.roles-count

limits.webhooks-count¶: The maximum number of webhooks in TCM.

Type: int

Default: 200

Environment variable: TCM_LIMITS_WEBHOOKS_COUNT

Command-line option: --limits.webhooks-count

limits.user-secrets-count¶: The maximum number secrets that a TCM user can have.

Type: int

Default: 10

Environment variable: TCM_LIMITS_USER_SECRETS_COUNT

Command-line option: --limits.user-secrets-count

limits.user-websessions-count¶: The maximum number of open sessions that a TCM user can have.

Type: int

Default: 10

Environment variable: TCM_LIMITS_USER_WEBSESSIONS_COUNT

Command-line option: --limits.user-websessions-count

limits.linked-cluster-users¶: The maximum number of clusters to which a single user can have access.

Type: int

Default: 10

Environment variable: TCM_LIMITS_LINKED_CLUSTER_USERS

Command-line option: --limits.linked-cluster-users

security

The security section defines the security parameters of TCM.

security.auth
security.hash-cost
security.encryption-key
security.encryption-key-file
security.bootstrap-password
security.bootstrap-api-token
security.integrity-check
security.signature-private-key-file

security.auth¶

Ways to log into TCM.

Possible values:

local
ldap

Type: []string
Default: [local]
Environment variable: TCM_SECURITY_AUTH
Command-line option: --security.auth

security.hash-cost¶: A hash cost for hashing users’ passwords.

Type: int

Default: 12

Environment variable: TCM_SECURITY_HASH_COST

Command-line option: --security.hash-cost

security.encryption-key¶: An encryption key for passwords used by TCM for accessing Tarantool and etcd clusters.

Type: string

Default: “”

Environment variable: TCM_SECURITY_ENCRYPTION_KEY

Command-line option: --security.encryption-key

security.encryption-key-file¶: A path to the file with the encryption key for passwords used by TCM for accessing Tarantool and etcd clusters.

Type: string

Default: “”

Environment variable: TCM_SECURITY_ENCRYPTION_KEY_FILE

Command-line option: --security.encryption-key-file

security.bootstrap-password¶: A password for the first login of the admin user. Only for testing purposes.

Type: string

Default: “”

Environment variable: TCM_SECURITY_BOOTSTRAP_PASSWORD

Command-line option: --security.bootstrap-password

security.bootstrap-api-token¶: A default API token for the admin user. Only for testing purposes.

Type: string

Default: “”

Environment variable: TCM_SECURITY_BOOTSTRAP_API_TOKEN

Command-line option: --security.bootstrap-api-token

security.integrity-check¶: Whether to check the digital signature. If true, the error is raised in case an incorrect signature is detected.

Type: bool

Default: false

Environment variable: TCM_SECURITY_INTEGRITY_CHECK

Command-line option: --security.integrity-check

security.signature-private-key-file¶: A path to a file with the private key to sign TCM data.

Type: string

Default: “”

Environment variable: TCM_SECURITY_SIGNATURE_PRIVATE_KEY_FILE

Command-line option: --security.signature-private-key-file

mode

mode¶: The TCM mode: production, development, or test.

Type: string

Default: production

Environment variable: TCM_MODE

Command-line option: --mode

feature

The feature section defines the security parameters of TCM.

feature.ttgraph
feature.column-store
feature.tqe
feature.api-token

feature.ttgraph¶: Whether Tarantool Graph DB integration is enabled.

Type: bool

Default: false

Environment variable: TCM_FEATURE_TTGRAPH

Command-line option: --feature.ttgraph

feature.column-store¶: Whether Tarantool Column Store integration is enabled.

Type: bool

Default: false

Environment variable: TCM_FEATURE_COLUMN_STORE

Command-line option: --feature.column-store

feature.tqe¶: Whether Tarantool Queue Enterprise integration is enabled.

Type: bool

Default: false

Environment variable: TCM_FEATURE_TQE

Command-line option: --feature.tqe

feature.api-token¶: Whether the use of API tokens is enabled.

Type: bool

Default: false

Environment variable: TCM_FEATURE_API_TOKEN

Command-line option: --feature.api-token

feature.tuples¶: Whether the use of Tuples is enabled.

Type: bool

Default: false

Environment variable: TCM_FEATURE_TUPLES

Command-line option: --feature.tuples

initial-settings

The initial-settings group defines entities that are created automatically upon the first TCM startup.

Integrity check

Enterprise Edition

Tarantool Cluster Manager is a part of the Enterprise Edition.

TCM supports the integrity check mechanism. The integrity check mechanism in TCM verifies the digital signature of centralized configuration files. It ensures that TCM only applies configurations that are signed with a trusted private key.

This mechanism allows TCM to:

Update the configuration with integrity check support
Detect unauthorized changes in centralized configuration

Configure integrity check

Configuration parameters

Parameter	Description	Type	Default
security.integrity-check	Enables signature validation	`bool`	`false`
security.signature-private-key-file	Path to the private key for signing configuration	`string`	`""`

Example configuration

Integrity check can be enabled directly in the TCM configuration file:

# tcm.yaml
security:
    integrity-check: true
    signature-private-key-file: /etc/tcm/private_key.pem

Note

The integrity-check-period option works only in the tt + Tarantool setup, where tt periodically verifies the integrity of the running instance. In TCM, this option is not used, as the component only uploads and verifies configuration signatures and does not interact directly with the database. Moreover, TCM cannot stop Tarantool execution in case of an integrity check failure — this behavior is specific to tt when Tarantool is started with the --integrity-check and --integrity-check-period options. Read details about tt integrity check in its documentation.

Terminals

Tarantool Cluster Manager (TCM) provides two ways to interact with Tarantool instances:

direct — a terminal that connects directly to a Tarantool instance using the go-tarantool library, bypassing the tt connect utility
tt-connect — a terminal that uses the tt CLI utility utility to connect to a Tarantool instance

Both terminals allow executing SQL queries, managing cluster state, viewing metrics, and more.

Terminal direct

Authentication credentials are taken from the cluster configuration:

credentials:
  users:
    tcm_tarantool:
      password: tcm_tarantool_password
      roles: [super]

Terminal tt-connect

Specify the path to the tt connect utility in the tcm.yaml configuration file:

mode: production
cluster:
  tt-command: .tarantool/tt

Tarantool Cluster Manager releases

Enterprise Edition

Tarantool Cluster Manager is a part of the Enterprise Edition.

This section contains the list of Tarantool Cluster Manager releases along with descriptions of their key changes.

For information about Tarantool releases, see Releases.

Supported versions

Tarantool Cluster Manager 1.9

Release date: April 17, 2026

Latest release in series: 1.9.1

TCM 1.9.0 expands cluster connectivity options, simplifies first-time setup by allowing initial user and role provisioning from configuration, and improves observability in Stateboard when vshard issues occur. The release also includes fixes for migration storage connectivity in split-configuration setups and corrects password handling during cluster Test connection.

Connect to instances via iproto.listen.uri

TCM can now connect to a cluster instance using iproto.listen.uri. This provides more flexibility in environments where instance endpoints are defined through iproto.listen.uri rather than alternative connection parameters, and helps align TCM connectivity with the instance’s actual listen configuration.

Bootstrap users and roles on first launch

TCM 1.9.0 introduces the ability to create users and roles on the very first TCM launch using the initial-settings field in the configuration. Roles can be created with an explicit ID or without one, allowing you to either keep stable identifiers across environments or let the system generate them as needed.

Stateboard router indicators for vshard errors

Stateboard tab now provides a router indicator response when vshard errors occur. This change improves diagnostics by making routing-related problems more visible and easier to interpret during incident investigation.

Fixes

Fixed a migration storage connection error that could occur when cluster configuration and TCM configuration are stored in different configuration storages.
Fixed incorrect password handling during Test connection when creating or updating a cluster, ensuring the test uses the intended credentials.
Fixed a bug where, after a failed migration, the resulting migration status was not properly recorded or displayed, ensuring accurate state tracking and more reliable migration workflows.

Tarantool Cluster Manager 1.8

Release date: March 13, 2026

Latest release in series: 1.8.1

TCM 1.8.0 improves LDAP reliability, expands the CLI with full user and role administration capabilities, and adds safer behavior when creating clusters. This release also includes fixes that improve authentication stability after failed login attempts and correct data rendering in the Tuples tab.

LDAP improvements

TCM 1.8.0 introduces support for cascading LDAP connections in environments where multiple LDAP domains share the same domain name. Instead of failing on the first unreachable domain, TCM now tries each configured domain sequentially until a successful connection is established.

New CLI commands for user and role management

This release adds dedicated CLI commands to manage users and roles without relying on the UI. You can now create and delete users using tcm user add and tcm user delete, and create, update, or delete roles using tcm role add, tcm role update, and tcm role delete.

When creating users, the CLI supports assigning multiple roles and granting access to multiple clusters (including per-cluster permissions and ACL on/off). Passwords are validated against the configured password policy, and operations provide detailed logging to support auditing and traceability.

Example of user creation:

.. tcm user add

--fullname 'John Doe'
--description 'System administrator'
--role admin-role-id
--role operator-role-id
--clusters cluster-id-1:cluster.config.read,cluster.config.write:true
--clusters cluster-id-2:cluster.config.read:false
--secret-type password
--public-key john-public-key
--secret-key john-secret-key

.. tcm role add

--name 'Cluster Administrator'
--description 'Full access to cluster configuration and management'
--permission admin.clusters.read
--permission admin.clusters.write
--permission admin.users.read
--permission admin.users.write

Safer cluster creation

To prevent configuration mistakes, TCM now shows a warning when attempting to create a new cluster using an ID that already exists. This helps catch ID conflicts early and reduces the chance of accidental misconfiguration.

Fixes

Fixed an issue where, after several failed login attempts followed by a successful login, an error could occur during logout.
Fixed an issue in the Tuples tab where scrolling could trigger a “data could not be found” error when connected to a cluster with multiple storages.
Improved the error returned when hitting the WebSession limit: previously it could be uninformative.

Tarantool Cluster Manager 1.7

Release date: February 11, 2026

Latest release in series: 1.7.3

The TCM 1.7.x release includes a range of improvements and bug fixes aimed at enhancing the stability, usability, and configurability of the system. Key updates include a fix for the Metrics tab and export functionality, improvements to session management, better handling of audit log paths, enhanced visibility into migration status, and refinements to LDAP authentication.

Metrics fix

A bug was fixed in 1.7.3 release that prevented the Metrics tab from displaying data and disabled the ability to export metrics.

WebSession improvements

In 1.7.2 the websession component has been updated to correctly handle user session creation. Users can now create the number of sessions specified in the configuration.

AuditLog path handling

The auditlog component now has improved support in version 1.7.2 for absolute paths in its configuration. When a relative path is provided, the audit log file is now saved relative to the application’s working directory.

Migration handling

In 1.7.1 the migration handling has been updated to improve visibility into the state of applied migrations. If a migration is reapplied, the system now logs a warning message in the logs.

Connection Test Fix

A bug in the “test connection” functionality for Tarantool and etcd services has been fixed. Previously, the function did not correctly report the connection status for these services.

Default cluster management

You can now control automatic creation of the default cluster using one of the following options:

TCM_DEFAULT_CLUSTER environment variable
default-cluster configuration parameter
--default-cluster command-line flag

This allows administrators to explicitly enable or disable default cluster auto-creation depending on deployment requirements.

LDAP authentication improvements

Error handling has been improved for LDAP authentication when the Automatically add non-existent users option is disabled.

It is also now possible to create a user via the UI with LDAP authentication enabled, simplifying user management in LDAP-based environments.

Tarantool Cluster Manager 1.6

Release date: February 6, 2026

Latest release in series: 1.6.0

This release introduces support for Tarantool DataBase (TDB) workers in the cluster dashboard with integrated health monitoring, adds TLS configuration guides for secure connections, improves audit log configuration and validation, and introduces a feature flag for managing the Tuples tab. It also includes important fixes for LDAP authentication, TLS configuration parsing, and a memory leak in SSL cluster connections.

TDB workers monitoring in cluster dashboard

TCM adds support for TDB workers in the cluster Stateboard tab with integrated health monitoring and visibility.

TDB workers are supported starting from TDB 3.1.0. Workers are automatically discovered from etcd and continuously monitored via dedicated health check endpoints. Their metrics are proxied through TCM and exposed individually, allowing detailed operational insight.

The interface displays workers directly in the stateboard with clear status indicators and a details panel. Each worker can be in one of four statuses: healthy, degraded, unhealthy, or no connection, helping administrators quickly detect and diagnose issues.

To learn more, see TDB documentation.

Audit log configuration improvements

The audit log configuration is now safer and more predictable.

Protocol values are validated during startup. If an invalid protocol is specified, the system automatically falls back to default settings and emits a warning. Audit log parameters can be set in advance at the system bootstrap stage by specifying them in the auditlog field of the initial-settings section in the configuration file. These settings will be applied automatically if the audit log has not been configured yet.

Explorer enhancements

A feature flag has been introduced to control the visibility of the Tuples tab in the Explorer interface.

The tab is displayed only when the corresponding feature flag is enabled and the CRUD module is available. The flag can be configured either in the TCM configuration file or via command-line arguments at startup.

To enable the Tuples tab in the TCM configuration file:

# tcm.yaml
feature:
    tuples: True

Stability and reliability fixes

This release also includes several fixes that improve system stability and security.

LDAP authentication behavior has been adjusted, including logout handling, anonymous binding to Active Directory, and the preservation of authorization method settings after a restart. TLS configuration parsing has been fixed to ensure cipher suites and curve preferences are correctly recognized in both configuration files and command-line arguments. Missing schema attributes for cluster configuration have been added, and configuration validation feedback in the editor has been improved.

Additionally, a memory leak that could occur when SSL-enabled cluster connections became unavailable has been resolved, resulting in more stable cluster operation.

Tarantool Cluster Manager 1.5

Release date: August 28, 2025

Latest release in series: 1.5.3

Tarantool Cluster Manager 1.5 introduces a new UI page for configuring TCF clusters and includes important fixes that enhance reliability, compliance, and user experience.

TCF cluster configuration in UI

TCM 1.5.0 adds a dedicated settings page for managing TCF cluster parameters directly through the web interface. You can now retrieve and modify key fields that define cluster behavior and failover logic without editing configuration files manually.

The new page allows configuring the following parameters:

dml_users – a list of users with DML access
cluster1, cluster2 – settings for connected clusters
replication_user, replication_password – replication credentials
failover_timeout – delay before switching to a failover node
initial_status – default service state on startup
max_suspect_counts – the threshold for marking a node as failed
health_check_delay – interval between health checks
enable_system_check – toggles system-level health monitoring
status_ttl – time-to-live for service status data

Testing improvements

To make tests more efficient and predictable, all occurrences of time.Sleep were replaced with require.Eventually. This change improves test speed and reliability. Additionally, HTTP checks and tuple insertion operations in tests were updated for better performance and accuracy.

Migrations section

Since version 1.5.1, TCM includes a new migrations section with a duration field. This field allows specifying the maximum execution time for long-running migrations, preventing them from being interrupted by the default timeout.

Cluster reliability

TCM 1.5.3 improves overall cluster stability, fault tolerance, and configuration handling. The cluster now automatically reconnects after transient failures and continuously monitors node health to detect degraded or unavailable instances faster. Configuration changes are applied correctly without disrupting cluster operation. Quorum and health check logic were reworked to better tolerate partial failures. Unavailable nodes are now excluded from quorum calculations, preventing cluster-wide outages when only a minority of nodes becomes unavailable.

The following issues were fixed:

incorrect quorum calculation when some nodes were down.
unstable health check behavior under partial failures.
cluster not found error when adding or editing cluster settings.

Migration management improvements

TCM 1.5.3 makes migration handling safer and more predictable. Applied migrations are now automatically locked from editing. Executed migrations are clearly marked as read-only in the interface, and the UI displays an explanatory message to indicate that modifications are not allowed. This prevents accidental changes to already executed migrations and ensures migration history consistency.

Fixes and compliance updates

This release includes multiple fixes across different modules:

CRUD and Explorer – data types used during operations have been corrected to comply with FSTEC security requirements, ensuring strict typing and better protection of sensitive data.
Authentication – the system no longer relies on etcd for storing authentication parameters. Instead, it uses local configuration to improve startup reliability and simplify setup.
Logging – fixed issues with log output by switching to the slog logging system.
UI – resolved display issues in the OperationStatus component.
Tuples – fixed an error that caused tab refresh failures in clusters with a large number of spaces.
utils — the FilterSlices function to correctly filter slices. Since version 1.5.1.
Audit log documentation now contains only necessary event types. Since version 1.5.1.
Fixed issue when adding a new role, the Permissions drop-down list had multiple empty lines at the bottom. Since version 1.5.2.

Tarantool Cluster Manager 1.4

Release date: June 9, 2025

Latest release in series: 1.4.0

Tarantool Cluster Manager 1.4.0 improves LDAP support and includes several enhancements and fixes aimed at improving authentication flexibility and system stability.

LDAP support improvements

TCM 1.4.0 significantly enhances the experience of working with LDAP authentication. The web interface now includes a visual confirmation pop-up when a connection to an LDAP server is successfully established. This helps administrators quickly verify the correctness of LDAP settings without checking logs or reloading the page.

The authentication settings now support switching between local and LDAP methods directly in the interface, making it easier to configure hybrid or alternative access scenarios.

The LDAP configuration form has been simplified:

The groupQueryTemplate field is now optional, allowing LDAP authentication without querying for user groups.
The queryUser and queryPassword fields are also optional, which enables anonymous binding to the LDAP server.
You now only need to provide either templateDN or templateQuery, instead of both – reducing configuration complexity.

More about LDAP authentication.

LDAP support improvements

In version 1.4.0, TCM improves the behavior of the etcd client. Previously, if one of the etcd nodes became unresponsive while keeping its port open, the client could hang indefinitely. This issue has been fixed to ensure better resilience of etcd-based components.

Additionally, the audit log mechanism now correctly creates log files in the directory of the running application binary. To learn more, see Audit log configuration.

Tarantool Cluster Manager 1.3

Release date: March 14, 2025

Latest release in series: 1.3.1

Tarantool Cluster Manager 1.3.0 enhances the TCF integration page with minor bug fixes and functional improvements. Below is an overview of key updates.

TCF page improvements

Starting from version 1.3.0, TCM provides additional actions for managing TCF clusters through the web interface. You can now use promote and demote operations directly on the TCF page without switching to external tools. Also, the TCF page is now disabled by default and must be explicitly enabled if needed. In addition, TCM now supports connections to multiple gRPC servers, which improves integration with distributed cluster infrastructures.

Explorer improvements

TCM 1.3.0 introduces a new approach to pagination in the Explorer. Instead of using a tuple, the interface now relies on pointers for navigating result pages. When sending data to the frontend, binary values (varbinary) are now automatically encoded in base64.

Additionally, TCM fixes an issue where queries using a datetime key could result in type mismatch errors due to incorrect index part handling.

etcd integration fixes

In this version, TCM improves its interaction with etcd-based data sources. Tabs that use etcd for updating can now be refreshed even if some of the etcd endpoints are temporarily unavailable. To improve stability, a check was added to detect and correctly handle empty tuple arrays, preventing unexpected errors when processing empty data.

CRUD and query parsing

TCM 1.3.0 includes improvements to how search expressions are parsed in CRUD explorer queries. The CRUD explorer is located on the Tuples page. This release also introduces dedicated tests for the relevant components to ensure consistent behavior in future versions.

Since version 1.3.1, TCM includes missing changes that have now been properly delivered. In addition, several minor issues flagged by the Svacer linter were fixed to improve overall code quality and maintainability.

Tarantool Cluster Manager 1.2

Release date: July 30, 2024

Latest release in series: 1.2.1

Tarantool Cluster Manager 1.2 introduces new features that extend its cluster management capabilities. Below is an overview of its key updates.

Managing Tarantool users

TCM 1.2 introduces the ability to manage Tarantool users on connected clusters. Previously, you could manage Tarantool users only though the Lua API (box.schema submodule) or cluster configuration. Now you can create, edit, and delete users and roles on each instance of a Tarantool cluster through the TCM web interface.

The tools for managing Tarantool users on a cluster instance are located on the Users tab of the instance page.

Learn more about managing Tarantool users from TCM in Managing cluster users and roles.

Migrations

Since version 1.2.0, TCM includes a page for editing and executing migrations on connected clusters. The new page Migrations in the Cluster page group provides a text editor where you can write migration scripts in Lua and apply them to the cluster.

Learn more about migrations in Tarantool Migrations.

Cluster security settings

Since version 1.2.2, TCM provides a web interface for managing cluster security settings on the Security page in the Cluster group.

Learn more about managing cluster security from TCM in Security settings.

TCF integration

Since version 1.2.2, TCM includes a page for managing clusters that run within Tarantool Clusters Federation.

Learn more about working with TCF in TCM in TCF integration.

Tarantool Cluster Manager 1.1

Release date: May 16, 2024

Latest release in series: 1.1.0

Tarantool Cluster Manager 1.1 introduces a number of new features that extend and improve its cluster management capabilities. Below is an overview of its key updates.

Data access

An important update of TCM 1.1.0 is a set of features that enable access to clusters’ stored data.

The instance space explorer shows all spaces that exist on an instance, including system spaces. On its pages, you can view and edit the stored data. To open the instance explorer, find the instance on the cluster stateboard and click its name to open its details page. Then click Explorer in the Actions menu in the top right corner.

In the development mode, the instance explorer also includes the schema editor. It allows you to add new and edit existing spaces.

For clusters that use the CRUD module, there is also the CRUD explorer that enables access to data in user spaces across the entire cluster. The CRUD explorer is located on the Tuples page.

Access control list

TCM’s access control list (ACL) enables control over user access to particular spaces and stored functions in the web interface.

For each user that has access to a cluster, you can enable the use of ACL on this cluster. This restricts this user’s access to the cluster’s spaces and functions unless they are explicitly specified in the ACL. The ACL must contain an entry for each such space and function.

Users with ACL off have access to all spaces and functions on clusters according to their cluster permissions.

The tools for managing ACL are located on the new ACL page.

API tokens

TCM 1.1 supports token authentication of external requests. Users can generate API tokens in their user settings dialog. An API token has the same permissions as its creator.

Stateboard improvements

TCM 1.1 extends the functionality of the cluster stateboard to improve the cluster management experience. Here are the key updates of the stateboard:

More flexible instance grouping.
Stateful failover and switchover controls.
Runtime issues on the stateboard.

Instance interaction

The instance management dialog has been extended with new functions:

A new terminal that uses the tt interactive console.
SQL query execution terminal.
Stored functions editor.
Slab visualization.

Cluster metrics

Starting from version 1.1.0, TCM displays metrics of connected clusters. You can view metrics in TCM one by one, visualizing them as charts or tables. The cluster metrics are shown on the new Cluster metrics page.

For more complex monitoring, you can use dedicated solutions, for example, Prometheus. It can integrate with TCM using the API tokens.

Configuration validation

The cluster configuration editor now validates the configuration semantically. Previously, TCM was able to highlight the syntax errors in configurations, for example, incorrect spelling of option names or hierarchy. In TCM 1.1.0, the editor checks and highlights possible semantic issues, such as:

Users without passwords.
Users with the super role.
Absence of leader instances in replica sets.

Onboarding tutorial

TCM 1.1.0 includes an interactive tutorial that takes new users through its main features and pages. It opens automatically after the first start.

Tarantool Cluster Manager 1.0

Release date: December 26, 2023

Latest release in series: 1.0.4

1.0 is the first public release series of Tarantool Cluster Manager. It was introduced as a part of the Tarantool EE 3.0 release. Below is an overview of key features of TCM 1.0.

Multiple connected clusters

TCM works as a standalone application. You can connect any number of Tarantool EE 3.0+ clusters to a single TCM instance and switch between them on the fly.

To connect a cluster to TCM, you need to provide the endpoint URLs and connection parameters of its centralized configuration storage (for example, etcd). To learn more, see Connecting clusters.

Cluster stateboard

The cluster stateboard is a main TCM page that visualizes the information about the selected cluster:

Cluster topology visualized as a table or a graph
Tarantool versions running on instances
Memory statistics
Errors and warnings that happen on instances

From the stateboard, you can navigate to specific instances to view their details or connect to their interactive consoles.

To learn more, see Viewing cluster state.

Cluster configuration management

TCM includes a visual editor for cluster configuration. It allows editing cluster configurations as a YAML file in the browser. Once you’re done editing the configuration, you can send the changes to the configuration storage in one click or save them locally to continue editing them later.

To learn more, see Configuring clusters.

Role-based access control

TCM features its own role-based access control system. It defines users that can log into TCM and their permissions to perform various actions or access clusters in its web interface.

You can use built-in roles or create new ones with permissions you need. Users’ access can be limited to specific clusters and operations on them, for example, editing the configuration or calling stored functions. To learn more, see Access control.

TCM also supports LDAP authentication.

Audit logging

TCM has a built-in audit logging mechanism. When enabled, it records information about events that occur in TCM and users’ actions to dedicated audit log files. You can define events to write to the audit log and adjust logging parameters, such as filename, log rotation, or compression.

To learn more, see Audit log.

Interactive console

The interactive console is Tarantool’s basic command-line interface for entering requests and seeing results. It is what users see when they start the server without an instance file. The interactive console is often called the Lua console to distinguish it from the administrative console, but in fact it can handle both Lua and SQL input.

The majority of examples in this manual show what users see with the interactive console. It includes:

tarantool> prompt
instruction (a Lua request or an SQL statement)
response (a display in either YAML or Lua format)

-- Interactive console example with Lua input and YAML output --
tarantool> box.info().replication
---
- 1:
    id: 1
    uuid: a5d22f66-2d28-4a35-b78f-5bf73baf6c8a
    lsn: 0
...

Interactive console input and output

The input language can be either Lua (default) or SQL. To change the input language, run \set language <language>, for example:

-- Set input language to SQL --
tarantool> \set language sql
---
- true
...

The delimiter can be changed to any character with \set delimiter <character>. By default, the delimiter is empty, which means the input does not need to end with a delimiter. For example, a common recommendation for SQL input is to use the semicolon delimiter:

-- Set ';' delimiter --
tarantool> \set delimiter ;
---
...

The output format can be either YAML (default) or Lua. To change the output format, run \set output <format>, for example:

-- Set output format Lua --
tarantool> \set output lua
true

The default YAML output format is the following:

The output starts from a document-start line "---".
Each item begins on a separate line starting with "- ".
Each sub-item in a nested structure is indented.
The output ends with a document-end line "...".

The alternative Lua format for console output is the following:

There are no lines for document-start or document-end.
Items are separated by commas.
Each sub-item in a nested structure is placed inside “{}” braces.

So, when an input is a Lua object description, the output in the Lua format equals it.

For the Lua output format, you can specify an end of statement symbol. It is added to the end of each output statement in the current session and can be used for parsing the output by scripts. By default, the end of statement symbol is empty. You can change it to any character or character sequence. To set an end of statement symbol for the current session, run \`set output lua,local_eos=<symbol>`, for example:

-- Set output format Lua and '#' end of statement symbol --
tarantool> \set output lua,local_eos=#
true#

To switch back to the empty end of statement symbol:

-- Set output format Lua and empty end of statement symbol --
tarantool> \set output lua,local_eos=
true

The YAML output has better readability. The Lua output can be reused in requests. The table below shows output examples in these formats compared with the MsgPack format, which is good for database storage.

Type	Lua input	Lua output	YAML output	MsgPack storage
scalar	`1`	`1`	`---` `- 1` `...`	`\x01`
scalar sequence	`1, 2, 3`	`1, 2, 3`	`---` `- 1` `- 2` `- 3` `...`	`\x01 \x02 \x03`
2-element table	`{1, 2}`	`{1, 2}`	`---` `- - 1` `- 2` `...`	`0x92 0x01 0x02`
map	`{key = 1}`	`{key = 1}`	`---` `- key: 1` `...`	`\x81 \xa3 \x6b \x65 \x79 \x01`

The console parameters of a Tarantool instance can also be changed from another instance using the console built-in module functions.

Keyboard shortcuts

Since 2.10.0.

Keyboard shortcut	Effect
`CTRL+C`	Discard current input with the `SIGINT` signal in the console mode and jump to a new line with a default prompt.
`CTRL+D`	Quit Tarantool interactive console.

Important

Keep in mind that CTRL+C shortcut will shut Tarantool down if there is any currently running command in the console. The SIGINT signal stops the instance running in a daemon mode.

LuaJIT memory profiler

Since version 2.7.1, Tarantool has a built‑in module called misc.memprof that implements a LuaJIT memory profiler (further in this section we call it the profiler for short). The profiler provides a memory allocation report that helps analyze Lua code and find the places that put the most pressure on the Lua garbage collector (GC).

Inside this section:

Working with the profiler
- Collecting a binary profile
- Parsing a binary profile and generating a profiling report
FAQ
Profiling a report analysis example
The heap summary and the –leak-only option

Working with the profiler

The profiler usage involves two steps:

Collecting a binary profile of allocations, reallocations, and deallocations in memory related to Lua (further, binary memory profile or binary profile for short).
Parsing the collected binary profile to get a human-readable profiling report.

Collecting a binary profile

To collect a binary profile for a particular part of the Lua code, you need to place this part between two misc.memprof functions, namely, misc.memprof.start() and misc.memprof.stop(), and then execute the code in Tarantool.

Below is a chunk of Lua code named test.lua to illustrate this.

  -- Prevent allocations on traces.
  jit.off()
  local str, err = misc.memprof.start("memprof_new.bin")
  -- Lua doesn't create a new frame to call string.rep, and all allocations
  -- are attributed not to the append() function but to the parent scope.
  local function append(str, rep)
      return string.rep(str, rep)
  end

  local t = {}
  for i = 1, 1e4 do
      -- table.insert is the built-in function and all corresponding
      -- allocations are reported in the scope of the main chunk.
      table.insert(t,
          append('q', i)
      )
  end
  local str, err = misc.memprof.stop()

The Lua code for starting the profiler – as in line 3 in the test.lua example above – is:

local str, err = misc.memprof.start(FILENAME)

where FILENAME is the name of the binary file where profiling events are written.

If the operation fails, for example if it is not possible to open a file for writing or if the profiler is already running, misc.memprof.start() returns nil as the first result, an error-message string as the second result, and a system-dependent error code number as the third result.

If the operation succeeds, misc.memprof.start() returns true.

The Lua code for stopping the profiler – as in line 18 in the test.lua example above – is:

local str, err = misc.memprof.stop()

If the operation fails, for example if there is an error when the file descriptor is being closed or if there is a failure during reporting, misc.memprof.stop() returns nil as the first result, an error-message string as the second result, and a system-dependent error code number as the third result.

If the operation succeeds, misc.memprof.stop() returns true.

To generate the file with memory profile in binary format (in the test.lua code example above the file name is memprof_new.bin), execute the code in Tarantool:

$ tarantool test.lua

Tarantool collects the allocation events in memprof_new.bin, puts the file in its working directory, and closes the session.

The test.lua code example above also illustrates the memory allocation logic in some cases that are important to understand for reading and analyzing a profiling report:

Line 2: It is recommended to switch the JIT compilation off by calling jit.off() before the profiler start. Refer to the following note about jitoff for more details.
Lines 6-8: Tail call optimization doesn’t create a new call frame, so all allocations inside the function called via the CALLT/CALLMT bytecodes are attributed to the function’s caller. See also the comments preceding these lines.
Lines 14-16: Usually the information about allocations inside Lua built‑ins is not really useful for developers. That’s why if a Lua built‑in function is called from a Lua function, the profiler attributes all allocations to the Lua function. Otherwise, this event is attributed to a C function. See also the comments preceding these lines.

Parsing a binary profile and generating a profiling report

After getting the memory profile in binary format, the next step is to parse it to get a human-readable profiling report. You can do this via Tarantool by using the following command (mind the hyphen - before the filename):

$ tarantool -e 'require("memprof")(arg)' - memprof_new.bin

where memprof_new.bin is the binary profile generated earlier by tarantool test.lua.

Note

There is a slight behavior change here: the tarantool -e ... command was slightly different in Tarantool versions prior to Tarantool 2.8.1.

Tarantool generates a profiling report and displays it on the console before closing the session:

ALLOCATIONS
@test.lua:14: 10000 events  +50240518 bytes -0 bytes
@test.lua:9: 1 events       +32 bytes       -0 bytes
@test.lua:8: 1 events       +20 bytes       -0 bytes
@test.lua:13: 1 events      +24 bytes       -0 bytes

REALLOCATIONS
@test.lua:13: 13 events     +262216 bytes   -131160 bytes
    Overrides:
        @test.lua:13

@test.lua:14: 11 events     +49536 bytes    -24768 bytes
            Overrides:
        @test.lua:14
        INTERNAL

INTERNAL: 3 events          +8448 bytes     -16896 bytes
    Overrides:
        @test.lua:14

DEALLOCATIONS
INTERNAL: 1723 events       +0 bytes        -483515 bytes
@test.lua:14: 1 events      +0 bytes        -32768 bytes

HEAP SUMMARY:
@test.lua:14 holds 50248326 bytes: 10010 allocs, 10 frees
@test.lua:13 holds 131080 bytes: 14 allocs, 13 frees
INTERNAL holds 8448 bytes: 3 allocs, 3 frees
@test.lua:9 holds 32 bytes: 1 allocs, 0 frees
@test.lua:8 holds 20 bytes: 1 allocs, 0 frees

Note

On macOS, a report will be different for the same chunk of code because Tarantool and LuaJIT are built with the GC64 mode enabled for macOS.

Let’s examine the report structure. A report has four sections:

ALLOCATIONS
REALLOCATIONS
DEALLOCATIONS
HEAP SUMMARY (described later in The heap summary and the –leak-only option)

Each section contains event records that are sorted from the most frequent to the least frequent.

An event record has the following format:

@<filename>:<line_number>: <number_of_events> events +<allocated> bytes -<freed> bytes

where:

<filename> -— a name of the file containing Lua code.
<line_number> -— the line number where the event is detected.
<number_of_events> —- a number of events for this code line.
+<allocated> bytes —- amount of memory allocated during all the events on this line.
-<freed> bytes —- amount of memory freed during all the events on this line.

The Overrides label shows what allocation has been overridden.

See the test.lua chunk above with the explanation in the comments for some examples.

The INTERNAL label indicates that this event is caused by internal LuaJIT structures.

Note

Important note regarding the INTERNAL label and the recommendation of switching the JIT compilation off (jit.off()): this version of the profiler doesn’t support verbose reporting for allocations on traces. If memory allocations are made on a trace, the profiler can’t associate the allocations with the part of Lua code that generated the trace. In this case, the profiler labels such allocations as INTERNAL.

So, if the JIT compilation is on, new traces will be generated and there will be a mixture of events labeled INTERNAL in the profiling report: some of them are really caused by internal LuaJIT structures, but some of them are caused by allocations on traces.

If you want to have a more definite report without JIT compiler allocations, call jit.off() before starting the profiling. And if you want to completely exclude the trace allocations from the report, remove also the old traces by additionally calling jit.flush() after jit.off().

Nevertheless, switching the JIT compilation off before the profiling is not “a must”. It is rather a recommendation, and in some cases, for example in a production environment, you may need to keep JIT compilation on to see the full picture of all the memory allocations. In this case, the majority of the INTERNAL events are most probably caused by traces.

As for investigating the Lua code with the help of profiling reports, it is always code-dependent and there can’t be hundred per cent definite recommendations in this regard. Nevertheless, you can see some of the things in the Profiling a report analysis example later.

Also, below is the FAQ section with the questions that most probably can arise while using the profiler.

FAQ

In this section, some profiler-related points are discussed in a Q&A format.

Question (Q): Is the profiler suitable for C allocations or allocations inside C code?

Answer (A): The profiler reports only allocation events caused by the Lua allocator. All Lua-related allocations, like table or string creation are reported. But the profiler doesn’t report allocations made by malloc() or other non-Lua allocators. You can use valgrind to debug them.

Q: Why are there so many INTERNAL allocations in my profiling report? What does it mean?

A: INTERNAL means that these allocations/reallocations/deallocations are related to the internal LuaJIT structures or are made on traces. Currently, the profiler doesn’t verbosely report allocations of objects that are made during trace execution. Try adding jit.off() before the profiler start.

Q: Why are there some reallocations/deallocations without an Overrides section?

A: These objects can be created before the profiler starts. Adding collectgarbage() before the profiler’s start enables collecting all previously allocated objects that are dead when the profiler starts.

Q: Why are some objects not collected during profiling? Is it a memory leak?

A: LuaJIT uses incremental Garbage Collector (GC). A GC cycle may not be finished at the moment the profiler stops. Add collectgarbage() before stopping the profiler to collect all the dead objects for sure.

Q: Can I profile not just a current chunk but the entire running application? Can I start the profiler when the application is already running?

A: Yes. Here is an example of code that can be inserted in the Tarantool console for a running instance.

  local fiber = require "fiber"
  local log = require "log"

  fiber.create(function()
    fiber.name("memprof")

    collectgarbage() -- Collect all objects already dead
    log.warn("start of profile")

    local st, err = misc.memprof.start(FILENAME)
    if not st then
      log.error("failed to start profiler: %s", err)
    end

    fiber.sleep(TIME)

    collectgarbage()
    st, err = misc.memprof.stop()

    if not st then
      log.error("profiler on stop error: %s", err)
    end

    log.warn("end of profile")
  end)

where:

FILENAME—the name of the binary file where profiling events are written
TIME—duration of profiling, in seconds.

Also, you can directly call misc.memprof.start() and misc.memprof.stop() from a console.

Profiling a report analysis example

In the example below, the following Lua code named format_concat.lua is investigated with the help of the memory profiler reports.

  -- Prevent allocations on new traces.
  jit.off()

  local function concat(a)
    local nstr = a.."a"
    return nstr
  end

  local function format(a)
    local nstr = string.format("%sa", a)
    return nstr
  end

  collectgarbage()

  local binfile = "/tmp/memprof_"..(arg[0]):match("([^/]*).lua")..".bin"

  local st, err = misc.memprof.start(binfile)
  assert(st, err)

  -- Payload.
  for i = 1, 10000 do
    local f = format(i)
    local c = concat(i)
  end
  collectgarbage()

  local st, err = misc.memprof.stop()
  assert(st, err)

  os.exit()

When you run this code in Tarantool and then parse the binary memory profile in /tmp/memprof_format_concat.bin, you will get the following profiling report:

ALLOCATIONS
@format_concat.lua:10: 19996 events +624284 bytes   -0 bytes
INTERNAL: 1 events                  +65536 bytes    -0 bytes

REALLOCATIONS

DEALLOCATIONS
INTERNAL: 19996 events              +0 bytes        -558778 bytes
    Overrides:
        @format_concat.lua:10

@format_concat.lua:10: 2 events     +0 bytes        -98304 bytes
    Overrides:
        @format_concat.lua:10

HEAP SUMMARY:
INTERNAL holds 65536 bytes: 1 allocs, 0 frees

Reasonable questions regarding the report can be:

Why are there no allocations related to the concat() function?
Why is the number of allocations not a round number?
Why are there about 20K allocations instead of 10K?

First of all, LuaJIT doesn’t create a new string if the string with the same payload exists (see details on lua-users.org/wiki). This is called string interning. So, when a string is created via the format() function, there is no need to create the same string via the concat() function, and LuaJIT just uses the previous one.

That is also the reason why the number of allocations is not a round number as could be expected from the cycle operator for i = 1, 10000...: Tarantool creates some strings for internal needs and built‑in modules, so some strings already exist.

But why are there so many allocations? It’s almost twice as big as the expected amount. This is because the string.format() built‑in function creates another string necessary for the %s identifier, so there are two allocations for each iteration: for tostring(i) and for string.format("%sa", string_i_value). You can see the difference in behavior by adding the line local _ = tostring(i) between lines 22 and 23.

To profile only the concat() function, comment out line 23 (which is local f = format(i)) and run the profiler. Now the output looks like this:

ALLOCATIONS
@format_concat.lua:5: 10000 events  +284411 bytes    -0 bytes

REALLOCATIONS

DEALLOCATIONS
INTERNAL: 10000 events              +0 bytes         -218905 bytes
    Overrides:
        @format_concat.lua:5

@format_concat.lua:5: 1 events      +0 bytes         -32768 bytes

HEAP SUMMARY:
@format_concat.lua:5 holds 65536 bytes: 10000 allocs, 9999 frees

Q: But what will change if JIT compilation is enabled?

A: In the code, comment out line 2 (which is jit.off()) and run the profiler. Now there are only 56 allocations in the report, and all the other allocations are JIT-related (see also the related dev issue):

ALLOCATIONS
@format_concat.lua:5: 56 events +1112 bytes -0 bytes
@format_concat.lua:0: 4 events  +640 bytes  -0 bytes
INTERNAL: 2 events              +382 bytes  -0 bytes

REALLOCATIONS

DEALLOCATIONS
INTERNAL: 58 events             +0 bytes    -1164 bytes
    Overrides:
        @format_concat.lua:5
        INTERNAL


HEAP SUMMARY:
@format_concat.lua:0 holds 640 bytes: 4 allocs, 0 frees
INTERNAL holds 360 bytes: 2 allocs, 1 frees

This happens because a trace has been compiled after 56 iterations (the default value of the hotloop compiler parameter). Then, the JIT-compiler removed the unused variable c from the trace, and, therefore, the dead code of the concat() function is eliminated.

Next, let’s profile only the format() function with JIT enabled. For that, comment out lines 2 and 24 (jit.off() and local c = concat(i)), do not comment out line 23 (local f = format(i)), and run the profiler. Now the output will look like this:

ALLOCATIONS
@format_concat.lua:10: 19996 events +624284 bytes  -0 bytes
INTERNAL: 4 events                  +66928 bytes   -0 bytes
@format_concat.lua:0: 4 events      +640 bytes     -0 bytes

REALLOCATIONS

DEALLOCATIONS
INTERNAL: 19997 events              +0 bytes       -559034 bytes
    Overrides:
        @format_concat.lua:0
        @format_concat.lua:10

@format_concat.lua:10: 2 events     +0 bytes       -98304 bytes
    Overrides:
        @format_concat.lua:10


HEAP SUMMARY:
INTERNAL holds 66928 bytes: 4 allocs, 0 frees
@format_concat.lua:0 holds 384 bytes: 4 allocs, 1 frees

Q: Why are there so many allocations in comparison to the concat() function?

A: The answer is simple: the string.format() function with the %s identifier is not yet compiled via LuaJIT. So, a trace can’t be recorded and the compiler doesn’t perform the corresponding optimizations.

If we change the format() function in lines 9-12 of the Profiling a report analysis example in the following way

local function format(a)
  local nstr = string.format("%sa", tostring(a))
  return nstr
end

the profiling report becomes much prettier:

ALLOCATIONS
@format_concat.lua:10: 109 events   +2112 bytes -0 bytes
@format_concat.lua:0: 4 events      +640 bytes  -0 bytes
INTERNAL: 3 events                  +1206 bytes -0 bytes

REALLOCATIONS

DEALLOCATIONS
INTERNAL: 112 events                +0 bytes    -2460 bytes
    Overrides:
        @format_concat.lua:0
        @format_concat.lua:10
        INTERNAL


HEAP SUMMARY:
INTERNAL holds 1144 bytes: 3 allocs, 1 frees
@format_concat.lua:0 holds 384 bytes: 4 allocs, 1 frees

The heap summary and the –leak-only option

This feature was added in version 2.8.1.

The end of each display is a HEAP SUMMARY section which looks like this:

@<filename>:<line number> holds <number of still reachable bytes> bytes:
<number of allocation events> allocs, <number of deallocation events> frees

Sometimes a program can cause many deallocations, so the DEALLOCATION section can become large, so the display is not easy to read. To minimize output, start the parsing with an extra flag: --leak-only, for example

$ tarantool -e 'require("memprof")(arg)' - --leak-only memprof_new.bin

When --leak-only is used, only the HEAP SUMMARY section is displayed.

LuaJIT platform profiler

The default profiling options for LuaJIT are not fine enough to get an understanding of performance. For example, perf is only able to show the host stack, so all the Lua calls are displayed as a single pcall(). Oppositely, the jit.p module provided with LuaJIT is not able to give any information about the host stack.

Since version 2.10.0, Tarantool has a built‑in module called misc.sysprof that implements a LuaJIT sampling profiler (further in this section we call it the profiler for short). The profiler can capture both guest and host stacks simultaneously, along with virtual machine states, so it can show the whole picture.

Three profiling modes are available:

Default: shows only virtual machine state counters.
Leaf: shows the last frame on the stack.
Callchain: performs a complete stack dump.

The profiler comes with a default parser, which produces output in a flamegraph.pl-suitable format.

Inside this section:

Working with the profiler
- Collecting a binary profile
- Parsing a binary profile and generating a profiling report
Profiler Lua API
Profiler C API

Working with the profiler

The profiler usage involves two steps:

Collecting a binary profile of stacks (further referred as binary sampling profile or binary profile for short).
Parsing the collected binary profile to get a human-readable profiling report.

Collecting a binary profile

To collect a binary profile for a particular part of the Lua and C code, you need to place this part between two misc.sysprof functions – namely, misc.sysprof.start() and misc.sysprof.stop() – and then execute the code in Tarantool.

Below is a chunk of Lua code named test.lua to illustrate this.

  local function payload()
    local function fib(n)
      if n <= 1 then
        return n
      end
      return fib(n - 1) + fib(n - 2)
    end
    return fib(32)
  end

  payload()

  local res, err = misc.sysprof.start({mode = 'C', interval = 1, path = 'sysprof.bin'})
  assert(res, err)

  payload()

  res, err = misc.sysprof.stop()
  assert(res, err)

The Lua code for starting the profiler – as in line 1 in the test.lua example above – is:

local str, err = misc.sysprof.start({mode = 'C', interval = 1, path = 'sysprof.bin'})

where:

mode is a profiling mode;
interval is a sampling interval;
sysprof.bin is the name of the binary file where profiling events are written.

If the operation fails, for example if it is not possible to open a file for writing or if the profiler is already running, misc.sysprof.start() returns nil as the first result, an error-message string as the second result, and a system-dependent error code number as the third result.

If the operation succeeds, misc.sysprof.start() returns true.

The Lua code for stopping the profiler – as in line 15 in the test.lua example above – is:

local res, err = misc.sysprof.stop()

If the operation fails, for example if there is an error when the file descriptor is being closed or if there is a failure during reporting, misc.sysprof.stop() returns nil as the first result, an error-message string as the second result, and a system-dependent error code number as the third result.

If the operation succeeds, misc.sysprof.stop() returns true.

To generate a file with the memory profile in the binary format (in the test.lua code example above the file name is sysprof.bin), execute the code in Tarantool:

$ tarantool test.lua

Tarantool collects allocation events in sysprof.bin, puts the file in its working directory, and closes the session.

Parsing a binary profile and generating a profiling report

After getting the platform profile in the binary format, the next step is to parse it to get a human-readable profiling report. You can do this via Tarantool with the following command (mind the hyphen - before the filename):

$ tarantool -e 'require("sysprof")(arg)' - sysprof.bin > tmp
$ curl -O https://raw.githubusercontent.com/brendangregg/FlameGraph/refs/heads/master/flamegraph.pl
$ perl flamegraph.pl tmp > sysprof.svg

where sysprof.bin is the binary profile generated earlier by tarantool test.lua.

Note

There is a slight behavior change here: the tarantool -e ... command was slightly different in Tarantool versions prior to Tarantool 2.8.1. The resulting SVG image contains a flamegraph with collected stacks and can be opened by a modern web-browser for analysis.

As for investigating the Lua code with the help of profiling reports, it is always code-dependent and there are no definite recommendations in this regard. Nevertheless, you can see some of the things in the Profiling report analysis example below.

Profiler Lua API

The platform profiler provides a Lua interface:

misc.sysprof.start(opts)
misc.sysprof.stop()
misc.sysprof.report()

The first two functions return boolean res and err, which is nil on success and contains an error message on failure.

misc.sysprof.report returns a Lua table containing the following counters:

 {
   "samples" = int,
   "INTERP" = int,
   "LFUNC" = int,
   "FFUNC" = int,
   "CFUNC" = int,
   "GC" = int,
   "EXIT" = int,
   "RECORD" = int,
   "OPT" = int,
   "ASM" = int,
   "TRACE" = int
}

The opts argument of misc.sysprof.start can contain the following parameters:

mode (required) – one of the supported profiling modes:
- 'D' = DEFAULT
- 'L' = LEAF
- 'C' = CALLGRAPH
interval (optional) – sampling interval in msec (default is 10 msec).
path (optional) – path to a file to store profile data (default is sysprof.bin).

Profiler C API

The platform profiler provides a low-level C interface:

int luaM_sysprof_set_writer(sp_writer writer) – sets a writer function for sysprof.
int luaM_sysprof_set_on_stop(sp_on_stop on_stop) – sets an on-stop callback for sysprof to clear resources.
int luaM_sysprof_set_backtracer(sp_backtracer backtracer) – sets a backtracing function. If the backtracer argument is NULL, the default backtracer is set.

Note

There is no need to call the configuration functions multiple times if you are starting and stopping the profiler several times in a single program.

Also, it is not necessary to configure sysprof for the Default mode. However, you MUST configure it for other modes.

int luaM_sysprof_start(lua_State *L, const struct luam_Sysprof_Options *opt) – see Profiler options.
int luaM_sysprof_stop(lua_State *L)
int luaM_sysprof_report(struct luam_Sysprof_Counters *counters) – writes profiling counters for each vmstate.

All of the functions return 0 on success and an error code on failure.

Configuration C types

Profiler configuration settings include:

typedef size_t (*sp_writer)(const void **data, size_t len, void *ctx) – a writer function for profile events.

Must be async-safe, see also man 7 signal-safety.

Should return the amount of written bytes on success, or zero in case of error.

Setting *data to NULL means end of profiling. For details see lj_wbuf.h.
typedef int (*sp_on_stop)(void *ctx, uint8_t *buf) – a callback on profiler stopping. Required for a correct cleanup at VM finalization when the profiler is still running.

Returns zero on success.
typedef void (*sp_backtracer)(void *(*frame_writer)(int frame_no, void *addr)) – a backtracing function for the host stack. Should call frame_writer on each frame in the stack, in the order from the stack top to the stack bottom.

The frame_writer function is implemented inside sysprof and will be passed to the backtracer function.

If frame_writer returns NULL, backtracing should be stopped. If frame_writer returns not NULL, the backtracing should be continued if there are frames left.

Profiler options

The options structure for luaM_sysprof_start is as follows:

struct luam_Sysprof_Options {
  /* Profiling mode. */
  uint8_t mode;

  /* Sampling interval in msec. */
  uint64_t interval;

  /* Custom buffer to write data. */
  uint8_t *buf;

  /* The buffer's size. */
  size_t len;

  /* Context for the profile writer and final callback. */
  void *ctx;
};

Profiling modes

The platform profiler supports three profiling modes:

DEFAULT mode collects only data for luam_sysprof_counters, which is stored in memory and can be collected with luaM_sysprof_report after the profiler stops.
LEAF mode = DEFAULT + streams samples with only top frames of the host and guests stacks in the format described in lj_sysprof.h.
CALLGRAPH mode = DEFAULT + streams samples with full callchains of the host and guest stacks in the format described in lj_sysprof.h.

#define LUAM_SYSPROF_DEFAULT 0
#define LUAM_SYSPROF_LEAF 1
#define LUAM_SYSPROF_CALLGRAPH 2

Profiling counters

The counters structure for luaM_sysprof_report is as follows:

struct luam_Sysprof_Counters {
  uint64_t vmst_interp;
  uint64_t vmst_lfunc;
  uint64_t vmst_ffunc;
  uint64_t vmst_cfunc;
  uint64_t vmst_gc;
  uint64_t vmst_exit;
  uint64_t vmst_record;
  uint64_t vmst_opt;
  uint64_t vmst_asm;
  uint64_t vmst_trace;

  uint64_t samples;
};

Note

The order of vmst_* counters is important: it should be the same as the order of the vmstates.

Caveats

Providing writers, backtracers and other settings in the Default mode is pointless, since it only collects counters.
There is NO default configuration for sysprof, so luaM_Sysprof_Configure must be called before the first run of sysprof. Mind the async safety.

LuaJIT getmetrics

Tarantool can return metrics of a current instance via the Lua API or the C API.

misc.getmetrics()
getmetrics table values
getmetrics C API
Example with gc_strnum, strhash_miss, and strhash_hit
Example with gc_allocated and gc_freed
Example with gc_allocated and a space optimization
gc_steps_atomic and gc_steps_propagate
Example with jit_trace_num and jit_trace_abort
Example with jit_snap_restore and a performance unoptimization

misc.getmetrics()

getmetrics()¶

Get the metrics values into a table.

Parameters: none

Return:	table

Example: metrics_table = misc.getmetrics()

getmetrics table values

The metrics table contains 19 values. All values have type = ‘number’ and are the result of a cast to double, so there may be a very slight precision loss. Values whose names begin with gc_ are associated with the LuaJIT garbage collector; a fuller study of the garbage collector can be found at a Lua-users wiki page and a slide from the creator of Lua. Values whose names begin with jit_ are associated with the “phases” of the just-in-time compilation process; a fuller study of JIT phases can be found at A masters thesis from cern.ch.

Values described as “monotonic” are cumulative, that is, they are “totals since all operations began”, rather than “since the last getmetrics() call”. Overflow is possible.

Because many values are monotonic, a typical analysis involves calling getmetrics(), saving the table, calling getmetrics() again and comparing the table to what was saved. The difference is a “slope curve”. An interesting slope curve is one that shows acceleration, for example the difference between the latest value and the previous value keeps increasing. Some of the table members shown here are used in the examples that come later in this section.

Name	Content	Monotonic?
gc_allocated	number of bytes of allocated memory	yes
gc_cdatanum	number of allocated cdata objects	no
gc_freed	number of bytes of freed memory	yes
gc_steps_atomic	number of steps of garbage collector, atomic phases, incremental	yes
gc_steps_finalize	number of steps of garbage collector, finalize	yes
gc_steps_pause	number of steps of garbage collector, pauses	yes
gc_steps_propagate	number of steps of garbage collector, propagate	yes
gc_steps_sweep	number of steps of garbage collector, sweep phases (see the Sweep phase description)	yes
gc_steps_sweepstring	number of steps of garbage collector, sweep phases for strings	yes
gc_strnum	number of allocated string objects	no
gc_tabnum	number of allocated table objects	no
gc_total	number of bytes of currently allocated memory (normally equals gc_allocated minus gc_freed)	no
gc_udatanum	number of allocated udata objects	no
jit_mcode_size	total size of all allocated machine code areas	no
jit_snap_restore	overall number of snap restores, based on the number of guard assertions leading to stopping trace executions (see external Snap tutorial)	yes
jit_trace_abort	overall number of aborted traces	yes
jit_trace_num	number of JIT traces	no
strhash_hit	number of strings being interned because, if a string with the same value is found via the hash, a new one is not created / allocated	yes
strhash_miss	total number of strings allocations during the platform lifetime	yes

Note: Although value names are similar to value names in ujit.getmetrics() the values are not the same, primarily because many ujit numbers are not monotonic.

Note: Although value names are similar to value names in LuaJIT metrics, and the values are exactly the same, misc.getmetrics() is slightly easier because there is no need to ‘require’ the misc module.

getmetrics C API

The Lua getmetrics() function is a wrapper for the C function luaM_metrics().

C programs may include a header named libmisclib.h. The definitions in libmisclib.h include the following lines:

struct luam_Metrics { /* the names described earlier for Lua */ }

LUAMISC_API void luaM_metrics(lua_State *L, struct luam_Metrics *metrics);

The names of struct luam_Metrics members are the same as Lua’s getmetrics table values names. The data types of struct luam_Metrics members are all size_t. The luaM_metrics() function will fill the *metrics structure with the metrics related to the Lua state anchored to the L coroutine.

Example with a C program

Go through the C stored procedures tutorial. Replace the easy.c example with

#include "module.h"
#include <lmisclib.h>

int easy(box_function_ctx_t *ctx, const char *args, const char *args_end)
{
  lua_State *ls = luaT_state();
  struct luam_Metrics m;
  luaM_metrics(ls, &m);
  printf("allocated memory = %lu\n", m.gc_allocated);
  return 0;
}

Now when you go back to the client and execute the requests up to and including the line capi_connection:call('easy') you will see that the display is something like “allocated memory = 4431950” although the number will vary.

Example with gc_strnum, strhash_miss, and strhash_hit

To track new string object allocations:

function f()
  collectgarbage("collect")
  local oldm = misc.getmetrics()
  local table_of_strings = {}
  for i = 3000, 4000 do table.insert(table_of_strings, tostring(i)) end
  for i = 3900, 4100 do table.insert(table_of_strings, tostring(i)) end
  local newm = misc.getmetrics()
  print("gc_strnum diff = " .. newm.gc_strnum - oldm.gc_strnum)
  print("strhash_miss diff = " .. newm.strhash_miss - oldm.strhash_miss)
  print("strhash_hit diff = " .. newm.strhash_hit - oldm.strhash_hit)
end
f()

The result will probably be: “gc_strnum diff = 1100” because we added 1202 strings but 101 were duplicates, “strhash_miss_diff = 1100” for the same reason, “strhash_hit_diff = 101” plus some overhead, for the same reason. (There is always a slight overhead amount for strhash_hit, which can be ignored.)

We say “probably” because there is a chance that the strings were already allocated somewhere. It is a good thing if the slope curve of strhash_miss is less than the slope curve of strhash_hit.

The other gc_*num values – gc_cdatanum, gc_tabnum, gc_udatanum – can be accessed in a similar way. Any of the gc_*num values can be useful when looking for memory leaks – the total number of these objects should not grow nonstop. A more general way to look for memory leaks is to watch gc_total. Also jit_mcode_size can be used to watch the amount of allocated memory for machine code traces.

Example with gc_allocated and gc_freed

To track an application’s effect on the garbage collector (less is better):

function f()
  for i = 1, 10 do collectgarbage("collect") end
  local oldm = misc.getmetrics()
  local newm = misc.getmetrics()
  oldm = misc.getmetrics()
  collectgarbage("collect")
  newm = misc.getmetrics()
  print("gc_allocated diff = " .. newm.gc_allocated - oldm.gc_allocated)
  print("gc_freed diff = " .. newm.gc_freed - oldm.gc_freed)
end
f()

The result will be: gc_allocated diff = 800, gc_freed diff = 800. This shows that local ... = getmetrics() itself causes memory allocation (because it is creating a table and assigning to it), and shows that when the name of a variable (in this case the oldm variable) is used again, that causes freeing. Ordinarily the freeing would not occur immediately, but collectgarbage("collect") forces it to happen so we can see the effect.

Example with gc_allocated and a space optimization

To test whether optimizing for space is possible with tables:

function f()
  collectgarbage("collect")
  local oldm = misc.getmetrics()
  local t = {}
  for i = 1, 513 do
    t[i] = i
  end
  local newm = misc.getmetrics()
  local diff = newm.gc_allocated - oldm.gc_allocated
  print("diff = " .. diff)
end
f()

The result will show that diff equals approximately 18000.

Now see what happens if the table initialization is different:

function f()
  local table_new = require "table.new"
  local oldm = misc.getmetrics()
  local t = table_new(513, 0)
  for i = 1, 513 do
    t[i] = i
  end
  local newm = misc.getmetrics()
  local diff = newm.gc_allocated - oldm.gc_allocated
  print("diff = " .. diff)
end
f()

The result will show that diff equals approximately 6000.

gc_steps_atomic and gc_steps_propagate

The slope curves of gc_steps_* items can be used for tracking pressure on the garbage collector too. During long-running routines, gc_steps_* values will increase, but long times between gc_steps_atomic increases are a good sign, And, since gc_steps_atomic increases only once per garbage-collector cycle, it shows how many garbage-collector cycles have occurred.

Also, increases in the gc_steps_propagate number can be used to estimate indirectly how many objects there are. These values also correlate with the garbage collector’s step multiplier. For example, the number of incremental steps can grow, but according to the step multiplier configuration, one step can process only a small number of objects. So these metrics should be considered when configuring the garbage collector.

The following function takes a casual look whether an SQL statement causes much pressure:

function f()
  collectgarbage("collect")
  local oldm = misc.getmetrics()
  collectgarbage("collect")
  box.execute([[DROP TABLE _vindex;]])
  local newm = misc.getmetrics()
  print("gc_steps_atomic = " .. newm.gc_steps_atomic - oldm.gc_steps_atomic)
  print("gc_steps_finalize = " .. newm.gc_steps_finalize - oldm.gc_steps_finalize)
  print("gc_steps_pause = " .. newm.gc_steps_pause - oldm.gc_steps_pause)
  print("gc_steps_propagate = " .. newm.gc_steps_propagate - oldm.gc_steps_propagate)
  print("gc_steps_sweep = " .. newm.gc_steps_sweep - oldm.gc_steps_sweep)
end
f()

And the display will show that the gc_steps_* metrics are not significantly different from what they would be if the box.execute() was absent.

Example with jit_trace_num and jit_trace_abort

Just-in-time compilers will “trace” code looking for opportunities to compile. jit_trace_abort can show how often there was a failed attempt (less is better), and jit_trace_num can show how many traces were generated since the last flush (usually more is better).

The following function does not contain code that can cause trouble for LuaJIT:

function f()
  jit.flush()
  for i = 1, 10 do collectgarbage("collect") end
  local oldm = misc.getmetrics()
  collectgarbage("collect")
  local sum = 0
  for i = 1, 57 do
    sum = sum + 57
  end
  for i = 1, 10 do collectgarbage("collect") end
  local newm = misc.getmetrics()
  print("trace_num = " .. newm.jit_trace_num - oldm.jit_trace_num)
  print("trace_abort = " .. newm.jit_trace_abort - oldm.jit_trace_abort)
end
f()

The result is: trace_num = 1, trace_abort = 0. Fine.

The following function seemingly does contain code that can cause trouble for LuaJIT:

jit.opt.start(0, "hotloop=2", "hotexit=2", "minstitch=15")
_G.globalthing = 5
function f()
  jit.flush()
  collectgarbage("collect")
  local oldm = misc.getmetrics()
  collectgarbage("collect")
  local sum = 0
  for i = 1, box.space._vindex:count()+ _G.globalthing do
    box.execute([[SELECT RANDOMBLOB(0);]])
    require('buffer').ibuf()
    _G.globalthing = _G.globalthing - 1
  end
  local newm = misc.getmetrics()
  print("trace_num = " .. newm.jit_trace_num - oldm.jit_trace_num)
  print("trace_abort = " .. newm.jit_trace_abort - oldm.jit_trace_abort)
end
f()

The result is: trace_num = between 2 and 4, trace_abort = 1. This means that up to four traces needed to be generated instead of one, and this means that something made LuaJIT give up in despair. Tracing more will reveal that the problem is not the suspicious-looking statements within the function, it is the jit.opt.start call. (A look at a jit.dump file might help in examining the trace compilation process.)

Example with jit_snap_restore and a performance unoptimization

If the slope curves of the jit_snap_restore metric grow after changes to old code, that can mean LuaJIT is stopping trace execution more frequently, and that can mean performance is degraded.

Start with this code:

function f()
  local function foo(i)
    return i <= 5 and i or tostring(i)
  end
  -- minstitch option needs to emulate nonstitching behaviour
  jit.opt.start(0, "hotloop=2", "hotexit=2", "minstitch=15")
  local sum = 0
  local oldm = misc.getmetrics()
  for i = 1, 10 do
    sum = sum + foo(i)
  end
  local newm = misc.getmetrics()
  local diff = newm.jit_snap_restore - oldm.jit_snap_restore
  print("diff = " .. diff)
end
f()

The result will be: diff = 3, because there is one side exit when the loop ends, and there are two side exits to the interpreter before LuaJIT may decide that the chunk of code is “hot” (the default value of the hotloop parameter is 56 according to Running LuaJIT).

And now change only one line within function local foo, so now the code is:

function f()
  local function foo(i)
    -- math.fmod is not yet compiled!
    return i <= 5 and i or math.fmod(i, 11)
  end
  -- minstitch option needs to emulate nonstitching behaviour
  jit.opt.start(0, "hotloop=2", "hotexit=2", "minstitch=15")
  local sum = 0
  local oldm = misc.getmetrics()
  for i = 1, 10 do
    sum = sum + foo(i)
  end
  local newm = misc.getmetrics()
  local diff = newm.jit_snap_restore - oldm.jit_snap_restore
  print("diff = " .. diff)
end
f()

The result will be: diff is larger, because there are more side exits. So this test indicates that changing the code affected the performance.

Administration

Tarantool is designed to have multiple running instances on the same host.

Here we show how to administer Tarantool instances using any of the following utilities:

systemd native utilities, or
tt, a command-line utility for managing Tarantool-based applications.

Note

Unlike the rest of this manual, here we use system-wide paths.
Console examples here are for Fedora.

This chapter includes the following sections:

Managing modules

This section covers the installation and reloading of Tarantool modules. To learn about writing your own module and contributing it, check the Contributing a module section.

Installing a module

Modules in Lua and C that come from Tarantool developers and community contributors are available in the following locations:

Tarantool modules repository (see below)
Tarantool deb/rpm repositories (see below)

Installing a module from a repository

See README in tarantool/rocks repository for detailed instructions.

Installing a module from deb/rpm

Follow these steps:

Install Tarantool as recommended on the download page.
Install the module you need. Look up the module’s name on Tarantool rocks page and put the prefix “tarantool-” before the module name to avoid ambiguity:
```
$ # for Ubuntu/Debian:
$ sudo apt-get install tarantool-<module-name>

$ # for RHEL/CentOS/Amazon:
$ sudo yum install tarantool-<module-name>
```
For example, to install the module vshard on Ubuntu, say:
```
$ sudo apt-get install tarantool-vshard
```

Once these steps are complete, you can:

load any module with

tarantool> name = require('module-name')

for example:

tarantool> vshard = require('vshard')

search locally for installed modules using package.path (Lua) or package.cpath (C):

tarantool> package.path
---
- ./?.lua;./?/init.lua; /usr/local/share/tarantool/?.lua;/usr/local/share/
tarantool/?/init.lua;/usr/share/tarantool/?.lua;/usr/share/tarantool/?/ini
t.lua;/usr/local/share/lua/5.1/?.lua;/usr/local/share/lua/5.1/?/init.lua;/
usr/share/lua/5.1/?.lua;/usr/share/lua/5.1/?/init.lua;
...

tarantool> package.cpath
---
- ./?.so;/usr/local/lib/x86_64-linux-gnu/tarantool/?.so;/usr/lib/x86_64-li
nux-gnu/tarantool/?.so;/usr/local/lib/tarantool/?.so;/usr/local/lib/x86_64
-linux-gnu/lua/5.1/?.so;/usr/lib/x86_64-linux-gnu/lua/5.1/?.so;/usr/local/
lib/lua/5.1/?.so;
...

Note

Question-marks stand for the module name that was specified earlier when saying require('module-name').

Reloading a module

You can reload any Tarantool application or module with zero downtime.

Reloading a module in Lua

Here’s an example that illustrates the most typical case – “update and reload”.

Note

In this example, we use recommended administration practices based on instance files and tt utility.

Update the application file.

For example, a module in /usr/share/tarantool/app.lua:

local function start()
  -- initial version
  box.once("myapp:v1.0", function()
    box.schema.space.create("somedata")
    box.space.somedata:create_index("primary")
    ...
  end)

  -- migration code from 1.0 to 1.1
  box.once("myapp:v1.1", function()
    box.space.somedata.index.primary:alter(...)
    ...
  end)

  -- migration code from 1.1 to 1.2
  box.once("myapp:v1.2", function()
    box.space.somedata.index.primary:alter(...)
    box.space.somedata:insert(...)
    ...
  end)
end

-- start some background fibers if you need

local function stop()
  -- stop all background fibers and clean up resources
end

local function api_for_call(xxx)
  -- do some business
end

return {
  start = start,
  stop = stop,
  api_for_call = api_for_call
}

Update the instance file.

For example, /etc/tarantool/instances.enabled/my_app.lua:

#!/usr/bin/env tarantool
--
-- hot code reload example
--

box.cfg({listen = 3302})

-- ATTENTION: unload it all properly!
local app = package.loaded['app']
if app ~= nil then
  -- stop the old application version
  app.stop()
  -- unload the application
  package.loaded['app'] = nil
  -- unload all dependencies
  package.loaded['somedep'] = nil
end

-- load the application
log.info('require app')
app = require('app')

-- start the application
app.start({some app options controlled by sysadmins})

The important thing here is to properly unload the application and its dependencies.

Manually reload the application file.

For example, using tt:

$ tt connect my_app -f /etc/tarantool/instances.enabled/my_app.lua

Reloading a module in C

After you compiled a new version of a C module (*.so shared library), call box.schema.func.reload(‘module-name’) from your Lua script to reload the module.

Logs

Each Tarantool instance logs important events to its own log file. For instances started with tt, the log location is defined by the log_dir parameter in the tt configuration. By default, it’s /var/log/tarantool in the tt system mode, and the var/log subdirectory of the tt working directory in the local mode. In the specified location, tt creates separate directories for each instance’s logs.

To check how logging works, write something to the log using the log module:

$ tt connect application
   • Connecting to the instance...
   • Connected to application

application> require('log').info("Hello for the manual readers")
---
...

Then check the logs:

$ tail instances.enabled/application/var/log/instance001/tt.log
2024-04-09 17:34:29.489 [49502] main/106/gc I> wal/engine cleanup is resumed
2024-04-09 17:34:29.489 [49502] main/104/interactive/box.load_cfg I> set 'instance_name' configuration option to "instance001"
2024-04-09 17:34:29.489 [49502] main/104/interactive/box.load_cfg I> set 'custom_proc_title' configuration option to "tarantool - instance001"
2024-04-09 17:34:29.489 [49502] main/104/interactive/box.load_cfg I> set 'log_nonblock' configuration option to false
2024-04-09 17:34:29.489 [49502] main/104/interactive/box.load_cfg I> set 'replicaset_name' configuration option to "replicaset001"
2024-04-09 17:34:29.489 [49502] main/104/interactive/box.load_cfg I> set 'listen' configuration option to [{"uri":"127.0.0.1:3301"}]
2024-04-09 17:34:29.489 [49502] main/107/checkpoint_daemon I> scheduled next checkpoint for Tue Apr  9 19:08:04 2024
2024-04-09 17:34:29.489 [49502] main/104/interactive/box.load_cfg I> set 'metrics' configuration option to {"labels":{"alias":"instance001"},"include":["all"],"exclude":[]}
2024-04-09 17:34:29.489 [49502] main I> entering the event loop
2024-04-09 17:34:38.905 [49502] main/116/console/unix/:/tarantool I> Hello for the manual readers

Log rotation

When logging to a file, the system administrator must ensure logs are rotated timely and do not take up all the available disk space. The recommended way to prevent log files from growing infinitely is using an external log rotation program, for example, logrotate, which is pre-installed on most mainstream Linux distributions.

A Tarantool log rotation configuration for logrotate can look like this:

# /var/log/tarantool/<env>/<app>/<instance>/*.log
/var/log/tarantool/*/*/*/*.log {
    daily
    size 512k
    missingok
    rotate 10
    compress
    delaycompress
    sharedscripts # Run tt logrotate only once after all logs are rotated.
    postrotate
        /usr/bin/tt -S logrotate
    endscript
}

In this configuration, tt logrotate is called after each log rotation to reopen the instance log files after they are moved by the logrotate program.

There is also the built-in function log.rotate(), which you can call on an instance to reopen its log file after rotation.

Log destination

Tarantool can write its logs to a log file, to syslog, or to a specified program through a pipe. For example, to send logs to syslog, specify the log.to parameter as follows:

log:
  to: syslog
  syslog:
    server: '127.0.0.1:514'

Security

Tarantool allows for two types of connections:

With console.listen() function from console module, you can set up a port which can be used to open an administrative console to the server. This is for administrators to connect to a running instance and make requests. tt invokes console.listen() to create a control socket for each started instance.
With box.cfg{listen=…} parameter from box module, you can set up a binary port for connections which read and write to the database or invoke stored procedures.

When you connect to an admin console:

The client-server protocol is plain text.
No password is necessary.
The user is automatically ‘admin’.
Each command is fed directly to the built-in Lua interpreter.

Therefore you must set up ports for the admin console very cautiously. If it is a TCP port, it should only be opened for a specific IP. Ideally, it should not be a TCP port at all, it should be a Unix domain socket, so that access to the server machine is required. Thus a typical port setup for admin console is:

console.listen('/var/lib/tarantool/socket_name.sock')

and a typical connection URI is:

/var/lib/tarantool/socket_name.sock

if the listener has the privilege to write on /var/lib/tarantool and the connector has the privilege to read on /var/lib/tarantool. Alternatively, to connect to an admin console of an instance started with tt, use tt connect.

To find out whether a TCP port is a port for admin console, use telnet. For example:

$ telnet 0 3303
Trying 0.0.0.0...
Connected to 0.
Escape character is '^]'.
Tarantool 2.1.0 (Lua console)
type 'help' for interactive help

In this example, the response does not include the word “binary” and does include the words “Lua console”. Therefore it is clear that this is a successful connection to a port for admin console, and you can now enter admin requests on this terminal.

When you connect to a binary port:

The client-server protocol is binary.
The user is automatically ‘guest’.
To change the user, it’s necessary to authenticate.

For ease of use, tt connect command automatically detects the type of connection during handshake and uses EVAL binary protocol command when it’s necessary to execute Lua commands over a binary connection. To execute EVAL, the authenticated user must have global “EXECUTE” privilege.

Therefore, when ssh access to the machine is not available, creating a Tarantool user with global “EXECUTE” privilege and non-empty password can be used to provide a system administrator remote access to an instance.

Access control

Tarantool enables flexible management of access to various database resources. The main concepts of Tarantool access control system are as follows:

A user is a person or program that interacts with a Tarantool instance.
An object is an entity to which access can be granted, for example, a space, an index, or a function.
A privilege allows a user to perform certain operations on specific objects, for example, creating spaces, reading or updating data.
A role is a named collection of privileges that can be granted to a user.

Note

The full list of object types and permissions is available in the All object types and permissions section.

Overview

Users

A user identifies a person or program that interacts with a Tarantool instance. There might be different types of users, for example:

A database administrator responsible for the overall management and administration of a database. An administrator can create other users and grant them specified privileges.
A user with limited access to certain data and stored functions. Such users can get their privileges from the database administrator.
Users used in communications between Tarantool instances. For example, such users can be created to maintain replication and sharding in a Tarantool cluster.

There are two built-in users in Tarantool:

admin is a user with all available administrative privileges. If the connection uses an admin-console port, the current user is admin. For example, admin is used when connecting to an instance using tt connect locally using the instance name:
```
$ tt connect app:instance001
```
To allow remote binary port connections using the admin user, you need to set a password.
guest is a user with minimum privileges used by default for remote binary port connections. For example, guest is used when connecting to an instance using tt connect using the IP address and port without specifying the name of a user:
```
$ tt connect 192.168.10.10:3301
```
Warning

Given that the guest user allows unauthenticated access to Tarantool instances, it is not recommended to grant additional privileges to this user. For example, granting the execute access to universe allows remote code execution on instances.

Note

Information about users is stored in the _user space.

Passwords

Any user (except guest) may have a password. If a password is not set, a user cannot connect to Tarantool instances.

Tarantool password hashes are stored in the _user system space. By default, Tarantool uses the CHAP protocol to authenticate users and applies SHA-1 hashing to passwords. So, if the password is ‘123456’, the stored hash is a string like ‘a7SDfrdDKRBe5FaN2n3GftLKKtk=’. In the Enterprise Edition, you can enable PAP authentication with the SHA256 hashing algorithm.

Tarantool Enterprise Edition allows you to improve database security by enforcing the use of strong passwords, setting up a maximum password age, and so on. Learn more from the Authentication topic.

Objects

An object is a securable entity to which access can be granted. Tarantool has a number of objects that enable flexible management of access to data, stored functions, specific actions, and so on.

Below are a few examples of objects:

universe represents a database (box.schema) that contains database objects, including spaces, indexes, users, roles, sequences, and functions. Granting privileges to universe gives a user access to any object in a database.
space enables granting privileges to user-created or system spaces.
function enables granting privileges to functions.

Note

The full list of object types is available in the Object types section.

Privileges

The privileges granted to a user determine which operations the user can perform, for example:

The read and write permissions granted to the space object allow a user to read or modify data in the specified space.
The create permission granted to the space object allows a user to create new spaces.
The execute permission granted to the function object allows a user to execute the specified function.
The session permission granted to the universe object allows a user to connect to an instance over IPROTO.
The usage permission granted to universe object allows a user to use his privileges on database objects (for example, read, write, and alter space).
The alter permission granted to a user allows modifying its own settings, for example, a password.
The drop permission granted to a user allows dropping users.

Note

The full lists of object types and the permissions supported for them are available in the Permissions and Object types and permissions sections.

Note that some privileges might require read and write access to certain system spaces. For example, the create permission granted to the space object requires read and write permissions to the _space system space. Similarly, granting the ability to create functions requires read and write access to the _func space.

Note

Information about privileges is stored in the _priv space.

Roles

A role is a container for privileges that can be granted to users. Roles can also be assigned to other roles, creating a role hierarchy.

There are the following built-in roles in Tarantool:

super has all available administrative permissions.
public has certain read permissions. This role is automatically granted to new users when they are created.
replication can be granted to a user used to maintain replication in a cluster.
sharding can be granted to a user used to maintain sharding in a cluster.

Note

The sharding role is created only if an instance is managed using YAML configuration.

Below are a few diagrams that demonstrate how privileges can be granted to a user without and with using roles.

In this example, a user gets privileges directly without using roles.

user1 ── privilege1
    ├─── privilege2
    └─── privilege3

In this example, a user gets all privileges provided by role1 and specific privileges assigned directly.

user1 ── role1 ── privilege1
    │        └─── privilege2
    ├─── privilege3
    └─── privilege4

In this example, role2 is granted to role1. This means that a user with role1 subsequently gets all privileges from both roles role1 and role2.

user1 ── role1 ── privilege1
    │        ├─── privilege2
    │        └─── role2
    │                 ├─── privilege3
    │                 └─── privilege4
    ├─── privilege5
    └─── privilege6

Note

Information about roles is stored in the _user space.

Object owners

An owner of a database object is the user who created it. The owner of the database and the owner of objects that are created initially (the system spaces and the default users) is the admin user.

Owners automatically have privileges for objects they create. They can share these privileges with other users or roles using box.schema.user.grant() and box.schema.role.grant().

Note

Information about users who gave the specified privileges is stored in the _priv space.

Sessions

A session is the state of a connection to Tarantool. The session contains:

An integer ID identifying the connection.
The current user associated with the connection.
The text description of the connected peer.
A session’s local state, such as Lua variables and functions.

In Tarantool, a single session can execute multiple concurrent transactions. Each transaction is identified by a unique integer ID, which can be queried at the start of the transaction using box.session.sync().

Note

To track all connects and disconnects, you can use connection and authentication triggers.

Managing users

Creating a user

To create a new user, call box.schema.user.create(). In the example below, a user is created without a password:

box.schema.user.create('testuser')

In this example, the password is specified in the options parameter:

box.schema.user.create('testuser', { password = 'foobar' })

Changing passwords

To set or change a user’s password, use box.schema.user.passwd(). In the example below, a user password is set for a currently logged-in user:

box.schema.user.passwd('foobar')

To set the password for the specified user, pass a username and password as shown below:

box.schema.user.passwd('testuser', 'foobar')

Note

box.schema.user.password() returns a hash of the specified password.

Granting privileges to a user

To grant the specified privileges to a user, use the box.schema.user.grant() function. In the example below, testuser gets read permissions to the writers space and read/write permissions to the books space:

box.schema.user.grant('testuser', 'read', 'space', 'writers')
box.schema.user.grant('testuser', 'read,write', 'space', 'books')

Learn more about granting privileges to different types of objects from Granting privileges.

Getting a user’s information

To check whether the specified user exists, call box.schema.user.exists():

box.schema.user.exists('testuser')
--[[
- true
--]]

To get information about privileges granted to a user, call box.schema.user.info():

box.schema.user.info('testuser')
--[[
- - - execute
    - role
    - public
  - - read
    - space
    - writers
  - - read,write
    - space
    - books
  - - session,usage
    - universe
    -
  - - alter
    - user
    - testuser
--]]

In the example above, testuser has the following privileges:

The execute permission to the public role means that this role is assigned to the user.
The read permission to the writers space means that the user can read data from this space.
The read,write permissions to the books space mean that the user can read and modify data in this space.
The session,usage permissions to universe mean the following:
- session: the user can authenticate over an IPROTO connection.
- usage: lets the user use their privileges on database objects (for example, read and modify data in a space).
The alter permission lets testuser modify its own settings, for example, a password.

Revoking user’s privileges

To revoke the specified privileges, use the box.schema.user.revoke() function. In the example below, write access to the books space is revoked:

box.schema.user.revoke('testuser', 'write', 'space', 'books')

Revoking the session permission to universe can be used to disallow a user to connect to a Tarantool instance:

box.schema.user.revoke('testuser', 'session', 'universe')

Changing the current user

The current user name can be found using box.session.user().

box.session.user()
--[[
- admin
--]]

The current user can be changed:

For an admin-console connection: using box.session.su():

box.session.su('testuser')
box.session.user()
--[[
- testuser
--]]

For a binary port connection: using the AUTH protocol command, supported by most clients.
For a binary-port connection invoking a stored function with the CALL command: if the SETUID property is enabled for the function, Tarantool temporarily replaces the current user with the function’s creator, with all the creator’s privileges, during function execution.

Dropping users

To drop the specified user, call box.schema.user.drop():

box.schema.user.drop('testuser')

Managing roles

Creating a role

To create a new role, call box.schema.role.create(). In the example below, two roles are created:

box.schema.role.create('books_space_manager')
box.schema.role.create('writers_space_reader')

Granting privileges to a role

To grant the specified privileges to a role, use the box.schema.role.grant() function. In the example below, the books_space_manager role gets read and write permissions to the books space:

box.schema.role.grant('books_space_manager', 'read,write', 'space', 'books')

The writers_space_reader role gets read permissions to the writers space:

box.schema.role.grant('writers_space_reader', 'read', 'space', 'writers')

Learn more about granting privileges to different types of objects from Granting privileges.

Note

Not all privileges can be granted to roles. Learn more from Permissions.

Granting a role to a role

Roles can be assigned to other roles. In the example below, the newly created all_spaces_manager role gets all privileges granted to books_space_manager and writers_space_reader:

box.schema.role.create('all_spaces_manager')
box.schema.role.grant('all_spaces_manager', 'books_space_manager')
box.schema.role.grant('all_spaces_manager', 'writers_space_reader')

Granting a role to a user

To grant the specified role to a user, use the box.schema.user.grant() function. In the example below, testuser gets privileges granted to the books_space_manager and writers_space_reader roles:

box.schema.user.grant('testuser', 'books_space_manager')
box.schema.user.grant('testuser', 'writers_space_reader')

Getting a role’s information

To check whether the specified role exists, call box.schema.role.exists():

box.schema.role.exists('books_space_manager')
--[[
- true
--]]

To get information about privileges granted to a role, call box.schema.role.info():

box.schema.role.info('books_space_manager')
--[[
- - - read,write
    - space
    - books
--]]

If a role has the execute permission to other roles, this means that these roles are granted to this parent role:

box.schema.role.info('all_spaces_manager')
--[[
- - - execute
    - role
    - books_space_manager
  - - execute
    - role
    - writers_space_reader
--]]

Revoking a role from a user

To revoke the specified role from a user, revoke the execute privilege for this role using the box.schema.user.revoke() function. In the example below, the books_space_reader role is revoked from testuser:

box.schema.user.revoke('testuser', 'execute', 'role', 'writers_space_reader')

To revoke role’s privileges, use box.schema.role.revoke().

Dropping roles

To drop the specified role, call box.schema.role.drop():

box.schema.role.drop('writers_space_reader')

Granting privileges

To grant the specified privileges to a user or role, use the box.schema.user.grant() and box.schema.role.grant() functions, which have similar signatures and accept the same set of arguments. For example, the box.schema.user.grant() signature looks as follows:

box.schema.user.grant(username, permissions, object-type, object-name[, {options}])

username: the name of the user that gets the specified privileges.
permissions: a string value that represents permissions granted to the user. If there are several permissions, they should be separated by commas without a space.
object-type: a type of object to which permissions are granted.
object-name: the name of the object to which permissions are granted. An empty string ("") or nil provided instead of object-name grants the specified permissions to all objects of the specified type.
Note

object-name is ignored for the following combinations of permissions and object types:
- Any permission granted to universe.
- The create and drop permissions for the following object types: user, role, space, function, sequence.
- The execute permission for the following object types: lua_eval, lua_call, sql.

Any object

In the example below, testuser gets privileges allowing them to create any object of any type:

box.schema.user.grant('testuser','read,write,create','universe')

In this example, testuser can grant access to objects that testuser created:

box.schema.user.grant('testuser','write','space','_priv')

Spaces

Creating and altering spaces

In the example below, testuser gets privileges allowing them to create spaces:

box.schema.user.grant('testuser','create','space')
box.schema.user.grant('testuser','write', 'space', '_schema')
box.schema.user.grant('testuser','write', 'space', '_space')

As you can see, the ability to create spaces also requires write access to certain system spaces.

To allow testuser to drop a space that has associated objects, add the following privileges:

box.schema.user.grant('testuser','create,drop','space')
box.schema.user.grant('testuser','write','space','_schema')
box.schema.user.grant('testuser','write','space','_space')
box.schema.user.grant('testuser','write','space','_space_sequence')
box.schema.user.grant('testuser','read','space','_trigger')
box.schema.user.grant('testuser','read','space','_fk_constraint')
box.schema.user.grant('testuser','read','space','_ck_constraint')
box.schema.user.grant('testuser','read','space','_func_index')

Creating and altering indexes

In the example below, testuser gets privileges allowing them to create indexes in the ‘writers’ space:

box.schema.user.grant('testuser','create,read','space','writers')
box.schema.user.grant('testuser','read,write','space','_space_sequence')
box.schema.user.grant('testuser','write', 'space', '_index')

To allow testuser to alter indexes in the writers space, grant the privileges below. This example assumes that indexes in the writers space are not created by testuser.

box.schema.user.grant('testuser','alter','space','writers')
box.schema.user.grant('testuser','read','space','_space')
box.schema.user.grant('testuser','read','space','_index')
box.schema.user.grant('testuser','read','space','_space_sequence')
box.schema.user.grant('testuser','write','space','_index')

If testuser created indexes in the writers space, granting the following privileges is enough to alter indexes:

box.schema.user.grant('testuser','read','space','_space_sequence')
box.schema.user.grant('testuser','read,write','space','_index')

CRUD operations

In this example, testuser gets privileges allowing them to select data from the ‘writers’ space:

box.schema.user.grant('testuser','read','space','writers')

In this example, testuser is allowed to read and modify data in the ‘books’ space:

box.schema.user.grant('testuser','read,write','space','books')

Sequences

Creating and dropping sequences

In this example, testuser gets privileges to create sequence generators:

box.schema.user.grant('testuser','create','sequence')
box.schema.user.grant('testuser', 'read,write', 'space', '_sequence')

To let testuser drop a sequence, grant them the following privileges:

box.schema.user.grant('testuser','drop','sequence')
box.schema.user.grant('testuser','write','space','_sequence_data')
box.schema.user.grant('testuser','write','space','_sequence')

Using sequence functions

In this example, testuser is allowed to use the id_seq:next() function with a sequence named ‘id_seq’:

box.schema.user.grant('testuser','read,write','sequence','id_seq')

In the next example, testuser is allowed to use the id_seq:set() or id_seq:reset() functions with a sequence named ‘id_seq’:

box.schema.user.grant('testuser','write','sequence','id_seq')

Functions

Creating and dropping functions

In this example, testuser gets privileges to create functions:

box.schema.user.grant('testuser','create','function')
box.schema.user.grant('testuser','read,write','space','_func')

To let testuser drop a function, grant them the following privileges:

box.schema.user.grant('testuser','drop','function')
box.schema.user.grant('testuser','write','space','_func')

Executing functions

To give the ability to execute a function named ‘sum’, grant the following privileges:

box.schema.user.grant('testuser','execute','function','sum')

Executing lua functions

Granting the ‘execute’ privilege on lua_call permits the user to call any global (accessible via the _G Lua table) user-defined Lua function with the IPROTO_CALL request. To grant permission to any non-persistent function, you need to specify its name when granting the lua_call privilege.

Note

The function doesn’t need to be defined at the time privileges are granted, meaning that the access to the function will be provided for the user once this function is defined.

function my_func_1() end
function my_func_2() end
box.cfg({listen = 3301})
box.schema.user.create('alice', {password = 'secret'})
conn = require('net.box').connect(box.cfg.listen, {user = 'alice', password = 'secret'})
box.schema.user.grant('alice', 'execute', 'lua_call', 'my_func_1')
conn:call('my_func_1') -- ok
conn:call('my_func_2') -- access denied
box.schema.user.grant('alice', 'execute', 'lua_call', 'box.session.su')
conn:call('box.session.su', {'admin'}) -- ok

Users

In this example, testuser gets privileges to create other users:

box.schema.user.grant('testuser','create','user')
box.schema.user.grant('testuser', 'read,write', 'space', '_user')
box.schema.user.grant('testuser', 'write', 'space', '_priv')

Roles

To let testuser create new roles, grant the following privileges:

box.schema.user.grant('testuser','create','role')
box.schema.user.grant('testuser', 'read,write', 'space', '_user')
box.schema.user.grant('testuser', 'write', 'space', '_priv')

Executing code

To let testuser execute Lua code, grant the execute privilege to the lua_eval object:

box.schema.user.grant('testuser','execute','lua_eval')

Similarly, executing an arbitrary SQL expression requires the execute privilege to the sql object:

box.schema.user.grant('testuser','execute','sql')

Example

In the example below, the created Lua function is executed on behalf of its creator, even if called by another user.

First, the two spaces (space1 and space2) are created, and a no-password user (private_user) is granted full access to them. Then read_and_modify is defined and private_user becomes this function’s creator. Finally, another user (public_user) is granted access to execute Lua functions created by private_user.

box.schema.space.create('space1')
box.schema.space.create('space2')
box.space.space1:create_index('pk')
box.space.space2:create_index('pk')

box.schema.user.create('private_user')

box.schema.user.grant('private_user', 'read,write', 'space', 'space1')
box.schema.user.grant('private_user', 'read,write', 'space', 'space2')
box.schema.user.grant('private_user', 'create', 'universe')
box.schema.user.grant('private_user', 'read,write', 'space', '_func')

function read_and_modify(key)
  local space1 = box.space.space1
  local space2 = box.space.space2
  local fiber = require('fiber')
  local t = space1:get{key}
  if t ~= nil then
    space1:put{key, box.session.uid()}
    space2:put{key, fiber.time()}
  end
end

box.session.su('private_user')
box.schema.func.create('read_and_modify', {setuid= true})
box.session.su('admin')
box.schema.user.create('public_user', {password = 'secret'})
box.schema.user.grant('public_user', 'execute', 'function', 'read_and_modify')

Whenever public_user calls the function, it is executed on behalf of its creator, private_user.

All object types and permissions

Object types

Object type	Description
`universe`	A database (box.schema) that contains database objects, including spaces, indexes, users, roles, sequences, and functions. Granting privileges to `universe` gives a user access to any object in the database.
`user`	A user.
`role`	A role.
`space`	A space.
`function`	A function.
`sequence`	A sequence.
`lua_eval`	Executing arbitrary Lua code.
`lua_call`	Calling any global user-defined Lua function.
`sql`	Executing an arbitrary SQL expression.

Permissions

Permission	Object type	Granted to roles	Description
`read`	All	Yes	Allows reading data of the specified object. For example, this permission can be used to allow a user to select data from the specified space.
`write`	All	Yes	Allows updating data of the specified object. For example, this permission can be used to allow a user to modify data in the specified space.
`create`	All	Yes	Allows creating objects of the specified type. For example, this permission can be used to allow a user to create new spaces. Note that this permission requires read and write access to certain system spaces.
`alter`	All	Yes	Allows altering objects of the specified type. Note that this permission requires read and write access to certain system spaces.
`drop`	All	Yes	Allows dropping objects of the specified type. Note that this permission requires read and write access to certain system spaces.
`execute`	`role`, `universe`, `function`, `lua_eval`, `lua_call`, `sql`	Yes	For `role`, allows using the specified role. For other object types, allows calling a function.
`session`	`universe`	No	Allows a user to connect to an instance over IPROTO.
`usage`	`universe`	No	Allows a user to use their privileges on database objects (for example, read, write, and alter spaces).

Object types and permissions

Object type	Details
`universe`	`read`: Allows reading any object types, including all spaces or sequence objects. `write`: Allows modifying any object types, including all spaces or sequence objects. `execute`: Allows execute functions, Lua code, or SQL expressions, including IPROTO calls. `session`: Allows a user to connect to an instance over IPROTO. `usage`: Allows a user to use their privileges on database objects (for example, read, write, and alter space). `create`: Allows creating users, roles, functions, spaces, and sequences. This permission requires read and write access to certain system spaces. `drop`: Allows deleting users, roles, functions, spaces, and sequences. This permission requires read and write access to certain system spaces. `alter`: Allows altering user settings or space objects.
`user`	`alter`: Allows modifying a user description, for example, change the password. `create`: Allows creating new users. This permission requires read and write access to the `_user` system space. `drop`: Allows dropping users. This permission requires read and write access to the `_user` system space.
`role`	`execute`: Indicates that a role is assigned to the user or another role. `create`: Allows creating new roles. This permission requires read and write access to the `_user` system space. `drop`: Allows dropping roles. This permission requires read and write access to the `_user` system space.
`space`	`read`: Allows selecting data from a space. `write`: Allows modifying data in a space. `create`: Allows creating new spaces. This permission requires read and write access to the `_space` system space. `drop`: Allows dropping spaces. This permission requires read and write access to the `_space` system space. `alter`: Allows modifying spaces. This permission requires read and write access to the `_space` system space. If a space is created by a user, they can read and write it without granting explicit permission.
`function`	`execute`: Allows calling a function. `create`: Allows creating a function. This permission requires read and write access to the `_func` system space. If a function is created by a user, they can execute it without granting explicit permission. `drop`: Allows dropping a function. This permission requires read and write access to the `_func` system space.
`sequence`	`read`: Allows using sequences in `space_obj:create_index()`. `write`: Allows all operations for a sequence object. `seq_obj:drop()` requires a write permission to the `_priv` system space. `create`: Allows creating sequences. This permission requires read and write access to the `_sequence` system space. If a sequence is created by a user, they can read/write it without explicit permission. `drop`: Allows dropping sequences. This permission requires read and write access to the `_sequence` system space. `alter`: Has no effect. `seq_obj:alter()` and other methods require the `write` permission.
`lua_eval`	`execute`: Allows executing arbitrary Lua code using the IPROTO_EVAL request.
`lua_call`	`execute`: Allows executing any user-defined function using the IPROTO_CALL request. This permission doesn’t allow a user to call built-in Lua functions (for example, `loadstring()` or `box.session.su()`) and functions defined in the `_func` system space.
`sql`	`execute`: Allows executing arbitrary SQL expression using the IPROTO_PREPARE and IPROTO_EXECUTE requests.

Replication administration

Monitoring a replica set

To learn what instances belong to the replica set and obtain statistics for all these instances, execute a box.info.replication request. The output below shows the replication status for a replica set containing one master and two replicas:

manual_leader:instance001> box.info.replication
---
- 1:
    id: 1
    uuid: 9bb111c2-3ff5-36a7-00f4-2b9a573ea660
    lsn: 21
    name: instance001
  2:
    id: 2
    uuid: 4cfa6e3c-625e-b027-00a7-29b2f2182f23
    lsn: 0
    upstream:
      status: follow
      idle: 0.052655000000414
      peer: replicator@127.0.0.1:3302
      lag: 0.00010204315185547
    name: instance002
    downstream:
      status: follow
      idle: 0.09503500000028
      vclock: {1: 21}
      lag: 0.00026917457580566
  3:
    id: 3
    uuid: 9a3a1b9b-8a18-baf6-00b3-a6e5e11fd8b6
    lsn: 0
    upstream:
      status: follow
      idle: 0.77522099999987
      peer: replicator@127.0.0.1:3303
      lag: 0.0001838207244873
    name: instance003
    downstream:
      status: follow
      idle: 0.33186100000012
      vclock: {1: 21}
      lag: 0
        ...

The following diagram illustrates the upstream and downstream connections if box.info.replication executed at the master instance (instance001):

If box.info.replication is executed on instance002, the upstream and downstream connections look as follows:

This means that statistics for replicas are given in regard to the instance on which box.info.replication is executed.

The primary indicators of replication health are:

idle: the time (in seconds) since the instance received the last event from a master.

If the master has no updates to send to the replicas, it sends heartbeat messages every replication_timeout seconds. The master is programmed to disconnect if it does not see acknowledgments of the heartbeat messages within replication_timeout * 4 seconds.

Therefore, in a healthy replication setup, idle should never exceed replication_timeout: if it does, either the replication is lagging seriously behind, because the master is running ahead of the replica, or the network link between the instances is down.
lag: the time difference between the local time at the instance, recorded when the event was received, and the local time at another master recorded when the event was written to the write-ahead log on that master.

Since the lag calculation uses the operating system clocks from two different machines, do not be surprised if it’s negative: a time drift may lead to the remote master clock being consistently behind the local instance’s clock.

Recovering from a degraded state

“Degraded state” is a situation when the master becomes unavailable – due to hardware or network failure, or due to a programming bug.

In a master-replica set with manual failover, if a master disappears, error messages appear on the replicas stating that the connection is lost:

2023-12-04 13:19:04.724 [16755] main/110/applier/replicator@127.0.0.1:3301 I> can't read row
2023-12-04 13:19:04.724 [16755] main/110/applier/replicator@127.0.0.1:3301 coio.c:349 E> SocketError: unexpected EOF when reading from socket, called on fd 19, aka 127.0.0.1:55932, peer of 127.0.0.1:3301: Broken pipe
2023-12-04 13:19:04.724 [16755] main/110/applier/replicator@127.0.0.1:3301 I> will retry every 1.00 second
2023-12-04 13:19:04.724 [16755] relay/127.0.0.1:55940/101/main coio.c:349 E> SocketError: unexpected EOF when reading from socket, called on fd 23, aka 127.0.0.1:3302, peer of 127.0.0.1:55940: Broken pipe
2023-12-04 13:19:04.724 [16755] relay/127.0.0.1:55940/101/main I> exiting the relay loop

In a master-replica set with automated failover, a log also includes Raft messages showing the process of a new master’s election:

2023-12-04 13:16:56.340 [16615] main/111/applier/replicator@127.0.0.1:3302 I> can't read row
2023-12-04 13:16:56.340 [16615] main/111/applier/replicator@127.0.0.1:3302 coio.c:349 E> SocketError: unexpected EOF when reading from socket, called on fd 24, aka 127.0.0.1:55687, peer of 127.0.0.1:3302: Broken pipe
2023-12-04 13:16:56.340 [16615] main/111/applier/replicator@127.0.0.1:3302 I> will retry every 1.00 second
2023-12-04 13:16:56.340 [16615] relay/127.0.0.1:55695/101/main coio.c:349 E> SocketError: unexpected EOF when reading from socket, called on fd 25, aka 127.0.0.1:3301, peer of 127.0.0.1:55695: Broken pipe
2023-12-04 13:16:56.340 [16615] relay/127.0.0.1:55695/101/main I> exiting the relay loop
2023-12-04 13:16:59.690 [16615] main/112/applier/replicator@127.0.0.1:3303 I> RAFT: message {term: 3, vote: 2, state: candidate, vclock: {1: 9}} from 2
2023-12-04 13:16:59.690 [16615] main/112/applier/replicator@127.0.0.1:3303 I> RAFT: received a newer term from 2
2023-12-04 13:16:59.690 [16615] main/112/applier/replicator@127.0.0.1:3303 I> RAFT: bump term to 3, follow
2023-12-04 13:16:59.690 [16615] main/112/applier/replicator@127.0.0.1:3303 I> RAFT: vote for 2, follow
2023-12-04 13:16:59.691 [16615] main/119/raft_worker I> RAFT: persisted state {term: 3}
2023-12-04 13:16:59.691 [16615] main/119/raft_worker I> RAFT: persisted state {term: 3, vote: 2}
2023-12-04 13:16:59.691 [16615] main/112/applier/replicator@127.0.0.1:3303 I> RAFT: message {term: 3, vote: 2, leader: 2, state: leader} from 2
2023-12-04 13:16:59.691 [16615] main/112/applier/replicator@127.0.0.1:3303 I> RAFT: vote request is skipped - this is a notification about a vote for a third node, not a request
2023-12-04 13:16:59.691 [16615] main/112/applier/replicator@127.0.0.1:3303 I> RAFT: leader is 2, follow

The master’s upstream status is reported as disconnected when executing box.info.replication on a replica:

auto_leader:instance001> box.info.replication
---
- 1:
    id: 1
    uuid: 4cfa6e3c-625e-b027-00a7-29b2f2182f23
    lsn: 32
    upstream:
      peer: replicator@127.0.0.1:3302
      lag: 0.00032305717468262
      status: disconnected
      idle: 48.352504000002
      message: 'connect, called on fd 20, aka 127.0.0.1:62575: Connection refused'
      system_message: Connection refused
    name: instance002
    downstream:
      status: stopped
      message: 'unexpected EOF when reading from socket, called on fd 32, aka 127.0.0.1:3301,
        peer of 127.0.0.1:62204: Broken pipe'
      system_message: Broken pipe
  2:
    id: 2
    uuid: 9bb111c2-3ff5-36a7-00f4-2b9a573ea660
    lsn: 1
    name: instance001
  3:
    id: 3
    uuid: 9a3a1b9b-8a18-baf6-00b3-a6e5e11fd8b6
    lsn: 0
    upstream:
      status: follow
      idle: 0.18620999999985
      peer: replicator@127.0.0.1:3303
      lag: 0.00012516975402832
    name: instance003
    downstream:
      status: follow
      idle: 0.19718099999955
      vclock: {2: 1, 1: 32}
      lag: 0.00051403045654297
...

To learn how to perform manual failover in a master-replica set, see the Performing manual failover section.

In a master-replica configuration with automated failover, a new master should be elected automatically.

Reseeding a replica

If any of a replica’s write-ahead log or snapshot files are corrupted or deleted, you can reseed the replica. This procedure works only if the master’s write-ahead logs are present.

Stop the replica using the tt stop command.
Delete write-ahead logs and snapshots stored in the var/lib/<instance_name> directory.

Note

var/lib is the default directory used by tt to store write-ahead logs and snapshots. Learn more from Configuration.
Start the replica using the tt start command. The replica should catch up with the master by retrieving all the master’s tuples.
(Optional) If you’re reseeding a replica after a replication conflict, you also need to restart replication.

Resolving replication conflicts

Tarantool guarantees that every update is applied only once on every replica. However, due to the asynchronous nature of replication, the order of updates is not guaranteed. This topic describes how to solve problems in master-master replication.

Replacing the same primary key

Case 1: You have two instances of Tarantool. For example, you try to make a replace operation with the same primary key on both instances at the same time. This causes a conflict over which tuple to save and which one to discard.

Tarantool trigger functions can help here to implement the rules of conflict resolution on some condition. For example, if you have a timestamp, you can declare saving the tuple with the bigger one.

First, you need a before_replace() trigger on the space which may have conflicts. In this trigger, you can compare the old and new replica records and choose which one to use (or skip the update entirely, or merge two records together).

Then you need to set the trigger at the right time before the space starts to receive any updates. The way you usually set the before_replace trigger is right when the space is created, so you need a trigger to set another trigger on the system space _space, to capture the moment when your space is created and set the trigger there. This can be an on_replace() trigger.

The difference between before_replace and on_replace is that on_replace is called after a row is inserted into the space, and before_replace is called before that.

To set a _space:on_replace() trigger correctly, you also need the right timing. The best timing to use it is when _space is just created, which is the box.ctl.on_schema_init() trigger.

You also need to utilize box.on_commit to get access to the space being created. The resulting snippet would be the following:

local my_space_name = 'my_space'
local my_trigger = function(old, new) ... end -- your function resolving a conflict
box.ctl.on_schema_init(function()
    box.space._space:on_replace(function(old_space, new_space)
        if not old_space and new_space and new_space.name == my_space_name then
            box.on_commit(function()
                box.space[my_space_name]:before_replace(my_trigger)
            end
        end
    end)
end)

Preventing duplicate insert

Case 2: In a replica set of two masters, both of them try to insert data by the same unique key:

tarantool> box.space.tester:insert{1, 'data'}

This would cause an error saying Duplicate key exists in unique index 'primary' in space 'tester' and the replication would be stopped. (This is the behavior when the replication_skip_conflict configuration parameter has its default recommended value, false.)

$ # error messages from master #1
2017-06-26 21:17:03.233 [30444] main/104/applier/rep_user@100.96.166.1 I> can't read row
2017-06-26 21:17:03.233 [30444] main/104/applier/rep_user@100.96.166.1 memtx_hash.cc:226 E> ER_TUPLE_FOUND:
Duplicate key exists in unique index 'primary' in space 'tester'
2017-06-26 21:17:03.233 [30444] relay/[::ffff:100.96.166.178]/101/main I> the replica has closed its socket, exiting
2017-06-26 21:17:03.233 [30444] relay/[::ffff:100.96.166.178]/101/main C> exiting the relay loop

$ # error messages from master #2
2017-06-26 21:17:03.233 [30445] main/104/applier/rep_user@100.96.166.1 I> can't read row
2017-06-26 21:17:03.233 [30445] main/104/applier/rep_user@100.96.166.1 memtx_hash.cc:226 E> ER_TUPLE_FOUND:
Duplicate key exists in unique index 'primary' in space 'tester'
2017-06-26 21:17:03.234 [30445] relay/[::ffff:100.96.166.178]/101/main I> the replica has closed its socket, exiting
2017-06-26 21:17:03.234 [30445] relay/[::ffff:100.96.166.178]/101/main C> exiting the relay loop

If we check replication statuses with box.info, we will see that replication at master #1 is stopped (1.upstream.status = stopped). Additionally, no data is replicated from that master (section 1.downstream is missing in the report), because the downstream has encountered the same error:

# replication statuses (report from master #3)
tarantool> box.info
---
- version: 1.7.4-52-g980d30092
  id: 3
  ro: false
  vclock: {1: 9, 2: 1000000, 3: 3}
  uptime: 557
  lsn: 3
  vinyl: []
  cluster:
    uuid: 34d13b1a-f851-45bb-8f57-57489d3b3c8b
  pid: 30445
  status: running
  signature: 1000012
  replication:
    1:
      id: 1
      uuid: 7ab6dee7-dc0f-4477-af2b-0e63452573cf
      lsn: 9
      upstream:
        peer: replicator@192.168.0.101:3301
        lag: 0.00050592422485352
        status: stopped
        idle: 445.8626639843
        message: Duplicate key exists in unique index 'primary' in space 'tester'
    2:
      id: 2
      uuid: 9afbe2d9-db84-4d05-9a7b-e0cbbf861e28
      lsn: 1000000
      upstream:
        status: follow
        idle: 201.99915885925
        peer: replicator@192.168.0.102:3301
        lag: 0.0015020370483398
      downstream:
        vclock: {1: 8, 2: 1000000, 3: 3}
    3:
      id: 3
      uuid: e826a667-eed7-48d5-a290-64299b159571
      lsn: 3
  uuid: e826a667-eed7-48d5-a290-64299b159571
...

To learn how to resolve a replication conflict by reseeding a replica, see Resolving replication conflicts.

Replication runs out of sync

In a master-master cluster of two instances, suppose we make the following operation:

tarantool> box.space.tester:upsert({1}, {{'=', 2, box.info.uuid}})

When this operation is applied on both instances in the replica set:

# at master #1
tarantool> box.space.tester:upsert({1}, {{'=', 2, box.info.uuid}})
# at master #2
tarantool> box.space.tester:upsert({1}, {{'=', 2, box.info.uuid}})

… we can have the following results, depending on the order of execution:

each master’s row contains the UUID from master #1,
each master’s row contains the UUID from master #2,
master #1 has the UUID of master #2, and vice versa.

Commutative changes

The cases described in the previous paragraphs represent examples of non-commutative operations, that is operations whose result depends on the execution order. On the contrary, for commutative operations, the execution order does not matter.

Consider for example the following command:

tarantool> box.space.tester:upsert{{1, 0}, {{'+', 2, 1)}

This operation is commutative: we get the same result no matter in which order the update is applied on the other masters.

Trigger usage

The logic and the snippet setting a trigger will be the same here as in case 1. But the trigger function will differ. Note that the trigger below assumes that tuple has a timestamp in the second field.

local my_space_name = 'test'
local my_trigger = function(old, new, sp, op)
    -- op:  ‘INSERT’, ‘DELETE’, ‘UPDATE’, or ‘REPLACE’
    if new == nil then
        print("No new during "..op, old)
        return -- deletes are ok
    end
    if old == nil then
        print("Insert new, no old", new)
        return new  -- insert without old value: ok
    end
    print(op.." duplicate", old, new)
    if op == 'INSERT' then
        if new[2] > old[2] then
            -- Creating new tuple will change op to ‘REPLACE’
            return box.tuple.new(new)
        end
        return old
    end
    if new[2] > old[2] then
        return new
    else
        return old
    end
    return
end

box.ctl.on_schema_init(function()
    box.space._space:on_replace(function(old_space, new_space)
        if not old_space and new_space and new_space.name == my_space_name then
            box.on_commit(function()
                box.space[my_space_name]:before_replace(my_trigger)
            end)
        end
    end)
end)

Server introspection

Using Tarantool as a client

Tarantool enters the interactive mode if:

you start Tarantool without an instance file, or
the instance file contains console.start().

Tarantool displays a prompt (e.g. “tarantool>”) and you can enter requests. When used this way, Tarantool can be a client for a remote server. See basic examples in Getting started.

The interactive mode is used in the tt utility’s connect command.

Executing code on an instance

You can attach to an instance’s admin console and execute some Lua code using tt:

$ # for local instances:
$ tt connect my_app
   • Connecting to the instance...
   • Connected to /var/run/tarantool/example.control

/var/run/tarantool/my_app.control> 1 + 1
---
- 2
...
/var/run/tarantool/my_app.control>

$ # for local and remote instances:
$ tt connect username:password@127.0.0.1:3306

You can also use tt to execute Lua code on an instance without attaching to its admin console. For example:

$ # executing commands directly from the command line
$ <command> | tt connect my_app -f -
<...>

$ # - OR -

$ # executing commands from a script file
$ tt connect my_app -f script.lua
<...>

Note

Alternatively, you can use the console module or the net.box module from a Tarantool server. Also, you can write your client programs with any of the connectors. However, most of the examples in this manual illustrate usage with either tt connect or using the Tarantool server as a client.

Health checks

To check the instance status, run:

$ tt status my_app

$ # - OR -

$ systemctl status tarantool@my_app

To check the boot log, on systems with systemd, run:

$ journalctl -u tarantool@my_app -n 5

For more specific checks, use the reports provided by functions in the following submodules:

Submodule box.cfg (check and specify all configuration parameters for the Tarantool server)
Submodule box.slab (monitor the total use and fragmentation of memory allocated for storing data in Tarantool)
Submodule box.info (introspect Tarantool server variables, primarily those related to replication)
Submodule box.stat (introspect Tarantool request and network statistics)

Finally, there is the metrics library, which enables collecting metrics (such as memory usage or number of requests) from Tarantool applications and expose them via various protocols, including Prometheus. Check Monitoring for more details.

Example

A very popular administrator request is box.slab.info(), which displays detailed memory usage statistics for a Tarantool instance.

tarantool> box.slab.info()
---
- items_size: 228128
  items_used_ratio: 1.8%
  quota_size: 1073741824
  quota_used_ratio: 0.8%
  arena_used_ratio: 43.2%
  items_used: 4208
  quota_used: 8388608
  arena_size: 2325176
  arena_used: 1003632
...

Tarantool takes memory from the operating system, for example when a user does many insertions. You can see how much it has taken by saying (on Linux):

ps -eo args,%mem | grep "tarantool"

Tarantool almost never releases this memory, even if the user deletes everything that was inserted, or reduces fragmentation by calling the Lua garbage collector via the collectgarbage function.

Ordinarily this does not affect performance. But, to force Tarantool to release memory, you can call box.snapshot(), stop the server instance, and restart it.

Inspect traffic

Inspecting binary traffic is a boring task. We offer a Wireshark plugin to simplify the analysis of Tarantool’s traffic.

To enable the plugin, follow the steps below.

Clone the tarantool-dissector repository:

git clone https://github.com/tarantool/tarantool-dissector.git

Copy or symlink the plugin files into the Wireshark plugin directory:

mkdir -p ~/.local/lib/wireshark/plugins
cd ~/.local/lib/wireshark/plugins
ln -s /path/to/tarantool-dissector/MessagePack.lua ./
ln -s /path/to/tarantool-dissector/tarantool.dissector.lua ./

(For the location of the plugin directory on macOS and Windows, please refer to the Plugin folders chapter in the Wireshark documentation.)

Run the Wireshark GUI and ensure that the plugins are loaded:

Open Help > About Wireshark > Plugins.
Find MessagePack.lua and tarantool.dissector.lua in the list.

Now you can inspect incoming and outgoing Tarantool packets with user-friendly annotations.

Visit the project page for details: https://github.com/tarantool/tarantool-dissector.

Profiling performance issues

Tarantool can at times work slower than usual. There can be multiple reasons, such as disk issues, CPU-intensive Lua scripts or misconfiguration. Tarantool’s log may lack details in such cases, so the only indications that something goes wrong are log entries like this: W> too long DELETE: 8.546 sec. Here are tools and techniques that can help you collect Tarantool’s performance profile, which is helpful in troubleshooting slowdowns.

Note

Most of these tools – except fiber.info() – are intended for generic GNU/Linux distributions, but not FreeBSD or Mac OS.

fiber.info()

The simplest profiling method is to take advantage of Tarantool’s built-in functionality. fiber.info() returns information about all running fibers with their corresponding C stack traces. You can use this data to see how many fibers are running and which C functions are executed more often than others.

First, enter your instance’s interactive administrator console:

$ tt connect NAME|URI

Once there, load the fiber module:

tarantool> fiber = require('fiber')

After that you can get the required information with fiber.info().

At this point, your console output should look something like this:

tarantool> fiber = require('fiber')
---
...
tarantool> fiber.info()
---
- 360:
    csw: 2098165
    backtrace:
    - '#0 0x4d1b77 in wal_write(journal*, journal_entry*)+487'
    - '#1 0x4bbf68 in txn_commit(txn*)+152'
    - '#2 0x4bd5d8 in process_rw(request*, space*, tuple**)+136'
    - '#3 0x4bed48 in box_process1+104'
    - '#4 0x4d72f8 in lbox_replace+120'
    - '#5 0x50f317 in lj_BC_FUNCC+52'
    fid: 360
    memory:
      total: 61744
      used: 480
    name: main
  129:
    csw: 113
    backtrace: []
    fid: 129
    memory:
      total: 57648
      used: 0
    name: 'console/unix/:'
...

We highly recommend to assign meaningful names to fibers you create so that you can find them in the fiber.info() list. In the example below, we create a fiber named myworker:

tarantool> fiber = require('fiber')
---
...
tarantool> f = fiber.create(function() while true do fiber.sleep(0.5) end end)
---
...
tarantool> f:name('myworker') <!-- assigning the name to a fiber
---
...
tarantool> fiber.info()
---
- 102:
    csw: 14
    backtrace:
    - '#0 0x501a1a in fiber_yield_timeout+90'
    - '#1 0x4f2008 in lbox_fiber_sleep+72'
    - '#2 0x5112a7 in lj_BC_FUNCC+52'
    fid: 102
    memory:
      total: 57656
      used: 0
    name: myworker <!-- newly created background fiber
  101:
    csw: 284
    backtrace: []
    fid: 101
    memory:
      total: 57656
      used: 0
    name: interactive
...

You can kill any fiber with fiber.kill(fid):

tarantool> fiber.kill(102)
---
...
tarantool> fiber.info()
---
- 101:
    csw: 324
    backtrace: []
    fid: 101
    memory:
      total: 57656
      used: 0
    name: interactive
...

To get a table of all alive fibers you can use fiber.top().

If you want to dynamically obtain information with fiber.info(), the shell script below may come in handy. It connects to a Tarantool instance specified by NAME every 0.5 seconds, grabs the fiber.info() output and writes it to the fiber-info.txt file:

$ rm -f fiber.info.txt
$ watch -n 0.5 "echo 'require(\"fiber\").info()' | tt connect NAME -f - | tee -a fiber-info.txt"

If you can’t understand which fiber causes performance issues, collect the metrics of the fiber.info() output for 10-15 seconds using the script above and contact the Tarantool team at support@tarantool.org.

Poor man’s profilers

pstack <pid>

To use this tool, first install it with a package manager that comes with your Linux distribution. This command prints an execution stack trace of a running process specified by the PID. You might want to run this command several times in a row to pinpoint the bottleneck that causes the slowdown.

Once installed, say:

$ pstack $(pidof tarantool INSTANCENAME.lua)

Next, say:

$ echo $(pidof tarantool INSTANCENAME.lua)

to show the PID of the Tarantool instance that runs the INSTANCENAME.lua file.

You should get similar output:

Thread 19 (Thread 0x7f09d1bff700 (LWP 24173)):
#0 0x00007f0a1a5423f2 in ?? () from /lib64/libgomp.so.1
#1 0x00007f0a1a53fdc0 in ?? () from /lib64/libgomp.so.1
#2 0x00007f0a1ad5adc5 in start_thread () from /lib64/libpthread.so.0
#3 0x00007f0a1a050ced in clone () from /lib64/libc.so.6
Thread 18 (Thread 0x7f09d13fe700 (LWP 24174)):
#0 0x00007f0a1a5423f2 in ?? () from /lib64/libgomp.so.1
#1 0x00007f0a1a53fdc0 in ?? () from /lib64/libgomp.so.1
#2 0x00007f0a1ad5adc5 in start_thread () from /lib64/libpthread.so.0
#3 0x00007f0a1a050ced in clone () from /lib64/libc.so.6
<...>
Thread 2 (Thread 0x7f09c8bfe700 (LWP 24191)):
#0 0x00007f0a1ad5e6d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x000000000045d901 in wal_writer_pop(wal_writer*) ()
#2 0x000000000045db01 in wal_writer_f(__va_list_tag*) ()
#3 0x0000000000429abc in fiber_cxx_invoke(int (*)(__va_list_tag*), __va_list_tag*) ()
#4 0x00000000004b52a0 in fiber_loop ()
#5 0x00000000006099cf in coro_init ()
Thread 1 (Thread 0x7f0a1c47fd80 (LWP 24172)):
#0 0x00007f0a1a0512c3 in epoll_wait () from /lib64/libc.so.6
#1 0x00000000006051c8 in epoll_poll ()
#2 0x0000000000607533 in ev_run ()
#3 0x0000000000428e13 in main ()

gdb -ex “bt” -p <pid>

As with pstack, the GNU debugger (also known as gdb) needs to be installed before you can start using it. Your Linux package manager can help you with that.

Once the debugger is installed, say:

$ gdb -ex "set pagination 0" -ex "thread apply all bt" --batch -p $(pidof tarantool INSTANCENAME.lua)

Next, say:

$ echo $(pidof tarantool INSTANCENAME.lua)

to show the PID of the Tarantool instance that runs the INSTANCENAME.lua file.

After using the debugger, your console output should look like this:

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

[CUT]

Thread 1 (Thread 0x7f72289ba940 (LWP 20535)):
#0 _int_malloc (av=av@entry=0x7f7226e0eb20 <main_arena>, bytes=bytes@entry=504) at malloc.c:3697
#1 0x00007f7226acf21a in __libc_calloc (n=<optimized out>, elem_size=<optimized out>) at malloc.c:3234
#2 0x00000000004631f8 in vy_merge_iterator_reserve (capacity=3, itr=0x7f72264af9e0) at /usr/src/tarantool/src/box/vinyl.c:7629
#3 vy_merge_iterator_add (itr=itr@entry=0x7f72264af9e0, is_mutable=is_mutable@entry=true, belong_range=belong_range@entry=false) at /usr/src/tarantool/src/box/vinyl.c:7660
#4 0x00000000004703df in vy_read_iterator_add_mem (itr=0x7f72264af990) at /usr/src/tarantool/src/box/vinyl.c:8387
#5 vy_read_iterator_use_range (itr=0x7f72264af990) at /usr/src/tarantool/src/box/vinyl.c:8453
#6 0x000000000047657d in vy_read_iterator_start (itr=<optimized out>) at /usr/src/tarantool/src/box/vinyl.c:8501
#7 0x00000000004766b5 in vy_read_iterator_next (itr=itr@entry=0x7f72264af990, result=result@entry=0x7f72264afad8) at /usr/src/tarantool/src/box/vinyl.c:8592
#8 0x000000000047689d in vy_index_get (tx=tx@entry=0x7f7226468158, index=index@entry=0x2563860, key=<optimized out>, part_count=<optimized out>, result=result@entry=0x7f72264afad8) at /usr/src/tarantool/src/box/vinyl.c:5705
#9 0x0000000000477601 in vy_replace_impl (request=<optimized out>, request=<optimized out>, stmt=0x7f72265a7150, space=0x2567ea0, tx=0x7f7226468158) at /usr/src/tarantool/src/box/vinyl.c:5920
#10 vy_replace (tx=0x7f7226468158, stmt=stmt@entry=0x7f72265a7150, space=0x2567ea0, request=<optimized out>) at /usr/src/tarantool/src/box/vinyl.c:6608
#11 0x00000000004615a9 in VinylSpace::executeReplace (this=<optimized out>, txn=<optimized out>, space=<optimized out>, request=<optimized out>) at /usr/src/tarantool/src/box/vinyl_space.cc:108
#12 0x00000000004bd723 in process_rw (request=request@entry=0x7f72265a70f8, space=space@entry=0x2567ea0, result=result@entry=0x7f72264afbc8) at /usr/src/tarantool/src/box/box.cc:182
#13 0x00000000004bed48 in box_process1 (request=0x7f72265a70f8, result=result@entry=0x7f72264afbc8) at /usr/src/tarantool/src/box/box.cc:700
#14 0x00000000004bf389 in box_replace (space_id=space_id@entry=513, tuple=<optimized out>, tuple_end=<optimized out>, result=result@entry=0x7f72264afbc8) at /usr/src/tarantool/src/box/box.cc:754
#15 0x00000000004d72f8 in lbox_replace (L=0x413c5780) at /usr/src/tarantool/src/box/lua/index.c:72
#16 0x000000000050f317 in lj_BC_FUNCC ()
#17 0x00000000004d37c7 in execute_lua_call (L=0x413c5780) at /usr/src/tarantool/src/box/lua/call.c:282
#18 0x000000000050f317 in lj_BC_FUNCC ()
#19 0x0000000000529c7b in lua_cpcall ()
#20 0x00000000004f6aa3 in luaT_cpcall (L=L@entry=0x413c5780, func=func@entry=0x4d36d0 <execute_lua_call>, ud=ud@entry=0x7f72264afde0) at /usr/src/tarantool/src/lua/utils.c:962
#21 0x00000000004d3fe7 in box_process_lua (handler=0x4d36d0 <execute_lua_call>, out=out@entry=0x7f7213020600, request=request@entry=0x413c5780) at /usr/src/tarantool/src/box/lua/call.c:382
#22 box_lua_call (request=request@entry=0x7f72130401d8, out=out@entry=0x7f7213020600) at /usr/src/tarantool/src/box/lua/call.c:405
#23 0x00000000004c0f27 in box_process_call (request=request@entry=0x7f72130401d8, out=out@entry=0x7f7213020600) at /usr/src/tarantool/src/box/box.cc:1074
#24 0x000000000041326c in tx_process_misc (m=0x7f7213040170) at /usr/src/tarantool/src/box/iproto.cc:942
#25 0x0000000000504554 in cmsg_deliver (msg=0x7f7213040170) at /usr/src/tarantool/src/cbus.c:302
#26 0x0000000000504c2e in fiber_pool_f (ap=<error reading variable: value has been optimized out>) at /usr/src/tarantool/src/fiber_pool.c:64
#27 0x000000000041122c in fiber_cxx_invoke(fiber_func, typedef __va_list_tag __va_list_tag *) (f=<optimized out>, ap=<optimized out>) at /usr/src/tarantool/src/fiber.h:645
#28 0x00000000005011a0 in fiber_loop (data=<optimized out>) at /usr/src/tarantool/src/fiber.c:641
#29 0x0000000000688fbf in coro_init () at /usr/src/tarantool/third_party/coro/coro.c:110

Run the debugger in a loop a few times to collect enough samples for making conclusions about why Tarantool demonstrates suboptimal performance. Use the following script:

$ rm -f stack-trace.txt
$ watch -n 0.5 "gdb -ex 'set pagination 0' -ex 'thread apply all bt' --batch -p $(pidof tarantool INSTANCENAME.lua) | tee -a stack-trace.txt"

Structurally and functionally, this script is very similar to the one used with fiber.info() above.

If you have any difficulties troubleshooting, let the script run for 10-15 seconds and then send the resulting stack-trace.txt file to the Tarantool team at support@tarantool.org.

Warning

Use the poor man’s profilers with caution: each time they attach to a running process, this stops the process execution for about a second, which may leave a serious footprint in high-load services.

gperftools

To use the CPU profiler from the Google Performance Tools suite with Tarantool, first take care of the prerequisites:

For Debian/Ubuntu, run:

$ apt-get install libgoogle-perftools4

For RHEL/CentOS/Fedora, run:

$ yum install gperftools-libs

Once you do this, install Lua bindings:

$ tt rocks install gperftools

Now you’re ready to go. Enter your instance’s interactive administrator console:

$ tt connect NAME|URI

To start profiling, say:

tarantool> cpuprof = require('gperftools.cpu')
tarantool> cpuprof.start('/home/<username>/tarantool-on-production.prof')

It takes at least a couple of minutes for the profiler to gather performance metrics. After that, save the results to disk (you can do that as many times as you need):

tarantool> cpuprof.flush()

To stop profiling, say:

tarantool> cpuprof.stop()

You can now analyze the output with the pprof utility that comes with the gperftools package:

$ pprof --text /usr/bin/tarantool /home/<username>/tarantool-on-production.prof

Note

On Debian/Ubuntu, the pprof utility is called google-pprof.

Your output should look similar to this:

Total: 598 samples
      83 13.9% 13.9% 83 13.9% epoll_wait
      54 9.0% 22.9% 102 17.1%
vy_mem_tree_insert.constprop.35
      32 5.4% 28.3% 34 5.7% __write_nocancel
      28 4.7% 32.9% 42 7.0% vy_mem_iterator_start_from
      26 4.3% 37.3% 26 4.3% _IO_str_seekoff
      21 3.5% 40.8% 21 3.5% tuple_compare_field
      19 3.2% 44.0% 19 3.2%
::TupleCompareWithKey::compare
      19 3.2% 47.2% 38 6.4% tuple_compare_slowpath
      12 2.0% 49.2% 23 3.8% __libc_calloc
       9 1.5% 50.7% 9 1.5%
::TupleCompare::compare@42efc0
       9 1.5% 52.2% 9 1.5% vy_cache_on_write
       9 1.5% 53.7% 57 9.5% vy_merge_iterator_next_key
       8 1.3% 55.0% 8 1.3% __nss_passwd_lookup
       6 1.0% 56.0% 25 4.2% gc_onestep
       6 1.0% 57.0% 6 1.0% lj_tab_next
       5 0.8% 57.9% 5 0.8% lj_alloc_malloc
       5 0.8% 58.7% 131 21.9% vy_prepare

perf

This tool for performance monitoring and analysis is installed separately via your package manager. Try running the perf command in the terminal and follow the prompts to install the necessary package(s).

Note

By default, some perf commands are restricted to root, so, to be on the safe side, either run all commands as root or prepend them with sudo.

To start gathering performance statistics, say:

$ perf record -g -p $(pidof tarantool INSTANCENAME.lua)

This command saves the gathered data to a file named perf.data inside the current working directory. To stop this process (usually, after 10-15 seconds), press ctrl+C. In your console, you’ll see:

^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.225 MB perf.data (1573 samples) ]

Now run the following command:

$ perf report -n -g --stdio | tee perf-report.txt

It formats the statistical data in the perf.data file into a performance report and writes it to the perf-report.txt file.

The resulting output should look similar to this:

# Samples: 14K of event 'cycles'
# Event count (approx.): 9927346847
#
# Children Self Samples Command Shared Object Symbol
# ........ ........ ............ ......... .................. .......................................
#
    35.50% 0.55% 79 tarantool tarantool [.] lj_gc_step
            |
             --34.95%--lj_gc_step
                       |
                       |--29.26%--gc_onestep
                       | |
                       | |--13.85%--gc_sweep
                       | | |
                       | | |--5.59%--lj_alloc_free
                       | | |
                       | | |--1.33%--lj_tab_free
                       | | | |
                       | | | --1.01%--lj_alloc_free
                       | | |
                       | | --1.17%--lj_cdata_free
                       | |
                       | |--5.41%--gc_finalize
                       | | |
                       | | |--1.06%--lj_obj_equal
                       | | |
                       | | --0.95%--lj_tab_set
                       | |
                       | |--4.97%--rehashtab
                       | | |
                       | | --3.65%--lj_tab_resize
                       | | |
                       | | |--0.74%--lj_tab_set
                       | | |
                       | | --0.72%--lj_tab_newkey
                       | |
                       | |--0.91%--propagatemark
                       | |
                       | --0.67%--lj_cdata_free
                       |
                        --5.43%--propagatemark
                                  |
                                   --0.73%--gc_mark

Unlike the poor man’s profilers, gperftools and perf have low overhead (almost negligible as compared with pstack and gdb): they don’t result in long delays when attaching to a process and therefore can be used without serious consequences.

jit.p

The jit.p profiler comes with the Tarantool application server, to load it one only needs to say require('jit.p') or require('jit.profile'). There are many options for sampling and display, they are described in the documentation for the LuaJIT Profiler, available from the 2.1 branch of the git repository in the file: doc/ext_profiler.html.

Example

Make a function that calls a function named f1 that does 500,000 inserts and deletes in a Tarantool space. Start the profiler, execute the function, stop the profiler, and show what the profiler sampled.

box.space.t:drop()
box.schema.space.create('t')
box.space.t:create_index('i')
function f1() for i = 1,500000 do
  box.space.t:insert{i}
  box.space.t:delete{i}
  end
return 1
end
function f3() f1() end
jit_p = require("jit.profile")
sampletable = {}
jit_p.start("f", function(thread, samples, vmstate)
  local dump=jit_p.dumpstack(thread, "f", 1)
  sampletable[dump] = (sampletable[dump] or 0) + samples
end)
f3()
jit_p.stop()
for d,v in pairs(sampletable) do print(v, d) end

Typically the result will show that the sampling happened within f1() many times, but also within internal Tarantool functions, whose names may change with each new version.

Daemon supervision

Server signals

Tarantool processes these signals during the event loop in the transaction processor thread:

Signal	Effect
SIGHUP	May cause log file rotation. See the example in reference on Tarantool logging parameters.
SIGUSR1	May cause a database checkpoint. See Function box.snapshot.
SIGTERM	May cause graceful shutdown (information will be saved first).
SIGINT (also known as keyboard interrupt)	May cause graceful shutdown.
SIGKILL	Causes an immediate shutdown.

Other signals will result in behavior defined by the operating system. Signals other than SIGKILL may be ignored, especially if Tarantool is executing a long-running procedure which prevents return to the event loop in the transaction processor thread.

Automatic instance restart

On systemd-enabled platforms, systemd automatically restarts all Tarantool instances in case of failure. To demonstrate it, let’s try to destroy an instance:

$ systemctl status tarantool@my_app|grep PID
Main PID: 5885 (tarantool)
$ tt connect my_app
   • Connecting to the instance...
   • Connected to /var/run/tarantool/my_app.control
/var/run/tarantool/my_app.control> os.exit(-1)
   ⨯ Connection was closed. Probably instance process isn't running anymore

Now let’s make sure that systemd has restarted the instance:

$ systemctl status tarantool@my_app|grep PID
Main PID: 5914 (tarantool)

Additionally, you can find the information about the instance restart in the boot logs:

$ journalctl -u tarantool@my_app -n 8

Core dumps

Tarantool makes a core dump if it receives any of the following signals: SIGSEGV, SIGFPE, SIGABRT or SIGQUIT. This is automatic if Tarantool crashes.

On systemd-enabled platforms, coredumpctl automatically saves core dumps and stack traces in case of a crash. Here is a general “how to” for how to enable core dumps on a Unix system:

Ensure session limits are configured to enable core dumps, i.e. say ulimit -c unlimited. Check “man 5 core” for other reasons why a core dump may not be produced.
Set a directory for writing core dumps to, and make sure that the directory is writable. On Linux, the directory path is set in a kernel parameter configurable via /proc/sys/kernel/core_pattern.
Make sure that core dumps include stack trace information. If you use a binary Tarantool distribution, this is automatic. If you build Tarantool from source, you will not get detailed information if you pass -DCMAKE_BUILD_TYPE=Release to CMake.

To simulate a crash, you can execute an illegal command against a Tarantool instance:

$ # !!! please never do this on a production system !!!
$ tt connect my_app
   • Connecting to the instance...
   • Connected to /var/run/tarantool/my_app.control
/var/run/tarantool/my_app.control> require('ffi').cast('char *', 0)[0] = 48
   ⨯ Connection was closed. Probably instance process isn't running anymore

Alternatively, if you know the process ID of the instance (here we refer to it as $PID), you can abort a Tarantool instance by running gdb debugger:

$ gdb -batch -ex "generate-core-file" -p $PID

or manually sending a SIGABRT signal:

$ kill -SIGABRT $PID

Note

To find out the process id of the instance ($PID), you can:

look it up in the instance’s box.info.pid,
find it with ps -A | grep tarantool, or
say systemctl status tarantool@my_app|grep PID.

On a systemd-enabled system, to see the latest crashes of the Tarantool daemon, say:

$ coredumpctl list /usr/bin/tarantool
MTIME                            PID   UID   GID SIG PRESENT EXE
Sat 2016-01-23 15:21:24 MSK   20681  1000  1000   6   /usr/bin/tarantool
Sat 2016-01-23 15:51:56 MSK   21035   995   992   6   /usr/bin/tarantool

To save a core dump into a file, say:

$ coredumpctl -o filename.core info <pid>

Stack traces

Since Tarantool stores tuples in memory, core files may be large. For investigation, you normally don’t need the whole file, but only a “stack trace” or “backtrace”.

To save a stack trace into a file, say:

$ gdb -se "tarantool" -ex "bt full" -ex "thread apply all bt" --batch -c core> /tmp/tarantool_trace.txt

where:

“tarantool” is the path to the Tarantool executable,
“core” is the path to the core file, and
“/tmp/tarantool_trace.txt” is a sample path to a file for saving the stack trace.

Note

Occasionally, you may find that the trace file contains output without debug symbols – the lines will contain ”??” instead of names. If this happens, check the instructions on these Tarantool wiki pages: How to debug core dump of stripped tarantool and How to debug core from different OS.

To see the stack trace and other useful information in console, say:

$ coredumpctl info 21035
          PID: 21035 (tarantool)
          UID: 995 (tarantool)
          GID: 992 (tarantool)
       Signal: 6 (ABRT)
    Timestamp: Sat 2016-01-23 15:51:42 MSK (4h 36min ago)
 Command Line: tarantool my_app.lua <running>
   Executable: /usr/bin/tarantool
Control Group: /system.slice/system-tarantool.slice/tarantool@my_app.service
         Unit: tarantool@my_app.service
        Slice: system-tarantool.slice
      Boot ID: 7c686e2ef4dc4e3ea59122757e3067e2
   Machine ID: a4a878729c654c7093dc6693f6a8e5ee
     Hostname: localhost.localdomain
      Message: Process 21035 (tarantool) of user 995 dumped core.

               Stack trace of thread 21035:
               #0  0x00007f84993aa618 raise (libc.so.6)
               #1  0x00007f84993ac21a abort (libc.so.6)
               #2  0x0000560d0a9e9233 _ZL12sig_fatal_cbi (tarantool)
               #3  0x00007f849a211220 __restore_rt (libpthread.so.0)
               #4  0x0000560d0aaa5d9d lj_cconv_ct_ct (tarantool)
               #5  0x0000560d0aaa687f lj_cconv_ct_tv (tarantool)
               #6  0x0000560d0aaabe33 lj_cf_ffi_meta___newindex (tarantool)
               #7  0x0000560d0aaae2f7 lj_BC_FUNCC (tarantool)
               #8  0x0000560d0aa9aabd lua_pcall (tarantool)
               #9  0x0000560d0aa71400 lbox_call (tarantool)
               #10 0x0000560d0aa6ce36 lua_fiber_run_f (tarantool)
               #11 0x0000560d0a9e8d0c _ZL16fiber_cxx_invokePFiP13__va_list_tagES0_ (tarantool)
               #12 0x0000560d0aa7b255 fiber_loop (tarantool)
               #13 0x0000560d0ab38ed1 coro_init (tarantool)
               ...

Debugger

To start gdb debugger on the core dump, say:

$ coredumpctl gdb <pid>

It is highly recommended to install tarantool-debuginfo package to improve gdb experience, for example:

$ dnf debuginfo-install tarantool

gdb also provides information about the debuginfo packages you need to install:

$ gdb -p <pid>
...
Missing separate debuginfos, use: dnf debuginfo-install
glibc-2.22.90-26.fc24.x86_64 krb5-libs-1.14-12.fc24.x86_64
libgcc-5.3.1-3.fc24.x86_64 libgomp-5.3.1-3.fc24.x86_64
libselinux-2.4-6.fc24.x86_64 libstdc++-5.3.1-3.fc24.x86_64
libyaml-0.1.6-7.fc23.x86_64 ncurses-libs-6.0-1.20150810.fc24.x86_64
openssl-libs-1.0.2e-3.fc24.x86_64

Symbolic names are present in stack traces even if you don’t have tarantool-debuginfo package installed.

Disaster recovery

The minimal fault-tolerant Tarantool configuration would be a replica set that includes a master and a replica, or two masters. The basic recommendation is to configure all Tarantool instances in a replica set to create snapshot files on a regular basis.

Here are action plans for typical crash scenarios.

Master-replica

Master crash: manual failover

Configuration: master-replica (manual failover).

Problem: The master has crashed.

Actions:

Ensure the master is stopped. For example, log in to the master machine and use tt stop.
Configure a new replica set leader using the <replicaset_name>.leader option.
Reload configuration on all instances using config:reload().
Make sure that a new replica set leader is a master using box.info.ro.
On a new master, remove a crashed instance from the ‘_cluster’ space.
Set up a replacement for the crashed master on a spare host.

Master crash: automated failover

Configuration: master-replica (automated failover).

Problem: The master has crashed.

Actions:

Use box.info.election to make sure a new master is elected automatically.
On a new master, remove a crashed instance from the ‘_cluster’ space.
Set up a replacement for the crashed master on a spare host.

Data loss

Configuration: master-replica.

Problem: Some transactions are missing on a replica after the master has crashed.

Actions:

You lose a few transactions in the master write-ahead log file, which may have not transferred to the replica before the crash. If you were able to salvage the master .xlog file, you may be able to recover these.

Find out instance UUID from the crashed master xlog:

$ head -5 var/lib/instance001/*.xlog | grep Instance
Instance: 9bb111c2-3ff5-36a7-00f4-2b9a573ea660

On the new master, use the UUID to find the position:

app:instance002> box.info.vclock[box.space._cluster.index.uuid:select{'9bb111c2-3ff5-36a7-00f4-2b9a573ea660'}[1][1]]
---
- 999
...

Play the records from the crashed .xlog to the new master, starting from the new master position:

$ tt play 127.0.0.1:3302 var/lib/instance001/00000000000000000000.xlog \
          --from 1000 \
          --replica 1 \
          --username admin --password secret

Master-master

Configuration: master-master.

Problem: one master has crashed.

Actions:

Let the load be handled by another master alone.
Remove a crashed master from a replica set.
Set up a replacement for the crashed master on a spare host. Learn more from Adding and removing instances.

Master-replica/master-master: data loss

Configuration: master-replica or master-master.

Problem: Data was deleted at one master and this data loss was propagated to the other node (master or replica).

Actions:

Put all nodes in read-only mode. Depending on the replication.failover mode, this can be done as follows:
- manual: change a replica set leader to null.
- election: set replication.election_mode to voter or off at the replica set level.
- off: set database.mode to ro.
Reload configurations on all instances using the reload() function provided by the config module.
Turn off deletion of expired checkpoints with box.backup.start(). This prevents the Tarantool garbage collector from removing files made with older checkpoints until box.backup.stop() is called.
Get the latest valid .snap file and use tt cat command to calculate at which LSN the data loss occurred.
Start a new instance and use tt play command to play to it the contents of .snap and .xlog files up to the calculated LSN.
Bootstrap a new replica from the recovered master.

Note

The steps above are applicable only to data in the memtx storage engine.

Backups

Tarantool has an append-only storage architecture: it appends data to files but it never overwrites earlier data. The Tarantool garbage collector removes old files after a checkpoint. You can prevent or delay the garbage collector’s action by configuring the checkpoint daemon. Backups can be taken at any time, with minimal overhead on database performance.

Two functions are helpful for backups in certain situations:

box.backup.start() informs the server that activities related to the removal of outdated backups must be suspended and returns a table with the names of snapshot and vinyl files that should be copied.
box.backup.stop() later informs the server that normal operations may resume.

Hot backup (memtx)

This is a special case when there are only in-memory tables.

The last snapshot file is a backup of the entire database; and the WAL files that are made after the last snapshot are incremental backups. Therefore taking a backup is a matter of copying the snapshot and WAL files.

Use tar to make a (possibly compressed) copy of the latest .snap and .xlog files on the snapshot.dir and wal.dir directories.
If there is a security policy, encrypt the .tar file.
Copy the .tar file to a safe place.

Later, restoring the database is a matter of taking the .tar file and putting its contents back in the snapshot.dir and wal.dir directories.

Hot backup (vinyl/memtx)

Vinyl stores its files in vinyl_dir, and creates a folder for each database space. Dump and compaction processes are append-only and create new files. The Tarantool garbage collector may remove old files after each checkpoint.

To take a mixed backup:

Issue box.backup.start() on the administrative console. This will return a list of files to back up and suspend garbage collection for them till the next box.backup.stop().
Copy the files from the list to a safe location. This will include memtx snapshot files, vinyl run and index files, at a state consistent with the last checkpoint.
Issue box.backup.stop() so the garbage collector can continue as usual.

Continuous remote backup (memtx)

The replication feature is useful for backup as well as for load balancing.

Therefore taking a backup is a matter of ensuring that any given replica is up to date, and doing a cold backup on it. Since all the other replicas continue to operate, this is not a cold backup from the end user’s point of view. This could be done on a regular basis, with a cron job or with a Tarantool fiber.

Continuous backup (memtx)

The logged changes done since the last cold backup must be secured, while the system is running.

For this purpose, you need a file copy utility that will do the copying remotely and continuously, copying only the parts of a write ahead log file that are changing. One such utility is rsync.

Alternatively, you need an ordinary file copy utility, but there should be frequent production of new snapshot files or new WAL files as changes occur, so that only the new files need to be copied.

Upgrades

Important

This section contains instructions for upgrading Tarantool clusters to versions up to 2.11.x.

This section describes the general upgrade process for Tarantool. There are two main upgrade scenarios for different use cases:

Live upgrade (without downtime) for replication clusters.
Upgrade with downtime for standalone instances.

You can also downgrade to an earlier version using a similar procedure.

For information about backwards compatibility, see the compatibility guarantees description.

Upgrading from or to certain versions can involve specific steps or slightly differ from the general upgrade procedure. Such version-specific cases are described on the dedicated pages inside this section.

This section includes the following topics:

Standalone instance upgrade

This page describes the process of upgrading a standalone Tarantool instance in production. Note that this always implies a downtime because the application needs to be stopped and restarted on the target version.

To upgrade without downtime, you need multiple Tarantool servers running in a replication cluster. Find detailed instructions in Replication cluster upgrade.

Checking your application

Before upgrading, make sure your application is compatible with the target Tarantool version:

Set up a development environment with the target Tarantool version installed. See the installation instructions at the Tarantool download page and in the tt install reference.
Deploy the application in this environment and check how it works. In case of any issues, adjust the application code to ensure compatibility with the target version.

When your application is ready to run on the target Tarantool version, you can start upgrading the production environment.

Upgrading a standalone instance

Stop the Tarantool instance.
Make a copy of all data and the package from which the current (old) version was installed. You may need it for rollback purposes. Find the backup instruction in the appropriate hot backup procedure in Backups.
Install the target Tarantool version on the host. You can do this using a package manager or the tt utility. See the installation instructions at Tarantool download page and in the tt install reference. To check that the target Tarantool version is installed, run tarantool -v.
Start your application on the target version.
Run box.schema.upgrade(). This will update the Tarantool system spaces to match the currently installed version of Tarantool.

Note

To undo schema upgrade in a case of failed upgrade, you can use box.schema.downgrade().

Rollback

The rollback procedure for a standalone instance is almost the same as the upgrade. The only difference is in the last step: you should call box.schema.downgrade() to return the schema to the original version.

Replication cluster upgrade

Below are the general instructions for upgrading a Tarantool cluster with replication. Upgrading from some versions can involve certain specifics. To find out if it is your case, check the version-specific topics of the Upgrades section.

A replication cluster can be upgraded without downtime due to its redundancy. When you disconnect a single instance for an upgrade, there is always another instance that takes over its functionality: being a master storage for the same data buckets or working as a router. This way, you can upgrade all the instances one by one.

The high-level steps of cluster upgrade are the following:

Ensure the application compatibility with the target Tarantool version.
Check the cluster health.
Install the target Tarantool version on the cluster nodes.
Upgrade router nodes one by one.
Upgrade storage replica sets one by one.

Important

The only way to upgrade Tarantool from version 1.6, 1.7, or 1.9 to 2.x without downtime is to take an intermediate step by upgrading to 1.10 and then to 2.x.

Before upgrading Tarantool from 1.6 to 2.x, please read about the associated caveats.

Note

Some upgrade steps are moved to the separate section Procedures and checks to avoid overloading the general instruction with details. Typically, these are checks you should repeat during the upgrade to ensure it goes well.

If you experience issues during upgrade, you can roll back to the original version. The rollback instructions are provided in the Rollback section.

Checking your application

Before upgrading, make sure your application is compatible with the target Tarantool version:

Set up a development environment with the target Tarantool version installed. See the installation instructions at the Tarantool download page and in the tt install reference.
Deploy the application in this environment and check how it works. In case of any issues, adjust the application code to ensure compatibility with the target version.

When your application is ready to run on the target Tarantool version, you can start upgrading the production environment.

Pre-upgrade checks

Perform these steps before the upgrade to ensure that your cluster is working correctly:

On each router instance, perform the vshard.router check:

tarantool> vshard.router.info()
-- no issues in the output
-- sum of 'bucket.available_rw' == total number of buckets

On each storage instance, perform the replication check:

tarantool> box.info
-- box.info.status == 'running'
-- box.info.ro == 'false' on one instance in each replica set.
-- box.info.replication[*].upstream.status == 'follow'
-- box.info.replication[*].downstream.status == 'follow'
-- box.info.replication[*].upstream.lag <= box.cfg.replication_timeout
-- can also be moderately larger under a write load

On each storage instance, perform the vshard.storage check:

tarantool> vshard.storage.info()
-- no issues in the output
-- replication.status == 'follow'

Check all instances’ logs for application errors.

Note

If you’re running Cartridge, you can check the health of the cluster instances on the Cluster tab of its web interface.

In case of any issues, make sure to fix them before starting the upgrade procedure.

Installing the target version

Install the target Tarantool version on all hosts of the cluster. You can do this using a package manager or the tt utility. See the installation instructions at the Tarantool download page and in the tt install reference.

Check that the target Tarantool version is installed by running tarantool -v on all hosts.

Upgrading a Tarantool cluster with no downtime

Upgrading routers

Upgrade router instances one by one:

Stop one router instance.
Start this instance on the target Tarantool version.
Repeat the previous steps for each router instance.

After completing the router instances upgrade, perform the vshard.router check on each of them.

Upgrading storages

Before upgrading storage instances:

Disable Cartridge failover: run
```
tt cartridge failover disable
```
or use the Cartridge web interface (Cluster tab, Failover: <Mode> button).

Disable the rebalancer: run

tarantool> vshard.storage.rebalancer_disable()

Make sure that the Cartridge upgrade_schema option is false.

Upgrade storage instances by performing the following steps for each replica set:

Note

To detect possible upgrade issues early, we recommend that you perform a replication check on all instances of the replica set after each step.

Pick a replica (a read-only instance) from the replica set. Stop this replica and start it again on the target Tarantool version. Wait until it reaches the running status (box.info.status == running).
Restart all other read-only instances of the replica set on the target version one by one.
Make one of the updated replicas the new master using the applicable instruction from Switching the master.
Restart the last instance of the replica set (the former master, now a replica) on the target version.

Run box.schema.upgrade() on the new master. This will update the Tarantool system spaces to match the currently installed version of Tarantool. The changes will be propagated to other nodes via the replication mechanism later.

Warning

This is the point of no return for upgrading from versions earlier than 2.8.2: once you complete it, the schema is no longer compatible with the initial version.

When upgrading from version 2.8.2 or newer, you can undo the schema upgrade using box.schema.downgrade().

Run box.snapshot() on every node in the replica set to make sure that the replicas immediately see the upgraded database state in case of restart.

Once you complete the steps, enable failover or rebalancer back:

Enable Cartridge failover: run
```
tt cartridge failover set [mode]
```
or use the Cartridge web interface (Cluster tab, Failover: Disabled button).

Enable the rebalancer: run

tarantool> vshard.storage.rebalancer_enable()

Post-upgrade checks

Perform these steps after the upgrade to ensure that your cluster is working correctly:

On each router instance, perform the vshard.router check:

tarantool> vshard.router.info()
-- no issues in the output
-- sum of 'bucket.available_rw' == total number of buckets

On each storage instance, perform the replication check:

tarantool> box.info
-- box.info.status == 'running'
-- box.info.ro == 'false' on one instance in each replica set.
-- box.info.replication[*].upstream.status == 'follow'
-- box.info.replication[*].downstream.status == 'follow'
-- box.info.replication[*].upstream.lag <= box.cfg.replication_timeout
-- can also be moderately larger under a write load

On each storage instance, perform the vshard.storage check:

tarantool> vshard.storage.info()
-- no issues in the output
-- replication.status == 'follow'

Check all instances’ logs for application errors.

Note

If you’re running Cartridge, you can check the health of the cluster instances on the Cluster tab of its web interface.

Rollback

Rollback before the point of no return

If you decide to roll back before reaching the point of no return, your data is fully compatible with the version you had before the upgrade. In this case, you can roll back the same way: restart the nodes you’ve already upgraded on the original version.

Rollback after the point of no return

If you’ve passed the point of no return (that is, executed box.schema.upgrade()) during the upgrade, then a rollback requires downgrading the schema to the original version.

To check if an automatic downgrade is available for your original version, use box.schema.downgrade_versions(). If the version you need is on the list, execute the following steps on each upgraded replica set to roll back:

Run box.schema.downgrade(<version>) on master specifying the original version.
Run box.snapshot() on every instance in the replica set to make sure that the replicas immediately see the downgraded database state after restart.
Restart all read-only instances of the replica set on the initial version one by one.
Make one of the updated replicas the new master using the applicable instruction from Switching the master.
Restart the last instance of the replica set (the former master, now a replica) on the original version.

Then enable failover or rebalancer back as described in the Upgrading storages.

Recovering from a failed upgrade

Warning

This section applies to cases when the upgrade procedure has failed and the cluster is not functioning properly anymore. Thus, it implies a downtime and a full cluster restart.

In case of an upgrade failure after passing the point of no return, follow these steps to roll back to the original version:

Stop all cluster instances.
Save snapshot and xlog files from all instances whose data was modified after the last backup procedure. These files will help apply these modifications later.
Save the latest backups from all instances.
Restore the original Tarantool version on all hosts of the cluster.
Launch the cluster on the original Tarantool version.

Note

At this point, the application becomes fully functional and contains data from the backups. However, the data modifications made after the backups were taken must be restored manually.
Manually apply the latest data modifications from xlog files you saved on step 2 using the xlog module. On instances where such changes happened, do the following:
1. Find out the vclock value of the latest operation in the original WAL.
2. Play the operations from the newer xlog starting from this vclock on the instance.
Important

If the upgrade has failed after calling box.schema.upgrade(), don’t apply the modifications of system spaces done by this call. This can make the schema incompatible with the original Tarantool version.

Find more information about the Tarantool recovery in Disaster recovery.

Procedures and checks

Replication check

Run box.info:

tarantool> box.info

Check that the following conditions are satisfied:

box.info.status is running
box.info.replication[*].upstream.status and box.info.replication[*].downstream.status are follow
box.info.replication[*].upstream.lag is less or equal than box.cfg.replication_timeout, but it can also be moderately larger under a write load.
box.info.ro is false at least on one instance in each replica set. If all instances have box.info.ro = true, this means there are no writable nodes. On Tarantool v. 2.10.0 or later, you can find out why this happened by running box.info.ro_reason. If box.info.ro_reason or box.info.status has the value orphan, the instance doesn’t see the rest of the replica set.

Then run box.info once more and check that box.info.replication[*].upstream.lag values are updated.

vshard.storage check

Run vshard.storage.info():

tarantool> vshard.storage.info()

Check that the following conditions are satisfied:

there are no issues or alerts
replication.status is follow

vshard.router check

Run vshard.router.info():

tarantool> vshard.router.info()

Check that the following conditions are satisfied:

there are no issues or alerts
all buckets are available (the sum of bucket.available_rw on all replica sets equals the total number of buckets)

Switching the master

Cartridge. If your cluster runs on Cartridge, you can switch the master in the web interface. To do this, go to the Cluster tab, click Edit replica set, and drag an instance to the top of Failover priority list to make it the master.
Raft. If your cluster uses automated leader election, switch the master by following these steps:
1. Pick a candidate – a read-only instance to become the new master.
2. Run box.ctl.promote() on the candidate. The operation will start and wait for the election to happen.
3. Run box.cfg{ election_mode = "voter" } on the current master.
4. Check that the candidate became the new master: its box.info.ro must be false.
Legacy. If your cluster neither works on Cartridge nor has automated leader election, switch the master by following these steps:
1. Pick a candidate – a read-only instance to become the new master.
2. Run box.cfg{ read_only = true } on the current master.
3. Check that the candidate’s vclock value matches the master’s: The value of box.info.vclock[<master_id>] on the candidate must be equal to box.info.lsn on the master. <master_id> here is the value of box.info.id on the master.
  
  If the vclock values don’t match, stop the switch procedure and restore the replica set state by calling box.cfg{ read_only == false } on the master. Then pick another candidate and restart the procedure.

After switching the master, perform the replication check on each instance of the replica set.

Live upgrade from Tarantool 1.6 to 1.10

This page includes explanations and solutions to some common issues when upgrading a replica set from Tarantool 1.6 to 1.10.

Versions later that 1.6 have incompatible .snap and .xlog file formats: 1.6 files are supported during upgrade, but you won’t be able to return to 1.6 after running under 1.10 or 2.x for a while. A few configuration parameters are also renamed.

To perform a live upgrade from Tarantool 1.6 to a more recent version, like 2.8.4, 2.10.1 and such, it is necessary to take an intermediate step by upgrading 1.6 -> 1.10 -> 2.x. This is the only way to perform the upgrade without downtime.

However, a direct upgrade of a replica set from 1.6 to 2.x is also possible, but only with downtime.

The procedure of live upgrade from 1.6 to 1.10 is similar to the general cluster upgrade procedure, but with slight differences in the Upgrading storages step. Find below the general storage upgrade procedure and the 1.6-specific notes for its steps.

General storage upgrade

Upgrade storage instances by performing the following steps for each replica set:

Note

To detect possible upgrade issues early, we recommend that you perform a replication check on all instances of the replica set after each step.

Pick a replica (a read-only instance) from the replica set. Stop this replica and start it again on the target Tarantool version. Wait until it reaches the running status (box.info.status == running).
Restart all other read-only instances of the replica set on the target version one by one.
Make one of the updated replicas the new master using the applicable instruction from Switching the master.
Restart the last instance of the replica set (the former master, now a replica) on the target version.

Run box.schema.upgrade() on the new master. This will update the Tarantool system spaces to match the currently installed version of Tarantool. The changes will be propagated to other nodes via the replication mechanism later.
Run box.snapshot() on every node in the replica set to make sure that the replicas immediately see the upgraded database state in case of restart.

1.6 storage upgrade specifics

Replication check: New Tarantool nodes follow 1.6 nodes just fine, but some 1.6 nodes might disconnect from new nodes with an ER_LOADING error. This is not critical, the error goes away when replication on 1.6 is restarted:
```
old_repl = box.cfg.replication
box.cfg{replication = ""}
box.cfg{replication = old_repl}
```
Point of no return: When upgrading from Tarantool 1.6, the step 3 (switching the master) is the point of no return. Оnce you complete it, the schema is no longer compatible with the initial version.
Restarting on the target version (steps 1, 2, and 4): Tarantool 1.10+ fails to recover from 1.6 xlogs, unless box.cfg{force_recovery = true} is set. There is a slight difference between 1.6 and 1.10 xlogs, which makes 1.6 xlogs appear erroneous to 1.10+ instances. In order to work around this, start the instance in force_recovery mode. To do so, add the line force_recovery = true to the file where the instance is initialized – for example, to init.lua.
Running box.schema.upgrade() (step 5): There was a breaking change between 1.6 and 1.10 – in 1.6, the field type num was an alias to number, and in 1.10, num is converted to unsigned. This means that after box.schema.upgrade() is performed on the master, the user might have some spaces with unsigned fields containing non-unsigned values: double, int, and so on. This will make the snapshot inconsistent, unless an extra action is performed after box.schema.upgrade(). Run this code in the Tarantool console on the new master:
```
-- First find all spaces containing unsigned fields with non-unsigned values in them.
-- Say, we have one such space denoted problematic_space and the problem is in field problematic_field_no.
a = box.space.problematic_space:format()
a[problematic_field_no].type = 'number'
box.space.problematic_space:format(a)
```
Taking snapshots (step 6): The user might be concerned with snapshot size in 1.10 – it’s drastically smaller than the one created by 1.6 (for example, ~300 Mb vs. 6 Gb in some corner cases). There is nothing to worry about. Tarantool 1.6 didn’t compress snapshots, while Tarantool 1.10 and above does that.

Upgrade from 1.6 directly to 2.x with downtime

However, a direct upgrade of a replica set from 1.6 to 2.x is also possible, but only with downtime.

Here is how to upgrade from Tarantool 1.6 directly to 2.x:

Stop all instances in the replica set.
Upgrade Tarantool version to 2.x on every instance.
Upgrade the corresponding instance files and applications, if needed.
Start all the instances with Tarantool 2.x.
Execute box.schema.upgrade() on the master.
Execute box.snapshot() on every node in the replica set.

Fix decimal values in vinyl spaces when upgrading to 2.10.1

This is an upgrade guide for fixing one specific problem which could happen with decimal values in vinyl spaces. It’s only relevant when you’re upgrading from Tarantool version <= 2.10.0 to anything >= 2.10.1.

Before gh-6377 was fixed, decimal and double values in a scalar or number index could end up in the wrong order after the update. If such an index has been built for a space that uses the vinyl storage engine, the index is persisted and is not rebuilt even after the upgrade. If this is the case, the user has to rebuild the affected indexes manually.

Here are the rules to determine whether your installation was affected. If all of the statements listed below are true, you have to rebuild indexes for the affected vinyl spaces manually.

You were running Tarantool version 2.10.0 and below.
You have spaces with the vinyl storage engine.
The vinyl spaces have number or scalar indexes.
The tuples in these spaces may contain both decimal and double Inf or NaN values.

If this is the case for you, you can run the following script, which will find all the affected indices:

local fiber = require('fiber')
local decimal = require('decimal')

local function isnan(val)
    return type(val) == 'number' and val ~= val
end

local function isinf(val)
    return val == math.huge or val == -math.huge
end

local function vinyl(id)
    return box.space[id].engine == 'vinyl'
end

require_rebuild = {}
local iters = 0
for _, v in box.space._index:pairs({512, 0}, {iterator='GE'}) do
    local id = v[1]
    iters = iters + 1
    if iters % 1000 == 0 then
        fiber.yield()
    end
    if vinyl(id) then
        local format = v[6]
        local check_fields = {}
        for _, fmt in pairs(v[6]) do
            if fmt[2] == 'number' or fmt[2] == 'scalar' then
                table.insert(check_fields,  fmt[1] + 1)
            end
        end
        local have_decimal = {}
        local have_nan = {}
        if #check_fields > 0 then
            for k, tuple in box.space[id]:pairs() do
                for _, i in pairs(check_fields) do
                    iters = iters + 1
                    if iters % 1000 == 0 then
                        fiber.yield()
                    end
                    have_decimal[i] = have_decimal[i] or
                                    decimal.is_decimal(tuple[i])
                    have_nan[i] = have_nan[i] or isnan(tuple[i]) or
                                isinf(tuple[i])
                    if have_decimal[i] and have_nan[i] then
                        table.insert(require_rebuild, v)
                        goto out
                    end
                end
            end
        end
    end
    ::out::
end

The indices requiring a rebuild will be stored in the require_rebuild table. If the table is empty, you’re safe and can continue using Tarantool as before.

If the require_rebuild table contains some entries, you can rebuild the affected indices with the following script.

Note

Please run the script below only on the master node and only after all the nodes are upgraded to the new Tarantool version.

local log = require('log')

local function rebuild_index(idx)
    local index_name = idx[3]
    local space_name = box.space[idx[1]].name
    log.info("Rebuilding index %s on space %s", index_name, space_name)
    if (idx[2] == 0) then
        log.error("Cannot rebuild primary index %s on space %s. Please, "..
                "recreate the space manually", index_name, space_name)
        return
    end
    log.info("Deleting index %s on space %s", index_name, space_name)
    local v = box.space._index:delete{idx[1], idx[2]}
    if v == nil then
        log.error("Couldn't find index %s on space %s", index_name, space_name)
        return
    end
    log.info("Done")
    log.info("Creating index %s on space %s", index_name, space_name)
    box.space._index:insert(v)
end

for _, idx in pairs(require_rebuild) do
    rebuild_index(idx)
end

The script might fail on some of the indices with the following error: “Cannot rebuild primary index index_name on space space_name. Please, recreate the space manually”. If this happens, automatic index rebuild is impossible, and you have to manually re-create the space to ensure data integrity:

Create a new space with the same format as the existing one.
Define the same indices on the freshly created space.
Iterate over the old space’s primary key and insert all the data into the new space.
Drop the old space.

Fix illegal type names when upgrading to 2.10.4

This is an upgrade guide for fixing one specific problem which could happen with field type names. It’s only relevant when you’re upgrading from a Tarantool version <=2.10.3 to >=2.10.4.

Before gh-5940 was fixed, the empty string, n, nu, s, and st (that is, leading parts of num and str) were accepted as valid field types. Since 2.10.4, Tarantool doesn’t accept these strings and they must be replaced with correct values num and str.

This instruction is also available on GitHub.

Check if your snapshots contain illegal type names

A snapshot can be validated against the issue using the following script:

#!/usr/bin/env tarantool

local xlog = require('xlog')
local json = require('json')

if arg[1] == nil then
    print(('Usage: %s xxxxxxxxxxxxxxxxxxxx.snap'):format(arg[0]))
    os.exit(1)
end

local illegal_types = {
    [''] = true,
    ['n'] = true,
    ['nu'] = true,
    ['s'] = true,
    ['st'] = true,
}

local function report_field_def(name, field_def)
    local msg = 'A field def in a _space entry %q contains an illegal type: %s'
    print(msg:format(name, json.encode(field_def)))
end

local has_broken_format = false

for _, record in xlog.pairs(arg[1]) do
    -- Filter inserts.
    if record.HEADER == nil or record.HEADER.type ~= 'INSERT' then
        goto continue
    end
    -- Filter _space records.
    if record.BODY == nil or record.BODY.space_id ~= 280 then
        goto continue
    end

    local tuple = record.BODY.tuple
    local name = tuple[3]
    local format = tuple[7]

    local is_format_broken = false
    for _, field_def in ipairs(format) do
        if illegal_types[field_def.type] ~= nil then
            report_field_def(name, field_def)
            is_format_broken = true
        end

        if illegal_types[field_def[2]] ~= nil then
            report_field_def(name, field_def)
            is_format_broken = true
        end

    end

    if is_format_broken then
        has_broken_format = true
        local msg = 'The following _space entry contains illegal type(s): %s'
        print(msg:format(json.encode(record)))
    end
    ::continue::
end

if has_broken_format then
    print('')
    print(('%s has an illegal type in a space format'):format(arg[1]))
    print('It is recommended to proceed with the upgrade instruction:')
    print('https://github.com/tarantool/tarantool/wiki/Fix-illegal-field-type-in-a-space-format-when-upgrading-to-2.10.4')
else
    print('Everything looks nice!')
end

os.exit(has_broken_format and 2 or 0)

If the snapshot contains the values that aren’t valid in 2.10.4, you’ll get an output like the following:

Fix an application file

To fix the application file that contains illegal type names, add the following code in it before the box.cfg()/vshard.cfg()/cartridge.cfg() call.

Note

In Cartridge applications, the instance file is called init.lua.

-- Convert illegal type names in a space format that were
-- allowed before tarantool 2.10.4.

local log = require('log')
local json = require('json')

local transforms = {
    [''] = 'num',
    ['n'] = 'num',
    ['nu'] = 'num',
    ['s'] = 'str',
    ['st'] = 'str',
}

-- The helper for before_replace().
local function transform_field_def(name, field_def, field, new_type)
    local field_def_old_str = json.encode(field_def)
    field_def[field] = new_type
    local field_def_new_str = json.encode(field_def)

    local msg = 'Transform a field def in a _space entry %q: %s -> %s'
    log.info(msg:format(name, field_def_old_str, field_def_new_str))
end

-- _space trigger.
local function before_replace(_, tuple)
    if tuple == nil then return tuple end

    local name = tuple[3]
    local format = tuple[7]

    -- Update format if necessary.
    local is_format_changed = false
    for i, field_def in ipairs(format) do
        local new_type = transforms[field_def.type]
        if new_type ~= nil then
            transform_field_def(name, field_def, 'type', new_type)
            is_format_changed = true
        end

        local new_type = transforms[field_def[2]]
        if new_type ~= nil then
            transform_field_def(name, field_def, 2, new_type)
            is_format_changed = true
        end
    end

    -- No changed: skip.
    if not is_format_changed then return tuple end

    -- Rebuild the tuple.
    local new_tuple = tuple:transform(7, 1, format)
    log.info(('Transformed _space entry %s to %s'):format(
        json.encode(tuple), json.encode(new_tuple)))
    return new_tuple
end

-- on_schema_init trigger to set before_replace().
local function on_schema_init()
    box.space._space:before_replace(before_replace)
end

-- Set the trigger on _space.
box.ctl.on_schema_init(on_schema_init)

You can delete these triggers after the box.cfg()/vshard.cfg()/cartridge.cfg() call.

An example for a Cartridge application:

The triggers will report the changes the make in the following form:

Recover from WALs with mixed transactions when upgrading to 2.11.0

This is a guide on fixing a specific problem that could happen when upgrading from a Tarantool version between 2.1.2 and 2.2.0 to 2.8.1 or later. The described solution is applicable since version 2.11.0.

The problem is described in the issue gh-7932. If two or more transactions happened simultaneously in Tarantool 2.1.2-2.2.0, their operations could be written to the write-ahead log mixed with each other. Starting from version 2.8.1, Tarantool recovers transactions atomically and expects all WAL entries between a transaction’s begin and commit operations to belong to one transaction. If there is an operation belonging to another transaction, Tarantool fails to recover from such a WAL.

Starting from version 2.11.0, Tarantool can recover from WALs with mixed transactions in the force_recovery mode.

Instances fail to start

If all instances or some of them fail to start after upgrading to 2.11 or a newer version due to a recovery error:

Start these instances with the force_recovery option to true.
Make new snapshots on the instances so that the old WALs with mixed transactions aren’t used for recovery anymore. To do this, call box.snapshot().
Set force_recovery back to false.

Replication doesn’t work

After all the instances start successfully, WALs with mixed transactions may still lead to replication issues. Some instances may fail to replicate from other instances because they are sending incorrect WALs. To fix the replication issues, rebootstrap the instances that fail to replicate.

Bug reports

If you found a bug in Tarantool, you’re doing us a favor by taking the time to tell us about it.

Please create an issue at Tarantool repository at GitHub. We encourage you to include the following information:

Steps needed to reproduce the bug, and an explanation why this differs from the expected behavior according to our manual. Please provide specific unique information. For example, instead of “I can’t get certain information”, say “box.space.x:delete() didn’t report what was deleted”.
Your operating system name and version, the Tarantool name and version, and any unusual details about your machine and its configuration.
Related files like a stack trace or a Tarantool log file.

If this is a feature request or if it affects a special category of users, be sure to mention that.

Usually within one or two workdays a Tarantool team member will write an acknowledgment, or some questions, or suggestions for a workaround.

Flight recorder

Enterprise Edition

The flight recorder is available in the Enterprise Edition only.

Example on GitHub: flightrec

The flight recorder is an event collection tool that gathers various information about a working Tarantool instance, such as:

logs
metrics
requests and responses

This information helps you investigate incidents related to crashing a Tarantool instance.

Enable the flight recorder

The flight recorder is disabled by default and can be enabled and configured for a specific Tarantool instance. To enable the flight recorder, set the flightrec.enabled configuration option to true.

flightrec:
  enabled: true

After flightrec.enabled is set to true, the flight recorder starts collecting data in the flight recording file current.ttfr. This file is stored in the snapshot.dir directory. By default, the directory is var/lib/{{ instance_name }}/<file_name>.ttfr.

If the instance crashes and reboots, Tarantool rotates the flight recording: current.ttfr is renamed to <timestamp>.ttfr (for example, 20230411T050721.ttfr) and the new current.ttfr file is created for collecting data. In the case of correct shutdown (for example, using os.exit()), Tarantool continues writing to the existing current.ttfr file after restart.

Note

Note that old flight recordings should be removed manually.

Configure the flight recorder

When the flight recorder is enabled, you can set the options related to logging, metrics, and storing the request and response data.

The flightrec configuration might look as follows:

flightrec:
  enabled: true
  logs_size: 10485800
  logs_log_level: 5
  metrics_period: 240
  metrics_interval: 0.5
  requests_size: 10485780

In the example, the following options are set:

flightrec.logs_size – a log storage size in bytes.
flightrec.logs_log_level – a log_level.
flightrec.metrics_period – the number of seconds to store metrics after the dump.
flightrec.metrics_interval – the frequency of metrics dumps in seconds.
flightrec.requests_size – a storage size for the request and response data in bytes.

Monitoring

Monitoring is the process of capturing runtime information about the instances of a Tarantool cluster using metrics. Metrics can indicate various characteristics, such as memory usage, the number of records in spaces, replication status, and so on. Typically, metrics are monitored in real time, allowing for the identification of current issues or the prediction of potential ones.

Getting started with monitoring

Example on GitHub: sharded_cluster_crud_metrics

Tarantool allows you to configure and expose its metrics using a YAML configuration. You can also use the built-in metrics module to create and collect custom metrics.

Configuring metrics

To configure metrics, use the metrics section in a cluster configuration. The configuration below enables all metrics excluding vinyl-specific ones:

metrics:
  include: [ all ]
  exclude: [ vinyl ]
  labels:
    alias: '{{ instance_name }}'

The metrics.labels option accepts the predefined {{ instance_name }} variable. This adds an instance name as a label to every observation.

Third-party Lua modules, like crud or expirationd, offer their own metrics. You can enable these metrics by configuring the corresponding role. The example below shows how to enable statistics on called operations by providing the roles.crud-router role’s configuration:

roles:
- roles.crud-router
- roles.metrics-export
roles_cfg:
  roles.crud-router:
    stats: true
    stats_driver: metrics
    stats_quantiles: true

expirationd metrics can be enabled as follows:

expirationd:
  cfg:
    metrics: true

Exposing metrics

To expose metrics in different formats, you can use a third-party metrics-export-role role. In the following example, the metrics of storage-a-001 are provided on two endpoints:

/metrics/prometheus: exposes metrics in the Prometheus format.
/metrics/json: exposes metrics in the JSON format.

storage-a-001:
  roles_cfg:
    roles.metrics-export:
      http:
      - listen: '127.0.0.1:8082'
        endpoints:
        - path: /metrics/prometheus/
          format: prometheus
        - path: /metrics/json
          format: json

Example on GitHub: sharded_cluster_crud_metrics

Note

The metrics module provides a set of plugins that can be used to collect and expose metrics in different formats. Learn more in Collecting metrics using plugins.

Creating custom metrics

The metrics module allows you to create and collect custom metrics. The example below shows how to collect the number of data operations performed on the specified space by increasing a counter value inside the on_replace() trigger function:

local metrics = require('metrics')
local bands_replace_count = metrics.counter('bands_replace_count', 'The number of data operations')
local trigger = require('trigger')
trigger.set(
        'box.space.bands.on_replace',
        'update_bands_replace_count_metric',
        function(_, _, _, request_type)
            bands_replace_count:inc(1, { request_type = request_type })
        end
)

Learn more in Custom metrics.

Collecting metrics

When metrics are configured and exposed, you can use the desired third-party tool to collect them. Below is the example of a Prometheus scrape configuration that collects metrics of multiple Tarantool instances:

global:
  scrape_interval:     5s
  evaluation_interval: 5s

scrape_configs:
  - job_name: prometheus
    static_configs:
      - targets:
          - 127.0.0.1:8081
          - 127.0.0.1:8082
          - 127.0.0.1:8083
          - 127.0.0.1:8084
          - 127.0.0.1:8085
    metrics_path: "/metrics/prometheus"

For more information on collecting and visualizing metrics, refer to Grafana dashboard.

Note

Tarantool Cluster Manager allows you to view metrics of connected clusters in real time. Learn more in Viewing cluster metrics.

Grafana dashboard

After enabling and configuring metrics, you can visualise them using Tarantool Grafana dashboards. These dashboards are available as part of Grafana official & community-built dashboards:

Tarantool 3	Prometheus, InfluxDB
Tarantool Cartridge and Tarantool 1.10—2.x	Prometheus, InfluxDB
Tarantool Data Grid 2	Prometheus, InfluxDB

The Tarantool Grafana dashboard is a ready for import template with basic memory, space operations, and HTTP load panels, based on default metrics package functionality.

../../../_images/Prometheus_dashboard_1.png

../../../_images/Prometheus_dashboard_2.png

../../../_images/Prometheus_dashboard_3.png

Prepare a monitoring stack

Since there are Prometheus and InfluxDB data source Grafana dashboards, you can use one of the following:

Telegraf as a server agent for collecting metrics, InfluxDB as a time series database for storing metrics, and Grafana as a visualization platform.
Prometheus as both a server agent for collecting metrics and a time series database for storing metrics, and Grafana as a visualization platform.

For issues related to setting up Prometheus, Telegraf, InfluxDB, or Grafana instances, refer to the corresponding project’s documentation.

Collect metrics with server agents

Prometheus

To collect metrics for Prometheus, first set up metrics output with prometheus format. You can use the roles.metrics-export configuration or set up the Prometheus plugin manually. To start collecting metrics, add a job to Prometheus configuration with each Tarantool instance URI as a target and metrics path as it was configured on Tarantool instances:

global:
  scrape_interval:     5s
  evaluation_interval: 5s

scrape_configs:
  - job_name: prometheus
    static_configs:
      - targets:
          - 127.0.0.1:8081
          - 127.0.0.1:8082
          - 127.0.0.1:8083
          - 127.0.0.1:8084
          - 127.0.0.1:8085
    metrics_path: "/metrics/prometheus"

InfluxDB

To collect metrics for InfluxDB, use the Telegraf agent. First off, configure Tarantool metrics output in json format with roles.metrics-export configuration or corresponding JSON plugin. To start collecting metrics, add http input to Telegraf configuration including each Tarantool instance metrics URL:

[[inputs.http]]
    urls = [
        "http://example_project:8081/metrics/json",
        "http://example_project:8082/metrics/json",
        "http://example_project:8083/metrics/json",
        "http://example_project:8084/metrics/json",
        "http://example_project:8085/metrics/json"
    ]
    timeout = "30s"
    tag_keys = [
        "metric_name",
        "label_pairs_alias",
        "label_pairs_quantile",
        "label_pairs_path",
        "label_pairs_method",
        "label_pairs_status",
        "label_pairs_operation",
        "label_pairs_level",
        "label_pairs_id",
        "label_pairs_engine",
        "label_pairs_name",
        "label_pairs_index_name",
        "label_pairs_delta",
        "label_pairs_stream",
        "label_pairs_thread",
        "label_pairs_kind"
    ]
    insecure_skip_verify = true
    interval = "10s"
    data_format = "json"
    name_prefix = "tarantool_"
    fieldpass = ["value"]

Be sure to include each label key as label_pairs_<key> to extract it with the plugin. For example, if you use { state = 'ready' } labels somewhere in metric collectors, add label_pairs_state tag key.

Import the dashboard

Open Grafana import menu.

To import a specific dashboard, choose one of the following options:

paste the dashboard id (21474 for Prometheus dashboard, 21484 for InfluxDB dashboard)
paste a link to the dashboard (https://grafana.com/grafana/dashboards/21474 for Prometheus dashboard, https://grafana.com/grafana/dashboards/21484 for InfluxDB dashboard)
paste the dashboard JSON file contents
upload the dashboard JSON file

Set dashboard name, folder and uid (if needed).

../../../_images/grafana_import_setup.png

You can choose the data source and data source variables after import.

../../../_images/grafana_variables_setup.png

Troubleshooting

If there are no data on the graphs, make sure that you picked datasource and job/measurement correctly.
If there are no data on the graphs, make sure that you have info group of Tarantool metrics (in particular, tnt_info_uptime).
If some Prometheus graphs show no data because of parse error: missing unit character in duration, ensure that you use Grafana 7.2 or newer.
If some Prometheus graphs display parse error: bad duration syntax "1m0" or similar error, you need to update your Prometheus version. See grafana/grafana#44542 for more details.

Alerting

You can set up alerts on metrics to get a notification when something went wrong. We will use Prometheus alert rules as an example here. You can get full alerts.yml file at tarantool/grafana-dashboard GitHub repo.

Tarantool metrics

You can use internal Tarantool metrics to monitor detailed RAM consumption, replication state, database engine status, track business logic issues (like HTTP 4xx and 5xx responses or low request rate) and external modules statistics (like CRUD errors). Evaluation timeouts, severity levels and thresholds (especially ones for business logic) are placed here for the sake of example: you may want to increase or decrease them for your application. Also, don’t forget to set sane rate time ranges based on your Prometheus configuration.

Lua memory

Monitoring tnt_info_memory_lua metric may prevent memory overflow and detect the presence of bad Lua code practices.

Note

The Lua memory is limited to 2 GB per instance if Tarantool doesn’t have the GC64 mode enabled for LuaJIT.

- alert: HighLuaMemoryWarning
  expr: tnt_info_memory_lua >= (512 * 1024 * 1024)
  for: 1m
  labels:
    severity: warning
  annotations:
    summary: "Instance '{{ $labels.alias }}' ('{{ $labels.job }}') Lua runtime warning"
    description: "'{{ $labels.alias }}' instance of job '{{ $labels.job }}' uses too much Lua memory
      and may hit threshold soon."

- alert: HighLuaMemoryAlert
  expr: tnt_info_memory_lua >= (1024 * 1024 * 1024)
  for: 1m
  labels:
    severity: page
  annotations:
    summary: "Instance '{{ $labels.alias }}' ('{{ $labels.job }}') Lua runtime alert"
    description: "'{{ $labels.alias }}' instance of job '{{ $labels.job }}' uses too much Lua memory
      and likely to hit threshold soon."

Memtx arena memory

By monitoring slab allocation statistics you can see how many free RAM is remaining to store memtx tuples and indexes for an instance. If Tarantool hit the limits, the instance will become unavailable for write operations, so this alert may help you see when it’s time to increase your memtx_memory limit or to add a new storage to a vshard cluster.

- alert: LowMemtxArenaRemainingWarning
  expr: (tnt_slab_quota_used_ratio >= 80) and (tnt_slab_arena_used_ratio >= 80)
  for: 1m
  labels:
    severity: warning
  annotations:
    summary: "Instance '{{ $labels.alias }}' ('{{ $labels.job }}') low arena memory remaining"
    description: "Low arena memory (tuples and indexes) remaining for '{{ $labels.alias }}' instance of job '{{ $labels.job }}'.
      Consider increasing memtx_memory or number of storages in case of sharded data."

- alert: LowMemtxArenaRemaining
  expr: (tnt_slab_quota_used_ratio >= 90) and (tnt_slab_arena_used_ratio >= 90)
  for: 1m
  labels:
    severity: page
  annotations:
    summary: "Instance '{{ $labels.alias }}' ('{{ $labels.job }}') low arena memory remaining"
    description: "Low arena memory (tuples and indexes) remaining for '{{ $labels.alias }}' instance of job '{{ $labels.job }}'.
      You are likely to hit limit soon.
      It is strongly recommended to increase memtx_memory or number of storages in case of sharded data."

- alert: LowMemtxItemsRemainingWarning
  expr: (tnt_slab_quota_used_ratio >= 80) and (tnt_slab_items_used_ratio >= 80)
  for: 1m
  labels:
    severity: warning
  annotations:
    summary: "Instance '{{ $labels.alias }}' ('{{ $labels.job }}') low items memory remaining"
    description: "Low items memory (tuples) remaining for '{{ $labels.alias }}' instance of job '{{ $labels.job }}'.
      Consider increasing memtx_memory or number of storages in case of sharded data."

- alert: LowMemtxItemsRemaining
  expr: (tnt_slab_quota_used_ratio >= 90) and (tnt_slab_items_used_ratio >= 90)
  for: 1m
  labels:
    severity: page
  annotations:
    summary: "Instance '{{ $labels.alias }}' ('{{ $labels.job }}') low items memory remaining"
    description: "Low items memory (tuples) remaining for '{{ $labels.alias }}' instance of job '{{ $labels.job }}'.
      You are likely to hit limit soon.
      It is strongly recommended to increase memtx_memory or number of storages in case of sharded data."

Vinyl engine status

You can monitor vinyl regulator performance to track possible scheduler or disk issues.

- alert: LowVinylRegulatorRateLimit
  expr: tnt_vinyl_regulator_rate_limit < 100000
  for: 1m
  labels:
    severity: warning
  annotations:
    summary: "Instance '{{ $labels.alias }}' ('{{ $labels.job }}') have low vinyl regulator rate limit"
    description: "Instance '{{ $labels.alias }}' of job '{{ $labels.job }}' have low vinyl engine regulator rate limit.
      This indicates issues with the disk or the scheduler."

Vinyl transactions errors are likely to lead to user requests errors.

- alert: HighVinylTxConflictRate
  expr: rate(tnt_vinyl_tx_conflict[5m]) / rate(tnt_vinyl_tx_commit[5m]) > 0.05
  for: 1m
  labels:
    severity: critical
  annotations:
    summary: "Instance '{{ $labels.alias }}' ('{{ $labels.job }}') have high vinyl tx conflict rate"
    description: "Instance '{{ $labels.alias }}' of job '{{ $labels.job }}' have
      high vinyl transactions conflict rate. It indicates that vinyl is not healthy."

Vinyl scheduler failed tasks are a good signal of disk issues and may be the reason of increasing RAM consumption.

- alert: HighVinylSchedulerFailedTasksRate
  expr: rate(tnt_vinyl_scheduler_tasks{status="failed"}[5m]) > 0.1
  for: 1m
  labels:
    severity: critical
  annotations:
    summary: "Instance '{{ $labels.alias }}' ('{{ $labels.job }}') have high vinyl scheduler failed tasks rate"
    description: "Instance '{{ $labels.alias }}' of job '{{ $labels.job }}' have
      high vinyl scheduler failed tasks rate."

Replication state

If tnt_replication_status is equal to 0, instance replication status is not equal to "follows": replication is either not ready yet or has been stopped due to some reason.

- alert: ReplicationNotRunning
  expr: tnt_replication_status == 0
  for: 1m
  labels:
    severity: critical
  annotations:
    summary: "Instance '{{ $labels.alias }}' ('{{ $labels.job }}') {{ $labels.stream }} (id {{ $labels.id }})
      replication is not running"
    description: "Instance '{{ $labels.alias }}' ('{{ $labels.job }}') {{ $labels.stream }} (id {{ $labels.id }})
      replication is not running."

Even if async replication is "follows", it could be considered malfunctioning if the lag is too high. It also may affect Tarantool garbage collector work, see box.info.gc().

- alert: HighReplicationLag
  expr: tnt_replication_lag > 1
  for: 1m
  labels:
    severity: warning
  annotations:
    summary: "Instance '{{ $labels.alias }}' ('{{ $labels.job }}') have high replication lag (id {{ $labels.id }})"
    description: "Instance '{{ $labels.alias }}' of job '{{ $labels.job }}' have high replication lag
      (id {{ $labels.id }}), check up your network and cluster state."

Event loop

High fiber event loop time leads to bad application performance, timeouts and various warnings. The reason could be a high quantity of working fibers or fibers that spend too much time without any yields or sleeps.

- alert: HighEVLoopTime
  expr: tnt_ev_loop_time > 0.1
  for: 1m
  labels:
    severity: warning
  annotations:
    summary: "Instance '{{ $labels.alias }}' ('{{ $labels.job }}') event loop has high cycle duration"
    description: "Instance '{{ $labels.alias }}' of job '{{ $labels.job }}' event loop has high cycle duration.
      Some high loaded fiber has too little yields. It may be the reason of 'Too long WAL write' warnings."

Configuration status

Configuration status displays Tarantool 3 configuration apply state. Additional metrics display the count of apply warnings and errors.

- alert: ConfigWarningAlerts
  expr: tnt_config_alerts{level="warn"} > 0
  for: 1m
  labels:
    severity: warning
  annotations:
    summary: "Instance '{{ $labels.alias }}' ('{{ $labels.job }}') has configuration 'warn' alerts"
    description: "Instance '{{ $labels.alias }}' of job '{{ $labels.job }}' has configuration 'warn' alerts.
                  Please, check config:info() for detailed info."

- alert: ConfigErrorAlerts
  expr: tnt_config_alerts{level="error"} > 0
  for: 1m
  labels:
    severity: page
  annotations:
    summary: "Instance '{{ $labels.alias }}' ('{{ $labels.job }}') has configuration 'error' alerts"
    description: "Instance '{{ $labels.alias }}' of job '{{ $labels.job }}' has configuration 'error' alerts.
                  Latest configuration has not been applied.
                  Please, check config:info() for detailed info."

- alert: ConfigStatusNotReady
  expr: tnt_config_status{status="ready"} == 0
  for: 5m
  labels:
    severity: warning
  annotations:
    summary: "Instance '{{ $labels.alias }}' ('{{ $labels.job }}') configuration is not ready"
    description: "Instance '{{ $labels.alias }}' of job '{{ $labels.job }}' configuration is not ready.
                  Please, check config:info() for detailed info."

HTTP server statistics

metrics allows to monitor tarantool/http handles, see “Collecting HTTP request latency statistics”. Here we use a summary collector with a default name and 0.99 quantile computation.

Too many responses with error codes usually is a sign of API issues or application malfunction.

- alert: HighInstanceHTTPClientErrorRate
  expr: sum by (job, instance, method, path, alias) (rate(http_server_request_latency_count{ job="tarantool", status=~"^4\\d{2}$" }[5m])) > 10
  for: 1m
  labels:
    severity: page
  annotations:
    summary: "Instance '{{ $labels.alias }}' ('{{ $labels.job }}') high rate of client error responses"
    description: "Too many {{ $labels.method }} requests to {{ $labels.path }} path
      on '{{ $labels.alias }}' instance of job '{{ $labels.job }}' get client error (4xx) responses."

- alert: HighHTTPClientErrorRate
  expr: sum by (job, method, path) (rate(http_server_request_latency_count{ job="tarantool", status=~"^4\\d{2}$" }[5m])) > 20
  for: 1m
  labels:
    severity: page
  annotations:
    summary: "Job '{{ $labels.job }}' high rate of client error responses"
    description: "Too many {{ $labels.method }} requests to {{ $labels.path }} path
      on instances of job '{{ $labels.job }}' get client error (4xx) responses."

- alert: HighHTTPServerErrorRate
  expr: sum by (job, instance, method, path, alias) (rate(http_server_request_latency_count{ job="tarantool", status=~"^5\\d{2}$" }[5m])) > 0
  for: 1m
  labels:
    severity: page
  annotations:
    summary: "Instance '{{ $labels.alias }}' ('{{ $labels.job }}') server error responses"
    description: "Some {{ $labels.method }} requests to {{ $labels.path }} path
      on '{{ $labels.alias }}' instance of job '{{ $labels.job }}' get server error (5xx) responses."

Responding with high latency is a synonym of insufficient performance. It may be a sign of application malfunction. Or maybe you need to add more routers to your cluster.

- alert: HighHTTPLatency
  expr: http_server_request_latency{ job="tarantool", quantile="0.99" } > 0.1
  for: 5m
  labels:
    severity: warning
  annotations:
    summary: "Instance '{{ $labels.alias }}' ('{{ $labels.job }}') high HTTP latency"
    description: "Some {{ $labels.method }} requests to {{ $labels.path }} path with {{ $labels.status }} response status
      on '{{ $labels.alias }}' instance of job '{{ $labels.job }}' are processed too long."

Having too little requests when you expect them may detect balancer, external client or network malfunction.

- alert: LowRouterHTTPRequestRate
  expr: sum by (job, instance, alias) (rate(http_server_request_latency_count{ job="tarantool", alias=~"^.*router.*$" }[5m])) < 10
  for: 5m
  labels:
    severity: warning
  annotations:
    summary: "Router '{{ $labels.alias }}' ('{{ $labels.job }}') low activity"
    description: "Router '{{ $labels.alias }}' instance of job '{{ $labels.job }}' gets too little requests.
      Please, check up your balancer middleware."

CRUD module statistics

If your application uses CRUD module requests, monitoring module statistics may track internal errors caused by invalid process of input and internal parameters.

- alert: HighCRUDErrorRate
  expr: rate(tnt_crud_stats_count{ job="tarantool", status="error" }[5m]) > 0.1
  for: 1m
  labels:
    severity: critical
  annotations:
    summary: "Instance '{{ $labels.alias }}' ('{{ $labels.job }}') too many CRUD {{ $labels.operation }} errors."
    description: "Too many {{ $labels.operation }} CRUD requests for '{{ $labels.name }}' space on
      '{{ $labels.alias }}' instance of job '{{ $labels.job }}' get module error responses."

Statistics could also monitor requests performance. Too high request latency will lead to high latency of client responses. It may be caused by network or disk issues. Read requests with bad (with respect to space indexes and sharding schema) conditions may lead to full-scans or map reduces and also could be the reason of high latency.

- alert: HighCRUDLatency
  expr: tnt_crud_stats{ job="tarantool", quantile="0.99" } > 0.1
  for: 1m
  labels:
    severity: warning
  annotations:
    summary: "Instance '{{ $labels.alias }}' ('{{ $labels.job }}') too high CRUD {{ $labels.operation }} latency."
    description: "Some {{ $labels.operation }} {{ $labels.status }} CRUD requests for '{{ $labels.name }}' space on
      '{{ $labels.alias }}' instance of job '{{ $labels.job }}' are processed too long."

You also can directly monitor map reduces and scan rate.

- alert: HighCRUDMapReduceRate
  expr: rate(tnt_crud_map_reduces{ job="tarantool" }[5m]) > 0.1
  for: 1m
  labels:
    severity: warning
  annotations:
    summary: "Instance '{{ $labels.alias }}' ('{{ $labels.job }}') too many CRUD {{ $labels.operation }} map reduces."
    description: "There are too many {{ $labels.operation }} CRUD map reduce requests for '{{ $labels.name }}' space on
      '{{ $labels.alias }}' instance of job '{{ $labels.job }}'.
      Check your request conditions or consider changing sharding schema."

Server-side monitoring

If there are no Tarantool metrics, you may miss critical conditions. Prometheus provide up metric to monitor the health of its targets.

- alert: InstanceDown
  expr: up == 0
  for: 1m
  labels:
    severity: page
  annotations:
    summary: "Instance '{{ $labels.instance }}' ('{{ $labels.job }}') down"
    description: "'{{ $labels.instance }}' of job '{{ $labels.job }}' has been down for more than a minute."

Do not forget to monitor your server’s CPU, disk and RAM from server side with your favorite tools. For example, on some high CPU consumption cases Tarantool instance may stop to send metrics, so you can track such breakdowns only from the outside.

Metrics reference

This page provides a detailed description of metrics from the metrics module.

General metrics

General instance information:

`tnt_cfg_current_time`	Instance system time in the Unix timestamp format
`tnt_info_uptime`	Time in seconds since the instance has started
`tnt_read_only`	Indicates if the instance is in read-only mode (`1` if true, `0` if false)

Memory general

The following metrics provide a picture of memory usage by the Tarantool process.

`tnt_info_memory_cache`	Number of bytes in the cache used to store tuples with the vinyl storage engine.
`tnt_info_memory_data`	Number of bytes used to store user data (tuples) with the memtx engine and with level 0 of the vinyl engine, without regard for memory fragmentation.
`tnt_info_memory_index`	Number of bytes used for indexing user data. Includes memtx and vinyl memory tree extents, the vinyl page index, and the vinyl bloom filters.
`tnt_info_memory_lua`	Number of bytes used for the Lua runtime. Monitoring this metric can prevent memory overflow.
`tnt_info_memory_net`	Number of bytes used for network input/output buffers.
`tnt_info_memory_tx`	Number of bytes in use by active transactions. For the vinyl storage engine, this is the total size of all allocated objects (struct `txv`, struct `vy_tx`, struct `vy_read_interval`) and tuples pinned for those objects.

Memory allocation

Provides a memory usage report for the slab allocator. The slab allocator is the main allocator used to store tuples. The following metrics help monitor the total memory usage and memory fragmentation. To learn more about use cases, refer to the box.slab submodule documentation.

Available memory, bytes:

`tnt_slab_quota_size`	Amount of memory available to store tuples and indexes. Is equal to `memtx_memory`.
`tnt_slab_arena_size`	Total memory available to store both tuples and indexes. Includes allocated but currently free slabs.
`tnt_slab_items_size`	Total amount of memory available to store only tuples and not indexes. Includes allocated but currently free slabs.

Memory usage, bytes:

`tnt_slab_quota_used`	The amount of memory that is already reserved by the slab allocator.
`tnt_slab_arena_used`	The effective memory used to store both tuples and indexes. Disregards allocated but currently free slabs.
`tnt_slab_items_used`	The effective memory used to store only tuples and not indexes. Disregards allocated but currently free slabs.

Memory utilization, %:

`tnt_slab_quota_used_ratio`	`tnt_slab_quota_used / tnt_slab_quota_size`
`tnt_slab_arena_used_ratio`	`tnt_slab_arena_used / tnt_slab_arena_size`
`tnt_slab_items_used_ratio`	`tnt_slab_items_used / tnt_slab_items_size`

Spaces

The following metrics provide specific information about each individual space in a Tarantool instance.

`tnt_space_len`	Number of records in the space. This metric always has 2 labels: `{name="test", engine="memtx"}`, where `name` is the name of the space and `engine` is the engine of the space.
`tnt_space_bsize`	Total number of bytes in all tuples. This metric always has 2 labels: `{name="test", engine="memtx"}`, where `name` is the name of the space and `engine` is the engine of the space.
`tnt_space_index_bsize`	Total number of bytes taken by the index. This metric always has 2 labels: `{name="test", index_name="pk"}`, where `name` is the name of the space and `index_name` is the name of the index.
`tnt_space_total_bsize`	Total size of tuples and all indexes in the space. This metric always has 2 labels: `{name="test", engine="memtx"}`, where `name` is the name of the space and `engine` is the engine of the space.
`tnt_vinyl_tuples`	Total tuple count for vinyl. This metric always has 2 labels: `{name="test", engine="vinyl"}`, where `name` is the name of the space and `engine` is the engine of the space. For vinyl this metric is disabled by default and can be enabled only with global variable setup: `rawset(_G, 'include_vinyl_count', true)`.

Network

Network activity stats. These metrics can be used to monitor network load, usage peaks, and traffic drops.

Sent bytes:

tnt_net_sent_total Bytes sent from the instance over the network since the instance’s start time

Received bytes:

tnt_net_received_total Bytes received by the instance since start time

Connections:

`tnt_net_connections_total`	Number of incoming network connections since the instance’s start time
`tnt_net_connections_current`	Number of active network connections

Requests:

`tnt_net_requests_total`	Number of network requests the instance has handled since its start time
`tnt_net_requests_current`	Number of pending network requests

Requests in progress:

`tnt_net_requests_in_progress_total`	Total count of requests processed by tx thread
`tnt_net_requests_in_progress_current`	Count of requests currently being processed in the tx thread

Requests placed in queues of streams:

`tnt_net_requests_in_stream_total`	Total count of requests, which was placed in queues of streams for all time
`tnt_net_requests_in_stream_current`	Count of requests currently waiting in queues of streams

Since Tarantool 2.10 in each network metric has the label thread, showing per-thread network statistics.

Fibers

Provides the statistics for fibers. If your application creates a lot of fibers, you can use the metrics below to monitor fiber count and memory usage.

`tnt_fiber_amount`	Number of fibers
`tnt_fiber_csw`	Overall number of fiber context switches
`tnt_fiber_memalloc`	Amount of memory reserved for fibers
`tnt_fiber_memused`	Amount of memory used by fibers

Operations

You can collect iproto requests an instance has processed and aggregate them by request type. This may help you find out what operations your clients perform most often.

tnt_stats_op_total Total number of calls since server start

To distinguish between request types, this metric has the operation label. For example, it can look as follows: {operation="select"}. For the possible request types, check the table below.

`auth`	Authentication requests
`call`	Requests to execute stored procedures
`delete`	Delete calls
`error`	Requests resulted in an error
`eval`	Calls to evaluate Lua code
`execute`	Execute SQL calls
`insert`	Insert calls
`prepare`	SQL prepare calls
`replace`	Replace calls
`select`	Select calls
`update`	Update calls
`upsert`	Upsert calls

Replication

Provides the current replication status. Learn more about replication in Tarantool.

`tnt_info_lsn`	LSN of the instance.
`tnt_info_vclock`	LSN number in vclock. This metric always has the label `{id="id"}`, where `id` is the instance’s number in the replica set.
`tnt_replication_lsn`	LSN of the tarantool instance. This metric always has labels `{id="id", type="type"}`, where `id` is the instance’s number in the replica set, `type` is `master` or `replica`.
`tnt_replication_lag`	Replication lag value in seconds. This metric always has labels `{id="id", stream="stream"}`, where `id` is the instance’s number in the replica set, `stream` is `downstream` or `upstream`.
`tnt_replication_status`	This metrics equals 1 when replication status is “follow” and 0 otherwise. This metric always has labels `{id="id", stream="stream"}`, where `id` is the instance’s number in the replica set, `stream` is `downstream` or `upstream`.

Runtime

`tnt_runtime_lua`	Lua garbage collector size in bytes
`tnt_runtime_used`	Number of bytes used for the Lua runtime
`tnt_runtime_tuple`	Number of bytes used for the tuples (except tuples owned by memtx and vinyl)

LuaJIT metrics

LuaJIT metrics provide an insight into the work of the Lua garbage collector. These metrics are available in Tarantool 2.6 and later.

General JIT metrics:

`lj_jit_snap_restore_total`	Overall number of snap restores
`lj_jit_trace_num`	Number of JIT traces
`lj_jit_trace_abort_total`	Overall number of abort traces
`lj_jit_mcode_size`	Total size of allocated machine code areas

JIT strings:

`lj_strhash_hit_total`	Number of strings being interned
`lj_strhash_miss_total`	Total number of string allocations

GC steps:

`lj_gc_steps_atomic_total`	Count of incremental GC steps (atomic state)
`lj_gc_steps_sweepstring_total`	Count of incremental GC steps (sweepstring state)
`lj_gc_steps_finalize_total`	Count of incremental GC steps (finalize state)
`lj_gc_steps_sweep_total`	Count of incremental GC steps (sweep state)
`lj_gc_steps_propagate_total`	Count of incremental GC steps (propagate state)
`lj_gc_steps_pause_total`	Count of incremental GC steps (pause state)

Allocations:

`lj_gc_strnum`	Number of allocated `string` objects
`lj_gc_tabnum`	Number of allocated `table` objects
`lj_gc_cdatanum`	Number of allocated `cdata` objects
`lj_gc_udatanum`	Number of allocated `udata` objects
`lj_gc_freed_total`	Total amount of freed memory
`lj_gc_memory`	Current allocated Lua memory
`lj_gc_allocated_total`	Total amount of allocated memory

CPU metrics

The following metrics provide CPU usage statistics. They are only available on Linux.

tnt_cpu_number Total number of processors configured by the operating system

tnt_cpu_time Host CPU time

tnt_cpu_thread

Tarantool thread CPU time. This metric always has the labels {kind="user", thread_name="tarantool", thread_pid="pid", file_name="init.lua"}, where:

kind can be either user or system
thread_name is tarantool, wal, iproto, or coio
file_name is the entrypoint file name, for example, init.lua.

There are also two cross-platform metrics, which can be obtained with a getrusage() call.

`tnt_cpu_user_time`	Tarantool CPU user time
`tnt_cpu_system_time`	Tarantool CPU system time

Vinyl

Vinyl metrics provide vinyl engine statistics.

Disk

The disk metrics are used to monitor overall data size on disk.

`tnt_vinyl_disk_data_size`	Amount of data in bytes stored in the `.run` files located in vinyl_dir
`tnt_vinyl_disk_index_size`	Amount of data in bytes stored in the `.index` files located in vinyl_dir

Regulator

The vinyl regulator decides when to commence disk IO actions. It groups activities in batches so that they are more consistent and efficient.

`tnt_vinyl_regulator_dump_bandwidth`	Estimated average dumping rate, bytes per second. The rate value is initially 10485760 (10 megabytes per second). It is recalculated depending on the the actual rate. Only significant dumps that are larger than 1 MB are used for estimating.
`tnt_vinyl_regulator_write_rate`	Actual average rate of performing write operations, bytes per second. The rate is calculated as a 5-second moving average. If the metric value is gradually going down, this can indicate disk issues.
`tnt_vinyl_regulator_rate_limit`	Write rate limit, bytes per second. The regulator imposes the limit on transactions based on the observed dump/compaction performance. If the metric value is down to approximately `10^5`, this indicates issues with the disk or the scheduler.
`tnt_vinyl_regulator_dump_watermark`	Maximum amount of memory in bytes used for in-memory storing of a vinyl LSM tree. When this maximum is accessed, a dump must occur. For details, see Filling an LSM tree. The value is slightly smaller than the amount of memory allocated for vinyl trees, reflected in the vinyl_memory parameter.
`tnt_vinyl_regulator_blocked_writers`	The number of fibers that are blocked waiting for Vinyl level0 memory quota.

Transactional activity

`tnt_vinyl_tx_commit`	Counter of commits (successful transaction ends) Includes implicit commits: for example, any insert operation causes a commit unless it is within a box.begin()–box.commit() block.
`tnt_vinyl_tx_rollback`	Сounter of rollbacks (unsuccessful transaction ends). This is not merely a count of explicit box.rollback() requests – it includes requests that ended with errors.
`tnt_vinyl_tx_conflict`	Counter of conflicts that caused transactions to roll back. The ratio `tnt_vinyl_tx_conflict / tnt_vinyl_tx_commit` above 5% indicates that vinyl is not healthy. At that moment, you’ll probably see a lot of other problems with vinyl.
`tnt_vinyl_tx_read_views`	Current number of read views – that is, transactions that entered the read-only state to avoid conflict temporarily. Usually the value is `0`. If it stays non-zero for a long time, it is indicative of a memory leak.

Memory

The following metrics show state memory areas used by vinyl for caches and write buffers.

`tnt_vinyl_memory_tuple_cache`	Amount of memory in bytes currently used to store tuples (data)
`tnt_vinyl_memory_level0`	“Level 0” (L0) memory area, bytes. L0 is the area that vinyl can use for in-memory storage of an LSM tree. By monitoring this metric, you can see when L0 is getting close to its maximum (`tnt_vinyl_regulator_dump_watermark`), at which time a dump will occur. You can expect L0 = 0 immediately after the dump operation is completed.
`tnt_vinyl_memory_page_index`	Amount of memory in bytes currently used to store indexes. If the metric value is close to vinyl_memory, this indicates that vinyl_page_size was chosen incorrectly.
`tnt_vinyl_memory_bloom_filter`	Amount of memory in bytes used by bloom filters.
`tnt_vinyl_memory_tuple`	Total size of memory in bytes occupied by Vinyl tuples. It includes cached tuples and tuples pinned by the Lua world.

Scheduler

The vinyl scheduler invokes the regulator and updates the related variables. This happens once per second.

tnt_vinyl_scheduler_tasks

Number of scheduler dump/compaction tasks. The metric always has label {status = <status_value>}, where <status_value> can be one of the following:

inprogress for currently running tasks
completed for successfully completed tasks
failed for tasks aborted due to errors.

tnt_vinyl_scheduler_dump_time Total time in seconds spent by all worker threads performing dumps.

tnt_vinyl_scheduler_dump_total Counter of dumps completed.

Event loop metrics

Event loop tx thread information:

`tnt_ev_loop_time`	Event loop time (ms)
`tnt_ev_loop_prolog_time`	Event loop prolog time (ms)
`tnt_ev_loop_epilog_time`	Event loop epilog time (ms)

Synchro

Shows the current state of a synchronous replication.

`tnt_synchro_queue_owner`	Instance ID of the current synchronous replication master.
`tnt_synchro_queue_term`	Current queue term.
`tnt_synchro_queue_len`	How many transactions are collecting confirmations now.
`tnt_synchro_queue_busy`	Whether the queue is processing any system entry (CONFIRM/ROLLBACK/PROMOTE/DEMOTE).

Election

Shows the current state of a replica set node in regards to leader election.

`tnt_election_state`	Election state (mode) of the node. When election is enabled, the node is writable only in the leader state. Possible values: 0 (`follower`): all the non-leader nodes are called followers 1 (`candidate`): the nodes that start a new election round are called candidates. 2 (`leader`): the node that collected a quorum of votes becomes the leader
`tnt_election_vote`	ID of a node the current node votes for. If the value is 0, it means the node hasn’t voted in the current term yet.
`tnt_election_leader`	Leader node ID in the current term. If the value is 0, it means the node doesn’t know which node is the leader in the current term.
`tnt_election_term`	Current election term.
`tnt_election_leader_idle`	Time in seconds since the last interaction with the known leader.

Memtx

Memtx mvcc memory statistics. Transaction manager consists of two parts:

the transactions themselves (TXN section)
MVCC

TXN

tnt_memtx_tnx_statements are the transaction statements.

For example, the user started a transaction and made an action in it space:replace{0, 1}. Under the hood, this operation will turn into statement for the current transaction. This metric always has the label {kind="..."}, which has the following possible values:

total: the number of bytes that are allocated for the statements of all current transactions.
average: average bytes used by transactions for statements (txn.statements.total bytes / number of open transactions).
max: the maximum number of bytes used by one the current transaction for statements.

tnt_memtx_tnx_user

In Tarantool C API there is a function box_txn_alloc(). By using this function user can allocate memory for the current transaction. This metric always has the label {kind="..."}, which has the following possible values:

total: memory allocated by the box_txn_alloc() function on all current transactions.
average: transaction average (total allocated bytes / number of all current transactions).
max: the maximum number of bytes allocated by box_txn_alloc() function per transaction.

tnt_memtx_tnx_system

There are internals: logs, savepoints. This metric always has the label {kind="..."}, which has the following possible values:

total: memory allocated by internals on all current transactions.
average: average allocated memory by internals (total memory / number of all current transactions).
max: the maximum number of bytes allocated by internals per transaction.

MVCC

mvcc is responsible for the isolation of transactions. It detects conflicts and makes sure that tuples that are no longer in the space, but read by some transaction (or can be read) have not been deleted.

tnt_memtx_mvcc_trackers

Trackers that keep track of transaction reads. This metric always has the label {kind="..."}, which has the following possible values:

total: trackers of all current transactions are allocated in total (in bytes).
average: average for all current transactions (total memory bytes / number of transactions).
max: maximum trackers allocated per transaction (in bytes).

tnt_memtx_mvcc_conflicts

Allocated in case of transaction conflicts. This metric always has the label {kind="..."}, which has the following possible values:

total: bytes allocated for conflicts in total.
average: average for all current transactions (total memory bytes / number of transactions).
max: maximum bytes allocated for conflicts per transaction.

Tuples

Saved tuples are divided into 3 categories: used, read_view, tracking.

Each category has two metrics:

retained tuples - they are no longer in the index, but MVCC does not allow them to be removed.
stories - MVCC is based on the story mechanism, almost every tuple has a story. This is a separate metric because even the tuples that are in the index can have a story. So stories and retained need to be measured separately.

`tnt_memtx_mvcc_tuples_used_stories`	Tuples that are used by active read-write transactions. This metric always has the label `{kind="..."}`, which has the following possible values: `count`: number of `used` tuples / number of stories. `total`: amount of bytes used by stories `used` tuples.
`tnt_memtx_mvcc_tuples_used_retained`	Tuples that are used by active read-write transactions. But they are no longer in the index, but MVCC does not allow them to be removed. This metric always has the label `{kind="..."}`, which has the following possible values: `count`: number of retained `used` tuples / number of stories. `total`: amount of bytes used by retained `used` tuples.
`tnt_memtx_mvcc_tuples_read_view_stories`	Tuples that are not used by active read-write transactions, but are used by read-only transactions (i.e. in read view). This metric always has the label `{kind="..."}`, which has the following possible values: `count`: number of `read_view` tuples / number of stories. `total`: amount of bytes used by stories `read_view` tuples.
`tnt_memtx_mvcc_tuples_read_view_retained`	Tuples that are not used by active read-write transactions, but are used by read-only transactions (i.e. in read view). This tuples are no longer in the index, but MVCC does not allow them to be removed. This metric always has the label `{kind="..."}`, which has the following possible values: `count`: number of retained `read_view` tuples / number of stories. `total`: amount of bytes used by retained `read_view` tuples.
`tnt_memtx_mvcc_tuples_tracking_stories`	Tuples that are not directly used by any transactions, but are used by MVCC to track reads. This metric always has the label `{kind="..."}`, which has the following possible values: `count`: number of `tracking` tuples / number of tracking stories. `total`: amount of bytes used by stories `tracking` tuples.
`tnt_memtx_mvcc_tuples_tracking_retained`	Tuples that are not directly used by any transactions, but are used by MVCC to track reads. This tuples are no longer in the index, but MVCC does not allow them to be removed. This metric always has the label `{kind="..."}`, which has the following possible values: `count`: number of retained `tracking` tuples / number of stories. `total`: amount of bytes used by retained `tracking` tuples.

Read view statistics

`tnt_memtx_tuples_data_total`	Total amount of memory (in bytes) allocated for data tuples. This includes `tnt_memtx_tuples_data_read_view` and `tnt_memtx_tuples_data_garbage` metric values plus tuples that are actually stored in memtx spaces.
`tnt_memtx_tuples_data_read_view`	Memory (in bytes) held for read views.
`tnt_memtx_tuples_data_garbage`	Memory (in bytes) that is unused and scheduled to be freed (freed lazily on memory allocation).
`tnt_memtx_index_total`	Total amount of memory (in bytes) allocated for indexing data. This includes `tnt_memtx_index_read_view` metric value plus memory used for indexing tuples that are actually stored in memtx spaces.
`tnt_memtx_index_read_view`	Memory (in bytes) held for read views.

Tarantool configuration

Since: 3.0.0.

tnt_config_alerts Count of current instance configuration apply alerts. {level="warn"} label covers warnings and {level="error"} covers errors.

tnt_config_status

The status of current instance configuration apply. status label contains possible status name. Current status has metric value 1, inactive statuses have metric value 0.

# HELP tnt_config_status Tarantool 3 configuration status
# TYPE tnt_config_status gauge
tnt_config_status{status="reload_in_progress",alias="router-001-a"} 0
tnt_config_status{status="uninitialized",alias="router-001-a"} 0
tnt_config_status{status="check_warnings",alias="router-001-a"} 0
tnt_config_status{status="ready",alias="router-001-a"} 1
tnt_config_status{status="check_errors",alias="router-001-a"} 0
tnt_config_status{status="startup_in_progress",alias="router-001-a"} 0

For example, this set of metrics means that current configuration for router-001-a status is ready.

Notes for operating systems

macOS

On macOS, no native system tools for administering Tarantool are supported. The recommended way to administer Tarantool instances is using tt CLI.

Gentoo Linux

The section below is about a dev-db/tarantool package installed from the official layman overlay (named tarantool).

The default instance directory is /etc/tarantool/instances.available, can be redefined in /etc/default/tarantool.

Tarantool instances can be managed (start/stop/reload/status/…) using OpenRC. Consider the example how to create an OpenRC-managed instance:

$ cd /etc/init.d
$ ln -s tarantool your_service_name
$ ln -s /usr/share/tarantool/your_service_name.lua /etc/tarantool/instances.available/your_service_name.lua

Checking that it works:

$ /etc/init.d/your_service_name start
$ tail -f -n 100 /var/log/tarantool/your_service_name.log

Troubleshooting guide

Problem: INSERT/UPDATE-requests result in ER_MEMORY_ISSUE error

Possible reasons

Lack of RAM (parameters arena_used_ratio and quota_used_ratio in box.slab.info() report are getting close to 100%).

To check these parameters, say:

$ # attaching to a Tarantool instance
$ tt connect <instance_name|URI>

-- requesting arena_used_ratio value
tarantool> box.slab.info().arena_used_ratio

-- requesting quota_used_ratio value
tarantool> box.slab.info().quota_used_ratio

Solution

Try either of the following measures:

In Tarantool’s instance file, increase the value of box.cfg{memtx_memory} (if memory resources are available).

In versions of Tarantool before 1.10, the server needs to be restarted to change this parameter. The Tarantool server will be unavailable while restarting from .xlog files, unless you restart it using hot standby mode. In the latter case, nearly 100% server availability is guaranteed.
Clean up the database.
Check the indicators of memory fragmentation:
```
-- requesting quota_used_ratio value
tarantool> box.slab.info().quota_used_ratio

-- requesting items_used_ratio value
tarantool> box.slab.info().items_used_ratio
```
In case of heavy memory fragmentation (quota_used_ratio is getting close to 100%, items_used_ratio is about 50%), we recommend restarting Tarantool in the hot standby mode.

Problem: Tarantool generates too heavy CPU load

Possible reasons

The transaction processor thread consumes over 60% CPU.

Solution

Attach to the Tarantool instance with tt utility, analyze the query statistics with box.stat() and spot the CPU consumption leader. The following commands can help:

$ # attaching to a Tarantool instance
$ tt connect <instance_name|URI>

-- checking the RPS of calling stored procedures
tarantool> box.stat().CALL.rps

The critical RPS value is 75 000, boiling down to 10 000 - 20 000 for a rich Lua application (a Lua module of 200+ lines).

-- checking RPS per query type
tarantool> box.stat().<query_type>.rps

The critical RPS value for SELECT/INSERT/UPDATE/DELETE requests is 100 000.

If the load is mostly generated by SELECT requests, we recommend adding a slave server and let it process part of the queries.

If the load is mostly generated by INSERT/UPDATE/DELETE requests, we recommend sharding the database.

Problem: Query processing times out

Possible reasons

Note

All reasons that we discuss here can be identified by messages in Tarantool’s log file, all starting with the words 'Too long...'.

Both fast and slow queries are processed within a single connection, so the readahead buffer is cluttered with slow queries.

Solution

Try either of the following measures:
- Increase the readahead buffer size (box.cfg{readahead} parameter).
  
  This parameter can be changed on the fly, so you don’t need to restart Tarantool. Attach to the Tarantool instance with tt utility and call box.cfg{} with a new readahead value:
```
$ # attaching to a Tarantool instance
$ tt connect <instance_name|URI>
```
```
-- changing the readahead value
tarantool> box.cfg{readahead = 10 * 1024 * 1024}
```
  Example: Given 1000 RPS, 1 Кbyte of query size, and 10 seconds of maximal query processing time, the minimal readahead buffer size must be 10 Mbytes.
- On the business logic level, split fast and slow queries processing by different connections.
Slow disks.

Solution

Check disk performance (use iostat, iotop or strace utility to check iowait parameter) and try to put .xlog files and snapshot files on different physical disks (i.e. use different locations for wal_dir and memtx_dir).

Problem: Replication “lag” and “idle” contain negative values

This is about box.info.replication.(upstream.)lag and box.info.replication.(upstream.)idle values in box.info.replication section.

Possible reasons

Operating system clock on the hosts is not synchronized, or the NTP server is faulty.

Solution

Check NTP server settings.

If you found no problems with the NTP server, just do nothing then. Lag calculation uses operating system clock from two different machines. If they get out of sync, the remote master clock can get consistently behind the local instance’s clock.

Problem: Replication statistics differ on replicas within a replica set

This is about a replica set that consists of one master and several replicas. In a replica set of this type, values in box.info.replication section, like box.info.replication.lsn, come from the master and must be the same on all replicas within the replica set. The problem is that they get different.

Possible reasons

Replication is broken.

Solution

Restart replication.

Problem: Master-master replication is stopped

This is about box.info.replication(.upstream).status = stopped.

Possible reasons

In a master-master replica set of two Tarantool instances, one of the masters has tried to perform an action already performed by the other server, for example re-insert a tuple with the same unique key. This would cause an error message like 'Duplicate key exists in unique index 'primary' in space <space_name>'.

Solution

This issue can be fixed in two ways:

Manually: reseed one master from another by removing write-ahead logs and snapshots.
Programmatically: set up a conflict resolution trigger.

Note

If one of the instances must be isolated during troubleshooting, it can be put to the isolated mode.

Then, restart replication as described in Restarting replication.

Problem: Tarantool works much slower than before

Possible reasons

Inefficient memory usage (RAM is cluttered with a huge amount of unused objects).

Solution

Call the Lua garbage collector with the collectgarbage(‘count’) function and measure its execution time with the Tarantool functions clock.bench() or clock.proc().

Example of calculating memory usage statistics:

$ # attaching to a Tarantool instance
$ tt connect <instance_name|URI>

-- loading Tarantool's "clock" module with time-related routines
tarantool> clock = require 'clock'
-- starting the timer
tarantool> b = clock.proc()
-- launching garbage collection
tarantool> c = collectgarbage('count')
-- stopping the timer after garbage collection is completed
tarantool> return c, clock.proc() - b

If the returned clock.proc() value is greater than 0.001, this may be an indicator of inefficient memory usage (no active measures are required, but we recommend to optimize your Tarantool application code).

If the value is greater than 0.01, your application definitely needs thorough code analysis aimed at optimizing memory usage.

Problem: Fiber switch is forbidden in ‘__gc’ metamethod

Problem description

Fiber switch is forbidden in __gc metamethod since this change to avoid unexpected Lua OOM. However, one may need to use a yielding function to finalize resources, for example, to close a socket.

Below are examples of proper implementing such a procedure.

Solution

First, there come two simple examples illustrating the logic of the solution:

Example 1
Example 2.

Next comes the Example 3 illustrating the usage of the sched.lua module that is the recommended method.

All the explanations are given in the comments in the code listing. -- > indicates the output in console.

Example 1

Implementing a valid finalizer for a particular FFI type (custom_t).

local ffi = require('ffi')
local fiber = require('fiber')

ffi.cdef('struct custom { int a; };')

local function __custom_gc(self)
  print(("Entered custom GC finalizer for %s... (before yield)"):format(self.a))
  fiber.yield()
  print(("Leaving custom GC finalizer for %s... (after yield)"):format(self.a))
end

local custom_t = ffi.metatype('struct custom', {
  __gc = function(self)
    -- XXX: Do not invoke yielding functions in __gc metamethod.
    -- Create a new fiber to run after the execution leaves
    -- this routine.
    fiber.new(__custom_gc, self)
    print(("Finalization is scheduled for %s..."):format(self.a))
  end
})

-- Create a cdata object of <custom_t> type.
local c = custom_t(42)

-- Remove a single reference to that object to make it subject
-- for GC.
c = nil

-- Run full GC cycle to purge the unreferenced object.
collectgarbage('collect')
-- > Finalization is scheduled for 42...

-- XXX: There is no finalization made until the running fiber
-- yields its execution. Let's do it now.
fiber.yield()
-- > Entered custom GC finalizer for 42... (before yield)
-- > Leaving custom GC finalizer for 42... (after yield)

Example 2

Implementing a valid finalizer for a particular user type (struct custom).

custom.c

#include <lauxlib.h>
#include <lua.h>
#include <module.h>
#include <stdio.h>

struct custom {
  int a;
};

const char *CUSTOM_MTNAME = "CUSTOM_MTNAME";

/*
 * XXX: Do not invoke yielding functions in __gc metamethod.
 * Create a new fiber to be run after the execution leaves
 * this routine. Unfortunately we can't pass the parameters to the
 * routine to be executed by the created fiber via <fiber_new_ex>.
 * So there is a workaround to load the Lua code below to create
 * __gc metamethod passing the object for finalization via Lua
 * stack to the spawned fiber.
 */
const char *gc_wrapper_constructor = " local fiber = require('fiber')         "
             " print('constructor is initialized')    "
             " return function(__custom_gc)           "
             "   print('constructor is called')       "
             "   return function(self)                "
             "     print('__gc is called')            "
             "     fiber.new(__custom_gc, self)       "
             "     print('Finalization is scheduled') "
             "   end                                  "
             " end                                    "
        ;

int custom_gc(lua_State *L) {
  struct custom *self = luaL_checkudata(L, 1, CUSTOM_MTNAME);
  printf("Entered custom_gc for %d... (before yield)\n", self->a);
  fiber_sleep(0);
  printf("Leaving custom_gc for %d... (after yield)\n", self->a);
  return 0;
}

int custom_new(lua_State *L) {
  struct custom *self = lua_newuserdata(L, sizeof(struct custom));
  luaL_getmetatable(L, CUSTOM_MTNAME);
  lua_setmetatable(L, -2);
  self->a = lua_tonumber(L, 1);
  return 1;
}

static const struct luaL_Reg libcustom_methods [] = {
  { "new", custom_new },
  { NULL, NULL }
};

int luaopen_custom(lua_State *L) {
  int rc;

  /* Create metatable for struct custom type */
  luaL_newmetatable(L, CUSTOM_MTNAME);
  /*
   * Run the constructor initializer for GC finalizer:
   * - load fiber module as an upvalue for GC finalizer
   *   constructor
   * - return GC finalizer constructor on the top of the
   *   Lua stack
   */
  rc = luaL_dostring(L, gc_wrapper_constructor);
  /*
   * Check whether constructor is initialized (i.e. neither
   * syntax nor runtime error is raised).
   */
  if (rc != LUA_OK)
    luaL_error(L, "test module loading failed: constructor init");
  /*
   * Create GC object for <custom_gc> function to be called
   * in scope of the GC finalizer and push it on top of the
   * constructor returned before.
   */
  lua_pushcfunction(L, custom_gc);
  /*
   * Run the constructor with <custom_gc> GCfunc object as
   * a single argument. As a result GC finalizer is returned
   * on the top of the Lua stack.
   */
  rc = lua_pcall(L, 1, 1, 0);
  /*
   * Check whether GC finalizer is created (i.e. neither
   * syntax nor runtime error is raised).
   */
  if (rc != LUA_OK)
    luaL_error(L, "test module loading failed: __gc init");
  /*
   * Assign the returned function as a __gc metamethod to
   * custom type metatable.
   */
  lua_setfield(L, -2, "__gc");

  /*
   * Initialize Lua table for custom module and fill it
   * with the custom methods.
   */
  lua_newtable(L);
  luaL_register(L, NULL, libcustom_methods);
  return 1;
}

custom_c.lua

-- Load custom Lua C extension.
local custom = require('custom')
-- > constructor is initialized
-- > constructor is called

-- Create a userdata object of <struct custom> type.
local c = custom.new(9)

-- Remove a single reference to that object to make it subject
-- for GC.
c = nil

-- Run full GC cycle to purge the unreferenced object.
collectgarbage('collect')
-- > __gc is called
-- > Finalization is scheduled

-- XXX: There is no finalization made until the running fiber
-- yields its execution. Let's do it now.
require('fiber').yield()
-- > Entered custom_gc for 9... (before yield)

-- XXX: Finalizer yields the execution, so now we are here.
print('We are here')
-- > We are here

-- XXX: This fiber finishes its execution, so yield to the
-- remaining fiber to finish the postponed finalization.
-- > Leaving custom_gc for 9... (after yield)

Example 3

It is important to note that the finalizer implementations in the examples above increase pressure on the platform performance by creating a new fiber on each __gc call. To prevent such an excessive fibers spawning, it’s better to start a single “scheduler” fiber and provide the interface to postpone the required asynchronous action.

For this purpose, the module called sched.lua is implemented (see the listing below). It is a part of Tarantool and should be made required in your custom code. The usage example is given in the init.lua file below.

sched.lua

local fiber = require('fiber')

local worker_next_task = nil
local worker_last_task
local worker_fiber
local worker_cv = fiber.cond()

-- XXX: the module is not ready for reloading, so worker_fiber is
-- respawned when sched.lua is purged from package.loaded.

--
-- Worker is a singleton fiber for not urgent delayed execution of
-- functions. Main purpose - schedule execution of a function,
-- which is going to yield, from a context, where a yield is not
-- allowed. Such as an FFI object's GC callback.
--
local function worker_f()
  while true do
    local task
    while true do
      task = worker_next_task
      if task then break end
      -- XXX: Make the fiber wait until the task is added.
      worker_cv:wait()
    end
    worker_next_task = task.next
    task.f(task.arg)
    fiber.yield()
  end
end

local function worker_safe_f()
  pcall(worker_f)
  -- The function <worker_f> never returns. If the execution is
  -- here, this fiber is probably canceled and now is not able to
  -- sleep. Create a new one.
  worker_fiber = fiber.new(worker_safe_f)
end

worker_fiber = fiber.new(worker_safe_f)

local function worker_schedule_task(f, arg)
  local task = { f = f, arg = arg }
  if not worker_next_task then
    worker_next_task = task
  else
    worker_last_task.next = task
  end
  worker_last_task = task
  worker_cv:signal()
end

return {
  postpone = worker_schedule_task
}

init.lua

local ffi = require('ffi')
local fiber = require('fiber')
local sched = require('sched')

local function __custom_gc(self)
  print(("Entered custom GC finalizer for %s... (before yield)"):format(self.a))
  fiber.yield()
  print(("Leaving custom GC finalizer for %s... (after yield)"):format(self.a))
end

ffi.cdef('struct custom { int a; };')
local custom_t = ffi.metatype('struct custom', {
  __gc = function(self)
    -- XXX: Do not invoke yielding functions in __gc metamethod.
    -- Schedule __custom_gc call via sched.postpone to be run
    -- after the execution leaves this routine.
    sched.postpone(__custom_gc, self)
    print(("Finalization is scheduled for %s..."):format(self.a))
  end
})

-- Create several <custom_t> objects to be finalized later.
local t = { }
for i = 1, 10 do t[i] = custom_t(i) end

-- Run full GC cycle to collect the existing garbage. Nothing is
-- going to be printed, since the table <t> is still "alive".
collectgarbage('collect')

-- Remove the reference to the table and, ergo, all references to
-- the objects.
t = nil

-- Run full GC cycle to collect the table and objects inside it.
-- As a result all <custom_t> objects are scheduled for further
-- finalization, but the finalizer itself (i.e. __custom_gc
-- functions) is not called.
collectgarbage('collect')
-- > Finalization is scheduled for 10...
-- > Finalization is scheduled for 9...
-- > ...
-- > Finalization is scheduled for 2...
-- > Finalization is scheduled for 1...

-- XXX: There is no finalization made until the running fiber
-- yields its execution. Let's do it now.
fiber.yield()
-- > Entered custom GC finalizer for 10... (before yield)

-- XXX: Oops, we are here now, since the scheduler fiber yielded
-- the execution to this one. Check this out.
print("We're here now. Let's continue the scheduled finalization.")
-- > We're here now. Let's continue the finalization

-- OK, wait a second to allow the scheduler to cleanup the
-- remaining garbage.
fiber.sleep(1)
-- > Leaving custom GC finalizer for 10... (after yield)
-- > Entered custom GC finalizer for 9... (before yield)
-- > Leaving custom GC finalizer for 9... (after yield)
-- > ...
-- > Entered custom GC finalizer for 1... (before yield)
-- > Leaving custom GC finalizer for 1... (after yield)

print("Did we finish? I guess so.")
-- > Did we finish? I guess so.

-- Stop the instance.
os.exit(0)

Connectors

Connectors are APIs that allow using Tarantool with various programming languages.

Connectors can be divided into two groups – those maintained by the Tarantool team and those supported by the community. The Tarantool team maintains the following connectors:

All other connectors are community-supported, which means that support for new Tarantool features may be delayed. Find all the available connectors on the Connectors page.

Protocol

Tarantool’s binary protocol was designed with a focus on asynchronous I/O and easy integration with proxies. Each client request starts with a variable-length binary header, containing request id, request type, instance id, log sequence number, and so on.

The mandatory length, present in request header simplifies client or proxy I/O. A response to a request is sent to the client as soon as it is ready. It always carries in its header the same type and id as in the request. The id makes it possible to match a request to a response, even if the latter arrived out of order.

Unless implementing a client driver, you needn’t concern yourself with the complications of the binary protocol. Language-specific drivers provide a friendly way to store domain language data structures in Tarantool. A complete description of the binary protocol is maintained in annotated Backus-Naur form in the source tree. For detailed examples and diagrams of all binary-protocol requests and responses, see Tarantool’s binary protocol.

Packet example

The Tarantool API exists so that a client program can send a request packet to a server instance, and receive a response. Here is an example of a what the client would send for box.space[513]:insert{'A', 'BB'}. The BNF description of the components is on the page about Tarantool’s binary protocol.

Component	Byte #0	Byte #1	Byte #2	Byte #3
code for insert	02
rest of header	…	…	…	…
2-digit number: space id	cd	02	01
code for tuple	21
1-digit number: field count = 2	92
1-character string: field[1]	a1	41
2-character string: field[2]	a2	42	42

Now, you could send that packet to the Tarantool instance, and interpret the response (the page about Tarantool’s binary protocol has a description of the packet format for responses as well as requests). But it would be easier, and less error-prone, if you could invoke a routine that formats the packet according to typed parameters. Something like response = tarantool_routine("insert", 513, "A", "B");. And that is why APIs exist for drivers for Perl, Python, PHP, and so on.

Setting up the server for connector examples

This chapter has examples that show how to connect to a Tarantool instance via the Perl, PHP, Python, node.js, and C connectors. The examples contain hard code that will work if and only if the following conditions are met:

the Tarantool instance (tarantool) is running on localhost (127.0.0.1) and is listening on port 3301 (box.cfg.listen = '3301'),
space examples has id = 999 (box.space.examples.id = 999) and has a primary-key index for a numeric field (box.space[999].index[0].parts[1].type = "unsigned"),
user ‘guest’ has privileges for reading and writing.

It is easy to meet all the conditions by starting the instance and executing this script:

box.cfg{listen=3301}
box.schema.space.create('examples',{id=999})
box.space.examples:create_index('primary', {type = 'hash', parts = {1, 'unsigned'}})
box.schema.user.grant('guest','read,write','space','examples')
box.schema.user.grant('guest','read','space','_space')

Interpreting function return values

For all connectors, calling a function via Tarantool causes a return in the MsgPack format. If the function is called using the connector’s API, some conversions may occur. All scalar values are returned as tuples (with a MsgPack type-identifier followed by a value); all non-scalar values are returned as a group of tuples (with a MsgPack array-identifier followed by the scalar values). If the function is called via the binary protocol command layer – “eval” – rather than via the connector’s API, no conversions occur.

In the following example, a Lua function will be created. Since it will be accessed externally by a ‘guest’ user, a grant of an execute privilege will be necessary. The function returns an empty array, a scalar string, two booleans, and a short integer. The values are the ones described in the table Common Types and MsgPack Encodings.

tarantool> box.cfg{listen=3301}
2016-03-03 18:45:52.802 [27381] main/101/interactive I> ready to accept requests
---
...
tarantool> function f() return {},'a',false,true,127; end
---
...
tarantool> box.schema.func.create('f')
---
...
tarantool> box.schema.user.grant('guest','execute','function','f')
---
...

Here is a C program which calls the function. Although C is being used for the example, the result would be precisely the same if the calling program was written in Perl, PHP, Python, Go, or Java.

#include <stdio.h>
#include <stdlib.h>
#include <tarantool/tarantool.h>
#include <tarantool/tnt_net.h>
#include <tarantool/tnt_opt.h>
void main() {
  struct tnt_stream *tnt = tnt_net(NULL);              /* SETUP */
  tnt_set(tnt, TNT_OPT_URI, "localhost:3301");
   if (tnt_connect(tnt) < 0) {                         /* CONNECT */
       printf("Connection refused\n");
       exit(-1);
   }
   struct tnt_stream *arg; arg = tnt_object(NULL);     /* MAKE REQUEST */
   tnt_object_add_array(arg, 0);
   struct tnt_request *req1 = tnt_request_call(NULL);  /* CALL function f() */
   tnt_request_set_funcz(req1, "f");
   uint64_t sync1 = tnt_request_compile(tnt, req1);
   tnt_flush(tnt);                                     /* SEND REQUEST */
   struct tnt_reply reply;  tnt_reply_init(&reply);    /* GET REPLY */
   tnt->read_reply(tnt, &reply);
   if (reply.code != 0) {
     printf("Call failed %lu.\n", reply.code);
     exit(-1);
   }
   const unsigned char *p= (unsigned char*)reply.data; /* PRINT REPLY */
   while (p < (unsigned char *) reply.data_end)
   {
     printf("%x ", *p);
     ++p;
   }
   printf("\n");
   tnt_close(tnt);                                     /* TEARDOWN */
   tnt_stream_free(arg);
   tnt_stream_free(tnt);
}

When this program is executed, it will print:

dd 0 0 0 5 90 91 a1 61 91 c2 91 c3 91 7f

The first five bytes – dd 0 0 0 5 – are the MsgPack encoding for “32-bit array header with value 5” (see MsgPack specification). The rest are as described in the table Common Types and MsgPack Encodings.

Go

Examples on GitHub: sample_db, go

go-tarantool is the official Go connector for Tarantool. It is not supplied as part of the Tarantool repository and should be installed separately.

This tutorial shows how to use the go-tarantool 2.x library to create a Go application that connects to a remote Tarantool instance, performs CRUD operations, and executes a stored procedure. You can find the full package documentation here: Client in Go for Tarantool.

Note

This tutorial shows how to make CRUD requests to a single-instance Tarantool database. To make requests to a sharded Tarantool cluster with the CRUD module, use the crud package’s API.

Sample database configuration

This section describes the configuration of a sample database that allows remote connections:

credentials:
  users:
    sampleuser:
      password: '123456'
      privileges:
      - permissions: [ read, write ]
        spaces: [ bands ]
      - permissions: [ execute ]
        functions: [ get_bands_older_than ]

groups:
  group001:
    replicasets:
      replicaset001:
        instances:
          instance001:
            iproto:
              listen:
              - uri: '127.0.0.1:3301'

app:
  file: 'myapp.lua'

The configuration contains one instance that listens for incoming requests on the 127.0.0.1:3301 address.
sampleuser has privileges to select and modify data in the bands space and execute the get_bands_older_than stored function. This user can be used to connect to the instance remotely.
myapp.lua defines the data model and a stored function.

The myapp.lua file looks as follows:

-- Create a space --
box.schema.space.create('bands')

-- Specify field names and types --
box.space.bands:format({
    { name = 'id', type = 'unsigned' },
    { name = 'band_name', type = 'string' },
    { name = 'year', type = 'unsigned' }
})

-- Create indexes --
box.space.bands:create_index('primary', { parts = { 'id' } })
box.space.bands:create_index('band', { parts = { 'band_name' } })
box.space.bands:create_index('year_band', { parts = { { 'year' }, { 'band_name' } } })

-- Create a stored function --
box.schema.func.create('get_bands_older_than', {
    body = [[
    function(year)
        return box.space.bands.index.year_band:select({ year }, { iterator = 'LT', limit = 10 })
    end
    ]]
})

You can find the full example on GitHub: sample_db.

Starting a sample database application

Before creating and starting a client Go application, you need to run the sample_db application using tt start:

$ tt start sample_db

Now you can create a client Go application that makes requests to this database.

Developing a client application

Before you start, make sure you have Go installed on your computer.

Creating an application

Create the hello directory for your application and go to this directory:
```
$ mkdir hello
$ cd hello
```
Initialize a new Go module:
```
$ go mod init example/hello
```
Inside the hello directory, create the hello.go file for application code.

Importing ‘go-tarantool’ packages

In the hello.go file, declare a main package and import the following packages:

package main

import (
	"context"
	"fmt"
	"github.com/tarantool/go-tarantool/v2"
	_ "github.com/tarantool/go-tarantool/v2/datetime"
	_ "github.com/tarantool/go-tarantool/v2/decimal"
	_ "github.com/tarantool/go-tarantool/v2/uuid"
	"time"
)

The packages for external MsgPack types, such as datetime, decimal, or uuid, are required to parse these types in a response.

Connecting to the database

Declare the main() function:
```
func main() {

}
```

Inside the main() function, add the following code:

// Connect to the database
ctx, cancel := context.WithTimeout(context.Background(), time.Second)
defer cancel()
dialer := tarantool.NetDialer{
	Address:  "127.0.0.1:3301",
	User:     "sampleuser",
	Password: "123456",
}
opts := tarantool.Opts{
	Timeout: time.Second,
}

conn, err := tarantool.Connect(ctx, dialer, opts)
if err != nil {
	fmt.Println("Connection refused:", err)
	return
}

// Interact with the database
// ...

This code establishes a connection to a running Tarantool instance on behalf of sampleuser. The conn object can be used to make CRUD requests and execute stored procedures.

Manipulating data

Inserting data

Add the following code to insert four tuples into the bands space:

// Insert data
tuples := [][]interface{}{
	{1, "Roxette", 1986},
	{2, "Scorpions", 1965},
	{3, "Ace of Base", 1987},
	{4, "The Beatles", 1960},
}
var futures []*tarantool.Future
for _, tuple := range tuples {
	request := tarantool.NewInsertRequest("bands").Tuple(tuple)
	futures = append(futures, conn.Do(request))
}
fmt.Println("Inserted tuples:")
for _, future := range futures {
	result, err := future.Get()
	if err != nil {
		fmt.Println("Got an error:", err)
	} else {
		fmt.Println(result)
	}
}

This code makes insert requests asynchronously:

The Future structure is used as a handle for asynchronous requests.
The NewInsertRequest() method creates an insert request object that is executed by the connection.

Note

Making requests asynchronously is the recommended way to perform data operations. Further requests in this tutorial are made synchronously.

Querying data

To get a tuple by the specified primary key value, use NewSelectRequest() to create an insert request object:

// Select by primary key
data, err := conn.Do(
	tarantool.NewSelectRequest("bands").
		Limit(10).
		Iterator(tarantool.IterEq).
		Key([]interface{}{uint(1)}),
).Get()
if err != nil {
	fmt.Println("Got an error:", err)
}
fmt.Println("Tuple selected by the primary key value:", data)

You can also get a tuple by the value of the specified index by using Index():

// Select by secondary key
data, err = conn.Do(
	tarantool.NewSelectRequest("bands").
		Index("band").
		Limit(10).
		Iterator(tarantool.IterEq).
		Key([]interface{}{"The Beatles"}),
).Get()
if err != nil {
	fmt.Println("Got an error:", err)
}
fmt.Println("Tuple selected by the secondary key value:", data)

Updating data

NewUpdateRequest() can be used to update a tuple identified by the primary key as follows:

// Update
data, err = conn.Do(
	tarantool.NewUpdateRequest("bands").
		Key(tarantool.IntKey{2}).
		Operations(tarantool.NewOperations().Assign(1, "Pink Floyd")),
).Get()
if err != nil {
	fmt.Println("Got an error:", err)
}
fmt.Println("Updated tuple:", data)

NewUpsertRequest() can be used to update an existing tuple or insert a new one. In the example below, a new tuple is inserted:

// Upsert
data, err = conn.Do(
	tarantool.NewUpsertRequest("bands").
		Tuple([]interface{}{uint(5), "The Rolling Stones", 1962}).
		Operations(tarantool.NewOperations().Assign(1, "The Doors")),
).Get()
if err != nil {
	fmt.Println("Got an error:", err)
}

In this example, NewReplaceRequest() is used to delete the existing tuple and insert a new one:

// Replace
data, err = conn.Do(
	tarantool.NewReplaceRequest("bands").
		Tuple([]interface{}{1, "Queen", 1970}),
).Get()
if err != nil {
	fmt.Println("Got an error:", err)
}
fmt.Println("Replaced tuple:", data)

Deleting data

NewDeleteRequest() in the example below is used to delete a tuple whose primary key value is 5:

// Delete
data, err = conn.Do(
	tarantool.NewDeleteRequest("bands").
		Key([]interface{}{uint(5)}),
).Get()
if err != nil {
	fmt.Println("Got an error:", err)
}
fmt.Println("Deleted tuple:", data)

Executing stored procedures

To execute a stored procedure, use NewCallRequest():

// Call
data, err = conn.Do(
	tarantool.NewCallRequest("get_bands_older_than").Args([]interface{}{1966}),
).Get()
if err != nil {
	fmt.Println("Got an error:", err)
}
fmt.Println("Stored procedure result:", data)

Closing the connection

The CloseGraceful() method can be used to close the connection when it is no longer needed:

// Close connection
conn.CloseGraceful()
fmt.Println("Connection is closed")

Note

You can find the example with all the requests above on GitHub: go.

Starting a client application

Execute the following go get commands to update dependencies in the go.mod file:

$ go get github.com/tarantool/go-tarantool/v2
$ go get github.com/tarantool/go-tarantool/v2/decimal
$ go get github.com/tarantool/go-tarantool/v2/uuid

To run the resulting application, execute the go run command in the application directory:

$ go run .
Inserted tuples:
[[1 Roxette 1986]]
[[2 Scorpions 1965]]
[[3 Ace of Base 1987]]
[[4 The Beatles 1960]]
Tuple selected by the primary key value: [[1 Roxette 1986]]
Tuple selected by the secondary key value: [[4 The Beatles 1960]]
Updated tuple: [[2 Pink Floyd 1965]]
Replaced tuple: [[1 Queen 1970]]
Deleted tuple: [[5 The Rolling Stones 1962]]
Stored procedure result: [[[2 Pink Floyd 1965] [4 The Beatles 1960]]]
Connection is closed

Community Go connectors

Last update: January 2023

There are also the following community-driven Go connectors:

The table below contains a feature comparison for the connectors mentioned above.

	tarantool/go-tarantool	viciious/go-tarantool	FZambia/tarantool
License	BSD 2-Clause	MIT	BSD 2-Clause
Last update	2023	2022	2022
Documentation	README with examples and up-to-date GoDoc	README with examples, code comments	README with examples
Testing / CI / CD	GitHub Actions	Travis CI	GitHub Actions
GitHub Stars	147	45	14
Static analysis	golangci-lint, luacheck	golint	golangci-lint
Packaging	go get	go get	go get
Code coverage	Yes	No	No
msgpack driver	vmihailenco/msgpack/v2 or vmihailenco/msgpack/v5	tinylib/msgp	vmihailenco/msgpack/v5
Async work	Yes	Yes	Yes
Schema reload	Yes (manual pull)	Yes (manual pull)	Yes (manual pull)
Space / index names	Yes	Yes	Yes
Tuples as structures	Yes (structure and marshall functions must be predefined in Go code)	No	Yes (structure and marshall functions must be predefined in Go code)
Access tuple fields by names	Only if marshalled to structure	No	Only if marshalled to structure
SQL support	Yes	No (#18, closed)	No
Interactive transactions	Yes	No	No
Varbinary support	Yes (with in-built language tools)	Yes (with in-built language tools)	Yes (decodes to string by default, see #6)
UUID support	Yes	No	No
Decimal support	Yes	No	No
EXT_ERROR support	Yes	No	No
Datetime support	Yes	No	No
box.session.push() responses	Yes	No (#21)	Yes
Session settings	Yes	No	No
Graceful shutdown	Yes	No	No
IPROTO_ID (feature discovering)	Yes	No	No
tarantool/crud support	No	No	No
Connection pool	Yes (round-robin failover, no balancing)	No	No
Transparent reconnecting	Yes (see comments in #129)	No (handle reconnects explicitly, refer to #11)	Yes (see comments in #7)
Transparent request retrying	No	No	No
Watchers	Yes	No	No
Pagination	Yes	No	No
Language features	context	context	context
Miscellaneous	Supports tarantool/queue API	Can mimic a Tarantool instance (also as replica). Provides instrumentation for reading snapshot and xlog files via snapio module. Implements unpacking of query structs if you want to implement your own iproto proxy	API is experimental and breaking changes may happen

Java

The following Java connectors are available:

tarantool-java-SDK is the primary connector that is actively being developed and maintained. This connector supports current Tarantool versions and features. The Spring Data and Testcontainers are implemented as modules within the Tarantool Java SDK project, providing seamless integration with these frameworks.

Separate documentation is available for each major and minor Tarantool Java SDK version, for example: https://tarantool.github.io/tarantool-java-sdk/1.5.*/.

For the most up-to-date documentation (including the latest changes and features), please refer to the dev version: https://tarantool.github.io/tarantool-java-sdk/dev/.

Note

The connectors below are either deprecated or are planned for deprecation.

cartridge-java supports both single Tarantool nodes and clusters, as well as applications built using the Cartridge framework and its modules. The Tarantool team actively updates this module with the newest Tarantool features.
tarantool-java works with early Tarantool versions (1.6 and later) and offers JDBC interface support for single Tarantool nodes. This module isn’t currently maintained and does not support the newest 2.x Tarantool features or Tarantool clusters.

C

tarantool-c is the official C connector for Tarantool. You can find the full library documentation here: Documentation for tarantool-c.

Here follow two examples of using Tarantool’s high-level C API.

Example 1

Here is a complete C program that inserts [99999,'B'] into space examples via the high-level C API.

#include <stdio.h>
#include <stdlib.h>

#include <tarantool/tarantool.h>
#include <tarantool/tnt_net.h>
#include <tarantool/tnt_opt.h>

void main() {
   struct tnt_stream *tnt = tnt_net(NULL);          /* See note = SETUP */
   tnt_set(tnt, TNT_OPT_URI, "localhost:3301");
   if (tnt_connect(tnt) < 0) {                      /* See note = CONNECT */
       printf("Connection refused\n");
       exit(-1);
   }
   struct tnt_stream *tuple = tnt_object(NULL);     /* See note = MAKE REQUEST */
   tnt_object_format(tuple, "[%d%s]", 99999, "B");
   tnt_insert(tnt, 999, tuple);                     /* See note = SEND REQUEST */
   tnt_flush(tnt);
   struct tnt_reply reply;  tnt_reply_init(&reply); /* See note = GET REPLY */
   tnt->read_reply(tnt, &reply);
   if (reply.code != 0) {
       printf("Insert failed %lu.\n", reply.code);
   }
   tnt_close(tnt);                                  /* See below = TEARDOWN */
   tnt_stream_free(tuple);
   tnt_stream_free(tnt);
}

Paste the code into a file named example.c and install tarantool-c. One way to install tarantool-c (using Ubuntu) is:

$ git clone git://github.com/tarantool/tarantool-c.git ~/tarantool-c
$ cd ~/tarantool-c
$ git submodule init
$ git submodule update
$ cmake .
$ make
$ make install

To compile and link the program, run:

$ # sometimes this is necessary:
$ export LD_LIBRARY_PATH=/usr/local/lib
$ gcc -o example example.c -ltarantool

Before trying to run, check that a server instance is listening at localhost:3301 and that the space examples exists, as described earlier. To run the program, say ./example. The program will connect to the Tarantool instance, and will send the request. If Tarantool is not running on localhost with listen address = 3301, the program will print “Connection refused”. If the insert fails, the program will print “Insert failed” and an error number (see all error codes in the source file /src/box/errcode.h).

Here are notes corresponding to comments in the example program.

SETUP

The setup begins by creating a stream.

struct tnt_stream *tnt = tnt_net(NULL);
tnt_set(tnt, TNT_OPT_URI, "localhost:3301");

In this program, the stream will be named tnt. Before connecting on the tnt stream, some options may have to be set. The most important option is TNT_OPT_URI. In this program, the URI is localhost:3301, since that is where the Tarantool instance is supposed to be listening.

Function description:

struct tnt_stream *tnt_net(struct tnt_stream *s)
int tnt_set(struct tnt_stream *s, int option, variant option-value)

CONNECT

Now that the stream named tnt exists and is associated with a URI, this example program can connect to a server instance.

if (tnt_connect(tnt) < 0)
   { printf("Connection refused\n"); exit(-1); }

Function description:

int tnt_connect(struct tnt_stream *s)

The connection might fail for a variety of reasons, such as: the server is not running, or the URI contains an invalid password. If the connection fails, the return value will be -1.

MAKE REQUEST

Most requests require passing a structured value, such as the contents of a tuple.

struct tnt_stream *tuple = tnt_object(NULL);
tnt_object_format(tuple, "[%d%s]", 99999, "B");

In this program, the request will be an INSERT, and the tuple contents will be an integer and a string. This is a simple serial set of values, that is, there are no sub-structures or arrays. Therefore it is easy in this case to format what will be passed using the same sort of arguments that one would use with a C printf() function: %d for the integer, %s for the string, then the integer value, then a pointer to the string value.

Function description:

ssize_t tnt_object_format(struct tnt_stream *s, const char *fmt, ...)

SEND REQUEST

The database-manipulation requests are analogous to the requests in the box library.

tnt_insert(tnt, 999, tuple);
tnt_flush(tnt);

In this program, the choice is to do an INSERT request, so the program passes the tnt_stream that was used for connection (tnt) and the tnt_stream that was set up with tnt_object_format() (tuple).

Function description:

ssize_t tnt_insert(struct tnt_stream *s, uint32_t space, struct tnt_stream *tuple)
ssize_t tnt_replace(struct tnt_stream *s, uint32_t space, struct tnt_stream *tuple)
ssize_t tnt_select(struct tnt_stream *s, uint32_t space, uint32_t index,
                   uint32_t limit, uint32_t offset, uint8_t iterator,
                   struct tnt_stream *key)
ssize_t tnt_update(struct tnt_stream *s, uint32_t space, uint32_t index,
                   struct tnt_stream *key, struct tnt_stream *ops)

GET REPLY

For most requests, the client will receive a reply containing some indication whether the result was successful, and a set of tuples.

struct tnt_reply reply;  tnt_reply_init(&reply);
tnt->read_reply(tnt, &reply);
if (reply.code != 0)
   { printf("Insert failed %lu.\n", reply.code); }

This program checks for success but does not decode the rest of the reply.

Function description:

struct tnt_reply *tnt_reply_init(struct tnt_reply *r)
tnt->read_reply(struct tnt_stream *s, struct tnt_reply *r)
void tnt_reply_free(struct tnt_reply *r)

TEARDOWN

When a session ends, the connection that was made with tnt_connect() should be closed, and the objects that were made in the setup should be destroyed.

tnt_close(tnt);
tnt_stream_free(tuple);
tnt_stream_free(tnt);

Function description:

void tnt_close(struct tnt_stream *s)
void tnt_stream_free(struct tnt_stream *s)

Example 2

Here is a complete C program that selects, using index key [99999], from space examples via the high-level C API. To display the results, the program uses functions in the MsgPuck library which allow decoding of MessagePack arrays.

#include <stdio.h>
#include <stdlib.h>
#include <tarantool/tarantool.h>
#include <tarantool/tnt_net.h>
#include <tarantool/tnt_opt.h>

#define MP_SOURCE 1
#include <msgpuck.h>

void main() {
    struct tnt_stream *tnt = tnt_net(NULL);
    tnt_set(tnt, TNT_OPT_URI, "localhost:3301");
    if (tnt_connect(tnt) < 0) {
        printf("Connection refused\n");
        exit(1);
    }
    struct tnt_stream *tuple = tnt_object(NULL);
    tnt_object_format(tuple, "[%d]", 99999); /* tuple = search key */
    tnt_select(tnt, 999, 0, UINT32_MAX, 0, 0, tuple);
    tnt_flush(tnt);
    struct tnt_reply reply; tnt_reply_init(&reply);
    tnt->read_reply(tnt, &reply);
    if (reply.code != 0) {
        printf("Select failed.\n");
        exit(1);
    }
    char field_type;
    field_type = mp_typeof(*reply.data);
    if (field_type != MP_ARRAY) {
        printf("no tuple array\n");
        exit(1);
    }
    long unsigned int row_count;
    uint32_t tuple_count = mp_decode_array(&reply.data);
    printf("tuple count=%u\n", tuple_count);
    unsigned int i, j;
    for (i = 0; i < tuple_count; ++i) {
        field_type = mp_typeof(*reply.data);
        if (field_type != MP_ARRAY) {
            printf("no field array\n");
            exit(1);
        }
        uint32_t field_count = mp_decode_array(&reply.data);
        printf("  field count=%u\n", field_count);
        for (j = 0; j < field_count; ++j) {
            field_type = mp_typeof(*reply.data);
            if (field_type == MP_UINT) {
                uint64_t num_value = mp_decode_uint(&reply.data);
                printf("    value=%lu.\n", num_value);
            } else if (field_type == MP_STR) {
                const char *str_value;
                uint32_t str_value_length;
                str_value = mp_decode_str(&reply.data, &str_value_length);
                printf("    value=%.*s.\n", str_value_length, str_value);
            } else {
                printf("wrong field type\n");
                exit(1);
            }
        }
    }
    tnt_close(tnt);
    tnt_stream_free(tuple);
    tnt_stream_free(tnt);
}

Similarly to the first example, paste the code into a file named example2.c.

To compile and link the program, say:

$ gcc -o example2 example2.c -ltarantool

To run the program, say ./example2.

The two example programs only show a few requests and do not show all that’s necessary for good practice. See more in the tarantool-c documentation at GitHub.

Python

Examples on GitHub: sample_db, python

tarantool-python is the official Python connector for Tarantool. It is not supplied as part of the Tarantool repository and must be installed separately.

The tutorial shows how to use the tarantool-python library to create a Python script that connects to a remote Tarantool instance, performs CRUD operations, and executes a stored procedure. You can find the full package documentation here: Python client library for Tarantool.

Note

This tutorial shows how to make CRUD requests to a single-instance Tarantool database. To make requests to a sharded Tarantool cluster with the CRUD module, use the tarantool.crud module’s API.

Sample database configuration

This section describes the configuration of a sample database that allows remote connections:

credentials:
  users:
    sampleuser:
      password: '123456'
      privileges:
      - permissions: [ read, write ]
        spaces: [ bands ]
      - permissions: [ execute ]
        functions: [ get_bands_older_than ]

groups:
  group001:
    replicasets:
      replicaset001:
        instances:
          instance001:
            iproto:
              listen:
              - uri: '127.0.0.1:3301'

app:
  file: 'myapp.lua'

The configuration contains one instance that listens for incoming requests on the 127.0.0.1:3301 address.
sampleuser has privileges to select and modify data in the bands space and execute the get_bands_older_than stored function. This user can be used to connect to the instance remotely.
myapp.lua defines the data model and a stored function.

The myapp.lua file looks as follows:

-- Create a space --
box.schema.space.create('bands')

-- Specify field names and types --
box.space.bands:format({
    { name = 'id', type = 'unsigned' },
    { name = 'band_name', type = 'string' },
    { name = 'year', type = 'unsigned' }
})

-- Create indexes --
box.space.bands:create_index('primary', { parts = { 'id' } })
box.space.bands:create_index('band', { parts = { 'band_name' } })
box.space.bands:create_index('year_band', { parts = { { 'year' }, { 'band_name' } } })

-- Create a stored function --
box.schema.func.create('get_bands_older_than', {
    body = [[
    function(year)
        return box.space.bands.index.year_band:select({ year }, { iterator = 'LT', limit = 10 })
    end
    ]]
})

You can find the full example on GitHub: sample_db.

Starting a sample database application

Before creating and starting a client Python application, you need to run the sample_db application using tt start:

$ tt start sample_db

Now you can create a client Python application that makes requests to this database.

Developing a client application

Before you start, make sure you have Python installed on your computer.

Creating an application

Create the hello directory for your application and go to this directory:
```
$ mkdir hello
$ cd hello
```

Create and activate a Python virtual environment:

$ python -m venv .venv
$ source .venv/bin/activate

Install the tarantool module:
```
$ pip install tarantool
```
Inside the hello directory, create the hello.py file for application code.

Importing ‘tarantool’

In the hello.py file, import the tarantool package:

import tarantool

Connecting to the database

Add the following code:

# Connect to the database
conn = tarantool.Connection(host='127.0.0.1',
                            port=3301,
                            user='sampleuser',
                            password='123456')

This code establishes a connection to a running Tarantool instance on behalf of sampleuser. The conn object can be used to make CRUD requests and execute stored procedures.

Manipulating data

Inserting data

Add the following code to insert four tuples into the bands space:

# Insert data
tuples = [(1, 'Roxette', 1986),
          (2, 'Scorpions', 1965),
          (3, 'Ace of Base', 1987),
          (4, 'The Beatles', 1960)]
print("Inserted tuples:")
for tuple in tuples:
    response = conn.insert(space_name='bands', values=tuple)
    print(response[0])

Connection.insert() is used to insert a tuple to the space.

Querying data

To get a tuple by the specified primary key value, use Connection.select():

# Select by primary key
response = conn.select(space_name='bands', key=1)
print('Tuple selected by the primary key value:', response[0])

You can also get a tuple by the value of the specified index using the index argument:

# Select by secondary key
response = conn.select(space_name='bands', key='The Beatles', index='band')
print('Tuple selected by the secondary key value:', response[0])

Updating data

Connection.update() can be used to update a tuple identified by the primary key as follows:

# Update
response = conn.update(space_name='bands',
                       key=2,
                       op_list=[('=', 'band_name', 'Pink Floyd')])
print('Updated tuple:', response[0])

Connection.upsert() updates an existing tuple or inserts a new one. In the example below, a new tuple is inserted:

# Upsert
conn.upsert(space_name='bands',
            tuple_value=(5, 'The Rolling Stones', 1962),
            op_list=[('=', 'band_name', 'The Doors')])

In this example, Connection.replace() deletes the existing tuple and inserts a new one:

# Replace
response = conn.replace(space_name='bands', values=(1, 'Queen', 1970))
print('Replaced tuple:', response[0])

Deleting data

Connection.delete() in the example below deletes a tuple whose primary key value is 5:

# Delete
response = conn.delete(space_name='bands', key=5)
print('Deleted tuple:', response[0])

Executing stored procedures

To execute a stored procedure, use Connection.call():

# Call
response = conn.call('get_bands_older_than', 1966)
print('Stored procedure result:', response[0])

Closing the connection

The Connection.close() method can be used to close the connection when it is no longer needed:

# Close connection
conn.close()
print('Connection is closed')

Note

You can find the example with all the requests above on GitHub: python.

Starting a client application

To run the resulting application, pass the script name to the python command:

$ python hello.py
Inserted tuples:
[1, 'Roxette', 1986]
[2, 'Scorpions', 1965]
[3, 'Ace of Base', 1987]
[4, 'The Beatles', 1960]
Tuple selected by the primary key value: [1, 'Roxette', 1986]
Tuple selected by the secondary key value: [4, 'The Beatles', 1960]
Updated tuple: [2, 'Pink Floyd', 1965]
Replaced tuple: [1, 'Queen', 1970]
Deleted tuple: [5, 'The Rolling Stones', 1962]
Stored procedure result: [[2, 'Pink Floyd', 1965], [4, 'The Beatles', 1960]]
Connection is closed

Community Python connectors

Last update: September 2023

There are also several community-driven Python connectors:

asynctnt with asyncio support
aiotarantool also with asyncio support, no active maintenance
gtarantool with gevent support, no active maintenance

The table below contains a feature comparison for asynctnt and tarantool-python.

Parameter	igorcoding/asynctnt	tarantool/tarantool-python
License	Apache License 2.0	BSD-2
Is maintained	Yes	Yes
Known Issues	None	None
Documentation	Yes (github.io)	Yes (readthedocs and tarantool.io)
Testing / CI / CD	GitHub Actions	GitHub Actions
GitHub Stars	73	92
Static Analysis	Yes (Flake8)	Yes (Flake8, Pylint)
Packaging	pip	pip, deb, rpm
Code coverage	Yes	Yes
Support asynchronous mode	Yes, asyncio	No
Batching support	No	Yes (with CRUD API)
Schema reload	Yes (automatically, see auto_refetch_schema)	Yes (automatically)
Space / index names	Yes	Yes
Access tuple fields by names	Yes	No
SQL support	Yes	Yes
Interactive transactions	Yes	No (issue #163)
Varbinary support	Yes (in `MP_BIN` fields)	Yes
Decimal support	Yes	Yes
UUID support	Yes	Yes
EXT_ERROR support	Yes	Yes
Datetime support	Yes	Yes
Interval support	No (issue #30)	Yes
box.session.push() responses	Yes	Yes
Session settings	No	No
Graceful shutdown	No	No
IPROTO_ID (feature discovery)	Yes	Yes
CRUD support	No	Yes
Transparent request retrying	No	No
Transparent reconnecting	Autoreconnect	Yes (reconnect_max_attempts, reconnect_delay), checking of connection liveness
Connection pool	No	Yes (with master discovery)
Support of PEP 249 – Python Database API Specification v2.0	No	Yes
Encrypted connection (Enterprise Edition)	No (issue #22)	Yes

C++

tntcxx is the official C++ connector for Tarantool.

Connecting to Tarantool from C++

To simplify the start of your working with the Tarantool C++ connector, we will use the example application from the connector repository. We will go step by step through the application code and explain what each part does.

The following main topics are discussed in this manual:

Pre-requisites
Connecting to Tarantool
Working with requests
Building and launching C++ application
Decoding and reading the data

Pre-requisites

To go through this Getting Started exercise, you need the following pre-requisites to be done:

clone the connector source code and ensure having Tarantool and third-party software
start Tarantool and create a database
set up access rights.

Installation

The Tarantool C++ connector is currently supported for Linux only.

The connector itself is a header-only library, so, it doesn’t require installation and building as such. All you need is to clone the connector source code and embed it in your C++ project.

Also, make sure you have other necessary software and Tarantool installed.

Make sure you have the following third-party software. If you miss some of the items, install them:
- Git, a version control system
- unzip utility
- gcc compiler complied with the C++17 standard
- cmake and make tools.
If you don’t have Tarantool on your OS, install it in one of the ways:
- from a package–refer to OS-specific instructions
- from the source.

Clone the Tarantool C++ connector repository.

git clone git@github.com:tarantool/tntcxx.git

Starting Tarantool and creating a database

Start Tarantool locally or in Docker and create a space with the following schema and index:

box.cfg{listen = 3301}
t = box.schema.space.create('t')
t:format({
         {name = 'id', type = 'unsigned'},
         {name = 'a', type = 'string'},
         {name = 'b', type = 'number'}
         })
t:create_index('primary', {
         type = 'hash',
         parts = {'id'}
         })

Important

Do not close the terminal window where Tarantool is running. You will need it later to connect to Tarantool from your C++ application.

Setting up access rights

To be able to execute the necessary operations in Tarantool, you need to grant the guest user with the read-write rights. The simplest way is to grant the user with the super role:

box.schema.user.grant('guest', 'super')

Connecting to Tarantool

There are three main parts of the C++ connector: the IO-zero-copy buffer, the msgpack encoder/decoder, and the client that handles requests.

To set up connection to a Tarantool instance from a C++ application, you need to do the following:

embed the connector into the application
instantiate a connector client and a connection object
define connection parameters and invoke the method to connect
define error handling behavior.

Embedding connector

Embed the connector in your C++ application by including the main header:

#include "../src/Client/Connector.hpp"

Instantiating objects

First, we should create a connector client. It can handle many connections to Tarantool instances asynchronously. To instantiate a client, you should specify the buffer and the network provider implementations as template parameters. The connector’s main class has the following signature:

template<class BUFFER, class NetProvider = EpollNetProvider<BUFFER>>
class Connector;

The buffer is parametrized by allocator. It means that users can choose which allocator will be used to provide memory for the buffer’s blocks. Data is organized into a linked list of blocks of fixed size that is specified as the template parameter of the buffer.

You can either implement your own buffer or network provider or use the default ones as we do in our example. So, the default connector instantiation looks as follows:

using Buf_t = tnt::Buffer<16 * 1024>;
#include "../src/Client/LibevNetProvider.hpp"
using Net_t = LibevNetProvider<Buf_t, DefaultStream>;

Connector<Buf_t, Net_t> client;

To use the BUFFER class, the buffer header should also be included:

#include "../src/Buffer/Buffer.hpp"

A client itself is not enough to work with Tarantool instances–we also need to create connection objects. A connection also takes the buffer and the network provider as template parameters. Note that they must be the same as ones of the client:

Connection<Buf_t, Net_t> conn(client);

Connecting

Our Tarantool instance is listening to the 3301 port on localhost. Let’s define the corresponding variables as well as the WAIT_TIMEOUT variable for connection timeout.

const char *address = "127.0.0.1";
int port = 3301;
int WAIT_TIMEOUT = 1000; //milliseconds

To connect to the Tarantool instance, we should invoke the Connector::connect() method of the client object and pass three arguments: connection instance, address, and port.

int rc = client.connect(conn, {.address = address,
			       .service = std::to_string(port),
			       /*.user = ...,*/
			       /*.passwd = ...,*/
			       /* .transport = STREAM_SSL, */});

Error handling

Implementation of the connector is exception free, so we rely on the return codes: in case of fail, the connect() method returns rc < 0. To get the error message corresponding to the last error occured during communication with the instance, we can invoke the Connection::getError() method.

if (rc != 0) {
	//assert(conn.getError().saved_errno != 0);
	std::cerr << conn.getError().msg << std::endl;
	return -1;
}

To reset connection after errors, that is, to clean up the error message and connection status, the Connection::reset() method is used.

Working with requests

In this section, we will show how to:

prepare different types of requests
send the requests
receive and handle responses.

We will also go through the case of having several connections and executing a number of requests from different connections simultaneously.

In our example C++ application, we execute the following types of requests:

ping
replace
select.

Note

Examples on other request types, namely, insert, delete, upsert, and update, will be added to this manual later.

Each request method returns a request ID that is a sort of future. This ID can be used to get the response message when it is ready. Requests are queued in the output buffer of connection until the Connector::wait() method is called.

Preparing requests

At this step, requests are encoded in the MessagePack format and saved in the output connection buffer. They are ready to be sent but the network communication itself will be done later.

Let’s remind that for the requests manipulating with data we are dealing with the Tarantool space t created earlier, and the space has the following format:

t:format({
         {name = 'id', type = 'unsigned'},
         {name = 'a', type = 'string'},
         {name = 'b', type = 'number'}
         })

ping

rid_t ping = conn.ping();

replace

Equals to Lua request <space_name>:replace(pk_value, "111", 1).

uint32_t space_id = 512;
int pk_value = 666;
std::tuple data = std::make_tuple(pk_value /* field 1*/, "111" /* field 2*/, 1.01 /* field 3*/);
rid_t replace = conn.space[space_id].replace(data);

select

Equals to Lua request <space_name>.index[0]:select({pk_value}, {limit = 1}).

uint32_t index_id = 0;
uint32_t limit = 1;
uint32_t offset = 0;
IteratorType iter = IteratorType::EQ;
auto i = conn.space[space_id].index[index_id];
rid_t select = i.select(std::make_tuple(pk_value), limit, offset, iter);

Sending requests

To send requests to the server side, invoke the client.wait() method.

client.wait(conn, ping, WAIT_TIMEOUT);

The wait() method takes the connection to poll, the request ID, and, optionally, the timeout as parameters. Once a response for the specified request is ready, wait() terminates. It also provides a negative return code in case of system related fails, for example, a broken or timeouted connection. If wait() returns 0, then a response has been received and expected to be parsed.

Now let’s send our requests to the Tarantool instance. The futureIsReady() function checks availability of a future and returns true or false.

while (! conn.futureIsReady(ping)) {
	/*
	 * wait() is the main function responsible for sending/receiving
	 * requests and implements event-loop under the hood. It may
	 * fail due to several reasons:
	 *  - connection is timed out;
	 *  - connection is broken (e.g. closed);
	 *  - epoll is failed.
	 */
	if (client.wait(conn, ping, WAIT_TIMEOUT) != 0) {
		std::cerr << conn.getError().msg << std::endl;
		conn.reset();
	}
}

Receiving responses

To get the response when it is ready, use the Connection::getResponse() method. It takes the request ID and returns an optional object containing the response. If the response is not ready yet, the method returns std::nullopt. Note that on each future, getResponse() can be called only once: it erases the request ID from the internal map once it is returned to a user.

A response consists of a header and a body (response.header and response.body). Depending on success of the request execution on the server side, body may contain either runtime error(s) accessible by response.body.error_stack or data (tuples)–response.body.data. In turn, data is a vector of tuples. However, tuples are not decoded and come in the form of pointers to the start and the end of msgpacks. See the “Decoding and reading the data” section to understand how to decode tuples.

There are two options for single connection it regards to receiving responses: we can either wait for one specific future or for all of them at once. We’ll try both options in our example. For the ping request, let’s use the first option.

std::optional<Response<Buf_t>> response = conn.getResponse(ping);
/*
 * Since conn.futureIsReady(ping) returned <true>, then response
 * must be ready.
 */
assert(response != std::nullopt);
/*
 * If request is successfully executed on server side, response
 * will contain data (i.e. tuple being replaced in case of :replace()
 * request or tuples satisfying search conditions in case of :select();
 * responses for pings contain nothing - empty map).
 * To tell responses containing data from error responses, one can
 * rely on response code storing in the header or check
 * Response->body.data and Response->body.error_stack members.
 */
printResponse<Buf_t>(*response);

For the replace and select requests, let’s examine the option of waiting for both futures at once.

/* Let's wait for both futures at once. */
std::vector<rid_t> futures(2);
futures[0] = replace;
futures[1] = select;
/* No specified timeout means that we poll futures until they are ready.*/
client.waitAll(conn, futures);
for (size_t i = 0; i < futures.size(); ++i) {
	assert(conn.futureIsReady(futures[i]));
	response = conn.getResponse(futures[i]);
	assert(response != std::nullopt);
	printResponse<Buf_t>(*response);
}

Several connections at once

Now, let’s have a look at the case when we establish two connections to Tarantool instance simultaneously.

/* Let's create another connection. */
Connection<Buf_t, Net_t> another(client);
if (client.connect(another, {.address = address,
			     .service = std::to_string(port),
			     /* .transport = STREAM_SSL, */}) != 0) {
	std::cerr << conn.getError().msg << std::endl;
	return -1;
}
/* Simultaneously execute two requests from different connections. */
rid_t f1 = conn.ping();
rid_t f2 = another.ping();
/*
 * waitAny() returns the first connection received response.
 * All connections registered via :connect() call are participating.
 */
std::optional<Connection<Buf_t, Net_t>> conn_opt = client.waitAny(WAIT_TIMEOUT);
Connection<Buf_t, Net_t> first = *conn_opt;
if (first == conn) {
	assert(conn.futureIsReady(f1));
	(void) f1;
} else {
	assert(another.futureIsReady(f2));
	(void) f2;
}

Closing connections

Finally, a user is responsible for closing connections.

client.close(conn);
client.close(another);

Building and launching C++ application

Now, we are going to build our example C++ application, launch it to connect to the Tarantool instance and execute all the requests defined.

Make sure you are in the root directory of the cloned C++ connector repository. To build the example application:

cd examples
cmake .
make

Make sure the Tarantool session you started earlier is running. Launch the application:

./Simple

As you can see from the execution log, all the connections to Tarantool defined in our application have been established and all the requests have been executed successfully.

Decoding and reading the data

Responses from a Tarantool instance contain raw data, that is, the data encoded into the MessagePack tuples. To decode client’s data, the user has to write their own decoders (readers) based on the database schema and include them in one’s application:

#include "Reader.hpp"

To show the logic of decoding a response, we will use the reader from our example.

First, the structure corresponding our example space format is defined:

/**
 * Corresponds to tuples stored in user's space:
 * box.execute("CREATE TABLE t (id UNSIGNED PRIMARY KEY, a TEXT, d DOUBLE);")
 */
struct UserTuple {
	uint64_t field1;
	std::string field2;
	double field3;

	static constexpr auto mpp = std::make_tuple(
		&UserTuple::field1, &UserTuple::field2, &UserTuple::field3);
};

Base reader prototype

Prototype of the base reader is given in src/mpp/Dec.hpp:

template <class BUFFER, Type TYPE>
struct SimpleReaderBase : DefaultErrorHandler {
   using BufferIterator_t = typename BUFFER::iterator;
   /* Allowed type of values to be parsed. */
   static constexpr Type VALID_TYPES = TYPE;
   BufferIterator_t* StoreEndIterator() { return nullptr; }
};

Every new reader should inherit from it or directly from the DefaultErrorHandler.

Parsing values

To parse a particular value, we should define the Value() method. First two arguments of the method are common and unused as a rule, but the third one defines the parsed value. In case of POD (Plain Old Data) structures, it’s enough to provide a byte-to-byte copy. Since there are fields of three different types in our schema, let’s define the corresponding Value() functions:

Parsing array

It’s also important to understand that a tuple itself is wrapped in an array, so, in fact, we should parse the array first. Let’s define another reader for that purpose.

Setting reader

The SetReader() method sets the reader that is invoked while each of the array’s entries is parsed. To make two readers defined above work, we should create a decoder, set its iterator to the position of the encoded tuple, and invoke the Read() method (the code block below is from the example application).

C++ connector API

The official C++ connector for Tarantool is located in the tanartool/tntcxx repository.

It is not supplied as part of the Tarantool repository and requires additional actions for usage. The connector itself is a header-only library and, as such, doesn’t require installation and building. All you need is to clone the connector source code and embed it in your C++ project. See the C++ connector Getting started document for details and examples.

Below is the description of the connector public API.

Connector class
- Public methods
Connection class

Connector class

template<class BUFFER, class NetProvider = EpollNetProvider<BUFFER>> class Connector¶

The Connector class is a template class that defines a connector client which can handle many connections to Tarantool instances asynchronously.

To instantiate a client, you should specify the buffer and the network provider implementations as template parameters. You can either implement your own buffer or network provider or use the default ones.

The default connector instantiation looks as follows:

using Buf_t = tnt::Buffer<16 * 1024>;
using Net_t = EpollNetProvider<Buf_t >;
Connector<Buf_t, Net_t> client;

int connect(Connection<BUFFER, NetProvider> &conn, const std::string_view &addr, unsigned port, size_t timeout = DEFAULT_CONNECT_TIMEOUT)¶

Connects to a Tarantool instance that is listening on addr:port. On successful connection, the method returns 0. If the host doesn’t reply within the timeout period or another error occurs, it returns -1. Then, Connection.getError() gives the error message.

Parameters:	conn – object of the Connection class. addr – address of the host where a Tarantool instance is running. port – port that a Tarantool instance is listening on. timeout – connection timeout, seconds. Optional. Defaults to `2`.
Returns:	`0` on success, or `-1` otherwise.
Rtype:	int

Possible errors:

connection timeout
refused to connect (due to incorrect address or/and port)
system errors: a socket can’t be created; failure of any of the system calls (fcntl, select, send, receive).

Example:

using Buf_t = tnt::Buffer<16 * 1024>;
using Net_t = EpollNetProvider<Buf_t >;

Connector<Buf_t, Net_t> client;
Connection<Buf_t, Net_t> conn(client);

int rc = client.connect(conn, "127.0.0.1", 3301);

int wait(Connection<BUFFER, NetProvider> &conn, rid_t future, int timeout = 0)¶

The main method responsible for sending a request and checking the response readiness.

You should prepare a request beforehand by using the necessary method of the Connection class, such as ping() and so on, which encodes the request in the MessagePack format and saves it in the output connection buffer.

wait() sends the request and is polling the future for the response readiness. Once the response is ready, wait() returns 0. If at timeout the response isn’t ready or another error occurs, it returns -1. Then, Connection.getError() gives the error message. timeout = 0 means the method is polling the future until the response is ready.

Parameters:	conn – object of the Connection class. future – request ID returned by a request method of the Connection class, such as, ping() and so on. timeout – waiting timeout, milliseconds. Optional. Defaults to `0`.
Returns:	`0` on receiving a response, or `-1` otherwise.
Rtype:	int

Possible errors:

timeout exceeded
other possible errors depend on a network provider used. If the EpollNetProvider is used, failing of the poll, read, and write system calls leads to system errors, such as, EBADF, ENOTSOCK, EFAULT, EINVAL, EPIPE, and ENOTCONN (EWOULDBLOCK and EAGAIN don’t occur in this case).

Example:

client.wait(conn, ping, WAIT_TIMEOUT)

void waitAll(Connection<BUFFER, NetProvider> &conn, rid_t *futures, size_t future_count, int timeout = 0)¶

Similar to wait(), the method sends the requests prepared and checks the response readiness, but can send several different requests stored in the futures array. Exceeding the timeout leads to an error; Connection.getError() gives the error message. timeout = 0 means the method is polling the futures until all the responses are ready.

Parameters:	conn – object of the Connection class. futures – array with the request IDs returned by request methods of the Connection class, such as, ping() and so on. future_count – size of the `futures` array. timeout – waiting timeout, milliseconds. Optional. Defaults to `0`.
Returns:	none
Rtype:	none

Possible errors:

timeout exceeded
other possible errors depend on a network provider used. If the EpollNetProvider is used, failing of the poll, read, and write system calls leads to system errors, such as, EBADF, ENOTSOCK, EFAULT, EINVAL, EPIPE, and ENOTCONN (EWOULDBLOCK and EAGAIN don’t occur in this case).

Example:

rid_t futures[2];
futures[0] = replace;
futures[1] = select;

client.waitAll(conn, (rid_t *) &futures, 2);

Connection<BUFFER, NetProvider> *waitAny(int timeout = 0)¶

Sends all requests that are prepared at the moment and is waiting for any first response to be ready. Upon the response readiness, waitAny() returns the corresponding connection object. If at timeout no response is ready or another error occurs, it returns nullptr. Then, Connection.getError() gives the error message. timeout = 0 means no time limitation while waiting for the response readiness.

Parameters:	timeout – waiting timeout, milliseconds. Optional. Defaults to `0`.
Returns:	object of the Connection class on success, or `nullptr` on error.
Rtype:	Connection<BUFFER, NetProvider>*

Possible errors:

timeout exceeded
other possible errors depend on a network provider used. If the EpollNetProvider is used, failing of the poll, read, and write system calls leads to system errors, such as, EBADF, ENOTSOCK, EFAULT, EINVAL, EPIPE, and ENOTCONN (EWOULDBLOCK and EAGAIN don’t occur in this case).

Example:

rid_t f1 = conn.ping();
rid_t f2 = another_conn.ping();

Connection<Buf_t, Net_t> *first = client.waitAny(WAIT_TIMEOUT);
if (first == &conn) {
    assert(conn.futureIsReady(f1));
} else {
    assert(another_conn.futureIsReady(f2));
}

void close(Connection<BUFFER, NetProvider> &conn)¶

Closes the connection established earlier by the connect() method.

Parameters:	conn – connection object of the Connection class.
Returns:	none
Rtype:	none

Possible errors: none.

Example:

client.close(conn);

Connection class

template<class BUFFER, class NetProvider> class Connection¶

The Connection class is a template class that defines a connection objects which is required to interact with a Tarantool instance. Each connection object is bound to a single socket.

Similar to a connector client, a connection object also takes the buffer and the network provider as template parameters, and they must be the same as ones of the client. For example:

//Instantiating a connector client
using Buf_t = tnt::Buffer<16 * 1024>;
using Net_t = EpollNetProvider<Buf_t >;
Connector<Buf_t, Net_t> client;

//Instantiating connection objects
Connection<Buf_t, Net_t> conn01(client);
Connection<Buf_t, Net_t> conn02(client);

The Connection class has two nested classes, namely, Space and Index that implement the data-manipulation methods like select(), replace(), and so on.

Public types
Public methods
Nested classes and their methods

Public types

typedef size_t rid_t¶: The alias of the built-in size_t type. rid_t is used for entities that return or contain a request ID.

Public methods

call()
futureIsReady()
getResponse()
getError()
reset()
ping()

template<class T> rid_t call(const std::string &func, const T &args)¶

Executes a call of a remote stored-procedure similar to conn:call(). The method returns the request ID that is used to get the response by getResponse().

Parameters:	func – a remote stored-procedure name. args – procedure’s arguments.
Returns:	a request ID
Rtype:	rid_t

Possible errors: none.

Example:

The following function is defined on the Tarantool instance you are connected to:

box.execute("DROP TABLE IF EXISTS t;")
box.execute("CREATE TABLE t(id INT PRIMARY KEY, a TEXT, b DOUBLE);")

function remote_replace(arg1, arg2, arg3)
    return box.space.T:replace({arg1, arg2, arg3})
end

The function call can look as follows:

rid_t f1 = conn.call("remote_replace", std::make_tuple(5, "some_sring", 5.55));

bool futureIsReady(rid_t future)¶

Checks availability of a request ID (future) returned by any of the request methods, such as, ping() and so on.

futureIsReady() returns true if the future is available or false otherwise.

Parameters:	future – a request ID.
Returns:	`true` or `false`
Rtype:	bool

Possible errors: none.

Example:

rid_t ping = conn.ping();
conn.futureIsReady(ping);

std::optional<Response<BUFFER>> getResponse(rid_t future)¶

The method takes a request ID (future) as an argument and returns an optional object containing a response. If the response is not ready, the method returns std::nullopt. Note that for each future the method can be called only once because it erases the request ID from the internal map as soon as the response is returned to a user.

A response consists of a header (response.header) and a body (response.body). Depending on success of the request execution on the server side, body may contain either runtime errors accessible by response.body.error_stack or data (tuples) accessible by response.body.data. Data is a vector of tuples. However, tuples are not decoded and come in the form of pointers to the start and the end of MessagePacks. For details on decoding the data received, refer to “Decoding and reading the data”.

Parameters:	future – a request ID
Returns:	a response object or `std::nullopt`
Rtype:	std::optional<Response<BUFFER>>

Possible errors: none.

Example:

rid_t ping = conn.ping();
std::optional<Response<Buf_t>> response = conn.getResponse(ping);

std::string &getError()¶

Returns an error message for the last error occured during the execution of methods of the Connector and Connection classes.

Returns:	an error message
Rtype:	std::string&

Possible errors: none.

Example:

int rc = client.connect(conn, address, port);

if (rc != 0) {
    std::cerr << conn.getError() << std::endl;
    return -1;
}

void reset()¶

Resets a connection after errors, that is, cleans up the error message and the connection status.

Returns:	none
Rtype:	none

Possible errors: none.

Example:

if (client.wait(conn, ping, WAIT_TIMEOUT) != 0) {
    assert(conn.status.is_failed);
    std::cerr << conn.getError() << std::endl;
    conn.reset();
}

rid_t ping()¶

Prepares a request to ping a Tarantool instance.

The method encodes the request in the MessagePack format and queues it in the output connection buffer to be sent later by one of Connector’s methods, namely, wait(), waitAll(), or waitAny().

Returns the request ID that is used to get the response by the getResponce() method.

Returns:	a request ID
Rtype:	rid_t

Possible errors: none.

Example:

rid_t ping = conn.ping();

Nested classes and their methods

Space
Index

Space class

class Space : Connection ¶

Space is a nested class of the Connection class. It is a public wrapper to access the data-manipulation methods in the way similar to the Tarantool submodule box.space, like, space[space_id].select(), space[space_id].replace(), and so on.

All the Space class methods listed below work in the following way:

A method encodes the corresponding request in the MessagePack format and queues it in the output connection buffer to be sent later by one of Connector’s methods, namely, wait(), waitAll(), or waitAny().
A method returns the request ID. To get and read the actual data requested, first you need to get the response object by using the getResponce() method and then decode the data.

Public methods:

select()
replace()
insert()
update()
upsert()
delete_()

template<class T> rid_t select(const T &key, uint32_t index_id = 0, uint32_t limit = UINT32_MAX, uint32_t offset = 0, IteratorType iterator = EQ)¶

Searches for a tuple or a set of tuples in the given space. The method works similar to space_object:select() and performs the search against the primary index (index_id = 0) by default. In other words, space[space_id].select() equals to space[space_id].index[0].select().

Parameters:	key – value to be matched against the index key. index_id – index ID. Optional. Defaults to `0`. limit – maximum number of tuples to select. Optional. Defaults to `UINT32_MAX`. offset – number of tuples to skip. Optional. Defaults to `0`. iterator – the type of iterator. Optional. Defaults to `EQ`.
Returns:	a request ID
Rtype:	rid_t

Possible errors: none.

Example:

/* Equals to space_object:select({key_value}, {limit = 1}) in Tarantool*/
uint32_t space_id = 512;
int key_value = 5;
uint32_t limit = 1;
auto i = conn.space[space_id];
rid_t select = i.select(std::make_tuple(key_value), index_id, limit, offset, iter);

template<class T> rid_t replace(const T &tuple)¶

Inserts a tuple into the given space. If a tuple with the same primary key already exists, replace() replaces the existing tuple with a new one. The method works similar to space_object:replace() / put().

Parameters:	tuple – a tuple to insert.
Returns:	a request ID
Rtype:	rid_t

Possible errors: none.

Example:

/* Equals to space_object:replace(key_value, "111", 1.01) in Tarantool*/
uint32_t space_id = 512;
int key_value = 5;
std::tuple data = std::make_tuple(key_value, "111", 1.01);
rid_t replace = conn.space[space_id].replace(data);

template<class T> rid_t insert(const T &tuple)¶

Inserts a tuple into the given space. The method works similar to space_object:insert().

Parameters:	tuple – a tuple to insert.
Returns:	a request ID
Rtype:	rid_t

Possible errors: none.

Example:

/* Equals to space_object:insert(key_value, "112", 2.22) in Tarantool*/
uint32_t space_id = 512;
int key_value = 6;
std::tuple data = std::make_tuple(key_value, "112", 2.22);
rid_t insert = conn.space[space_id].insert(data);

template<class K, class T> rid_t update(const K &key, const T &tuple, uint32_t index_id = 0)¶

Updates a tuple in the given space. The method works similar to space_object:update() and searches for the tuple to update against the primary index (index_id = 0) by default. In other words, space[space_id].update() equals to space[space_id].index[0].update().

The tuple parameter specifies an update operation, an identifier of the field to update, and a new field value. The set of available operations and the format of specifying an operation and a field identifier is the same as in Tarantool. Refer to the description of :doc:` </reference/reference_lua/box_space/update>` and example below for details.

Parameters:	key – value to be matched against the index key. tuple – parameters for the update operation, namely, `operator, field_identifier, value`. index_id – index ID. Optional. Defaults to `0`.
Returns:	a request ID
Rtype:	rid_t

Possible errors: none.

Example:

/* Equals to space_object:update(key, {{'=', 1, 'update' }, {'+', 2, 12}}) in Tarantool*/
uint32_t space_id = 512;
std::tuple key = std::make_tuple(5);
std::tuple op1 = std::make_tuple("=", 1, "update");
std::tuple op2 = std::make_tuple("+", 2, 12);
rid_t f1 = conn.space[space_id].update(key, std::make_tuple(op1, op2));

template<class T, class O> rid_t upsert(const T &tuple, const O &ops, uint32_t index_base = 0)¶

Updates or inserts a tuple in the given space. The method works similar to space_object:upsert().

If there is an existing tuple that matches the key fields of tuple, the request has the same effect as update() and the ops parameter is used. If there is no existing tuple that matches the key fields of tuple, the request has the same effect as insert() and the tuple parameter is used.

Parameters:	tuple – a tuple to insert. ops – parameters for the update operation, namely, `operator, field_identifier, value`. index_base – starting number to count fields in a tuple: `0` or `1`. Optional. Defaults to `0`.
Returns:	a request ID
Rtype:	rid_t

Possible errors: none.

Example:

/* Equals to space_object:upsert({333, "upsert-insert", 0.0}, {{'=', 1, 'upsert-update'}}) in Tarantool*/
uint32_t space_id = 512;
std::tuple tuple = std::make_tuple(333, "upsert-insert", 0.0);
std::tuple op1 = std::make_tuple("=", 1, "upsert-update");
rid_t f1 = conn.space[space_id].upsert(tuple, std::make_tuple(op1));

template<class T> rid_t delete_(const T &key, uint32_t index_id = 0)¶

Deletes a tuple in the given space. The method works similar to space_object:delete() and searches for the tuple to delete against the primary index (index_id = 0) by default. In other words, space[space_id].delete_() equals to space[space_id].index[0].delete_().

Parameters:	key – value to be matched against the index key. index_id – index ID. Optional. Defaults to `0`.
Returns:	a request ID
Rtype:	rid_t

Possible errors: none.

Example:

/* Equals to space_object:delete(123) in Tarantool*/
uint32_t space_id = 512;
std::tuple key = std::make_tuple(123);
rid_t f1 = conn.space[space_id].delete_(key);

Index class

class Index : Space ¶

Index is a nested class of the Space class. It is a public wrapper to access the data-manipulation methods in the way similar to the Tarantool submodule box.index, like, space[space_id].index[index_id].select() and so on.

All the Index class methods listed below work in the following way:

A method encodes the corresponding request in the MessagePack format and queues it in the output connection buffer to be sent later by one of Connector’s methods, namely, wait(), waitAll(), or waitAny().
A method returns the request ID that is used to get the response by the getResponce() method. Refer to the getResponce() description to understand the response structure and how to read the requested data.

Public methods:

select()
update()
delete_()

template<class T> rid_t select(const T &key, uint32_t limit = UINT32_MAX, uint32_t offset = 0, IteratorType iterator = EQ)¶

This is an alternative to space.select(). The method searches for a tuple or a set of tuples in the given space against a particular index and works similar to index_object:select().

Parameters:	key – value to be matched against the index key. limit – maximum number of tuples to select. Optional. Defaults to `UINT32_MAX`. offset – number of tuples to skip. Optional. Defaults to `0`. iterator – the type of iterator. Optional. Defaults to `EQ`.
Returns:	a request ID
Rtype:	rid_t

Possible errors: none.

Example:

/* Equals to index_object:select({key}, {limit = 1}) in Tarantool*/
uint32_t space_id = 512;
uint32_t index_id = 1;
int key = 10;
uint32_t limit = 1;
auto i = conn.space[space_id].index[index_id];
rid_t select = i.select(std::make_tuple(key), limit, offset, iter);

template<class K, class T> rid_t update(const K &key, const T &tuple)¶

This is an alternative to space.update(). The method updates a tuple in the given space but searches for the tuple against a particular index. The method works similar to index_object:update().

Parameters:	key – value to be matched against the index key. tuple – parameters for the update operation, namely, `operator, field_identifier, value`.
Returns:	a request ID
Rtype:	rid_t

Possible errors: none.

Example:

/* Equals to index_object:update(key, {{'=', 1, 'update' }, {'+', 2, 12}}) in Tarantool*/
uint32_t space_id = 512;
uint32_t index_id = 1;
std::tuple key = std::make_tuple(10);
std::tuple op1 = std::make_tuple("=", 1, "update");
std::tuple op2 = std::make_tuple("+", 2, 12);
rid_t f1 = conn.space[space_id].index[index_id].update(key, std::make_tuple(op1, op2));

template<class T> rid_t delete_(const T &key)¶

This is an alternative to space.delete_(). The method deletes a tuple in the given space but searches for the tuple against a particular index. The method works similar to index_object:delete().

Parameters:	key – value to be matched against the index key.
Returns:	a request ID
Rtype:	rid_t

Possible errors: none.

Example:

/* Equals to index_object:delete(123) in Tarantool*/
uint32_t space_id = 512;
uint32_t index_id = 1;
std::tuple key = std::make_tuple(123);
rid_t f1 = conn.space[space_id].index[index_id].delete_(key);

Community-supported connectors

This section provides information on several community-supported connectors. Note that they may have limited support for new Tarantool features.

For Erlang, use the Erlang tarantool driver.

For R, use the tarantoolr connector.

C#

The most commonly used C# driver is progaudi.tarantool, previously named tarantool-csharp. It is not supplied as part of the Tarantool repository; it must be installed separately. The makers recommend cross-platform installation using Nuget.

To be consistent with the other instructions in this chapter, here is a way to install the driver directly on Ubuntu 16.04.

Install .net core from Microsoft. Follow .net core installation instructions.

Note

Mono will not work, nor will .Net from xbuild. Only .net core supported on Linux and Mac.
Read the Microsoft End User License Agreement first, because it is not an ordinary open-source agreement and there will be a message during installation saying “This software may collect information about you and your use of the software, and send that to Microsoft.” Still you can set environment variables to opt out from telemetry.

Create a new console project.

$ cd ~
$ mkdir progaudi.tarantool.test
$ cd progaudi.tarantool.test
$ dotnet new console

Add progaudi.tarantool reference.

$ dotnet add package progaudi.tarantool

Change code in Program.cs.

$ cat <<EOT > Program.cs
using System;
using System.Threading.Tasks;
using ProGaudi.Tarantool.Client;

public class HelloWorld
{
  static public void Main ()
  {
    Test().GetAwaiter().GetResult();
  }
  static async Task Test()
  {
    var box = await Box.Connect("127.0.0.1:3301");
    var schema = box.GetSchema();
    var space = await schema.GetSpace("examples");
    await space.Insert((99999, "BB"));
  }
}
EOT

Build and run your application.

Before trying to run, check that the server is listening at localhost:3301 and that the space examples exists, as described earlier.
```
$ dotnet restore
$ dotnet run
```
The program will:
- connect using an application-specific definition of the space,
- open a socket connection with the Tarantool server at localhost:3301,
- send an INSERT request, and — if all is well — end without saying anything.
If Tarantool is not running on localhost with listen port = 3301, or if user ‘guest’ does not have authorization to connect, or if the INSERT request fails for any reason, the program will print an error message, among other things (stacktrace, etc).

The example program only shows one request and does not show all that’s necessary for good practice. For that, please see the progaudi.tarantool driver repository.

Node.js

The most commonly used node.js driver is the Node Tarantool driver. It is not supplied as part of the Tarantool repository; it must be installed separately. The most common way to install it is with npm. For example, on Ubuntu, the installation could look like this after npm has been installed:

$ npm install tarantool-driver --global

Here is a complete node.js program that inserts [99999,'BB'] into space[999] via the node.js API. Before trying to run, check that the server instance is listening at localhost:3301 and that the space examples exists, as described earlier. To run, paste the code into a file named example.rs and say node example.rs. The program will connect using an application-specific definition of the space. The program will open a socket connection with the Tarantool instance at localhost:3301, then send an INSERT request, then — if all is well — end after saying “Insert succeeded”. If Tarantool is not running on localhost with listen port = 3301, the program will print “Connect failed”. If the ‘guest’ user does not have authorization to connect, the program will print “Auth failed”. If the insert request fails for any reason, for example because the tuple already exists, the program will print “Insert failed”.

var TarantoolConnection = require('tarantool-driver');
var conn = new TarantoolConnection({port: 3301});
var insertTuple = [99999, "BB"];
conn.connect().then(function() {
    conn.auth("guest", "").then(function() {
        conn.insert(999, insertTuple).then(function() {
            console.log("Insert succeeded");
            process.exit(0);
    }, function(e) { console.log("Insert failed");  process.exit(1); });
    }, function(e) { console.log("Auth failed");    process.exit(1); });
    }, function(e) { console.log("Connect failed"); process.exit(1); });

The example program only shows one request and does not show all that’s necessary for good practice. For that, please see The node.js driver repository.

Perl

The most commonly used Perl driver is tarantool-perl. It is not supplied as part of the Tarantool repository; it must be installed separately. The most common way to install it is by cloning from GitHub.

To avoid minor warnings that may appear the first time tarantool-perl is installed, start with installing some other modules that tarantool-perl uses, with CPAN, the Comprehensive Perl Archive Network:

$ sudo cpan install AnyEvent
$ sudo cpan install Devel::GlobalDestruction

Then, to install tarantool-perl itself, say:

$ git clone https://github.com/tarantool/tarantool-perl.git tarantool-perl
$ cd tarantool-perl
$ git submodule init
$ git submodule update --recursive
$ perl Makefile.PL
$ make
$ sudo make install

Here is a complete Perl program that inserts [99999,'BB'] into space[999] via the Perl API. Before trying to run, check that the server instance is listening at localhost:3301 and that the space examples exists, as described earlier. To run, paste the code into a file named example.pl and say perl example.pl. The program will connect using an application-specific definition of the space. The program will open a socket connection with the Tarantool instance at localhost:3301, then send an space_object:INSERT request, then — if all is well — end without displaying any messages. If Tarantool is not running on localhost with listen port = 3301, the program will print “Connection refused”.

#!/usr/bin/perl
use DR::Tarantool ':constant', 'tarantool';
use DR::Tarantool ':all';
use DR::Tarantool::MsgPack::SyncClient;

my $tnt = DR::Tarantool::MsgPack::SyncClient->connect(
  host    => '127.0.0.1',                      # look for tarantool on localhost
  port    => 3301,                             # on port 3301
  user    => 'guest',                          # username. for 'guest' we do not also say 'password=>...'

  spaces  => {
    999 => {                                   # definition of space[999] ...
      name => 'examples',                      #   space[999] name = 'examples'
      default_type => 'STR',                   #   space[999] field type is 'STR' if undefined
      fields => [ {                            #   definition of space[999].fields ...
          name => 'field1', type => 'NUM' } ], #     space[999].field[1] name='field1',type='NUM'
      indexes => {                             #   definition of space[999] indexes ...
        0 => {
          name => 'primary', fields => [ 'field1' ] } } } } );

$tnt->insert('examples' => [ 99999, 'BB' ]);

The example program uses field type names ‘STR’ and ‘NUM’ instead of ‘string’ and ‘unsigned’, due to a temporary Perl limitation.

The example program only shows one request and does not show all that’s necessary for good practice. For that, please see the tarantool-perl repository.

PHP

tarantool-php is the official PHP connector for Tarantool. It is not supplied as part of the Tarantool repository and must be installed separately (see installation instructions in the connector’s README file).

Here is a complete PHP program that inserts [99999,'BB'] into a space named examples via the PHP API.

Before trying to run, check that the server instance is listening at localhost:3301 and that the space examples exists, as described earlier.

To run, paste the code into a file named example.php and say:

$ php -d extension=~/tarantool-php/modules/tarantool.so example.php

The program will open a socket connection with the Tarantool instance at localhost:3301, then send an INSERT request, then – if all is well – print “Insert succeeded”.

If the tuple already exists, the program will print “Duplicate key exists in unique index ‘primary’ in space ‘examples’”.

<?php
$tarantool = new Tarantool('localhost', 3301);

try {
    $tarantool->insert('examples', [99999, 'BB']);
    echo "Insert succeeded\n";
} catch (Exception $e) {
    echo $e->getMessage(), "\n";
}

The example program only shows one request and does not show all that’s necessary for good practice. For that, please see tarantool/tarantool-php project at GitHub.

Besides, there is another community-driven tarantool-php GitHub project which includes an alternative connector written in pure PHP, an object mapper, a queue and other packages.

Reference

Configuration reference

This topic describes all configuration parameters provided by Tarantool.

Most of the configuration options described in this reference can be applied to a specific instance, replica set, group, or to all instances globally. To do so, you need to define the required option at the specified level.

app

Using Tarantool as an application server, you can run your own Lua applications. In the app section, you can load the application and provide an application configuration in the app.cfg section.

Note

app can be defined in any scope.

Note

Note that an application specified using app is loaded after application roles specified using the roles option.

app.cfg
app.file
app.module

app.cfg¶

A configuration of the application loaded using app.file or app.module.

Example

In the example below, the application is loaded from the myapp.lua file placed next to the YAML configuration file:

app:
  file: 'myapp.lua'
  cfg:
    greeting: 'Hello'

Example on GitHub: application

Tip

The experimental.config.utils.schema built-in module provides an API for managing user-defined configurations of applications (app.cfg) and roles (roles_cfg).

Type: map
Default: nil
Environment variable: TT_APP_CFG

app.file¶: A path to a Lua file to load an application from.

Type: string

Default: nil

Environment variable: TT_APP_FILE

app.module¶

A Lua module to load an application from.

Example

The app section can be placed in any configuration scope. As an example use case, you can provide different applications for storages and routers in a sharded cluster:

groups:
  storages:
    app:
      module: storage
      # ...
  routers:
    app:
      module: router
      # ...

Type: string
Default: nil
Environment variable: TT_APP_MODULE

audit_log

Enterprise Edition

Configuring audit_log parameters is available in the Enterprise Edition only.

The audit_log section defines configuration parameters related to audit logging.

Note

audit_log can be defined in any scope.

audit_log.extract_key
audit_log.file
audit_log.filter
audit_log.format
audit_log.nonblock
audit_log.pipe
audit_log.spaces
audit_log.to
audit_log.syslog.*

audit_log.extract_key¶: If set to true, the audit subsystem extracts and prints only the primary key instead of full tuples in DML events (space_insert, space_replace, space_delete). Otherwise, full tuples are logged. The option may be useful in case tuples are big.

Type: boolean

Default: false

Environment variable: TT_AUDIT_LOG_EXTRACT_KEY

audit_log.file¶: Specify a file for the audit log destination. You can set the file type using the audit_log.to option. If you write logs to a file, Tarantool reopens the audit log at SIGHUP.

Type: string

Default: ‘var/log/{{ instance_name }}/audit.log’

Environment variable: TT_AUDIT_LOG_FILE

audit_log.filter¶

Enable logging for a specified subset of audit events. This option accepts the following values:

Event names (for example, password_change). For details, see Audit log events.
Event groups (for example, audit). For details, see Event groups.

The option contains either one value from Possible values section (see below) or a combination of them.

To enable custom audit log events, specify the custom value in this option.

Example

filter: [ user_create,data_operations,ddl,custom ]

Type: array
Possible values: ‘all’, ‘audit’, ‘auth’, ‘priv’, ‘ddl’, ‘dml’, ‘data_operations’, ‘compatibility’,
‘audit_enable’, ‘auth_ok’, ‘auth_fail’, ‘disconnect’, ‘user_create’, ‘user_drop’, ‘role_create’, ‘role_drop’,
‘user_disable’, ‘user_enable’, ‘user_grant_rights’, ‘role_grant_rights’, ‘role_revoke_rights’, ‘password_change’,
‘access_denied’, ‘eval’, ‘call’, ‘space_select’, ‘space_create’, ‘space_alter’, ‘space_drop’, ‘space_insert’,
‘space_replace’, ‘space_delete’, ‘custom’
Default: ‘nil’
Environment variable: TT_AUDIT_LOG_FILTER

audit_log.format¶

Specify a format that is used for the audit log.

Example

If you set the option to plain,

audit_log:
  to: file
  format: plain

the output in the file might look as follows:

2024-01-17T00:12:27.155+0300
4b5a2624-28e5-4b08-83c7-035a0c5a1db9
INFO remote:unix/:(socket)
session_type:console
module:tarantool
user:admin
type:space_create
tag:
description:Create space Bands

Type: string
Possible values: ‘json’, ‘csv’, ‘plain’
Default: ‘json’
Environment variable: TT_AUDIT_LOG_FORMAT

audit_log.nonblock¶: Specify the logging behavior if the system is not ready to write. If set to true, Tarantool does not block during logging if the system is non-writable and writes a message instead. Using this value may improve logging performance at the cost of losing some log messages.

Note

The option only has an effect if the audit_log.to is set to syslog or pipe.

Type: boolean

Default: false

Environment variable: TT_AUDIT_LOG_NONBLOCK

audit_log.pipe¶

Specify a pipe for the audit log destination. You can set the pipe type using the audit_log.to option. If log is a program, its pid is stored in the audit.pid field. You need to send it a signal to rotate logs.

Example

This starts the cronolog program when the server starts and sends all audit_log messages to cronolog standard input (stdin).

audit_log:
  to: pipe
  pipe: 'cronolog audit_tarantool.log'

Type: string
Default: box.NULL
Environment variable: TT_AUDIT_LOG_PIPE

audit_log.spaces¶

The array of space names for which data operation events (space_select, space_insert, space_replace, space_delete) should be logged. The array accepts string values. If set to box.NULL, the data operation events are logged for all spaces.

Example

In the example, only the events of bands and singers spaces are logged:

audit_log:
  spaces: [bands, singers]

Type: array
Default: box.NULL
Environment variable: TT_AUDIT_LOG_SPACES

audit_log.to¶

Enable audit logging and define the log location. This option accepts the following values:

devnull: disable audit logging.
file: write audit logs to a file (see audit_log.file).
pipe: start a program and write audit logs to it (see audit_log.pipe).
syslog: write audit logs to a system logger (see audit_log.syslog.*).

By default, audit logging is disabled.

Example

The basic audit log configuration might look as follows:

audit_log:
  to: file
  file: 'audit_tarantool.log'
  filter: [ user_create,data_operations,ddl,custom ]
  format: json
  spaces: [ bands ]
  extract_key: true

Type: string
Possible values: ‘devnull’, ‘file’, ‘pipe’, ‘syslog’
Default: ‘devnull’
Environment variable: TT_AUDIT_LOG_TO

audit_log.syslog.*

audit_log.syslog.facility
audit_log.syslog.identity
audit_log.syslog.server

audit_log.syslog.facility¶

Specify a system logger keyword that tells syslogd where to send the message. You can enable logging to a system logger using the audit_log.to option.

compat

The compat section defines values of the compat module options.

Note

compat can be defined in any scope.

compat.binary_data_decoding
compat.box_cfg_replication_sync_timeout
compat.box_error_serialize_verbose
compat.box_error_unpack_type_and_code
compat.box_info_cluster_meaning
compat.box_session_push_deprecation
compat.box_space_execute_priv
compat.box_space_max
compat.box_tuple_extension
compat.box_tuple_new_vararg
compat.c_func_iproto_multireturn
compat.fiber_channel_close_mode
compat.fiber_slice_default
compat.json_escape_forward_slash
compat.sql_priv
compat.sql_seq_scan_default
compat.yaml_pretty_multiline

compat.binary_data_decoding¶

Define how to store binary data fields in Lua after decoding:

new: as varbinary objects
old: as plain strings

conditional

The conditional section defines the configuration parts that apply to instances that meet certain conditions.

Note

conditional can be defined in the global scope only.

conditional.if

conditional.if¶

Specify a conditional section of the configuration. The configuration options defined inside a conditional.if section apply only to instances on which the specified condition is true.

Conditions can include one variable – tarantool_version: a three-number Tarantool version running on the instance, for example, 3.1.0. It compares to version literal values that include three numbers separated by periods (x.y.z).

The following operators are available in conditions:

comparison: >, <, >=, <=, ==, !=
logical operators || (OR) and && (AND)
parentheses ()

Example:

In this example, different configuration parts apply to instances running Tarantool versions above and below 3.1.0:

On versions less than 3.1.0, the upgraded label is set to false.
On versions 3.1.0 or newer, the upgraded label is set to true. Additionally, new compat options are defined. These options were introduced in version 3.1.0, so on older versions they would cause an error.

conditional:
  - if: tarantool_version < 3.1.0
    labels:
      upgraded: 'false'
  - if: tarantool_version >= 3.1.0
    labels:
      upgraded: 'true'
    compat:
      box_error_serialize_verbose: 'new'
      box_error_unpack_type_and_code: 'new'

config

The config section defines various parameters related to centralized configuration.

Note

config can be defined in the global scope only.

config.reload
config.context.*
config.etcd.*
config.storage.*

config.reload¶

Specify how the configuration is reloaded. This option accepts the following values:

auto: configuration is reloaded automatically when it is changed.
manual: configuration should be reloaded manually. In this case, you can reload the configuration in the application code using config:reload().

config.context.*

This section describes options related to loading configuration settings from external storage such as external files or environment variables.

config.context
- config.context.<name>

config.context¶: Specify how to load settings from external storage. For example, this option can be used to load passwords from safe storage. You can find examples in the Loading secrets from safe storage section.

Type: map

Default: nil

Environment variable: TT_CONFIG_CONTEXT

config.context.<name>¶: The name of an entity that identifies a configuration value to load.

config.context.<name>.env¶

The name of an environment variable to load a configuration value from. To load a configuration value from an environment variable, set config.context.<name>.from to env.

Example

In this example, passwords are loaded from the DBADMIN_PASSWORD and SAMPLEUSER_PASSWORD environment variables:

config:
  context:
    dbadmin_password:
      from: env
      env: DBADMIN_PASSWORD
    sampleuser_password:
      from: env
      env: SAMPLEUSER_PASSWORD

config.context.<name>.from¶

The type of storage to load a configuration value from. There are the following storage types:

file: load a configuration value from a file. In this case, you need to specify the path to the file using config.context.<name>.file.
env: load a configuration value from an environment variable. In this case, specify the environment variable name using config.context.<name>.env.

config.context.<name>.file¶

The path to a file to load a configuration value from. To load a configuration value from a file, set config.context.<name>.from to file.

Example

In this example, passwords are loaded from the dbadmin_password.txt and sampleuser_password.txt files:

config:
  context:
    dbadmin_password:
      from: file
      file: secrets/dbadmin_password.txt
      rstrip: true
    sampleuser_password:
      from: file
      file: secrets/sampleuser_password.txt
      rstrip: true

config.context.<name>.rstrip¶: (Optional) Whether to strip whitespace characters and newlines from the end of data.

config.etcd.*

Enterprise Edition

Centralized configuration storages are supported by the Enterprise Edition only.

This section describes options related to providing connection settings to a centralized etcd-based storage. If replication.failover is set to supervised, Tarantool also uses etcd to maintain the state of failover coordinators.

Note

Note that a centralized cluster configuration cannot contain the config.etcd section.

config.etcd.endpoints
config.etcd.prefix
config.etcd.username
config.etcd.password
config.etcd.ssl.ca_file
config.etcd.ssl.ca_path
config.etcd.ssl.ssl_cert
config.etcd.ssl.ssl_key
config.etcd.ssl.verify_host
config.etcd.ssl.verify_peer
config.etcd.http.request.timeout
config.etcd.http.request.unix_socket
config.etcd.watchers.reconnect_max_attempts
config.etcd.watchers.reconnect_timeout

config.etcd.endpoints¶

The list of endpoints used to access an etcd server.

Type: array
Default: nil
Environment variable: TT_CONFIG_ETCD_ENDPOINTS

config.etcd.prefix¶

A key prefix used to search a configuration on an etcd server. Tarantool searches keys by the following path: <prefix>/config/*. Note that <prefix> should start with a slash (/).

Type: string
Default: nil
Environment variable: TT_CONFIG_ETCD_PREFIX

config.etcd.username¶

A username used for authentication.

Type: string
Default: nil
Environment variable: TT_CONFIG_ETCD_USERNAME

config.etcd.password¶

A password used for authentication.

Type: string
Default: nil
Environment variable: TT_CONFIG_ETCD_PASSWORD

config.etcd.ssl.ca_file¶: A path to a trusted certificate authorities (CA) file.

Type: string

Default: nil

Environment variable: TT_CONFIG_ETCD_SSL_CA_FILE

config.etcd.ssl.ca_path¶: A path to a directory holding certificates to verify the peer with.

Type: string

Default: nil

Environment variable: TT_CONFIG_ETCD_SSL_CA_PATH

config.etcd.ssl.ssl_cert¶

Since: 3.2.0

A path to an SSL certificate file.

Type: string
Default: nil
Environment variable: TT_CONFIG_ETCD_SSL_SSL_CERT

config.etcd.ssl.ssl_key¶: A path to a private SSL key file.

Type: string

Default: nil

Environment variable: TT_CONFIG_ETCD_SSL_SSL_KEY

config.etcd.ssl.verify_host¶: Enable verification of the certificate’s name (CN) against the specified host.

Type: boolean

Default: nil

Environment variable: TT_CONFIG_ETCD_SSL_VERIFY_HOST

config.etcd.ssl.verify_peer¶: Enable verification of the peer’s SSL certificate.

Type: boolean

Default: nil

Environment variable: TT_CONFIG_ETCD_SSL_VERIFY_PEER

config.etcd.http.request.timeout¶

A time period required to process an HTTP request to an etcd server: from sending a request to receiving a response.

Type: number
Default: nil
Environment variable: TT_CONFIG_ETCD_HTTP_REQUEST_TIMEOUT

config.etcd.http.request.unix_socket¶: A Unix domain socket used to connect to an etcd server.

Type: string

Default: nil

Environment variable: TT_CONFIG_ETCD_HTTP_REQUEST_UNIX_SOCKET

config.etcd.watchers.reconnect_max_attempts¶

Since: 3.1.0

The maximum number of attempts to reconnect to an etcd server in case of connection failure.

Type: integer
Default: nil
Environment variable: TT_CONFIG_ETCD_WATCHERS_RECONNECT_MAX_ATTEMPTS

config.etcd.watchers.reconnect_timeout¶

Since: 3.1.0

The timeout (in seconds) between attempts to reconnect to an etcd server in case of connection failure.

Type: number
Default: nil
Environment variable: TT_CONFIG_ETCD_WATCHERS_RECONNECT_TIMEOUT

config.storage.*

Enterprise Edition

Centralized configuration storages are supported by the Enterprise Edition only.

This section describes options related to providing connection settings to a centralized Tarantool-based storage.

Note

Note that a centralized cluster configuration cannot contain the config.storage section.

config.storage.endpoints
config.storage.prefix
config.storage.reconnect_after
config.storage.timeout

config.storage.endpoints¶

An array of endpoints used to access a configuration storage. Each endpoint can include the following fields:

uri: a URI of the configuration storage’s instance.
login: a username used to connect to the instance.
password: a password used for authentication.
params: SSL parameters required for encrypted connections (<uri>.params.*).

Type: array
Default: nil
Environment variable: TT_CONFIG_STORAGE_ENDPOINTS

config.storage.prefix¶

A key prefix used to search a configuration in a centralized configuration storage. Tarantool searches keys by the following path: <prefix>/config/*. Note that <prefix> should start with a slash (/).

Type: string
Default: nil
Environment variable: TT_CONFIG_STORAGE_PREFIX

config.storage.reconnect_after¶: A number of seconds to wait before reconnecting to a configuration storage.

Type: number

Default: 3

Environment variable: TT_CONFIG_STORAGE_RECONNECT_AFTER

config.storage.timeout¶

The interval (in seconds) to perform the status check of a configuration storage.

Type: number
Default: 3
Environment variable: TT_CONFIG_STORAGE_TIMEOUT

console

Configure the administrative console. A client to the console is tt connect.

Note

console can be defined in any scope.

console.enabled
console.socket

console.enabled¶

Whether to listen on the Unix socket provided in the console.socket option.

If the option is set to false, the administrative console is disabled.

Type: boolean
Default: true
Environment variable: TT_CONSOLE_ENABLED

console.socket¶

The Unix socket for the administrative console.

Mind the following nuances:

Only a Unix domain socket is allowed. A TCP socket can’t be configured this way.
console.socket is a file path, without any unix: or unix/: prefixes.
If the file path is a relative path, it is interpreted relative to process.work_dir.

Type: string
Default: ‘var/run/{{ instance_name }}/tarantool.control’
Environment variable: TT_CONSOLE_SOCKET

credentials

The credentials section allows you to create users and grant them the specified privileges. Learn more in Credentials.

Note

credentials can be defined in any scope.

credentials.roles.*
credentials.users.*
<user_or_role_name>.privileges.*

credentials.roles.*

credentials.roles
- credentials.roles.<role_name>.roles
- credentials.roles.<role_name>.privileges

credentials.roles¶

An array of roles that can be granted to users or other roles.

Example

In the example below, the writers_space_reader role gets privileges to select data in the writers space:

roles:
  writers_space_reader:
    privileges:
    - permissions: [ read ]
      spaces: [ writers ]

See also: Managing users and roles

Type: map
Default: nil
Environment variable: TT_CREDENTIALS_ROLES

credentials.roles.<role_name>.roles¶: An array of roles granted to this role.

credentials.roles.<role_name>.privileges¶

An array of privileges granted to this role.

See <user_or_role_name>.privileges.*.

credentials.users.*

credentials.users

credentials.users¶

An array of users.

Example

In this example, sampleuser gets the following privileges:

Privileges granted to the writers_space_reader role.
Privileges to select and modify data in the books space.

sampleuser:
  password: '123456'
  roles: [ writers_space_reader ]
  privileges:
  - permissions: [ read, write ]
    spaces: [ books ]

See also: Managing users and roles

Type: map
Default: nil
Environment variable: TT_CREDENTIALS_USERS

credentials.users.<username>.password¶

A user’s password.

Example

In the example below, a password for the dbadmin user is set:

credentials:
  users:
    dbadmin:
      password: 'T0p_Secret_P@$$w0rd'

credentials.users.<username>.roles¶: An array of roles granted to this user.

credentials.users.<username>.privileges¶

An array of privileges granted to this user.

See <user_or_role_name>.privileges.*.

<user_or_role_name>.privileges.*

<user_or_role_name>.privileges

<user_or_role_name>.privileges¶

Privileges that can be granted to a user or role using the following options:

credentials.users.<username>.privileges
credentials.roles.<role_name>.privileges

<user_or_role_name>.privileges.permissions¶

Permissions assigned to this user or a user with this role.

Example

In this example, sampleuser gets privileges to select and modify data in the books space:

sampleuser:
  password: '123456'
  roles: [ writers_space_reader ]
  privileges:
  - permissions: [ read, write ]
    spaces: [ books ]

See also: Managing users and roles

<user_or_role_name>.privileges.spaces¶

Spaces to which this user or a user with this role gets the specified permissions.

Example

In this example, sampleuser gets privileges to select and modify data in the books space:

sampleuser:
  password: '123456'
  roles: [ writers_space_reader ]
  privileges:
  - permissions: [ read, write ]
    spaces: [ books ]

See also: Managing users and roles

<user_or_role_name>.privileges.functions¶: Functions to which this user or a user with this role gets the specified permissions.

<user_or_role_name>.privileges.sequences¶: Sequences to which this user or a user with this role gets the specified permissions.

<user_or_role_name>.privileges.lua_eval¶: Whether this user or a user with this role can execute arbitrary Lua code.

<user_or_role_name>.privileges.lua_call¶

A list of global user-defined Lua functions that this user or a user with this role can call. To allow calling a specific function, specify its name as the value. To allow calling all global Lua functions except built-in ones functions, specify the all value.

This option should be configured together with the execute permission.

Since version 3.3.0, the lua_call option allows granting users privileges to call specified lua function on the instance in runtime (thus it doesn’t require an ability to write to the database).

Example to grant custom functions to the ‘alice’ user:

credentials:
  users:
    alice:
      privileges:
        - permissions: [execute]
          lua_call: [my_func, my_func2]

<user_or_role_name>.privileges.sql¶: Whether this user or a user with this role can execute an arbitrary SQL expression.

database

The database section defines database-specific configuration parameters, such as an instance’s read-write mode or transaction isolation level.

Note

database can be defined in any scope.

database.hot_standby
database.instance_uuid
database.mode
database.replicaset_uuid
database.txn_isolation
database.txn_timeout
database.use_mvcc_engine

database.hot_standby¶

Whether to start the server in the hot standby mode. This mode can be used to provide failover without replication.

Suppose there are two cluster applications. Each cluster has one instance with the same configuration:

groups:
  group001:
    replicasets:
      replicaset001:
        instances:
          instance001:
            database:
              hot_standby: true
            wal:
              dir: /tmp/wals
            snapshot:
              dir: /tmp/snapshots
            iproto:
              listen:
              - uri: '127.0.0.1:3301'

In particular, both instances use the same directory for storing write-ahead logs and snapshots.

When you start both cluster applications on the same machine, the instance from the first one will be the primary instance and the second will be the standby instance. In the logs of the second cluster instance, you should see a notification:

main/104/interactive I> Entering hot standby mode

This means that the standby instance is ready to take over if the primary instance goes down. The standby instance initializes and tries to take a lock on a directory for storing write-ahead logs but fails because the primary instance has made a lock on this directory.

If the primary instance goes down for any reason, the lock is released. In this case, the standby instance succeeds in taking the lock and becomes the primary instance.

database.hot_standby has no effect:

If wal.mode is set to none.
If wal.dir_rescan_delay is set to a large value on macOS or FreeBSD. On these platforms, the hot standby mode is designed so that the loop repeats every wal.dir_rescan_delay seconds.
For spaces created with engine set to vinyl.

Examples on GitHub: hot_standby_1, hot_standby_2

Type: boolean
Default: false
Environment variable: TT_DATABASE_HOT_STANDBY

database.instance_uuid¶

An instance UUID.

By default, instance UUIDs are generated automatically. database.instance_uuid can be used to specify an instance identifier manually.

UUIDs should follow these rules:

The values must be true unique identifiers, not shared by other instances or replica sets within the common infrastructure.
The values must be used consistently, not changed after the initial setup. The initial values are stored in snapshot files and are checked whenever the system is restarted.
The values must comply with RFC 4122. The nil UUID is not allowed.

failover

The failover section defines parameters related to a supervised failover.

Note

failover can be defined in the global scope only.

failover.log.to
failover.log.file
failover.call_timeout
failover.connect_timeout
failover.lease_interval
failover.probe_interval
failover.renew_interval
failover.stateboard.*

failover.log.to¶

Since: 3.3.0

Enterprise Edition

Configuring failover.log.to and failover.log.file parameters is available in the Enterprise Edition only.

Define a location Tarantool sends failover logs to. This option accepts the following values:

stderr: write logs to the standard error stream.
file: write logs to a file (see failover.log.file).

Type: string
Default: ‘stderr’
Environment variable: TT_FAILOVER_LOG_TO

failover.log.file¶

Since: 3.3.0

Specify a file for failover logs destination. To write logs to a file, set failover.log.to to file. Otherwise, failover.log.file is ignored.

Example

The example below shows how to write failover logs to a file placed in the specified directory:

failover:
  log:
    to: file
    file: var/log/failover.log

Type: string
Default: nil
Environment variable: TT_FAILOVER_LOG_FILE

failover.call_timeout¶

Since: 3.1.0

A call timeout (in seconds) for monitoring and failover requests to an instance.

Type: number
Default: 1
Environment variable: TT_FAILOVER_CALL_TIMEOUT

failover.connect_timeout¶

Since: 3.1.0

A connection timeout (in seconds) for monitoring and failover requests to an instance.

Type: number
Default: 1
Environment variable: TT_FAILOVER_CONNECT_TIMEOUT

failover.lease_interval¶

Since: 3.1.0

A time interval (in seconds) that specifies how long an instance should be a leader without renew requests from a coordinator. When this interval expires, the leader switches to read-only mode. This action is performed by the instance itself and works even if there is no connectivity between the instance and the coordinator.

Type: number
Default: 30
Environment variable: TT_FAILOVER_LEASE_INTERVAL

failover.probe_interval¶

Since: 3.1.0

A time interval (in seconds) that specifies how often a monitoring service of the failover coordinator polls an instance for its status.

Type: number
Default: 10
Environment variable: TT_FAILOVER_PROBE_INTERVAL

failover.renew_interval¶

Since: 3.1.0

A time interval (in seconds) that specifies how often a failover coordinator sends read-write deadline renewals.

Type: number
Default: 10
Environment variable: TT_FAILOVER_RENEW_INTERVAL

failover.stateboard.*

failover.stateboard.* options define configuration parameters related to maintaining the state of failover coordinators in a remote etcd-based storage.

failover.stateboard.keepalive_interval
failover.stateboard.renew_interval

failover.stateboard.keepalive_interval¶

Since: 3.1.0

A time interval (in seconds) that specifies how long a transient state information is stored and how quickly a lock expires.

Note

failover.stateboard.keepalive_interval should be smaller than failover.lease_interval. Otherwise, switching of a coordinator causes a replica set leader to go to read-only mode for some time.

Type: number
Default: 10
Environment variable: TT_FAILOVER_STATEBOARD_KEEPALIVE_INTERVAL

failover.stateboard.renew_interval¶

Since: 3.1.0

A time interval (in seconds) that specifies how often a failover coordinator writes its state information to etcd. This option also determines the frequency at which an active coordinator reads new commands from etcd.

Type: number
Default: 2
Environment variable: TT_FAILOVER_STATEBOARD_RENEW_INTERVAL

feedback

The feedback section describes configuration parameters for sending information about a running Tarantool instance to the specified feedback server.

Note

feedback can be defined in any scope.

feedback.crashinfo
feedback.enabled
feedback.host
feedback.interval
feedback.metrics_collect_interval
feedback.metrics_limit
feedback.send_metrics

feedback.crashinfo¶

Whether to send crash information in the case of an instance failure. This information includes:

General information from the uname output.
Build information.
The crash reason.
The stack trace.

To turn off sending crash information, set this option to false.

Type: boolean
Default: true
Environment variable: TT_FEEDBACK_CRASHINFO

feedback.enabled¶: Whether to send information about a running instance to the feedback server. To turn off sending feedback, set this option to false.

Type: boolean

Default: true

Environment variable: TT_FEEDBACK_ENABLED

feedback.host¶: The address to which information is sent.

Type: string

Default: https://feedback.tarantool.io

Environment variable: TT_FEEDBACK_HOST

feedback.interval¶: The interval (in seconds) of sending information.

Type: number

Default: 3600

Environment variable: TT_FEEDBACK_INTERVAL

feedback.metrics_collect_interval¶: The interval (in seconds) for collecting metrics.

Type: number

Default: 60

Environment variable: TT_FEEDBACK_METRICS_COLLECT_INTERVAL

feedback.metrics_limit¶: The maximum size of memory (in bytes) used to store metrics before sending them to the feedback server. If the size of collected metrics exceeds this value, earlier metrics are dropped.

Type: integer

Default: 1024 * 1024 (1048576)

Environment variable: TT_FEEDBACK_METRICS_LIMIT

feedback.send_metrics¶: Whether to send metrics to the feedback server. Note that all collected metrics are dropped after sending them to the feedback server.

Type: boolean

Default: true

Environment variable: TT_FEEDBACK_SEND_METRICS

fiber

The fiber section describes options related to configuring fibers, yields, and cooperative multitasking.

Note

fiber can be defined in any scope.

fiber.io_collect_interval
fiber.too_long_threshold
fiber.worker_pool_threads
fiber.slice.*
fiber.top.*

fiber.io_collect_interval¶

The time period (in seconds) a fiber sleeps between iterations of the event loop.

fiber.io_collect_interval can be used to reduce CPU load in deployments where the number of client connections is large, but requests are not so frequent (for example, each connection issues just a handful of requests per second).

Type: number
Default: box.NULL
Environment variable: TT_FIBER_IO_COLLECT_INTERVAL

fiber.too_long_threshold¶

If processing a request takes longer than the given period (in seconds), the fiber warns about it in the log.

fiber.too_long_threshold has effect only if log.level is greater than or equal to 4 (warn).

Type: number
Default: 0.5
Environment variable: TT_FIBER_TOO_LONG_THRESHOLD

fiber.worker_pool_threads¶: The maximum number of threads to use during execution of certain internal processes (for example, socket.getaddrinfo() and coio_call()).

Type: number

Default: 4

Environment variable: TT_FIBER_WORKER_POOL_THREADS

fiber.slice.*

This section describes options related to configuring time periods for fiber slices. See fiber.set_max_slice for details and examples.

fiber.slice.warn
fiber.slice.err

fiber.slice.warn¶: Set a time period (in seconds) that specifies the warning slice.

Type: number

Default: 0.5

Environment variable: TT_FIBER_SLICE_WARN

fiber.slice.err¶: Set a time period (in seconds) that specifies the error slice.

Type: number

Default: 1

Environment variable: TT_FIBER_SLICE_ERR

fiber.top.*

This section describes options related to configuring the fiber.top() function, normally used for debug purposes. fiber.top() shows all alive fibers and their CPU consumption.

fiber.top.enabled

fiber.top.enabled¶

Enable or disable the fiber.top() function.

Enabling fiber.top() slows down fiber switching by about 15%, so it is disabled by default.

Type: boolean
Default: false
Environment variable: TT_FIBER_TOP_ENABLED

flightrec

Enterprise Edition

Configuring flightrec parameters is available in the Enterprise Edition only.

The flightrec section describes options related to the flight recorder configuration.

Note

flightrec can be defined in any scope.

flightrec.enabled
flightrec.logs_size
flightrec.logs_max_msg_size
flightrec.logs_log_level
flightrec.metrics_period
flightrec.metrics_interval
flightrec.requests_size
flightrec.requests_max_req_size
flightrec.requests_max_res_size

flightrec.enabled¶: Enable the flight recorder.

Type: boolean

Default: false

Environment variable: TT_FLIGHTREC_ENABLED

flightrec.logs_size¶: Specify the size (in bytes) of the log storage. You can set this option to 0 to disable the log storage.

Type: integer

Default: 10485760

Environment variable: TT_FLIGHTREC_LOGS_SIZE

flightrec.logs_max_msg_size¶: Specify the maximum size (in bytes) of the log message. The log message is truncated if its size exceeds this limit.

Type: integer

Default: 4096

Maximum: 16384

Environment variable: TT_FLIGHTREC_LOGS_MAX_MSG_SIZE

flightrec.logs_log_level¶: Specify the level of detail the log has. The default value is 6 (VERBOSE). You can learn more about log levels from the log_level option description. Note that the flightrec.logs_log_level value might differ from log_level.

Type: integer

Default: 6

Environment variable: TT_FLIGHTREC_LOGS_LOG_LEVEL

flightrec.metrics_period¶: Specify the time period (in seconds) that defines how long metrics are stored from the moment of dump. So, this value defines how much historical metrics data is collected up to the moment of crash. The frequency of metric dumps is defined by flightrec.metrics_interval.

Type: integer

Default: 180

Environment variable: TT_FLIGHTREC_METRICS_PERIOD

flightrec.metrics_interval¶

Specify the time interval (in seconds) that defines the frequency of dumping metrics. This value shouldn’t exceed flightrec.metrics_period.

Type: number
Default: 1.0
Minimum: 0.001
Environment variable: TT_FLIGHTREC_METRICS_INTERVAL

Note

Given that the average size of a metrics entry is 2 kB, you can estimate the size of the metrics storage as follows:

(flightrec_metrics_period / flightrec_metrics_interval) * 2 kB

flightrec.requests_size¶: Specify the size (in bytes) of storage for the request and response data. You can set this parameter to 0 to disable a storage of requests and responses.

Type: integer

Default: 10485760

Environment variable: TT_FLIGHTREC_REQUESTS_SIZE

flightrec.requests_max_req_size¶: Specify the maximum size (in bytes) of a request entry. A request entry is truncated if this size is exceeded.

Type: integer

Default: 16384

Environment variable: TT_FLIGHTREC_REQUESTS_MAX_REQ_SIZE

flightrec.requests_max_res_size¶: Specify the maximum size (in bytes) of a response entry. A response entry is truncated if this size is exceeded.

Type: integer

Default: 16384

Environment variable: TT_FLIGHTREC_REQUESTS_MAX_RES_SIZE

iproto

The iproto section is used to configure parameters related to communicating to and between cluster instances.

Note

iproto can be defined in any scope.

iproto.listen
iproto.net_msg_max
iproto.readahead
iproto.threads
iproto.advertise.*
<uri>.params.*

iproto.listen¶

An array of URIs used to listen for incoming requests. If required, you can enable SSL for specific URIs by providing additional parameters (<uri>.params.*).

Note that a URI value can’t contain parameters, a login, or a password.

Example

In the example below, iproto.listen is set explicitly for each instance in a cluster:

groups:
  group001:
    replicasets:
      replicaset001:
        instances:
          instance001:
            iproto:
              listen:
              - uri: '127.0.0.1:3301'
          instance002:
            iproto:
              listen:
              - uri: '127.0.0.1:3302'
          instance003:
            iproto:
              listen:
              - uri: '127.0.0.1:3303'

iproto.advertise.*

iproto.advertise.client
iproto.advertise.peer
iproto.advertise.sharding
iproto.advertise.<peer_or_sharding>.*

iproto.advertise.client¶

A URI used to advertise the current instance to clients.

The iproto.advertise.client option accepts a URI in the following formats:

An address: host:port.
A Unix domain socket: unix/:.

Note that this option doesn’t allow to set a username and password. If a remote client needs this information, it should be delivered outside of the cluster configuration.

Note

The host value cannot be 0.0.0.0/[::] and the port value cannot be 0.

Type: string
Default: box.NULL
Environment variable: TT_IPROTO_ADVERTISE_CLIENT

iproto.advertise.peer¶

Settings used to advertise the current instance to other cluster members. The format of these settings is described in iproto.advertise.<peer_or_sharding>.*.

Example

In the example below, the following configuration options are specified:

In the credentials section, the replicator user with the replication role is created.
iproto.advertise.peer specifies that other instances should connect to an address defined in iproto.listen using the replicator user.

credentials:
  users:
    replicator:
      password: 'topsecret'
      roles: [replication]

iproto:
  advertise:
    peer:
      login: replicator

replication:
  failover: election

groups:
  group001:
    replicasets:
      replicaset001:
        instances:
          instance001:
            iproto:
              listen:
              - uri: '127.0.0.1:3301'
          instance002:
            iproto:
              listen:
              - uri: '127.0.0.1:3302'
          instance003:
            iproto:
              listen:
              - uri: '127.0.0.1:3303'

Type: map
Environment variable: see iproto.advertise.<peer_or_sharding>.*

iproto.advertise.sharding¶

Settings used to advertise the current instance to a router and rebalancer. The format of these settings is described in iproto.advertise.<peer_or_sharding>.*.

Note

If iproto.advertise.sharding is not specified, advertise settings from iproto.advertise.peer are used.

Example

In the example below, the following configuration options are specified:

In the credentials section, the replicator and storage users are created.
iproto.advertise.peer specifies that other instances should connect to an address defined in iproto.listen with the replicator user.
iproto.advertise.sharding specifies that a router should connect to storages using an address defined in iproto.listen with the storage user.

credentials:
  users:
    replicator:
      password: 'topsecret'
      roles: [replication]
    storage:
      password: 'secret'
      roles: [sharding]

iproto:
  advertise:
    peer:
      login: replicator
    sharding:
      login: storage

Type: map
Environment variable: see iproto.advertise.<peer_or_sharding>.*

iproto.advertise.<peer_or_sharding>.*

iproto.advertise.<peer_or_sharding>.uri
iproto.advertise.<peer_or_sharding>.login
iproto.advertise.<peer_or_sharding>.password
iproto.advertise.<peer_or_sharding>.params

iproto.advertise.<peer_or_sharding>.uri¶: (Optional) A URI used to advertise the current instance. By default, the URI defined in iproto.listen is used to advertise the current instance.

Note

The host value cannot be 0.0.0.0/[::] and the port value cannot be 0.

Type: string

Default: nil

Environment variable: TT_IPROTO_ADVERTISE_PEER_URI, TT_IPROTO_ADVERTISE_SHARDING_URI

iproto.advertise.<peer_or_sharding>.login¶: (Optional) A username used to connect to the current instance. If a username is not set, the guest user is used.

Type: string

Default: nil

Environment variable: TT_IPROTO_ADVERTISE_PEER_LOGIN, TT_IPROTO_ADVERTISE_SHARDING_LOGIN

iproto.advertise.<peer_or_sharding>.password¶: (Optional) A password for the specified user. If a login is specified but a password is missing, it is taken from the user’s credentials.

Type: string

Default: nil

Environment variable: TT_IPROTO_ADVERTISE_PEER_PASSWORD, TT_IPROTO_ADVERTISE_SHARDING_PASSWORD

iproto.advertise.<peer_or_sharding>.params¶: (Optional) URI parameters (<uri>.params.*) required for connecting to the current instance.

<uri>.params.*

Enterprise Edition

TLS traffic encryption is supported by the Enterprise Edition only.

URI parameters that can be used in the iproto.listen.<uri>.params and iproto.advertise.<peer_or_sharding>.params options.

<uri>.params.transport
<uri>.params.ssl_ca_file
<uri>.params.ssl_cert_file
<uri>.params.ssl_ciphers
<uri>.params.ssl_key_file
<uri>.params.ssl_password
<uri>.params.ssl_password_file

Note

Note that <uri>.params.* options don’t have corresponding environment variables for URIs specified in iproto.listen.

<uri>.params.transport¶

Allows you to enable traffic encryption for client-server communications over binary connections. In a Tarantool cluster, one instance might act as the server that accepts connections from other instances and the client that connects to other instances.

<uri>.params.transport accepts one of the following values:

plain (default): turn off traffic encryption.
ssl: encrypt traffic by using the TLS 1.2 protocol (Enterprise Edition only).

Example

The example below demonstrates how to enable traffic encryption by using a self-signed server certificate. The following parameters are specified for each instance:

ssl_cert_file: a path to an SSL certificate file.
ssl_key_file: a path to a private SSL key file.

replicaset001:
  replication:
    failover: manual
  leader: instance001
  iproto:
    advertise:
      peer:
        login: replicator
  instances:
    instance001:
      iproto:
        listen:
        - uri: '127.0.0.1:3301'
          params:
            transport: 'ssl'
            ssl_cert_file: 'certs/server.crt'
            ssl_key_file: 'certs/server.key'
    instance002:
      iproto:
        listen:
        - uri: '127.0.0.1:3302'
          params:
            transport: 'ssl'
            ssl_cert_file: 'certs/server.crt'
            ssl_key_file: 'certs/server.key'
    instance003:
      iproto:
        listen:
        - uri: '127.0.0.1:3303'
          params:
            transport: 'ssl'
            ssl_cert_file: 'certs/server.crt'
            ssl_key_file: 'certs/server.key'

Example on Github: ssl_without_ca

Type: string
Default: ‘plain’
Environment variable: TT_IPROTO_ADVERTISE_PEER_PARAMS_TRANSPORT, TT_IPROTO_ADVERTISE_SHARDING_PARAMS_TRANSPORT

<uri>.params.ssl_ca_file¶

(Optional) A path to a trusted certificate authorities (CA) file. If not set, the peer won’t be checked for authenticity.

Both a server and a client can use the ssl_ca_file parameter:

If it’s on the server side, the server verifies the client.
If it’s on the client side, the client verifies the server.
If both sides have the CA files, the server and the client verify each other.

See also: <uri>.params.transport

Type: string
Default: nil
Environment variable: TT_IPROTO_ADVERTISE_PEER_PARAMS_SSL_CA_FILE, TT_IPROTO_ADVERTISE_SHARDING_PARAMS_SSL_CA_FILE

<uri>.params.ssl_cert_file¶

A path to an SSL certificate file:

For a server, it’s mandatory.
For a client, it’s mandatory if the ssl_ca_file parameter is set for a server; otherwise, optional.

See also: <uri>.params.transport

Type: string
Default: nil
Environment variable: TT_IPROTO_ADVERTISE_PEER_PARAMS_SSL_CERT_FILE, TT_IPROTO_ADVERTISE_SHARDING_PARAMS_SSL_CERT_FILE

<uri>.params.ssl_ciphers¶

(Optional) A colon-separated (:) list of SSL cipher suites the connection can use. Note that the list is not validated: if a cipher suite is unknown, Tarantool ignores it, doesn’t establish the connection, and writes to the log that no shared cipher was found.

The supported cipher suites are:

ECDHE-ECDSA-AES256-GCM-SHA384
ECDHE-RSA-AES256-GCM-SHA384
DHE-RSA-AES256-GCM-SHA384
ECDHE-ECDSA-CHACHA20-POLY1305
ECDHE-RSA-CHACHA20-POLY1305
DHE-RSA-CHACHA20-POLY1305
ECDHE-ECDSA-AES128-GCM-SHA256
ECDHE-RSA-AES128-GCM-SHA256
DHE-RSA-AES128-GCM-SHA256
ECDHE-ECDSA-AES256-SHA384
ECDHE-RSA-AES256-SHA384
DHE-RSA-AES256-SHA256
ECDHE-ECDSA-AES128-SHA256
ECDHE-RSA-AES128-SHA256
DHE-RSA-AES128-SHA256
ECDHE-ECDSA-AES256-SHA
ECDHE-RSA-AES256-SHA
DHE-RSA-AES256-SHA
ECDHE-ECDSA-AES128-SHA
ECDHE-RSA-AES128-SHA
DHE-RSA-AES128-SHA
AES256-GCM-SHA384
AES128-GCM-SHA256
AES256-SHA256
AES128-SHA256
AES256-SHA
AES128-SHA
GOST2012-GOST8912-GOST8912
GOST2001-GOST89-GOST89

For detailed information on SSL ciphers and their syntax, refer to OpenSSL documentation.

See also: <uri>.params.transport

Type: string
Default: nil
Environment variable: TT_IPROTO_ADVERTISE_PEER_PARAMS_SSL_CIPHERS, TT_IPROTO_ADVERTISE_SHARDING_PARAMS_SSL_CIPHERS

<uri>.params.ssl_key_file¶

A path to a private SSL key file:

For a server, it’s mandatory.
For a client, it’s mandatory if the ssl_ca_file parameter is set for a server; otherwise, optional.

If the private key is encrypted, provide a password for it in the ssl_password or ssl_password_file parameter.

See also: <uri>.params.transport

Type: string
Default: nil
Environment variable: TT_IPROTO_ADVERTISE_PEER_PARAMS_SSL_KEY_FILE, TT_IPROTO_ADVERTISE_SHARDING_PARAMS_SSL_KEY_FILE

<uri>.params.ssl_password¶

(Optional) A password for an encrypted private SSL key provided using ssl_key_file. Alternatively, the password can be provided in ssl_password_file.

Tarantool applies the ssl_password and ssl_password_file parameters in the following order:

If ssl_password is provided, Tarantool tries to decrypt the private key with it.
If ssl_password is incorrect or isn’t provided, Tarantool tries all passwords from ssl_password_file one by one in the order they are written.
If ssl_password and all passwords from ssl_password_file are incorrect, or none of them is provided, Tarantool treats the private key as unencrypted.

See also: <uri>.params.transport

Type: string
Default: nil
Environment variable: TT_IPROTO_ADVERTISE_PEER_PARAMS_SSL_PASSWORD, TT_IPROTO_ADVERTISE_SHARDING_PARAMS_SSL_PASSWORD

<uri>.params.ssl_password_file¶

(Optional) A text file with one or more passwords for encrypted private SSL keys provided using ssl_key_file (each on a separate line). Alternatively, the password can be provided in ssl_password.

See also: <uri>.params.transport

Type: string
Default: nil
Environment variable: TT_IPROTO_ADVERTISE_PEER_PARAMS_SSL_PASSWORD_FILE, TT_IPROTO_ADVERTISE_SHARDING_PARAMS_SSL_PASSWORD_FILE

groups

The groups section provides the ability to define the full topology of a Tarantool cluster.

Note

groups can be defined in the global scope only.

groups.<group_name>
groups.<group_name>.replicasets
groups.<group_name>.<config_parameter>

groups.<group_name>¶

A group name.

The following rules are applied to group names:

The maximum number of symbols is 63.
Should start with a letter.
Can contain lowercase letters (a-z).
Can contain digits (0-9).
Can contain the following characters: -, _.

groups.<group_name>.replicasets¶: Replica sets that belong to this group. See replicasets.

groups.<group_name>.<config_parameter>¶: Any configuration parameter that can be defined in the group scope. For example, iproto and database configuration parameters defined at the group level are applied to all instances in this group.

replicasets

Note

replicasets can be defined in the group scope only.

replicasets.<replicaset_name>
replicasets.<replicaset_name>.leader
replicasets.<replicaset_name>.bootstrap_leader
replicasets.<replicaset_name>.instances
replicasets.<replicaset_name>.<config_parameter>

replicasets.<replicaset_name>¶

A replica set name.

Note that the rules applied to a replica set name are the same as for groups. Learn more in groups.<group_name>.

replicasets.<replicaset_name>.leader¶

A replica set leader. This option can be used to set a replica set leader when manual replication.failover is used.

To perform controlled failover, <replicaset_name>.leader can be temporarily removed or set to null.

Example

replication:
  failover: manual

groups:
  group001:
    replicasets:
      replicaset001:
        leader: instance001
        instances:
          instance001:
            iproto:
              listen:
              - uri: '127.0.0.1:3301'
          instance002:
            iproto:
              listen:
              - uri: '127.0.0.1:3302'
          instance003:
            iproto:
              listen:
              - uri: '127.0.0.1:3303'

replicasets.<replicaset_name>.bootstrap_leader¶

A bootstrap leader for a replica set. To specify a bootstrap leader manually, you need to set replication.bootstrap_strategy to config.

Example

groups:
  group001:
    replicasets:
      replicaset001:
        replication:
          bootstrap_strategy: config
        bootstrap_leader: instance001
        instances:
          instance001:
            iproto:
              listen:
              - uri: '127.0.0.1:3301'
          instance002:
            iproto:
              listen:
              - uri: '127.0.0.1:3302'
          instance003:
            iproto:
              listen:
              - uri: '127.0.0.1:3303'

replicasets.<replicaset_name>.instances¶: Instances that belong to this replica set. See instances.

replicasets.<replicaset_name>.<config_parameter>¶: Any configuration parameter that can be defined in the replica set scope. For example, iproto and database configuration parameters defined at the replica set level are applied to all instances in this replica set.

instances

Note

instances can be defined in the replica set scope only.

instances.<instance_name>
instances.<instance_name>.<config_parameter>

instances.<instance_name>¶

An instance name.

Note that the rules applied to an instance name are the same as for groups. Learn more in groups.<group_name>.

instances.<instance_name>.<config_parameter>¶: Any configuration parameter that can be defined in the instance scope. For example, iproto and database configuration parameters defined at the instance level are applied to this instance only.

isolated mode

Since version 3.3.0, a new isolated option is added to instance configuration.

The option takes boolean values, by default it is set to false. isolated: true moves the instance it has been applied at to the isolated mode.

The isolated mode allows the user to temporarily isolate an instance and perform maintenance activities on it.

In the isolated mode:

The instance is moved to the read-only state
iproto stops listening for new connections
iproto drops all the current connections
The instance is disconnected from all the replication upstreams
Other replicaset members exclude the isolated instance from the replication upstreams

Note

Isolated instance can’t be bootstrapped (a local snapshot is required to start).

Example

The example below shows how to isolate an instance:

groups:
  g:
    replicasets:
      r:
        instances:
          i-001: {}
          i-002: {}
          i-003: {}
          i-004:
            isolated: true

labels

The labels section allows adding custom attributes to the configuration. Attributes must be key: value pairs with string keys and values.

Note

labels can be defined in any scope.

labels.<label_name>

labels.<label_name>¶

A value of the label with the specified name.

Example

The example below shows how to define labels on the replica set and instance levels:

groups:
  group001:
    replicasets:
      replicaset001:
        labels:
          dc: 'east'
          production: 'false'
        instances:
          instance001:
            labels:
              rack: '10'
              production: 'true'

log

The log section defines configuration parameters related to logging. To handle logging in your application, use the log module.

Note

log can be defined in any scope.

log.to
log.file
log.format
log.level
log.modules
log.nonblock
log.pipe
log.syslog.*

log.to¶

Define a location Tarantool sends logs to. This option accepts the following values:

stderr: write logs to the standard error stream.
file: write logs to a file (see log.file).
pipe: start a program and write logs to its standard input (see log.pipe).
syslog: write logs to a system logger (see log.syslog.*).

Type: string
Default: ‘stderr’
Environment variable: TT_LOG_TO

log.file¶

Specify a file for logs destination. To write logs to a file, you need to set log.to to file. Otherwise, log.file is ignored.

Example

The example below shows how to write logs to a file placed in the specified directory:

log:
  to: file
  file: var/log/{{ instance_name }}/instance.log

Example on GitHub: log_file

Type: string
Default: ‘var/log/{{ instance_name }}/tarantool.log’
Environment variable: TT_LOG_FILE

log.format¶

Specify a format that is used for a log entry. The following formats are supported:

plain: a log entry is formatted as plain text. Example:

2024-04-09 11:00:10.369 [12089] main/104/interactive I> log level 5 (INFO)

json: a log entry is formatted as JSON and includes additional fields. Example:

{
  "time": "2024-04-09T11:00:57.174+0300",
  "level": "INFO",
  "message": "log level 5 (INFO)",
  "pid": 12160,
  "cord_name": "main",
  "fiber_id": 104,
  "fiber_name": "interactive",
  "file": "src/main.cc",
  "line": 498
}

Type: string
Default: ‘plain’
Environment variable: TT_LOG_FORMAT

log.level¶

Specify the level of detail logs have. There are the following levels:

0 – fatal
1 – syserror
2 – error
3 – crit
4 – warn
5 – info
6 – verbose
7 – debug

By setting log.level, you can enable logging of all events with severities above or equal to the given level.

Example

The example below shows how to log all events with severities above or equal to the VERBOSE level.

log:
  level: 'verbose'

Example on GitHub: log_level

Type: number, string
Default: 5
Environment variable: TT_LOG_LEVEL

log.modules¶

Configure the specified log levels (log.level) for different modules.

You can specify a logging level for the following module types:

Modules (files) that use the default logger. Example: Set log levels for files that use the default logger.
Modules that use custom loggers created using the log.new() function. Example: Set log levels for modules that use custom loggers.
The tarantool module that enables you to configure the logging level for Tarantool core messages. Specifically, it configures the logging level for messages logged from non-Lua code, including C modules. Example: Set a log level for C modules.

Example 1: Set log levels for files that use the default logger

Suppose you have two identical modules placed by the following paths: test/module1.lua and test/module2.lua. These modules use the default logger and look as follows:

return {
    say_hello = function()
        local log = require('log')
        log.info('Info message from module1')
    end
}

To configure logging levels, you need to provide module names corresponding to paths to these modules:

log:
  modules:
    test.module1: 'verbose'
    test.module2: 'error'
app:
  file: 'app.lua'

To load these modules in your application (app.lua), you need to add the corresponding require directives:

module1 = require('test.module1')
module2 = require('test.module2')

Given that module1 has the verbose logging level and module2 has the error level, calling module1.say_hello() shows a message but module2.say_hello() is swallowed:

-- Prints 'info' messages --
module1.say_hello()
--[[
[92617] main/103/interactive/test.logging.module1 I> Info message from module1
---
...
--]]

-- Swallows 'info' messages --
module2.say_hello()
--[[
---
...
--]]

Example on GitHub: log_existing_modules

Example 2: Set log levels for modules that use custom loggers

This example shows how to set the verbose level for module1 and the error level for module2:

log:
  modules:
    module1: 'verbose'
    module2: 'error'
app:
  file: 'app.lua'

To create custom loggers in your application (app.lua), call the log.new() function:

-- Creates new loggers --
module1_log = require('log').new('module1')
module2_log = require('log').new('module2')

Given that module1 has the verbose logging level and module2 has the error level, calling module1_log.info() shows a message but module2_log.info() is swallowed:

-- Prints 'info' messages --
module1_log.info('Info message from module1')
--[[
[16300] main/103/interactive/module1 I> Info message from module1
---
...
--]]

-- Swallows 'debug' messages --
module1_log.debug('Debug message from module1')
--[[
---
...
--]]

-- Swallows 'info' messages --
module2_log.info('Info message from module2')
--[[
---
...
--]]

Example on GitHub: log_new_modules

Example 3: Set a log level for C modules

This example shows how to set the info level for the tarantool module:

log:
  modules:
    tarantool: 'info'
app:
  file: 'app.lua'

The specified level affects messages logged from C modules:

ffi = require('ffi')

-- Prints 'info' messages --
ffi.C._say(ffi.C.S_INFO, nil, 0, nil, 'Info message from C module')
--[[
[6024] main/103/interactive I> Info message from C module
---
...
--]]

-- Swallows 'debug' messages --
ffi.C._say(ffi.C.S_DEBUG, nil, 0, nil, 'Debug message from C module')
--[[
---
...
--]]

The example above uses the LuaJIT ffi library to call C functions provided by the say module.

Example on GitHub: log_existing_c_modules

Type: map
Default: box.NULL
Environment variable: TT_LOG_MODULES

log.nonblock¶: Specify the logging behavior if the system is not ready to write. If set to true, Tarantool does not block during logging if the system is non-writable and writes a message instead. Using this value may improve logging performance at the cost of losing some log messages.

Note

The option only has an effect if the log.to is set to syslog or pipe.

Type: boolean

Default: false

Environment variable: TT_LOG_NONBLOCK

log.pipe¶

Start a program and write logs to its standard input (stdin). To send logs to a program’s standard input, you need to set log.to to pipe.

Example

In the example below, Tarantool writes logs to the standard input of cronolog:

log:
  to: pipe
  pipe: 'cronolog tarantool.log'

Example on GitHub: log_pipe

Type: string
Default: box.NULL
Environment variable: TT_LOG_PIPE

log.syslog.*

log.syslog.facility
log.syslog.identity
log.syslog.server

log.syslog.facility¶: Specify the syslog facility to be used when syslog is enabled. To write logs to syslog, you need to set log.to to syslog.

Type: string

Possible values: ‘auth’, ‘authpriv’, ‘cron’, ‘daemon’, ‘ftp’, ‘kern’, ‘lpr’, ‘mail’, ‘news’, ‘security’, ‘syslog’, ‘user’, ‘uucp’, ‘local0’, ‘local1’, ‘local2’, ‘local3’, ‘local4’, ‘local5’, ‘local6’, ‘local7’

Default: ‘local7’

Environment variable: TT_LOG_SYSLOG_FACILITY

log.syslog.identity¶: Specify an application name used to identify Tarantool messages in syslog logs. To write logs to syslog, you need to set log.to to syslog.

Type: string

Default: ‘tarantool’

Environment variable: TT_LOG_SYSLOG_IDENTITY

log.syslog.server¶

Set a location of a syslog server. This option accepts one of the following values:

An IPv4 address. Example: 127.0.0.1:514.
A Unix socket path starting with unix:. Examples: unix:/dev/log on Linux or unix:/var/run/syslog on macOS.

To write logs to syslog, you need to set log.to to syslog.

Example

In the example below, Tarantool writes logs to a syslog server that listens for logging messages on the 127.0.0.1:514 address:

log:
  to: syslog
  syslog:
    server: '127.0.0.1:514'

Example on GitHub: log_syslog

Type: string
Default: box.NULL
Environment variable: TT_LOG_SYSLOG_SERVER

lua

The lua section outlines the configuration parameters related to the Lua environment within Tarantool.

Note

lua can be defined in any scope.

lua.memory

lua.memory¶

Specifies the maximum memory amount available to Lua scripts, measured in bytes.

When the specified value exceeds the current memory usage, the new limit takes effect immediately without a restart. However, when the specified value is lower than the current memory usage, a restart of the instance is required for the change to take effect.

Example to set the Lua memory limit to 4 GB:

lua:
    memory: 4294967296

Type: integer
Default: 2147483648 (2GB)
Environment variable: TT_LUA_MEMORY

memtx

The memtx section is used to configure parameters related to the memtx engine.

Note

memtx can be defined in any scope.

memtx.allocator
memtx.max_tuple_size
memtx.memory
memtx.min_tuple_size
memtx.slab_alloc_factor
memtx.slab_alloc_granularity
memtx.sort_threads

memtx.allocator¶

Specify the allocator that manages memory for memtx tuples. Possible values:

system – the memory is allocated as needed, checking that the quota is not exceeded. THe allocator is based on the malloc function.
small – a slab allocator. The allocator repeatedly uses a memory block to allocate objects of the same type. Note that this allocator is prone to unresolvable fragmentation on specific workloads, so you can switch to system in such cases.

Type: string
Default: ‘small’
Environment variable: TT_MEMTX_ALLOCATOR

memtx.max_tuple_size¶: Size of the largest allocation unit for the memtx storage engine in bytes. It can be increased if it is necessary to store large tuples.

Type: integer

Default: 1048576

Environment variable: TT_MEMTX_MAX_TUPLE_SIZE

memtx.memory¶

The amount of memory in bytes that Tarantool allocates to store tuples. When the limit is reached, INSERT and UPDATE requests fail with the ER_MEMORY_ISSUE error. The server does not go beyond the memtx.memory limit to allocate tuples, but there is additional memory used to store indexes and connection information.

Example

In the example below, the memory size is set to 1 GB (1073741824 bytes).

memtx:
  memory: 1073741824

Type: integer
Default: 268435456
Environment variable: TT_MEMTX_MEMORY

memtx.min_tuple_size¶: Size of the smallest allocation unit in bytes. It can be decreased if most of the tuples are very small.

Type: integer

Default: 16

Possible values: between 8 and 1048280 inclusive

Environment variable: TT_MEMTX_MIN_TUPLE_SIZE

memtx.slab_alloc_factor¶

The multiplier for computing the sizes of memory chunks that tuples are stored in. A lower value may result in less wasted memory depending on the total amount of memory available and the distribution of item sizes.

metrics

The metrics section defines configuration parameters for metrics.

Note

metrics can be defined in any scope.

metrics.exclude
metrics.include
metrics.labels

metrics.exclude¶

An array containing the metrics to turn off. The array can contain the same values as the exclude configuration parameter passed to metrics.cfg().

Example

metrics:
  include: [ all ]
  exclude: [ vinyl ]
  labels:
    alias: '{{ instance_name }}'

Type: array
Default: []
Environment variable: TT_METRICS_EXCLUDE

metrics.include¶: An array containing the metrics to turn on. The array can contain the same values as the include configuration parameter passed to metrics.cfg().

Type: array

Default: [ all ]

Environment variable: TT_METRICS_INCLUDE

metrics.labels¶: Global labels to be added to every observation.

Type: map

Default: { alias = names.instance_name }

Environment variable: TT_METRICS_LABELS

process

The process section defines configuration parameters of the Tarantool process in the system.

Note

process can be defined in any scope.

process.background
process.coredump
process.title
process.pid_file
process.strip_core
process.username
process.work_dir

process.background¶

Run the server as a daemon process.

If this option is set to true, Tarantool log location defined by the log.to option should be set to file, pipe, or syslog – anything other than stderr, the default, because a daemon process is detached from a terminal and it can’t write to the terminal’s stderr.

Important

Do not enable the background mode for applications intended to run by the tt utility. For more information, see the tt start reference.

Type: boolean
Default: false
Environment variable: TT_PROCESS_BACKGROUND

process.coredump¶

Create coredump files.

Usually, an administrator needs to call ulimit -c unlimited (or set corresponding options in systemd’s unit file) before running a Tarantool process to get core dumps. If process.coredump is enabled, Tarantool sets the corresponding resource limit by itself and the administrator doesn’t need to call ulimit -c unlimited (see man 3 setrlimit).

This option also sets the state of the dumpable attribute, which is enabled by default, but may be dropped in some circumstances (according to man 2 prctl, see PR_SET_DUMPABLE).

Type: boolean
Default: false
Environment variable: TT_PROCESS_COREDUMP

process.title¶

Add the given string to the server’s process title (it is shown in the COMMAND column for the Linux commands ps -ef and top -c).

For example, if you set the option to myservice - {{ instance_name }}:

process:
  title: myservice - {{ instance_name }}

ps -ef might show the Tarantool server process like this:

$ ps -ef | grep tarantool
503      68100 68098  0 10:33 pts/2    00:00.10 tarantool <running>: myservice instance1

Type: string
Default: ‘tarantool - {{ instance_name }}’
Environment variable: TT_PROCESS_TITLE

process.pid_file¶

Store the process id in this file.

This option may contain a relative file path. In this case, it is interpreted as relative to process.work_dir.

Type: string
Default: ‘var/run/{{ instance_name }}/tarantool.pid’
Environment variable: TT_PROCESS_PID_FILE

process.strip_core¶: Whether coredump files should not include memory allocated for tuples – this memory can be large if Tarantool runs under heavy load. Setting to true means “do not include”.

Type: boolean

Default: true

Environment variable: TT_PROCESS_STRIP_CORE

process.username¶: The name of the system user to switch to after start.

Type: string

Default: box.NULL

Environment variable: TT_PROCESS_USERNAME

process.work_dir¶

A directory where Tarantool working files will be stored (database files, logs, a PID file, a console Unix socket, and other files if an application generates them in the current directory). The server instance switches to process.work_dir with chdir(2) after start.

If set as a relative file path, it is relative to the current working directory, from where Tarantool is started. If not specified, defaults to the current working directory.

Other directory and file parameters, if set as relative paths, are interpreted as relative to process.work_dir, for example, directories for storing snapshots and write-ahead logs.

Type: string
Default: box.NULL
Environment variable: TT_PROCESS_WORK_DIR

replication

The replication section defines configuration parameters related to replication.

replication.anon
replication.autoexpel
replication.anon
replication.bootstrap_strategy
replication.connect_timeout
replication.election_mode
replication.election_timeout
replication.election_fencing_mode
replication.failover
replication.peers
replication.skip_conflict
replication.sync_lag
replication.sync_timeout
replication.synchro_queue_max_size
replication.synchro_quorum
replication.synchro_timeout
replication.threads
replication.timeout

replication.anon¶

Whether to make the current instance act as an anonymous replica. Anonymous replicas are read-only and can be used, for example, for backups.

To make the specified instance act as an anonymous replica, set replication.anon to true:

instance003:
  replication:
    anon: true

You can find the full example on GitHub: anonymous_replica.

Anonymous replicas are not displayed in the box.info.replication section. You can check their status using box.info.replication_anon().

While anonymous replicas are read-only, you can write data to replication-local and temporary spaces (created with is_local = true and temporary = true, respectively). Given that changes to replication-local spaces are allowed, an anonymous replica might increase the 0 component of the vclock value.

Here are the limitations of having anonymous replicas in a replica set:

A replica set must contain at least one non-anonymous instance.
An anonymous replica can’t be configured as a writable instance by setting database.mode to rw or making it a leader using <replicaset_name>.leader.
If replication.failover is set to election, an anonymous replica can have replication.election_mode set to off only.
If replication.failover is set to supervised, an external failover coordinator doesn’t consider anonymous replicas when selecting a bootstrap or replica set leader.

Note

Anonymous replicas are not registered in the _cluster space. This means that there is no limitation on the number of anonymous replicas in a replica set.

Type: boolean
Default: false
Environment variable: TT_REPLICATION_ANON

replication.autoexpel¶

Since: 3.3.0

The replication.autoexpel option designed for managing dynamic clusters using YAML-based configurations. It enables the automatic expulsion of instances that are removed from the YAML configuration.

Only instances with names that match the specified prefix are considered for expulsion; all others are excluded. Additionally, instances without a persistent name are ignored.

If an instance is in read-write mode and has the latest database schema, it initiates the expulsion of instances that:

Match the specified prefix

Absent from the updated YAML configuration

The expulsion process follows the standard procedure, involving the removal of the instance from the _cluster system space.

The autoexpel logic is activated during specific events:

Startup. When the cluster starts, autoexpel checks and removes instances not matching the updated configuration.

Reconfiguration. When the YAML configuration is reloaded, autoexpel compares the current state to the updated configuration and performs necessary expulsions.

box.status watcher event. Changes detected by the box.status watcher also trigger the autoexpel mechanism.

autoexpel does not take any actions on newly joined instances unless one of the triggering events occurs. This means that an instance meeting the autoexpel criterion can still join the cluster, but it may be removed later during reconfiguration or on subsequent triggering events.

Note

The replication.autoexpel option governs the expelling process and is configurable at the replicaset, group, and global levels. It is not applicable at the instance level.

Configuration fields

by (string, default: nil): specifies the autoexpel criterion. Currently, only prefix is supported and must be explicitly set.
enabled (boolean, default: false): enables or disables the autoexpel logic.
prefix (string, default: nil): defines the pattern for instance names that are considered part of the cluster.

replication.autoexpel_by.*

replication.autoexpel_by purpose is to define the criterion used for determining which instances in a cluster are subject to the autoexpel process.

The by field helps differentiate between:

Instances that are part of the cluster and should adhere to the YAML configuration.

Instances or tools (e.g., CDC tools) that use the replication channel but are not part of the cluster configuration.

The default value of by is nil, meaning no autoexpel criterion is applied unless explicitly set.

Currently, the only supported value for by is prefix. The prefix value instructs the system to identify instances based on their names, matching them against a prefix pattern defined in the configuration.

If the autoexpel feature is enabled (enabled: true), the by field must be explicitly set to prefix.

The absence of this field or an unsupported value will result in configuration errors.
replication:
  autoexpel:
    enabled: true
    by: prefix
    prefix: '{{ replicaset_name }}'
Type: string

Default: nil

Environment variable: TT_REPLICATION_AUTOEXPEL_BY

replication.autoexpel_enabled.*

The replication.autoexpel_enabled field is a boolean configuration option that determines whether the autoexpel logic is active for the cluster. This feature is designed to automatically manage dynamic cluster configurations by removing instances that are no longer present in the YAML configuration.

Note

By default, the enabled field is set to false, meaning the autoexpel logic is turned off. This ensures that no instances are automatically removed unless explicitly configured.

Enabling autoexpel logic

To enable autoexpel, you should set enabled to true in the replication.autoexpel section of your YAML configuration:
replication:
  autoexpel:
    enabled: true
    by: prefix
    prefix: '{{ replicaset_name }}'
To disable autoexpel, set enabled to false.

Dependencies

If enabled is set to true, the following fields are required:

by: specifies the criterion for autoexpel (e.g., prefix).

prefix: defines the pattern used to match instance names for expulsion.

Failure to configure these fields when enabled is true will result in a configuration error.

Type: boolean

Default: false

Environment variable: TT_REPLICATION_AUTOEXPEL_ENABLED

replication.autoexpel_prefix.*

The prefix field filters instances for expulsion by differentiating cluster instances (from the YAML configuration) from external services (e.g., CDC tools). Only instances matching the prefix are considered.

A consistent naming pattern ensures the _cluster system space automatically aligns with the YAML configuration.

If the prefix field is not set (nil), the autoexpel logic cannot identify instances for expulsion, and the feature will not function. This field is mandatory when replication.autoexpel_enabled is set to true.

How it works:

The prefix filters instance names (e.g., {{ replicaset_name }} for replicaset-specific names or i- for names starting with i-).

Instances matching the prefix and removed from the YAML configuration are expelled.

Unnamed instances or those not matching the prefix are ignored.

Dynamic prefix based on replicaset name:
replication:
  autoexpel:
    enabled: true
    by: prefix
    prefix: '{{ replicaset_name }}'
In this setup:

Instances are grouped by replicaset names (e.g., r-001-i-001 for replicaset r-001).

The prefix ensures that only instances with names matching the replicaset name are auto expelled when removed from the configuration.

Static prefix for matching patterns:
replication:
  autoexpel:
    enabled: true
    by: prefix
    prefix: 'i-'
In this setup:

All instances with names starting with i- (e.g., i-001, i-002) are considered for expulsion.

This is useful when instances follow a uniform naming convention.

Type: string

Default: nil

Environment variable: TT_REPLICATION_AUTOEXPEL_PREFIX

autoexpel full example

Create a config.yaml file with the following content:
credentials:
  users:
    guest:
      roles: [super]

replication:
  failover: manual
  autoexpel:
    enabled: true
    by: prefix
    prefix: '{{ replicaset_name }}'

iproto:
  listen:
    - uri: 'unix/:./var/run/{{ instance_name }}.iproto'

groups:
  g-001:
    replicasets:
      r-001:
        leader: r-001-i-001
        instances:
          r-001-i-001: {}
          r-001-i-002: {}
          r-001-i-003: {}
This configuration:

Sets up authentication with a guest user assigned the super role.

Enables the autoexpel option to automatically expel instances not present in the YAML file.

Defines instance names based on a prefix pattern: {{ replicaset_name }}.

Lists three instances: r-001-i-001, r-001-i-002, and r-001-i-003.

Open terminal window and start three instances using the following commands:
tarantool --name r-001-i-001 --config config.yaml -i
tarantool --name r-001-i-002 --config config.yaml -i
tarantool --name r-001-i-003 --config config.yaml -i
Edit config.yaml and remove the following entry for r-001-i-003:

The updated config.yaml should look like this:
groups:
  g-001:
    replicasets:
      r-001:
        leader: r-001-i-001
        instances:
          r-001-i-001: {}
          r-001-i-002: {}
Save the file.

For the leader instance (r-001-i-001), check the _cluster space:

Hint

The _cluster system space in Tarantool stores metadata about all instances currently recognized as part of the cluster. It shows which instances are registered and active.

You should see r-001-i-003 still listed in the _cluster system space.

Reload the configuration:
config = require('config')
config:reload()
Verify the changes:
box.space._cluster:fselect()
After the reload, r-001-i-003 should no longer appear in the _cluster system space.

replication.bootstrap_strategy¶

Specifies a strategy used to bootstrap a replica set. The following strategies are available:

auto: a node doesn’t boot if half or more of the other nodes in a replica set are not connected. For example, if a replica set contains 2 or 3 nodes, a node requires 2 connected instances. In the case of 4 or 5 nodes, at least 3 connected instances are required. Moreover, a bootstrap leader fails to boot unless every connected node has chosen it as a bootstrap leader.
config: use the specified node to bootstrap a replica set. To specify the bootstrap leader, use the <replicaset_name>.bootstrap_leader option.
supervised: a bootstrap leader isn’t chosen automatically but should be appointed using box.ctl.make_bootstrap_leader() on the desired node.
legacy (deprecated since 2.11.0): a node requires the replication_connect_quorum number of other nodes to be connected. This option is added to keep the compatibility with the current versions of Cartridge and might be removed in the future.

Type: string
Default: auto
Environment variable: TT_REPLICATION_BOOTSTRAP_STRATEGY

replication.connect_timeout¶

A timeout (in seconds) a replica waits when trying to connect to a master in a cluster. See orphan status for details.

This parameter is different from replication.timeout, which a master uses to disconnect a replica when the master receives no acknowledgments of heartbeat messages.

Type: number
Default: 30
Environment variable: TT_REPLICATION_CONNECT_TIMEOUT

replication.election_mode¶

A role of a replica set node in the leader election process.

The possible values are:

off: a node doesn’t participate in the election activities.
voter: a node can participate in the election process but can’t be a leader.
candidate: a node should be able to become a leader.
manual: allow to control which instance is the leader explicitly instead of relying on automated leader election. By default, the instance acts like a voter – it is read-only and may vote for other candidate instances. Once box.ctl.promote() is called, the instance becomes a candidate and starts a new election round. If the instance wins the elections, it becomes a leader but won’t participate in any new elections.

Note

You can set replication.election_mode to a value other than off if the replication.failover mode is election.

Type: string
Default: box.NULL (the actual default value depends on replication.failover)
Environment variable: TT_REPLICATION_ELECTION_MODE

replication.election_timeout¶

Specifies the timeout (in seconds) between election rounds in the leader election process if the previous round ended up with a split vote.

It is quite big, and for most of the cases, it can be lowered to 300-400 ms.

To avoid the split vote repeat, the timeout is randomized on each node during every new election, from 100% to 110% of the original timeout value. For example, if the timeout is 300 ms and there are 3 nodes started the election simultaneously in the same term, they can set their election timeouts to 300, 310, and 320 respectively, or to 305, 302, and 324, and so on. In that way, the votes will never be split because the election on different nodes won’t be restarted simultaneously.

Type: number
Default: 5
Environment variable: TT_REPLICATION_ELECTION_TIMEOUT

replication.election_fencing_mode¶

Specifies the leader fencing mode that affects the leader election process. When the parameter is set to soft or strict, the leader resigns its leadership if it has less than replication.synchro_quorum of alive connections to the cluster nodes. The resigning leader receives the status of a follower in the current election term and becomes read-only.

In soft mode, a connection is considered dead if there are no responses for 4 * replication.timeout seconds both on the current leader and the followers.
In strict mode, a connection is considered dead if there are no responses for 2 * replication.timeout seconds on the current leader and 4 * replication.timeout seconds on the followers. This improves the chances that there is only one leader at any time.

Fencing applies to the instances that have the replication.election_mode set to candidate or manual. To turn off leader fencing, set election_fencing_mode to off.

Type: string
Default: soft
Possible values: off, soft, strict
Environment variable: TT_REPLICATION_ELECTION_FENCING_MODE

replication.failover¶

A failover mode used to take over a master role when the current master instance fails. The following modes are available:

off

Leadership in a replica set is controlled using the database.mode option. In this case, you can set the database.mode option to rw on all instances in a replica set to make a master-master configuration.

The default database.mode is determined as follows: rw if there is one instance in a replica set; ro if there are several instances.
manual

Leadership in a replica set is controlled using the <replicaset_name>.leader option. In this case, a master-master configuration is forbidden.

In the manual mode, the database.mode option cannot be set explicitly. The leader is configured in the read-write mode, all the other instances are read-only.
election

Automated leader election is used to control leadership in a replica set.

In the election mode, database.mode and <replicaset_name>.leader shouldn’t be set explicitly.
supervised (Enterprise Edition only)

Leadership in a replica set is controlled using an external failover coordinator.

In the supervised mode, database.mode and <replicaset_name>.leader shouldn’t be set explicitly.

roles

This section describes configuration parameters related to application roles.

Note

Configuration parameters related to roles can be defined in any scope.

roles
roles_cfg

roles¶

Specify the roles of an instance. To specify a role’s configuration, use the roles_cfg option.

Type: array
Default: nil
Environment variable: TT_ROLES

roles_cfg¶

Specify a role’s configuration. This option accepts a role name as the key and a role’s configuration as the value. To specify the roles of an instance, use the roles option.

Tip

The experimental.config.utils.schema built-in module provides an API for managing user-defined configurations of applications (app.cfg) and roles (roles_cfg).

Type: map
Default: nil
Environment variable: TT_ROLES_CFG

security

Enterprise Edition

Configuring security parameters is available in the Enterprise Edition only.

The security section defines configuration parameters related to various security settings.

Note

security can be defined in any scope.

security.auth_delay
security.auth_retries
security.auth_type
security.disable_guest
security.password_enforce_digits
security.password_enforce_lowercase
security.password_enforce_specialchars
security.password_enforce_uppercase
security.password_history_length
security.password_lifetime_days
security.password_min_length
security.secure_erasing

security.auth_delay¶

Specify a period of time (in seconds) that a specific user should wait for the next attempt after failed authentication.

The security.auth_retries option lets a client try to authenticate the specified number of times before security.auth_delay is enforced.

In the configuration below, Tarantool lets a client try to authenticate with the same username three times. At the fourth attempt, the authentication delay configured with security.auth_delay is enforced. This means that a client should wait 10 seconds after the first failed attempt.

security:
  auth_delay: 10
  auth_retries: 2

Type: number
Default: 0
Environment variable: TT_SECURITY_AUTH_DELAY

security.auth_retries¶

Specify the maximum number of authentication retries allowed before security.auth_delay is enforced. The default value is 0, which means security.auth_delay is enforced after the first failed authentication attempt.

The retry counter is reset after security.auth_delay seconds since the first failed attempt. For example, if a client tries to authenticate fewer than security.auth_retries times within security.auth_delay seconds, no authentication delay is enforced. The retry counter is also reset after any successful authentication attempt.

Type: integer
Default: 0
Environment variable: TT_SECURITY_AUTH_RETRIES

security.auth_type¶

Specify a protocol used to authenticate users. The possible values are:

chap-sha1: use the CHAP protocol with SHA-1 hashing applied to passwords.
pap-sha256: use PAP authentication with the SHA256 hashing algorithm.

Note that CHAP stores password hashes in the _user space unsalted. If an attacker gains access to the database, they may crack a password, for example, using a rainbow table. For PAP, a password is salted with a user-unique salt before saving it in the database, which keeps the database protected from cracking using a rainbow table.

To enable PAP, specify the security.auth_type option as follows:

security:
  auth_type: 'pap-sha256'

Type: string
Default: ‘chap-sha1’
Environment variable: TT_SECURITY_AUTH_TYPE

security.disable_guest¶: If true, turn off access over remote connections from unauthenticated or guest users. This option affects connections between cluster members and net.box connections.

Type: boolean

Default: false

Environment variable: TT_SECURITY_DISABLE_GUEST

security.password_enforce_digits¶: If true, a password should contain digits (0-9).

Type: boolean

Default: false

Environment variable: TT_SECURITY_PASSWORD_ENFORCE_DIGITS

security.password_enforce_lowercase¶: If true, a password should contain lowercase letters (a-z).

Type: boolean

Default: false

Environment variable: TT_SECURITY_PASSWORD_ENFORCE_LOWERCASE

security.password_enforce_specialchars¶: If true, a password should contain at least one special character (such as &|?!@$).

Type: boolean

Default: false

Environment variable: TT_SECURITY_PASSWORD_ENFORCE_SPECIALCHARS

security.password_enforce_uppercase¶: If true, a password should contain uppercase letters (A-Z).

Type: boolean

Default: false

Environment variable: TT_SECURITY_PASSWORD_ENFORCE_UPPERCASE

security.password_history_length¶: Specify the number of unique new user passwords before an old password can be reused.

Note

Tarantool uses the auth_history field in the box.space._user system space to store user passwords.

Type: integer

Default: 0

Environment variable: TT_SECURITY_PASSWORD_HISTORY_LENGTH

security.password_lifetime_days¶: Specify the maximum period of time (in days) a user can use the same password. When this period ends, a user gets the “Password expired” error on a login attempt. To restore access for such users, use box.schema.user.passwd.

Note

The default 0 value means that a password never expires.

Type: integer

Default: 0

Environment variable: TT_SECURITY_PASSWORD_LIFETIME_DAYS

security.password_min_length¶: Specify the minimum number of characters for a password.

Type: integer

Default: 0

Environment variable: TT_SECURITY_PASSWORD_MIN_LENGTH

security.secure_erasing¶: If true, forces Tarantool to overwrite a data file a few times before deletion to render recovery of a deleted file impossible. The option applies to both .xlog and .snap files as well as Vinyl data files.

Type: boolean

Default: false

Environment variable: TT_SECURITY_SECURE_ERASING

sharding

The sharding section defines configuration parameters related to sharding.

Note

Sharding support requires installing the vshard module. The minimum required version of vshard is 0.1.25.

sharding.bucket_count
sharding.discovery_mode
sharding.failover_ping_timeout
sharding.lock
sharding.rebalancer_disbalance_threshold
sharding.rebalancer_max_receiving
sharding.rebalancer_max_sending
sharding.rebalancer_mode
sharding.roles
sharding.sched_move_quota
sharding.sched_ref_quota
sharding.shard_index
sharding.sync_timeout
sharding.weight
sharding.zone

sharding.bucket_count¶

The total number of buckets in a cluster. Learn more in Bucket count.

Note

This option should be defined at the global level.

Example

sharding:
  bucket_count: 1000

Type: integer
Default: 3000
Environment variable: TT_SHARDING_BUCKET_COUNT

sharding.discovery_mode¶: A mode of the background discovery fiber used by the router to find buckets. Learn more in vshard.router.discovery_set().

Note

This option should be defined at the global level.

Type: string

Default: ‘on’

Possible values: ‘on’, ‘off’, ‘once’

Environment variable: TT_SHARDING_DISCOVERY_MODE

sharding.failover_ping_timeout¶: The timeout (in seconds) after which a node is considered unavailable if there are no responses during this period. The failover fiber is used to detect if a node is down.

Note

This option should be defined at the global level.

Type: number

Default: 5

Environment variable: TT_SHARDING_FAILOVER_PING_TIMEOUT

sharding.lock¶: Whether a replica set is locked. A locked replica set cannot receive new buckets nor migrate its own buckets.

Note

sharding.lock can be specified at the replica set level or higher.

Type: boolean

Default: nil

Environment variable: TT_SHARDING_LOCK

sharding.rebalancer_disbalance_threshold¶

The maximum bucket disbalance threshold (in percent). The disbalance is calculated for each replica set using the following formula:

|etalon_bucket_count - real_bucket_count| / etalon_bucket_count * 100

Note

This option should be defined at the global level.

Type: number
Default: 1
Environment variable: TT_SHARDING_REBALANCER_DISBALANCE_THRESHOLD

sharding.rebalancer_max_receiving¶

The maximum number of buckets that can be received in parallel by a single replica set. This number must be limited because the rebalancer sends a large number of buckets from the existing replica sets to the newly added one. This produces a heavy load on the new replica set.

Note

This option should be defined at the global level.

Example

Suppose, rebalancer_max_receiving is equal to 100 and bucket_count is equal to 1000. There are 3 replica sets with 333, 333, and 334 buckets on each respectively. When a new replica set is added, each replica set’s etalon_bucket_count becomes equal to 250. Rather than receiving all 250 buckets at once, the new replica set receives 100, 100, and 50 buckets sequentially.

Type: integer
Default: 100
Environment variable: TT_SHARDING_REBALANCER_MAX_RECEIVING

sharding.rebalancer_max_sending¶: The degree of parallelism for parallel rebalancing.

Note

This option should be defined at the global level.

Type: integer

Default: 1

Maximum: 15

Environment variable: TT_SHARDING_REBALANCER_MAX_SENDING

sharding.rebalancer_mode¶

Since: 3.1.0

Configure how a rebalancer is selected:

auto (default): if there are no replica sets with the rebalancer sharding role (sharding.roles), a replica set with the rebalancer is selected automatically among all replica sets.
manual: one of the replica sets should have the rebalancer sharding role. The rebalancer is in this replica set.
off: rebalancing is turned off regardless of whether a replica set with the rebalancer sharding role exists or not.

Note

This option should be defined at the global level.

Type: string
Default: ‘auto’
Environment variable: TT_SHARDING_REBALANCER_MODE

sharding.roles¶

Roles of a replica set in regard to sharding. A replica set can have the following roles:

router: a replica set acts as a router.
storage: a replica set acts as a storage.
rebalancer: a replica set acts as a rebalancer.

The rebalancer role is optional. If it is not specified, a rebalancer is selected automatically from the master instances of replica sets.

There can be at most one replica set with the rebalancer role. Additionally, this replica set should have a storage role.

Example

replicasets:
  storage-a:
    sharding:
      roles: [storage, rebalancer]

snapshot

The snapshot section defines configuration parameters related to the snapshot files. To learn more about the snapshots’ configuration, check the Persistence page.

Note

snapshot can be defined in any scope.

snapshot.dir
snapshot.snap_io_rate_limit
snapshot.count
snapshot.by.*

snapshot.dir¶

A directory where memtx stores snapshot (.snap) files. A relative path in this option is interpreted as relative to process.work_dir.

By default, snapshots and WAL files are stored in the same directory. However, you can set different values for the snapshot.dir and wal.dir options to store them on different physical disks for performance matters.

Type: string
Default: ‘var/lib/{{ instance_name }}’
Environment variable: TT_SNAPSHOT_DIR

snapshot.snap_io_rate_limit¶: Reduce the throttling effect of box.snapshot() on INSERT/UPDATE/DELETE performance by setting a limit on how many megabytes per second it can write to disk. The same can be achieved by splitting wal.dir and snapshot.dir locations and moving snapshots to a separate disk. The limit also affects what box.stat.vinyl().regulator may show for the write rate of dumps to .run and .index files.

Type: number

Default: box.NULL

Environment variable: TT_SNAPSHOT_SNAP_IO_RATE_LIMIT

snapshot.count¶

The maximum number of snapshots that are stored in the snapshot.dir directory. If the number of snapshots after creating a new one exceeds this value, the Tarantool garbage collector deletes old snapshots. If snapshot.count is set to zero, the garbage collector does not delete old snapshots.

Example

In the example, the checkpoint daemon creates a snapshot every two hours until it has created three snapshots. After creating a new snapshot (the fourth one), the oldest snapshot and any associated write-ahead-log files are deleted.

snapshot:
  by:
    interval: 7200
  count: 3

Note

Snapshots will not be deleted if replication is ongoing and the file has not been relayed to a replica. Therefore, snapshot.count has no effect unless all replicas are alive.

Type: integer
Default: 2
Environment variable: TT_SNAPSHOT_COUNT

snapshot.by.*

snapshot.by.interval
snapshot.by.wal_size

snapshot.by.interval¶

The interval in seconds between actions by the checkpoint daemon. If the option is set to a value greater than zero, and there is activity that causes change to a database, then the checkpoint daemon calls box.snapshot() every snapshot.by.interval seconds, creating a new snapshot file each time. If the option is set to zero, the checkpoint daemon is disabled.

Example

In the example, the checkpoint daemon creates a new database snapshot every two hours, if there is activity.

by:
  interval: 7200

Type: number
Default: 3600
Environment variable: TT_SNAPSHOT_BY_INTERVAL

snapshot.by.wal_size¶: The threshold for the total size in bytes for all WAL files created since the last snapshot taken. Once the configured threshold is exceeded, the WAL thread notifies the checkpoint daemon that it must make a new snapshot and delete old WAL files.

Type: integer

Default: 10^18

Environment variable: TT_SNAPSHOT_BY_WAL_SIZE

sql

The sql section defines configuration parameters related to SQL.

Note

sql can be defined in any scope.

sql.cache_size

sql.cache_size¶: The maximum cache size (in bytes) for all SQL prepared statements. To see the actual cache size, use box.info.sql().cache.size.

Type: integer

Default: 5242880

Environment variable: TT_SQL_CACHE_SIZE

vinyl

The vinyl section defines configuration parameters related to the vinyl storage engine.

Note

vinyl can be defined in any scope.

vinyl.bloom_fpr
vinyl.cache
vinyl.defer_deletes
vinyl.dir
vinyl.max_tuple_size
vinyl.memory
vinyl.page_size
vinyl.range_size
vinyl.read_threads
vinyl.run_count_per_level
vinyl.run_size_ratio
vinyl.timeout
vinyl.write_threads

vinyl.bloom_fpr¶: A bloom filter’s false positive rate – the suitable probability of the bloom filter to give a wrong result. The vinyl.bloom_fpr setting is a default value for the bloom_fpr option passed to space_object:create_index().

Type: number

Default: 0.05

Environment variable: TT_VINYL_BLOOM_FPR

vinyl.cache¶: The cache size for the vinyl storage engine. The cache can be resized dynamically.

Type: integer

Default: 128 * 1024 * 1024

Environment variable: TT_VINYL_CACHE

vinyl.defer_deletes¶: Enable the deferred DELETE optimization in vinyl. It was disabled by default since Tarantool version 2.10 to avoid possible performance degradation of secondary index reads.

Type: boolean

Default: false

Environment variable: TT_VINYL_DEFER_DELETES

vinyl.dir¶

A directory where vinyl files or subdirectories will be stored.

This option may contain a relative file path. In this case, it is interpreted as relative to process.work_dir.

Type: string
Default: ‘var/lib/{{ instance_name }}’
Environment variable: TT_VINYL_DIR

vinyl.max_tuple_size¶: The size of the largest allocation unit, for the vinyl storage engine. It can be increased if it is necessary to store large tuples.

Type: integer

Default: 1024 * 1024

Environment variable: TT_VINYL_MAX_TUPLE_SIZE

vinyl.memory¶: The maximum number of in-memory bytes that vinyl uses.

Type: integer

Default: 128 * 1024 * 1024

Environment variable: TT_VINYL_MEMORY

vinyl.page_size¶: The page size. A page is a read/write unit for vinyl disk operations. The vinyl.page_size setting is a default value for the page_size option passed to space_object:create_index().

Type: integer

Default: 8 * 1024

Environment variable: TT_VINYL_PAGE_SIZE

vinyl.range_size¶

The default maximum range size for a vinyl index, in bytes. The maximum range size affects the decision of whether to split a range.

If vinyl.range_size is specified (but the value is not null or 0), then it is used as the default value for the range_size option passed to space_object:create_index().

If vinyl.range_size is not specified (or is explicitly set to null or 0), and range_size is not specified when the index is created, then Tarantool sets a value later depending on performance considerations. To see the actual value, use index_object:stat().range_size.

Type: integer
Default: box.NULL (means that an effective default is determined in runtime)
Environment variable: TT_VINYL_RANGE_SIZE

vinyl.read_threads¶: The maximum number of read threads that vinyl can use for concurrent operations, such as I/O and compression.

Type: integer

Default: 1

Environment variable: TT_VINYL_READ_THREADS

vinyl.run_count_per_level¶: The maximum number of runs per level in the vinyl LSM tree. If this number is exceeded, a new level is created. The vinyl.run_count_per_level setting is a default value for the run_count_per_level option passed to space_object:create_index().

Type: integer

Default: 2

Environment variable: TT_VINYL_RUN_COUNT_PER_LEVEL

vinyl.run_size_ratio¶: The ratio between the sizes of different levels in the LSM tree. The vinyl.run_size_ratio setting is a default value for the run_size_ratio option passed to space_object:create_index().

Type: number

Default: 3.5

Environment variable: TT_VINYL_RUN_SIZE_RATIO

vinyl.timeout¶: The vinyl storage engine has a scheduler that performs compaction. When vinyl is low on available memory, the compaction scheduler may be unable to keep up with incoming update requests. In that situation, queries may time out after vinyl.timeout seconds. This should rarely occur, since normally vinyl throttles inserts when it is running low on compaction bandwidth. Compaction can also be initiated manually with index_object:compact().

Type: integer

Default: 60

Environment variable: TT_VINYL_TIMEOUT

vinyl.write_threads¶: The maximum number of write threads that vinyl can use for some concurrent operations, such as I/O and compression.

Type: integer

Default: 4

Environment variable: TT_VINYL_WRITE_THREADS

wal

The wal section defines configuration parameters related to write-ahead log. To learn more about the WAL configuration, check the Persistence page.

Note

wal can be defined in any scope.

wal.cleanup_delay
wal.dir
wal.dir_rescan_delay
wal.max_size
wal.mode
wal.queue_max_size
wal.retention_period
wal.ext.*

wal.cleanup_delay¶

The delay in seconds used to prevent the Tarantool garbage collector from immediately removing write-ahead log files after a node restart. This delay eliminates possible erroneous situations when the master deletes WALs needed by replicas after restart. As a consequence, replicas sync with the master faster after its restart and don’t need to download all the data again. Once all the nodes in the replica set are up and running, a scheduled garbage collection is started again even if wal.cleanup_delay has not expired.

Note

The option has no effect on nodes running as anonymous replicas.

wal.ext.*

Enterprise Edition

Configuring wal.ext.* parameters is available in the Enterprise Edition only.

This section describes options related to WAL extensions.

wal.ext.new
wal.ext.old
wal.ext.spaces

wal.ext.new¶: Enable storing a new tuple for each CRUD operation performed. The option is in effect for all spaces. To adjust the option for specific spaces, use the wal.ext.spaces option.

Type: boolean

Default: false

Environment variable: TT_WAL_EXT_NEW

wal.ext.old¶: Enable storing an old tuple for each CRUD operation performed. The option is in effect for all spaces. To adjust the option for specific spaces, use the wal.ext.spaces option.

Type: boolean

Default: false

Environment variable: TT_WAL_EXT_OLD

wal.ext.spaces¶

Enable or disable storing an old and new tuple in the WAL record for a given space explicitly. The configuration for specific spaces has priority over the configuration in the wal.ext.new and wal.ext.old options.

The option is a key-value pair:

The key is a space name (string).
The value is a table that includes two optional boolean options: old and new. The format and the default value of these options are described in wal.ext.old and wal.ext.new.

Example

In the example, only new tuples are added to the log for the bands space.

ext:
  new: true
  old: true
  spaces:
    bands:
      old: false

Type: map
Default: nil
Environment variable: TT_WAL_EXT_SPACES

Configuration reference (box.cfg)

Note

Starting with the 3.0 version, the recommended way of configuring Tarantool is using a configuration file. Configuring Tarantool in code is considered a legacy approach.

This topic describes all configuration parameters that can be specified in code using the box.cfg API.

Basic parameters
Configuring the storage
Checkpoint daemon
Binary logging and snapshots
Hot standby
Replication
Networking
Logging
Audit log
Authentication
Flight recorder
Feedback
Deprecated parameters

Basic parameters

background
coredump
custom_proc_title
listen
memtx_dir
pid_file
read_only
sql_cache_size
vinyl_dir
vinyl_timeout
username
wal_dir
work_dir
worker_pool_threads
strip_core
memtx_use_mvcc_engine

background¶

Since version 1.6.2.

Run the server as a background task. The log and pid_file parameters must be non-null for this to work.

Important

Do not enable the background mode for applications intended to run by the tt utility. For more information, see the tt start reference.

Type: boolean
Default: false
Environment variable: TT_BACKGROUND
Dynamic: no

coredump¶

Create coredump files.

Usually, an administrator needs to call ulimit -c unlimited (or set corresponding options in systemd’s unit file) before running a Tarantool process to get core dumps. If coredump is enabled, Tarantool sets the corresponding resource limit by itself and the administrator doesn’t need to call ulimit -c unlimited (see man 3 setrlimit).

This option also sets the state of the dumpable attribute, which is enabled by default, but may be dropped in some circumstances (according to man 2 prctl, see PR_SET_DUMPABLE).

Type: boolean
Environment variable: TT_COREDUMP
Default: false
Dynamic: no

custom_proc_title¶

Since version 1.6.7.

Add the given string to the server’s process title (what’s shown in the COMMAND column for ps -ef and top -c commands).

For example, ordinarily ps -ef shows the Tarantool server process thus:

$ ps -ef | grep tarantool
1000     14939 14188  1 10:53 pts/2    00:00:13 tarantool <running>

But if the configuration parameters include custom_proc_title='sessions' then the output looks like:

$ ps -ef | grep tarantool
1000     14939 14188  1 10:53 pts/2    00:00:16 tarantool <running>: sessions

Type: string
Default: null
Environment variable: TT_CUSTOM_PROC_TITLE
Dynamic: yes

listen¶

Since version 1.6.4.

The read/write data port number or URI (Universal Resource Identifier) string. Has no default value, so must be specified if connections occur from the remote clients that don’t use the “admin port”. Connections made with listen = URI are called “binary port” or “binary protocol” connections.

A typical value is 3301.

box.cfg { listen = 3301 }

box.cfg { listen = "127.0.0.1:3301" }

Note

A replica also binds to this port, and accepts connections, but these connections can only serve reads until the replica becomes a master.

Starting from version 2.10.0, you can specify several URIs, and the port number is always stored as an integer value.

Type: integer or string
Default: null
Environment variable: TT_LISTEN
Dynamic: yes

memtx_dir¶

Since version 1.7.4.

A directory where memtx stores snapshot (.snap) files. A relative path in this option is interpreted as relative to work_dir.

By default, snapshots and WAL files are stored in the same directory. However, you can set different values for the memtx_dir and wal_dir options to store them on different physical disks for performance matters.

Type: string
Default: “.”
Environment variable: TT_MEMTX_DIR
Dynamic: no

pid_file¶

Since version 1.4.9.

Store the process id in this file. Can be relative to work_dir. A typical value is “tarantool.pid”.

Type: string
Default: null
Environment variable: TT_PID_FILE
Dynamic: no

read_only¶

Since version 1.7.1.

Say box.cfg{read_only=true...} to put the server instance in read-only mode. After this, any requests that try to change persistent data will fail with error ER_READONLY. Read-only mode should be used for master-replica replication. Read-only mode does not affect data-change requests for spaces defined as temporary. Although read-only mode prevents the server from writing to the WAL, it does not prevent writing diagnostics with the log module.

Type: boolean
Default: false
Environment variable: TT_READ_ONLY
Dynamic: yes

Setting read_only == true affects spaces differently depending on the options that were used during box.schema.space.create, as summarized by this chart:

Option	Can be created?	Can be written to?	Is replicated?	Is persistent?
(default)	no	no	yes	yes
temporary	no	yes	no	no
is_local	no	yes	no	yes

sql_cache_size¶

Since version 2.3.1.

The maximum number of bytes in the cache for SQL prepared statements. (The number of bytes that are actually used can be seen with box.info.sql().cache.size.)

Type: number
Default: 5242880
Environment variable: TT_SQL_CACHE_SIZE
Dynamic: yes

vinyl_dir¶

Since version 1.7.1.

A directory where vinyl files or subdirectories will be stored. Can be relative to work_dir. If not specified, defaults to work_dir.

Type: string
Default: “.”
Environment variable: TT_VINYL_DIR
Dynamic: no

vinyl_timeout¶

Since version 1.7.5.

The vinyl storage engine has a scheduler which does compaction. When vinyl is low on available memory, the compaction scheduler may be unable to keep up with incoming update requests. In that situation, queries may time out after vinyl_timeout seconds. This should rarely occur, since normally vinyl would throttle inserts when it is running low on compaction bandwidth. Compaction can also be ordered manually with index_object:compact().

Type: float
Default: 60
Environment variable: TT_VINYL_TIMEOUT
Dynamic: yes

username¶

Since version 1.4.9.

UNIX user name to switch to after start.

Type: string
Default: null
Environment variable: TT_USERNAME
Dynamic: no

wal_dir¶

Since version 1.6.2.

A directory where write-ahead log (.xlog) files are stored. A relative path in this option is interpreted as relative to work_dir.

By default, WAL files and snapshots are stored in the same directory. However, you can set different values for the wal_dir and memtx_dir options to store them on different physical disks for performance matters.

Type: string
Default: “.”
Environment variable: TT_WAL_DIR
Dynamic: no

work_dir¶

Since version 1.4.9.

A directory where database working files will be stored. The server instance switches to work_dir with chdir(2) after start. Can be relative to the current directory. If not specified, defaults to the current directory. Other directory parameters may be relative to work_dir, for example:

box.cfg{
    work_dir = '/home/user/A',
    wal_dir = 'B',
    memtx_dir = 'C'
}

will put xlog files in /home/user/A/B, snapshot files in /home/user/A/C, and all other files or subdirectories in /home/user/A.

Type: string
Default: null
Environment variable: TT_WORK_DIR
Dynamic: no

worker_pool_threads¶

Since version 1.7.5.

The maximum number of threads to use during execution of certain internal processes (currently socket.getaddrinfo() and coio_call()).

Type: integer
Default: 4
Environment variable: TT_WORKER_POOL_THREADS
Dynamic: yes

strip_core¶

Since version 2.2.2.

Whether coredump files should include memory allocated for tuples. (This can be large if Tarantool runs under heavy load.) Setting to true means “do not include”. In an older version of Tarantool the default value of this parameter was false.

Type: boolean
Default: true
Environment variable: TT_STRIP_CORE
Dynamic: no

memtx_use_mvcc_engine¶

Since version 2.6.1.

Enable transactional manager if set to true.

Type: boolean
Default: false
Environment variable: TT_MEMTX_USE_MVCC_ENGINE
Dynamic: no

Configuring the storage

memtx_memory
memtx_max_tuple_size
memtx_min_tuple_size
memtx_allocator
memtx_sort_threads
slab_alloc_factor
slab_alloc_granularity
vinyl_bloom_fpr
vinyl_cache
vinyl_max_tuple_size
vinyl_memory
vinyl_page_size
vinyl_range_size
vinyl_run_count_per_level
vinyl_run_size_ratio
vinyl_read_threads
vinyl_write_threads

memtx_memory¶

Since version 1.7.4.

How much memory Tarantool allocates to store tuples. When the limit is reached, INSERT or UPDATE requests begin failing with error ER_MEMORY_ISSUE. The server does not go beyond the memtx_memory limit to allocate tuples, but there is additional memory used to store indexes and connection information.

Type: float
Default: 256 * 1024 * 1024 = 268435456 bytes
Minimum: 33554432 bytes (32 MB)
Environment variable: TT_MEMTX_MEMORY
Dynamic: yes but it cannot be decreased

memtx_max_tuple_size¶

Since version 1.7.4.

Size of the largest allocation unit, for the memtx storage engine. It can be increased if it is necessary to store large tuples.

Type: integer
Default: 1024 * 1024 = 1048576 bytes
Environment variable: TT_MEMTX_MAX_TUPLE_SIZE
Dynamic: yes

memtx_min_tuple_size¶

Since version 1.7.4.

Size of the smallest allocation unit. It can be decreased if most of the tuples are very small.

Type: integer
Default: 16 bytes
Possible values: between 8 and 1048280 inclusive
Environment variable: TT_MEMTX_MIN_TUPLE_SIZE
Dynamic: no

memtx_allocator¶

Since version 2.10.0.

Specify the allocator that manages memory for memtx tuples. Possible values:

system – the memory is allocated as needed, checking that the quota is not exceeded. THe allocator is based on the malloc function.
small – a slab allocator. The allocator repeatedly uses a memory block to allocate objects of the same type. Note that this allocator is prone to unresolvable fragmentation on specific workloads, so you can switch to system in such cases.

Type: string
Default: ‘small’
Environment variable: TT_MEMTX_ALLOCATOR
Dynamic: no

memtx_sort_threads¶

Since: 3.0.0.

The number of threads from the thread pool used to sort keys of secondary indexes on loading a memtx database. The minimum value is 1, the maximum value is 256. The default is to use all available cores.

Note

Since 3.0.0, this option replaces the approach when OpenMP threads are used to parallelize sorting. For backward compatibility, the OMP_NUM_THREADS environment variable is taken into account to set the number of sorting threads.

Type: integer
Default: box.NULL
Environment variable: TT_MEMTX_SORT_THREADS
Dynamic: no

slab_alloc_factor¶

Checkpoint daemon

checkpoint_count
checkpoint_interval
checkpoint_wal_threshold

Checkpoint daemon

The work of the checkpoint daemon is based on the following configuration options:

checkpoint_interval – a new snapshot is taken once in a given period.
checkpoint_wal_threshold – a new snapshot is taken once the size of all WAL files created since the last snapshot exceeds a given limit.

If necessary, the checkpoint daemon also activates the Tarantool garbage collector that deletes old snapshots and WAL files.

Tarantool garbage collector

Note

This garbage collector is called as follows:

When the number of snapshots reaches the limit of checkpoint_count size. After a new snapshot is taken, Tarantool garbage collector deletes the oldest snapshot file and any associated WAL files.
When the size of all WAL files created since the last snapshot reaches the limit of checkpoint_wal_threshold. Once this size is exceeded, the checkpoint daemon takes a snapshot, then the garbage collector deletes the old WAL files.

If an old snapshot file is deleted, the Tarantool garbage collector also deletes any write-ahead log (.xlog) files that meet the following conditions:

The WAL files are older than the snapshot file.
The WAL files contain information present in the snapshot file.

Tarantool garbage collector also deletes obsolete vinyl .run files.

Tarantool garbage collector doesn’t delete a file in the following cases:

A backup is running, and the file has not been backed up (see Hot backup).
Replication is running, and the file has not been relayed to a replica (see Replication architecture),
A replica is connecting.
A replica has fallen behind. The progress of each replica is tracked; if a replica’s position is far from being up to date, then the server stops to give it a chance to catch up. If an administrator concludes that a replica is permanently down, then the correct procedure is to restart the server, or (preferably) remove the replica from the cluster.

checkpoint_interval¶

Since version 1.7.4.

The interval in seconds between actions by the checkpoint daemon. If the option is set to a value greater than zero, and there is activity that causes change to a database, then the checkpoint daemon calls box.snapshot() every checkpoint_interval seconds, creating a new snapshot file each time. If the option is set to zero, the checkpoint daemon is disabled.

Example

box.cfg{ checkpoint_interval = 7200 }

In the example, the checkpoint daemon creates a new database snapshot every two hours, if there is activity.

Type: integer
Default: 3600 (one hour)
Environment variable: TT_CHECKPOINT_INTERVAL
Dynamic: yes

checkpoint_count¶

Since version 1.7.4.

The maximum number of snapshots that are stored in the memtx_dir directory. If the number of snapshots after creating a new one exceeds this value, the Tarantool garbage collector deletes old snapshots. If the option is set to zero, the garbage collector does not delete old snapshots.

Example

box.cfg{
    checkpoint_interval = 7200,
    checkpoint_count  = 3
}

In the example, the checkpoint daemon creates a new snapshot every two hours until it has created three snapshots. After creating a new snapshot (the fourth one), the oldest snapshot and any associated write-ahead-log files are deleted.

Note

Snapshots will not be deleted if replication is ongoing and the file has not been relayed to a replica. Therefore, checkpoint_count has no effect unless all replicas are alive.

Type: integer
Default: 2
Environment variable: TT_CHECKPOINT_COUNT
Dynamic: yes

checkpoint_wal_threshold¶

Since version 2.1.2.

The threshold for the total size in bytes for all WAL files created since the last checkpoint. Once the configured threshold is exceeded, the WAL thread notifies the checkpoint daemon that it must make a new checkpoint and delete old WAL files.

This parameter enables administrators to handle a problem that could occur with calculating how much disk space to allocate for a partition containing WAL files.

Type: integer
Default: 10^18 (a large number so in effect there is no limit by default)
Environment variable: TT_CHECKPOINT_WAL_THRESHOLD
Dynamic: yes

Binary logging and snapshots

force_recovery
wal_max_size
snap_io_rate_limit
wal_mode
wal_dir_rescan_delay
wal_queue_max_size
wal_cleanup_delay
wal_ext
secure_erasing

force_recovery¶

Since version 1.7.4.

If force_recovery equals true, Tarantool tries to continue if there is an error while reading a snapshot file (at server instance start) or a write-ahead log file (at server instance start or when applying an update at a replica): skips invalid records, reads as much data as possible and lets the process finish with a warning. Users can prevent the error from recurring by writing to the database and executing box.snapshot().

Otherwise, Tarantool aborts recovery if there is an error while reading.

Type: boolean
Default: false
Environment variable: TT_FORCE_RECOVERY
Dynamic: no

wal_max_size¶

Since version 1.7.4.

The maximum number of bytes in a single write-ahead log file. When a request would cause an .xlog file to become larger than wal_max_size, Tarantool creates a new WAL file.

Type: integer
Default: 268435456 (256 * 1024 * 1024) bytes
Environment variable: TT_WAL_MAX_SIZE
Dynamic: no

snap_io_rate_limit¶

Since version 1.4.9.

Reduce the throttling effect of box.snapshot() on INSERT/UPDATE/DELETE performance by setting a limit on how many megabytes per second it can write to disk. The same can be achieved by splitting wal_dir and memtx_dir locations and moving snapshots to a separate disk. The limit also affects what box.stat.vinyl().regulator may show for the write rate of dumps to .run and .index files.

Type: float
Default: null
Environment variable: TT_SNAP_IO_RATE_LIMIT
Dynamic: yes

wal_mode¶

Since version 1.6.2.

Specify fiber-WAL-disk synchronization mode as:

none: write-ahead log is not maintained. A node with wal_mode set to none can’t be a replication master.
write: fibers wait for their data to be written to the write-ahead log (no fsync(2)).
fsync: fibers wait for their data, fsync(2) follows each write(2).

Type: string
Default: “write”
Environment variable: TT_WAL_MODE
Dynamic: no

wal_dir_rescan_delay¶

Since version 1.6.2.

The time interval in seconds between periodic scans of the write-ahead-log file directory, when checking for changes to write-ahead-log files for the sake of replication or hot standby.

Type: float
Default: 2
Environment variable: TT_WAL_DIR_RESCAN_DELAY
Dynamic: no

wal_queue_max_size¶

Since version 2.8.1.

The size of the queue (in bytes) used by a replica to submit new transactions to a write-ahead log (WAL). This option helps limit the rate at which a replica submits transactions to the WAL. Limiting the queue size might be useful when a replica is trying to sync with a master and reads new transactions faster than writing them to the WAL.

Note

You might consider increasing the wal_queue_max_size value in case of large tuples (approximately one megabyte or larger).

Type: number
Default: 16777216 bytes
Environment variable: TT_WAL_QUEUE_MAX_SIZE
Dynamic: yes

wal_cleanup_delay¶

Since version 2.6.3.

Note

The wal_cleanup_delay option has no effect on nodes running as anonymous replicas.

Type: number
Default: 14400 seconds
Environment variable: TT_WAL_CLEANUP_DELAY
Dynamic: yes

wal_ext¶

Since version 2.11.0.

(Enterprise Edition only) Allows you to add auxiliary information to each write-ahead log record. For example, you can enable storing an old and new tuple for each CRUD operation performed. This information might be helpful for implementing a CDC (Change Data Capture) utility that transforms a data replication stream.

You can enable storing old and new tuples as follows:

Set the old and new options to true to store old and new tuples in a write-ahead log for all spaces.
```
box.cfg {
    wal_ext = { old = true, new = true }
}
```

To adjust these options for specific spaces, use the spaces option.

box.cfg {
    wal_ext = {
        old = true, new = true,
        spaces = {
            space1 = { old = false },
            space2 = { new = false }
        }
    }
}

The configuration for specific spaces has priority over the global configuration, so only new tuples are added to the log for space1 and only old tuples for space2.

Note that records with additional fields are replicated as follows:

If a replica doesn’t support the extended format configured on a master, auxiliary fields are skipped.
If a replica and master have different configurations for WAL records, the master’s configuration is ignored.

Type: map
Default: nil
Environment variable: TT_WAL_EXT
Dynamic: yes

secure_erasing¶

Since version 3.0.0.

(Enterprise Edition only) If true, forces Tarantool to overwrite a data file a few times before deletion to render recovery of a deleted file impossible. The option applies to both .xlog and .snap files as well as Vinyl data files.

Type: boolean
Default: false
Environment variable: TT_SECURE_ERASING
Dynamic: yes

Hot standby

hot_standby¶

Since version 1.7.4.

Whether to start the server in hot standby mode.

Hot standby is a feature which provides a simple form of failover without replication.

The expectation is that there will be two instances of the server using the same configuration. The first one to start will be the “primary” instance. The second one to start will be the “standby” instance.

To initiate the standby instance, start a second instance of the Tarantool server on the same computer with the same box.cfg configuration settings – including the same directories and same non-null URIs – and with the additional configuration setting hot_standby = true. Expect to see a notification ending with the words I> Entering hot standby mode. This is fine. It means that the standby instance is ready to take over if the primary instance goes down.

The standby instance will initialize and will try to take a lock on wal_dir, but will fail because the primary instance has made a lock on wal_dir. So the standby instance goes into a loop, reading the write ahead log which the primary instance is writing (so the two instances are always in sync), and trying to take the lock. If the primary instance goes down for any reason, the lock will be released. In this case, the standby instance will succeed in taking the lock, will connect on the listen address and will become the primary instance. Expect to see a notification ending with the words I> ready to accept requests.

Thus there is no noticeable downtime if the primary instance goes down.

Hot standby feature has no effect:

if wal_dir_rescan_delay = a large number (on Mac OS and FreeBSD); on these platforms, it is designed so that the loop repeats every wal_dir_rescan_delay seconds.
if wal_mode = ‘none’; it is designed to work with wal_mode = 'write' or wal_mode = 'fsync'.
for spaces created with engine = ‘vinyl’; it is designed to work for spaces created with engine = 'memtx'.

Type: boolean
Default: false
Environment variable: TT_HOT_STANDBY
Dynamic: no

Replication

replication¶

Since version 1.7.4.

If replication is not an empty string, the instance is considered to be a Tarantool replica. The replica will try to connect to the master specified in replication with a URI (Universal Resource Identifier), for example:

konstantin:secret_password@tarantool.org:3301

If there is more than one replication source in a replica set, specify an array of URIs, for example (replace ‘uri’ and ‘uri2’ in this example with valid URIs):

box.cfg{ replication = { 'uri1', 'uri2' } }

Note

Starting from version 2.10.0, there is a number of other ways for specifying several URIs. See syntax examples.

If one of the URIs is “self” – that is, if one of the URIs is for the instance where box.cfg{} is being executed – then it is ignored. Thus, it is possible to use the same replication specification on multiple server instances, as shown in these examples.

The default user name is ‘guest’.

A read-only replica does not accept data-change requests on the listen port.

The replication parameter is dynamic, that is, to enter master mode, simply set replication to an empty string and issue:

box.cfg{ replication = new-value }

Type: string
Default: null
Environment variable: TT_REPLICATION
Dynamic: yes

replication_anon¶

Since version 2.3.1.

A Tarantool replica can be anonymous. This type of replica is read-only (but you still can write to temporary and replica-local spaces), and it isn’t present in the _cluster space.

Since an anonymous replica isn’t registered in the _cluster table, there is no limitation for anonymous replicas count in a replica set: you can have as many of them as you want.

In order to make a replica anonymous, pass the option replication_anon=true to box.cfg and set read_only to true.

Let’s go through anonymous replica bootstrap. Suppose we have got a master configured with

box.cfg{listen=3301}

and created a local space called “loc”:

box.schema.space.create('loc', {is_local=true})
box.space.loc:create_index("pk")

Now, to configure an anonymous replica, we need to issue box.cfg, as usual.

box.cfg{replication_anon=true, read_only=true, replication=3301}

As mentioned above, replication_anon may be set to true only together with read_only. The instance will fetch the master’s snapshot and start following its changes. It will receive no id, so its id value will remain zero.

tarantool> box.info.id
---
- 0
...
tarantool> box.info.replication
---
- 1:
    id: 1
    uuid: 3c84f8d9-e34d-4651-969c-3d0ed214c60f
    lsn: 4
    upstream:
    status: follow
    idle: 0.6912029999985
    peer:
    lag: 0.00014615058898926
...

Now we can use the replica. For example, we can do inserts into the local space:

tarantool> for i = 1,10 do
    > box.space.loc:insert{i}
    > end
---
...

Note that while the instance is anonymous, it will increase the 0-th component of its vclock:

tarantool> box.info.vclock
---
- {0: 10, 1: 4}
...

Let’s now promote the anonymous replica to a regular one:

tarantool> box.cfg{replication_anon=false}
2019-12-13 20:34:37.423 [71329] main I> assigned id 2 to replica 6a9c2ed2-b9e1-4c57-a0e8-51a46def7661
2019-12-13 20:34:37.424 [71329] main/102/interactive I> set 'replication_anon' configuration option to false
---
...

tarantool> 2019-12-13 20:34:37.424 [71329] main/117/applier/ I> subscribed
2019-12-13 20:34:37.424 [71329] main/117/applier/ I> remote vclock {1: 5} local vclock {0: 10, 1: 5}
2019-12-13 20:34:37.425 [71329] main/118/applierw/ C> leaving orphan mode

The replica has just received an id equal to 2. We can make it read-write now.

tarantool> box.cfg{read_only=false}
2019-12-13 20:35:46.392 [71329] main/102/interactive I> set 'read_only' configuration option to false
---
...

tarantool> box.schema.space.create('test')
---
- engine: memtx
before_replace: 'function: 0x01109f9dc8'
on_replace: 'function: 0x01109f9d90'
ck_constraint: []
field_count: 0
temporary: false
index: []
is_local: false
enabled: false
name: test
id: 513
- created
...

tarantool> box.info.vclock
---
- {0: 10, 1: 5, 2: 2}
...

Now the replica tracks its changes in the 2nd vclock component, as expected. It can also become a replication master from now on.

Notes:

You cannot replicate from an anonymous instance.
To promote an anonymous instance to a regular one, first start it as anonymous, and only then issue box.cfg{replication_anon=false}
In order for the deanonymization to succeed, the instance must replicate from some read-write instance, otherwise it cannot be added to the _cluster table.

Type: boolean
Default: false
Environment variable: TT_REPLICATION_ANON
Dynamic: yes

bootstrap_leader¶

Since 3.0.0.

A bootstrap leader for a replica set. You can pass a bootstrap leader’s URI, UUID, or name.

To specify a bootstrap leader manually, you need to set bootstrap_strategy to config, for example:

box.cfg{
    bootstrap_strategy = 'config',
    bootstrap_leader = '127.0.0.1:3301',
    replication = {'127.0.0.1:3301'},
}

Type: string
Default: null
Environment variable: TT_BOOTSTRAP_LEADER
Dynamic: yes

bootstrap_strategy¶

Since 2.11.0.

Specify a strategy used to bootstrap a replica set. The following strategies are available:

auto: a node doesn’t boot if a half or more of other nodes in a replica set are not connected. For example, if the replication parameter contains 2 or 3 nodes, a node requires 2 connected instances. In the case of 4 or 5 nodes, at least 3 connected instances are required. Moreover, a bootstrap leader fails to boot unless every connected node has chosen it as a bootstrap leader.
config: use the specified node to bootstrap a replica set. To specify the bootstrap leader, use the bootstrap_leader option.
supervised: a bootstrap leader isn’t chosen automatically but should be appointed using box.ctl.make_bootstrap_leader() on the desired node.
legacy (deprecated since 2.11.0): a node requires the replication_connect_quorum number of other nodes to be connected. This option is added to keep the compatibility with the current versions of Cartridge and might be removed in the future.

Type: string
Default: auto
Environment variable: TT_BOOTSTRAP_STRATEGY
Dynamic: yes

replication_connect_timeout¶

Since version 1.9.0.

The number of seconds that a replica will wait when trying to connect to a master in a cluster. See orphan status for details.

This parameter is different from replication_timeout, which a master uses to disconnect a replica when the master receives no acknowledgments of heartbeat messages.

Type: float
Default: 30
Environment variable: TT_REPLICATION_CONNECT_TIMEOUT
Dynamic: yes

replication_connect_quorum¶

Deprecated since 2.11.0.

This option is in effect if bootstrap_strategy is set to legacy.

Specify the number of nodes to be up and running to start a replica set. This parameter has effect during bootstrap or configuration update. Setting replication_connect_quorum to 0 makes Tarantool require no immediate reconnect only in case of recovery. See Orphan status for details.

Example:

box.cfg { replication_connect_quorum = 2 }

Type: integer
Default: null
Environment variable: TT_REPLICATION_CONNECT_QUORUM
Dynamic: yes

replication_skip_conflict¶

Since version 1.10.1.

By default, if a replica adds a unique key that another replica has added, replication stops with error = ER_TUPLE_FOUND.

However, by specifying replication_skip_conflict = true, users can state that such errors may be ignored. So instead of saving the broken transaction to the xlog, it will be written there as NOP (No operation).

Example:

box.cfg{replication_skip_conflict=true}

Type: boolean
Default: false
Environment variable: TT_REPLICATION_SKIP_CONFLICT
Dynamic: yes

Note

replication_skip_conflict = true is recommended to be used only for manual replication recovery.

replication_sync_lag¶

Since version 1.9.0.

The maximum lag allowed for a replica. When a replica syncs (gets updates from a master), it may not catch up completely. The number of seconds that the replica is behind the master is called the “lag”. Syncing is considered to be complete when the replica’s lag is less than or equal to replication_sync_lag.

If a user sets replication_sync_lag to nil or to 365 * 100 * 86400 (TIMEOUT_INFINITY), then lag does not matter – the replica is always considered to be “synced”. Also, the lag is ignored (assumed to be infinite) in case the master is running Tarantool older than 1.7.7, which does not send heartbeat messages.

This parameter is ignored during bootstrap. See orphan status for details.

Type: float
Default: 10
Environment variable: TT_REPLICATION_SYNC_LAG
Dynamic: yes

replication_sync_timeout¶

Since version 1.10.2.

The number of seconds that a node waits when trying to sync with other nodes in a replica set (see bootstrap_strategy), after connecting or during configuration update. This could fail indefinitely if replication_sync_lag is smaller than network latency, or if the replica cannot keep pace with master updates. If replication_sync_timeout expires, the replica enters orphan status.

Type: float
Default: 0
Environment variable: TT_REPLICATION_SYNC_TIMEOUT
Dynamic: yes

Note

The default replication_sync_timeout value can be changed to the old default value (300) by using the compat module. For more information on changing the default value via the compat module, see Default value for replication_sync_timeout.

replication_timeout¶

Since version 1.7.5.

If the master has no updates to send to the replicas, it sends heartbeat messages every replication_timeout seconds, and each replica sends an ACK packet back.

Both master and replicas are programmed to drop the connection if they get no response in four replication_timeout periods. If the connection is dropped, a replica tries to reconnect to the master.

See more in Monitoring a replica set.

Type: integer
Default: 1
Environment variable: TT_REPLICATION_TIMEOUT
Dynamic: yes

replicaset_uuid¶

Since version 1.9.0.

As described in section “Replication architecture”, each replica set is identified by a universally unique identifier called replica set UUID, and each instance is identified by an instance UUID.

Ordinarily it is sufficient to let the system generate and format the UUID strings which will be permanently stored.

However, some administrators may prefer to store Tarantool configuration information in a central repository, for example Apache ZooKeeper. Such administrators can assign their own UUID values for either – or both – instances (instance_uuid) and replica set (replicaset_uuid), when starting up for the first time.

General rules:

The values must be true unique identifiers, not shared by other instances or replica sets within the common infrastructure.
The values must be used consistently, not changed after initial setup (the initial values are stored in snapshot files and are checked whenever the system is restarted).
The values must comply with RFC 4122. The nil UUID is not allowed.

The UUID format includes sixteen octets represented as 32 hexadecimal (base 16) digits, displayed in five groups separated by hyphens, in the form 8-4-4-4-12 for a total of 36 characters (32 alphanumeric characters and four hyphens).

Example:

box.cfg{replicaset_uuid='7b853d13-508b-4b8e-82e6-806f088ea6e9'}

Type: string
Default: null
Environment variable: TT_REPLICASET_UUID
Dynamic: no

instance_uuid¶

Since version 1.9.0.

For replication administration purposes, it is possible to set the universally unique identifiers of the instance (instance_uuid) and the replica set (replicaset_uuid), instead of having the system generate the values.

See the description of replicaset_uuid parameter for details.

Example:

box.cfg{instance_uuid='037fec43-18a9-4e12-a684-a42b716fcd02'}

Type: string
Default: null
Environment variable: TT_INSTANCE_UUID
Dynamic: no

replication_synchro_quorum¶

Since version 2.5.1.

For synchronous replication only. This option tells how many replicas should confirm the receipt of a synchronous transaction before it can finish its commit.

Since version 2.5.3, the option supports dynamic evaluation of the quorum number. That is, the number of quorum can be specified not as a constant number, but as a function instead. In this case, the option returns the formula evaluated. The result is treated as an integer number. Once any replicas are added or removed, the expression is re-evaluated automatically.

For example,

box.cfg{replication_synchro_quorum = "N / 2 + 1"}

Where N is a current number of registered replicas in a cluster.

Keep in mind that the example above represents a canonical quorum definition. The formula at least 50% of the cluster size + 1 guarantees data reliability. Using a value less than the canonical one might lead to unexpected results, including a split-brain.

Since version 2.10.0, this option does not account for anonymous replicas.

The default value for this parameter is N / 2 + 1.

It is not used on replicas, so if the master dies, the pending synchronous transactions will be kept waiting on the replicas until a new master is elected.

If the value for this option is set to 1, the synchronous transactions work like asynchronous when not configured. 1 means that successful WAL write to the master is enough to commit.

Type: number
Default: N / 2 + 1 (before version 2.10.0, the default value was 1)
Environment variable: TT_REPLICATION_SYNCHRO_QUORUM
Dynamic: yes

replication_synchro_timeout¶

Since version 2.5.1.

For synchronous replication only. Tells how many seconds to wait for a synchronous transaction quorum replication until it is declared failed and is rolled back.

It is not used on replicas, so if the master dies, the pending synchronous transactions will be kept waiting on the replicas until a new master is elected.

Type: number
Default: 5
Environment variable: TT_REPLICATION_SYNCHRO_TIMEOUT
Dynamic: yes

replication_threads¶

Since version 2.10.0.

The number of threads spawned to decode the incoming replication data.

The default value is 1. It means that a single separate thread handles all the incoming replication streams. In most cases, one thread is enough for all incoming data. Therefore, it is likely that the user will not need to set this configuration option.

Possible values range from 1 to 1000. If there are multiple replication threads, connections to serve are distributed evenly between the threads.

Type: number
Default: 1
Possible values: from 1 to 1000
Environment variable: TT_REPLICATION_THREADS
Dynamic: no

election_mode¶

Since version 2.6.1.

Specify the role of a replica set node in the leader election process.

Possible values:

off
voter
candidate
manual.

Participation of a replica set node in the automated leader election can be turned on and off by this option.

The default value is off. All nodes that have values other than off run the Raft state machine internally talking to other nodes according to the Raft leader election protocol. When the option is off, the node accepts Raft messages from other nodes, but it doesn’t participate in the election activities, and this doesn’t affect the node’s state. So, for example, if a node is not a leader but it has election_mode = 'off', it is writable anyway.

You can control which nodes can become a leader. If you want a node to participate in the election process but don’t want that it becomes a leaders, set the election_mode option to voter. In this case, the election works as usual, this particular node will vote for other nodes, but won’t become a leader.

If the node should be able to become a leader, use election_mode = 'candidate'.

Since version 2.8.2, the manual election mode is introduced. It may be used when a user wants to control which instance is the leader explicitly instead of relying on the Raft election algorithm.

When an instance is configured with the election_mode='manual', it behaves as follows:

By default, the instance acts like a voter – it is read-only and may vote for other instances that are candidates.
Once box.ctl.promote() is called, the instance becomes a candidate and starts a new election round. If the instance wins the elections, it becomes a leader, but won’t participate in any new elections.

Type: string
Default: ‘off’
Environment variable: TT_ELECTION_MODE
Dynamic: yes

election_timeout¶

Since version 2.6.1.

Specify the timeout between election rounds in the leader election process if the previous round ended up with a split-vote.

In the leader election process, there can be an election timeout for the case of a split-vote. The timeout can be configured using this option; the default value is 5 seconds.

It is quite big, and for most of the cases it can be freely lowered to 300-400 ms. It can be a floating point value (300 ms would be box.cfg{election_timeout = 0.3}).

Type: number
Default: 5
Environment variable: TT_ELECTION_TIMEOUT
Dynamic: yes

election_fencing_mode¶

Since version 2.11.0.

In earlier Tarantool versions, use election_fencing_enabled instead.

Specify the leader fencing mode that affects the leader election process. When the parameter is set to soft or strict, the leader resigns its leadership if it has less than replication_synchro_quorum of alive connections to the cluster nodes. The resigning leader receives the status of a follower in the current election term and becomes read-only.

In soft mode, a connection is considered dead if there are no responses for 4*replication_timeout seconds both on the current leader and the followers.
In strict mode, a connection is considered dead if there are no responses for 2*replication_timeout seconds on the current leader and 4*replication_timeout seconds on the followers. This improves chances that there is only one leader at any time.

Fencing applies to the instances that have the election_mode set to candidate or manual. To turn off leader fencing, set election_fencing_mode to off.

Type: string
Default: ‘soft’
Environment variable: TT_ELECTION_FENCING_MODE
Dynamic: yes

instance_name¶

Since version 3.0.0.

Specify the instance name. This value must be unique in a replica set.

The following rules are applied to instance names:

The maximum number of symbols is 63.
Should start with a letter.
Can contain lowercase letters (a-z). If uppercase letters are used, they are converted to lowercase.
Can contain digits (0-9).
Can contain the following characters: -, _.

To change or remove the specified name, you should temporarily set the box.cfg.force_recovery configuration option to true. When all the names are updated and all the instances synced, box.cfg.force_recovery can be set back to false.

Note

The instance name is persisted in the box.space._cluster system space.

Networking

io_collect_interval
net_msg_max
readahead
iproto_threads

io_collect_interval¶

Since version 1.4.9.

The instance will sleep for io_collect_interval seconds between iterations of the event loop. Can be used to reduce CPU load in deployments in which the number of client connections is large, but requests are not so frequent (for example, each connection issues just a handful of requests per second).

Type: float
Default: null
Environment variable: TT_IO_COLLECT_INTERVAL
Dynamic: yes

net_msg_max¶

Since version 1.10.1.

On powerful systems, increase net_msg_max and the scheduler will immediately start processing pending requests.

On weaker systems, decrease net_msg_max and the overhead may decrease although this may take some time because the scheduler must wait until already-running requests finish.

When net_msg_max is reached, Tarantool suspends processing of incoming packages until it has processed earlier messages. This is not a direct restriction of the number of fibers that handle network messages, rather it is a system-wide restriction of channel bandwidth. This in turn causes restriction of the number of incoming network messages that the transaction processor thread handles, and therefore indirectly affects the fibers that handle network messages. (The number of fibers is smaller than the number of messages because messages can be released as soon as they are delivered, while incoming requests might not be processed until some time after delivery.)

On typical systems, the default value (768) is correct.

Type: integer
Default: 768
Environment variable: TT_NET_MSG_MAX
Dynamic: yes

readahead¶

Since version 1.6.2.

The size of the read-ahead buffer associated with a client connection. The larger the buffer, the more memory an active connection consumes and the more requests can be read from the operating system buffer in a single system call. The rule of thumb is to make sure the buffer can contain at least a few dozen requests. Therefore, if a typical tuple in a request is large, e.g. a few kilobytes or even megabytes, the read-ahead buffer size should be increased. If batched request processing is not used, it’s prudent to leave this setting at its default.

Type: integer
Default: 16320
Environment variable: TT_READAHEAD
Dynamic: yes

iproto_threads¶

Since version 2.8.1.

The number of network threads. There can be unusual workloads where the network thread is 100% loaded and the transaction processor thread is not, so the network thread is a bottleneck. In that case set iproto_threads to 2 or more. The operating system kernel will determine which connection goes to which thread.

On typical systems, the default value (1) is correct.

Type: integer
Default: 1
Environment variable: TT_IPROTO_THREADS
Dynamic: no

Logging

This section provides information on how to configure options related to logging. You can also use the log module to configure logging in your application.

log_level
log
log_nonblock
too_long_threshold
log_format
log_modules

log_level¶

Since version 1.6.2.

Specify the level of detail the log has. There are the following levels:

0 – fatal
1 – syserror
2 – error
3 – crit
4 – warn
5 – info
6 – verbose
7 – debug

By setting log_level, you can enable logging of all events with severities above or equal to the given level. Tarantool prints logs to the standard error stream by default. This can be changed with the log configuration parameter.

Type: integer, string
Default: 5
Environment variable: TT_LOG_LEVEL
Dynamic: yes

Note

Prior to Tarantool 1.7.5 there were only six levels and DEBUG was level 6. Starting with Tarantool 1.7.5, VERBOSE is level 6 and DEBUG is level 7. VERBOSE is a new level for monitoring repetitive events which would cause too much log writing if INFO were used instead.

log¶

Since version 1.7.4.

By default, Tarantool sends the log to the standard error stream (stderr).

box.cfg{}
-- or
box.cfg{log = ''}

If log is specified and not empty, Tarantool can send the log to a:

file
pipe
system logger

Example 1: sending the log to the tarantool.log file.

box.cfg{log = 'tarantool.log'}
-- or
box.cfg{log = 'file:tarantool.log'}

This opens the file tarantool.log for output on the server’s default directory. If the log string has no prefix or has the prefix “file:”, then the string is interpreted as a file path.

Example 2: sending the log to a pipe.

box.cfg{log = '| cronolog tarantool.log'}
-- or
box.cfg{log = 'pipe: cronolog tarantool.log'}

This starts the program cronolog when the server starts, and sends all log messages to the standard input (stdin) of cronolog. If the log string begins with ‘|’ or has the prefix “pipe:”, then the string is interpreted as a Unix pipeline.

Example 3: sending the log to syslog.

box.cfg{log = 'syslog:identity=tarantool'}
-- or
box.cfg{log = 'syslog:facility=user'}
-- or
box.cfg{log = 'syslog:identity=tarantool,facility=user'}
-- or
box.cfg{log = 'syslog:server=unix:/dev/log'}

If the log string begins with “syslog:”, then it is interpreted as a message for the syslogd program, which normally is running in the background on any Unix-like platform. The setting can be syslog:, syslog:facility=..., syslog:identity=..., syslog:server=..., or a combination.

The syslog:identity setting is an arbitrary string, which is placed at the beginning of all messages. The default value is “tarantool”.
The syslog:facility setting is currently ignored but will be used in the future. The value must be one of the syslog keywords, which tell syslogd where the message should go. The possible values are: auth, authpriv, cron, daemon, ftp, kern, lpr, mail, news, security, syslog, user, uucp, local0, local1, local2, local3, local4, local5, local6, local7. The default value is: local7.
The syslog:server setting is the locator for the syslog server. It can be a Unix socket path beginning with “unix:”, or an ipv4 port number. The default socket value is: dev/log (on Linux) or /var/run/syslog (on macOS). The default port value is: 514, the UDP port.

When logging to a file, Tarantool reopens the log on SIGHUP. When log is a program, its PID is saved in the log.pid variable. You need to send it a signal to rotate logs.

Type: string
Default: null
Environment variable: TT_LOG
Dynamic: no

log_nonblock¶

Since version 1.7.4.

If log_nonblock equals true, Tarantool does not block during logging when the system is not ready for writing, and drops the message instead. If log_level is high, and many messages go to the log, setting log_nonblock to true may improve logging performance at the cost of some log messages getting lost.

This parameter has effect only if log is configured to send logs to a pipe or system logger. The default log_nonblock value is nil, which means that blocking behavior corresponds to the logger type:

false for stderr and file loggers.
true for a pipe and system logger.

This is a behavior change: in earlier versions of the Tarantool server, the default value was true.

Type: boolean
Default: nil
Environment variable: TT_LOG_NONBLOCK
Dynamic: no

too_long_threshold¶

Since version 1.6.2.

If processing a request takes longer than the given value (in seconds), warn about it in the log. Has effect only if log_level is greater than or equal to 4 (WARNING).

Type: float
Default: 0.5
Environment variable: TT_TOO_LONG_THRESHOLD
Dynamic: yes

log_format¶

Since version 1.7.6.

Log entries have two possible formats:

‘plain’ (the default), or
‘json’ (with more detail and with JSON labels).

Here is what a log entry looks like if box.cfg{log_format='plain'}:

2017-10-16 11:36:01.508 [18081] main/101/interactive I> set 'log_format' configuration option to "plain"

Here is what a log entry looks like if box.cfg{log_format='json'}:

{"time": "2017-10-16T11:36:17.996-0600",
"level": "INFO",
"message": "set 'log_format' configuration option to \"json\"",
"pid": 18081,|
"cord_name": "main",
"fiber_id": 101,
"fiber_name": "interactive",
"file": "builtin\/box\/load_cfg.lua",
"line": 317}

The log_format='plain' entry has a time value, process ID, cord name, fiber_id, fiber_name, log level, and message.

The log_format='json' entry has the same fields along with their labels, and in addition has the file name and line number of the Tarantool source.

Type: string
Default: ‘plain’
Environment variable: TT_LOG_FORMAT
Dynamic: yes

log_modules¶

Since version 2.11.0.

Configure the specified log levels (log_level) for different modules.

You can specify a logging level for the following module types:

Modules (files) that use the default logger. Example: Set log levels for files that use the default logger.
Modules that use custom loggers created using the log.new() function. Example: Set log levels for modules that use custom loggers.
The tarantool module that enables you to configure the logging level for Tarantool core messages. Specifically, it configures the logging level for messages logged from non-Lua code, including C modules. Example: Set a log level for C modules.

Type: table
Default: blank
Environment variable: TT_LOG_MODULES
Dynamic: yes

Example 1: Set log levels for files that use the default logger

Suppose you have two identical modules placed by the following paths: test/logging/module1.lua and test/logging/module2.lua. These modules use the default logger and look as follows:

return {
    say_hello = function()
        local log = require('log')
        log.info('Info message from module1')
    end
}

To load these modules in your application, you need to add the corresponding require directives:

module1 = require('test.logging.module1')
module2 = require('test.logging.module2')

To configure logging levels, you need to provide module names corresponding to paths to these modules. In the example below, the box_cfg variable contains logging settings that can be passed to the box.cfg() function:

box_cfg = { log_modules = {
    ['test.logging.module1'] = 'verbose',
    ['test.logging.module2'] = 'error' }
}

Given that module1 has the verbose logging level and module2 has the error level, calling module1.say_hello() shows a message but module2.say_hello() is swallowed:

-- Prints 'info' messages --
module1.say_hello()
--[[
[92617] main/103/interactive/test.logging.module1 I> Info message from module1
---
...
--]]

-- Swallows 'info' messages --
module2.say_hello()
--[[
---
...
--]]

Example 2: Set log levels for modules that use custom loggers

In the example below, the box_cfg variable contains logging settings that can be passed to the box.cfg() function. This example shows how to set the verbose level for module1 and the error level for module2:

box_cfg = { log_level = 'warn',
            log_modules = {
                module1 = 'verbose',
                module2 = 'error' }
}

To create custom loggers, call the log.new() function:

-- Creates new loggers --
module1_log = require('log').new('module1')
module2_log = require('log').new('module2')

Given that module1 has the verbose logging level and module2 has the error level, calling module1_log.info() shows a message but module2_log.info() is swallowed:

-- Prints 'info' messages --
module1_log.info('Info message from module1')
--[[
[16300] main/103/interactive/module1 I> Info message from module1
---
...
--]]

-- Swallows 'debug' messages --
module1_log.debug('Debug message from module1')
--[[
---
...
--]]

-- Swallows 'info' messages --
module2_log.info('Info message from module2')
--[[
---
...
--]]

Example 3: Set a log level for C modules

In the example below, the box_cfg variable contains logging settings that can be passed to the box.cfg() function. This example shows how to set the info level for the tarantool module:

box_cfg = { log_level = 'warn',
            log_modules = { tarantool = 'info' } }

The specified level affects messages logged from C modules:

ffi = require('ffi')

-- Prints 'info' messages --
ffi.C._say(ffi.C.S_INFO, nil, 0, nil, 'Info message from C module')
--[[
[6024] main/103/interactive I> Info message from C module
---
...
--]]

-- Swallows 'debug' messages --
ffi.C._say(ffi.C.S_DEBUG, nil, 0, nil, 'Debug message from C module')
--[[
---
...
--]]

The example above uses the LuaJIT ffi library to call C functions provided by the say module.

Logging example

This example illustrates how “rotation” works, that is, what happens when the server instance is writing to a log and signals are used when archiving it.

Start with two terminal shells: Terminal #1 and Terminal #2.
In Terminal #1, start an interactive Tarantool session. Then, use the log property to send logs to Log_file and call log.info to put a message in the log file.
```
box.cfg{log='Log_file'}
log = require('log')
log.info('Log Line #1')
```
In Terminal #2, use the mv command to rename the log file to Log_file.bak.
```
mv Log_file Log_file.bak
```
As a result, the next log message will go to Log_file.bak.
Go back to Terminal #1 and put a message “Log Line #2” in the log file.
```
log.info('Log Line #2')
```
In Terminal #2, use ps to find the process ID of the Tarantool instance.
```
ps -A | grep tarantool
```
In Terminal #2, execute kill -HUP to send a SIGHUP signal to the Tarantool instance. Tarantool will open Log_file again, and the next log message will go to Log_file.
```
kill -HUP process_id
```
The same effect could be accomplished by calling log.rotate.
In Terminal #1, put a message “Log Line #3” in the log file.
```
log.info('Log Line #3')
```

In Terminal #2, use less to examine files. Log_file.bak will have the following lines …

2015-11-30 15:13:06.373 [27469] main/101/interactive I> Log Line #1`
2015-11-30 15:14:25.973 [27469] main/101/interactive I> Log Line #2`

… and Log_file will look like this:

log file has been reopened
2015-11-30 15:15:32.629 [27469] main/101/interactive I> Log Line #3

Audit log

Enterprise Edition

Audit log features are available in the Enterprise Edition only.

The audit_* parameters define configuration related to audit logging.

audit_extract_key
audit_filter
audit_format
audit_log
audit_nonblock
audit_spaces

audit_extract_key¶

Since: 3.0.0.

If set to true, the audit subsystem extracts and prints only the primary key instead of full tuples in DML events (space_insert, space_replace, space_delete). Otherwise, full tuples are logged. The option may be useful in case tuples are big.

Type: boolean
Default: false
Environment variable: TT_AUDIT_EXTRACT_KEY

audit_filter¶

Enable logging for a specified subset of audit events. This option accepts the following values:

Event names (for example, password_change). For details, see Audit log events.
Event groups (for example, audit). For details, see Event groups.

The option contains either one value from Possible values section (see below) or a combination of them.

To enable custom audit log events, specify the custom value in this option.

The default value is compatibility, which enables logging of all events available before 2.10.0.

Example

box.cfg{
    audit_log = 'audit.log',
    audit_filter = 'audit,auth,priv,password_change,access_denied'
   }

Type: array
Possible values: ‘all’, ‘audit’, ‘auth’, ‘priv’, ‘ddl’, ‘dml’, ‘data_operations’, ‘compatibility’,
‘audit_enable’, ‘auth_ok’, ‘auth_fail’, ‘disconnect’, ‘user_create’, ‘user_drop’, ‘role_create’, ‘role_drop’,
‘user_disable’, ‘user_enable’, ‘user_grant_rights’, ‘role_grant_rights’, ‘role_revoke_rights’, ‘password_change’,
‘access_denied’, ‘eval’, ‘call’, ‘space_select’, ‘space_create’, ‘space_alter’, ‘space_drop’, ‘space_insert’,
‘space_replace’, ‘space_delete’, ‘custom’
Default: ‘compatibility’
Environment variable: TT_AUDIT_FILTER

audit_format¶

Specify the format that is used for the audit log events – plain text, CSV or JSON format.

Plain text is used by default. This human-readable format can be efficiently compressed.

box.cfg{audit_log = 'audit.log', audit_format = 'plain'}

Example

remote: session_type:background module:common.admin.auth user: type:custom_tdg_audit tag:tdg_severity_INFO description:[5e35b406-4274-4903-857b-c80115275940] subj: "anonymous", msg: "Access granted to anonymous user"

The JSON format is more convenient to receive log events, analyze them and integrate them with other systems if needed.

box.cfg{audit_log = 'audit.log', audit_format = 'json'}

Example

{"time": "2022-11-17T21:55:49.880+0300", "remote": "", "session_type": "background", "module": "common.admin.auth", "user": "", "type": "custom_tdg_audit", "tag": "tdg_severity_INFO", "description": "[c26cd11a-3342-4ce6-8f0b-a4b222268b9d] subj: \"anonymous\", msg: \"Access granted to anonymous user\""}

Using the CSV format allows you to view audit log events in tabular form.

box.cfg{audit_log = 'audit.log', audit_format = 'csv'}

Example

2022-11-17T21:58:03.131+0300,,background,common.admin.auth,,,custom_tdg_audit,tdg_severity_INFO,"[b3dfe2a3-ec29-4e61-b747-eb2332c83b2e] subj: ""anonymous"", msg: ""Access granted to anonymous user"""

Type: string
Possible values: ‘json’, ‘csv’, ‘plain’
Default: ‘json’
Environment variable: TT_AUDIT_FORMAT

audit_log¶

Enable audit logging and define the log location.

This option accepts a string value that allows you to define the log location. The following locations are supported:

File: to write audit logs to a file, specify a path to a file (with an optional file prefix)
Pipeline: to start a program and write audit logs to it, specify a program name (with | or pipe prefix)
System log: to write audit logs to a system log, specify a message for syslogd (with syslog prefix)

See the examples below.

By default, audit logging is disabled.

Example: Writing to a file

box.cfg{audit_log = 'audit_tarantool.log'}
-- or
box.cfg{audit_log = 'file:audit_tarantool.log'}

This opens the audit_tarantool.log file for output in the server’s default directory. If the audit_log string has no prefix or the prefix file:, the string is interpreted as a file path.

If you log to a file, Tarantool will reopen the audit log at SIGHUP.

Example: Sending to a pipeline

box.cfg{audit_log = '| cronolog audit_tarantool.log'}
-- or
box.cfg{audit_log = 'pipe: cronolog audit_tarantool.log'}'

This starts the cronolog program when the server starts and sends all audit_log messages to cronolog’s standard input (stdin). If the audit_log string starts with ‘|’ or contains the prefix pipe:, the string is interpreted as a Unix pipeline.

If log is a program, check out its pid and send it a signal to rotate logs.

Example: Writing to a system log

Warning

Below is an example of writing audit logs to a directory shared with the system logs. Tarantool allows this option, but it is not recommended to do this to avoid difficulties when working with audit logs. System and audit logs should be written separately. To do this, create separate paths and specify them.

This sample configuration sends the audit log to syslog:

box.cfg{audit_log = 'syslog:identity=tarantool'}
-- or
box.cfg{audit_log = 'syslog:facility=user'}
-- or
box.cfg{audit_log = 'syslog:identity=tarantool,facility=user'}
-- or
box.cfg{audit_log = 'syslog:server=unix:/dev/log'}

If the audit_log string starts with “syslog:”, it is interpreted as a message for the syslogd program, which normally runs in the background of any Unix-like platform. The setting can be ‘syslog:’, ‘syslog:facility=…’, ‘syslog:identity=…’, ‘syslog:server=…’ or a combination.

The syslog:identity setting is an arbitrary string that is placed at the beginning of all messages. The default value is tarantool.

The syslog:facility setting is currently ignored, but will be used in the future. The value must be one of the syslog keywords that tell syslogd where to send the message. The possible values are auth, authpriv, cron, daemon, ftp, kern, lpr, mail, news, security, syslog, user, uucp, local0, local1, local2, local3, local4, local5, local6, local7. The default value is local7.

The syslog:server setting is the locator for the syslog server. It can be a Unix socket path starting with “unix:” or an ipv4 port number. The default socket value is /dev/log (on Linux) or /var/run/syslog (on Mac OS). The default port value is 514, which is the UDP port.

An example of a Tarantool audit log entry in the syslog:

09:32:52 tarantool_audit: {"time": "2024-02-08T09:32:52.190+0300", "uuid": "94454e46-9a0e-493a-bb9f-d59e44a43581", "severity": "INFO", "remote": "unix/:(socket)", "session_type": "console", "module": "tarantool", "user": "admin", "type": "space_create", "tag": "", "description": "Create space bands"}

Type: string
Possible values: see the string format above
Default: ‘nill’
Environment variable: TT_AUDIT_LOG

audit_nonblock¶: Specify the logging behavior if the system is not ready to write. If set to true, Tarantool does not block during logging if the system is non-writable and writes a message instead. Using this value may improve logging performance at the cost of losing some log messages.

Note

The option only has an effect if the audit_log is set to syslog or pipe.

Setting audit_nonblock to true is not allowed if the output is to a file. In this case, set audit_nonblock to false.

Type: boolean

Default: true

Environment variable: TT_AUDIT_NONBLOCK

audit_spaces¶

Since: 3.0.0.

Example

In the example, only the events of bands and singers spaces are logged:

box.cfg{
    audit_spaces = 'bands,singers'
   }

Type: array
Default: box.NULL
Environment variable: TT_AUDIT_SPACES

Authentication

Enterprise Edition

Authentication features are supported by the Enterprise Edition only.

auth_delay
auth_retries
auth_type
disable_guest
password_min_length
password_enforce_uppercase
password_enforce_lowercase
password_enforce_digits
password_enforce_specialchars
password_lifetime_days
password_history_length

auth_delay¶

Since 2.11.0.

Specify a period of time (in seconds) that a specific user should wait for the next attempt after failed authentication.

With the configuration below, Tarantool refuses the authentication attempt if the previous attempt was less than 5 seconds ago.

box.cfg{ auth_delay = 5 }

Type: number
Default: 0
Environment variable: TT_AUTH_DELAY
Dynamic: yes

auth_retries¶

Since 3.0.0.

Specify the maximum number of authentication retries allowed before auth_delay is enforced. The default value is 0, which means auth_delay is enforced after the first failed authentication attempt.

The retry counter is reset after auth_delay seconds since the first failed attempt. For example, if a client tries to authenticate fewer than auth_retries times within auth_delay seconds, no authentication delay is enforced. The retry counter is also reset after any successful authentication attempt.

Type: number
Default: 0
Environment variable: TT_AUTH_RETRIES
Dynamic: yes

auth_type¶

Since 2.11.0.

Specify an authentication protocol:

‘chap-sha1’: use the CHAP protocol to authenticate users with SHA-1 hashing applied to passwords.
‘pap-sha256’: use PAP authentication with the SHA256 hashing algorithm.

For new users, the box.schema.user.create method will generate authentication data using PAP-SHA256. For existing users, you need to reset a password using box.schema.user.passwd to use the new authentication protocol.

Type: string
Default value: ‘chap-sha1’
Environment variable: TT_AUTH_TYPE
Dynamic: yes

disable_guest¶

Since 2.11.0.

If true, disables access over remote connections from unauthenticated or guest access users. This option affects both net.box and replication connections.

Type: boolean
Default: false
Environment variable: TT_DISABLE_GUEST
Dynamic: yes

password_min_length¶

Since 2.11.0.

Specify the minimum number of characters for a password.

The following example shows how to set the minimum password length to 10.

box.cfg{ password_min_length = 10 }

Type: integer
Default: 0
Environment variable: TT_PASSWORD_MIN_LENGTH
Dynamic: yes

password_enforce_uppercase¶

Since 2.11.0.

If true, a password should contain uppercase letters (A-Z).

Type: boolean
Default: false
Environment variable: TT_PASSWORD_ENFORCE_UPPERCASE
Dynamic: yes

password_enforce_lowercase¶

Since 2.11.0.

If true, a password should contain lowercase letters (a-z).

Type: boolean
Default: false
Environment variable: TT_PASSWORD_ENFORCE_LOWERCASE
Dynamic: yes

password_enforce_digits¶

Since 2.11.0.

If true, a password should contain digits (0-9).

Type: boolean
Default: false
Environment variable: TT_PASSWORD_ENFORCE_DIGITS
Dynamic: yes

password_enforce_specialchars¶

Since 2.11.0.

If true, a password should contain at least one special character (such as &|?!@$).

Type: boolean
Default: false
Environment variable: TT_PASSWORD_ENFORCE_SPECIALCHARS
Dynamic: yes

password_lifetime_days¶

Since 2.11.0.

Specify the maximum period of time (in days) a user can use the same password. When this period ends, a user gets the “Password expired” error on a login attempt. To restore access for such users, use box.schema.user.passwd.

Note

The default 0 value means that a password never expires.

The example below shows how to set a maximum password age to 365 days.

box.cfg{ password_lifetime_days = 365 }

Type: integer
Default: 0
Environment variable: TT_PASSWORD_LIFETIME_DAYS
Dynamic: yes

password_history_length¶

Since 2.11.0.

Specify the number of unique new user passwords before an old password can be reused.

In the example below, a new password should differ from the last three passwords.

box.cfg{ password_history_length = 3 }

Type: integer
Default: 0
Environment variable: TT_PASSWORD_HISTORY_LENGTH
Dynamic: yes

Note

Tarantool uses the auth_history field in the box.space._user system space to store user passwords.

Flight recorder

Enterprise Edition

The flight recorder is available in the Enterprise Edition only.

flightrec_enabled
flightrec_logs_size
flightrec_logs_max_msg_size
flightrec_logs_log_level
flightrec_metrics_period
flightrec_metrics_interval
flightrec_requests_size
flightrec_requests_max_req_size
flightrec_requests_max_res_size

flightrec_enabled¶

Since 2.11.0.

Enable the flight recorder.

Type: boolean
Default: false
Environment variable: TT_FLIGHTREC_ENABLED
Dynamic: yes

flightrec_logs_size¶

Since 2.11.0.

Specify the size (in bytes) of the log storage. You can set this option to 0 to disable the log storage.

Type: integer
Default: 10485760
Environment variable: TT_FLIGHTREC_LOGS_SIZE
Dynamic: yes

flightrec_logs_max_msg_size¶

Since 2.11.0.

Specify the maximum size (in bytes) of the log message. The log message is truncated if its size exceeds this limit.

Type: integer
Default: 4096
Maximum: 16384
Environment variable: TT_FLIGHTREC_LOGS_MAX_MSG_SIZE
Dynamic: yes

flightrec_logs_log_level¶

Since 2.11.0.

Specify the level of detail the log has. You can learn more about log levels from the log_level option description. Note that the flightrec_logs_log_level value might differ from log_level.

Type: integer
Default: 6
Environment variable: TT_FLIGHTREC_LOGS_LOG_LEVEL
Dynamic: yes

flightrec_metrics_period¶

Since 2.11.0.

Specify the time period (in seconds) that defines how long metrics are stored from the moment of dump. So, this value defines how much historical metrics data is collected up to the moment of crash. The frequency of metric dumps is defined by flightrec_metrics_interval.

Type: integer
Default: 180
Environment variable: TT_FLIGHTREC_METRICS_PERIOD
Dynamic: yes

flightrec_metrics_interval¶

Since 2.11.0.

Specify the time interval (in seconds) that defines the frequency of dumping metrics. This value shouldn’t exceed flightrec_metrics_period.

Note

Given that the average size of a metrics entry is 2 kB, you can estimate the size of the metrics storage as follows:

(flightrec_metrics_period / flightrec_metrics_interval) * 2 kB

Type: number
Default: 1.0
Minimum: 0.001
Environment variable: TT_FLIGHTREC_METRICS_INTERVAL
Dynamic: yes

flightrec_requests_size¶

Since 2.11.0.

Specify the size (in bytes) of storage for the request and response data. You can set this parameter to 0 to disable a storage of requests and responses.

Type: integer
Default: 10485760
Environment variable: TT_FLIGHTREC_REQUESTS_SIZE
Dynamic: yes

flightrec_requests_max_req_size¶

Since 2.11.0.

Specify the maximum size (in bytes) of a request entry. A request entry is truncated if this size is exceeded.

Type: integer
Default: 16384
Environment variable: TT_FLIGHTREC_REQUESTS_MAX_REQ_SIZE
Dynamic: yes

flightrec_requests_max_res_size¶

Since 2.11.0.

Specify the maximum size (in bytes) of a response entry. A response entry is truncated if this size is exceeded.

Type: integer
Default: 16384
Environment variable: TT_FLIGHTREC_REQUESTS_MAX_RES_SIZE
Dynamic: yes

Feedback

feedback_enabled
feedback_host
feedback_interval

By default, a Tarantool daemon sends a small packet once per hour, to https://feedback.tarantool.io. The packet contains three values from box.info: box.info.version, box.info.uuid, and box.info.cluster_uuid. By changing the feedback configuration parameters, users can adjust or turn off this feature.

feedback_enabled¶

Since version 1.10.1.

Whether to send feedback.

If this is set to true, feedback will be sent as described above. If this is set to false, no feedback will be sent.

Type: boolean
Default: true
Environment variable: TT_FEEDBACK_ENABLED
Dynamic: yes

feedback_host¶

Since version 1.10.1.

The address to which the packet is sent. Usually the recipient is Tarantool, but it can be any URL.

Type: string
Default: https://feedback.tarantool.io
Environment variable: TT_FEEDBACK_HOST
Dynamic: yes

feedback_interval¶

Since version 1.10.1.

The number of seconds between sendings, usually 3600 (1 hour).

Type: float
Default: 3600
Environment variable: TT_FEEDBACK_INTERVAL
Dynamic: yes

Deprecated parameters

These parameters are deprecated since Tarantool version 1.7.4:

logger
logger_nonblock
panic_on_snap_error,
panic_on_wal_error
replication_source
slab_alloc_arena
slab_alloc_maximal
slab_alloc_minimal
snap_dir
snapshot_count
snapshot_period
rows_per_wal
election_fencing_enabled

logger¶: Deprecated in favor of log. The parameter was only renamed, while the type, values and semantics remained intact.

logger_nonblock¶: Deprecated in favor of log_nonblock. The parameter was only renamed, while the type, values and semantics remained intact.

panic_on_snap_error¶

Deprecated in favor of force_recovery.

If there is an error while reading a snapshot file (at server instance start), abort.

Type: boolean
Default: true
Dynamic: no

panic_on_wal_error¶: Deprecated in favor of force_recovery.

Type: boolean

Default: true

Dynamic: yes

replication_source¶: Deprecated in favor of replication. The parameter was only renamed, while the type, values and semantics remained intact.

slab_alloc_arena¶

Deprecated in favor of memtx_memory.

How much memory Tarantool allocates to actually store tuples, in gigabytes. When the limit is reached, INSERT or UPDATE requests begin failing with error ER_MEMORY_ISSUE. While the server does not go beyond the defined limit to allocate tuples, there is additional memory used to store indexes and connection information. Depending on actual configuration and workload, Tarantool can consume up to 20% more than the limit set here.

Type: float
Default: 1.0
Dynamic: no

slab_alloc_maximal¶: Deprecated in favor of memtx_max_tuple_size. The parameter was only renamed, while the type, values and semantics remained intact.

slab_alloc_minimal¶: Deprecated in favor of memtx_min_tuple_size. The parameter was only renamed, while the type, values and semantics remained intact.

snap_dir¶: Deprecated in favor of memtx_dir. The parameter was only renamed, while the type, values and semantics remained intact.

snapshot_period¶: Deprecated in favor of checkpoint_interval. The parameter was only renamed, while the type, values and semantics remained intact.

snapshot_count¶: Deprecated in favor of checkpoint_count. The parameter was only renamed, while the type, values and semantics remained intact.

rows_per_wal¶: Deprecated in favor of wal_max_size. The parameter does not allow to properly limit size of WAL logs.

election_fencing_enabled¶

Deprecated in Tarantool v2.11 in favor of election_fencing_mode.

The parameter does not allow using the strict fencing mode. Setting to true is equivalent to setting the soft election_fencing_mode. Setting to false is equivalent to setting the off election_fencing_mode.

Type: boolean
Default: true
Environment variable: TT_ELECTION_FENCING_ENABLED
Dynamic: yes

tarantool command-line options

tarantool is the Tarantool database and application server. This command can be used for different purposes, for example, running a single Tarantool instance or starting an external coordinator used for a supervised failover. The tarantool command also provides additional options that might be helpful for development purposes.

Note

The tt utility is the recommended way to start Tarantool instances. Learn more from Starting and stopping instances.

Starting instances using the tarantool command

Below is the syntax for starting a Tarantool instance configured in a file:

$ tarantool [OPTION ...] --name INSTANCE_NAME --config CONFIG_FILE_PATH

The command below starts router-a-001 configured in the config.yaml file:

$ tarantool --name router-a-001 --config config.yaml

Options

-h, --help¶: Print an annotated list of all available options and exit.

--help-env-list¶

Since: 3.0.0.

Show a list of environment variables that can be used to configure Tarantool.

--failover¶

Since: 3.1.0.

Enterprise Edition

This option is supported by the Enterprise Edition only.

Start an external coordinator used for a supervised failover.

--force-recovery¶

Since: 3.0.0.

Try to start an instance if there is an error while reading a corrupted snapshot or write-ahead log file during the recovery process:

For a corrupted snapshot file – at the instance start.
For a corrupted write-ahead log file – at the instance start or when applying an update at a replica.

With this option enabled, Tarantool skips invalid records, reads as much data as possible, and lets the process finish with a warning. When the instance has started, call box.snapshot() to make a new snapshot so that the corrupted snapshots or write-ahead logs aren’t used for recovery anymore.

You can also enable force recovery using the TT_FORCE_RECOVERY environment variable. TT_FORCE_RECOVERY has a lower priority than the --force-recovery option.

Example on GitHub: force_recovery

-v, -V, --version¶

Print the product name and version.

Example

$ tarantool --version
Tarantool Enterprise 3.0.0-beta1-2-gcbb569b4c-r607-gc64
Target: Linux-x86_64-RelWithDebInfo
...

In this example:

3.0.0 is a Tarantool version. Tarantool follows semantic versioning, which is described in the Tarantool release policy section.
Target is the platform Tarantool is built on. Platform-specific details may follow this line.

-c, --config PATH¶

Since: 3.0.0.

Set a path to a YAML configuration file. You can also configure this value using the TT_CONFIG environment variable.

-n, --name INSTANCE¶

Since: 3.0.0.

Set the name of an instance to run. You can also configure this value using the TT_INSTANCE_NAME environment variable.

-i¶

Enter an interactive mode.

Example

$ tarantool -i

-e EXPR¶

Execute the ‘EXPR’ string. See also: lua man page.

Example

$ tarantool -e 'print("Hello, world!")'
Hello, world!

-l NAME¶

Require the ‘NAME’ library. See also: lua man page.

Example

$ tarantool -l luatest.coverage script.lua

-j cmd¶

Perform a LuaJIT control command. See also: Command Line Options.

Example

$ tarantool -j off app.lua

-b ...¶

Save or list bytecode. See also: Command Line Options.

Example

$ tarantool -b test.lua test.out

-d SCRIPT¶

Activate a debugging session for ‘SCRIPT’. See also: luadebug.lua.

Example

$ tarantool -d app.lua

--¶: Stop handling options. See also: lua man page.

-¶: Stop handling options and execute the standard input as a file. See also: lua man page.

SQL reference

This reference covers all the SQL statements and clauses supported by Tarantool.

What Tarantool’s SQL product delivers

Tarantool’s SQL is a major new feature that was first introduced with Tarantool version 2.1.
The primary advantages are:
- a high level of SQL compatibility
- an easy way to switch from NoSQL to SQL and back
- the Tarantool brand.

The “high level of SQL compatibility” includes support for joins, subqueries, triggers, indexes, groupings, transactions in a multi-user environment, and conformance with the majority of the mandatory requirements of the SQL:2016 standard.

The “easy way to switch” consists of the fact that the same tables can be operated on with SQL and with the long-established Tarantool-NoSQL product, meaning that when you want standard Relational-DBMS jobs you can do them, and when you want NoSQL capability you can have it (Tarantool-NoSQL outperforms other NoSQL products in public benchmarks).

The “Tarantool brand” comes from the support of a multi-billion-dollar internet / mail / social-network provider, a dozens-of-professionals staff of programmers and support people, a community who believes in open-source BSD licensing, and hundreds of corporations / government bodies using Tarantool products in production already.

The status of Tarantool’s SQL feature is “release”. So, it is working now and you can verify that by downloading it and trying all the features, which will be explained in the rest of this document. There is also a tutorial.

Differences from other products

Differences from other SQL products: The Tarantool design requirement is that Tarantool’s SQL conforms to the majority of the listed mandatory requirements of the core SQL:2016 standard, and this will be shown in the specific conformance statements in the feature list in a section about “compliance with the official SQL standard”. Possibly the deviations which most people will find notable are: type checking is less strict, and some data definition options must be done with NoSQL syntax.

Differences from other NoSQL products: By examining attempts by others to paste relatively smaller subsets of SQL onto NoSQL products, it should be possible to conclude that Tarantool’s SQL has demonstrably more features and capabilities. The reason is that the Tarantool developers started with a complete code base of a working SQL DBMS and made it work with Tarantool-NoSQL underneath, rather than starting with a NoSQL DBMS and adding syntax to it.

What Tarantool’s SQL manual delivers

The following parts of this document are:
The SQL User Guide explains “How to get Started” and explains the terms and the syntax elements that apply for all SQL statements.
The SQL Statements and Clauses guide explains, for each SQL statement, the format and the rules and the exceptions and the examples and the limitations.
The SQL Plus Lua guide has the details about calling Lua from SQL, calling SQL from Lua, and using the same database objects in both SQL and Lua.
The SQL Features list shows how the product conforms with the mandatory features of the SQL standard.

Users are expected to know what databases are, and experience with other SQL DBMSs would be an advantage. To learn about the basics of relational database management and SQL in particular, check the SQL Beginners’ Guide in the How-to guides section.

SQL user guide

The User Guide describes how users can start up with SQL with Tarantool, and necessary concepts.

Heading	Summary
Getting Started	Typing SQL statements on a console
Supported Syntax	For example what characters are allowed
Concepts	tokens, literals, identifiers, operands, operators, expressions, statements
Data type conversion	Casting, implicit or explicit

Getting Started

The explanations for installing and starting the Tarantool server are in earlier chapters of the Tarantool manual.

To get started specifically with the SQL features, using Tarantool as a client, execute these requests:

box.cfg{}
box.execute([[VALUES ('hello');]])

The bottom of the screen should now look like this:

tarantool> box.execute([[VALUES ('hello');]])
---
- metadata:
  - name: COLUMN_1
    type: string
  rows:
  - ['hello']
...

That’s an SQL statement done with Tarantool.

Now you are ready to execute any SQL statements via the connection. For example

box.execute([[CREATE TABLE things (id INTEGER PRIMARY key,
                                   remark STRING);]])
box.execute([[INSERT INTO things VALUES (55, 'Hello SQL world!');]])
box.execute([[SELECT * FROM things WHERE id > 0;]])

And you will see the results of the SQL query.

For the rest of this chapter, the box.execute([[…]]) enclosure will not be shown. Examples will simply say what a piece of syntax looks like, such as SELECT 'hello';
and users should know that must be entered as
box.execute([[SELECT 'hello';]])
It is also legal to enclose SQL statements inside single or double quote marks instead of [[ … ]].

Supported Syntax

Keywords, for example CREATE or INSERT or VALUES, may be entered in either upper case or lower case.

Literal values, for example 55 or 'Hello SQL world!', should be entered without single quote marks if they are numeric, and should be entered with single quote marks if they are strings.

Object names, for example table1 or column1, should usually be entered without double quote marks and are subject to some restrictions. They may be enclosed in double quote marks and in that case they are subject to fewer restrictions.

Almost all keywords are reserved, which means that they cannot be used as object names unless they are enclosed in double quote marks.

Comments may be between /* and */ (bracketed) or between -- and the end of a line (simple).

INSERT /* This is a bracketed comment */ INTO t VALUES (5);
INSERT INTO t VALUES (5); -- this is a simple comment

Expressions, for example a + b or a > b AND NOT a <= b, may have arithmetic operators + - / *, may have comparison operators = > < <= >= LIKE, and may be combined with AND OR NOT, with optional parentheses.

Concepts

In the SQL beginners’ guide there was discussion of:
What are: relational databases, tables, views, rows, and columns?
What are: transactions, write-ahead logs, commits and rollbacks?
What are: security considerations?
How to: add, delete, or update rows in tables?
How to: work inside transactions with commits and/or rollbacks?
How to: select, join, filter, group, and sort rows?

Tarantool has a “schema”. A schema is a container for all database objects. A schema may be called a “database” in other DBMS implementations

Tarantool allows four types of “database objects” to be created within the schema: tables, triggers, indexes, and constraints. Within tables, there are “columns”.

Almost all Tarantool SQL statements begin with a reserved-word “verb” such as INSERT, and end optionally with a semicolon. For example: INSERT INTO t VALUES (1);

A Tarantool SQL database and a Tarantool NoSQL database are the same thing. However, some operations are only possible with SQL, and others are only possible with NoSQL. Mixing SQL statements with NoSQL requests is allowed.

Tokens

The token is the minimum SQL-syntax unit that Tarantool understands. These are the types of tokens:

Keywords – official words in the language, for example SELECT
Literals – constants for numerics or strings, for example 15.7 or 'Taranto'
Identifiers – for example column55 or table_of_accounts
Operators (strictly speaking “non-alphabetic operators”) – for example * / + - ( ) , ; < = >=

Tokens can be separated from each other by one or more separators:
* White space characters: tab (U+0009), line feed (U+000A), vertical tab (U+000B), form feed (U+000C), carriage return (U+000D), space (U+0020), next line (U+0085), and all the rare characters in Unicode classes Zl and Zp and Zs. For a full list see https://github.com/tarantool/tarantool/issues/2371.
* Bracketed comments (beginning with /* and ending with */)
* Simple comments (beginning with -- and ending with line feed)
Separators are not necessary before or after operators.
Separators are necessary after keywords or numerics or ordinary identifiers, unless the following token is an operator.
Thus Tarantool can understand this series of six tokens:
SELECT'a'FROM/**/t;
but for readability one would usually use spaces to separate tokens:
SELECT 'a' FROM /**/ t;

Literals

There are eight kinds of literals: BOOLEAN INTEGER DOUBLE DECIMAL STRING VARBINARY MAP ARRAY.

BOOLEAN literals:
TRUE | FALSE | UNKNOWN
A literal has data type = BOOLEAN if it is the keyword TRUE or FALSE. UNKNOWN is a synonym for NULL. A literal may have type = BOOLEAN if it is the keyword NULL and there is no context to indicate a different data type.

INTEGER literals:
[plus-sign | minus-sign] digit [digit …]
or, for a hexadecimal integer literal,
[plus-sign | minus-sign] 0X | 0x hexadecimal-digit [hexadecimal-digit …]
Examples: 5, -5, +5, 55555, 0X55, 0x55
Hexadecimal 0X55 is equal to decimal 85. A literal has data type = INTEGER if it contains only digits and is in the range -9223372036854775808 to +18446744073709551615, integers outside that range are illegal.

DOUBLE literals:
[E|e [plus-sign | minus-sign] digit …]
Examples: 1E5, 1.1E5.
A literal has data type = DOUBLE if it contains “E”. DOUBLE literals are also known as floating-point literals or approximate-numeric literals. To represent “Inf” (infinity), write a real numeric outside the double-precision numeric range, for example 1E309. To represent “nan” (not a number), write an expression that does not result in a real numeric, for example 0/0, using Tarantool/NoSQL. This will appear as NULL in Tarantool/SQL. In an earlier version literals containing periods were considered to be NUMBER literals. In a future version “nan” may not appear as NULL. Prior to Tarantool v. 2.10.0, digits with periods such as .0 were considered to be DOUBLE literals, but now they are considered to be DECIMAL literals.

DECIMAL literals:
[plus-sign | minus-sign] [digit [digit …]] period [digit [digit …]]
Examples: .0, 1.0, 12345678901234567890.123456789012345678
A literal has data type = DECIMAL if it contains a period, and does not contain “E”. DECIMAL literals may contain up to 38 digits; if there are more, then post-decimal digits may be subject to rounding. In earlier Tarantool versions literals containing periods were considered to be NUMBER or DECIMAL literals.

STRING literals:
[quote] [character …] [quote]
Examples: 'ABC', 'AB''C'
A literal has data type type = STRING if it is a sequence of zero or more characters enclosed in single quotes. The sequence '' (two single quotes in a row) is treated as ' (a single quote) when enclosed in quotes, that is, 'A''B' is interpreted as A'B.

VARBINARY literals:
X|x [quote] [hexadecimal-digit-pair …] [quote]
Example: X'414243', which will be displayed as 'ABC'.
A literal has data type = VARBINARY (“variable-length binary”) if it is the letter X followed by quotes containing pairs of hexadecimal digits, representing byte values.

MAP literals:
[left curly bracket] key [colon] value [right curly bracket]
Examples: {'a':1}, {1:'a'}
A map literal is a pair of curly brackets (also called “braces”) enclosing a STRING or INTEGER or UUID literal (called the map “key”) followed by a colon followed by any type of literal (called the map “value”). This is a minimal form of a MAP expression.

ARRAY literals:
[left square bracket] [literal] [right square bracket]
Examples: [1], ['a']
An ARRAY literal is a literal value which is enclosed inside square brackets. This is a minimal form of an ARRAY expression.

Here are four ways to put non-ASCII characters,such as the Greek letter α alpha, in string literals:
First make sure that your shell program is set to accept characters as UTF-8. A simple way to check is
SELECT hex(cast('α' as VARBINARY)); If the result is CEB1 – which is the hexadecimal value for the UTF-8 representation of α – it is good.

(1) Simply enclose the character inside '...',
'α'

(2) Find out what is the hexadecimal code for the UTF-8 representation of α, and enclose that inside X'...', then cast to STRING because X'...' literals are data type VARBINARY not STRING,
CAST(X'CEB1' AS STRING)

(3) Find out what is the Unicode code point for α, and pass that to the CHAR function.
CHAR(945) /* remember that this is α as data type STRING not VARBINARY */

(4) Enclose statements inside double quotes and include Lua escapes, for example box.execute("SELECT '\206\177';")

One can use the concatenation operator || to combine characters made with any of these methods.

Limitations: (Issue#2344)
* LENGTH('A''B') = 3 which is correct, but on the Tarantool console the display from SELECT A''B; is A''B, which is misleading.
* It is unfortunate that X'41' is a byte sequence which looks the same as 'A', but it is not the same. box.execute("select 'A' < X'41';") is not legal at the moment. This happens because TYPEOF(X'41') yields 'varbinary'. Also it is illegal to say UPDATE ... SET string_column = X'41', one must say UPDATE ... SET string_column = CAST(X'41' AS STRING);.

Identifiers

All database objects – tables, triggers, indexes, columns, constraints, functions, collations – have identifiers. An identifier should begin with a letter or underscore ('_') and should contain only letters, digits, dollar signs ('$'), or underscores. The maximum number of bytes in an identifier is between 64982 and 65000. For compatibility reasons, Tarantool recommends that an identifier should not have more than 30 characters.

Letters in identifiers do not have to come from the Latin alphabet, for example the Japanese syllabic ひ and the Cyrillic letter д are legal. But be aware that a Latin letter needs only one byte but a Cyrillic letter needs two bytes, so Cyrillic identifiers consume a tiny amount more space.

Reserved words

Certain words are reserved and should not be used for identifiers. The simple rule is: if a word means something in Tarantool SQL syntax, do not try to use it for an identifier. The current list of reserved words is:

ALL ALTER ANALYZE AND ANY ARRAY AS ASC ASENSITIVE AUTOINCREMENT BEGIN BETWEEN BINARY BLOB BOOL BOOLEAN BOTH BY CALL CASE CAST CHAR CHARACTER CHECK COLLATE COLUMN COMMIT CONDITION CONNECT CONSTRAINT CREATE CROSS CURRENT CURRENT_DATE CURRENT_TIME CURRENT_TIMESTAMP CURRENT_USER CURSOR DATE DATETIME DEC DECIMAL DECLARE DEFAULT DEFERRABLE DELETE DENSE_RANK DESC DESCRIBE DETERMINISTIC DISTINCT DOUBLE DROP EACH ELSE ELSEIF END ESCAPE EXCEPT EXISTS EXPLAIN FALSE FETCH FLOAT FOR FOREIGN FROM FULL FUNCTION GET GRANT GROUP HAVING IF IMMEDIATE IN INDEX INNER INOUT INSENSITIVE INSERT INT INTEGER INTERSECT INTO IS ITERATE JOIN LEADING LEAVE LEFT LIKE LIMIT LOCALTIME LOCALTIMESTAMP LOOP MAP MATCH NATURAL NOT NULL NUM NUMBER NUMERIC OF ON OR ORDER OUT OUTER OVER PARTIAL PARTITION PRAGMA PRECISION PRIMARY PROCEDURE RANGE RANK READS REAL RECURSIVE REFERENCES REGEXP RELEASE RENAME REPEAT REPLACE RESIGNAL RETURN REVOKE RIGHT ROLLBACK ROW ROWS ROW_NUMBER SAVEPOINT SCALAR SELECT SENSITIVE SEQSCAN SESSION SET SIGNAL SIMPLE SMALLINT SPECIFIC SQL START STRING SYSTEM TABLE TEXT THEN TO TRAILING TRANSACTION TRIGGER TRIM TRUE TRUNCATE UNION UNIQUE UNKNOWN UNSIGNED UPDATE USER USING UUID VALUES VARBINARY VARCHAR VIEW WHEN WHENEVER WHERE WHILE WITH

Identifiers may be enclosed in double quotes. These are called quoted identifiers or “delimited identifiers” (unquoted identifiers may be called “regular identifiers”). The double quotes are not part of the identifier. A delimited identifier may be a reserved word and may contain any printable character. Tarantool converts letters in regular identifiers to upper case before it accesses the database, so for statements like CREATE TABLE a (a INTEGER PRIMARY KEY); or SELECT a FROM a; the table name is A and the column name is A. However, Tarantool does not convert delimited identifiers to upper case, so for statements like CREATE TABLE "a" ("a" INTEGER PRIMARY KEY); or SELECT "a" FROM "a"; the table name is a and the column name is a. The sequence "" is treated as " when enclosed in double quotes, that is, "A""B" is interpreted as "A"B".

Examples: things, t45, journal_entries_for_2017, ддд, "into"

Inside certain statements, identifiers may have “qualifiers” to prevent ambiguity. A qualifier is an identifier of a higher-level object, followed by a period. For example column1 within table1 may be referred to as table1.column1. The “name” of an object is the same as its identifier, or its qualified identifier. For example, inside SELECT table1.column1, table2.column1 FROM table1, table2; the qualifiers make it clear that the first column is column1 from table1 and the second column is column1 from table2.

The rules are sometimes relaxed for compatibility reasons, for example some non-letter characters such as $ and « are legal in regular identifiers. However, it is better to assume that rules are never relaxed.

The following are examples of legal and illegal identifiers.

_A1   -- legal, begins with underscore and contains underscore | letter | digit
1_A   -- illegal, begins with digit
A$« -- legal, but not recommended, try to stick with digits and letters and underscores
+ -- illegal, operator token
grant -- illegal, GRANT is a reserved word
"grant" -- legal, delimited identifiers may be reserved words
"_space" -- legal, but Tarantool already uses this name for a system space
"A"."X" -- legal, for columns only, inside statements where qualifiers may be necessary
'a' -- illegal, single quotes are for literals not identifiers
A123456789012345678901234567890 -- legal, identifiers can be long
ддд -- legal, and will be converted to upper case in identifiers

The following example shows that conversion to upper case affects regular identifiers but not delimited identifiers.

CREATE TABLE "q" ("q" INTEGER PRIMARY KEY);
SELECT * FROM q;
-- Result = "error: 'no such table: Q'.

Operands

An operand is something that can be operated on. Literals and column identifiers are operands. So are NULL and DEFAULT.

NULL and DEFAULT are keywords which represent values whose data types are not known until they are assigned or compared, so they are known by the technical term “contextually typed value specifications”. (Exception: for the non-standard statement “SELECT NULL FROM table-name;” NULL has data type BOOLEAN.)

Operand data types

Every operand has a data type.

For literals, as seen earlier, the data type is usually determined by the format.

For identifiers, the data type is usually determined by the definition.

The usual determination may change because of context or because of explicit casting.

For some SQL data type names there are aliases. An alias may be used for data definition. For example VARCHAR(5) and TEXT are aliases of STRING and may appear in CREATE TABLE table_name (column_name VARCHAR(5) PRIMARY KEY); but Tarantool, if asked, will report that the data type of column_name is STRING.

For every SQL data type there is a corresponding NoSQL type, for example an SQL STRING is stored in a NoSQL space as type = ‘string’.

To avoid confusion in this manual, all references to SQL data type names are in upper case and all similar words which refer to NoSQL types or to other kinds of object are in lower case, for example:

STRING is a data type name, but string is a general term;
NUMBER is a data type name, but numeric is a general term.

Although it is common to say that a VARBINARY value is a “binary string”, this manual will not use that term and will instead say “byte sequence”.

Here are all the SQL data types, their corresponding NoSQL types, their aliases, and minimum / maximum literal examples.

SQL type	NoSQL type	Aliases	Minimum	Maximum
BOOLEAN	boolean	BOOL	FALSE	TRUE
INTEGER	integer	INT	-9223372036854775808	18446744073709551615
UNSIGNED	unsigned	(none)	0	18446744073709551615
DOUBLE	double	(none)	-1.79769e308	1.79769e308
NUMBER	number	(none)	-1.79769e308	1.79769e308
DECIMAL	decimal	DEC	-9999999999999999999 9999999999999999999	9999999999999999999 9999999999999999999
STRING	string	TEXT, VARCHAR(n)	`''`	`'many-characters'`
VARBINARY	varbinary	(none)	`X''`	`X'many-hex-digits'`
UUID	uuid	(none)	00000000-0000-0000- 0000-000000000000	ffffffff-ffff-ffff- dfff-ffffffffffff
DATETIME	datetime	(none)
INTERVAL	interval	(none)
SCALAR	(varies)	(none)	FALSE	maximum UUID value
MAP	map	(none)	`{}`	`{big-key:big-value}`
ARRAY	array	(none)	[]	`[many values]`
ANY	any	(none)	FALSE	`[many values]`

BOOLEAN values are FALSE, TRUE, and UNKNOWN (which is the same as NULL). FALSE is less than TRUE.

INTEGER values are numerics that do not contain decimal points and are not expressed with exponential notation. The range of possible values is between -2^63 and +2^64, or NULL.

UNSIGNED values are numerics that do not contain decimal points and are not expressed with exponential notation. The range of possible values is between 0 and +2^64, or NULL.

DOUBLE values are numerics that do contain decimal points (for example 0.5) or are expressed with exponential notation (for example 5E-1). The range of possible values is the same as for the IEEE 754 floating-point standard, or NULL. Numerics outside the range of DOUBLE literals may be displayed as -inf or inf.

NUMBER values have the same range as DOUBLE values. But NUMBER values may also be integers. There is no literal format for NUMBER (literals like 1.5 or 1E555 are considered to be DOUBLEs), so use CAST to insist that a numeric has data type NUMBER, but that is rarely necessary. See the description of NoSQL type ‘number’. Support for arithmetic and built-in arithmetic functions with NUMBERs was removed in Tarantool version 2.10.1.

DECIMAL values can contain up to 38 digits on either side of a decimal point. and any arithmetic with DECIMAL values has exact results (arithmetic with DOUBLE values could have approximate results instead of exact results). Before Tarantool v. 2.10.0 there was no literal format for DECIMAL, so it was necessary to use CAST to insist that a numeric has data type DECIMAL, for example CAST(1.1 AS DECIMAL) or CAST('99999999999999999999999999999999999999' AS DECIMAL). See the description of NoSQL type ‘decimal’. DECIMAL support in SQL was added in Tarantool version 2.10.1.

STRING values are any sequence of zero or more characters encoded with UTF-8, or NULL. The possible character values are the same as for the Unicode standard. Byte sequences which are not valid UTF-8 characters are allowed but not recommended. STRING literal values are enclosed within single quotes, for example 'literal'. If the VARCHAR alias is used for column definition, it must include a maximum length, for example column_1 VARCHAR(40). However, the maximum length is ignored. The data-type may be followed by [COLLATE collation-name].

VARBINARY values are any sequence of zero or more octets (bytes), or NULL. VARBINARY literal values are expressed as X followed by pairs of hexadecimal digits enclosed within single quotes, for example X'0044'. VARBINARY’s NoSQL equivalent is 'varbinary' but not character string – the MessagePack storage is MP_BIN (MsgPack binary).

UUID (Universally unique identifier) values are 32 hexadecimal digits, or NULL. The usual format is a string with five fields separated by hyphens, 8-4-4-4-12, for example '000024ac-7ca6-4ab2-bd75-34742ac91213'. The MessagePack storage is MP_EXT (MsgPack extension) with 16 bytes. UUID values may be created with Tarantool/NoSQL Module uuid, or with the UUID() function, or with the CAST() function. UUID support in SQL was added in Tarantool version 2.9.1.

DATETIME. Introduced in v. 2.10.0. A datetime table field can be created by using this type, which is semantically equivalent to the standard TIMESTAMP WITH TIME ZONE type.

tarantool> create table T2(d datetime primary key);
---
- row_count: 1
...

tarantool> insert into t2 values ('2022-01-01');
---
- null
- 'Type mismatch: can not convert string(''2022-01-01'') to datetime'
...

tarantool> insert into t2 values (cast('2022-01-01' as datetime));
---
- row_count: 1
...

tarantool> select * from t2;
---
- metadata:
  - name: D
    type: datetime
  rows:
  - ['2022-01-01T00:00:00Z']
...

There is no implicit cast available from a string expression to a datetime expression (unlike convention used by majority of SQL vendors). In such cases, you need to use explicit cast from a string value to a datetime value (see the example above).

You can subtract datetime and datetime, datetime and interval, or add datetime and interval in any order (see examples of such arithmetic in the description of the INTERVAL type).

The built-in functions related to the DATETIME type are DATE_PART() and NOW()

INTERVAL. Introduced in v. 2.10.0. Similarly to the DATETIME type, you can define a column of the INTERVAL type.

tarantool> create table T(d datetime primary key, i interval);
---
- row_count: 1
...

tarantool> insert into T values (cast('2022-02-02T01:01' as datetime), cast({'year': 1, 'month': 1} as interval));
---
- row_count: 1
...

tarantool> select * from t;
---
- metadata:
  - name: D
    type: datetime
  - name: I
    type: interval
  rows:
  - ['2022-02-02T01:01:00Z', '+1 years, 1 months']
...

Unlike DATETIME, INTERVAL cannot be a part of an index.

There is no implicit cast available for conversions to an interval from a string or any other type. But there is explicit cast allowed from maps (see examples below).

Intervals can be used in arithmetic operations like + or - only with the datetime expression or another interval:

tarantool> select * from t
---
- metadata:
  - name: D
    type: datetime
  - name: I
    type: interval
  rows:
  - ['2022-02-02T01:01:00Z', '+1 years, 1 months']
...

tarantool> select d, d + i, d + cast({'year': 1, 'month': 2} as interval) from t
---
- metadata:
  - name: D
    type: datetime
  - name: COLUMN_1
    type: datetime
  - name: COLUMN_2
    type: datetime
  rows:
  - ['2022-02-02T01:01:00Z', '2023-03-02T01:01:00Z', '2023-04-02T01:01:00Z']
...

tarantool> select i + cast({'year': 1, 'month': 2} as interval) from t
---
- metadata:
  - name: COLUMN_1
    type: interval
  rows:
  - ['+2 years, 3 months']
...

There is the predefined list of known attributes for the map if you want to convert one to the INTERVAL expression:

year
month
week
day
hour
minute
second
nsec

tarantool> select cast({'year': 1, 'month': 1, 'week': 1, 'day': 1, 'hour': 1, 'min': 1, 'sec': 1} as interval)
---
- metadata:
  - name: COLUMN_1
    type: interval
  rows:
  - ['+1 years, 1 months, 1 weeks, 1 days, 1 hours, 1 minutes, 1 seconds']
...

tarantool> \set language lua


tarantool> v = {year = 1, month = 1, week = 1, day = 1, hour = 1,
         >      min = 1, sec = 1, nsec = 1, adjust = 'none'}
---
...

tarantool> box.execute('select cast(#v as interval);', {{['#v'] = v}})

---
- metadata:
  - name: COLUMN_1
    type: interval
  rows:
  - ['+1 years, 1 months, 1 weeks, 1 days, 1 hours, 1 minutes, 1.000000001 seconds']
...

SCALAR can be used for column definitions and the individual column values have type SCALAR. See Column definition – the rules for the SCALAR data type. The data-type may be followed by [COLLATE collation-name]. Prior to Tarantool version 2.10.1, individual column values had one of the preceding types – BOOLEAN, INTEGER, DOUBLE, DECIMAL, STRING, VARBINARY, or UUID. Starting in Tarantool version 2.10.1, all values have type SCALAR.

MAP values are key:value combinations which can be produced with MAP expressions. Maps cannot be used in arithmetic or comparison (except IS [NOT] NULL), and the only functions where they are allowed are CAST, QUOTE, TYPEOF, and functions involving NULL comparisons.

ARRAY values are lists which can be produced with ARRAY expressions. Arrays cannot be used in arithmetic or comparison (except IS [NOT] NULL), and the only functions where they are allowed are CAST, QUOTE, TYPEOF, and functions involving NULL comparisons.

ANY can be used for column definitions and the individual column values have type ANY. The difference between SCALAR and ANY is:

SCALAR columns may not contain MAP or ARRAY values, but ANY columns may contain them.
SCALAR values are comparable, while ANY values are not comparable.

Any value of any data type may be NULL. Ordinarily NULL will be cast to the data type of any operand it is being compared to or to the data type of the column it is in. If the data type of NULL cannot be determined from context, it is BOOLEAN.

Most of the SQL data types correspond to Tarantool/NoSQL types with the same name. In Tarantool versions before v. 2.10.0, There were also some Tarantool/NoSQL data types which had no corresponding SQL data types. In those versions, if Tarantool/SQL reads a Tarantool/NoSQL value of a type that has no SQL equivalent, Tarantool/SQL could treat it as NULL or INTEGER or VARBINARY. For example, SELECT "flags" FROM "_vspace"; would return a column whose type is 'map'. Such columns can only be manipulated in SQL by invoking Lua functions.

Operators

An operator signifies what operation can be performed on operands.

Almost all operators are easy to recognize because they consist of one-character or two-character non-alphabetic tokens, except for six keyword operators (AND IN IS LIKE NOT OR).

Almost all operators are “dyadic”, that is, they are performed on a pair of operands – the only operators that are performed on a single operand are NOT and ~ and (sometimes) -.

The result of an operation is a new operand. If the operator is a comparison operator then the result has data type BOOLEAN (TRUE or FALSE or UNKNOWN). Otherwise the result has the same data type as the original operands, except that: promotion to a broader type may occur to avoid overflow. Arithmetic with NULL operands will result in a NULL operand.

In the following list of operators, the tag “(arithmetic)” indicates that all operands are expected to be numerics (other than NUMBER) and should result in a numeric; the tag “(comparison)” indicates that operands are expected to have similar data types and should result in a BOOLEAN; the tag “(logic)” indicates that operands are expected to be BOOLEAN and should result in a BOOLEAN. Exceptions may occur where operations are not possible, but see the “special situations” which are described after this list. Although all examples show literals, they could just as easily show column identifiers.

Starting with Tarantool version 2.10.1, arithmetic operands cannot be NUMBERs.

+ addition (arithmetic)

Add two numerics according to standard arithmetic rules. Example: 1 + 5, result = 6.

- subtraction (arithmetic)

Subtract second numeric from first numeric according to standard arithmetic rules.

Example: 1 - 5, result = -4.
* multiplication (arithmetic)

Multiply two numerics according to standard arithmetic rules.

Example: 2 * 5, result = 10.
/ division (arithmetic)

Divide second numeric into first numeric according to standard arithmetic rules. Division by zero is not legal. Division of integers always results in rounding toward zero, use CAST to DOUBLE or to DECIMAL to get non-integer results.

Example: 5 / 2, result = 2.
% modulus (arithmetic)

Divide second numeric into first numeric according to standard arithmetic rules. The result is the remainder. Starting with Tarantool version 2.10.1, operands must be INTEGER or UNSIGNED.

Examples: 17 % 5, result = 2; -123 % 4, result = -3.
<< shift left (arithmetic)

Shift the first numeric to the left N times, where N = the second numeric. For positive numerics, each 1-bit shift to the left is equivalent to multiplying times 2.

Example: 5 << 1, result = 10.

Note

Starting with Tarantool version 2.10.1, operands must be non-negative INTEGER or UNSIGNED.
>> shift right (arithmetic)

Shift the first numeric to the right N times, where N = the second numeric. For positive numerics, each 1-bit shift to the right is equivalent to dividing by 2.

Example: 5 >> 1, result = 2.

Note

Starting with Tarantool version 2.10.1, operands must be non-negative INTEGER or UNSIGNED.
& and (arithmetic)

Combine the two numerics, with 1 bits in the result if and only if both original numerics have 1 bits.

Example: 5 & 4, result = 4.

Note

Starting with Tarantool version 2.10.1, operands must be non-negative INTEGER or UNSIGNED.
| or (arithmetic)

Combine the two numerics, with 1 bits in the result if either original numeric has a 1 bit.

Example: 5 | 2, result = 7.

Note

Starting with Tarantool version 2.10.1, operands must be non-negative INTEGER or UNSIGNED.
~ negate (arithmetic), sometimes called bit inversion

Change 0 bits to 1 bits, change 1 bits to 0 bits.

Example: ~5, result = -6.

Note

Starting with Tarantool version 2.10.1, the operand must be non-negative INTEGER or UNSIGNED.

< less than (comparison)

Return TRUE if the first operand is less than the second by arithmetic or collation rules.

Example for numerics: 5 < 2, result = FALSE

Example for strings: 'C' < ' ', result = FALSE
<= less than or equal (comparison)

Return TRUE if the first operand is less than or equal to the second by arithmetic or collation rules.

Example for numerics: 5 <= 5, result = TRUE

Example for strings: 'C' <= 'B', result = FALSE
> greater than (comparison)

Return TRUE if the first operand is greater than the second by arithmetic or collation rules.

Example for numerics: 5 > -5, result = TRUE

Example for strings: 'C' > '!', result = TRUE
>= greater than or equal (comparison)

Return TRUE if the first operand is greater than or equal to the second by arithmetic or collation rules.

Example for numerics: 0 >= 0, result = TRUE Example for strings: 'Z' >= 'Γ', result = FALSE

= equal (assignment or comparison)

After the word SET, “=” means the first operand gets the value from the second operand. In other contexts, “=” returns TRUE if operands are equal.

Example for assignment: ... SET column1 = 'a';

Example for numerics: 0 = 0, result = TRUE

Example for strings: '1' = '2 ', result = FALSE
== equal (assignment), or equal (comparison)

This is a non-standard equivalent of “= equal (assignment or comparison)”.

<> not equal (comparison)

Return TRUE if the first operand is not equal to the second by arithmetic or collation rules.

Example for strings: 'A' <> 'A ' is TRUE.
!= not equal (comparison)

This is a non-standard equivalent of “<> not equal (comparison)”.
[ , ] (indexed access operator)

Array example: ['a', 'b', 'c'] [2] (returns 'b')

Map example: {'a' : 123, 7: 'asd'}['a'] (returns 123)

See also: ARRAY index expression and MAP index expression.

IS NULL and IS NOT NULL (comparison)

For IS NULL: Return TRUE if the first operand is NULL, otherwise return FALSE. Example: column1 IS NULL, result = TRUE if column1 contains NULL.

For IS NOT NULL: Return FALSE if the first operand is NULL, otherwise return TRUE. Example: column1 IS NOT NULL, result = FALSE if column1 contains NULL.

LIKE (comparison)

Perform a comparison of two string operands. If the second operand contains '_', the '_' matches any single character in the first operand. If the second operand contains '%', the '%' matches 0 or more characters in the first operand. If it is necessary to search for either '_' or '%' within a string without treating it specially, an optional clause can be added, ESCAPE single-character-operand, for example 'abc_' LIKE 'abcX_' ESCAPE 'X' is TRUE because X' means “following character is not special”. Matching is also affected by the string’s collation.

BETWEEN (comparison)

x BETWEEN y AND z is shorthand for x >= y AND x <= z.
NOT negation (logic)

Return TRUE if operand is FALSE return FALSE if operand is TRUE, else return UNKNOWN.

Example: NOT (1 > 1), result = TRUE.
IN is equal to one of a list of operands (comparison)

Return TRUE if first operand equals any of the operands in a parenthesized list.

Example: 1 IN (2,3,4,1,7), result = TRUE.
AND and (logic)

Return TRUE if both operands are TRUE. Return UNKNOWN if both operands are UNKNOWN. Return UNKNOWN if one operand is TRUE and the other operand is UNKNOWN. Return FALSE if one operand is FALSE and the other operand is (UNKNOWN or TRUE or FALSE).
OR or (logic)

Return TRUE if either operand is TRUE. Return FALSE if both operands are FALSE. Return UNKNOWN if one operand is UNKNOWN and the other operand is (UNKNOWN or FALSE).

|| concatenate (string manipulation)

Return the value of the first operand concatenated with the value of the second operand.

Example: 'A' || 'B', result = 'AB'.

The precedence of dyadic operators is:

||
* / %
+ -
<< >> & |
<  <= > >=
=  == != <> IS IS NOT IN LIKE
AND
OR

To ensure a desired precedence, use () parentheses.

Special situations

If one of the operands has data type DOUBLE, Tarantool uses floating-point arithmetic. This means that exact results are not guaranteed and rounding may occur without warning. For example, 4.7777777777777778 = 4.7777777777777777 is TRUE.

The floating-point values inf and -inf are possible. For example, SELECT 1e318, -1e318; will return “inf, -inf”. Arithmetic on infinite values may cause NULL results, for example SELECT 1e318 - 1e318; is NULL and SELECT 1e318 * 0; is NULL.

SQL operations never return the floating-point value -nan, although it may exist in data created by Tarantool’s NoSQL. In SQL, -nan is treated as NULL.

In older Tarantool versions, a string would be converted to a numeric if it was used with an arithmetic operator and conversion was possible, for example '7' + '7' = 14. And for comparison, '7' = 7. This is called implicit casting. It was applicable for STRINGs and all numeric data types. Starting with Tarantool version 2.10, it is no longer supported.

Limitations: (Issue#2346)
* Some words, for example MATCH and REGEXP, are reserved but are not necessary for current or planned Tarantool versions
* 999999999999999 << 210 yields 0.

Expressions

An expression is a chunk of syntax that causes return of a value. Expressions may contain literals, column-names, operators, and parentheses.

Therefore these are examples of expressions: 1, 1 + 1 << 1, (1 = 2) OR 4 > 3, 'x' || 'y' || 'z'.

Also there are two expressions that involve keywords:

value IS [NOT] NULL: determine whether value is (not) NULL.
CASE ... WHEN ... THEN ... ELSE ... END: set a series of conditions.

ARRAY expressions

Usage: [ value ... ]

Examples: [1,2,3,4], [1,[2,3],4], ['a', "column_1", uuid()]

An expression has data type = ARRAY if it is a sequence of zero or more values enclosed in square brackets ([ and ]). Often the values in the sequence are called “elements”. The element data type may be anything, including ARRAY – that is, ARRAYs may be nested. Different elements may have different types. The Lua equivalent type is ‘array’.

MAP expressions

Usage: { key : value }

Literal examples: {'a':1}, { "column_1" : X'1234' }

Non-literal examples: {"a":"a"}, {UUID(): (SELECT 1) + 1}, {1:'a123', 'two':uuid()}

An expression has data type = MAP if it is enclosed in curly brackets (also called braces) { and } and contains a key for identification, then a colon :, then a value for what the key identifies. The key data type must be INTEGER or STRING or UUID. The value data type may be anything, including MAP – that is, MAPs may be nested. The Lua equivalent type is ‘map’ but the syntax is slightly different, for example the SQL value {'a': 1} is represented in Lua as {a = 1}.

ARRAY index expression

Usage: array-value [square bracket] index [square bracket]

Example: ['a', 'b', 'c'] [2] (this returns ‘b’)

As in other languages, an element of an array can be referenced with an integer inside square brackets. The returned value is of type ANY.

The SELECT query below retrieves all score values stored in the second position of the scores array field:

CREATE TABLE plays (user_id INTEGER PRIMARY KEY, scores ARRAY);
INSERT INTO plays VALUES (1, [23, 17, 55, 48]);
INSERT INTO plays VALUES (2, [12, 8, 20, 33]);
SELECT scores[2] FROM plays;
/* ---
  rows:
  - [17]
  - [8]
... */

MAP index expression

Usage: map-value [square bracket] index [square bracket]

Example: {'a' : 123, 7: 'asd'}['a'] (this returns 123). The returned value is of type ANY.

The SELECT query below retrieves all values stored in the name attribute of the info map field:

CREATE TABLE bands (id INTEGER PRIMARY KEY, info MAP);
INSERT INTO bands VALUES (1, {'name': 'The Beatles', 'year': 1960});
INSERT INTO bands VALUES (2, {'name': 'The Doors', 'year': 1965});
SELECT info['name'] FROM bands;
/* ---
  rows:
  - ['The Beatles']
  - ['The Doors']
... */

Comparing and ordering

There are rules for determining whether value-1 is “less than”, “equal to”, or “greater than” value-2. These rules are applied for searches, for sorting results in order by column values, and for determining whether a column is unique. The result of a comparison of two values can be TRUE, FALSE, or UNKNOWN (the three BOOLEAN values). For any comparisons where neither operand is NULL, the operands are “distinct” if the comparison result is FALSE. For any set of operands where all operands are distinct from each other, the set is considered to be “unique”.

When comparing a numeric to a numeric:
* infinity = infinity is true
* regular numerics are compared according to usual arithmetic rules

When comparing any value to NULL:
(for examples in this paragraph assume that column1 in table T contains {NULL, NULL, 1, 2})
* value comparison-operator NULL is UNKNOWN (not TRUE and not FALSE), which affects “WHERE condition” because the condition must be TRUE, and does not affect “CHECK (condition)” because the condition must be either TRUE or UNKNOWN. Therefore SELECT * FROM T WHERE column1 > 0 OR column1 < 0 OR column1 = 0; returns only {1,2}, and the table can have been created with CREATE TABLE T (… column1 INTEGER, CHECK (column1 >= 0));
* for any operations that contain the keyword DISTINCT, NULLs are not distinct. Therefore SELECT DISTINCT column1 FROM T; will return {NULL,1,2}.
* for grouping, NULL values sort together. Therefore SELECT column1, COUNT(*) FROM T GROUP BY column1; will include a row {NULL, 2}.
* for ordering, NULL values sort together and are less than non-NULL values. Therefore SELECT column1 FROM T ORDER BY column1; returns {NULL, NULL, 1,2}.
* for evaluating a UNIQUE constraint or UNIQUE index, any number of NULLs is okay. Therefore CREATE UNIQUE INDEX i ON T (column1); will succeed.

When comparing any value (except an ARRAY or MAP or ANY) to a SCALAR:
* This is always legal, and the result depends on the underlying type of the value. For example, if COLUMN1 is defined as SCALAR, and a value in the column is ‘a’, then COLUMN1 < 5 is a legal comparison and the result is FALSE because numeric is less than STRING.

When comparing a numeric to a STRING:
* Comparison is legal if the STRING value can be converted to a numeric with an explicit cast.

When comparing a BOOLEAN to a BOOLEAN:
TRUE is greater than FALSE.

When comparing a VARBINARY to a VARBINARY:
* The numeric value of each pair of bytes is compared until the end of the byte sequences or until inequality. If two byte sequences are otherwise equal but one is longer, then the longer one is greater.

When comparing for the sake of eliminating duplicates:
* This is usually signalled by the word DISTINCT, so it applies to SELECT DISTINCT, to set operators such as UNION (where DISTINCT is implied), and to aggregate functions such as AVG(DISTINCT).
* Two operators are “not distinct” if they are equal to each other, or are both NULL
* If two values are equal but not identical, for example 1.0 and 1.00, they are non-distinct and there is no way to specify which one will be eliminated
* Values in primary-key or unique columns are distinct due to definition.

When comparing a STRING to a STRING:
* Ordinarily collation is “binary”, that is, comparison is done according to the numeric values of the bytes. This can be cancelled by adding a COLLATE clause at the end of either expression. So 'A' < 'a' and 'a' < 'Ä', but 'A' COLLATE "unicode_ci" = 'a' and 'a' COLLATE "unicode_ci" = 'Ä'.
* When comparing a column with a string literal, the column’s defined collation is used.
* Ordinarily trailing spaces matter. So 'a' = 'a ' is not TRUE. This can be cancelled by using the TRIM(TRAILING …) function.

When comparing any value to an ARRAY or MAP or ANY:
* The result is an error.

Limitations:
* LIKE is not expected to work with VARBINARY.

Statements

A statement consists of SQL-language keywords and expressions that direct Tarantool to do something with a database. Statements begin with one of the words ALTER ANALYZE COMMIT CREATE DELETE DROP EXPLAIN INSERT PRAGMA RELEASE REPLACE ROLLBACK SAVEPOINT SELECT SET START TRUNCATE UPDATE VALUES WITH. Statements should end with ; semicolon although this is not mandatory.

A client sends a statement to the Tarantool server. The Tarantool server parses the statement and executes it. If there is an error, Tarantool returns an error message.

List of legal statements

In alphabetical order, the following statements are legal.

ALTER TABLE table-name [RENAME or ADD CONSTRAINT or DROP CONSTRAINT clauses];
ANALYZE [table-name]; – temporarily disabled in current version
COMMIT;
CREATE [UNIQUE] INDEX [IF NOT EXISTS] index-name
ON table-name (column-name [, column-name …]);
CREATE TABLE [IF NOT EXISTS] table-name
(column-or-constraint-definition
[, column-or-constraint-definition …])
[WITH ENGINE = engine-name];
CREATE TRIGGER [IF NOT EXISTS] trigger-name
BEFORE|AFTER INSERT|UPDATE|DELETE ON table-name
FOR EACH ROW
BEGIN dml-statement [, dml-statement …] END;
CREATE VIEW [IF NOT EXISTS] view-name
[(column-name [, column-name …])]
AS select-statement | values-statement;
DROP INDEX [IF EXISTS] index-name ON table-name;
DROP TABLE [IF EXISTS] table-name;
DROP TRIGGER [IF EXISTS] trigger-name;
DROP VIEW [IF EXISTS] view-name;
EXPLAIN explainable-statement;
INSERT INTO table-name
[(column-name [, column-name …])]
values-statement | select-statement;
PRAGMA pragma-name[(value)];
RELEASE SAVEPOINT savepoint-name;
REPLACE INTO table-name VALUES (expression [, expression …]);
ROLLBACK [TO [SAVEPOINT] savepoint-name];
SAVEPOINT savepoint-name;
SELECT [DISTINCT|ALL] expression [, expression …]
FROM [SEQSCAN] table-name | joined-table-names [AS alias]
[WHERE expression]
[GROUP BY expression [, expression …]]
[HAVING expression]
[ORDER BY expression]
LIMIT expression [OFFSET expression]];
SET SESSION session-name = session-value;
START TRANSACTION;
TRUNCATE TABLE table-name;
UPDATE table-name
SET column-name=expression [,column-name=expression…]
[WHERE expression];
VALUES (expression [, expression …];
WITH [RECURSIVE] common-table-expression;

Data Type Conversion

Data type conversion, also called casting, is necessary for any operation involving two operands X and Y, when X and Y have different data types.
Or, casting is necessary for assignment operations (when INSERT or UPDATE is putting a value of type X into a column defined as type Y).
Casting can be “explicit” when a user uses the CAST function, or “implicit” when Tarantool does a conversion automatically.

The general rules are fairly simple:
Assignments and operations involving NULL cause NULL or UNKNOWN results.
For arithmetic, convert to the data type which can contain both operands and the result.
For explicit casts, if a meaningful result is possible, the operation is allowed.
For implicit casts, if a meaningful result is possible and the data types on both sides are either STRINGs or most numeric types (that is, are STRING or INTEGER or UNSIGNED or DOUBLE or DECIMAL but not NUMBER), the operation is sometimes allowed.

The specific situations in this chart follow the general rules:

~                To BOOLEAN | To numeric | To STRING | To VARBINARY | To UUID
---------------  ----------   ----------   ---------   ------------   -------
From BOOLEAN   | AAA        | ---        | A--       | ---          | ---
From numeric   | ---        | SSA        | A--       | ---          | ---
From STRING    | S--        | S--        | AAA       | A--          | S--
From VARBINARY | ---        | ---        | A--       | AAA          | S--
From UUID      | ---        | ---        | A--       | A--          | AAA

Where each entry in the chart has 3 characters:
Where A = Always allowed, S = Sometimes allowed, - = Never allowed.
The first character of an entry is for explicit casts,
the second character is for implicit casts for assignment,
the third character is for implicit cast for comparison.
So AAA = Always for explicit, Always for Implicit (assignment), Always for Implicit (comparison).

The S “Sometimes allowed” character applies for these special situations:
From STRING To BOOLEAN is allowed if UPPER(string-value) = 'TRUE' or 'FALSE'.
From numeric to INTEGER or UNSIGNED is allowed for cast and assignment only if the result is not out of range, and the numeric has no post-decimal digits.
From STRING to INTEGER or UNSIGNED or DECIMAL is allowed only if the string has a representation of a numeric, and the result is not out of range, and the numeric has no post-decimal digits.
From STRING to DOUBLE or NUMBER is allowed only if the string has a representation of a numeric.
From STRING to UUID is allowed only if the value is (8 hexadecimal digits) hyphen (4 hexadecimal digits) hyphen (4 hexadecimal digits) hyphen (4 hexadecimal digits) hyphen (12 hexadecimal digits), such as '8e3b281b-78ad-4410-bfe9-54806a586a90'.
From VARBINARY to UUID is allowed only if the value is 16 bytes long, as in X'8e3b281b78ad4410bfe954806a586a90'.

The chart does not show To|From SCALAR because the conversions depend on the type of the value, not the type of the column definition. Explicit cast to SCALAR is always allowed.

The chart does not show To|From ARRAY or MAP or ANY because almost no conversions are possible. Explicit cast to ANY, or casting any value to its original data type, is legal, but that is all. This is a slight change: before Tarantool v. 2.10.0, it was legal to cast such values as VARBINARY. It is still possible to use arguments with these types in QUOTE functions, which is a way to convert them to STRINGs.

Note

Since version 2.4.1, the NUMBER type is processed in the same way as the number type in NoSQL Tarantool.

Starting with Tarantool 2.10.1, these conversions which used to be legal are now illegal:
Explicit cast from numeric to BOOLEAN,
Explicit cast from BOOLEAN to numeric,
Implicit cast from NUMBER to other numeric types for arithmetic or built-in functions.
Implicit cast from numeric to STRING.
Implicit cast from STRING to numeric.

Examples of casts, illustrating the situations in the chart:

CAST(TRUE AS STRING) is legal. The intersection of the “From BOOLEAN” row with the “To STRING” column is A-- and the first letter of A-- is for explicit cast and A means Always Allowed. The result is ‘TRUE’.

UPDATE ... SET varbinary_column = 'A' is illegal. The intersection of the “From STRING” row with the “To VARBINARY” column is A-- and the second letter of A-- is for implicit cast (assignment) and - means not allowed. The result is an error message.

1.7E-1 > 0 is legal. The intersection of the “From numeric” row with the “To numeric” column is SSA, and the third letter of SSA is for implicit cast (comparison) and A means Always Allowed. The result is TRUE.

11 > '2' is illegal. The intersection of the “From numeric” row with the “To STRING” column is A– and the third letter of A– is for implicit cast (comparison) and - means not allowed. The result is an error message. For detailed explanation see the following section.

CAST('5' AS INTEGER) is legal. The intersection of the “From STRING” row with the “To numeric” column is S– and the first letter of S– is for explicit cast and S means Sometimes Allowed. However, CAST('5.5' AS INTEGER) is illegal because 5.5 is not an integer – if the string contains post-decimal digits and the target is INTEGER or UNSIGNED, the assignment will fail.

Implicit string/numeric cast

The examples in this section are true only for Tarantool versions before Tarantool 2.10. Starting with Tarantool 2.10, implicit string/numeric cast is no longer allowed.

Special considerations may apply for casting STRINGs to/from INTEGERs/DOUBLEs/NUMBERs/UNSIGNEDs (numerics) for comparison or assignment.

1 = '1' /* compare a STRING with a numeric */
UPDATE ... SET string_column = 1 /* assign a numeric to a STRING */

For comparisons, the cast is always from STRING to numeric.
Therefore 1e2 = '100' is TRUE, and 11 > '2' is TRUE.
If the cast fails, then the numeric is less than the STRING.
Therefore 1e400 < '' is TRUE.
Exception: for BETWEEN the cast is to the data type of the first and last operands.
Therefore '66' BETWEEN 5 AND '7' is TRUE.

For assignments, due to a change in behavior starting with Tarantool 2.5.1, implicit casts from strings to numerics are not legal. Therefore INSERT INTO t (integer_column) VALUES ('5'); is an error.

Implicit cast does happen if STRINGS are used in arithmetic.
Therefore '5' / '5' = 1. If the cast fails, then the result is an error.
Therefore '5' / '' is an error.

Implicit cast does NOT happen if numerics are used in concatenation, or in LIKE.
Therefore 5 || '5' is illegal.

In the following examples, implicit cast does not happen for values in SCALAR columns:
DROP TABLE scalars;
CREATE TABLE scalars (scalar_column SCALAR PRIMARY KEY);
INSERT INTO scalars VALUES (11), ('2');
SELECT * FROM scalars WHERE scalar_column > 11; /* 0 rows. So 11 > '2'. */
SELECT * FROM scalars WHERE scalar_column < '2'; /* 1 row. So 11 < '2'. */
SELECT max(scalar_column) FROM scalars; /* 1 row: '2'. So 11 < '2'. */
SELECT sum(scalar_column) FROM scalars; /* 1 row: 13. So cast happened. */
These results are not affected by indexing, or by reversing the operands.

Implicit cast does NOT happen for GREATEST() or LEAST(). Therefore LEAST('5',6) is 6.

For function arguments:
If the function description says that a parameter has a specific data type, and implicit assignment casts are allowed, then arguments which are not passed with that data type will be converted before the function is applied.
For example, the LENGTH() function expects a STRING or VARBINARY, and INTEGER can be converted to STRING, therefore LENGTH(15) will return the length of '15', that is, 2.
But implicit cast sometimes does NOT happen for parameters. Therefore ABS('5') will cause an error message after Issue#4159 is fixed. However, TRIM(5) will still be legal.

Although it is not a requirement of the SQL standard, implicit cast is supposed to help compatibility with other DBMSs. However, other DBMSs have different rules about what can be converted (for example they may allow assignment of 'inf' but disallow comparison with '1e5'). And, of course, it is not possible to be compatible with other DBMSs and at the same time support SCALAR, which other DBMSs do not have.

SQL statements and clauses

The Statements and Clauses guide shows all Tarantool/SQL statements’ syntax and use.

Heading	Summary
Statements that change data definition	ALTER TABLE, CREATE TABLE, DROP TABLE, CREATE VIEW, DROP VIEW, CREATE INDEX, DROP INDEX, CREATE TRIGGER, DROP TRIGGER
Statements that change data	INSERT, UPDATE, DELETE, REPLACE, TRUNCATE, SET
Statements that retrieve data	SELECT, VALUES, PRAGMA, EXPLAIN
Statements for transactions	START TRANSACTION, COMMIT, SAVEPOINT, RELEASE SAVEPOINT, ROLLBACK
Functions	For example CAST(…), LENGTH(…), VERSION()

Statements that change data definition

ALTER TABLE

Syntax:

ALTER TABLE table-name RENAME TO new-table-name;
ALTER TABLE table-name ADD COLUMN column-name column-definition;
ALTER TABLE table-name ADD CONSTRAINT constraint-name constraint-definition;
ALTER TABLE table-name DROP CONSTRAINT constraint-name;
ALTER TABLE table-name ENABLE|DISABLE CHECK CONSTRAINT constraint-name;

ALTER is used to change a table’s name or a table’s elements.

Examples:

For renaming a table with ALTER ... RENAME, the old-table must exist, the new-table must not exist. Example:
-- renaming a table: ALTER TABLE t1 RENAME TO t2;

For adding a column with ADD COLUMN, the table must exist, the table must be empty, the column name must be unique within the table. Example with a STRING column that must start with X:

ALTER TABLE t1 ADD COLUMN s4 STRING CHECK (s4 LIKE 'X%');

ALTER TABLE ... ADD COLUMN support was added in version 2.7.1.

For adding a table constraint with ADD CONSTRAINT, the table must exist, the table must be empty, the constraint name must be unique within the table. Example with a foreign-key constraint definition:
ALTER TABLE t1 ADD CONSTRAINT fk_s1_t1_1 FOREIGN KEY (s1) REFERENCES t1;

It is not possible to say CREATE TABLE table_a ... REFERENCES table_b ... if table b does not exist yet. This is a situation where ALTER TABLE is handy – users can CREATE TABLE table_a without the foreign key, then CREATE TABLE table_b, then ALTER TABLE table_a ... REFERENCES table_b ....

-- adding a primary-key constraint definition:
-- This is unusual because primary keys are created automatically
-- and it is illegal to have two primary keys for the same table.
-- However, it is possible to drop a primary-key index, and this
-- is a way to restore the primary key if that happens.
ALTER TABLE t1 ADD CONSTRAINT "pk_unnamed_T1_1" PRIMARY KEY (s1);

-- adding a unique-constraint definition:
-- Alternatively, you can say CREATE UNIQUE INDEX unique_key ON t1 (s1);
ALTER TABLE t1 ADD CONSTRAINT "unique_unnamed_T1_2" UNIQUE (s1);

-- Adding a check-constraint definition:
ALTER TABLE t1 ADD CONSTRAINT "ck_unnamed_T1_1" CHECK (s1 > 0);

For ALTER ... DROP CONSTRAINT, it is only legal to drop a named constraint. (Tarantool generates the constraint names automatically if the user does not provide them.) Since version 2.4.1, it is possible to drop any of the named table constraints, namely, PRIMARY KEY, UNIQUE, FOREIGN KEY, and CHECK.

To remove a unique constraint, use either ALTER ... DROP CONSTRAINT or DROP INDEX, which will drop the constraint as well.

-- dropping a constraint:
ALTER TABLE t1 DROP CONSTRAINT "fk_unnamed_JJ2_1";

For ALTER ... ENABLE|DISABLE CHECK CONSTRAINT, it is only legal to enable or disable a named constraint, and Tarantool only looks for names of check constraints. By default a constraint is enabled. If a constraint is disabled, then the check will not be performed.

-- disabling and re-enabling a constraint:
ALTER TABLE t1 DISABLE CHECK CONSTRAINT c;
ALTER TABLE t1 ENABLE CHECK CONSTRAINT c;

Limitations:

It is not possible to drop a column.
It is not possible to modify NOT NULL constraints or column properties DEFAULT and data type. However, it is possible to modify them with Tarantool/NOSQL, for example by calling space_object:format() with a different is_nullable value.

CREATE TABLE

Syntax:

CREATE TABLE [IF NOT EXISTS] table-name (column-definition or table-constraint list) [WITH ENGINE = string];

Create a new base table, usually called a “table”.

Note

A table is a base table if it is created with CREATE TABLE and contains data in persistent storage.

A table is a viewed table, or just “view”, if it is created with CREATE VIEW and gets its data from other views or from base tables.

The table-name must be an identifier which is valid according to the rules for identifiers, and must not be the name of an already existing base table or view.

The column-definition or table-constraint list is a comma-separated list of column definitions or table constraint definitions. Column definitions and table constraint definitions are sometimes called table elements.

Rules:

A primary key is necessary; it can be specified with a table constraint PRIMARY KEY.
There must be at least one column.
When IF NOT EXISTS is specified, and there is already a table with the same name, the statement is ignored.
When WITH ENGINE = string is specified, where string must be either ‘memtx’ or ‘vinyl’, the table is created with that storage engine. When this clause is not specified, the table is created with the default engine, which is ordinarily ‘memtx’ but may be changed by updating the box.space._session_settings system table..

Actions:

Tarantool evaluates each column definition and table-constraint, and returns an error if any of the rules is violated.
Tarantool makes a new definition in the schema.
Tarantool makes new indexes for PRIMARY KEY or UNIQUE constraints. A unique index name is created automatically.
Usually Tarantool effectively executes a COMMIT statement.

Examples:

-- the simplest form, with one column and one constraint:
CREATE TABLE t1 (s1 INTEGER, PRIMARY KEY (s1));

-- you can see the effect of the statement by querying
-- Tarantool system spaces:
SELECT * FROM "_space" WHERE "name" = 'T1';
SELECT * FROM "_index" JOIN "_space" ON "_index"."id" = "_space"."id"
         WHERE "_space"."name" = 'T1';

-- variation of the simplest form, with delimited identifiers
-- and a bracketed comment:
CREATE TABLE "T1" ("S1" INT /* synonym of INTEGER */, PRIMARY KEY ("S1"));

-- two columns, one named constraint
CREATE TABLE t1 (s1 INTEGER, s2 STRING, CONSTRAINT pk_s1s2_t1_1 PRIMARY KEY (s1, s2));

Limitations:

The maximum number of columns is 2000.
The maximum length of a row depends on the memtx_max_tuple_size or vinyl_max_tuple_size configuration option.

Column definition

Syntax:

column-name data-type [, column-constraint]

Define a column, which is a table element used in a CREATE TABLE statement.

The column-name must be an identifier which is valid according to the rules for identifiers.

Each column-name must be unique within a table.

Column definition – data type

Every column has a data type: ANY or ARRAY or BOOLEAN or DECIMAL or DOUBLE or INTEGER or MAP or NUMBER or SCALAR or STRING or UNSIGNED or UUID or VARBINARY. The detailed description of data types is in the section Operands.

Column definition – the rules for the SCALAR data type

The rules for the SCALAR data type were significantly changed in Tarantool version v. 2.10.0.

SCALAR is a “complex” data type, unlike all the other data types which are “primitive”. Two column values in a SCALAR column can have two different primitive data types.

Any item defined as SCALAR has an underlying primitive type. For example, here:
```
CREATE TABLE t (s1 SCALAR PRIMARY KEY);
INSERT INTO t VALUES (55), ('41');
```
the underlying primitive type of the item in the first row is INTEGER because literal 55 has data type INTEGER, and the underlying primitive type in the second row is STRING (the data type of a literal is always clear from its format).

An item’s primitive type is far less important than its defined type. Incidentally Tarantool might find the primitive type by looking at the way MsgPack stores it, but that is an implementation detail.
A SCALAR definition may not include a maximum length, as there is no suggested restriction.
A SCALAR definition may include a COLLATE clause, which affects any items whose primitive data type is STRING. The default collation is “binary”.
Some assignments are illegal when data types differ, but legal when the target is a SCALAR item. For example UPDATE ... SET column1 = 'a' is illegal if column1 is defined as INTEGER, but is legal if column1 is defined as SCALAR – values which happen to be INTEGER will be changed so their data type is SCALAR.
There is no literal syntax which implies data type SCALAR.
TYPEOF(x) is always ‘scalar’ or ‘NULL’, it is never the underlying data type. In fact there is no function that is guaranteed to return the underlying data type. For example, TYPEOF(CAST(1 AS SCALAR)); returns ‘scalar’, not ‘integer’.
For any operation that requires implicit casting from an item defined as SCALAR, the operation will fail at runtime. For example, if a definition is:
```
CREATE TABLE t (s1 SCALAR PRIMARY KEY, s2 INTEGER);
```
and the only row in table T has s1 = 1, that is, its underlying primitive type is INTEGER, then UPDATE t SET s2 = s1; is illegal.
For any dyadic operation that requires implicit casting for comparison, the syntax is legal and the operation will not fail at runtime. Take this situation: comparison with a primitive type VARBINARY and a primitive type STRING.
```
CREATE TABLE t (s1 SCALAR PRIMARY KEY);
INSERT INTO t VALUES (X'41');
SELECT * FROM t WHERE s1 > 'a';
```
The comparison is valid, because Tarantool knows the ordering of X’41’ and ‘a’ in Tarantool/NoSQL ‘scalar’ – this is a case where the primitive type matters.
The result data type of min/max operation on a column defined as SCALAR is SCALAR. Users will need to know the underlying primitive type of the result in advance. For example:
```
CREATE TABLE t (s1 INTEGER, s2 SCALAR PRIMARY KEY);
INSERT INTO t VALUES (1, X'44'), (2, 11), (3, 1E4), (4, 'a');
SELECT cast(min(s2) AS INTEGER), hex(cast(max(s2) as VARBINARY)) FROM t;
```
The result is: - - [11, '44',]

That is only possible with Tarantool/NoSQL scalar rules, but SELECT SUM(s2) would not be legal because addition would in this case require implicit casting from VARBINARY to a numeric, which is not sensible.
The result data type of a primitive combination is sometimes SCALAR although Tarantool in effect uses the primitive data type not the defined data type. (Here the word “combination” is used in the way that the standard document uses it for section “Result of data type combinations”.) Therefore for greatest(1E308, 'a', 0, X'00') the result is X’00’ but typeof(greatest(1E308, 'a', 0, X'00') is ‘scalar’.
The union of two SCALARs is sometimes the primitive type. For example, SELECT TYPEOF((SELECT CAST('a' AS SCALAR) UNION SELECT CAST('a' AS SCALAR))); returns ‘string’.

Column definition – relation to NoSQL

All of the SQL data types except SCALAR correspond to Tarantool/NoSQL types with the same name. For example an SQL STRING is stored in a NoSQL space as type = ‘string’.

Therefore specifying an SQL data type X determines that the storage will be in a space with a format column saying that the NoSQL type is ‘x’.

The rules for that NoSQL type are applicable to the SQL data type.

If two items have SQL data types that have the same underlying type, then they are compatible for all assignment or comparison purposes.

If two items have SQL data types that have different underlying types, then the rules for explicit casts, or implicit (assignment) casts, or implicit (comparison) casts, apply.

There is one floating-point value which is not handled by SQL: -nan is seen as NULL although its data type is ‘double’.

Before Tarantool v. 2.10.0, there were also some Tarantool/NoSQL data types which had no corresponding SQL data types. For example, SELECT "flags" FROM "_vspace"; would return a column whose SQL data type is VARBINARY rather than MAP. Such columns can only be manipulated in SQL by invoking Lua functions.

Column definition – column-constraint or default clause

The column-constraint or default clause may be as follows:

Type	Comment
NOT NULL	means “it is illegal to assign a NULL to this column”
PRIMARY KEY	explained in the Table constraint definition section
UNIQUE	explained in the Table constraint definition section
CHECK (expression)	explained in the Table constraint definition section
foreign-key-clause	explained in the Table constraint definition for foreign keys section
DEFAULT expression	means “if INSERT does not assign to this column then assign expression result to this column” – if there is no DEFAULT clause then DEFAULT NULL is assumed

If column-constraint is PRIMARY KEY, this is a shorthand for a separate table-constraint definition: “PRIMARY KEY (column-name)”.

If column-constraint is UNIQUE, this is a shorthand for a separate table-constraint definition: “UNIQUE (column-name)”.

If column-constraint is CHECK, this is a shorthand for a separate table-constraint definition: “CHECK (expression)”.

Columns defined with PRIMARY KEY are automatically NOT NULL.

To enforce some restrictions that Tarantool does not enforce automatically, add CHECK clauses, like these:

CREATE TABLE t ("smallint" INTEGER PRIMARY KEY CHECK ("smallint" <= 32767 AND "smallint" >= -32768));
CREATE TABLE t ("shorttext" STRING PRIMARY KEY CHECK (length("shorttext") <= 10));

but this may cause inserts or updates to be slow.

Column definition – examples

These are shown within CREATE TABLE statements. Data types may also appear in CAST functions.

-- the simple form with column-name and data-type
CREATE TABLE t (column1 INTEGER ...);
-- with column-name and data-type and column-constraint
CREATE TABLE t (column1 STRING PRIMARY KEY ...);
-- with column-name and data-type and collate-clause
CREATE TABLE t (column1 SCALAR COLLATE "unicode" ...);

-- with all possible data types and aliases
CREATE TABLE t
(column1 BOOLEAN, column2 BOOL,
 column3 INT PRIMARY KEY, column4 INTEGER,
 column5 DOUBLE,
 column6 NUMBER,
 column7 STRING, column8 STRING COLLATE "unicode",
 column9 TEXT, columna TEXT COLLATE "unicode_sv_s1",
 columnb VARCHAR(0), columnc VARCHAR(100000) COLLATE "binary",
 columnd UUID,
 columne VARBINARY,
 columnf SCALAR, columng SCALAR COLLATE "unicode_uk_s2",
 columnh DECIMAL,
 columni ARRAY,
 columnj MAP,
 columnk ANY);

-- with all possible column constraints and a default clause
CREATE TABLE t
(column1 INTEGER NOT NULL,
 column2 INTEGER PRIMARY KEY,
 column3 INTEGER UNIQUE,
 column4 INTEGER CHECK (column3 > column2),
 column5 INTEGER REFERENCES t,
 column6 INTEGER DEFAULT NULL);

Table constraint definition

A table constraint restricts the data you can add to the table. If you try to insert invalid data on a column, Tarantool throws an error.

A table constraint has the following syntax:

[CONSTRAINT [name]] constraint_expression

constraint_expression:
  | PRIMARY KEY (column_name, ...)
  | UNIQUE (column_name, ...)
  | CHECK (expression)
  | FOREIGN KEY (column_name, ...) foreign_key_clause

Define a constraint, which is a table element used in a CREATE TABLE statement.

A constraint name must be an identifier that is valid according to the rules for identifiers. A constraint name must be unique within the table for a specific constraint type. For example, the CHECK and FOREIGN KEY constraints can have the same name.

PRIMARY KEY constraints

PRIMARY KEY constraints look like this:

PRIMARY KEY (column_name, ...)

There is a shorthand: specifying PRIMARY KEY in a column definition.

Every table must have one and only one primary key.
Primary-key columns are automatically NOT NULL.
Primary-key columns are automatically indexed.
Primary-key columns are unique. That means it is illegal to have two rows with the same values for the columns specified in the constraint.

Example 1: one-column primary key

Create an author table with the id primary key column:

CREATE TABLE author (
    id INTEGER PRIMARY KEY,
    name STRING NOT NULL
);

Insert data into this table:

INSERT INTO author VALUES (1, 'Leo Tolstoy'),
                          (2, 'Fyodor Dostoevsky');

On an attempt to add an author with the existing id, the following error is raised:

INSERT INTO author VALUES (2, 'Alexander Pushkin');
/*
- Duplicate key exists in unique index "pk_unnamed_author_1" in space "author" with
  old tuple - [2, "Fyodor Dostoevsky"] and new tuple - [2, "Alexander Pushkin"]
*/

Example 2: two-column primary key

Create a book table with the primary key defined on two columns:

CREATE TABLE book (
    id INTEGER,
    title STRING NOT NULL,
    PRIMARY KEY (id, title)
);

Insert data into this table:

INSERT INTO book VALUES (1, 'War and Peace'),
                        (2, 'Crime and Punishment');

On an attempt to add the existing book, the following error is raised:

INSERT INTO book VALUES (2, 'Crime and Punishment');
/*
- Duplicate key exists in unique index "pk_unnamed_book_1" in space "BOOK" with old
  tuple - [2, "Crime and Punishment"] and new tuple - [2, "Crime and Punishment"]
*/

PRIMARY KEY with the AUTOINCREMENT modifier may be specified in one of two ways:

In a column definition after the words PRIMARY KEY, as in CREATE TABLE t (c INTEGER PRIMARY KEY AUTOINCREMENT);
In a PRIMARY KEY (column-list) after a column name, as in CREATE TABLE t (c INTEGER, PRIMARY KEY (c AUTOINCREMENT));

When AUTOINCREMENT is specified, the column must be a primary-key column and it must be INTEGER or UNSIGNED.

Only one column in the table may be autoincrement. However, it is legal to say PRIMARY KEY (a, b, c AUTOINCREMENT) – in that case, there are three columns in the primary key but only the third column (c) is AUTOINCREMENT.

As the name suggests, values in an autoincrement column are automatically incremented. That is: if a user inserts NULL in the column, then the stored value will be the smallest non-negative integer that has not already been used. This occurs because autoincrement columns are associated with sequences.

UNIQUE constraints

UNIQUE constraints look like this:

UNIQUE (column_name, ...)

There is a shorthand: specifying UNIQUE in a column definition.

Unique constraints are similar to primary-key constraints, except that:

A table may have any number of unique keys, and unique keys are not automatically NOT NULL.
Unique columns are automatically indexed.
Unique columns are unique. That means it is illegal to have two rows with the same values in the unique-key columns.

Example 1: one-column unique constraint

Create an author table with the unique name column:

CREATE TABLE author (
    id INTEGER PRIMARY KEY,
    name STRING UNIQUE
);

Insert data into this table:

INSERT INTO author VALUES (1, 'Leo Tolstoy'),
                          (2, 'Fyodor Dostoevsky');

On an attempt to add an author with the same name, the following error is raised:

INSERT INTO author VALUES (3, 'Leo Tolstoy');
/*
- Duplicate key exists in unique index "unique_unnamed_author_2" in space "author"
  with old tuple - [1, "Leo Tolstoy"] and new tuple - [3, "Leo Tolstoy"]
*/

Example 2: two-column unique constraint

Create a book table with the unique constraint defined on two columns:

CREATE TABLE book (
    id INTEGER PRIMARY KEY,
    title STRING NOT NULL,
    author_id INTEGER UNIQUE,
    UNIQUE (title, author_id)
);

Insert data into this table:

INSERT INTO book VALUES (1, 'War and Peace', 1),
                        (2, 'Crime and Punishment', 2);

On an attempt to add a book with duplicated values, the following error is raised:

INSERT INTO book VALUES (3, 'War and Peace', 1);
/*
- Duplicate key exists in unique index "unique_unnamed_book_2" in space "book" with
  old tuple - [1, "War and Peace", 1] and new tuple - [3, "War and Peace", 1]
*/

CHECK constraints

The CHECK constraint is used to limit the value range that a column can store. CHECK constraints look like this:

CHECK (expression)

There is a shorthand: specifying CHECK in a column definition.

The expression may be anything that returns a BOOLEAN result = TRUE or FALSE or UNKNOWN.
The expression may not contain a subquery.
If the expression contains a column name, the column must exist in the table.
If a CHECK constraint is specified, the table must not contain rows where the expression is FALSE. (The table may contain rows where the expression is either TRUE or UNKNOWN.)
Constraint checking may be stopped with ALTER TABLE … DISABLE CHECK CONSTRAINT and restarted with ALTER TABLE … ENABLE CHECK CONSTRAINT.

Example

Create an author table with the name column that should contain values longer than 4 characters:

CREATE TABLE author (
    id INTEGER PRIMARY KEY,
    name STRING,
    CONSTRAINT check_name_length CHECK (CHAR_LENGTH(name) > 4)
);

Insert data into this table:

INSERT INTO author VALUES (1, 'Leo Tolstoy'),
                          (2, 'Fyodor Dostoevsky');

On an attempt to add an author with a name shorter than 5 characters, the following error is raised:

INSERT INTO author VALUES (3, 'Alex');
/*
- Check constraint 'check_name_length' failed for a tuple
*/

Table constraint definition for foreign keys

A foreign key is a constraint that can be used to enforce data integrity across related tables. A foreign key constraint is defined on the child table that references the parent table’s column values.

Foreign key constraints look like this:

FOREIGN KEY (referencing_column_name, ...)
    REFERENCES referenced_table_name (referenced_column_name, ...)

You can also add a reference in a column definition:

referencing_column_name column_definition
    REFERENCES referenced_table_name(referenced_column_name)

Note

Since 2.11.0, the following referencing options aren’t supported anymore:

The ON UPDATE and ON DELETE triggers. The RESTRICT trigger action is used implicitly.
The MATCH subclause. MATCH FULL is used implicitly.
DEFERRABLE constraints. The INITIALLY IMMEDIATE constraint check time rule is used implicitly.

Note that a referenced column should meet one of the following requirements:

A referenced column is a PRIMARY KEY column.
A referenced column has a UNIQUE constraint.
A referenced column has a UNIQUE index.

Note that before the 2.11.0 version, an index existence for the referenced columns is checked when creating a constraint (for example, using CREATE TABLE or ALTER TABLE). Starting with 2.11.0, this check is weakened and the existence of an index is checked during data insertion.

Example

This example shows how to create a relation between the parent and child tables through a single-column foreign key:

First, create a parent author table:

CREATE TABLE author (
    id INTEGER PRIMARY KEY,
    name STRING NOT NULL
);

Insert data into this table:

INSERT INTO author VALUES (1, 'Leo Tolstoy'),
                          (2, 'Fyodor Dostoevsky');

Create a child book table whose author_id column references the id column from the author table:

CREATE TABLE book (
    id INTEGER PRIMARY KEY,
    title STRING NOT NULL,
    author_id INTEGER NOT NULL UNIQUE,
    FOREIGN KEY (author_id)
        REFERENCES author (id)
);

Alternatively, you can add a reference in a column definition:

CREATE TABLE book (
    id INTEGER PRIMARY KEY,
    title STRING NOT NULL,
    author_id INTEGER NOT NULL UNIQUE REFERENCES author(id)
);

Insert data to the book table:

INSERT INTO book VALUES (1, 'War and Peace', 1),
                        (2, 'Crime and Punishment', 2);

Check how the created foreign key constraint enforces data integrity. The following error is raised on an attempt to insert a new book with the author_id value that doesn’t exist in the parent author table:
```
INSERT INTO book VALUES (3, 'Eugene Onegin', 3);
/*
- 'Foreign key constraint ''fk_unnamed_book_1'' failed: foreign tuple was not found'
*/
```
On an attempt to delete an author that already has books in the book table, the following error is raised:
```
DELETE FROM author WHERE id = 2;
/*
- 'Foreign key ''fk_unnamed_book_1'' integrity check failed: tuple is referenced'
*/
```

DROP TABLE

Syntax:

DROP TABLE [IF EXISTS] table-name;

Drop a table.

The table-name must identify a table that was created earlier with the CREATE TABLE statement.

Rules:

If there is a view that references the table, the drop will fail. Please drop the referencing view with DROP VIEW first.
If there is a foreign key that references the table, the drop will fail. Please drop the referencing constraint with ALTER TABLE … DROP first.

Actions:

Tarantool returns an error if the table does not exist and there is no IF EXISTS clause.
The table and all its data are dropped.
All indexes for the table are dropped.
All triggers for the table are dropped.
Usually Tarantool effectively executes a COMMIT statement.

Examples:

-- the simple case:
DROP TABLE t31;
-- with an IF EXISTS clause:
DROP TABLE IF EXISTS t31;

CREATE VIEW

Syntax:

CREATE VIEW [IF NOT EXISTS] view-name [(column-list)] AS subquery;

Create a new viewed table, usually called a “view”.

The view-name must be valid according to the rules for identifiers.

The optional column-list must be a comma-separated list of names of columns in the view.

The syntax of the subquery must be the same as the syntax of a SELECT statement, or of a VALUES clause.

Rules:

There must not already be a base table or view with the same name as view-name.
If column-list is specified, the number of columns in column-list must be the same as the number of columns in the select list of the subquery.

Actions:

Tarantool will throw an error if a rule is violated.
Tarantool will create a new persistent object with column-names equal to the names in the column-list or the names in the subquery’s select list.
Usually Tarantool effectively executes a COMMIT statement.

Examples:

-- the simple case:
CREATE VIEW v AS SELECT column1, column2 FROM t;
-- with a column-list:
CREATE VIEW v (a,b) AS SELECT column1, column2 FROM t;

Limitations:

It is not possible to insert or update or delete from a view, although sometimes a possible substitution is to create an INSTEAD OF trigger.

DROP VIEW

Syntax:

DROP VIEW [IF EXISTS] view-name;

Drop a view.

The view-name must identify a view that was created earlier with the CREATE VIEW statement.

Rules: none

Actions:

Tarantool returns an error if the view does not exist and there is no IF EXISTS clause.
The view is dropped.
All triggers for the view are dropped.
Usually Tarantool effectively executes a COMMIT statement.

Examples:

-- the simple case:
DROP VIEW v31;
-- with an IF EXISTS clause:
DROP VIEW IF EXISTS v31;

CREATE INDEX

Syntax:

CREATE [UNIQUE] INDEX [IF NOT EXISTS] index-name ON table-name (column-list);

Create an index.

The index-name must be valid according to the rules for identifiers.

The table-name must refer to an existing table.

The column-list must be a comma-separated list of names of columns in the table.

Rules:

There must not already be, for the same table, an index with the same name as index-name. But there may already be, for a different table, an index with the same name as index-name.
The maximum number of indexes per table is 128.

Actions:

Tarantool will throw an error if a rule is violated.
If the new index is UNIQUE, Tarantool will throw an error if any row exists with columns that have duplicate values.
Tarantool will create a new index.
Usually Tarantool effectively executes a COMMIT statement.

Automatic indexes:

Indexes may be created automatically for columns mentioned in the PRIMARY KEY or UNIQUE clauses of a CREATE TABLE statement. If an index was created automatically, then the index-name has four parts:

pk if this is for a PRIMARY KEY clause, unique if this is for a UNIQUE clause;
_unnamed_;
the name of the table;
_ and an ordinal number; the first index is 1, the second index is 2, and so on.

For example, after CREATE TABLE t (s1 INTEGER PRIMARY KEY, s2 INTEGER, UNIQUE (s2)); there are two indexes named pk_unnamed_T_1 and unique_unnamed_T_2. You can confirm this by saying SELECT * FROM "_index"; which will list all indexes on all tables. There is no need to say CREATE INDEX for columns that already have automatic indexes.

Examples:

-- the simple case
CREATE INDEX idx_column1_t_1 ON t (column1);
-- with IF NOT EXISTS clause
CREATE INDEX IF NOT EXISTS idx_column1_t_1 ON t (column1);
-- with UNIQUE specifier and more than one column
CREATE UNIQUE INDEX idx_unnamed_t_1 ON t (column1, column2);

Dropping an automatic index created for a unique constraint will drop the unique constraint as well.

DROP INDEX

Syntax:

DROP INDEX [IF EXISTS] index-name ON table-name;

The index-name must be the name of an existing index, which was created with CREATE INDEX. Or, the index-name must be the name of an index that was created automatically due to a PRIMARY KEY or UNIQUE clause in the CREATE TABLE statement. To see what a table’s indexes are, use PRAGMA index_list(table-name);.

Rules: none

Actions:

Tarantool throws an error if the index does not exist, or is an automatically created index.
Tarantool will drop the index.
Usually Tarantool effectively executes a COMMIT statement.

Example:

-- the simplest form:
DROP INDEX idx_unnamed_t_1 ON t;

CREATE TRIGGER

Syntax:

The trigger-name must be valid according to the rules for identifiers.

If the trigger action time is BEFORE or AFTER, then the table-name must refer to an existing base table.

If the trigger action time is INSTEAD OF, then the table-name must refer to an existing view.

Rules:

There must not already be a trigger with the same name as trigger-name.
Triggers on different tables or views share the same namespace.
The statements between BEGIN and END should not refer to the table-name mentioned in the ON clause.
The statements between BEGIN and END should not contain an INDEXED BY clause.

SQL triggers are not activated by Tarantool/NoSQL requests. This will change in a future version.

On a replica, effects of trigger execution are applied, and the SQL triggers themselves are not activated upon replication events.

NoSQL triggers are activated both on replica and master, thus if you have a NoSQL trigger on a replica, it is activated when applying effects of an SQL trigger.

Actions:

Tarantool will throw an error if a rule is violated.
Tarantool will create a new trigger.
Usually Tarantool effectively executes a COMMIT statement.

Examples:

-- the simple case:
CREATE TRIGGER stores_before_insert BEFORE INSERT ON stores FOR EACH ROW
  BEGIN DELETE FROM warehouses; END;
-- with IF NOT EXISTS clause:
CREATE TRIGGER IF NOT EXISTS stores_before_insert BEFORE INSERT ON stores FOR EACH ROW
  BEGIN DELETE FROM warehouses; END;
-- with FOR EACH ROW and WHEN clauses:
CREATE TRIGGER stores_before_insert BEFORE INSERT ON stores FOR EACH ROW WHEN a=5
  BEGIN DELETE FROM warehouses; END;
-- with multiple statements between BEGIN and END:
CREATE TRIGGER stores_before_insert BEFORE INSERT ON stores FOR EACH ROW
  BEGIN DELETE FROM warehouses; INSERT INTO inventories VALUES (1); END;

Trigger extra clauses

UPDATE OF column-list

After BEFORE|AFTER UPDATE it is optional to add OF column-list. If any of the columns in column-list is affected at the time the row is processed, then the trigger will be activated for that row. For example:

CREATE TRIGGER table1_before_update
 BEFORE UPDATE  OF column1, column2 ON table1
 FOR EACH ROW
 BEGIN UPDATE table2 SET column1 = column1 + 1; END;
UPDATE table1 SET column3 = column3 + 1; -- Trigger will not be activated
UPDATE table1 SET column2 = column2 + 0; -- Trigger will be activated

WHEN

After table-name FOR EACH ROW it is optional to add [WHEN expression]. If the expression is true at the time the row is processed, only then will the trigger will be activated for that row. For example:
```
CREATE TRIGGER table1_before_update BEFORE UPDATE ON table1 FOR EACH ROW
 WHEN (SELECT COUNT(*) FROM table1) > 1
 BEGIN UPDATE table2 SET column1 = column1 + 1; END;
```
This trigger will not be activated unless there is more than one row in table1.
OLD and NEW

The keywords OLD and NEW have special meaning in the context of trigger action:
- OLD.column-name refers to the value of column-name before the change.
- NEW.column-name refers to the value of column-name after the change.
For example:
```
CREATE TABLE table1 (column1 STRING, column2 INTEGER PRIMARY KEY);
CREATE TABLE table2 (column1 STRING, column2 STRING, column3 INTEGER PRIMARY KEY);
INSERT INTO table1 VALUES ('old value', 1);
INSERT INTO table2 VALUES ('', '', 1);
CREATE TRIGGER table1_before_update BEFORE UPDATE ON table1 FOR EACH ROW
 BEGIN UPDATE table2 SET column1 = old.column1, column2 = new.column1; END;
UPDATE table1 SET column1 = 'new value';
SELECT * FROM table2;
```
At the beginning of the UPDATE for the single row of table1, the value in column1 is ‘old value’ – so that is what is seen as old.column1.

At the end of the UPDATE for the single row of table1, the value in column1 is ‘new value’ – so that is what is seen as new.column1. (OLD and NEW are qualifiers for table1, not table2.)

Therefore, SELECT * FROM table2; returns ['old value', 'new value'].

OLD.column-name does not exist for an INSERT trigger.

NEW.column-name does not exist for a DELETE trigger.

OLD and NEW are read-only; you cannot change their values.
Deprecated or illegal statements:

It is illegal for the trigger action to include a qualified column reference other than OLD.column-name or NEW.column-name. For example, CREATE TRIGGER ... BEGIN UPDATE table1 SET table1.column1 = 5; END; is illegal.

It is illegal for the trigger action to include statements that include a WITH clause, a DEFAULT VALUES clause, or an INDEXED BY clause.

It is usually not a good idea to have a trigger on table1 which causes a change on table2, and at the same time have a trigger on table2 which causes a change on table1. For example:
```
CREATE TRIGGER table1_before_update
 BEFORE UPDATE ON table1
 FOR EACH ROW
 BEGIN UPDATE table2 SET column1 = column1 + 1; END;
CREATE TRIGGER table2_before_update
 BEFORE UPDATE ON table2
 FOR EACH ROW
 BEGIN UPDATE table1 SET column1 = column1 + 1; END;
```
Luckily UPDATE table1 ... will not cause an infinite loop, because Tarantool recognizes when it has already updated so it will stop. However, not every DBMS acts this way.

Trigger activation

These are remarks concerning trigger activation.

Standard terminology:

“trigger action time” = BEFORE or AFTER or INSTEAD OF
“trigger event” = INSERT or DELETE or UPDATE
“triggered statement” = BEGIN … DELETE|INSERT|REPLACE|SELECT|UPDATE … END
“triggered when clause” = WHEN search-condition
“activate” = execute a triggered statement
some vendors use the word “fire” instead of “activate”

If there is more than one trigger for the same trigger event, Tarantool may execute the triggers in any order.

It is possible for a triggered statement to cause activation of another triggered statement. For example, this is legal:

CREATE TRIGGER t1_before_delete BEFORE DELETE ON t1 FOR EACH ROW BEGIN DELETE FROM t2; END;
CREATE TRIGGER t2_before_delete BEFORE DELETE ON t2 FOR EACH ROW BEGIN DELETE FROM t3; END;

Activation occurs FOR EACH ROW, not FOR EACH STATEMENT. Therefore, if no rows are candidates for insert or update or delete, then no triggers are activated.

The BEFORE trigger is activated even if the trigger event fails.

If an UPDATE trigger event does not make a change, the trigger is activated anyway. For example, if row 1 column1 contains 'a', and the trigger event is UPDATE ... SET column1 = 'a';, the trigger is activated.

The triggered statement may refer to a function: RAISE(FAIL, error-message). If a triggered statement invokes a RAISE(FAIL, error-message) function, or if a triggered statement causes an error, then statement execution stops immediately.

The triggered statement may refer to column values within the rows being changed. in this case:

The row “as of before” the change is called the “old” row (which makes sense only for UPDATE and DELETE statements).
The row “as of after” the change is called the “new” row (which makes sense only for UPDATE and INSERT statements).

This example shows how an INSERT can be done to a view by referring to the “new” row:

CREATE TABLE t (s1 INTEGER PRIMARY KEY, s2 INTEGER);
CREATE VIEW v AS SELECT s1, s2 FROM t;
CREATE TRIGGER v_instead_of INSTEAD OF INSERT ON v
  FOR EACH ROW
  BEGIN INSERT INTO t VALUES (new.s1, new.s2); END;
INSERT INTO v VALUES (1, 2);

Ordinarily saying INSERT INTO view_name ... is illegal in Tarantool, so this is a workaround.

It is possible to generalize this so that all data-change statements on views will change the base tables, provided that the view contains all the columns of the base table, and provided that the triggers refer to those columns when necessary, as in this example:

CREATE TABLE base_table (primary_key_column INTEGER PRIMARY KEY, value_column INTEGER);
CREATE VIEW viewed_table AS SELECT primary_key_column, value_column FROM base_table;
CREATE TRIGGER viewed_table_instead_of_insert INSTEAD OF INSERT ON viewed_table FOR EACH ROW
  BEGIN
    INSERT INTO base_table VALUES (new.primary_key_column, new.value_column); END;
CREATE TRIGGER viewed_table_instead_of_update INSTEAD OF UPDATE ON viewed_table FOR EACH ROW
  BEGIN
    UPDATE base_table
    SET primary_key_column = new.primary_key_column, value_column = new.value_column
    WHERE primary_key_column = old.primary_key_column; END;
CREATE TRIGGER viewed_table_instead_of_delete INSTEAD OF DELETE ON viewed_table FOR EACH ROW
  BEGIN
    DELETE FROM base_table WHERE primary_key_column = old.primary_key_column; END;

When INSERT or UPDATE or DELETE occurs for table X, Tarantool usually operates in this order (a basic scheme):

For each row
  Perform constraint checks
  For each BEFORE trigger that refers to table X
    Check that the trigger's WHEN condition is true.
    Execute what is in the triggered statement.
  Insert or update or delete the row in table X.
  Perform more constraint checks
  For each AFTER trigger that refers to table X
    Check that the trigger's WHEN condition is true.
    Execute what is in the triggered statement.

However, Tarantool does not guarantee execution order when there are multiple constraints, or multiple triggers for the same event (including NoSQL on_replace triggers or SQL INSTEAD OF triggers that affect a view of table X).

The maximum number of trigger activations per statement is 32.

INSTEAD OF triggers

A trigger which is created with the clause
INSTEAD OF INSERT|UPDATE|DELETE ON view-name
is an INSTEAD OF trigger. For each affected row, the trigger action is performed “instead of” the INSERT or UPDATE or DELETE statement that causes trigger activation.

For example, ordinarily it is illegal to INSERT rows in a view, but it is legal to create a trigger which intercepts attempts to INSERT, and puts rows in the underlying base table:

CREATE TABLE t1 (column1 INTEGER PRIMARY KEY, column2 INTEGER);
CREATE VIEW v1 AS SELECT column1, column2 FROM t1;
CREATE TRIGGER v1_instead_of INSTEAD OF INSERT ON v1 FOR EACH ROW BEGIN
 INSERT INTO t1 VALUES (NEW.column1, NEW.column2); END;
INSERT INTO v1 VALUES (1, 1);
-- ... The result will be: table t1 will contain a new row.

INSTEAD OF triggers are only legal for views, while BEFORE or AFTER triggers are only legal for base tables.

It is legal to create INSTEAD OF triggers with triggered WHEN clauses.

Limitations:

It is legal to create INSTEAD OF triggers with UPDATE OF column-list clauses, but they are not standard SQL.

Example:

CREATE TRIGGER ev1_instead_of_update
  INSTEAD OF UPDATE OF column2,column1 ON ev1
  FOR EACH ROW BEGIN
  INSERT INTO et2 VALUES (NEW.column1, NEW.column2); END;

DROP TRIGGER

Syntax:

DROP TRIGGER [IF EXISTS] trigger-name;

Drop a trigger.

The trigger-name must identify a trigger that was created earlier with the CREATE TRIGGER statement.

Rules: none

Actions:

Tarantool returns an error if the trigger does not exist and there is no IF EXISTS clause.
The trigger is dropped.
Usually Tarantool effectively executes a COMMIT statement.

Examples:

-- the simple case:
DROP TRIGGER table1_before_insert;
-- with an IF EXISTS clause:
DROP TRIGGER IF EXISTS table1_before_insert;

Statements that change data

INSERT

Syntax:

INSERT INTO table-name [(column-list)] VALUES (expression-list) [, (expression-list)];
INSERT INTO table-name [(column-list)] select-statement;
INSERT INTO table-name DEFAULT VALUES;

Insert one or more new rows into a table.

The table-name must be a name of a table defined earlier with CREATE TABLE.

The optional column-list must be a comma-separated list of names of columns in the table.

The expression-list must be a comma-separated list of expressions; each expression may contain literals and operators and subqueries and function invocations.

Rules:

The values in the expression-list are evaluated from left to right.
The order of the values in the expression-list must correspond to the order of the columns in the table, or (if a column-list is specified) to the order of the columns in the column-list.
The data type of the value should correspond to the data type of the column, that is, the data type that was specified with CREATE TABLE.
If a column-list is not specified, then the number of expressions must be the same as the number of columns in the table.
If a column-list is specified, then some columns may be omitted; omitted columns will get default values.
The parenthesized expression-list may be repeated – (expression-list),(expression-list),... – for multiple rows.

Actions:

Tarantool evaluates each expression in expression-list, and returns an error if any of the rules is violated.
Tarantool creates zero or more new rows containing values based on the values in the VALUES list or based on the results of the select-expression or based on the default values.
Tarantool executes constraint checks and trigger actions and the actual insertion.

Examples:

-- the simplest form:
INSERT INTO table1 VALUES (1, 'A');
-- with a column list:
INSERT INTO table1 (column1, column2) VALUES (2, 'B');
-- with an arithmetic operator in the first expression:
INSERT INTO table1 VALUES (2 + 1, 'C');
-- put two rows in the table:
INSERT INTO table1 VALUES (4, 'D'), (5, 'E');

UPDATE

Syntax:

UPDATE table-name SET column-name = expression [, column-name = expression ...] [WHERE search-condition];

Update zero or more existing rows in a table.

The table-name must be a name of a table defined earlier with CREATE TABLE or CREATE VIEW.

The column-name must be an updatable column in the table.

The expression may contain literals and operators and subqueries and function invocations and column names.

Rules:

The values in the SET clause are evaluated from left to right.
The data type of the value should correspond to the data type of the column, that is, the data type that was specified with CREATE TABLE.
If a search-condition is not specified, then all rows in the table will be updated; otherwise only those rows which match the search-condition will be updated.

Actions:

Tarantool evaluates each expression in the SET clause, and returns an error if any of the rules is violated. For each row that is found by the WHERE clause, a temporary new row is formed based on the original contents and the modifications caused by the SET clause.
Tarantool executes constraint checks and trigger actions and the actual update.

Examples:

-- the simplest form:
UPDATE t SET column1 = 1;
-- with more than one assignment in the SET clause:
UPDATE t SET column1 = 1, column2 = 2;
-- with a WHERE clause:
UPDATE t SET column1 = 5 WHERE column2 = 6;

Special cases:

It is legal to say SET (list of columns) = (list of values). For example:

UPDATE t SET (column1, column2, column3) = (1, 2, 3);

It is not legal to assign to a column more than once. For example:

INSERT INTO t (column1) VALUES (0);
UPDATE t SET column1 = column1 + 1, column1 = column1 + 1;

The result is an error: “duplicate column name”.

It is not legal to assign to a primary-key column.

DELETE

Syntax:

DELETE FROM table-name [WHERE search-condition];

Delete zero or more existing rows in a table.

The table-name must be a name of a table defined earlier with CREATE TABLE or CREATE VIEW.

The search-condition may contain literals and operators and subqueries and function invocations and column names.

Rules:

If a search-condition is not specified, then all rows in the table will be deleted; otherwise only those rows which match the search-condition will be deleted.

Actions:

Tarantool evaluates each expression in the search-condition, and returns an error if any of the rules is violated.
Tarantool finds the set of rows that are to be deleted.
Tarantool executes constraint checks and trigger actions and the actual deletion.

Examples:

-- the simplest form:
DELETE FROM t;
-- with a WHERE clause:
DELETE FROM t WHERE column2 = 6;

REPLACE

Syntax:

REPLACE INTO table-name [(column-list)] VALUES (expression-list) [, (expression-list)];
REPLACE INTO table-name [(column-list)] select-statement;
REPLACE INTO table-name DEFAULT VALUES;

Insert one or more new rows into a table, or update existing rows.

If a row already exists (as determined by the primary key or any unique key), then the action is delete + insert, and the rules are the same as for a DELETE statement followed by an INSERT statement. Otherwise the action is insert, and the rules are the same as for the INSERT statement.

Examples:

-- the simplest form:
REPLACE INTO table1 VALUES (1, 'A');
-- with a column list:
REPLACE INTO table1 (column1, column2) VALUES (2, 'B');
-- with an arithmetic operator in the first expression:
REPLACE INTO table1 VALUES (2 + 1, 'C');
-- put two rows in the table:
REPLACE INTO table1 VALUES (4, 'D'), (5, 'E');

TRUNCATE

Syntax:

TRUNCATE TABLE table-name;

Remove all rows in the table.

TRUNCATE is considered to be a schema-change rather than a data-change statement, so it does not work within transactions (it cannot be rolled back).

Rules:

It is illegal to truncate a table which is referenced by a foreign key.
It is illegal to truncate a table which is also a system space, such as _space.
The table must be a base table rather than a view.

Actions:

All rows in the table are removed. Usually this is faster than DELETE FROM table-name;.
If the table has an autoincrement primary key, its sequence is not reset to zero, but that may occur in a future Tarantool version.
There is no effect for any triggers associated with the table.
There is no effect on the counts for the ROW_COUNT() function.
Only one action is written to the write-ahead log (with DELETE FROM table-name; there would be one action for each deleted row).

Example:

TRUNCATE TABLE t;

SET

Syntax:

SET SESSION setting-name = setting-value;

SET SESSION is a shorthand way to update the box.space._session_settings temporary system space.

setting-name can have the following values:

"sql_default_engine"
"sql_full_column_names"
"sql_full_metadata"
"sql_parser_debug"
"sql_recursive_triggers"
"sql_reverse_unordered_selects"
"sql_select_debug"
"sql_vdbe_debug"
"sql_defer_foreign_keys" (removed in 2.11.0)
"error_marshaling_enabled" (removed in 2.10.0)

The quote marks are necessary.

If setting-name is "sql_default_engine", then setting-value can be either ‘vinyl’ or ‘memtx’. Otherwise, setting-value can be either TRUE or FALSE.

Example: SET SESSION "sql_default_engine" = 'vinyl'; changes the default engine to ‘vinyl’ instead of ‘memtx’, and returns:

---
- row_count: 1
...

It is functionally the same thing as an UPDATE Statement:

UPDATE "_session_settings"
SET "value" = 'vinyl'
WHERE "name" = 'sql_default_engine';

Statements that retrieve data

SELECT

Syntax:

SELECT [ALL|DISTINCT] select list [from clause] [where clause] [group-by clause] [having clause] [order-by clause];

Select zero or more rows.

The clauses of the SELECT statement are discussed in the following five sections.

Select list

Syntax:

select-list-column [, select-list-column ...]

select-list-column:

Define what will be in a result set; this is a clause in a SELECT statement.

The select list is a comma-delimited list of expressions, or * (asterisk). An expression can have an alias provided with an [[AS] column-name] clause.

The * “asterisk” shorthand is valid if and only if the SELECT statement also contains a FROM clause which specifies the table or tables (details about the FROM clause are in the next section). The simple form is * which means “all columns” – for example, if the select is done for a table which contains three columns s1 s2 s3, then SELECT * ... is equivalent to SELECT s1, s2, s3 .... The qualified form is table-name.* which means “all columns in the specified table”, which again must be a result of the FROM clause – for example, if the table is named table1, then table1.* is equivalent to a list of the columns of table1.

The [[AS] column-name] clause determines the column name. The column name is useful for two reasons:

in a tabular display, the column names are the headings
if the results of the SELECT are used when creating a new table (such as a view), then the column names in the new table will be the column names in the select list.

If [[AS] column-name] is missing, and the expression is not simply the name of a column in the table, then Tarantool makes a name COLUMN_n where n is the number of the non-simple expression within the select list, for example SELECT 5.88, table1.x, 'b' COLLATE "unicode_ci" FROM table1; will cause the column names to be COLUMN_1, X, COLUMN_2. This is a behavior change since version 2.5.1. In earlier versions, the name would be equal to the expression; see Issue#3962. It is still legal to define tables with column names like COLUMN_1 but not recommended.

Examples:

-- the simple form:
SELECT 5;
-- with multiple expressions including operators:
SELECT 1, 2 * 2, 'Three' || 'Four';
-- with [[AS] column-name] clause:
SELECT 5 AS column1;
-- * which must be eventually followed by a FROM clause:
SELECT * FROM table1;
-- as a list:
SELECT 1 AS a, 2 AS b, table1.* FROM table1;

FROM clause

Syntax:

FROM [SEQSCAN] table-reference [, table-reference ...]

Specify the table or tables for the source of a SELECT statement.

The table-reference must be a name of an existing table, or a subquery, or a joined table.

A joined table looks like this:

table-reference-or-joined-table join-operator table-reference-or-joined-table [join-specification]

A join-operator must be any of the standard types:

[NATURAL] LEFT [OUTER] JOIN,
[NATURAL] INNER JOIN, or
CROSS JOIN

A join-specification must be any of:

ON expression, or
USING (column-name [, column-name …])

Parentheses are allowed, and [[AS] correlation-name] is allowed.

The maximum number of joins in a FROM clause is 64.

The SEQSCAN keyword (since 2.11) marks the queries that perform sequential scans during the execution. It happens if the query can’t use indexes, and goes through all the table rows one by one, sometimes causing a heavy load. Such queries are called scan queries. If a scan query doesn’t have the SEQSCAN keyword, Tarantool raises an error. SEQSCAN must precede all names of the tables that the query scans.

To find out if a query performs a sequential scan, use EXPLAIN QUERY PLAN. For scan queries, the result contains SCAN TABLE table_name.

Note

For backward compatibility, the scan queries without the SEQSCAN keyword are allowed in Tarantool 2.11. The errors on scan queries are the default behavior starting from 3.0. You can change the default behavior of scan queries using the compat option sql_seq_scan.

Examples:

-- the simplest form:
SELECT * FROM SEQSCAN t;
-- with two tables, making a Cartesian join:
SELECT * FROM SEQSCAN t1, SEQSCAN t2;
-- with one table joined to itself, requiring correlation names:
SELECT a.*, b.* FROM SEQSCAN t1 AS a, SEQSCAN t1 AS b;
-- with a left outer join:
SELECT * FROM SEQSCAN t1 LEFT JOIN SEQSCAN t2;

WHERE clause

Syntax:

WHERE condition;

Specify the condition for filtering rows from a table; this is a clause in a SELECT or UPDATE or DELETE statement.

The condition may contain any expression that returns a BOOLEAN (TRUE or FALSE or UNKNOWN) value.

For each row in the table:

if the condition is true, then the row is kept;
if the condition is false or unknown, then the row is ignored.

In effect, WHERE condition takes a table with n rows and returns a table with n or fewer rows.

Examples:

-- with a simple condition:
SELECT 1 FROM t WHERE column1 = 5;
-- with a condition that contains AND and OR and parentheses:
SELECT 1 FROM t WHERE column1 = 5 AND (x > 1 OR y < 1);

GROUP BY clause

Syntax:

GROUP BY expression [, expression ...]

Make a grouped table; this is a clause in a SELECT statement.

The expressions should be column names in the table, and each column should be specified only once.

In effect, the GROUP BY clause takes a table with rows that may have matching values, combines rows that have matching values into single rows, and returns a table which, because it is the result of GROUP BY, is called a grouped table.

Thus, if the input is a table:

a    b      c
-    -      -
  'a'   'b
  'b'   'b'
  'a'   'b'
  'a'   'b'
  'b'   'b'

then GROUP BY a, b will produce a grouped table:

a    b      c
-    -      -
1    'a'   'b'
1    'b'   'b'
2    'a'   'b'
3    'a'   'b'

The rows where column a and column b have the same value have been merged; column c has been preserved but its value should not be depended on – if the rows were not all ‘b’, Tarantool could pick any value.

It is useful to envisage a grouped table as having hidden extra columns for the aggregation of the values, for example:

a    b      c    COUNT(a) SUM(a) MIN(c)
-    -      -    -------- ------ ------
1    'a'    'b'         2      2    'b'
1    'b'    'b'         1      1    'b'
2    'a'    'b'         1      2    'b'
     'a'    'b'         1      3    'b'

These extra columns are what aggregate functions are for.

Examples:

-- with a single column:
SELECT 1 FROM t GROUP BY column1;
-- with two columns:
SELECT 1 FROM t GROUP BY column1, column2;

Limitations:

SELECT s1, s2 FROM t GROUP BY s1; is legal.
SELECT s1 AS q FROM t GROUP BY q; is legal.
SELECT s1 FROM t GROUP by 1; is legal.

Aggregate functions

Syntax:

function-name (one or more expressions)

Apply a built-in aggregate function to one or more expressions and return a scalar value.

Aggregate functions are only legal in certain clauses of a SELECT statement for grouped tables. (A table is a grouped table if a GROUP BY clause is present.) Also, if an aggregate function is used in a select list and the GROUP BY clause is omitted, then Tarantool assumes SELECT ... GROUP BY [all columns];.

NULLs are ignored for all aggregate functions except COUNT(*).

AVG([DISTINCT] expression)

Return the average value of expression.

Example: AVG(column1)

COUNT([DISTINCT] expression)

Return the number of occurrences of expression.

Example: COUNT(column1)

COUNT(*)

Return the number of occurrences of a row.

Example: COUNT(*)

GROUP_CONCAT(expression-1 [, expression-2]) or GROUP_CONCAT(DISTINCT expression-1)

Return a list of expression-1 values, separated by commas if expression-2 is omitted, or separated by the expression-2 value if expression-2 is not omitted.

Example: GROUP_CONCAT(column1)

MAX([DISTINCT] expression)

Return the maximum value of expression.

Example: MAX(column1)

MIN([DISTINCT] expression)

Return the minimum value of expression.

Example: MIN(column1)

SUM([DISTINCT] expression)

Return the sum of values of expression, or NULL if there are no rows.

Example: SUM(column1)

TOTAL([DISTINCT] expression)

Return the sum of values of expression, or zero if there are no rows.

Example: TOTAL(column1)

HAVING clause

Syntax:

HAVING condition;

Specify the condition for filtering rows from a grouped table; this is a clause in a SELECT statement.

The clause preceding the HAVING clause may be a GROUP BY clause. HAVING operates on the table that the GROUP BY produces, which may contain grouped columns and aggregates.

If the preceding clause is not a GROUP BY clause, then there is only one group and the HAVING clause may only contain aggregate functions or literals.

For each row in the table:

if the condition is true, then the row is kept;
if the condition is false or unknown, then the row is ignored.

In effect, HAVING condition takes a table with n rows and returns a table with n or fewer rows.

Examples:

-- with a simple condition:
SELECT 1 FROM t GROUP BY column1 HAVING column2 > 5;
-- with a more complicated condition:
SELECT 1 FROM t GROUP BY column1 HAVING column2 > 5 OR column2 < 5;
-- with an aggregate:
SELECT x, SUM(y) FROM t GROUP BY x HAVING SUM(y) > 0;
-- with no GROUP BY and an aggregate:
SELECT SUM(y) FROM t GROUP BY x HAVING MIN(y) < MAX(y);

Limitations:

HAVING without GROUP BY is not supported for multiple tables.

ORDER BY clause

Syntax:

ORDER BY expression [ASC|DESC] [, expression [ASC|DESC] ...]

Put rows in order; this is a clause in a SELECT statement.

An ORDER BY expression has one of three types which are checked in order:

Expression is a positive integer, representing the ordinal position of the column in the select list. For example, in the statement
SELECT x, y, z FROM t ORDER BY 2;
ORDER BY 2 means “order by the second column in the select list”, which is y.
Expression is a name of a column in the select list, which is determined by an AS clause. For example, in the statement
SELECT x, y AS x, z FROM t ORDER BY x;
ORDER BY x means “order by the column explicitly named x in the select list”, which is the second column.
Expression contains a name of a column in a table of the FROM clause. For example, in the statement
SELECT x, y FROM t1 JOIN t2 ORDER BY z;
ORDER BY z means “order by a column named z which is expected to be in table t1 or table t2”.

If both tables contain a column named z, then Tarantool will choose the first column that it finds.

The expression may also contain operators and function names and literals. For example, in the statement
SELECT x, y FROM t ORDER BY UPPER(z);
ORDER BY UPPER(z) means “order by the uppercase form of column t.z”, which may be similar to doing ordering with one of Tarantool’s case-insensitive collations.

Type 3 is illegal if the SELECT statement contains UNION or EXCEPT or INTERSECT.

If an ORDER BY clause contains multiple expressions, then expressions on the left are processed first and expressions on the right are processed only if necessary for tie-breaking. For example, in the statement
SELECT x, y FROM t ORDER BY x, y; if there are two rows which both have the same values for column x, then an additional check is made to see which row has a greater value for column y.

In effect, ORDER BY clause takes a table with rows that may be out of order, and returns a table with rows in order.

Sorting order:

The default order is ASC (ascending), the optional order is DESC (descending).
NULLs come first, then BOOLEANs, then numerics, then STRINGs, then VARBINARYs, then UUIDs.
Ordering does not matter for ARRAYs or MAPs or ANYs because they are not legal for comparisons.
Within STRINGs, ordering is according to collation.
Collation may be specified with a COLLATE clause within the ORDER BY column-list, or may be default.

Examples:

-- with a single column:
SELECT 1 FROM t ORDER BY column1;
-- with two columns:
SELECT 1 FROM t ORDER BY column1, column2;
-- with a variety of data:
CREATE TABLE h (s1 NUMBER PRIMARY KEY, s2 SCALAR);
INSERT INTO h VALUES (7, 'A'), (4, 'a'), (-4, 'AZ'), (17, 17), (23, NULL);
INSERT INTO h VALUES (17.5, 'Д'), (1e+300, 'A'), (0, ''), (-1, '');
SELECT * FROM h ORDER BY s2 COLLATE "unicode_ci", s1;
-- The result of the above SELECT will be:
- - [23, null]
  - [17, 17]
  - [-1, '']
  - [0, '']
  - [4, 'a']
  - [7, 'A']
  - [1e+300, 'A']
  - [-4, 'AZ']
  - [17.5, 'Д']
...

Limitations:

ORDER BY 1 is legal. This is common but is not standard SQL nowadays.

LIMIT clause

Syntax:

LIMIT limit-expression [OFFSET offset-expression]
LIMIT offset-expression, limit-expression

Note

The above is not a typo: offset-expression and limit-expression are in reverse order if a comma is used.

Specify a maximum number of rows and a start row; this is a clause in a SELECT statement.

Expressions may contain integers and arithmetic operators or functions, for example ABS(-3 / 1). However, the result must be an integer value greater than or equal to zero.

Usually the LIMIT clause follows an ORDER BY clause, because otherwise Tarantool does not guarantee that rows are in order.

Examples:

-- simple case:
SELECT * FROM t LIMIT 3;
-- both limit and order:
SELECT * FROM t LIMIT 3 OFFSET 1;
-- applied to a UNIONed result (LIMIT clause must be the final clause):
SELECT column1 FROM table1 UNION SELECT column1 FROM table2 ORDER BY 1 LIMIT 1;

Limitations:

If ORDER BY … LIMIT is used, then all order-by columns must be ASC or all must be DESC.

Subquery

Syntax:

SELECT-statement syntax
VALUES-statement syntax

A subquery has the same syntax as a SELECT statement or VALUES statement embedded inside a main statement.

Note

The SELECT and VALUES statements are called “queries” because they return answers, in the form of result sets.

Subqueries may be the second part of INSERT statements. For example:

INSERT INTO t2 SELECT a, b, c FROM t1;

Subqueries may be in the FROM clause of SELECT statements.

Subqueries may be expressions, or be inside expressions. In this case they must be parenthesized, and usually the number of rows must be 1. For example:

SELECT 1, (SELECT 5), 3 FROM t WHERE c1 * (SELECT COUNT(*) FROM t2) > 5;

Subqueries may be expressions on the right side of certain comparison operators, and in this unusual case the number of rows may be greater than 1. The comparison operators are: [NOT] EXISTS and [NOT] IN. For example:

DELETE FROM t WHERE s1 NOT IN (SELECT s2 FROM t);

Subqueries may refer to values in the outer query. In this case, the subquery is called a “correlated subquery”.

Subqueries may refer to rows which are being updated or deleted by the main query. In that case, the subquery finds the matching rows first, before starting to update or delete. For example, after:

CREATE TABLE t (s1 INTEGER PRIMARY KEY, s2 INTEGER);
INSERT INTO t VALUES (1, 3), (2, 1);
DELETE FROM t WHERE s2 NOT IN (SELECT s1 FROM t);

only one of the rows is deleted, not both rows.

WITH clause

WITH clause (common table expression)

Syntax:

WITH temporary-table-name AS (subquery)
[, temporary-table-name AS (subquery)]
SELECT statement | INSERT statement | DELETE statement | UPDATE statement | REPLACE statement;

WITH v AS (SELECT * FROM t) SELECT * FROM v;

is equivalent to creating a view and selecting from it:

CREATE VIEW v AS SELECT * FROM t;
SELECT * FROM v;

The difference is that a WITH-clause “view” is temporary and only useful within the same statement. No CREATE privilege is required.

The WITH-clause can also be thought of as a subquery that has a name. This is useful when the same subquery is being repeated. For example:

SELECT * FROM t WHERE a < (SELECT s1 FROM x) AND b < (SELECT s1 FROM x);

can be replaced with:

WITH s AS (SELECT s1 FROM x) SELECT * FROM t,s WHERE a < s.s1 AND b < s.s1;

This “factoring out” of a repeated expression is regarded as good practice.

Examples:

WITH cte AS (VALUES (7, '') INSERT INTO j SELECT * FROM cte;
WITH cte AS (SELECT s1 AS x FROM k) SELECT * FROM cte;
WITH cte AS (SELECT COUNT(*) FROM k WHERE s2 < 'x' GROUP BY s3)
  UPDATE j SET s2 = 5
  WHERE s1 = (SELECT s1 FROM cte) OR s3 = (SELECT s1 FROM cte);

WITH can only be used at the beginning of a statement, therefore it cannot be used at the beginning of a subquery or after a set operator or inside a CREATE statement.

A WITH-clause “view” is read-only because Tarantool does not support updatable views.

WITH RECURSIVE

WITH RECURSIVE clause (iterative common table expression)

The real power of WITH lies in the WITH RECURSIVE clause, which is useful when it is combined with UNION or UNION ALL:

WITH RECURSIVE recursive-table-name AS
(SELECT ... FROM non-recursive-table-name ...
UNION [ALL]
SELECT ... FROM recursive-table-name ...)
statement-that-uses-recursive-table-name;

In non-SQL this can be read as: starting with a seed value from a non-recursive table, produce a recursive viewed table, UNION that with itself, UNION that with itself, UNION that with itself … forever, or until a condition in the WHERE clause says “stop”.

For example:

CREATE TABLE ts (s1 INTEGER PRIMARY KEY);
INSERT INTO ts VALUES (1);
WITH RECURSIVE w AS (
  SELECT s1 FROM ts
  UNION ALL
  SELECT s1 + 1 FROM w WHERE s1 < 4)
SELECT * FROM w;

First, table w is seeded from t1, so it has one row: [1].

Then, UNION ALL (SELECT s1 + 1 FROM w) takes the row from w – which contains [1] – adds 1 because the select list says “s1+1”, and so it has one row: [2].

Then, UNION ALL (SELECT s1 + 1 FROM w) takes the row from w – which contains [2] – adds 1 because the select list says “s1+1”, and so it has one row: [3].

Then, UNION ALL (SELECT s1 + 1 FROM w) takes the row from w – which contains [3] – adds 1 because the select list says “s1+1”, and so it has one row: [4].

Then, UNION ALL (SELECT s1 + 1 FROM w) takes the row from w – which contains [4] – and now the importance of the WHERE clause becomes evident, because “s1 < 4” is false for this row, and therefore the “stop” condition has been reached.

So, before the “stop”, table w got 4 rows – [1], [2], [3], [4] – and the result of the statement looks like:

tarantool> WITH RECURSIVE w AS (
         >   SELECT s1 FROM ts
         >   UNION ALL
         >   SELECT s1 + 1 FROM w WHERE s1 < 4)
         > SELECT * FROM w;
---
- - [1]
  - [2]
  - [3]
  - [4]
...

In other words, this WITH RECURSIVE ... SELECT produces a table of auto-incrementing values.

UNION, EXCEPT, and INTERSECT clauses

Syntax:

select-statement UNION [ALL] select-statement [ORDER BY clause] [LIMIT clause];
select-statement EXCEPT select-statement [ORDER BY clause] [LIMIT clause];
select-statement INTERSECT select-statement [ORDER BY clause] [LIMIT clause];

UNION, EXCEPT, and INTERSECT are collectively called “set operators” or “table operators”. In particular:

a UNION b means “take rows which occur in a OR b”.
a EXCEPT b means “take rows which occur in a AND NOT b”.
a INTERSECT b means “take rows which occur in a AND b”.

Duplicate rows are eliminated unless ALL is specified.

The select-statements may be chained: SELECT ... SELECT ... SELECT ...;

Each select-statement must result in the same number of columns.

The select-statements may be replaced with VALUES statements.

The maximum number of set operations is 50.

Example:

CREATE TABLE t1 (s1 INTEGER PRIMARY KEY, s2 STRING);
CREATE TABLE t2 (s1 INTEGER PRIMARY KEY, s2 STRING);
INSERT INTO t1 VALUES (1, 'A'), (2, 'B'), (3, NULL);
INSERT INTO t2 VALUES (1, 'A'), (2, 'C'), (3,NULL);
SELECT s2 FROM t1 UNION SELECT s2 FROM t2;
SELECT s2 FROM t1 UNION ALL SELECT s2 FROM t2 ORDER BY s2;
SELECT s2 FROM t1 EXCEPT SELECT s2 FROM t2;
SELECT s2 FROM t1 INTERSECT SELECT s2 FROM t2;

In this example:

The UNION query returns 4 rows: NULL, ‘A’, ‘B’, ‘C’.
The UNION ALL query returns 6 rows: NULL, NULL, ‘A’, ‘A’, ‘B’, ‘C’.
The EXCEPT query returns 1 row: ‘B’.
The INTERSECT query returns 2 rows: NULL, ‘A’.

Limitations:

Parentheses are not allowed.
Evaluation is left to right, INTERSECT does not have precedence.

Example:

CREATE TABLE t01 (s1 INTEGER PRIMARY KEY, s2 STRING);
CREATE TABLE t02 (s1 INTEGER PRIMARY KEY, s2 STRING);
CREATE TABLE t03 (s1 INTEGER PRIMARY KEY, s2 STRING);
INSERT INTO t01 VALUES (1, 'A');
INSERT INTO t02 VALUES (1, 'B');
INSERT INTO t03 VALUES (1, 'A');
SELECT s2 FROM t01 INTERSECT SELECT s2 FROM t03 UNION SELECT s2 FROM t02;
SELECT s2 FROM t03 UNION SELECT s2 FROM t02 INTERSECT SELECT s2 FROM t03;
-- ... results are different.

INDEXED BY clause

Syntax:

INDEXED BY index-name

The INDEXED BY clause may be used in a SELECT, DELETE, or UPDATE statement, immediately after the table-name. For example:

DELETE FROM table7 INDEXED BY index7 WHERE column1 = 'a';

In this case the search for ‘a’ will take place within index7. For example:

SELECT * FROM table7 NOT INDEXED WHERE column1 = 'a';

In this case the search for ‘a’ will be done via a search of the whole table, what is sometimes called a “full table scan”, even if there is an index for column1.

Ordinarily Tarantool chooses the appropriate index or lookup method depending on a complex set of “optimizer” rules; the INDEXED BY clause overrides the optimizer choice. If the index was defined with the exclude_null parts option, it will only be used if the user specifies it.

Example:

Suppose a table has two columns:

The first column is the primary key and therefore it has an automatic index named pk_unnamed_T_1.
The second column has an index created by the user.

The user selects with INDEXED BY the-index-on-column1, then selects with INDEXED BY the-index-on-column-2.

CREATE TABLE t (column1 INTEGER PRIMARY KEY, column2 INTEGER);
CREATE INDEX idx_column2_t_1 ON t (column2);
INSERT INTO t VALUES (1, 2), (2, 1);
SELECT * FROM t INDEXED BY "pk_unnamed_T_1";
SELECT * FROM t INDEXED BY idx_column2_t_1;
-- Result for the first select: (1, 2), (2, 1)
-- Result for the second select: (2, 1), (1, 2).

Limitations:
Often INDEXED BY has no effect.
Often INDEXED BY affects a choice of covering index, but not a WHERE clause.

VALUES

Syntax:

VALUES (expression [, expression ...]) [, (expression [, expression ...])

Select one or more rows.

VALUES has the same effect as SELECT, that is, it returns a result set, but VALUES statements may not have FROM or GROUP or ORDER BY or LIMIT clauses.

VALUES may be used wherever SELECT may be used, for example in subqueries.

Examples:

-- simple case:
VALUES (1);
-- equivalent to SELECT 1, 2, 3:
VALUES (1, 2, 3);
-- two rows:
VALUES (1, 2, 3), (4, 5, 6);

PRAGMA

Syntax:

PRAGMA pragma-name (pragma-value);
or PRAGMA pragma-name;

PRAGMA statements will give rudimentary information about database ‘metadata’ or server performance, although it is better to get metadata via system tables.

For PRAGMA statements that include (pragma-value), pragma values are strings and can be specified inside "" double quotes, or without quotes. When a string is used for searching, results must match according to a binary collation. If the object being searched has a lower-case name, use double quotes.

In an earlier version, there were some PRAGMA statements that determined behavior. Now that does not happen. Behavior change is done by updating the box.space._session_settings system table.

Pragma	Parameter	Effect
foreign_key_list	string table-name	Return a result set with one row for each foreign key of “table-name”. Each row contains: (INTEGER) id – identification number (INTEGER) seq – sequential number (STRING) table – name of table (STRING) from – referencing key (STRING) to – referenced key (STRING) on_update – ON UPDATE clause (STRING) on_delete – ON DELETE clause (STRING) match – MATCH clause The system table is `"_fk_constraint"`.
collation_list		Return a result set with one row for each supported collation. The first four collations are `'none'` and `'unicode'` and `'unicode_ci'` and `'binary'`, then come about 270 predefined collations, the exact count may vary because users can add their own collations. The system table is `"_collation"`.
index_info	string table-name . index-name	Return a result set with one row for each column in “table-name.index-name”. Each row contains: (INTEGER) seqno – the column’s ordinal position in the index (first column is 0) (INTEGER) cid – the column’s ordinal position in the table (first column is 0) (STRING) name – name of the column (INTEGER) desc – 0 is ASC, 1 is DESC (STRING) collation name (STRING) type – data type
index_list	string table-name	Return a result set with one row for each index of “table-name”. Each row contains: (INTEGER) seq – sequential number (STRING) name – index name (INTEGER) unique – whether the index is unique, 0 is false, 1 is true The system table is `"_index"`.
stats		Return a result set with one row for each index of each table. Each row contains: (STRING) table – name of the table (STRING) index – name of the index (INTEGER) width – arbitrary information (INTEGER) height – arbitrary information
table_info	string table-name	Return a result set with one row for each column in “table-name”. Each row contains: (INTEGER) cid – ordinal position in the table (first column number is 0) (STRING) name – column name (STRING) type (INTEGER) notnull – whether the column is NOT NULL, 0 is false, 1 is true. (STRING) dflt_value – default value (INTEGER) pk – whether the column is a PRIMARY KEY column, 0 is false, 1 is true.

Example: (not showing result set metadata)

PRAGMA table_info(T);
---
- - [0, 's1', 'integer', 1, null, 1]
  - [1, 's2', 'integer', 0, null, 0]
...

EXPLAIN

Syntax:

EXPLAIN explainable-statement;

EXPLAIN will show what steps Tarantool would take if it executed explainable-statement. This is primarily a debugging and optimization aid for the Tarantool team.

Example: EXPLAIN DELETE FROM m; returns:

- - [0, 'Init', 0, 3, 0, '', '00', 'Start at 3']
  - [1, 'Clear', 16416, 0, 0, '', '00', '']
  - [2, 'Halt', 0, 0, 0, '', '00', '']
  - [3, 'Transaction', 0, 1, 1, '0', '01', 'usesStmtJournal=0']
  - [4, 'Goto', 0, 1, 0, '', '00', '']

Variation: EXPLAIN QUERY PLAN statement; shows the steps of a search.

Statements for transactions

START TRANSACTION

Syntax:

START TRANSACTION;

Start a transaction. After START TRANSACTION;, a transaction is “active”. If a transaction is already active, then START TRANSACTION; is illegal.

Transactions should be active for fairly short periods of time, to avoid concurrency issues. To end a transaction, say COMMIT; or ROLLBACK;.

Just as in NoSQL, transaction control statements are subject to limitations set by the storage engine involved:
* For the memtx storage engine, if a yield happens within an active transaction, the transaction is rolled back.
* For the vinyl engine, yields are allowed.
Also, although CREATE AND DROP and ALTER statements are legal in transactions, there are a few exceptions. For example, CREATE INDEX ON table_name ... will fail within a multi-statement transaction if the table is not empty.

However, transaction control statements still may not work as you expect when run over a network connection: a transaction is associated with a fiber, not a network connection, and different transaction control statements sent via the same network connection may be executed by different fibers from the fiber pool.

In order to ensure that all statements are part of the intended transaction, put all of them between START TRANSACTION; and COMMIT; or ROLLBACK; then send as a single batch. For example:

Enclose each separate SQL statement in a box.execute() function.
Pass all the box.execute() functions to the server in a single message.

If you are using a console, you can do this by writing everything on a single line.

If you are using net.box, you can do this by putting all the function calls in a single string and calling eval(string).

Example:

START TRANSACTION;

Example of a whole transaction sent to a server on localhost:3301 with eval(string):

net_box = require('net.box')
conn = net_box.new('localhost', 3301)
s = 'box.execute([[START TRANSACTION;]]) '
s = s .. 'box.execute([[INSERT INTO t VALUES (1);]]) '
s = s .. 'box.execute([[ROLLBACK;]]) '
conn:eval(s)

COMMIT

Syntax:

COMMIT;

Commit an active transaction, so all changes are made permanent and the transaction ends.

COMMIT is illegal unless a transaction is active. If a transaction is not active then SQL statements are committed automatically.

Example:

COMMIT;

SAVEPOINT

Syntax:

SAVEPOINT savepoint-name;

Set a savepoint, so that ROLLBACK TO savepoint-name is possible.

SAVEPOINT is illegal unless a transaction is active.

If a savepoint with the same name already exists, it is released before the new savepoint is set.

Example:

SAVEPOINT x;

RELEASE SAVEPOINT

Syntax:

RELEASE SAVEPOINT savepoint-name;

Release (destroy) a savepoint created by a SAVEPOINT statement.

RELEASE is illegal unless a transaction is active.

Savepoints are released automatically when a transaction ends.

Example:

RELEASE SAVEPOINT x;

ROLLBACK

Syntax:

ROLLBACK [TO [SAVEPOINT] savepoint-name];

If ROLLBACK does not specify a savepoint-name, rollback an active transaction, so all changes since START TRANSACTION are cancelled, and the transaction ends.

If ROLLBACK does specify a savepoint-name, rollback an active transaction, so all changes since SAVEPOINT savepoint-name are cancelled, and the transaction does not end.

ROLLBACK is illegal unless a transaction is active.

Examples:

-- the simple form:
ROLLBACK;
-- the form so changes before a savepoint are not cancelled:
ROLLBACK TO SAVEPOINT x;

-- An example of a Lua function that will do a transaction
-- containing savepoint and rollback to savepoint.
function f()
box.execute([[DROP TABLE IF EXISTS t;]]) -- commits automatically
box.execute([[CREATE TABLE t (s1 STRING PRIMARY KEY);]]) -- commits automatically
box.execute([[START TRANSACTION;]]) -- after this succeeds, a transaction is active
box.execute([[INSERT INTO t VALUES ('Data change #1');]])
box.execute([[SAVEPOINT "1";]])
box.execute([[INSERT INTO t VALUES ('Data change #2');]])
box.execute([[ROLLBACK TO SAVEPOINT "1";]]) -- rollback Data change #2
box.execute([[ROLLBACK TO SAVEPOINT "1";]]) -- this is legal but does nothing
box.execute([[COMMIT;]]) -- make Data change #1 permanent, end the transaction
end

Functions

Explanation of functions

Syntax:

function-name (one or more expressions)

Apply a built-in function to one or more expressions and return a scalar value.

Tarantool supports 33 built-in functions.

The maximum number of operands for any function is 127.

The required privileges for built-in functions will likely change in a future version.

List of functions

These are Tarantool/SQL’s built-in functions. Starting with Tarantool 2.10, for functions that require numeric arguments, function arguments with NUMBER data type are illegal.

ABS

Syntax:

ABS(numeric-expression)

Return the absolute value of numeric-expression, which can be any numeric type.

Example: ABS(-1) is 1.

CAST

Syntax:

CAST(expression AS data-type)

Return the expression value after casting to the specified data type.

CAST to/from UUID may change some components to/from little-endian.

Examples: CAST('AB' AS VARBINARY), CAST(X'4142' AS STRING)

CHAR

Syntax:

CHAR([numeric-expression [,numeric-expression...])

Return the characters whose Unicode code point values are equal to the numeric expressions.

Short example:

The first 128 Unicode characters are the “ASCII” characters, so CHAR(65, 66, 67) is ‘ABC’.

Long example:

For the current list of Unicode characters, in order by code point, see www.unicode.org/Public/UCD/latest/ucd/UnicodeData.txt. In that list, there is a line for a Linear B ideogram

100CC;LINEAR B IDEOGRAM B240 WHEELED CHARIOT ...

Therefore, for a string with a chariot in the middle, use the concatenation operator || and the CHAR function

'start of string ' || CHAR(0X100CC) || ' end of string'.

COALESCE

Syntax:

COALESCE(expression, expression [, expression ...])

Return the value of the first non-NULL expression, or, if all expression values are NULL, return NULL.

Example:: COALESCE(NULL, 17, 32) is 17.

DATE_PART

Syntax:

DATE_PART(value_requested , datetime)

Since 2.10.0.

The DATE_PART() function returns the requested information from a DATETIME value. It takes two arguments: the first one tells us what information is requested, the second is a DATETIME value.

Below is a list of supported values of the first argument and what information is returned:

millennium – millennium
century – century
decade – decade
year – year
quarter – quarter of year
month – month of year
week – week of year
day – day of month
dow – day of week
doy – day of year
hour – hour of day
minute – minute of hour
second – second of minute
millisecond – millisecond of second
microsecond – microsecond of second
nanosecond – nanosecond of second
epoch – epoch
timezone_offset – time zone offset from the UTC, in minutes.

Examples:

tarantool> select date_part('millennium', cast({'year': 2000, 'month': 4, 'day': 5, 'hour': 6, 'min': 33, 'sec': 22, 'nsec': 523999111} as datetime));
---
- metadata:
  - name: COLUMN_1
    type: integer
  rows:
  - [2]
...

tarantool> select date_part('day', cast({'year': 2000, 'month': 4, 'day': 5, 'hour': 6, 'min': 33, 'sec': 22, 'nsec': 523999111} as datetime));
---
- metadata:
  - name: COLUMN_1
    type: integer
  rows:
  - [5]
...

tarantool> select date_part('nanosecond', cast({'year': 2000, 'month': 4, 'day': 5, 'hour': 6, 'min': 33, 'sec': 22, 'nsec': 523999111} as datetime));
---
- metadata:
  - name: COLUMN_1
    type: integer
  rows:
  - [523999111]
...

GREATEST

Syntax:

GREATEST(expression-1, expression-2, [expression-3 ...])

Return the greatest value of the supplied expressions, or, if any expression is NULL, return NULL. The reverse of GREATEST is LEAST.

Examples: GREATEST(7, 44, -1) is 44; GREATEST(1E308, 'a', 0, X'00') is ‘0’ = the nul character; GREATEST(3, NULL, 2) is NULL

HEX

Syntax:

HEX(expression)

Return the hexadecimal code for each byte in expression.

Starting with Tarantool version 2.10.0, the expression must be a byte sequence (data type VARBINARY).

In earlier versions of Tarantool, the expression could be either a string or a byte sequence. For ASCII characters, this was straightforward because the encoding is the same as the code point value. For non-ASCII characters, since character strings are usually encoded in UTF-8, each character will require two or more bytes.

Examples:

HEX(X'41') will return 41.
HEX(CAST('Д' AS VARBINARY)) will return D094.

IFNULL

Syntax:

IFNULL(expression, expression)

Return the value of the first non-NULL expression, or, if both expression values are NULL, return NULL. Thus IFNULL(expression, expression) is the same as COALESCE(expression, expression).

Example:: IFNULL(NULL, 17) is 17

LEAST

Syntax:

LEAST(expression-1, expression-2, [expression-3 ...])

Return the least value of the supplied expressions, or, if any expression is NULL, return NULL. The reverse of LEAST is GREATEST.

Examples: LEAST(7, 44, -1) is -1; LEAST(1E308, 'a', 0, X'00') is 0; LEAST(3, NULL, 2) is NULL.

LENGTH

Syntax:

LENGTH(expression)

Return the number of characters in the expression, or the number of bytes in the expression. It depends on the data type: strings with data type STRING are counted in characters, byte sequences with data type VARBINARY are counted in bytes and are not ended by the nul character. There are two aliases for LENGTH(expression) – CHAR_LENGTH(expression) and CHARACTER_LENGTH(expression) do the same thing.

Examples:

LENGTH('ДД') is 2, the string has 2 characters.
LENGTH(CAST('ДД' AS VARBINARY)) is 4, the string has 4 bytes.
LENGTH(CHAR(0, 65)) is 2, ‘0’ does not mean ‘end of string’.
LENGTH(X'410041') is 3, X’…’ byte sequences have type VARBINARY.

LIKELIHOOD

Syntax:

LIKELIHOOD(expression, DOUBLE literal)

Return the expression without change, provided that the numeric literal is between 0.0 and 1.0.

Example: LIKELIHOOD('a' = 'b', .0) is FALSE

LIKELY

Syntax:

LIKELY(expression)

Return TRUE if the expression is probably true.

Example: LIKELY('a' <= 'b') is TRUE

LOWER

Syntax:

LOWER(string-expression)

Return the expression, with upper-case characters converted to lower case. The reverse of LOWER is UPPER.

Example: LOWER('ДA') is ‘дa’

NOW

Syntax:

NOW()

Since 2.10.0.

The NOW() function returns the current date and time as a DATETIME value.

If the function is called more than once in a query, it returns the same result until the query completes, unless a yield has occurred. On yield, the value returned by NOW() is changing.

Examples:

tarantool> select now(), now(), now()
---
- metadata:
  - name: COLUMN_1
    type: datetime
  - name: COLUMN_2
    type: datetime
  - name: COLUMN_3
    type: datetime
  rows:
  - ['2022-07-20T19:02:02.010812282+0300', '2022-07-20T19:02:02.010812282+0300', '2022-07-20T19:02:02.010812282+0300']
...

NULLIF

Syntax:

NULLIF(expression-1, expression-2)

Return expression-1 if expression-1 <> expression-2, otherwise return NULL.

Examples:

NULLIF('a', 'A') is ‘a’.
NULLIF(1.00, 1) is NULL.

Note

Before /release/2.10.4, the type of the result was always SCALAR. Since /release/2.10.4, the result of NULLIF matches the type of the first argument. If the first argument is the NULL literal, then the result has the SCALAR type.

POSITION

Syntax:

POSITION(expression-1, expression-2)

Return the position of expression-1 within expression-2, or return 0 if expression-1 does not appear within expression-2. The data types of the expressions must be either STRING or VARBINARY. If the expressions have data type STRING, then the result is the character position. If the expressions have data type VARBINARY, then the result is the byte position.

Short example: POSITION('C', 'ABC') is 3

Long example: The UTF-8 encoding for the Latin letter A is hexadecimal 41; the UTF-8 encoding for the Cyrillic letter Д is hexadecimal D094 – you can confirm this by saying SELECT HEX(‘ДA’); and seeing that the result is ‘D09441’. If you now execute SELECT POSITION('A', 'ДA'); the result will be 2, because ‘A’ is the second character in the string. However, if you now execute SELECT POSITION(X'41', X'D09441'); the result will be 3, because X’41’ is the third byte in the byte sequence.

PRINTF

Syntax:

PRINTF(string-expression [, expression ...])

Return a string formatted according to the rules of the C sprintf() function, where %d%s means the next two arguments are a numeric and a string, and so on.

If an argument is missing or is NULL, it becomes:

‘0’ if the format requires an integer,
‘0.0’ if the format requires a numeric with a decimal point,
‘’ if the format requires a string.

Example: PRINTF('%da', 5) is ‘5a’.

QUOTE

Syntax:

QUOTE(string-argument)

Return a string with enclosing quotes if necessary, and with quotes inside the enclosing quotes if necessary. This function is useful for creating strings which are part of SQL statements, because of SQL’s rules that string literals are enclosed by single quotes, and single quotes inside such strings are shown as two single quotes in a row.

Starting with Tarantool version 2.10, arguments with numeric data types are returned without change.

Example: QUOTE('a') is 'a'. QUOTE(5) is 5.

RAISE

Syntax:

RAISE(FAIL, error-message)

This may only be used within a triggered statement. See also Trigger Activation.

RANDOM

Syntax: RANDOM()

Return a 19-digit integer which is generated by a pseudo-random number generator,

Example: RANDOM() is 6832175749978026034, or it is any other integer

RANDOMBLOB

Syntax:

RANDOMBLOB(n)

Return a byte sequence, n bytes long, data type = VARBINARY, containing bytes generated by a pseudo-random byte generator. The result can be translated to hexadecimal. If n is less than 1 or is NULL or is infinity, then NULL is returned.

Example: HEX(RANDOMBLOB(3)) is ‘9EAAA8’, or it is the hex value for any other three-byte string

REPLACE

Syntax:

REPLACE(expression-1, expression-2, expression-3)

Return expression-1, except that wherever expression-1 contains expression-2, replace expression-2 with expression-3. The expressions should all have data type STRING or VARBINARY.

Example: REPLACE('AAABCCCBD', 'B', '!') is ‘AAA!CCC!D’

ROUND

Syntax:

ROUND(numeric-expression-1 [, numeric-expression-2])

Return the rounded value of numeric-expression-1, always rounding .5 upward for positive numerics or downward for negative numerics. If numeric-expression-2 is supplied then rounding is to the nearest numeric-expression-2 digits after the decimal point; if numeric-expression-2 is not supplied then rounding is to the nearest integer.

Example: ROUND(-1.5) is -2, ROUND(1.7766E1,2) is 17.77.

ROW_COUNT

ROW_COUNT()

Return the number of rows that were inserted / updated / deleted by the last INSERT or UPDATE or DELETE or REPLACE statement. Rows which were updated by an UPDATE statement are counted even if there was no change. Rows which were inserted / updated / deleted due to foreign-key action are not counted. Rows which were inserted / updated / deleted due to a view’s INSTEAD OF triggers are not counted. After a CREATE or DROP statement, ROW_COUNT() is 1. After other statements, ROW_COUNT() is 0.

Example: ROW_COUNT() is 1 after a successful INSERT of a single row.

Special rule if there are BEFORE or AFTER triggers: In effect the ROW_COUNT() counter is pushed at the beginning of a series of triggered statements, and popped at the end. Therefore, after the following statements:

CREATE TABLE t1 (s1 INTEGER PRIMARY KEY);
CREATE TABLE t2 (s1 INTEGER, s2 STRING, s3 INTEGER, PRIMARY KEY (s1, s2, s3));
CREATE TRIGGER tt1 BEFORE DELETE ON t1 FOR EACH ROW BEGIN
  INSERT INTO t2 VALUES (old.s1, '#2 Triggered', ROW_COUNT());
  INSERT INTO t2 VALUES (old.s1, '#3 Triggered', ROW_COUNT());
  END;
INSERT INTO t1 VALUES (1),(2),(3);
DELETE FROM t1;
INSERT INTO t2 VALUES (4, '#4 Untriggered', ROW_COUNT());
SELECT * FROM t2;

The result is:

---
- - [1, '#2 Triggered', 3]
  - [1, '#3 Triggered', 1]
  - [2, '#2 Triggered', 3]
  - [2, '#3 Triggered', 1]
  - [3, '#2 Triggered', 3]
  - [3, '#3 Triggered', 1]
  - [4, '#4 Untriggered', 3]
...

SOUNDEX

Syntax:

SOUNDEX(string-expression)

Return a four-character string which represents the sound of string-expression. Often words and names which have different spellings will have the same Soundex representation if they are pronounced similarly, so it is possible to search by what they sound like. The algorithm works with characters in the Latin alphabet and works best with English words.

Example: SOUNDEX('Crater') and SOUNDEX('Creature') both return C636.

SUBSTR

Syntax:

SUBSTR(string-or-varbinary-value, numeric-start-position [, numeric-length])

If string-or-varbinary-value has data type STRING, then return the substring which begins at character position numeric-start-position and continues for numeric-length characters (if numeric-length is supplied), or continues till the end of string-or-varbinary-value (if numeric-length is not supplied).

If numeric-start-position is less than 1, or if numeric-start-position + numeric-length is greater than the length of string-or-varbinary-value, then the result is not an error, anything which would be before the start or after the end is ignored. There are no symbols with index <= 0 or with index greater than the length of the first argument.

If numeric-length is less than 0, then the result is an error.

If string-or-varbinary-value has data type VARBINARY rather than data type STRING, then positioning and counting is by bytes rather than by characters.

Examples: SUBSTR('ABCDEF', 3, 2) is ‘CD’, SUBSTR('абвгде', -1, 4) is ‘аб’

TRIM

Syntax:

TRIM([[LEADING|TRAILING|BOTH] [expression-1] FROM] expression-2)

Return expression-2 after removing all leading and/or trailing characters or bytes. The expressions should have data type STRING or VARBINARY. If LEADING|TRAILING|BOTH is omitted, the default is BOTH. If expression-1 is omitted, the default is ‘ ‘ (space) for data type STRING or X’00’ (nul) for data type VARBINARY.

Examples:

TRIM('a' FROM 'abaaaaa') is ‘b’ – all repetitions of ‘a’ are removed on both sides; TRIM(TRAILING 'ב' FROM 'אב') is ‘א’ – if all characters are Hebrew, TRAILING means “left”; TRIM(X'004400') is X’44’ – the default byte sequence to trim is X’00’ when data type is VARBINARY; TRIM(LEADING 'abc' FROM 'abcd') is ‘d’ – expression-1 can have more than 1 character.

TYPEOF

Syntax:

TYPEOF(expression)

Return ‘NULL’ if the expression is NULL, or return ‘scalar’ if the expression is the name of a column defined as SCALAR, or return the data type of the expression.

Examples:

TYPEOF('A') returns ‘string’; TYPEOF(RANDOMBLOB(1)) returns ‘varbinary’; TYPEOF(1e44) returns ‘double’ or ‘number’; TYPEOF(-44) returns ‘integer’; TYPEOF(NULL) returns ‘NULL’

Prior to Tarantool version 2.10, TYPEOF(expression) simply returned the data type of the expression for all cases.

UNICODE

Syntax:

UNICODE(string-expression)

Return the Unicode code point value of the first character of string-expression. If string-expression is empty, the return is NULL. This is the reverse of CHAR(integer).

Example: UNICODE('Щ') is 1065 (hexadecimal 0429).

UNLIKELY

Syntax:

UNLIKELY(expression)

Return TRUE if the expression is probably false. Limitation: in fact UNLIKELY may return the same thing as LIKELY.

Example: UNLIKELY('a' <= 'b') is TRUE.

UPPER

Syntax:

UPPER(string-expression)

Return the expression, with lower-case characters converted to upper case. The reverse of UPPER is LOWER.

Example: UPPER('-4щl') is ‘-4ЩL’.

UUID

Syntax:

UUID([integer])

Return a Universal Unique Identifier, data type UUID. Optionally one can specify a version number; however, at this time the only allowed version is 4, which is the default. UUID support in SQL was added in Tarantool version 2.9.1.

Example: UUID() or UUID(4)

VERSION

Syntax:

VERSION()

Return the Tarantool version.

Example: for a February 2020 build VERSION() is '2.4.0-35-g57f6fc932'.

ZEROBLOB

Syntax:

ZEROBLOB(n)

Return a byte sequence, data type = VARBINARY, n bytes long.

COLLATE clause

COLLATE collation-name

The collation-name must identify an existing collation.

The COLLATE clause is allowed for STRING or SCALAR items:
() in CREATE INDEX
() in CREATE TABLE as part of column definition
() in CREATE TABLE as part of UNIQUE definition
() in string expressions

Examples:

-- In CREATE INDEX
CREATE INDEX idx_unicode_mb_1 ON mb (s1 COLLATE "unicode");
-- In CREATE TABLE
CREATE TABLE t1 (s1 INTEGER PRIMARY KEY, s2 STRING COLLATE "unicode_ci");
-- In CREATE TABLE ... UNIQUE
CREATE TABLE mb (a STRING, b STRING, PRIMARY KEY(a), UNIQUE(b COLLATE "unicode_ci" DESC));
-- In string expressions
SELECT 'a' = 'b' COLLATE "unicode"
    FROM t
    WHERE s1 = 'b' COLLATE "unicode"
    ORDER BY s1 COLLATE "unicode";

The list of collations can be seen with: PRAGMA collation_list;

The collation rules comply completely with the Unicode Technical Standard #10 (“Unicode Collation Algorithm”) and the default character order is as in the Default Unicode Collation Element Table (DUCET). There are many permanent collations; the commonly used ones include:
"none" (not applicable)
"unicode" (characters are in DUCET order with strength = ‘tertiary’)
"unicode_ci" (characters are in DUCET order with strength = ‘primary’)
"binary" (characters are in code point order)
These identifiers must be quoted and in lower case because they are in lower case in Tarantool/NoSQL collations.

If one says COLLATE "binary", this is equivalent to asking for what is sometimes called “code point order” because, if the contents are in the UTF-8 character set, characters with larger code points will appear after characters with lower code points.

In an expression, COLLATE is an operator with higher precedence than anything except ~. This is fine because there are no other useful operators except || and comparison. After ||, collation is preserved.

In an expression with more than one COLLATE clause, if the collation names differ, there is an error: “Illegal mix of collations”. In an expression with no COLLATE clauses, literals have collation "binary", columns have the collation specified by CREATE TABLE.

In other words, to pick a collation, Tarantool uses:
the first COLLATE clause in an expression if it was specified,
else the column’s COLLATE clause if it was specified,
else "binary".

However, for searches and sometimes for sorting, the collation may be an index’s collation, so all non-index COLLATE clauses are ignored.

EXPLAIN will not show the name of what collation was used, but will show the collation’s characteristics.

Example with Swedish collation:
Knowing that “sv” is the two-letter code for Swedish,
and knowing that “s1” means strength = 1,
and seeing with PRAGMA collation_list; that there is a collation named unicode_sv_s1,
check whether two strings are equal according to Swedish rules (yes they are):
SELECT 'ÄÄ' = 'ĘĘ' COLLATE "unicode_sv_s1";

Example with Russian and Ukrainian and Kyrgyz collations:
Knowing that Russian collation is practically the same as Unicode default,
and knowing that the two-letter codes for Ukrainian and Kyrgyz are ‘uk’ and ‘ky’,
and knowing that in Russian (but not Ukrainian) ‘Г’ = ‘Ґ’ with strength=primary,
and knowing that in Russian (but not Kyrgyz) ‘Е’ = ‘Ё’ with strength=primary,
the three SELECT statements here will return results in three different orders:
CREATE TABLE things (remark STRING PRIMARY KEY);
INSERT INTO things VALUES ('Е2'), ('Ё1');
INSERT INTO things VALUES ('Г2'), ('Ґ1');
SELECT remark FROM things ORDER BY remark COLLATE "unicode";
SELECT remark FROM things ORDER BY remark COLLATE "unicode_uk_s1";
SELECT remark FROM things ORDER BY remark COLLATE "unicode_ky_s1";

Default function parameters

Starting in Tarantool 2.10, if a parameter for an aggregate function or a built-in scalar SQL function is one of the extra-parameters that can appear in box.execute(…[,extra-parameters]) requests, default data type is calculated thus:
* When there is only one possible data type, it is default.
Example: box.execute([[SELECT TYPEOF(LOWER(?));]],{x}) is ‘string’.
* When possible data types are INTEGER or DOUBLE or DECIMAL, DECIMAL is default.
Example: box.execute([[SELECT TYPEOF(AVG(?));]],{x}) is ‘decimal’.
* When possible data types are STRING or VARBINARY, STRING is default.
Example: box.execute([[SELECT TYPEOF(LENGTH(?));]],{x}) is ‘string’.
* When possible data types are any other scalar data type, SCALAR is default.
Example: box.execute([[SELECT TYPEOF(GREATEST(?,5));]],{x}) is ‘scalar’.
* When possible data type is a non-scalar data type, such as ARRAY, result is undefined.
* Otherwise, there is no default.
Example: box.execute([[SELECT TYPEOF(LIKELY(?));]],{x}) is the name of one of the primitive data types.

SQL PLUS LUA – Adding Tarantool/NoSQL to Tarantool/SQL

The Adding Tarantool/NoSQL To Tarantool/SQL Guide contains descriptions of NoSQL database objects that can be accessed from SQL, of SQL database objects that can be accessed from NoSQL, of the way to call SQL from Lua, and of the way to call Lua from SQL.

Heading	Summary
Lua requests	Some Lua requests that are especially useful for SQL, such as requests to grant privileges
System tables	Looking at Lua sysview spaces such as _space
Calling Lua routines from SQL	Tarantool’s implementation of SQL stored procedures
Executing Lua chunks	The LUA(…) function
Example sessions	Million-row insert, etc.
Lua functions to make views of metadata	Making equivalents to standard-SQL information_schema tables

Lua Requests

A great deal of functionality is not specifically part of Tarantool’s SQL feature, but is part of the Tarantool Lua application server and DBMS. Here are some examples so it is clear where to look in other sections of the Tarantool manual.

NoSQL “spaces” can be accessed as SQL "tables", and vice versa. For example, suppose a table has been created with
CREATE TABLE things (id INTEGER PRIMARY KEY, remark SCALAR);

This is viewable from Tarantool’s NoSQL feature as a memtx space named THINGS with a primary-key TREE index …

tarantool> box.space.THINGS
---
- engine: memtx
  before_replace: 'function: 0x40bb4608'
  on_replace: 'function: 0x40bb45e0'
  ck_constraint: []
  field_count: 2
  temporary: false
  index:
    0: &0
      unique: true
      parts:
     - type: integer
        is_nullable: false
        fieldno: 1
      id: 0
      space_id: 520
      type: TREE
      name: pk_unnamed_THINGS_1
    pk_unnamed_THINGS_1: *0
  is_local: false
  enabled: true
  name: THINGS
  id: 520

The NoSQL basic data operation requests select, insert, replace, upsert, update, delete will all work. Particularly interesting are the requests that come only via NoSQL.

To create an index on things (remark) with a non-default option for example a special id, say:
box.space.THINGS:create_index('idx_100_things_2', {id=100, parts={2, 'scalar'}})

(If the SQL data type name is SCALAR, then the NoSQL type is ‘scalar’, as described earlier. See the chart in section Operands.)

To grant database-access privileges to user ‘guest’, say
box.schema.user.grant('guest', 'execute', 'universe')
To grant SELECT privileges on table things to user ‘guest’, say
box.schema.user.grant('guest', 'read', 'space', 'THINGS')
To grant UPDATE privileges on table things to user ‘guest’, say:
box.schema.user.grant('guest', 'read,write', 'space', 'THINGS')
To grant DELETE or INSERT privileges on table things if no reading is involved, say:
box.schema.user.grant('guest', 'write', 'space', 'THINGS')
To grant DELETE or INSERT privileges on table things if reading is involved, say:
box.schema.user.grant('guest', 'read,write', 'space', 'THINGS')
To grant CREATE TABLE privilege to user ‘guest’, say
box.schema.user.grant('guest', 'read,write', 'space', '_schema')
box.schema.user.grant('guest', 'read,write', 'space', '_space')
box.schema.user.grant('guest', 'read,write', 'space', '_index')
box.schema.user.grant('guest', 'create', 'space')
To grant CREATE TRIGGER privilege to user ‘guest’, say
box.schema.user.grant('guest', 'read', 'space', '_space')
box.schema.user.grant('guest', 'read,write', 'space', '_trigger')
To grant CREATE INDEX privilege to user ‘guest’, say
box.schema.user.grant('guest', 'read,write', 'space', '_index')
box.schema.user.grant('guest', 'create', 'space')
To grant CREATE TABLE … INTEGER PRIMARY KEY AUTOINCREMENT to user ‘guest’, say
box.schema.user.grant('guest', 'read,write', 'space', '_schema')
box.schema.user.grant('guest', 'read,write', 'space', '_space')
box.schema.user.grant('guest', 'read,write', 'space', '_index')
box.schema.user.grant('guest', 'create', 'space')
box.schema.user.grant('guest', 'read,write', 'space', '_space_sequence')
box.schema.user.grant('guest', 'read,write', 'space', '_sequence')
box.schema.user.grant('guest', 'create', 'sequence')

To write a stored procedure that inserts 5 rows in things, say
function f() for i = 3, 7 do box.space.THINGS:insert{i, i} end end
For client-side API functions, see section “Connectors”.

To make spaces with field names that SQL can understand, use space_object:format(). (Exception: in Tarantool/NoSQL it is legal for tuples to have more fields than are described by a format clause, but in Tarantool/SQL such fields will be ignored.)

To handle replication and sharding of SQL data, see section Sharding.

To enhance performance of SQL statements by preparing them in advance, see section box.prepare().

To call SQL from Lua, see section box.execute([[…]]).

Limitations: (Issue#2368)
* after box.schema.user.grant('guest','read,write,execute','universe'), user 'guest' can create tables. But this is a powerful set of privileges.

Limitations: (Issue#4659, Issue#4757, Issue#4758)
SELECT with * or ORDER BY or GROUP BY from spaces that have map fields or array fields may cause errors. Any access to spaces that have hash indexes may cause severe errors in Tarantool version 2.3 or earlier.

System Tables

There is a way to get some information about the database objects, for example the names of all the tables and their indexes, using SELECT statements. This is done by looking at special read-only tables which Tarantool updates automatically whenever objects are created or dropped. See the submodule box.space overview section. Names of system tables are in lower case so always enclose them in "quotes".

For example, the _space system table has these fields which are seen in SQL as columns:
id = numeric identifier
owner = for example, 1 if the object was made by the 'admin' user
name = the name that was used with CREATE TABLE
engine = usually 'memtx' (the 'vinyl' engine can be used but is not default)
field_count = sometimes 0, but usually a count of the table’s columns
flags = usually empty
format = what a Lua format() function or an SQL CREATE statement produced
Example selection:
SELECT "id", "name" FROM "_space";

Calling Lua routines from SQL

SQL statements can invoke functions that are written in Lua. This is Tarantool’s equivalent for the “stored procedure” feature found in other SQL DBMSs. Tarantool server-side stored procedures are written in Lua rather than SQL/PSM dialect.

Functions can be invoked anywhere that the SQL syntax allows a literal or a column name for reading. Function parameters can include any number of SQL values. If a SELECT statement’s result set has a million rows, and the select list invokes a non-deterministic function, then the function is called a million times.

To create a Lua function that you can call from SQL, use box.schema.func.create(func-name, {options-with-body}) with these additional options:

exports = {'LUA', 'SQL'} – This indicates what languages can call the function. The default is 'LUA'. Specify both: 'LUA', 'SQL'.

param_list = {list} – This is the list of parameters. Specify the Lua type names for each parameter of the function. Remember that a Lua type name is the same as an SQL data type name, in lower case. The Lua type should not be an array.

Also it is good to specify {deterministic = true} if possible, because that may allow Tarantool to generate more efficient SQL byte code.

For a useful example, here is a general function for decoding a single Lua 'map' field:

box.schema.func.create('_DECODE',
   {language = 'LUA',
    returns = 'string',
    body = [[function (field, key)
             -- If Tarantool version < 2.10.1, replace next line with
             -- return require('msgpack').decode(field)[key]
             return field[key]
             end]],
    is_sandboxed = false,
    -- If Tarantool version < 2.10.1, replace next line with
    -- param_list = {'string', 'string'},
    param_list = {'map', 'string'},
    exports = {'LUA', 'SQL'},
    is_deterministic = true})

See it work with, say, the _trigger space. That space has a 'map' field named opts which has a key named sql. By selecting from the space and passing the field and the key name to _DECODE, you can get a list of all the trigger bodies.

box.execute([[SELECT _decode("opts", 'sql') FROM "_trigger";]])

Remember that SQL converts regular identifiers to upper case, so this example works with a function named _DECODE. If the function had been named _decode, then the SELECT statement would have to be:
box.execute([[SELECT "_decode"("opts", 'sql') FROM "_trigger";]])

Here is another example, which illustrates the way that Tarantool creates a view which includes the table_name and table_type columns in the same way that the standard-SQL information_schema.tables view contains them. The difficulty is that, in order to discover whether table_type should be 'BASE TABLE' or should be 'VIEW', it is necessary to know the value of the "flags" field in the Tarantool/NoSQL “_space” or "_vspace" space. The "flags" field type is "map", which SQL does not understand well. If there were no Lua functions, it would be necessary to treat the field as a VARBINARY and look for POSITION(X'A476696577C3',"flags") > 0 (A4 is a MsgPack signal that a 4-byte string follows, 76696577 is UTF8 encoding for ‘view’, C3 is a MsgPack code meaning true). In any case, starting with Tarantool version 2.10, POSITION() does not work on VARBINARY operands. But there is a more sophisticated way, namely, creating a function that returns true if "flags".view is true. So for this case the way to make the function looks like this:

box.schema.func.create('TABLES_IS_VIEW',
     {language = 'LUA',
      returns = 'boolean',
      body = [[function (flags)
          local view
          -- If Tarantool version < 2.10.1, replace next line with
          -- view = require('msgpack').decode(flags).view
          view = flags.view
          if view == nil then return false end
          return view
          end]],
     is_sandboxed = false,
     -- If Tarantool version < 2.10.1, replace next line with
     -- param_list = {'string'},
     param_list = {'map'},
     exports = {'LUA', 'SQL'},
     is_deterministic = true})

And this creates the view:

box.execute([[
CREATE VIEW vtables AS SELECT
"name" AS table_name,
CASE WHEN tables_is_view("flags") == TRUE THEN 'VIEW'
     ELSE 'BASE TABLE' END AS table_type,
"id" AS id,
"engine" AS engine,
(SELECT "name" FROM "_vuser" x
 WHERE x."id" = y."owner") AS owner,
"field_count" AS field_count
FROM "_vspace" y;
]])

Remember that these Lua functions are persistent, so if the server has to be restarted then they do not have to be re-declared.

Executing Lua chunks

To execute Lua code without creating a function, use:
LUA(Lua-code-string)
where Lua-code-string is any amount of Lua code. The string should begin with 'return '.

For example this will show the number of seconds since the epoch:
box.execute([[SELECT lua('return os.time()');]])
For example this will show a database configuration member:
box.execute([[SELECT lua('return box.cfg.memtx_memory');]])
For example this will return FALSE because Lua nil and box.NULL are the same as SQL NULL:
box.execute([[SELECT lua('return box.NULL') IS NOT NULL;]])

Warning: the SQL statement must not invoke a Lua function, or execute a Lua chunk, that accesses a space that underlies any SQL table that the SQL statement accesses. For example, if function f() contains a request "box.space.TEST:insert{0}", then the SQL statement "SELECT f() FROM test;" will try to access the same space in two ways. The results of such conflict may include a hang or an infinite loop.

Example Sessions

Example Session – Create, Insert, Select

Assume that the task is to create two tables, put some rows in each table, create a view that is based on a join of the tables, then select from the view all rows where the second column values are not null, ordered by the first column.

That is, the way to populate the table is
CREATE TABLE t1 (c1 INTEGER PRIMARY KEY, c2 STRING);
CREATE TABLE t2 (c1 INTEGER PRIMARY KEY, x2 STRING);
INSERT INTO t1 VALUES (1, 'A'), (2, 'B'), (3, 'C');
INSERT INTO t1 VALUES (4, 'D'), (5, 'E'), (6, 'F');
INSERT INTO t2 VALUES (1, 'C'), (4, 'A'), (6, NULL);
CREATE VIEW v AS SELECT * FROM t1 NATURAL JOIN t2;
SELECT * FROM v WHERE c2 IS NOT NULL ORDER BY c1;

So the session looks like this:
box.cfg{}
box.execute([[CREATE TABLE t1 (c1 INTEGER PRIMARY KEY, c2 STRING);]])
box.execute([[CREATE TABLE t2 (c1 INTEGER PRIMARY KEY, x2 STRING);]])
box.execute([[INSERT INTO t1 VALUES (1, 'A'), (2, 'B'), (3, 'C');]])
box.execute([[INSERT INTO t1 VALUES (4, 'D'), (5, 'E'), (6, 'F');]])
box.execute([[INSERT INTO t2 VALUES (1, 'C'), (4, 'A'), (6, NULL);]])
box.execute([[CREATE VIEW v AS SELECT * FROM t1 NATURAL JOIN t2;]])
box.execute([[SELECT * FROM v WHERE c2 IS NOT NULL ORDER BY c1;]])

If one executes the above requests with Tarantool as a client, provided the database objects do not already exist, the execution will be successful and the final display will be

tarantool> box.execute([[SELECT * FROM v WHERE c2 IS NOT NULL ORDER BY c1;]])
---
- - [1, 'A', 'C']
- [4, 'D', 'A']
- [6, 'F', null]

Example Session – Get a List of Columns

Here is a function which will create a table that contains a list of all the columns and their Lua types, for all tables. It is not a necessary function because one can create a _COLUMNS view instead. It merely shows, with simpler Lua code, how to make a base table instead of a view.

function create_information_schema_columns()
  box.execute([[DROP TABLE IF EXISTS information_schema_columns;]])
  box.execute([[CREATE TABLE information_schema_columns (
                    table_name STRING,
                    column_name STRING,
                    ordinal_position INTEGER,
                    data_type STRING,
                    PRIMARY KEY (table_name, column_name));]]);
  local space = box.space._vspace:select()
  local sqlstring = ''
  for i = 1, #space do
      for j = 1, #space[i][7] do
          sqlstring = "INSERT INTO information_schema_columns VALUES ("
                  .. "'" .. space[i][3] .. "'"
                  .. ","
                  .. "'" .. space[i][7][j].name .. "'"
                  .. ","
                  .. j
                  .. ","
                  .. "'" .. space[i][7][j].type .. "'"
                  .. ");"
          box.execute(sqlstring)
      end
  end
  return
end

If you now execute the function by saying
create_information_schema_columns()
you will see that there is a table named information_schema_columns containing table_name and column_name and ordinal_position and data_type for everything that was accessible.

Example Session – Million-Row Insert

Here is a variation of the Lua tutorial “Insert one million tuples with a Lua stored procedure”. The differences are: the creation is done with an SQL CREATE TABLE statement, and the inserting is done with an SQL INSERT statement. Otherwise, it is the same. It is the same because Lua and SQL are compatible, just as Lua and NoSQL are compatible.

box.execute([[CREATE TABLE tester (s1 INTEGER PRIMARY KEY, s2 STRING);]])

function string_function()
  local random_number
  local random_string
  random_string = ""
  for x = 1,10,1 do
    random_number = math.random(65, 90)
    random_string = random_string .. string.char(random_number)
  end
  return random_string
end

function main_function()
    local string_value, t, sql_statement
    for i = 1,1000000, 1 do
    string_value = string_function()
    sql_statement = "INSERT INTO tester VALUES (" .. i .. ",'" .. string_value .. "');"
    box.execute(sql_statement)
    end
end
start_time = os.clock()
main_function()
end_time = os.clock()
'insert done in ' .. end_time - start_time .. ' seconds'

Limitations: The function takes more time than the original (Tarantool/NoSQL).

Lua functions to make views of metadata

Tarantool does not include all the standard-SQL information_schema views, which are for looking at metadata, that is, “data about the data”. But here is the Lua code and SQL code for creating equivalents:
_TABLES nearly equivalent to INFORMATION_SCHEMA.TABLES
_COLUMNS nearly equivalent to INFORMATION_SCHEMA.COLUMNS
_VIEWS nearly equivalent to INFORMATION_SCHEMA.VIEWS
_TRIGGERS nearly equivalent to INFORMATION_SCHEMA.TRIGGERS
_REFERENTIAL_CONSTRAINTS nearly equivalent to INFORMATION_SCHEMA.REFERENTIAL_CONSTRAINTS
_CHECK_CONSTRAINTS nearly equivalent to INFORMATION_SCHEMA.CHECK_CONSTRAINTS
_TABLE_CONSTRAINTS nearly equivalent to INFORMATION_SCHEMA.TABLE_CONSTRAINTS.
For each view there will be an example of a SELECT from the view, and the code. Users who want metadata can simply copy the code. Use this code only with Tarantool version 2.3.0 or later. With an earlier Tarantool version, a PRAGMA statement may be useful.

_TABLES view

Example:

tarantool>SELECT * FROM _tables WHERE id > 340 LIMIT 5;
OK 5 rows selected (0.0 seconds)
+---------------+--------------+----------------+------------+-----+--------+-------+-------------+
| TABLE_CATALOG | TABLE_SCHEMA | TABLE_NAME     | TABLE_TYPE | ID  | ENGINE | OWNER | FIELD_COUNT |
+---------------+--------------+----------------+------------+-----+--------+-------+-------------+
| NULL          | NULL         | _fk_constraint | BASE TABLE | 356 | memtx  | admin |        0    |
| NULL          | NULL         | _ck_constraint | BASE TABLE | 364 | memtx  | admin |        0    |
| NULL          | NULL         | _func_index    | BASE TABLE | 372 | memtx  | admin |        0    |
| NULL          | NULL         | _COLUMNS       | VIEW       | 513 | memtx  | admin |        8    |
| NULL          | NULL         | _VIEWS         | VIEW       | 514 | memtx  | admin |        7    |
+---------------+--------------+----------------+------------+-----+--------+-------+-------------+

Definition of the function and the CREATE VIEW statement:

box.schema.func.drop('_TABLES_IS_VIEW',{if_exists = true})
box.schema.func.create('_TABLES_IS_VIEW',
     {language = 'LUA',
      returns = 'boolean',
      body = [[function (flags)
          local view
          -- If Tarantool version < 2.10.1, replace next line with
          -- view = require('msgpack').decode(flags).view
          view = flags.view
          if view == nil then return false end
          return view
          end]],
     is_sandboxed = false,
     -- If Tarantool version < 2.10.1, replace next line with
     -- param_list = {'string'},
     param_list = {'map'},
     exports = {'LUA', 'SQL'},
     is_deterministic = true})
box.schema.role.grant('public', 'execute', 'function', '_TABLES_IS_VIEW')
pcall(function ()
    box.schema.role.revoke('public', 'read', 'space', '_TABLES', {if_exists = true})
    end)
box.execute([[DROP VIEW IF EXISTS _tables;]])
box.execute([[
CREATE VIEW _tables AS SELECT
    CAST(NULL AS STRING) AS table_catalog,
    CAST(NULL AS STRING) AS table_schema,
    "name" AS table_name,
    CASE
        WHEN _tables_is_view("flags") = TRUE THEN 'VIEW'
        ELSE 'BASE TABLE' END
        AS table_type,
    "id" AS id,
    "engine" AS engine,
    (SELECT "name" FROM "_vuser" x WHERE x."id" = y."owner") AS owner,
    "field_count" AS field_count
FROM "_vspace" y;
]])
box.schema.role.grant('public', 'read', 'space', '_TABLES')

_COLUMNS view

This is also an example of how one can use recursive views to make temporary tables with multiple rows for each tuple in the original "_vspace" space. It requires a global variable, _G.box.FORMATS, as a temporary static variable.

Warning: Use this code only with Tarantool version 2.3.2 or later. Use with earlier versions will cause an assertion. See Issue#4504.

Example:

tarantool>SELECT * FROM _columns WHERE ordinal_position = 9;
OK 6 rows selected (0.0 seconds)
+--------------+-------------+--------------------------+--------------+------------------+-------------+-----------+-----+
| CATALOG_NAME | SCHEMA_NAME | TABLE_NAME               | COLUMN_NAME  | ORDINAL_POSITION | IS_NULLABLE | DATA_TYPE | ID  |
+--------------+-------------+--------------------------+--------------+------------------+-------------+-----------+-----+
| NULL         | NULL        | _sequence                | cycle        |                9 | YES         | boolean   | 284 |
| NULL         | NULL        | _vsequence               | cycle        |                9 | YES         | boolean   | 286 |
| NULL         | NULL        | _func                    | returns      |                9   YES           string    | 296 |
| NULL         | NULL        | _fk_constraint           | parent_cols  |                9 | YES         | array     | 356 |
| NULL         | NULL        | _REFERENTIAL_CONSTRAINTS | MATCH_OPTION |                9 | YES         | string    | 518 |
+--------------+-------------+--------------------------+--------------+------------------+-------------+-----------+-----+

Definition of the function and the CREATE VIEW statement:

box.schema.func.drop('_COLUMNS_FORMATS', {if_exists = true})
box.schema.func.create('_COLUMNS_FORMATS',
    {language = 'LUA',
     returns = 'scalar',
     body = [[
     function (row_number_, ordinal_position)
         if row_number_ == 0 then
             _G.box.FORMATS = {}
             local vspace = box.space._vspace:select()
             for i = 1, #vspace do
                 local format = vspace[i]["format"]
                 for j = 1, #format do
                     local is_nullable = 'YES'
                     if format[j].is_nullable == false then
                         is_nullable = 'NO'
                     end
                     table.insert(_G.box.FORMATS,
                                  {vspace[i].name, format[j].name, j,
                                   is_nullable, format[j].type, vspace[i].id})
                 end
             end
             return ''
         end
         if row_number_ > #_G.box.FORMATS then
             _G.box.FORMATS = {}
             return ''
         end
         return _G.box.FORMATS[row_number_][ordinal_position]
     end
     ]],
    param_list = {'integer', 'integer'},
    exports = {'LUA', 'SQL'},
    is_sandboxed = false,
    setuid = false,
    is_deterministic = false})
box.schema.role.grant('public', 'execute', 'function', '_COLUMNS_FORMATS')

pcall(function ()
    box.schema.role.revoke('public', 'read', 'space', '_COLUMNS', {if_exists = true})
    end)
box.execute([[DROP VIEW IF EXISTS _columns;]])
box.execute([[
CREATE VIEW _columns AS
WITH RECURSIVE r_columns AS
(
SELECT 0 AS row_number_,
      '' AS table_name,
      '' AS column_name,
      0 AS ordinal_position,
      '' AS is_nullable,
      '' AS data_type,
      0 AS id
UNION ALL
SELECT row_number_ + 1 AS row_number_,
       _columns_formats(row_number_, 1) AS table_name,
       _columns_formats(row_number_, 2) AS column_name,
       _columns_formats(row_number_, 3) AS ordinal_position,
       _columns_formats(row_number_, 4) AS is_nullable,
       _columns_formats(row_number_, 5) AS data_type,
       _columns_formats(row_number_, 6) AS id
    FROM r_columns
    WHERE row_number_ == 0 OR row_number_ <= lua('return #_G.box.FORMATS + 1')
)
SELECT CAST(NULL AS STRING) AS catalog_name,
       CAST(NULL AS STRING) AS schema_name,
       table_name,
       column_name,
       ordinal_position,
       is_nullable,
       data_type,
       id
    FROM r_columns
    WHERE data_type <> '';
]])
box.schema.role.grant('public', 'read', 'space', '_COLUMNS')

_VIEWS view

Example:

tarantool>SELECT table_name, substr(view_definition,1,20), id, owner, field_count FROM _views LIMIT 5;
OK 5 rows selected (0.0 seconds)
+--------------------------+------------------------------+-----+-------+-------------+
| TABLE_NAME               | SUBSTR(VIEW_DEFINITION,1,20) | ID  | OWNER | FIELD_COUNT |
+--------------------------+------------------------------+-----+-------+-------------+
| _COLUMNS                 | CREATE VIEW _columns         | 513 | admin |           8 |
| _TRIGGERS                | CREATE VIEW _trigger         | 515 | admin |           4 |
| _CHECK_CONSTRAINTS       | CREATE VIEW _check_c         | 517 | admin |           8 |
| _REFERENTIAL_CONSTRAINTS | CREATE VIEW _referen         | 518 | admin |          12 |
| _TABLE_CONSTRAINTS       | CREATE VIEW _table_c         | 519 | admin |          11 |
+--------------------------+------------------------------+-----+-------+-------------+

Definition of the function and the CREATE VIEW statement:

box.schema.func.drop('_VIEWS_DEFINITION',{if_exists = true})
box.schema.func.create('_VIEWS_DEFINITION',
    {language = 'LUA',
     returns = 'string',
     body = [[function (flags)
              -- If Tarantool version < 2.10.1, replace next line with
              -- return require('msgpack').decode(flags).sql
              return flags.sql
              end]],
     -- If Tarantool version < 2.10.1, replace next line with
     -- param_list = {'string'},
     param_list = {'map'},
     exports = {'LUA', 'SQL'},
     is_sandboxed = false,
     setuid = false,
     is_deterministic = false})
box.schema.role.grant('public', 'execute', 'function', '_VIEWS_DEFINITION')
pcall(function ()
    box.schema.role.revoke('public', 'read', 'space', '_VIEWS', {if_exists = true})
    end)
box.execute([[DROP VIEW IF EXISTS _views;]])
box.execute([[
CREATE VIEW _views AS SELECT
    CAST(NULL AS STRING) AS table_catalog,
    CAST(NULL AS STRING) AS table_schema,
    "name" AS table_name,
    CAST(_views_definition("flags") AS STRING) AS VIEW_DEFINITION,
    "id" AS id,
    (SELECT "name" FROM "_vuser" x WHERE x."id" = y."owner") AS owner,
    "field_count" AS field_count
    FROM "_vspace" y
    WHERE _tables_is_view("flags") = TRUE;
]])
box.schema.role.grant('public', 'read', 'space', '_VIEWS')

_TABLES_IS_VIEW() was described earlier, see _TABLES view.

_TRIGGERS view

Example:

tarantool>SELECT trigger_name, opts_sql FROM _triggers;
OK 2 rows selected (0.0 seconds)
+--------------+-------------------------------------------------------------------------------------------------+
| TRIGGER_NAME | OPTS_SQL                                                                                        |
+--------------+-------------------------------------------------------------------------------------------------+
| THINGS1_AD   | CREATE TRIGGER things1_ad AFTER DELETE ON things1 FOR EACH ROW BEGIN DELETE FROM things2; END;  |
| THINGS1_BI   | CREATE TRIGGER things1_bi BEFORE INSERT ON things1 FOR EACH ROW BEGIN DELETE FROM things2; END; |
+--------------+-------------------------------------------------------------------------------------------------+

Definition of the function and the CREATE VIEW statement:

box.schema.func.drop('_TRIGGERS_OPTS_SQL',{if_exists = true})
box.schema.func.create('_TRIGGERS_OPTS_SQL',
    {language = 'LUA',
     returns = 'string',
     body = [[function (opts)
              -- If Tarantool version < 2.10.1, replace next line with
              -- return require('msgpack').decode(opts).sql
              return opts.sql
              end]],
     -- If Tarantool version < 2.10.1, replace next line with
     -- param_list = {'string'},
     param_list = {'map'},
     exports = {'LUA', 'SQL'},
     is_sandboxed = false,
     setuid = false,
     is_deterministic = false})
box.schema.role.grant('public', 'execute', 'function', '_TRIGGERS_OPTS_SQL')
pcall(function ()
    box.schema.role.revoke('public', 'read', 'space', '_TRIGGERS', {if_exists = true})
    end)
box.execute([[DROP VIEW IF EXISTS _triggers;]])
box.execute([[
CREATE VIEW _triggers AS SELECT
    CAST(NULL AS STRING) AS trigger_catalog,
    CAST(NULL AS STRING) AS trigger_schema,
    "name" AS trigger_name,
    CAST(_triggers_opts_sql("opts") AS STRING) AS opts_sql,
    "space_id" AS space_id
    FROM "_trigger";
]])
box.schema.role.grant('public', 'read', 'space', '_TRIGGERS')

Users who select from this view will need ‘read’ privilege on the _trigger space.

_REFERENTIAL_CONSTRAINTS view

Example:

tarantool>SELECT constraint_name, update_rule, delete_rule, match_option,
> referencing, referenced
> FROM _referential_constraints;
OK 2 rows selected (0.0 seconds)
+----------------------+-------------+-------------+--------------+-------------+------------+
| CONSTRAINT_NAME      | UPDATE_RULE | DELETE_RULE | MATCH_OPTION | REFERENCING | REFERENCED |
+----------------------+-------------+-------------+--------------+-------------+------------+
| fk_unnamed_THINGS2_1 | no_action   | no_action   | simple       | THINGS2     | THINGS1    |
| fk_unnamed_THINGS3_1 | no_action   | no_action   | simple       | THINGS3     | THINGS1    |
+----------------------+-------------+-------------+--------------+-------------+------------+

Definition of the CREATE VIEW statement:

pcall(function ()
    box.schema.role.revoke('public', 'read', 'space', '_REFERENTIAL_CONSTRAINTS', {if_exists = true})
    end)
box.execute([[DROP VIEW IF EXISTS _referential_constraints;]])
box.execute([[
CREATE VIEW _referential_constraints AS SELECT
    CAST(NULL AS STRING) AS constraint_catalog,
    CAST(NULL AS STRING) AS constraint_schema,
    "name" AS constraint_name,
    CAST(NULL AS STRING) AS unique_constraint_catalog,
    CAST(NULL AS STRING) AS unique_constraint_schema,
    '' AS unique_constraint_name,
    "on_update" AS update_rule,
    "on_delete" AS delete_rule,
    "match" AS match_option,
    (SELECT "name" FROM "_vspace" x WHERE x."id" = y."child_id") AS referencing,
    (SELECT "name" FROM "_vspace" x WHERE x."id" = y."parent_id") AS referenced,
    "is_deferred" AS is_deferred,
    "child_id" AS child_id,
    "parent_id" AS parent_id
    FROM "_fk_constraint" y;
]])
box.schema.role.grant('public', 'read', 'space', '_REFERENTIAL_CONSTRAINTS')

In this example child_cols or parent_cols are not taken from the _fk_constraint space because in standard SQL those are in a separate table.

Users who select from this view will need ‘read’ privilege on the _fk_constraint space.

_CHECK_CONSTRAINTS view

Example:

tarantool>SELECT constraint_name, check_clause, space_name, language
> FROM _check_constraints;
OK 3 rows selected (0.0 seconds)
+------------------------+-------------------------+------------+----------+
| CONSTRAINT_NAME        | CHECK_CLAUSE            | SPACE_NAME | LANGUAGE |
+------------------------+-------------------------+------------+----------+
| ck_unnamed_Employees_1 | first_name LIKE 'Влад%' | Employees  | SQL      |
| ck_unnamed_Critics_1   | first_name LIKE 'Vlad%' | Critics    | SQL      |
| ck_unnamed_ACTORS_1    | salary > 0              | ACTORS     | SQL      |
+------------------------+-------------------------+------------+----------+

Definition of the CREATE VIEW statement:

pcall(function ()
    box.schema.role.revoke('public', 'read', 'space', '_CHECK_CONSTRAINTS', {if_exists = true})
    end)
box.execute([[DROP VIEW IF EXISTS _check_constraints;]])
box.execute([[
CREATE VIEW _check_constraints AS SELECT
    CAST(NULL AS STRING) AS constraint_catalog,
    CAST(NULL AS STRING) AS constraint_schema,
    "name" AS constraint_name,
    "code" AS check_clause,
    (SELECT "name" FROM "_vspace" x WHERE x."id" = y."space_id") AS space_name,
    "language" AS language,
    "is_deferred" AS is_deferred,
    "space_id" AS space_id
    FROM "_ck_constraint" y;
]])
box.schema.role.grant('public', 'read', 'space', '_CHECK_CONSTRAINTS')

Users who select from this view will need ‘read’ privilege on the _ck_constraint space.

_TABLE_CONSTRAINTS view

This has only the constraints (primary-key and unique-key) that can be found by looking at the _index space. It is not a list of indexes, that is, it is not equivalent to INFORMATION_SCHEMA.STATISTICS. The columns of the index are not taken because in standard SQL they would be in a different table.

Example:

tarantool>SELECT constraint_name, constraint_type, table_name, id, iid, index_type
> FROM _table_constraints
> LIMIT 5;
OK 5 rows selected (0.0 seconds)
+-----------------+-----------------+-------------+-----+-----+------------+
| CONSTRAINT_NAME | CONSTRAINT_TYPE | TABLE_NAME  | ID  | IID | INDEX_TYPE |
+-----------------+-----------------+-------------+-----+-----+------------+
| primary         | PRIMARY         | _schema     | 272 |   0 | tree       |
| primary         | PRIMARY         | _collation  | 276 |   0 | tree       |
| name            | UNIQUE          | _collation  | 276 |   1 | tree       |
| primary         | PRIMARY         | _vcollation | 277 |   0 | tree       |
| name            | UNIQUE          | _vcollation | 277 |   1 | tree       |
+-----------------+-----------------+-------------+-----+-----+------------+

Definition of the function and the CREATE VIEW statement:

box.schema.func.drop('_TABLE_CONSTRAINTS_OPTS_UNIQUE',{if_exists = true})
function _TABLE_CONSTRAINTS_OPTS_UNIQUE (opts) return require('msgpack').decode(opts).unique end
box.schema.func.create('_TABLE_CONSTRAINTS_OPTS_UNIQUE',
    {language = 'LUA',
     returns = 'boolean',
     body = [[function (opts) return require('msgpack').decode(opts).unique end]],
     param_list = {'string'},
     exports = {'LUA', 'SQL'},
     is_sandboxed = false,
     setuid = false,
     is_deterministic = false})
box.schema.role.grant('public', 'execute', 'function', '_TABLE_CONSTRAINTS_OPTS_UNIQUE')
pcall(function ()
box.schema.role.revoke('public', 'read', 'space', '_TABLE_CONSTRAINTS', {if_exists = true})
end)
box.execute([[DROP VIEW IF EXISTS _table_constraints;]])
box.execute([[
CREATE VIEW _table_constraints AS SELECT
CAST(NULL AS STRING) AS constraint_catalog,
CAST(NULL AS STRING) AS constraint_schema,
"name" AS constraint_name,
(SELECT "name" FROM "_vspace" x WHERE x."id" = y."id") AS table_name,
CASE WHEN "iid" = 0 THEN 'PRIMARY' ELSE 'UNIQUE' END AS constraint_type,
CAST(NULL AS STRING) AS initially_deferrable,
CAST(NULL AS STRING) AS deferred,
CAST(NULL AS STRING) AS enforced,
"id" AS id,
"iid" AS iid,
"type" AS index_type
FROM "_vindex" y
WHERE _table_constraints_opts_unique("opts") = TRUE;
]])
box.schema.role.grant('public', 'read', 'space', '_TABLE_CONSTRAINTS')

SQL features

This section compares Tarantool’s features with SQL:2016’s “Feature taxonomy and definition for mandatory features”.

For each feature in that list, there is a simple example SQL statement. If Tarantool appears to handle the example, it is marked “OK”, otherwise it is marked “No”.

E011, Numeric data types

Feature ID	Feature	Example	Tests
E011-01	INTEGER and SMALLINT	`CREATE TABLE t (s1 INT PRIMARY KEY);`	OK.
E011-02	REAL, DOUBLE PRECISION, and FLOAT data types	`CREATE TABLE tr (s1 FLOAT PRIMARY KEY);`	No. Tarantool’s floating point data type is DOUBLE. Note: Floating point SQL types are not planned to be compatible between 2.1 and 2.2 releases. The reason is that in 2.1 we set ‘number’ format for columns of these types, but will restrict it to ‘float32’ and ‘float64’ in 2.2. The format change requires data migration and cannot be done automatically, because in 2.1 we have no information to distinguish ‘number’ columns (created from Lua) from FLOAT/DOUBLE/REAL ones (created from SQL).
E011-03	DECIMAL and NUMERIC data types	`CREATE TABLE td (s1 NUMERIC PRIMARY KEY);`	No, NUMERIC data types are not supported, although the DECIMAL data type is supported.
E011-04	Arithmetic operators	`SELECT 10+1, 9-2, 8*3, 7/2 FROM t;`	OK.
E011-05	Numeric comparisons	`SELECT * FROM t WHERE 1 < 2;`	OK.
E011-06	Implicit casting among the numeric data types	`SELECT * FROM t WHERE s1 = 1.00;`	OK, because Tarantool allows comparison of 1.00 with an INTEGER column.

E021, Character string types

Feature ID	Feature	Example	Tests
E021-01	Character data type (including all its spellings)	`CREATE TABLE t44 (s1 CHAR PRIMARY KEY);`	No, CHAR is not supported. This type of unsupported features will only be counted once.
E021-02	CHARACTER VARYING data type (including all its spellings)	`CREATE TABLE t45 (s1 VARCHAR PRIMARY KEY);`	No, Tarantool only allows VARCHAR(n), which is a synonym for STRING.
E021-03	Character literals	`INSERT INTO t45 VALUES ('');`	OK, and the bad practice of accepting `""` for character literals is avoided.
E021-04	CHARACTER_LENGTH function	`SELECT character_length(s1) FROM t;`	OK. Tarantool treats this as a synonym of LENGTH().
E021-05	OCTET_LENGTH	`SELECT octet_length(s1) FROM t;`	No. There is no such function.
E021-06	SUBSTRING function	`SELECT substring(s1 FROM 1 FOR 1) FROM t;`	No. There is no such function. There is a function SUBSTR(x,n,n), which is OK.
E021-07	Character concatenation	`SELECT 'a' \|\| 'b' FROM t;`	OK.
E021-08	UPPER and LOWER functions	`SELECT upper('a'),lower('B') FROM t;`	OK. Tarantool supports both UPPER() and LOWER().
E021-09	TRIM function	`SELECT trim('a ') FROM t;`	OK.
E021-10	Implicit casting among the fixed-length and variable-length character string types	`SELECT * FROM tm WHERE char_column > varchar_column;`	No, there is no fixed-length character string type.
E021-11	POSITION function	`SELECT position(x IN y) FROM z;`	No. Tarantool’s POSITION function requires ‘`,`’ rather than ‘`IN`’.
E021-12	Character comparison	`SELECT * FROM t WHERE s1 > 'a';`	OK. We should note here that comparisons use a binary collation by default, but it is easy to use a COLLATE clause.

E031, Identifiers

Feature ID	Feature	Example	Tests
E031	Identifiers	`CREATE TABLE rank (ceil INT PRIMARY KEY);`	No. Tarantool’s list of reserved words differs from the standard’s list of reserved words.
E031-01	Delimited identifiers	`CREATE TABLE "t47" (s1 INT PRIMARY KEY);`	OK. Also, enclosing identifiers inside double quotes means they won’t be converted to upper case or lower case, this is the behavior that some other DBMSs lack.
E031-02	Lower case identifiers	`CREATE TABLE t48 (s1 INT PRIMARY KEY);`	OK.
E031-03	Trailing underscore	`CREATE TABLE t49_ (s1 INT PRIMARY KEY);`	OK.

E051, Basic query specification

Feature ID	Feature	Example	Tests
E051-01	SELECT DISTINCT	`SELECT DISTINCT s1 FROM t;`	OK.
E051-02	GROUP BY clause	`SELECT DISTINCT s1 FROM t GROUP BY s1;`	OK.
E051-04	GROUP BY can contain columns not in select list	`SELECT s1 FROM t GROUP BY lower(s1);`	OK.
E051-05	Select list items can be renamed	`SELECT s1 AS K FROM t ORDER BY K;`	OK.
E051-06	HAVING clause	`SELECT count() FROM t HAVING count() > 0;`	OK. Tarantool supports HAVING, and GROUP BY is not mandatory before HAVING.
E051-07	Qualified * in SELECT list	`SELECT t.* FROM t;`	OK.
E051-08	Correlation names in the FROM clause	`SELECT * FROM t AS K;`	OK.
E051-09	Rename columns in the FROM clause	`SELECT * FROM t AS x(q,c);`	No.

E061, Basic predicates and search conditions

Feature ID	Feature	Example	Tests
E061-01	Comparison predicate	`SELECT * FROM t WHERE 0 = 0;`	OK.
E061-02	BETWEEN predicate	`SELECT * FROM t WHERE ' ' BETWEEN '' AND ' ';`	OK.
E061-03	IN predicate with list of values	`SELECT * FROM t WHERE s1 IN ('a', upper('a'));`	OK.
E061-04	LIKE predicate	`SELECT * FROM t WHERE s1 LIKE '_';`	OK.
E061-05	LIKE predicate: ESCAPE clause	`VALUES ('abc_' LIKE 'abcX_' ESCAPE 'X');`	OK.
E061-06	NULL predicate	`SELECT * FROM t WHERE s1 IS NOT NULL;`	OK.
E061-07	Quantified comparison predicate	`SELECT * FROM t WHERE s1 = ANY (SELECT s1 FROM t);`	No. Syntax error.
E061-08	EXISTS predicate	`SELECT * FROM t WHERE NOT EXISTS (SELECT * FROM t);`	OK.
E061-09	Subqueries in comparison predicate	`SELECT * FROM t WHERE s1 > (SELECT s1 FROM t);`	OK.
E061-11	Subqueries in IN predicate	`SELECT * FROM t WHERE s1 IN (SELECT s1 FROM t);`	OK.
E061-12	Subqueries in quantified comparison predicate	`SELECT * FROM t WHERE s1 >= ALL (SELECT s1 FROM t);`	No. Syntax error.
E061-13	Correlated subqueries	`SELECT * FROM t WHERE s1 = (SELECT s1 FROM t2 WHERE t2.s2 = t.s1);`	OK.
E061-14	Search condition	`SELECT * FROM t WHERE 0 <> 0 OR 'a' < 'b' AND s1 IS NULL;`	OK.

E071, Basic query expressions

Feature ID	Feature	Example	Tests
E071-01	UNION DISTINCT table operator	`SELECT * FROM t UNION DISTINCT SELECT * FROM t;`	No. However, `SELECT * FROM t UNION SELECT * FROM t;` is OK.
E071-02	UNION ALL table operator	`SELECT * FROM t UNION ALL SELECT * FROM t;`	OK.
E071-03	EXCEPT DISTINCT table operator	`SELECT * FROM t EXCEPT DISTINCT SELECT * FROM t;`	No. However, `SELECT * FROM t EXCEPT SELECT * FROM t;` is OK.
E071-05	Columns combined via table operators need not have exactly the same data type	`SELECT s1 FROM t UNION SELECT 5 FROM t;`	OK.
E071-06	Table operators in subqueries	`SELECT * FROM t WHERE 'a' IN (SELECT * FROM t UNION SELECT * FROM t);`	OK.

E081, Basic privileges

Tarantool doesn’t support privileges except via NoSQL.

E091, Set functions

Feature ID	Feature	Example	Tests
E091-01	AVG	`SELECT avg(s1) FROM t7;`	No. Tarantool supports AVG but there is no warning that NULLs are eliminated.
E091-02	COUNT	`SELECT count(*) FROM t7 WHERE s1 > 0;`	OK.
E091-03	MAX	`SELECT max(s1) FROM t7 WHERE s1 > 0;`	OK.
E091-04	MIN	`SELECT min(s1) FROM t7 WHERE s1 > 0;`	OK.
E091-05	SUM	`SELECT sum(1) FROM t7 WHERE s1 > 0;`	OK.
E091-06	ALL quantifier	`SELECT sum(ALL s1) FROM t7 WHERE s1 > 0;`	OK.
E091-07	DISTINCT quantifier	`SELECT sum(DISTINCT s1) FROM t7 WHERE s1 > 0;`	OK.

E101, Basic data manipulation

Feature ID	Feature	Example	Tests
E101-01	INSERT statement	`INSERT INTO t (s1,s2) VALUES (1,''), (2,NULL), (3,55);`	OK.
E101-03	Searched UPDATE statement	`UPDATE t SET s1 = NULL WHERE s1 IN (SELECT s1 FROM t2);`	OK.
E101-04	Searched DELETE statement	`DELETE FROM t WHERE s1 IN (SELECT s1 FROM t);`	OK.

E111, Single row SELECT statement

Feature ID	Feature	Example	Tests
E111	Single row SELECT statement	`SELECT count(*) FROM t;`	OK.

E121, Basic cursor support

Feature ID	Feature	Example	Tests
E121-01	DECLARE CURSOR		No. Tarantool doesn’t support cursors.
E121-02	ORDER BY columns need not be in select list	`SELECT s1 FROM t ORDER BY s2;`	OK.
E121-03	Value expressions in ORDER BY clause	`SELECT s1 FROM t7 ORDER BY -s1;`	OK.
E121-04	OPEN statement		No. Tarantool doesn’t support cursors.
E121-06	Positioned UPDATE statement		No. Tarantool doesn’t support cursors.
E121-07	Positioned DELETE statement		No. Tarantool doesn’t support cursors.
E121-08	CLOSE statement		No. Tarantool doesn’t support cursors.
E121-10	FETCH statement implicit next		No. Tarantool doesn’t support cursors.
E121-17	WITH HOLD cursors		No. Tarantool doesn’t support cursors.

E131, Null value support

Feature ID	Feature	Example	Tests
E131	Null value support (nulls in lieu of values)	`SELECT s1 FROM t7 WHERE s1 IS NULL;`	OK.

E141, Basic integrity constraints

Feature ID	Feature	Example	Tests
E141-01	NOT NULL constraints	`CREATE TABLE t8 (s1 INT PRIMARY KEY, s2 INT NOT NULL);`	OK.
E141-02	UNIQUE constraints of NOT NULL columns	`CREATE TABLE t9 (s1 INT PRIMARY KEY , s2 INT NOT NULL UNIQUE);`	OK.
E141-03	PRIMARY KEY constraints	`CREATE TABLE t10 (s1 INT PRIMARY KEY);`	OK, although Tarantool shouldn’t always insist on having a primary key.
E141-04	Basic FOREIGN KEY constraint with the NO ACTION default for both referential delete and referential update actions	`CREATE TABLE t11 (s0 INT PRIMARY KEY, s1 INT REFERENCES t10);`	OK.
E141-06	CHECK constraints	`CREATE TABLE t12 (s1 INT PRIMARY KEY, s2 INT, CHECK (s1 = s2));`	OK.
E141-07	Column defaults	`CREATE TABLE t13 (s1 INT PRIMARY KEY, s2 INT DEFAULT -1);`	OK.
E141-08	NOT NULL inferred on primary key	`CREATE TABLE t14 (s1 INT PRIMARY KEY);`	OK. We are unable to insert NULL although we don’t explicitly say the column is NOT NULL.
E141-10	Names in a foreign key can be specified in any order	`CREATE TABLE t15 (s1 INT, s2 INT, PRIMARY KEY (s1,s2));` `CREATE TABLE t16 (s1 INT PRIMARY KEY, s2 INT, FOREIGN KEY (s2,s1) REFERENCES t15 (s1,s2));`	OK.

E151, Transaction support

Feature ID	Feature	Example	Tests
E151-01	COMMIT statement	`COMMIT;`	No. Tarantool supports COMMIT but it is necessary to say START TRANSACTION first.
E151-02	ROLLBACK statement	`ROLLBACK;`	OK.

E152, Basic SET TRANSACTION statement

Feature ID	Feature	Example	Tests
E152-01	SET TRANSACTION statement: ISOLATION SERIALIZABLE clause	`SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;`	No. Syntax error.
E152-02	SET TRANSACTION statement: READ ONLY and READ WRITE clauses	`SET TRANSACTION READ ONLY;`	No. Syntax error.

E*, Other

Feature ID	Feature	Example	Tests
E153	Updatable queries with subqueries	`UPDATE "view_containing_subquery" SET column1=0;`	No.
E161	SQL comments using leading double minus	`--comment;`	OK.
E171	SQLSTATE support	`DROP TABLE no_such_table;`	No. Tarantool returns an error message but not an SQLSTATE string.
E182	Host language binding		OK. Any of the Tarantool connectors should be able to call box.execute().

F021, Basic information schema

Feature ID	Feature	Example	Tests
F021	Basic information schema	`SELECT * from information_schema.tables;`	No. Tarantool’s metadata is not in a schema with that name (not counted in the final score).

F031, Basic schema manipulation

Feature ID	Feature	Example	Tests
F031-01	CREATE TABLE statement to create persistent base tables	`CREATE TABLE t20 (t20_1 INT NOT NULL);`	No. We always have to specify PRIMARY KEY (we only count this flaw once).
F031-02	CREATE VIEW statement	`CREATE VIEW t21 AS SELECT * FROM t20;`	OK.
F031-03	GRANT statement		No. Tarantool doesn’t support privileges except via NoSQL.
F031-04	ALTER TABLE statement: add column	`ALTER TABLE t7 ADD COLUMN t7_2 VARCHAR(1) DEFAULT 'q';`	OK. Tarantool supports ALTER TABLE, and support for ADD COLUMN was added in Tarantool 2.7.
F031-13	DROP TABLE statement: RESTRICT clause	`DROP TABLE t20 RESTRICT;`	No. Tarantool supports DROP TABLE but not this clause.
F031-16	DROP VIEW statement: RESTRICT clause	`DROP VIEW v2 RESTRICT;`	No. Tarantool supports DROP VIEW but not this clause.
F031-19	REVOKE statement: RESTRICT clause		No. Tarantool does not support privileges except via NoSQL.

F041, Basic joined table

Feature ID	Feature	Example	Tests
F041-01	Inner join but not necessarily the INNER keyword	`SELECT a.s1 FROM t7 a JOIN t7 b;`	OK.
F041-02	INNER keyword	`SELECT a.s1 FROM t7 a INNER JOIN t7 b;`	OK.
F041-03	LEFT OUTER JOIN	`SELECT t7.,t22. FROM t22 LEFT OUTER JOIN t7 ON (t22_1 = s1);`	OK.
F041-04	RIGHT OUTER JOIN	`SELECT t7.,t22. FROM t22 RIGHT OUTER JOIN t7 ON (t22_1 = s1);`	No. Syntax error.
F041-05	Outer joins can be nested	`SELECT t7.,t22. FROM t22 LEFT OUTER JOIN t7 ON (t22_1 = s1) LEFT OUTER JOIN t23;`	OK.
F041-07	The inner table in a left or right outer join can also be used in an inner join	`SELECT t7.* FROM (t22 LEFT OUTER JOIN t7 ON (t22_1 = s1)) j INNER JOIN t22 ON (j.t22_4 = t7.s1);`	OK.
F041-08	All comparison operators are supported	`SELECT * FROM t WHERE 0 = 1 OR 0 > 1 OR 0 < 1 OR 0 <> 1;`	OK.

F051, Basic date and time

Feature ID	Feature	Example	Tests
F051-01	DATE data type (including support of DATE literal)	`CREATE TABLE dates (s1 DATE);`	No. Tarantool does not support the DATE data type.
F051-02	TIME data type (including support of TIME literal)	`CREATE TABLE times (s1 TIME DEFAULT TIME '1:2:3');`	No. Syntax error.
F051-03	TIMESTAMP data type (including support of TIMESTAMP literal)	`CREATE TABLE timestamps (s1 TIMESTAMP);`	No. Syntax error.
F051-04	Comparison predicate on DATE, TIME and TIMESTAMP data types	`SELECT * FROM dates WHERE s1 = s1;`	No. Date and time data types are not supported.
F051-05	Explicit CAST between date-time types and character string types	`SELECT cast(s1 AS VARCHAR(10)) FROM dates;`	No. Date and time data types are not supported.
F051-06	CURRENT_DATE	`SELECT current_date FROM t;`	No. Syntax error.
F051-07	LOCALTIME	`SELECT localtime FROM t;`	No. Syntax error.
F051-08	LOCALTIMESTAMP	`SELECT localtimestamp FROM t;`	No. Syntax error.

F081, UNION and EXCEPT in views

Feature ID	Feature	Example	Tests
F081	UNION and EXCEPT in views	`CREATE VIEW vv AS SELECT * FROM t7 EXCEPT SELECT * * FROM t15;`	OK.

F131, Grouped operations

Feature ID	Feature	Example	Tests
F131-01	WHERE, GROUP BY, and HAVING clauses supported in queries with grouped views	`CREATE VIEW vv2 AS SELECT * FROM vv GROUP BY s1;`	OK.
F131-02	Multiple tables supported in queries with grouped views	`CREATE VIEW vv3 AS SELECT * FROM vv2,t30;`	OK.
F131-03	Set functions supported in queries with grouped views	`CREATE VIEW vv4 AS SELECT count(*) FROM vv2;`	OK.
F131-04	Subqueries with GROUP BY and HAVING clauses and grouped views	`CREATE VIEW vv5 AS SELECT count() FROM vv2 GROUP BY s1 HAVING count() > 0;`	OK.
F131-05	Single row SELECT with GROUP BY and HAVING clauses and grouped views	`SELECT count() FROM vv2 GROUP BY s1 HAVING count() > 0;`	OK.

F181, Multiple module support

No. Tarantool doesn’t have modules.

F201, CAST function

Feature ID	Feature	Example	Tests
F201	CAST function	`SELECT cast(s1 AS INT) FROM t;`	OK.

F221, Explicit defaults

Feature ID	Feature	Example	Tests
F221	Explicit defaults	`UPDATE t SET s1 = DEFAULT;`	No. Syntax error.

F261, CASE expression

Feature ID	Feature	Example	Tests
F261-01	Simple CASE	`SELECT CASE WHEN 1 = 0 THEN 5 ELSE 7 END FROM t;`	OK.
F261-02	Searched CASE	`SELECT CASE 1 WHEN 0 THEN 5 ELSE 7 END FROM t;`	OK.
F261-03	NULLIF	`SELECT nullif(s1,7) FROM t;`	OK
F261-04	COALESCE	`SELECT coalesce(s1,7) FROM t;`	OK.

F311, Schema definition statement

Feature ID	Feature	Tests
F311-01	CREATE SCHEMA	No. Tarantool doesn’t have schemas or databases.
F311-02	CREATE TABLE for persistent base tables	No. Tarantool doesn’t have CREATE TABLE inside CREATE SCHEMA.
F311-03	CREATE VIEW	No. Tarantool doesn’t have CREATE VIEW inside CREATE SCHEMA.
F311-04	CREATE VIEW: WITH CHECK OPTION	No. Tarantool doesn’t have CREATE VIEW inside CREATE SCHEMA.
F311-05	GRANT statement	No. Tarantool doesn’t have GRANT inside CREATE SCHEMA.

F*, Other

Feature ID	Feature	Example	Tests
F471	Scalar subquery values	`SELECT s1 FROM t WHERE s1 = (SELECT count(*) FROM t);`	OK.
F481	Expanded NULL predicate	`SELECT * FROM t WHERE row(s1,s1) IS NOT NULL;`	No. Syntax error.
F812	Basic flagging		No. Tarantool doesn’t support any flagging.

S011, Distinct types

Feature ID	Feature	Example	Tests
S011	Distinct types	`CREATE TYPE x AS FLOAT;`	No. Tarantool doesn’t support distinct types.

T321, Basic SQL-invoked routines

Feature ID	Feature	Example	Tests
T321-01	User-defined functions with no overloading	`CREATE FUNCTION f() RETURNS INT RETURN 5;`	No. User-defined functions for SQL are created in Lua with a different syntax.
T321-02	User-defined procedures with no overloading	`CREATE PROCEDURE p() BEGIN END;`	No. User-defined functions for SQL are created in Lua with a different syntax.
T321-03	Function invocation	`SELECT f(1) FROM t;`	OK. Tarantool can invoke Lua user-defined functions.
T321-04	CALL statement	`CALL p();`	No. Tarantool doesn’t support CALL statements.
T321-05	RETURN statement	`CREATE FUNCTION f() RETURNS INT RETURN 5;`	No. Tarantool doesn’t support RETURN statements.

T*, Other

Feature ID	Feature	Example	Tests
T631	IN predicate with one list element	`SELECT * FROM t WHERE 1 IN (1);`	OK.

Total number of items marked “No”: 67

Total number of items marked “OK”: 79

Built-in modules reference

This reference covers Tarantool’s built-in Lua modules.

Note

Some functions in these modules are analogs to functions from standard Lua libraries. For better results, we recommend using functions from Tarantool’s built-in modules.

Module box

As well as executing Lua chunks or defining your own functions, you can exploit Tarantool’s storage functionality with the box module and its submodules.

Every submodule contains one or more Lua functions. A few submodules contain members as well as functions. The functions allow data definition (create alter drop), data manipulation (insert delete update upsert select replace), and introspection (inspecting contents of spaces, accessing server configuration).

To catch errors that functions in box submodules may throw, use pcall.

The contents of the box module can be inspected at runtime with box, with no arguments. The box module contains:

Submodule box.backup

The box.backup submodule contains two functions that are helpful for backup in certain situations.

Below is a list of all box.backup functions.

Name	Use
box.backup.start()	Ask server to suspend activities before the removal of outdated backups
box.backup.stop()	Inform server that normal operations may resume

box.backup.start()

backup.start([n])¶

Informs the server that activities related to the removal of outdated backups must be suspended.

To guarantee an opportunity to copy these files, Tarantool will not delete them. But there will be no read-only mode and checkpoints will continue by schedule as usual.

Parameters:	n (`number`) – optional argument starting with Tarantool 1.10.1 that indicates the checkpoint to use relative to the latest checkpoint. For example `n = 0` means “backup will be based on the latest checkpoint”, `n = 1` means “backup will be based on the first checkpoint before the latest checkpoint (counting backwards)”, and so on. The default value for n is zero.

Return: a table with the names of snapshot and vinyl files that should be copied

Example:

tarantool> box.backup.start()
---
- - ./00000000000000000015.snap
  - ./00000000000000000000.vylog
  - ./513/0/00000000000000000002.index
  - ./513/0/00000000000000000002.run
...

box.backup.stop()

backup.stop()¶: Informs the server that normal operations may resume.

Submodule box.cfg

The box.cfg submodule is used for specifying server configuration parameters.

To view the current configuration, say box.cfg without braces:

tarantool> box.cfg
---
- checkpoint_count: 2
  too_long_threshold: 0.5
  slab_alloc_factor: 1.05
  memtx_max_tuple_size: 1048576
  background: false
  <...>
...

To set particular parameters, use the following syntax: box.cfg{key = value [, key = value ...]} (further referred to as box.cfg{...} for short). For example:

tarantool> box.cfg{listen = 3301}

Parameters that are not specified in the box.cfg{...} call explicitly will be set to the default values.

If you say box.cfg{} with no parameters, Tarantool applies the following default settings to all the parameters:

tarantool> box.cfg{}
tarantool> box.cfg -- sorted in the alphabetic order
---
- background                   = false
  checkpoint_count             = 2
  checkpoint_interval          = 3600
  checkpoint_wal_threshold     = 1000000000000000000
  coredump                     = false
  custom_proc_title            = nil
  feedback_enabled             = true
  feedback_host                = 'https://feedback.tarantool.io'
  feedback_interval            = 3600
  force_recovery               = false
  hot_standby                  = false
  instance_uuid                = nil -- generated automatically
  io_collect_interval          = nil
  iproto_threads               = 1
  listen                       = nil
  log                          = nil
  log_format                   = plain
  log_level                    = 5
  log_nonblock                 = true
  memtx_dir                    = '.'
  memtx_max_tuple_size         = 1024 * 1024
  memtx_memory                 = 256 * 1024 *1024
  memtx_min_tuple_size         = 16
  net_msg_max                  = 768
  pid_file                     = nil
  readahead                    = 16320
  read_only                    = false
  replicaset_uuid              = nil -- generated automatically
  replication                  = nil
  replication_anon             = false
  replication_connect_timeout  = 30
  replication_skip_conflict    = false
  replication_sync_lag         = 10
  replication_sync_timeout     = 0
  replication_timeout          = 1
  slab_alloc_factor            = 1.05
  snap_io_rate_limit           = nil
  sql_cache_size               = 5242880
  strip_core                   = true
  too_long_threshold           = 0.5
  username                     = nil
  vinyl_bloom_fpr              = 0.05
  vinyl_cache                  = 128 * 1024 * 1024
  vinyl_dir                    = '.'
  vinyl_max_tuple_size         = 1024 * 1024* 1024 * 1024
  vinyl_memory                 = 128 * 1024 * 1024
  vinyl_page_size              = 8 * 1024
  vinyl_range_size             = nil
  vinyl_read_threads           = 1
  vinyl_run_count_per_level    = 2
  vinyl_run_size_ratio         = 3.5
  vinyl_timeout                = 60
  vinyl_write_threads          = 4
  wal_dir                      = '.'
  wal_dir_rescan_delay         = 2
  wal_max_size                 = 256 * 1024 * 1024
  wal_mode                     = 'write'
  worker_pool_threads          = 4
  work_dir                     = nil

The first call to box.cfg{...} (with or without parameters) initiates Tarantool’s database module box.

box.cfg{...} is also the command that reloads persistent data files into RAM upon restart once we have data.

Submodule box.ctl

The wait_ro (wait until read-only) and wait_rw (wait until read-write) functions are useful during server initialization. To see whether a function is already in read-only or read-write mode, check box.info.ro.

A particular use is for box.once(). For example, when a replica is initializing, it may call a box.once() function while the server is still in read-only mode, and fail to make changes that are necessary only once before the replica is fully initialized. This could cause conflicts between a master and a replica if the master is in read-write mode and the replica is in read-only mode. Waiting until “read only mode = false” solves this problem.

Below is a list of all box.ctl functions.

Name	Use
box.ctl.wait_ro()	Wait until `box.info.ro` is true
box.ctl.wait_rw()	Wait until `box.info.ro` is false
box.ctl.on_schema_init()	Create a “schema_init trigger”
box.ctl.on_shutdown()	Create a “shutdown trigger”
box.ctl.on_recovery_state()	Create a trigger executed on different stages of a node recovery or initial configuration
box.ctl.on_election()	Create a trigger executed every time the current state of a replica set node in regard to leader election changes
box.ctl.set_on_shutdown_timeout()	Set a timeout in seconds for the `on_shutdown` trigger
box.ctl.is_recovery_finished()	Check if recovery has finished
box.ctl.promote()	Wait, then choose replication leader
box.ctl.demote()	Revoke the leader role from the instance
box.ctl.make_bootstrap_leader()	Make the instance a bootstrap leader of a replica set

box.ctl.wait_ro()

box.ctl.wait_ro([timeout])¶

Wait until box.info.ro is true.

Parameters:	timeout (`number`) – maximum number of seconds to wait
Return:	nil, or error may be thrown due to timeout or fiber cancellation

Example:

tarantool> box.info().ro
---
- false
...

tarantool> n = box.ctl.wait_ro(0.1)
---
- error: timed out
...

box.ctl.wait_rw()

box.ctl.wait_rw([timeout])¶

Wait until box.info.ro is false.

Parameters:	timeout (`number`) – maximum number of seconds to wait
Return:	nil, or error may be thrown due to timeout or fiber cancellation

Example:

tarantool> box.ctl.wait_rw(0.1)
---
...

box.ctl.on_schema_init()

box.ctl.on_schema_init(trigger-function[, old-trigger-function])¶

Create a “schema_init trigger”. The trigger-function will be executed when box.cfg{} happens for the first time. That is, the schema_init trigger is called before the server’s configuration and recovery begins, and therefore box.ctl.on_schema_init must be called before box.cfg is called.

Parameters:	trigger-function (`function`) – function which will become the trigger function old-trigger-function (`function`) – existing trigger function which will be replaced by trigger-function
Return:	nil or function pointer

If the parameters are (nil, old-trigger-function), then the old trigger is deleted.

A common use is: make a schema_init trigger function which creates a before_replace trigger function on a system space. Thus, since system spaces are created when the server starts, the before_replace triggers will be activated for each tuple in each system space. For example, such a trigger could change the storage engine of a given space, or make a given space replica-local while a replica is being bootstrapped. Making such a change after box.cfg is not reliable because other connections might use the database before the change is made.

Details about trigger characteristics are in the triggers section.

Example:

Suppose that, before the server is fully up and ready for connections, you want to make sure that the engine of space space_name is vinyl. So you want to make a trigger that will be activated when a tuple is inserted in the _space system space. In this case you could end up with a master that has space-name with engine='memtx' and a replica that has space_name with engine='vinyl', with the same contents.

function function_for_before_replace(old, new)
  if old == nil and new ~= nil and new[3] == 'space_name' and new[4] ~= 'vinyl' then
    return new:update{{'=', 4, 'vinyl'}}
  end
end

box.ctl.on_schema_init(function()
  box.space._space:before_replace(function_for_before_replace)
end)

box.cfg{replication='master_uri', ...}

box.ctl.on_shutdown()

box.ctl.on_shutdown(trigger-function[, old-trigger-function])¶

Create a “shutdown trigger”. The trigger-function will be executed whenever os.exit() happens, or when the server is shut down after receiving a SIGTERM or SIGINT or SIGHUP signal (but not after SIGSEGV or SIGABORT or any signal that causes immediate program termination).

Parameters:	trigger-function (`function`) – function which will become the trigger function old-trigger-function (`function`) – existing trigger function which will be replaced by trigger-function
Return:	nil or function pointer

If the parameters are (nil, old-trigger-function), then the old trigger is deleted.

If you want to set a timeout for this trigger, use the set_on_shutdown_timeout function.

box.ctl.on_recovery_state()

box.ctl.on_recovery_state(trigger-function)¶

Since: 2.11.0

Create a trigger executed on different stages of a node recovery or initial configuration. Note that you need to set the box.ctl.on_recovery_state trigger before the initial box.cfg call.

Parameters:	trigger-function (`function`) – a trigger function
Return:	`nil` or a function pointer

A registered trigger function is run on each of the supported recovery state and receives the state name as a parameter:

snapshot_recovered: the node has recovered the snapshot files.
wal_recovered: the node has recovered the WAL files.
indexes_built: the node has built secondary indexes for memtx spaces. This stage might come before any actual data is recovered. This means that the indexes are available right after the first tuple is recovered.
synced: the node has synced with enough remote peers. This means that the node changes the state from orphan to running.

All these states are passed during the initial box.cfg call when recovering from the snapshot and WAL files. Note that the synced state might be reached after the initial box.cfg call finishes. For example, if replication_sync_timeout is set to 0, the node finishes box.cfg without reaching synced and stays orphan. Once the node is synced with enough remote peers, the synced state is reached.

Note

When bootstrapping a fresh cluster with no data, all the instances in this cluster execute triggers on the same stages for consistency. For example, snapshot_recovered and wal_recovered run when the node finishes a cluster’s bootstrap or finishes joining to an existing cluster.

Example:

The example below shows how to log a specified message when each state is reached.

local log = require('log')
local log_recovery_state = function(state)
    log.info(state .. ' state reached')
end
box.ctl.on_recovery_state(log_recovery_state)

box.ctl.on_election()

box.ctl.on_election(trigger-function)¶

Since: 2.10.0

Create a trigger executed every time the current state of a replica set node in regard to leader election changes. The current state is available in the box.info.election table.

The trigger doesn’t accept any parameters. You can see the changes in box.info.election and box.info.synchro.

Parameters:	trigger-function (`function`) – a trigger function
Return:	`nil` or a function pointer

box.ctl.set_on_shutdown_timeout()

box.ctl.set_on_shutdown_timeout([timeout])¶

Set a timeout for the on_shutdown trigger. If the timeout has expired, the server stops immediately regardless of whether any on_shutdown triggers are left unexecuted.

Parameters:	timeout (`double`) – time to wait for the trigger to be completed. The default value is 3 seconds.
Return:	nil

box.ctl.is_recovery_finished()

box.ctl.is_recovery_finished()¶

Since version 2.5.3.

Check whether the recovery process has finished. Until it has finished, space changes such as insert or update are not possible.

Return:	`true` if recovery has finished, otherwise `false`
Rtype:	boolean

box.ctl.promote()

promote()¶

Since version 2.6.2. Renamed in release 2.6.3.

Wait, then choose new replication leader.

For synchronous transactions it is possible that a new leader will be chosen but the transactions of the old leader have not been completed. Therefore to finalize the transaction, the function box.ctl.promote() should be called, as mentioned in the notes for leader election. The old name for this function is box.ctl.clear_synchro_queue().

The election state should change to leader.

Parameters: none

Return:	nil or function pointer

box.ctl.demote()

demote()¶

Since version 2.10.0.

Revoke the leader role from the instance.

On synchronous transaction queue owner, the function works in the following way:

If box.cfg.election_mode is off, the function writes a DEMOTE request to WAL. The DEMOTE request clears the ownership of the synchronous transaction queue, while the PROMOTE request assigns it to a new instance.
If box.cfg.election_mode is enabled in any mode, then the function makes the instance start a new term and give up the leader role.

On instances that are not queue owners, the function does nothing and returns immediately.

Parameters: none

Return:	nil

box.ctl.make_bootstrap_leader()

make_bootstrap_leader()¶

Since: 3.0.0.

Make the instance a bootstrap leader of a replica set.

To be able to make the instance a bootstrap leader manually, the replication.bootstrap_strategy configuration option should be set to supervised. In this case, the instances do not choose a bootstrap leader automatically but wait for it to be appointed manually. Configuration fails if no bootstrap leader is appointed during a replication.connect_timeout.

Note

When a new instance joins a replica set configured with the supervised bootstrap strategy, this instance doesn’t choose the bootstrap leader automatically but joins to the instance on which box.ctl.make_bootstrap_leader() was executed last time.

Submodule box.error

The box.error submodule can be used to work with errors in your application. For example, you can get the information about the last error raised by Tarantool or raise custom errors manually.

The difference between raising an error using box.error and a Lua’s built-in error function is that when the error reaches the client, its error code is preserved. In contrast, a Lua error would always be presented to the client as ER_PROC_LUA.

Note

To learn how to handle errors in your application, see the Handling errors section.

Creating an error

You can create an error object using the box.error.new() function. The created object can be passed to box.error() to raise the error. You can also raise the error using error_object:raise().

The example below shows how to create and raise the error with the specified code and reason.

local custom_error = box.error.new({ code = 500,
                                     reason = 'Internal server error' })

box.error(custom_error)
--[[
---
- error: Internal server error
...
--]]

box.error.new() provides different overloads for creating an error object with different parameters. These overloads are similar to the box.error() overloads described in the next section.

Raising an error

To raise an error, call the box.error() function. This function can accept the specified error parameters or an error object created using box.error.new(). In both cases, you can use box.error() to raise the following error types:

A custom error with the specified reason, code, and type.
A predefined Tarantool error.

Custom error

The following box.error() overloads are available for raising a custom error:

box.error({ reason = string[, code = number, type = string] }) accepts a Lua table containing the error reason, code, and type.
box.error(type, reason[, args]) accepts the error type, its reason, and optional arguments passed to a reason’s string.

Note

The same overloads are available for box.error.new().

box.error({ reason = string[, …] })

In the example below, box.error() accepts a Lua table with the specified error code and reason:

box.error { code = 500,
            reason = 'Custom server error' }
--[[
---
- error: Custom server error
...
--]]

The next example shows how to specify a custom error type:

box.error { code = 500,
            reason = 'Internal server error',
            type = 'CustomInternalError' }
--[[
---
- error: Internal server error
...
--]]

When a custom type is specified, it is returned in the error_object.type attribute. When it is not specified, error_object.type returns one of the built-in errors, such as ClientError or OutOfMemory.

box.error(type, reason[, …])

This example shows how to raise an error with the type and reason specified in the box.error() arguments:

box.error('CustomConnectionError', 'cannot connect to the given port')
--[[
---
- error: cannot connect to the given port
...
--]]

You can also use a format string to compose an error reason:

box.error('CustomConnectionError', '%s cannot connect to the port %u', 'client', 8080)
--[[
---
- error: client cannot connect to the port 8080
...
--]]

Tarantool error

The box.error(code[, …]) overload raises a predefined Tarantool error specified by its identifier. The error code defines the error message format and the number of required arguments. In the example below, no arguments are passed for the box.error.READONLY error code:

box.error(box.error.READONLY)
--[[
---
- error: Can't modify data on a read-only instance
...
--]]

For the box.error.NO_SUCH_USER error code, you need to pass one argument:

box.error(box.error.NO_SUCH_USER, 'John')
--[[
---
- error: User 'John' is not found
...
--]]

box.error.CREATE_SPACE requires two arguments:

box.error(box.error.CREATE_SPACE, 'my_space', 'the space already exists')
--[[
---
- error: 'Failed to create space ''my_space'': the space already exists'
...
--]]

Getting the last error

To get the last raised error, call box.error.last():

box.error.last()
--[[
---
- error: Internal server error
...
--]]

Obtaining error details

To get error details, call the error_object.unpack(). Error details may include an error code, type, message, and trace.

box.error.last():unpack()
--[[
---
- code: 500
  base_type: CustomError
  type: CustomInternalError
  custom_type: CustomInternalError
  message: Internal server error
  trace:
  - file: '[string "custom_error = box.error.new({ code = 500,..."]'
    line: 1
...
--]]

Setting the last error

You can set the last error explicitly by calling box.error.set():

-- Create two errors --
local error1 = box.error.new({ code = 500, reason = 'Custom error 1' })
local error2 = box.error.new({ code = 505, reason = 'Custom error 2' })

-- Raise the first error --
box.error(error1)
--[[
---
- error: Custom error 1
...
--]]

-- Get the last error --
box.error.last()
--[[
---
- Custom error 1
...
--]]

-- Set the second error as the last error --
box.error.set(error2)
--[[
---
...
--]]

-- Get the last error --
box.error.last()
--[[
---
- Custom error 2
...
--]]

Error lists

error_object provides the API for organizing errors into lists. To set and get the previous error, use the error_object:set_prev() method and error_object.prev attribute.

local base_server_error = box.error.new({ code = 500,
                                          reason = 'Base server error',
                                          type = 'BaseServerError' })
local storage_server_error = box.error.new({ code = 507,
                                             reason = 'Not enough storage',
                                             type = 'StorageServerError' })

base_server_error:set_prev(storage_server_error)
--[[
---
...
--]]

box.error(base_server_error)
--[[
---
- error: Base server error
...
--]]

box.error.last().prev:unpack()
--[[
---
- code: 507
  base_type: CustomError
  type: StorageServerError
  custom_type: StorageServerError
  message: Not enough storage
  trace:
  - file: '[string "storage_server_error = box.error.new({ code =..."]'
    line: 1
...
--]]

Cycles are not allowed for error lists:

storage_server_error:set_prev(base_server_error)
--[[
---
- error: 'builtin/error.lua:120: Cycles are not allowed'
...
--]]

Setting the previous error does not erase its own previous members:

-- e1 -> e2 -> e3 -> e4
e1:set_prev(e2)
e2:set_prev(e3)
e3:set_prev(e4)
e2:set_prev(e5)
-- Now there are two lists: e1 -> e2 -> e5 and e3 -> e4

IPROTO also supports stacked diagnostics. See details in MessagePack extensions – The ERROR type.

Clearing errors

To clear the errors, call box.error.clear().

box.error.clear()
--[[
---
...
--]]
box.error.last()
--[[
---
- null
...
--]]

API Reference

Below is a list of box.error functions and related objects.

Name	Use
box.error()	Raise the last error or the error defined by the specified parameters
box.error.last()	Get the last raised error
box.error.clear()	Clear the errors
box.error.new()	Create the error but do not raise it
box.error.set()	Set the specified error as the last system error explicitly
box.error.is()	Verify whether the specified argument is an error cdata object
error_object	An object that defines an error

box.error()

box.error()¶

Raise the last error.

See also: box.error.last()

box.error(error_object)

Raise the error defined by error_object.

Parameters:	error_object (`error_object`) – an error object

Example

local custom_error = box.error.new({ code = 500,
                                     reason = 'Internal server error' })

box.error(custom_error)
--[[
---
- error: Internal server error
...
--]]

box.error({ reason = string[, code = number, type = string] }])

Raise the error defined by the specified parameters.

Parameters:	reason (`string`) – an error description code (`integer`) – (optional) a numeric code for this error type (`string`) – (optional) an error type

Example 1

box.error { code = 500,
            reason = 'Custom server error' }
--[[
---
- error: Custom server error
...
--]]

Example 2: custom type

box.error { code = 500,
            reason = 'Internal server error',
            type = 'CustomInternalError' }
--[[
---
- error: Internal server error
...
--]]

box.error(type, reason[, ...])

Raise the error defined by the specified type and description.

Parameters:	type (`string`) – an error type reason (`string`) – an error description ... – description arguments

Example 1: without arguments

box.error('CustomConnectionError', 'cannot connect to the given port')
--[[
---
- error: cannot connect to the given port
...
--]]

Example 2: with arguments

box.error('CustomConnectionError', '%s cannot connect to the port %u', 'client', 8080)
--[[
---
- error: client cannot connect to the port 8080
...
--]]

box.error(code[, ...])

Raise a predefined Tarantool error specified by its identifier. You can see all Tarantool errors in the errcode.h file.

Parameters:	code (`number`) – a pre-defined error identifier; Lua constants that correspond to those Tarantool errors are defined as members of `box.error`, for example, `box.error.NO_SUCH_USER == 45` ... – description arguments

Example 1: no arguments

box.error(box.error.READONLY)
--[[
---
- error: Can't modify data on a read-only instance
...
--]]

Example 2: one argument

box.error(box.error.NO_SUCH_USER, 'John')
--[[
---
- error: User 'John' is not found
...
--]]

Example 3: two arguments

box.error(box.error.CREATE_SPACE, 'my_space', 'the space already exists')
--[[
---
- error: 'Failed to create space ''my_space'': the space already exists'
...
--]]

box.error.last()

box.error.last()¶

Get the last raised error.

Return:	an error_object representing the last error

Example

box.error.last()
--[[
---
- error: Internal server error
...
--]]

See also: error_object:unpack()

box.error.clear()

box.error.clear()¶

Clear the errors.

Example

box.error.clear()
--[[
---
...
--]]
box.error.last()
--[[
---
- null
...
--]]

box.error.new()

box.error.new({ reason = string[, code = number, type = string] }])¶

Create an error object with the specified parameters.

Parameters:	reason (`string`) – an error description code (`integer`) – (optional) a numeric code for this error type (`string`) – (optional) an error type

Example 1

local custom_error = box.error.new({ code = 500,
                                     reason = 'Internal server error' })

box.error(custom_error)
--[[
---
- error: Internal server error
...
--]]

Example 2: custom type

local custom_error = box.error.new({ code = 500,
                                     reason = 'Internal server error',
                                     type = 'CustomInternalError' })

box.error(custom_error)
--[[
---
- error: Internal server error
...
--]]

box.error.new(type, reason[, ...])

Create an error object with the specified type and description.

Parameters:	type (`string`) – an error type reason (`string`) – an error description ... – description arguments

Example

local custom_error = box.error.new('CustomInternalError', 'Internal server error')

box.error(custom_error)
--[[
---
- error: Internal server error
...
--]]

box.error.new(code[, ...])

Create a predefined Tarantool error specified by its identifier. You can see all Tarantool errors in the errcode.h file.

Parameters:	code (`number`) – a pre-defined error identifier; Lua constants that correspond to those Tarantool errors are defined as members of `box.error`, for example, `box.error.NO_SUCH_USER == 45` ... – description arguments

Example 1: one argument

local custom_error = box.error.new(box.error.NO_SUCH_USER, 'John')

box.error(custom_error)
--[[
---
- error: User 'John' is not found
...
--]]

Example 2: two arguments

local custom_error = box.error.new(box.error.CREATE_SPACE, 'my_space', 'the space already exists')

box.error(custom_error)
--[[
---
- error: 'Failed to create space ''my_space'': the space already exists'
...
--]]

box.error.set()

box.error.set(error_object)¶

Since: 2.4.1

Set the specified error as the last system error explicitly. This error is returned by box.error.last().

Parameters:	error_object (`error_object`) – an error object

Example

-- Create two errors --
local error1 = box.error.new({ code = 500, reason = 'Custom error 1' })
local error2 = box.error.new({ code = 505, reason = 'Custom error 2' })

-- Raise the first error --
box.error(error1)
--[[
---
- error: Custom error 1
...
--]]

-- Get the last error --
box.error.last()
--[[
---
- Custom error 1
...
--]]

-- Set the second error as the last error --
box.error.set(error2)
--[[
---
...
--]]

-- Get the last error --
box.error.last()
--[[
---
- Custom error 2
...
--]]

box.error.is()

box.error.is(object)¶

Since: 3.2.0

The box.error.is function allows verify whether the specified argument is an error cdata object.

Parameters:	object (`object`) – the object to be verified.

Return type: boolean

Example

tarantool> box.error.is(box.error.new(box.error.UNKNOWN))
---
- true
...
tarantool> box.error.is('foo')
---
- false
...

error_object

object error_object¶

An object that defines an error. error_object is returned by the following methods:

error_object:unpack()¶

Get error details that may include an error code, type, message, and trace.

Example

box.error.last():unpack()
--[[
---
- code: 500
  base_type: CustomError
  type: CustomInternalError
  custom_type: CustomInternalError
  message: Internal server error
  trace:
  - file: '[string "custom_error = box.error.new({ code = 500,..."]'
    line: 1
...
--]]

Note

Depending on the error type, error details may include other attributes, such as errno or reason.

error_object:raise()¶

Raise the current error.

See also: Raising an error

error_object:set_prev(error_object)¶

Since: 2.4.1

Set the previous error for the current one.

Parameters:	body (`error_object`) – an error object

See also: Error lists

error_object.prev¶

Since: 2.4.1

Get the previous error for the current one.

Rtype:	error_object

See also: Error lists

error_object.code¶

The error code. This attribute may return a custom error code or a Tarantool error code.

Rtype:	number

error_object.type¶

The error type.

Rtype:	string

See also: Custom error

error_object.message¶

The error message.

Rtype:	string

error_object.trace¶

The error trace.

Rtype:	table

error_object.errno¶

If the error is a system error (for example, a socket or file IO failure), returns a C standard error number.

Rtype:	number

error_object.reason¶

Since: 2.10.0

Returns the box.info.ro_reason value at the moment of throwing the box.error.READONLY error.

The following values may be returned:

election if the instance has box.cfg.election_mode set to a value other than off and this instance is not a leader. In this case, error_object may include the following attributes: state, leader_id, leader_uuid, and term.
synchro if the synchronous queue has an owner that is not the given instance. This error usually happens if synchronous replication is used and another instance is called box.ctl.promote(). In this case, error_object may include the queue_owner_id, queue_owner_uuid, and term attributes.
config if the box.cfg.read_only is set to true.
orphan if the instance is in the orphan state.

Note

If multiple reasons are true at the same time, then only one is returned in the following order of preference: election, synchro, config, orphan.

Rtype:	string

error_object.state¶

Since: 2.10.0

For the box.error.READONLY error, returns the current state of a replica set node in regards to leader election (see box.info.election.state). This attribute presents if the error reason is election.

Rtype:	string

error_object.leader_id¶

Since: 2.10.0

For the box.error.READONLY error, returns a numeric identifier (box.info.id) of the replica set leader. This attribute may present if the error reason is election.

Rtype:	number

error_object.leader_uuid¶

Since: 2.10.0

For the box.error.READONLY error, returns a globally unique identifier (box.info.uuid) of the replica set leader. This attribute may present if the error reason is election.

error_object.queue_owner_id¶

Since: 2.10.0

For the box.error.READONLY error, returns a numeric identifier (box.info.id) of the synchronous queue owner. This attribute may present if the error reason is synchro.

Rtype:	number

error_object.queue_owner_uuid¶

Since: 2.10.0

For the box.error.READONLY error, returns a globally unique identifier (box.info.uuid) of the synchronous queue owner. This attribute may present if the error reason is synchro.

error_object.term¶

Since: 2.10.0

For the box.error.READONLY error, returns the current election term (see box.info.election.term). This attribute may present if the error reason is election or synchro.

error_object.name¶

Since: 3.1.0

Returns the name of the error.

Submodule box.index

The box.index submodule provides read-only access for index definitions and index keys. Indexes are contained in box.space.space-name.index array within each space object. They provide an API for ordered iteration over tuples. This API is a direct binding to corresponding methods of index objects of type box.index in the storage engine.

Below is a list of all box.index functions and members.

Name	Use
Examples for box.index	Some useful examples
space_object:create_index()	Create an index
index_object.unique	Flag, true if an index is unique
index_object.type	Index type
index_object.parts	Array of index key fields
index_object:pairs()	Prepare for iterating
index_object:select()	Select one or more tuples via index
index_object:get()	Select a tuple via index
index_object:min()	Find the minimum value in index
index_object:max()	Find the maximum value in index
index_object:random()	Find a random value in index
index_object:count()	Count tuples matching key value
index_object:update()	Update a tuple
index_object:delete()	Delete a tuple by key
index_object:alter()	Alter an index
index_object:drop()	Drop an index
index_object:rename()	Rename an index
index_object:bsize()	Get count of bytes for an index
index_object:stat()	Get statistics for an index
index_object:compact()	Remove unused index space
index_object:tuple_pos()	Return a tuple’s position for an index
index_object extensions	Any function / method that any user wants to add

Examples for `box.index`

Example showing use of the box functions

This example will work with the sandbox configuration described in the preface. That is, there is a space named tester with a numeric primary key. The example function will:

select a tuple whose key value is 1000;
raise an error if the tuple already exists and already has 3 fields;
Insert or replace the tuple with:
- field[1] = 1000
- field[2] = a uuid
- field[3] = number of seconds since 1970-01-01;
Get field[3] from what was replaced;
Format the value from field[3] as yyyy-mm-dd hh:mm:ss.ffff;
Return the formatted value.

The function uses Tarantool box functions box.space…select, box.space…replace, fiber.time, uuid.str. The function uses Lua functions os.date() and string.sub().

function example()
  local a, b, c, table_of_selected_tuples, d
  local replaced_tuple, time_field
  local formatted_time_field
  local fiber = require('fiber')
  table_of_selected_tuples = box.space.tester:select{1000}
  if table_of_selected_tuples ~= nil then
    if table_of_selected_tuples[1] ~= nil then
      if #table_of_selected_tuples[1] == 3 then
        box.error({code=1, reason='This tuple already has 3 fields'})
      end
    end
  end
  replaced_tuple = box.space.tester:replace
    {1000,  require('uuid').str(), tostring(fiber.time())}
  time_field = tonumber(replaced_tuple[3])
  formatted_time_field = os.date("%Y-%m-%d %H:%M:%S", time_field)
  c = time_field % 1
  d = string.sub(c, 3, 6)
  formatted_time_field = formatted_time_field .. '.' .. d
  return formatted_time_field
end

… And here is what happens when one invokes the function:

tarantool> box.space.tester:delete(1000)
---
- [1000, '264ee2da03634f24972be76c43808254', '1391037015.6809']
...
tarantool> example(1000)
---
- 2014-01-29 16:11:51.1582
...
tarantool> example(1000)
---
- error: 'This tuple already has 3 fields'
...

Example showing a user-defined iterator

Here is an example that shows how to build one’s own iterator. The paged_iter function is an “iterator function”, which will only be understood by programmers who have read the Lua manual section Iterators and Closures. It does paginated retrievals, that is, it returns 10 tuples at a time from a table named “t”, whose primary key was defined with create_index('primary',{parts={1,'string'}}).

function paged_iter(search_key, tuples_per_page)
  local iterator_string = "GE"
  return function ()
  local page = box.space.t.index[0]:select(search_key,
    {iterator = iterator_string, limit=tuples_per_page})
  if #page == 0 then return nil end
  search_key = page[#page][1]
  iterator_string = "GT"
  return page
  end
end

Programmers who use paged_iter do not need to know why it works, they only need to know that, if they call it within a loop, they will get 10 tuples at a time until there are no more tuples.

In this example the tuples are merely printed, a page at a time. But it should be simple to change the functionality, for example by yielding after each retrieval, or by breaking when the tuples fail to match some additional criteria.

for page in paged_iter("X", 10) do
  print("New Page. Number Of Tuples = " .. #page)
  for i = 1, #page, 1 do
    print(page[i])
  end
end

Example showing submodule `box.index`
with index type = RTREE for spatial searches

This submodule may be used for spatial searches if the index type is RTREE. There are operations for searching rectangles (geometric objects with 4 corners and 4 sides) and boxes (geometric objects with more than 4 corners and more than 4 sides, sometimes called hyperrectangles). This manual uses the term rectangle-or-box for the whole class of objects that includes both rectangles and boxes. Only rectangles will be illustrated.

Rectangles are described according to their X-axis (horizontal axis) and Y-axis (vertical axis) coordinates in a grid of arbitrary size. Here is a picture of four rectangles on a grid with 11 horizontal points and 11 vertical points:

           X AXIS
           1   2   3   4   5   6   7   8   9   10  11
        1
        2  #-------+                                           <-Rectangle#1
Y AXIS  3  |       |
        4  +-------#
        5          #-----------------------+                   <-Rectangle#2
        6          |                       |
        7          |   #---+               |                   <-Rectangle#3
        8          |   |   |               |
        9          |   +---#               |
        10         +-----------------------#
        11                                     #               <-Rectangle#4

The rectangles are defined according to this scheme: {X-axis coordinate of top left, Y-axis coordinate of top left, X-axis coordinate of bottom right, Y-axis coordinate of bottom right} – or more succinctly: {x1,y1,x2,y2}. So in the picture … Rectangle#1 starts at position 1 on the X axis and position 2 on the Y axis, and ends at position 3 on the X axis and position 4 on the Y axis, so its coordinates are {1,2,3,4}. Rectangle#2’s coordinates are {3,5,9,10}. Rectangle#3’s coordinates are {4,7,5,9}. And finally Rectangle#4’s coordinates are {10,11,10,11}. Rectangle#4 is actually a “point” since it has zero width and zero height, so it could have been described with only two digits: {10,11}.

Some relationships between the rectangles are: “Rectangle#1’s nearest neighbor is Rectangle#2”, and “Rectangle#3 is entirely inside Rectangle#2”.

Now let us create a space and add an RTREE index.

tarantool> s = box.schema.space.create('rectangles')
tarantool> i = s:create_index('primary', {
         >   type = 'HASH',
         >   parts = {1, 'unsigned'}
         > })
tarantool> r = s:create_index('rtree', {
         >   type = 'RTREE',
         >   unique = false,
         >   parts = {2, 'ARRAY'}
         > })

Field#1 doesn’t matter, we just make it because we need a primary-key index. (RTREE indexes cannot be unique and therefore cannot be primary-key indexes.) The second field must be an “array”, which means its values must represent {x,y} points or {x1,y1,x2,y2} rectangles. Now let us populate the table by inserting two tuples, containing the coordinates of Rectangle#2 and Rectangle#4.

tarantool> s:insert{1, {3, 5, 9, 10}}
tarantool> s:insert{2, {10, 11}}

And now, following the description of RTREE iterator types, we can search the rectangles with these requests:

tarantool> r:select({10, 11, 10, 11}, {iterator = 'EQ'})
---
- - [2, [10, 11]]
...
tarantool> r:select({4, 7, 5, 9}, {iterator = 'GT'})
---
- - [1, [3, 5, 9, 10]]
...
tarantool> r:select({1, 2, 3, 4}, {iterator = 'NEIGHBOR'})
---
- - [1, [3, 5, 9, 10]]
  - [2, [10, 11]]
...

Request#1 returns 1 tuple because the point {10,11} is the same as the rectangle {10,11,10,11} (“Rectangle#4” in the picture). Request#2 returns 1 tuple because the rectangle {4,7,5,9}, which was “Rectangle#3” in the picture, is entirely within{3,5,9,10} which was Rectangle#2. Request#3 returns 2 tuples, because the NEIGHBOR iterator always returns all tuples, and the first returned tuple will be {3,5,9,10} (“Rectangle#2” in the picture) because it is the closest neighbor of {1,2,3,4} (“Rectangle#1” in the picture).

Now let us create a space and index for cuboids, which are rectangle-or-boxes that have 6 corners and 6 sides.

tarantool> s = box.schema.space.create('R')
tarantool> i = s:create_index('primary', {parts = {1, 'unsigned'}})
tarantool> r = s:create_index('S', {
         >   type = 'RTREE',
         >   unique = false,
         >   dimension = 3,
         >   parts = {2, 'ARRAY'}
         > })

The additional option here is dimension=3. The default dimension is 2, which is why it didn’t need to be specified for the examples of rectangle. The maximum dimension is 20. Now for insertions and selections there will usually be 6 coordinates. For example:

tarantool> s:insert{1, {0, 3, 0, 3, 0, 3}}
tarantool> r:select({1, 2, 1, 2, 1, 2}, {iterator = box.index.GT})

Now let us create a space and index for Manhattan-style spatial objects, which are rectangle-or-boxes that have a different way to calculate neighbors.

tarantool> s = box.schema.space.create('R')
tarantool> i = s:create_index('primary', {parts = {1, 'unsigned'}})
tarantool> r = s:create_index('S', {
         >   type = 'RTREE',
         >   unique = false,
         >   distance = 'manhattan',
         >   parts = {2, 'ARRAY'}
         > })

The additional option here is distance='manhattan'. The default distance calculator is ‘euclid’, which is the straightforward as-the-crow-flies method. The optional distance calculator is ‘manhattan’, which can be a more appropriate method if one is following the lines of a grid rather than traveling in a straight line.

tarantool> s:insert{1, {0, 3, 0, 3}}
tarantool> r:select({1, 2, 1, 2}, {iterator = box.index.NEIGHBOR})

More examples of spatial searching are online in the file R tree index quick start and usage.

space_object:create_index()

object space_object¶

space_object:create_index(index-name[, index_opts])¶

Create an index.

It is mandatory to create an index for a space before trying to insert tuples into it or select tuples from it. The first created index will be used as the primary-key index, so it must be unique.

Parameters:	space_object (`space_object`) – an object reference index_name (`string`) – name of index, which should conform to the rules for object names index_opts (`table`) – index options (see index_opts)
Return:	index object
Rtype:	index_object

Possible errors:

too many parts
index ‘…’ already exists
primary key must be unique

Building or rebuilding a large index will cause occasional yields so that other requests will not be blocked. If the other requests cause an illegal situation such as a duplicate key in a unique index, building or rebuilding such index will fail.

Example:

-- Create a space --
bands = box.schema.space.create('bands')

-- Specify field names and types --
box.space.bands:format({
    { name = 'id', type = 'unsigned' },
    { name = 'band_name', type = 'string' },
    { name = 'year', type = 'unsigned' }
})

-- Create a primary index --
box.space.bands:create_index('primary', { parts = { 'id' } })

-- Create a unique secondary index --
box.space.bands:create_index('band', { parts = { 'band_name' } })

-- Create a non-unique secondary index --
box.space.bands:create_index('year', { parts = { { 'year' } }, unique = false })

-- Create a multi-part index --
box.space.bands:create_index('year_band', { parts = { { 'year' }, { 'band_name' } } })

index_opts

object index_opts¶

Index options that include the index name, type, identifiers of key fields, and so on. These options are passed to the space_object.create_index() method.

Note

These options are also passed to index_object:alter().

index_opts.type¶: The index type.

Type: string

Default: TREE

Possible values: TREE, HASH, RTREE, BITSET

index_opts.id¶: A unique numeric identifier of the index, which is generated automatically.

Type: number

Default: last index’s ID + 1

index_opts.unique¶

Specify whether an index may be unique. When true, the index cannot contain the same key value twice.

Type: boolean
Default: true

Example:

-- Create a non-unique secondary index --
box.space.bands:create_index('year', { parts = { { 'year' } }, unique = false })

index_opts.if_not_exists¶: Specify whether to swallow an error on an attempt to create an index with a duplicated name.

Type: boolean

Default: false

index_opts.parts¶

Specify the index’s key parts.

Type: a table of key_part values
Default: {1, ‘unsigned’}

Example:

-- Create a primary index --
box.space.bands:create_index('primary', { parts = { 'id' } })

-- Create a unique secondary index --
box.space.bands:create_index('band', { parts = { 'band_name' } })

-- Create a non-unique secondary index --
box.space.bands:create_index('year', { parts = { { 'year' } }, unique = false })

-- Create a multi-part index --
box.space.bands:create_index('year_band', { parts = { { 'year' }, { 'band_name' } } })

Note

Alternative way to declare index parts

Before version 2.7.1, if an index consisted of a single part and had some options like is_nullable or collation and its definition was written as

my_space:create_index('one_part_idx', {parts = {1, 'unsigned', is_nullable=true}})

(with the only brackets) then options were ignored by Tarantool.

Since version 2.7.1 it is allowed to omit extra braces in an index definition and use both ways:

-- with extra braces
my_space:create_index('one_part_idx', {parts = {{1, 'unsigned', is_nullable=true}}})

-- without extra braces
my_space:create_index('one_part_idx', {parts = {1, 'unsigned', is_nullable=true}})

index_opts.dimension¶: The RTREE index dimension.

Type: number

Default: 2

index_opts.distance¶: The RTREE index distance type.

Type: string

Default: euclid

Possible values: euclid, manhattan

index_opts.sequence¶: Create a generator for indexes using a sequence object. Learn more from specifying a sequence in create_index().

Type: string or number

index_opts.func¶: Specify the identifier of the functional index function.

Type: string

index_opts.hint¶

Since: 2.6.1

Specify whether hint optimization is enabled for the TREE index:

If true, the index works faster.
If false, the index size is reduced by half.

Type: boolean
Default: true

index_opts.bloom_fpr¶

Vinyl only

Specify the bloom filter’s false positive rate.

Type: number
Default: vinyl.bloom_fpr

index_opts.page_size¶

Vinyl only

Specify the size of a page used for read and write disk operations.

Type: number
Default: vinyl.page_size

index_opts.range_size¶

Vinyl only

Specify the default maximum range size (in bytes) for a vinyl index.

Type: number
Default: vinyl.range_size

index_opts.run_count_per_level¶

Vinyl only

Specify the maximum number of runs per level in the LSM tree.

Type: number
Default: vinyl.run_count_per_level

index_opts.run_size_ratio¶

Vinyl only

Specify the ratio between the sizes of different levels in the LSM tree.

Type: number
Default: vinyl.run_size_ratio

index_opts.layout¶

MemCS only

Specify how a column within the index is physically stored.

Possible values:

If not set (or set to plain), the default plain layout is used.
If set to null_rle, run-length encoding of NULL values is used. Applies to nullable columns that are not listed in index parts.

For example:

local format = {
    { 'c1', 'unsigned' },
    { 'c2', 'unsigned', is_nullable = true },
    { 'c3', 'unsigned', is_nullable = true },
    { 'c4', 'unsigned' },
    { 'c5', 'unsigned', is_nullable = true },
}

box.schema.create_space('test', {
    engine = 'memcs', format = format, field_count = #format
})

box.space.test:create_index('primary', {
    parts = { 'c1' }, layout = 'null_rle'
})

box.space.test:create_index('secondary', {
    parts = { 'c1', 'c2' }, covers = { 'c3', 'c4' }, layout = 'null_rle'
})

In this example, the null_rle layout is applied to c2, c3, c5 in the primary index, and to c3 in the secondary index.

Type: string
Default: not set

key_part

object key_part¶

A descriptor of a single part in a multipart key. A table of parts is passed to the index_opts.parts option.

key_part.field¶

Specify the field number or name.

Note

To create a key part by a field name, you need to specify space_object:format() first.

Type: string or number

Examples: Creating an index using field names and numbers

key_part.type¶: Specify the field type. If the field type is specified in space_object:format(), key_part.type inherits this value.

Type: string

Default: scalar

Possible values: listed in Indexed field types

key_part.collation¶

Specify the collation used to compare field values. If the field collation is specified in space_object:format(), key_part.collation inherits this value.

Type: string
Possible values: listed in the box.space._collation system space

Example:

-- Create a space --
box.schema.space.create('tester')

-- Use the 'unicode' collation --
box.space.tester:create_index('unicode', { parts = { { field = 1,
                                                        type = 'string',
                                                        collation = 'unicode' } } })

-- Use the 'unicode_ci' collation --
box.space.tester:create_index('unicode_ci', { parts = { { field = 1,
                                                        type = 'string',
                                                        collation = 'unicode_ci' } } })

-- Insert test data --
box.space.tester:insert { 'ЕЛЕ' }
box.space.tester:insert { 'елейный' }
box.space.tester:insert { 'ёлка' }

-- Returns nil --
select_unicode = box.space.tester.index.unicode:select({ 'ЁлКа' })
-- Returns 'ёлка' --
select_unicode_ci = box.space.tester.index.unicode_ci:select({ 'ЁлКа' })

key_part.is_nullable¶

Specify whether nil (or its equivalent such as msgpack.NULL) can be used as a field value. If the is_nullable option is specified in space_object:format(), key_part.is_nullable inherits this value.

You can set this option to true if:

the index type is TREE
the index is not the primary index

It is also legal to insert nothing at all when using trailing nullable fields. Within indexes, such null values are always treated as equal to other null values and are always treated as less than non-null values. Nulls may appear multiple times even in a unique index.

Type: boolean
Default: false

Example:

box.space.tester:create_index('I', {unique = true, parts = {{field = 2, type = 'number', is_nullable = true}}})

Warning

It is legal to create multiple indexes for the same field with different is_nullable values or to call space_object:format() with a different is_nullable value from what is used for an index. When there is a contradiction, the rule is: null is illegal unless is_nullable=true for every index and for the space format.

key_part.exclude_null¶

Since: 2.8.2

Specify whether an index can skip tuples with null at this key part. You can set this option to true if:

the index type is TREE
the index is not the primary index

If exclude_null is set to true, is_nullable is set to true automatically. Note that this option can be changed dynamically. In this case, the index is rebuilt.

Such indexes do not store filtered tuples at all, so indexing can be done faster.

Type: boolean
Default: false

key_part.path¶

Specify the path string for a map field.

Type: string

See the examples below:

Creating an index using the path option for map fields
Creating a multikey index using the path option with [*]

Examples

Creating an index using field names and numbers

create_index() can use field names or field numbers to define key parts.

Example 1 (field names):

To create a key part by a field name, you need to specify space_object:format() first.

-- Create a primary index --
box.space.bands:create_index('primary', { parts = { 'id' } })

-- Create a unique secondary index --
box.space.bands:create_index('band', { parts = { 'band_name' } })

-- Create a non-unique secondary index --
box.space.bands:create_index('year', { parts = { { 'year' } }, unique = false })

-- Create a multi-part index --
box.space.bands:create_index('year_band', { parts = { { 'year' }, { 'band_name' } } })

Example 2 (field numbers):

-- Create a primary index --
box.space.bands:create_index('primary', { parts = { 1 } })

-- Create a unique secondary index --
box.space.bands:create_index('band', { parts = { 2 } })

-- Create a non-unique secondary index --
box.space.bands:create_index('year', { parts = { { 3 } }, unique = false })

-- Create a multi-part index --
box.space.bands:create_index('year_band', { parts = { 3, 2 } })

Creating an index using the path option for map fields (JSON-path indexes)

To create an index for a field that is a map (a path string and a scalar value), specify the path string during index creation, like this:

parts = {field-number, 'data-type', path = 'path-name'}

The index type must be TREE or HASH and the contents of the field must always be maps with the same path.

Example 1 – The simplest use of path:

box.schema.space.create('space1')
box.space.space1:create_index('primary', { parts = { { field = 1,
                                                       type = 'scalar',
                                                       path = 'age' } } })
box.space.space1:insert({ { age = 44 } })
box.space.space1:select(44)

Example 2 – path plus format() plus JSON syntax to add clarity:

box.schema.space.create('space2')
box.space.space2:format({ { 'id', 'unsigned' }, { 'data', 'map' } })
box.space.space2:create_index('info', { parts = { { 'data.full_name["firstname"]', 'str' },
                                                  { 'data.full_name["surname"]', 'str' } } })
box.space.space2:insert({ 1, { full_name = { firstname = 'John', surname = 'Doe' } } })
box.space.space2:select { 'John' }

Creating a multikey index using the path option with [*]

The string in a path option can contain [*] which is called an array index placeholder. Indexes defined with this are useful for JSON documents that all have the same structure.

For example, when creating an index on field#2 for a string document that will start with {'data': [{'name': '...'}, {'name': '...'}], the parts section in the create_index request could look like:

parts = {{field = 2, type = 'str', path = 'data[*].name'}}

Then tuples containing names can be retrieved quickly with index_object:select({key-value}).

A single field can have multiple keys, as in this example which retrieves the same tuple twice because there are two keys ‘A’ and ‘B’ which both match the request:

my_space = box.schema.space.create('json_documents')
my_space:create_index('primary')
multikey_index = my_space:create_index('multikey', {parts = {{field = 2, type = 'str', path = 'data[*].name'}}})
my_space:insert({1,
         {data = {{name = 'A'},
                  {name = 'B'}},
          extra_field = 1}})
multikey_index:select({''}, {iterator = 'GE'})

The result of the select request looks like this:

tarantool> multikey_index:select({''},{iterator='GE'})
---
- - [1, {'data': [{'name': 'A'}, {'name': 'B'}], 'extra_field': 1}]
- [1, {'data': [{'name': 'A'}, {'name': 'B'}], 'extra_field': 1}]
...

The following restrictions exist:

[*] must be alone or must be at the end of a name in the path.
[*] must not appear twice in the path.
If an index has a path with x[*], then no other index can have a path with x.component.
[*] must not appear in the path of a primary key.
If an index has unique=true and has a path with [*], then duplicate keys from different tuples are disallowed, but duplicate keys for the same tuple are allowed.
The field’s value must have the same structure as in the path definition, or be nil (nil is not indexed).
In a space with multikey indexes, any tuple cannot contain more than ~8,000 elements indexed that way.

Creating a functional index

Functional indexes are indexes that call a user-defined function for forming the index key, rather than depending entirely on the Tarantool default formation. Functional indexes are useful for condensing or truncating or reversing or any other way that users want to customize the index.

There are several recommendations for building functional indexes:

The function definition must expect a tuple, which has the contents of fields at the time a data-change request happens, and must return a tuple, which has the contents that will be put in the index.
The create_index definition must include the specification of all key parts, and the custom function must return a table that has the same number of key parts with the same types.
The space must have a memtx engine.
The function must be persistent and deterministic (see Creating a function with body).
The key parts must not depend on JSON paths.
The function must access key-part values by index, not by field name.
Functional indexes must not be primary-key indexes.
Functional indexes cannot be altered and the function cannot be changed if it is used for an index, so the only way to change them is to drop the index and create it again.
Only sandboxed functions are suitable for functional indexes.

Example:

A function could make a key using only the first letter of a string field.

Create a space. The space needs a primary-key field, which is not the field that we will use for the functional index:

box.schema.space.create('tester')
box.space.tester:create_index('i', { parts = { { field = 1, type = 'string' } } })

Create a function. The function expects a tuple. In this example, it will work on tuple[2] because the key source is field number 2 in what we will insert. Use string.sub() from the string module to get the first character:
```
function_code = [[function(tuple) return {string.sub(tuple[2],1,1)} end]]
```

Make the function persistent using the box.schema.func.create function:

box.schema.func.create('my_func',
        { body = function_code, is_deterministic = true, is_sandboxed = true })

Create a functional index. Specify the fields whose values will be passed to the function. Specify the function:

box.space.tester:create_index('func_index', { parts = { { field = 1, type = 'string' } },
                                              func = 'my_func' })

Insert a few tuples. Select using only the first letter, it will work because that is the key. Or, select using the same function as was used for insertion:

box.space.tester:insert({ 'a', 'wombat' })
box.space.tester:insert({ 'b', 'rabbit' })
box.space.tester.index.func_index:select('w')
box.space.tester.index.func_index:select(box.func.my_func:call({ { 'tester', 'wombat' } }))

The results of the two select requests will look like this:

tarantool> box.space.tester.index.func_index:select('w')
---
- - ['a', 'wombat']
...
tarantool> box.space.tester.index.func_index:select(box.func.my_func:call({{'tester','wombat'}}));
---
- - ['a', 'wombat']
...

Here is the full code of the example:

box.schema.space.create('tester')
box.space.tester:create_index('i', { parts = { { field = 1, type = 'string' } } })
function_code = [[function(tuple) return {string.sub(tuple[2],1,1)} end]]
box.schema.func.create('my_func',
        { body = function_code, is_deterministic = true, is_sandboxed = true })
box.space.tester:create_index('func_index', { parts = { { field = 1, type = 'string' } },
                                              func = 'my_func' })
box.space.tester:insert({ 'a', 'wombat' })
box.space.tester:insert({ 'b', 'rabbit' })
box.space.tester.index.func_index:select('w')
box.space.tester.index.func_index:select(box.func.my_func:call({ { 'tester', 'wombat' } }))

Functions for functional indexes can return multiple keys. Such functions are called “multikey” functions.

To create a multikey function, the options of box.schema.func.create() must include is_multikey = true. The return value must be a table of tuples. If a multikey function returns N tuples, then N keys will be added to the index.

Example:

tester = box.schema.space.create('withdata')
tester:format({ { name = 'name', type = 'string' },
                { name = 'address', type = 'string' } })
name_index = tester:create_index('name', { parts = { { field = 1, type = 'string' } } })
function_code = [[function(tuple)
       local address = string.split(tuple[2])
       local ret = {}
       for _, v in pairs(address) do
         table.insert(ret, {utf8.upper(v)})
       end
       return ret
     end]]
box.schema.func.create('address',
        { body = function_code,
          is_deterministic = true,
          is_sandboxed = true,
          is_multikey = true })
addr_index = tester:create_index('addr', { unique = false,
                                           func = 'address',
                                           parts = { { field = 1, type = 'string',
                                                  collation = 'unicode_ci' } } })
tester:insert({ "James", "SIS Building Lambeth London UK" })
tester:insert({ "Sherlock", "221B Baker St Marylebone London NW1 6XE UK" })
addr_index:select('Uk')

index_object.unique

object index_object¶

index_object.unique¶

true if the index is unique, false if the index is not unique.

Rtype:	boolean

See also: index_opts.unique

index_object.type

object index_object¶

index_object.type¶

The index type.

Rtype:	string

See also: index_opts.type

index_object.parts

object index_object¶

index_object.parts¶

The index’s key parts. Since version 3.0.0, the index_object.parts can operate methods extract_key(), compare(), compare_with_key(), merge().

Since version 3.1.0, the index_object.parts can operate methods validate_key(), validate_full_key(), validate_tuple(), compare_keys().

``index_object.parts`` example

box.schema.space.create('T')
i = box.space.T:create_index('I', {parts={3, 'string', 1, 'unsigned'}})
box.space.T:insert{1, 99.5, 'X', nil, 99.5}
i.parts:extract_key(box.space.T:get({'X', 1}))

``key_def`` equivalent

key_def = require('key_def')
box.schema.space.create('T')
i = box.space.T:create_index('I', {parts={3, 'string', 1, 'unsigned'}})
box.space.T:insert{1, 99.5, 'X', nil, 99.5}
k = key_def.new(i.parts)
k:extract_key(box.space.T:get({'X', 1}))

The outcome of the methods calling is described in key_def_object.

rtype: table

See also: index_opts.parts

index_object:pairs()

object index_object¶

index_object:pairs([key[, {iterator = iterator-type}]])¶

Search for a tuple or a set of tuples via the given index, and allow iterating over one tuple at a time. To search by the primary index in the specified space, use the space_object:pairs() method.

The key parameter specifies what must match within the index.

Note

key is only used to find the first match. Do not assume all matched tuples will contain the key.

The iterator parameter specifies the rule for matching and ordering. Different index types support different iterators. For example, a TREE index maintains a strict order of keys and can return all tuples in ascending or descending order, starting from the specified key. Other index types, however, do not support ordering.

To understand consistency of tuples returned by an iterator, it’s essential to know the principles of the Tarantool transaction processing subsystem. An iterator in Tarantool does not own a consistent read view. Instead, each procedure is granted exclusive access to all tuples and spaces until there is a “context switch”: which may happen due to the implicit yield rules, or by an explicit call to fiber.yield. When the execution flow returns to the yielded procedure, the data set could have changed significantly. Iteration, resumed after a yield point, does not preserve the read view, but continues with the new content of the database. The tutorial Indexed pattern search shows one way that iterators and yields can be used together.

For information about iterators’ internal structures, see the “Lua Functional library” documentation.

Parameters:

index_object (index_object) – an object reference.
key (scalar/table) – value to be matched against the index key, which may be multi-part.
iterator – as defined in tables below. The default iterator type is ‘EQ’.
after – a tuple or the position of a tuple (tuple_pos) after which pairs starts the search. You can pass an empty string or box.NULL to this option to start the search from the first tuple.

Return:

The iterator, which can be used in a for/end loop or with totable().

Possible errors:

no such space
wrong type
selected iteration type is not supported for the index type
key is not supported for the iteration type
iterator position is invalid

Complexity factors: Index size, Index type; Number of tuples accessed.

A search-key-value can be a number (for example 1234), a string (for example 'abcd'), or a table of numbers and strings (for example {1234, 'abcd'}). Each part of a key will be compared to each part of an index key.

The returned tuples will be in order by index key value, or by the hash of the index key value if index type = ‘hash’. If the index is non-unique, then duplicates will be secondarily in order by primary key value. The order will be reversed if the iterator type is ‘LT’ or ‘LE’ or ‘REQ’.

Iterator types for TREE indexes

Iterator type	Arguments	Description
box.index.EQ or ‘EQ’	search value	The comparison operator is ‘==’ (equal to). If an index key is equal to a search value, it matches. Tuples are returned in ascending order by index key. This is the default.
box.index.REQ or ‘REQ’	search value	Matching is the same as for `box.index.EQ`. Tuples are returned in descending order by index key.
box.index.GT or ‘GT’	search value	The comparison operator is ‘>’ (greater than). If an index key is greater than a search value, it matches. Tuples are returned in ascending order by index key.
box.index.GE or ‘GE’	search value	The comparison operator is ‘>=’ (greater than or equal to). If an index key is greater than or equal to a search value, it matches. Tuples are returned in ascending order by index key.
box.index.ALL or ‘ALL’	search value	Same as box.index.GE.
box.index.LT or ‘LT’	search value	The comparison operator is ‘<’ (less than). If an index key is less than a search value, it matches. Tuples are returned in descending order by index key.
box.index.LE or ‘LE’	search value	The comparison operator is ‘<=’ (less than or equal to). If an index key is less than or equal to a search value, it matches. Tuples are returned in descending order by index key.

Informally, we can state that searches with TREE indexes are generally what users will find is intuitive, provided that there are no nils and no missing parts. Formally, the logic is as follows. A search key has zero or more parts, for example {}, {1,2,3},{1,nil,3}. An index key has one or more parts, for example {1}, {1,2,3},{1,2,3}. A search key may contain nil (but not msgpack.NULL, which is the wrong type). An index key may not contain nil or msgpack.NULL, although a later version of Tarantool will have different rules – the behavior of searches with nil is subject to change. Possible iterators are LT, LE, EQ, REQ, GE, GT. A search key is said to “match” an index key if the following statements, which are pseudocode for the comparison operation, return TRUE.

If (number-of-search-key-parts > number-of-index-key-parts) return ERROR
If (number-of-search-key-parts == 0) return TRUE
for (i = 1; ; ++i)
{
  if (i > number-of-search-key-parts) OR (search-key-part[i] is nil)
  {
    if (iterator is LT or GT) return FALSE
    return TRUE
  }
  if (type of search-key-part[i] is not compatible with type of index-key-part[i])
  {
    return ERROR
  }
  if (search-key-part[i] == index-key-part[i])
  {
    continue
  }
  if (search-key-part[i] > index-key-part[i])
  {
    if (iterator is EQ or REQ or LE or LT) return FALSE
    return TRUE
  }
  if (search-key-part[i] < index-key-part[i])
  {
    if (iterator is EQ or REQ or GE or GT) return FALSE
    return TRUE
  }
}

Iterator types for HASH indexes

Type	Arguments	Description
box.index.ALL	none	All index keys match. Tuples are returned in ascending order by hash of index key, which will appear to be random.
box.index.EQ or ‘EQ’	search value	The comparison operator is ‘==’ (equal to). If an index key is equal to a search value, it matches. The number of returned tuples will be 0 or 1. This is the default.

Iterator types for BITSET indexes

Type	Arguments	Description
box.index.ALL or ‘ALL’	none	All index keys match. Tuples are returned in their order within the space.
box.index.EQ or ‘EQ’	bitset value	If an index key is equal to a bitset value, it matches. Tuples are returned in their order within the space. This is the default.
box.index.BITS_ALL_SET	bitset value	If all of the bits which are 1 in the bitset value are 1 in the index key, it matches. Tuples are returned in their order within the space.
box.index.BITS_ANY_SET	bitset value	If any of the bits which are 1 in the bitset value are 1 in the index key, it matches. Tuples are returned in their order within the space.
box.index.BITS_ALL_NOT_SET	bitset value	If all of the bits which are 1 in the bitset value are 0 in the index key, it matches. Tuples are returned in their order within the space.

Iterator types for RTREE indexes

Type	Arguments	Description
box.index.ALL or ‘ALL’	none	All keys match. Tuples are returned in their order within the space.
box.index.EQ or ‘EQ’	search value	If all points of the rectangle-or-box defined by the search value are the same as the rectangle-or-box defined by the index key, it matches. Tuples are returned in their order within the space. “Rectangle-or-box” means “rectangle-or-box as explained in section about RTREE”. This is the default.
box.index.GT or ‘GT’	search value	If all points of the rectangle-or-box defined by the search value are within the rectangle-or-box defined by the index key, it matches. Tuples are returned in their order within the space.
box.index.GE or ‘GE’	search value	If all points of the rectangle-or-box defined by the search value are within, or at the side of, the rectangle-or-box defined by the index key, it matches. Tuples are returned in their order within the space.
box.index.LT or ‘LT’	search value	If all points of the rectangle-or-box defined by the index key are within the rectangle-or-box defined by the search key, it matches. Tuples are returned in their order within the space.
box.index.LE or ‘LE’	search value	If all points of the rectangle-or-box defined by the index key are within, or at the side of, the rectangle-or-box defined by the search key, it matches. Tuples are returned in their order within the space.
box.index.OVERLAPS or ‘OVERLAPS’	search value	If some points of the rectangle-or-box defined by the search value are within the rectangle-or-box defined by the index key, it matches. Tuples are returned in their order within the space.
box.index.NEIGHBOR or ‘NEIGHBOR’	search value	If some points of the rectangle-or-box defined by the defined by the key are within, or at the side of, defined by the index key, it matches. Tuples are returned in order: nearest neighbor first.

Examples:

Below are few examples of using pairs with different parameters. To try out these examples, you need to bootstrap a Tarantool instance as described in Using data operations.

-- Insert test data --
tarantool> bands:insert{1, 'Roxette', 1986}
           bands:insert{2, 'Scorpions', 1965}
           bands:insert{3, 'Ace of Base', 1987}
           bands:insert{4, 'The Beatles', 1960}
           bands:insert{5, 'Pink Floyd', 1965}
           bands:insert{6, 'The Rolling Stones', 1962}
           bands:insert{7, 'The Doors', 1965}
           bands:insert{8, 'Nirvana', 1987}
           bands:insert{9, 'Led Zeppelin', 1968}
           bands:insert{10, 'Queen', 1970}
---
...

-- Select all tuples by the primary index --
tarantool> for _, tuple in bands.index.primary:pairs() do
               print(tuple)
           end
[1, 'Roxette', 1986]
[2, 'Scorpions', 1965]
[3, 'Ace of Base', 1987]
[4, 'The Beatles', 1960]
[5, 'Pink Floyd', 1965]
[6, 'The Rolling Stones', 1962]
[7, 'The Doors', 1965]
[8, 'Nirvana', 1987]
[9, 'Led Zeppelin', 1968]
[10, 'Queen', 1970]
---
...

-- Select all tuples whose secondary key values start with the specified string --
tarantool> for _, tuple in bands.index.band:pairs("The", {iterator = "GE"}) do
             if (string.sub(tuple[2], 1, 3) ~= "The") then break end
             print(tuple)
           end
[4, 'The Beatles', 1960]
[7, 'The Doors', 1965]
[6, 'The Rolling Stones', 1962]
---
...

-- Select all tuples whose secondary key values are between 1965 and 1970 --
tarantool> for _, tuple in bands.index.year:pairs(1965, {iterator = "GE"}) do
             if (tuple[3] > 1970) then break end
             print(tuple)
           end
[2, 'Scorpions', 1965]
[5, 'Pink Floyd', 1965]
[7, 'The Doors', 1965]
[9, 'Led Zeppelin', 1968]
[10, 'Queen', 1970]
---
...

-- Select all tuples after the specified tuple --
tarantool> for _, tuple in bands.index.primary:pairs({}, {after={7, 'The Doors', 1965}}) do
               print(tuple)
           end
[8, 'Nirvana', 1987]
[9, 'Led Zeppelin', 1968]
[10, 'Queen', 1970]
---
...

index_object:select()

object index_object¶

index_object:select(search-key, options)¶

Search for a tuple or a set of tuples by the current index. To search by the primary index in the specified space, use the space_object:select() method.

Parameters:

index_object (index_object) – an object reference.
key (scalar/table) – a value to be matched against the index key, which may be multi-part.
options (table/nil) –
none, any, or all of the following parameters:
- iterator – the iterator type. The default iterator type is ‘EQ’.
- limit – the maximum number of tuples.
- offset – the number of tuples to skip (use this parameter carefully when scanning large data sets).
- options.after – a tuple or the position of a tuple (tuple_pos) after which select starts the search. You can pass an empty string or box.NULL to this option to start the search from the first tuple.
- options.fetch_pos – if true, the select method returns the position of the last selected tuple as the second value.
  
  Note
  
  The after and fetch_pos options are supported for the TREE index only.

Return:

This function might return one or two values:

The tuples whose fields are equal to the fields of the passed key. If the number of passed fields is less than the number of fields in the current key, then only the passed fields are compared, so select{1,2} matches a tuple whose primary key is {1,2,3}.
(Optionally) If options.fetch_pos is set to true, returns a base64-encoded string representing the position of the last selected tuple as the second value. If no tuples are fetched, returns nil.

Rtype:

array of tuples
(Optionally) string

Warning

Use the offset option carefully when scanning large data sets as it linearly increases the number of scanned tuples and leads to a full space scan. Instead, you can use the after and fetch_pos options.

Examples:

Below are few examples of using select with different parameters. To try out these examples, you need to bootstrap a Tarantool database as described in Using data operations.

-- Insert test data --
box.space.bands:insert { 1, 'Roxette', 1986 }
box.space.bands:insert { 2, 'Scorpions', 1965 }
box.space.bands:insert { 3, 'Ace of Base', 1987 }
box.space.bands:insert { 4, 'The Beatles', 1960 }
box.space.bands:insert { 5, 'Pink Floyd', 1965 }
box.space.bands:insert { 6, 'The Rolling Stones', 1962 }
box.space.bands:insert { 7, 'The Doors', 1965 }
box.space.bands:insert { 8, 'Nirvana', 1987 }
box.space.bands:insert { 9, 'Led Zeppelin', 1968 }
box.space.bands:insert { 10, 'Queen', 1970 }

-- Select a tuple by the specified primary key value --
select_primary = bands.index.primary:select { 1 }
--[[
---
- - [1, 'Roxette', 1986]
...
--]]

-- Select a tuple by the specified secondary key value --
select_secondary = bands.index.band:select { 'The Doors' }
--[[
---
- - [7, 'The Doors', 1965]
...
--]]

-- Select a tuple by the specified multi-part secondary key value --
select_multipart = bands.index.year_band:select { 1960, 'The Beatles' }
--[[
---
- - [4, 'The Beatles', 1960]
...
--]]

-- Select tuples by the specified partial key value --
select_multipart_partial = bands.index.year_band:select { 1965 }
--[[
---
- - [5, 'Pink Floyd', 1965]
  - [2, 'Scorpions', 1965]
  - [7, 'The Doors', 1965]
...
--]]

-- Select maximum 3 tuples by the specified secondary index --
select_limit = bands.index.band:select({}, { limit = 3 })
--[[
---
- - [3, 'Ace of Base', 1987]
  - [9, 'Led Zeppelin', 1968]
  - [8, 'Nirvana', 1987]
...
--]]

-- Select maximum 3 tuples with the key value greater than 1965 --
select_greater = bands.index.year:select({ 1965 }, { iterator = 'GT', limit = 3 })
--[[
---
- - [9, 'Led Zeppelin', 1968]
  - [10, 'Queen', 1970]
  - [1, 'Roxette', 1986]
...
--]]

-- Select maximum 3 tuples after the specified tuple --
select_after_tuple = bands.index.primary:select({}, { after = { 4, 'The Beatles', 1960 }, limit = 3 })
--[[
---
- - [5, 'Pink Floyd', 1965]
  - [6, 'The Rolling Stones', 1962]
  - [7, 'The Doors', 1965]
...
--]]

-- Select first 3 tuples and fetch a last tuple's position --
result, position = bands.index.primary:select({}, { limit = 3, fetch_pos = true })
-- Then, pass this position as the 'after' parameter --
select_after_position = bands.index.primary:select({}, { limit = 3, after = position })
--[[
---
- - [4, 'The Beatles', 1960]
  - [5, 'Pink Floyd', 1965]
  - [6, 'The Rolling Stones', 1962]
...
--]]

Note

box.space.space-name.index.index-name:select(...)[1]. can be replaced by box.space.space-name.index.index-name:get(...). That is, get can be used as a convenient shorthand to get the first tuple in the tuple set that would be returned by select. However, if there is more than one tuple in the tuple set, then get throws an error.

index_object:get()

object index_object¶

index_object:get(key)¶

Search for a tuple via the given index, as described in the select topic.

Parameters:	index_object (`index_object`) – an object reference. key (`scalar/table`) – values to be matched against the index key
Return:	the tuple whose index-key fields are equal to the passed key values.
Rtype:	tuple

Possible errors:

no such index;
wrong type;
more than one tuple matches.

Complexity factors: Index size, Index type. See also space_object:get().

Example:

tarantool> box.space.tester.index.primary:get(2)
---
- [2, 'Music']
...

index_object:min()

object index_object¶

index_object:min([key])¶

Find the minimum value in the specified index.

Parameters:	index_object (`index_object`) – an object reference. key (`scalar/table`) – values to be matched against the index key
Return:	the tuple for the first key in the index. If the optional `key` value is supplied, returns the first key that is greater than or equal to `key`. Starting with Tarantool 2.0.4, `index_object:min(key)` returns nothing if `key` doesn’t match any value in the index.
Rtype:	tuple

Possible errors:

Index is not of type ‘TREE’.
ER_TRANSACTION_CONFLICT if a transaction conflict is detected in the MVCC transaction mode.

Complexity factors: Index size, Index type.

Example:

Below are few examples of using min. To try out these examples, you need to bootstrap a Tarantool database as described in Using data operations.

-- Insert test data --
box.space.bands:insert { 1, 'Roxette', 1986 }
box.space.bands:insert { 2, 'Scorpions', 1965 }
box.space.bands:insert { 3, 'Ace of Base', 1987 }
box.space.bands:insert { 4, 'The Beatles', 1960 }
box.space.bands:insert { 5, 'Pink Floyd', 1965 }
box.space.bands:insert { 6, 'The Rolling Stones', 1962 }
box.space.bands:insert { 7, 'The Doors', 1965 }
box.space.bands:insert { 8, 'Nirvana', 1987 }
box.space.bands:insert { 9, 'Led Zeppelin', 1968 }
box.space.bands:insert { 10, 'Queen', 1970 }

-- Find the minimum value in the specified index
min = box.space.bands.index.year:min()
--[[
---
- [4, 'The Beatles', 1960]
...
--]]

-- Find the minimum value that matches the partial key value
min_partial = box.space.bands.index.year_band:min(1965)
--[[
---
- [5, 'Pink Floyd', 1965]
...
--]]

index_object:max()

object index_object¶

index_object:max([key])¶

Find the maximum value in the specified index.

Parameters:	index_object (`index_object`) – an object reference key (`scalar/table`) – values to be matched against the index key
Return:	the tuple for the last key in the index. If the optional `key` value is supplied, returns the last key that is less than or equal to `key`. Starting with Tarantool 2.0.4, `index_object:max(key)` returns nothing if `key` doesn’t match any value in the index.
Rtype:	tuple

Possible errors:

Index is not of type ‘TREE’.
ER_TRANSACTION_CONFLICT if a transaction conflict is detected in the MVCC transaction mode.

Complexity factors: index size, index type.

Example:

Below are few examples of using max. To try out these examples, you need to bootstrap a Tarantool database as described in Using data operations.

-- Insert test data --
box.space.bands:insert { 1, 'Roxette', 1986 }
box.space.bands:insert { 2, 'Scorpions', 1965 }
box.space.bands:insert { 3, 'Ace of Base', 1987 }
box.space.bands:insert { 4, 'The Beatles', 1960 }
box.space.bands:insert { 5, 'Pink Floyd', 1965 }
box.space.bands:insert { 6, 'The Rolling Stones', 1962 }
box.space.bands:insert { 7, 'The Doors', 1965 }
box.space.bands:insert { 8, 'Nirvana', 1987 }
box.space.bands:insert { 9, 'Led Zeppelin', 1968 }
box.space.bands:insert { 10, 'Queen', 1970 }

-- Find the maximum value in the specified index
max = box.space.bands.index.year:max()
--[[
---
- [8, 'Nirvana', 1987]
...
--]]

-- Find the maximum value that matches the partial key value
max_partial = box.space.bands.index.year_band:max(1965)
--[[
---
- [7, 'The Doors', 1965]
...
--]]

index_object:random()

object index_object¶

index_object:random(seed)¶

Find a random value in the specified index. This method is useful when it’s important to get insight into data distribution in an index without having to iterate over the entire data set.

Parameters:	index_object (`index_object`) – an object reference. seed (`number`) – an arbitrary non-negative integer
Return:	the tuple for the random key in the index.
Rtype:	tuple

Complexity factors: Index size, Index type.

Note regarding storage engine: vinyl does not support random().

Example:

tarantool> box.space.tester.index.secondary:random(1)
---
- ['Beta!', 66, 'This is the second tuple!']
...

index_object:count()

object index_object¶

index_object:count([key][, iterator])¶

Iterate over an index, counting the number of tuples which match the key-value.

Parameters:	index_object (`index_object`) – an object reference. key (`scalar/table`) – values to be matched against the index key iterator – comparison method
Return:	the number of matching tuples.
Rtype:	number

Example:

Below are few examples of using count. To try out these examples, you need to bootstrap a Tarantool database as described in Using data operations.

-- Insert test data --
box.space.bands:insert { 1, 'Roxette', 1986 }
box.space.bands:insert { 2, 'Scorpions', 1965 }
box.space.bands:insert { 3, 'Ace of Base', 1987 }
box.space.bands:insert { 4, 'The Beatles', 1960 }
box.space.bands:insert { 5, 'Pink Floyd', 1965 }
box.space.bands:insert { 6, 'The Rolling Stones', 1962 }
box.space.bands:insert { 7, 'The Doors', 1965 }
box.space.bands:insert { 8, 'Nirvana', 1987 }
box.space.bands:insert { 9, 'Led Zeppelin', 1968 }
box.space.bands:insert { 10, 'Queen', 1970 }

-- Count the number of tuples that match the full key value
count = box.space.bands.index.year:count(1965)
--[[
---
- 3
...
--]]

-- Count the number of tuples that match the partial key value
count_partial = box.space.bands.index.year_band:count(1965)
--[[
---
- 3
...
--]]

index_object:update()

object index_object¶

index_object:update(key, {{operator, field_identifier, value}, ...})¶

Update a tuple.

Same as box.space…update(), but key is searched in this index instead of primary key. This index should be unique.

Parameters:	index_object (`index_object`) – an object reference. key (`scalar/table`) – values to be matched against the index key operator (`string`) – operation type represented in string field_identifier (`field-or-string`) – what field the operation will apply to. The field number can be negative, meaning the position from the end of tuple. (#tuple + negative field number + 1) value (`lua_value`) – what value will be applied
Return:	the updated tuple nil if the key is not found
Rtype:	tuple or nil

Since Tarantool 2.3 a tuple can also be updated via JSON paths.

index_object:delete()

object index_object¶

index_object:delete(key)¶

Delete a tuple identified by a key.

Same as box.space…delete(), but key is searched in this index instead of in the primary-key index. This index ought to be unique.

Parameters:	index_object (`index_object`) – an object reference. key (`scalar/table`) – values to be matched against the index key
Return:	the deleted tuple.
Rtype:	tuple

Note regarding storage engine: vinyl will return nil, rather than the deleted tuple.

index_object:alter()

object index_object¶

index_object:alter({options})¶

Alter an index. It is legal in some circumstances to change one or more of the index characteristics, for example its type, its sequence options, its parts, and whether it is unique. Usually this causes rebuilding of the space, except for the simple case where a part’s is_nullable flag is changed from false to true.

Parameters:	index_object (`index_object`) – an object reference. options (`table`) – index options (see index_opts)
Return:	nil

Possible errors:

index does not exist
the primary-key index cannot be changed to {unique = false}

Note

Vinyl does not support alter() of a primary-key index unless the space is empty.

Example 1:

You can add and remove fields that make up a primary index:

tarantool> s = box.schema.create_space('test')
---
...
tarantool> i = s:create_index('i', {parts = {{field = 1, type = 'unsigned'}}})
---
...
tarantool> s:insert({1, 2})
---
- [1, 2]
...
tarantool> i:select()
---
- - [1, 2]
...
tarantool> i:alter({parts = {{field = 1, type = 'unsigned'}, {field = 2, type = 'unsigned'}}})
---
...
tarantool> s:insert({1, 't'})
---
- error: 'Tuple field 2 type does not match one required by operation: expected unsigned'
...

Example 2:

You can change index options for both memtx and vinyl spaces:

tarantool> box.space.space55.index.primary:alter({type = 'HASH'})
---
...

tarantool> box.space.vinyl_space.index.i:alter({page_size=4096})
---
...

index_object:drop()

object index_object¶

index_object:drop()¶

Drop an index. Dropping a primary-key index has a side effect: all tuples are deleted.

Parameters:	index_object (`index_object`) – an object reference.
Return:	nil.

Possible errors:

index does not exist,
a primary-key index cannot be dropped while a secondary-key index exists.

Example:

tarantool> box.space.space55.index.primary:drop()
---
...

index_object:rename()

object index_object¶

index_object:rename(index-name)¶

Rename an index.

Parameters:	index_object (`index_object`) – an object reference. index-name (`string`) – new name for index
Return:	nil

Possible errors: index_object does not exist.

Example:

tarantool> box.space.space55.index.primary:rename('secondary')
---
...

Complexity factors: Index size, Index type, Number of tuples accessed.

index_object:bsize()

object index_object¶

index_object:bsize()¶

Return the total number of bytes taken by the index.

Parameters:	index_object (`index_object`) – an object reference.
Return:	number of bytes
Rtype:	number

index_object:stat()

object index_object¶

index_object:stat()¶

Return statistics about actions taken that affect the index.

This is for use with the vinyl engine.

Some detail items in the output from index_object:stat() are:

index_object:stat().latency – timings subdivided by percentages;
index_object:stat().bytes – the number of bytes total;
index_object:stat().disk.rows – the approximate number of tuples in each range;
index_object:stat().disk.statement – counts of inserts|updates|upserts|deletes;
index_object:stat().disk.compaction – counts of compactions and their amounts;
index_object:stat().disk.dump – counts of dumps and their amounts;
index_object:stat().disk.iterator.bloom – counts of bloom filter hits|misses;
index_object:stat().disk.pages – the size in pages;
index_object:stat().disk.last_level – size of data in the last LSM tree level;
index_object:stat().cache.evict – number of evictions from the cache;
index_object:stat().range_size – maximum number of bytes in a range;
index_object:stat().dumps_per_compaction – average number of dumps required to trigger major compaction in any range of the LSM tree.

Summary index statistics are also available via box.stat.vinyl().

Parameters:	index_object (`index_object`) – an object reference.
Return:	statistics
Rtype:	table

index_object:compact()

object index_object¶

index_object:compact()¶

Remove unused index space. For the memtx storage engine this method does nothing; index_object:compact() is only for the vinyl storage engine. For example, with vinyl, if a tuple is deleted, the space is not immediately reclaimed. There is a scheduler for reclaiming space automatically based on factors such as lsm shape and amplification as discussed in the section Storing data with vinyl, so calling index_object:compact() manually is not always necessary.

Return:	nil (Tarantool returns without waiting for compaction to complete)

index_object:tuple_pos()

object index_object¶

index_object:tuple_pos(tuple)¶

Return a tuple’s position for an index. This value can be passed to the after option of the select and pairs methods:

index_object:select and space_object:select

index_object:pairs and space_object:pairs

Note that tuple_pos does not work with functional and multikey indexes.

Parameters:	index_object (`index_object`) – an object reference tuple (`scalar/table`) – a tuple whose position should be found
Return:	a tuple’s position in a space
Rtype:	base64-encoded string

Example:

To try out this example, you need to bootstrap a Tarantool instance as described in Using data operations.

-- Insert test data --
tarantool> bands:insert{1, 'Roxette', 1986}
           bands:insert{2, 'Scorpions', 1965}
           bands:insert{3, 'Ace of Base', 1987}
           bands:insert{4, 'The Beatles', 1960}
           bands:insert{5, 'Pink Floyd', 1965}
           bands:insert{6, 'The Rolling Stones', 1962}
---
...

-- Get a tuple's position --
tarantool> position = bands.index.primary:tuple_pos({3, 'Ace of Base', 1987})
---
...
-- Pass the tuple's position as the 'after' parameter --
tarantool> bands:select({}, {limit = 3, after = position})
---
- - [4, 'The Beatles', 1960]
  - [5, 'Pink Floyd', 1965]
  - [6, 'The Rolling Stones', 1962]
...

index_object extensions

You can extend index_object with custom functions as follows:

Create a Lua function.
Add the function name to a predefined global variable, which has the table type.
Call the function on the index_object: index_object:function-name([parameters]).

There are three predefined global variables:

Adding to box_schema.index_mt makes the function available for all indexes.
Adding to box_schema.memtx_index_mt makes the function available for all memtx indexes.
Adding to box_schema.vinyl_index_mt makes the function available for all vinyl indexes.

Alternatively, you can make a user-defined function available for only one index by calling getmetatable(index_object) and then adding the function name to the meta table.

Example 1:

The example below shows how to extend all memtx indexes with the custom function:

box.schema.space.create('tester1', { engine = 'memtx' })
box.space.tester1:create_index('index1')
global_counter = 5

-- Create a custom function.
function increase_global_counter()
    global_counter = global_counter + 1
end

-- Extend all memtx indexes with the created function.
box.schema.memtx_index_mt.increase_global_counter = increase_global_counter

-- Call the 'increase_global_counter' function on 'index1'
-- to change the 'global_counter' value from 5 to 6.
box.space.tester1.index.index1:increase_global_counter()

Example 2:

The example below shows how to extend the specified index with the custom function with parameters:

box.schema.space.create('tester2', { engine = 'memtx', id = 1000 })
box.space.tester2:create_index('index2')
local_counter = 0

-- Create a custom function.
function increase_local_counter(i_arg, param)
    local_counter = local_counter + param + i_arg.space_id
end

-- Extend only the 'index2' index with the created function.
box.schema.memtx_index_mt.increase_local_counter = increase_local_counter
meta = getmetatable(box.space.tester2.index.index2)
meta.increase_local_counter = increase_local_counter

-- Call the 'increase_local_counter' function on 'index2'
-- to change the 'local_counter' value from 0 to 1005.
box.space.tester2.index.index2:increase_local_counter(5)

Submodule box.info

The box.info submodule provides access to information about a running Tarantool instance. Below is a list of all box.info functions and members.

Name	Use
box.info()	Get all keys and values provided by the `box.info` submodule
box.info.cluster	Information about the cluster to which the current instance belongs
box.info.config	The instance’s state in regard to configuration
box.info.election	The current state of this replica set node in regard to leader election
box.info.gc()	Get information about the Tarantool garbage collector
box.info.hostname	The hostname that identifies a machine the current instance is running on
box.info.id	A numeric identifier of the current instance within the replica set
box.info.listen	A real address to which an instance is bound
box.info.lsn	A log sequence number (LSN) for the latest entry in the instance’s write-ahead log (WAL)
box.info.memory()	Get information about memory usage for the current instance
box.info.name	The name of the current instance
box.info.package	The package name
box.info.pid	Get a process ID of the current instance
box.info.replicaset	Information about the replica set to which the current instance belongs
box.info.replication	Statistics for all instances in the replica set
box.info.replication_anon()	List all the anonymous replicas following the instance
box.info.ro	The current mode of the instance (writable or read-only)
box.info.ro_reason	The reason why the current instance is read-only
box.info.schema_version	The database schema version
box.info.signature	The sum of all `lsn` values from each vector clock for all instances in the replica set
box.info.sql	Get information about the cache for all SQL prepared statements
box.info.status	The current state of the instance
box.info.synchro	The current state of synchronous replication
box.info.uptime	The number of seconds since the instance started
box.info.uuid	A globally unique identifier of the current instance
box.info.vclock	A table with the vclock values of all instances in a replica set which have made data changes
box.info.version	The Tarantool version
box.info.vinyl()	(Deprecated) Get runtime statistics for the vinyl storage engine

box.info()

box.info()¶

Get all keys and values provided by the box.info submodule. Since box.info contents are dynamic, it’s not possible to iterate over keys with the Lua pairs() function. For this purpose, box.info() builds and returns a Lua table with all keys and values provided in the submodule.

Return:	keys and values in the submodule
Rtype:	table

Since version 3.3.0 returns the hierarchy table, showing names of the group, replicaset, and the instance itself. These names are taken directly from the --name CLI option (or the TT_INSTANCE_NAME environment variable) and the cluster configuration. This means they are always present if the YAML configuration flow is in use, disregarding the database status (whether upgraded, writable or not).

Example

This example is for a master-replica set that contains one master instance and one replica instance. The request was issued at the replica instance.

sharded_cluster_crud:storage-a-002> box.info()
---
- version: 3.2.0-entrypoint-218-gf8d77dbec
  id: 2
  ro: true
  uuid: 5a879b0e-9b53-4053-980a-be1a39ad1166
  pid: 12059
  replicaset:
    uuid: 90dc4d6c-3f7d-45e5-aa5a-55903b0a79c9
    name: storage-a
  schema_version: 87
  listen: 127.0.0.1:3303
  replication_anon:
    count: 0
  replication:
    1:
      id: 1
      uuid: ffb1b8bb-d59f-4eee-ad3e-91058e6f5486
      lsn: 1092
      upstream:
        status: follow
        idle: 0.99728900000082
        peer: replicator@127.0.0.1:3302
        lag: 0.00025296211242676
      name: storage-a-001
      downstream:
        status: follow
        idle: 0.18575600000077
        vclock: {1: 1092}
        lag: 0
    2:
      id: 2
      uuid: 5a879b0e-9b53-4053-980a-be1a39ad1166
      lsn: 0
      name: storage-a-002
  package: Tarantool
  hostname: demo.example.com
  election:
    state: follower
    vote: 0
    leader: 0
    term: 1
  signature: 1092
  synchro:
    queue:
      owner: 0
      confirm_lag: 0
      term: 0
      age: 0
      len: 0
      busy: false
    quorum: 2
  status: running
  sql: []
  vclock: {1: 1092}
  uptime: 229
  lsn: 0
  vinyl: []
  ro_reason: config
  memory: []
  gc: []
  cluster:
    name: null
  name: storage-a-002
  config:
    status: ready
    meta:
      last: &0 []
      active: *0
    alerts: []
    hierarchy:
      group: storages
      replicaset: storage-a
      instance: storage-a-002
...

box.info.cluster

box.info.cluster¶

Information about the cluster to which the current instance belongs. The returned table contains the following fields:

name – the cluster name

Rtype:	table

box.info.config

box.info.config¶

Since: 3.2.0

The instance’s state in regard to configuration. Note that box.info.config returns the instance’s state obtained using config:info(‘v2’).

Rtype:	table

Example

sharded_cluster_crud:storage-a-002> box.info.config
---
- status: ready
  meta:
    last: &0 []
    active: *0
  alerts: []
  hierarchy:
    group: storages
    replicaset: storage-a
    instance: storage-a-002
...

box.info.election

box.info.election¶

Since: 2.6.1

The current state of this replica set node in regard to leader election. The following information is provided:

state – the election state (mode) of the node. Possible values are leader, follower, or candidate. For more details, refer to description of the leader election process. When replication.failover is set to election, the node is writable only in the leader state.
term – the current election term.
vote – the ID of a node the current node votes for. If the value is 0, it means the node hasn’t voted in the current term yet.
leader – a leader node ID in the current term. If the value is 0, it means the node doesn’t know which node is the leader in the current term.
leader_name – a leader name. Returns nil if there is no leader in a cluster or box.NULL if a leader does not have a name. Since version 3.0.0.
leader_idle – time in seconds since the last interaction with the known leader. Since version 2.10.0.

Note

IDs in the box.info.election output are the replica IDs visible in the box.info.id output on each node and in the _cluster space.

Example:

auto_leader:instance001> box.info.election
---
- leader_idle: 0
  leader_name: instance001
  state: leader
  vote: 2
  term: 3
  leader: 2
...

box.info.gc()

box.info.gc()¶

Get information about the Tarantool garbage collector. The garbage collector compares vclock (vector clock) values of users and checkpoints, so a look at box.info.gc() may show why the garbage collector has not removed old WAL files, or show what it may soon remove.

consumers – a list of users whose requests might affect the garbage collector.
checkpoints – a list of preserved checkpoints.
checkpoints[n].references – a list of references to a checkpoint.
checkpoints[n].vclock – a checkpoint’s vclock value.
checkpoints[n].signature – a sum of a checkpoint’s vclock’s components.
checkpoint_is_in_progress – true if a checkpoint is in progress, otherwise false
vclock – the garbage collector’s vclock.
signature – the sum of the garbage collector’s checkpoint’s components.
wal_retention_vclock – a vclock value of the oldest write-ahead log file protected from removing by the garbage collector by using the wal.retention_period option.

box.info.hostname

box.info.hostname¶

Since: 3.2.0

The hostname that identifies a machine the current instance is running on.

Rtype:	string

box.info.id

box.info.id¶

A numeric identifier of the current instance within the replica set. This value corresponds to replication[{n}].id. Learn more in box.info.replication.

Rtype:	number

box.info.listen

box.info.listen¶

Since: 2.4.1

A real address to which an instance is bound. If multiple URIs are configured, returns an array of strings. If an instance does not listen to anything, box.info.listen is nil.

To learn how to configure URIs used to listen for incoming requests, see iproto.listen.

Rtype:	string \| string[]

Example

sharded_cluster_crud:storage-a-002> box.info.listen
---
- 127.0.0.1:3303
...

box.info.lsn

box.info.lsn¶

A log sequence number (LSN) for the latest entry in the instance’s write-ahead log (WAL). This value corresponds to `replication[{n}].lsn. Learn more in box.info.replication.

Rtype:	number

box.info.memory()

box.info.memory()¶

Get information about memory usage for the current instance.

Note

To get a picture of the vinyl subsystem, use box.stat.vinyl() instead.

cache – the number of bytes used for caching user data. The memtx storage engine does not require a cache, so in fact this is the number of bytes in the cache for the tuples stored for the vinyl storage engine.
data – the number of bytes used for storing user data (the tuples) with the memtx engine and with level 0 of the vinyl engine, without taking memory fragmentation into account.
index – the number of bytes used for indexing user data, including memtx and vinyl memory tree extents, the vinyl page index, and the vinyl bloom filters.
lua – the number of bytes used for Lua runtime.
net – the number of bytes used for network input/output buffers.
tx – the number of bytes in use by active transactions. For the vinyl storage engine, this is the total size of all allocated objects (struct txv, struct vy_tx, struct vy_read_interval) and tuples pinned for those objects.

Example

sharded_cluster_crud:storage-a-002> box.info.memory()
---
- cache: 0
  data: 43848
  index: 1130496
  lua: 10516849
  net: 1572864
  tx: 0
...

box.info.name

box.info.name¶

Since: 3.0.0

The name of the current instance. You can specify the names of instances when configuring a cluster topology.

Rtype:	string

Example

sharded_cluster_crud:storage-a-002> box.info.name
---
- storage-a-002
...

box.info.package

box.info.package¶

The package name. It can be:

Tarantool
Tarantool Enterprise

Rtype:	string

box.info.pid

box.info.pid¶

A process ID of the current instance. You can also get the process ID as follows:

Using tarantool.pid().
Using the ps -A Linux command.

Rtype:	number

box.info.replicaset

box.info.replicaset¶

Since: 3.0.0

Information about the replica set to which the current instance belongs. The returned table contains the following fields:

name – the replica set name
uuid – the replica set UUID

You can specify the names of replica sets when configuring a cluster topology.

Rtype:	table

box.info.replication

box.info.replication¶

The replication section of box.info() is a table with statistics for all instances in the replica set that the current instance belongs to. To see the example, refer to Monitoring a replica set.

In the following, n is the index number of one table item, for example, replication[1], which has data about server instance number 1, which may or may not be the same as the current instance (the “current instance” is what is responding to box.info).

replication[n].id is a short numeric identifier of instance n within the replica set. This value is stored in the box.space._cluster system space.
replication[n].uuid is a globally unique identifier of instance n. This value is stored in the box.space._cluster system space.
replication[n].lsn is the log sequence number (LSN) for the latest entry in instance n’s write-ahead log (WAL).
replication[n].name is the instance name. See also: box.info.name.
replication[n].upstream appears (is not nil) if the current instance is following or intending to follow instance n, which ordinarily means replication[n].upstream.status = follow, replication[n].upstream.peer = url of instance n which is being followed, replication[n].lag and idle = the instance’s speed, described later. Another way to say this is: replication[n].upstream will appear when replication[n].upstream.peer is not of the current instance, and is not read-only, and was specified in box.cfg{replication={...}}, so it is shown in box.cfg.replication.
replication[n].upstream.status is the replication status of the connection with the instance n:
- connect: an instance is connecting to the master.
- auth: authentication is being performed.
- wait_snapshot: an instance is receiving metadata from the master. If join fails with a non-critical error at this stage (for example, ER_READONLY, ER_ACCESS_DENIED, or a network-related issue), an instance tries to find a new master to join.
- fetch_snapshot: an instance is receiving data from the master’s .snap files.
- final_join: an instance is receiving new data added during fetch_snapshot.
- sync: the master and replica are synchronizing to have the same data.
- follow: the current instance’s role is replica. This means that the instance is read-only or acts as a replica for this remote peer in master-master configuration. The instance is receiving or able to receive data from the instance n’s (upstream) master.
- stopped: replication is stopped due to a replication error (for example, duplicate key).
- disconnected: an instance is not connected to the replica set (for example, due to network issues, not replication errors).
Learn more from Replication stages.

replication[n].upstream.idle is the time (in seconds) since the last event was received. This is the primary indicator of replication health. Learn more from Monitoring a replica set.

replication[n].upstream.peer contains instance n’s URI, for example, 127.0.0.1:3302. Learn more from Monitoring a replica set.

replication[n].upstream.lag is the time difference between the local time of instance n, recorded when the event was received, and the local time at another master recorded when the event was written to the write-ahead log on that master. Learn more from Monitoring a replica set.
replication[n].upstream.message contains an error message in case of a degraded state; otherwise, it is nil.
replication[n].downstream appears (is not nil) with data about an instance that is following instance n or is intending to follow it, which ordinarily means replication[n].downstream.status = follow.
replication[n].downstream.vclock contains the vector clock, which is a table of ‘id, lsn’ pairs, for example, vclock: {1: 3054773, 4: 8938827, 3: 285902018}. (Notice that the table may have multiple pairs although vclock is a singular name).

Even if instance n is removed, its values will still appear here; however, its values will be overridden if an instance joins later with the same UUID. Vector clock pairs will only appear if lsn > 0.

replication[n].downstream.vclock may be the same as the current instance’s vclock (box.info.vclock) because this is for all known vclock values of the cluster. A master will know what is in a replica’s copy of vclock because, when the master makes a data change, it sends the change information to the replica (including the master’s vector clock), and the replica replies with what is in its entire vector clock table.

A replica also sends its entire vector clock table in response to a master’s heartbeat message, see the heartbeat-message examples in the section Binary protocol – replication.
replication[n].downstream.idle is the time (in seconds) since the last time that instance n sent events through the downstream replication.
replication[n].downstream.status is the replication status for downstream replications:
- stopped means that downstream replication has stopped,
- follow means that downstream replication is in progress (instance n is ready to accept data from the master or is currently doing so).
replication[n].downstream.lag is the time difference between the local time at the master node, recorded when a particular transaction was written to the write-ahead log, and the local time recorded when it receives an acknowledgment for this transaction from a replica. Since version 2.10.0. See more in Monitoring a replica set.
replication[n].downstream.message and replication[n].downstream.system_message will be nil unless a problem occurs with the connection. For example, if instance n goes down, then one may see status = 'stopped', message = 'unexpected EOF when reading from socket', and system_message = 'Broken pipe'. See also degraded state.

box.info.replication_anon()

box.info.replication_anon()¶

List all the anonymous replicas following the instance.

The output is similar to the one produced by box.info.replication with an exception that anonymous replicas are indexed by their uuid strings rather than server ids, since server ids have no meaning for anonymous replicas.

Notice that when you issue a plain box.info.replication_anon, the only info returned is the number of anonymous replicas following the current instance. In order to see the full stats, you have to call box.info.replication_anon(). This is done to not overload the box.info output with excess info, since there may be lots of anonymous replicas.

Example:

anonymous_replica:instance001> box.info.replication_anon
---
- count: 1
...

anonymous_replica:instance001> box.info.replication_anon()
---
- 44237cb4-de83-4347-b6db-46274b940acf:
    id: 0
    uuid: 44237cb4-de83-4347-b6db-46274b940acf
    lsn: 0
    downstream:
      status: follow
      idle: 0.81613899999866
      vclock: {1: 7}
      lag: 0
    name: null
...

Notice that anonymous replicas hide their lsn from the others, so an anonymous replica lsn will always be reported as zero, even if an anonymous replica performs some local space operations. To find out the lsn of a specific anonymous replica, you have to issue box.info.lsn on it.

box.info.ro

box.info.ro¶

The current mode of the instance (writable or read-only). Learn more in box.info.ro_reason.

Rtype:	boolean

Example

sharded_cluster_crud:storage-a-002> box.info.ro
---
- true
...

box.info.ro_reason

box.info.ro_reason¶

Since: 2.10.0

The reason why the current instance is read-only. To get whether the current instance is writable or read-only, use box.info.ro. If the instance is in writable mode, box.info.ro_reason returns nil.

The possible values returned by ro_reason:

election – the instance is not the leader. See box.info.election for details.
synchro – the instance is not the owner of the synchronous transaction queue. For details, see box.info.synchro.
config – the instance is is configured to be read only.
orphan – the instance is in orphan state. For details, see the orphan status page.

Rtype:	string

Example

sharded_cluster_crud:storage-a-002> box.info.ro
---
- true
...
sharded_cluster_crud:storage-a-002> box.info.ro_reason
---
- config
...

box.info.schema_version

box.info.schema_version¶

Since: 2.11.0

The database schema version. A schema version is a number that indicates whether the database schema is changed. For example, the schema_version value grows if a space or index is added or deleted, or a space, index, or field name is changed.

Rtype:	number

Example

sharded_cluster_crud:storage-a-002> box.info.schema_version
---
- 87
...

box.info.signature

box.info.signature¶

The sum of all lsn values from each vector clock (vclock) for all instances in the replica set the current instance belongs to.

Rtype:	number

box.info.sql

box.info.sql¶

Get information about the cache for all SQL prepared statements. The returned table contains the following fields:

cache.size – the actual cache size (in bytes) for all SQL prepared statements. To configure the maximum cache size, use the sql.cache_size option.
cache.stmt_count – the number of statements in the SQL prepared statement cache.

Rtype:	table

box.info.status

box.info.status¶

The current state of the instance. It can be:

running – the instance is loaded
loading – the instance is either recovering xlogs/snapshots or bootstrapping
orphan – the instance has not (yet) succeeded in joining the required number of masters (see orphan status)
hot_standby – the instance is standing by another instance

Rtype:	string

box.info.synchro

box.info.synchro¶

Since version 2.8.1.

The current state of synchronous replication.

In synchronous replication, transaction is considered committed only after achieving the required quorum number. While transactions are collecting confirmations from remote nodes, these transactions are waiting in the queue.

The following information is provided:

queue:
- owner (since version 2.10.0) – ID of the replica that owns the synchronous transaction queue. Once an owner instance appears, all other instances become read-only. If the owner field is 0, then every instance may be writable, but they can’t create any synchronous transactions. To claim or reclaim the queue, use box.ctl.promote() on the instance that you want to promote. To clear the ownership, call box.ctl.demote() on the synchronous queue owner.
  
  When Raft election is enabled and replication.election_mode is set to candidate, the new Raft leader claims the queue automatically after winning the elections. It means that the value of box.info.synchro.queue.owner becomes equal to box.info.election.leader. When Raft is enabled, no manual intervention with box.ctl.promote() or box.ctl.demote() is required.
- term (since version 2.10.0) – current queue term. It contains the term of the last PROMOTE request. Usually, it is equal to box.info.election.term. However, the queue term value may be less than the election term. It can happen when a new round of elections has started, but no instance has been promoted yet.
- len – the number of entries that are currently waiting in the queue.
- busy (since version 2.10.0) – the boolean value is true when the instance is processing or writing some system request that modifies the queue (for example, PROMOTE, CONFIRM, or ROLLBACK). Until the request is complete, any other incoming synchronous transactions and system requests will be delayed.
- age (since version 3.2.0) – the time in seconds that the oldest entry currently present in the queue has spent waiting for the quorum to collect.
- confirm_lag (since version 3.2.0) – the time in seconds that the latest successfully confirmed entry waited for the quorum to collect.
quorum – the resulting value of the replication.synchro_quorum configuration option. Since version 2.5.3, the option can be set as a dynamic formula. In this case, the value of the quorum member depends on the current number of replicas.

Example 1:

In this example, the quorum field is equal to 1. That is, synchronous transactions work like asynchronous ones. 1 means that a successful WAL writing to the master is enough to commit.

box_info_synchro:instance001> box.info.synchro
---
- queue:
    owner: 1
    confirm_lag: 0
    term: 2
    age: 0
    len: 0
    busy: false
  quorum: 1
...

Example 2:

Example on GitHub: box_info_synchro

In this example, there are two instances:

instance001 is going to be the leader.
instance002 is a follower instance.

groups:
  group001:
    replicasets:
      replicaset001:
        instances:
          instance001:
            database:
              mode: rw
            replication:
              synchro_quorum: 2
              synchro_timeout: 1000
            iproto:
              listen:
              - uri: '127.0.0.1:3301'
          instance002:
            database:
              mode: ro
            iproto:
              listen:
              - uri: '127.0.0.1:3302'

On the first instance, grant the user with the super role:

box_info_synchro:instance001> box.schema.user.grant('guest', 'super')

After that, use the box.ctl.promote() function to claim the queue:

box_info_synchro:instance001> box.ctl.promote()

Create a space named sync and enable synchronous replication on this space:

box_info_synchro:instance001> s = box.schema.space.create("sync", {is_sync=true})

Then, create an index:

box_info_synchro:instance001> _ = s:create_index('pk')

Check the current state of synchronous replication:

box_info_synchro:instance001> box.info.synchro
---
- queue:
    owner: 1
    confirm_lag: 0
    term: 2
    age: 0
    len: 0
    busy: false
  quorum: 2
...

On the second instance, simulate failure like if this instance would crash or go out of the network:

box_info_synchro:instance002> os.exit(0)
   ⨯ Connection was closed. Probably instance process isn't running anymore

On the first instance, try to perform some synchronous transactions. The transactions would hang, because the replication.synchro_quorum option is set to 2, and the second instance is not available:

box_info_synchro:instance001> fiber = require('fiber')
---
...
box_info_synchro:instance001> for i = 1, 3 do fiber.new(function() box.space.sync:replace{i} end) end
---
...

Call the box.info.synchro command on the first instance again:

box_info_synchro:instance001> box.info.synchro
---
- queue:
    owner: 1
    confirm_lag: 0
    term: 2
    age: 5.2658250015229
    len: 3
    busy: false
  quorum: 2
...

The len field is now equal to 3. It means that there are 3 transactions waiting in the queue.

box.info.uptime

box.info.uptime¶

The number of seconds since the instance started. This value can also be retrieved with tarantool.uptime().

Rtype:	number

box.info.uuid

box.info.uuid¶

A globally unique identifier of the current instance. This value corresponds to replication[{n}].uuid. Learn more in box.info.replication.

Rtype:	string

box.info.vclock

box.info.vclock¶

A table with the vclock values of all instances in a replica set which have made data changes.

Rtype:	table

box.info.version

box.info.version¶

The Tarantool version.

Rtype:	string

box.info.vinyl()

box.info.vinyl()¶: Get runtime statistics for the vinyl storage engine. This function is deprecated, use box.stat.vinyl() instead.

Submodule box.iproto

Since 2.11.0.

The box.iproto submodule provides the ability to work with the network subsystem of Tarantool. It allows you to extend the IPROTO functionality from Lua. With this submodule, you can:

parse unknown IPROTO request types
send arbitrary IPROTO packets
override the behavior of the existing and unknown request types in the binary protocol

The submodule exports all IPROTO constants and features to Lua.

IPROTO constants

IPROTO constants in the box.iproto namespace are written in uppercase letters without the IPROTO_ prefix. The constants are divided into several groups:

key. Example: IPROTO_SYNC.
request type. Example: IPROTO_OK.
flag. Example: IPROTO_COMMIT.
ballot key. Example: IPROTO_FLAG_COMMIT.
metadata key. Example: IPROTO_FIELD_IS_NULLABLE.
RAFT key. Example: IPROTO_TERM.

Each group is located in the corresponding subnamespace without the prefix. For example:

box.iproto.key.SYNC = 0x01
-- ...
box.iproto.type.SELECT = 1
-- ...
box.iproto.flag.COMMIT = 1
-- ...
box.iproto.ballot_key.VCLOCK = 2
-- ...
box.iproto.metadata_key.IS_NULLABLE = 3
-- ...
box.iproto.raft_key.TERM = 0
-- ...

IPROTO features

The submodule exports:

the current IPROTO protocol version (box.iproto.protocol_version)
the set of IPROTO protocol features supported by the server (box.iproto.protocol_features)
IPROTO protocol features with the corresponding code (box.iproto.feature)

Example

The example converts the feature names from box.iproto.protocol_features set into codes:

-- Features supported by the server
box.iproto.protocol_features = {
    streams = true,
    transactions = true,
    error_extension = true,
    watchers = true,
    pagination = true,
}

-- Convert the feature names into codes
features = {}
for name in pairs(box.iproto.protocol_features) do
    table.insert(features, box.iproto.feature[name])
end
return features -- [0, 1, 2, 3, 4]

Handling the unknown IPROTO request types

Every IPROTO request has a static handler. That is, before version 2.11.0, any unknown request raised an error. Since 2.11.0, a new request type is introduced – IPROTO_UNKNOWN. This type is used to override the handlers of the unknown IPROTO request types. For details, see box.iproto.override() and box_iproto_override functions.

API reference

The table lists all available functions and data of the submodule:

Name	Use
box.iproto.key	Request keys
box.iproto.type	Request types
box.iproto.flag	Flags from the IPROTO_FLAGS key
box.iproto.ballot_key	Keys from the IPROTO_BALLOT requests
box.iproto.metadata_key	Keys nested in the IPROTO_METADATA key
box.iproto.raft	Keys from the `IPROTO_RAFT_` requests
box.iproto.protocol_version	The current IPROTO protocol version
box.iproto.protocol_features	The set of supported IPROTO protocol features
box.iproto.feature	IPROTO protocol features
box.iproto.override()	Set a new IPROTO request handler callback for the given request type
box.iproto.send()	Send an IPROTO packet over the session’s socket

box.iproto.key

box.iproto.key¶

Contains all available request keys, except raft, metadata, and ballot keys. Learn more: Keys used in requests and responses.

Example

tarantool> box.iproto.key.SYNC
---
- 1
...

box.iproto.type

box.iproto.type¶

Contains all available request types. Learn more about the requests: Client-server requests and responses.

Example

tarantool> box.iproto.type.UNKNOWN
---
- -1
...
tarantool> box.iproto.type.CHUNK
---
- 128
...

box.iproto.flag

box.iproto.flag¶

Contains the flags from the IPROTO_FLAGS key. Learn more: IPROTO_FLAGS key.

Example

tarantool> box.iproto.flag.COMMIT
---
- 1
...
tarantool> box.iproto.flag.WAIT_SYNC
---
- 2
...

box.iproto.ballot_key

box.iproto.ballot_key¶

Contains the keys from the IPROTO_BALLOT requests. Learn more: IPROTO_BALLOT keys.

Example

tarantool> box.iproto.ballot_key.IS_RO_CFG
---
- 1
...
tarantool> box.iproto.ballot_key.VCLOCK
---
- 2
...

box.iproto.metadata_key

box.iproto.metadata_key¶

Contains the IPROTO_FIELD_* keys, which are nested in the IPROTO_METADATA key.

Example

tarantool> box.iproto.metadata_key.NAME
---
- 0
...
tarantool> box.iproto.metadata_key.TYPE
---
- 1
...

box.iproto.raft

box.iproto.raft_key¶

Contains the keys from the IPROTO_RAFT_* requests. Learn more: Synchronous replication keys.

Example

tarantool> box.iproto.raft_key.TERM
---
- 0
...
tarantool> box.iproto.raft_key.VOTE
---
- 1
...

box.iproto.protocol_version

box.iproto.protocol_version¶

The current IPROTO protocol version of the server. Learn more: IPROTO_ID.

Example

tarantool> box.iproto.protocol_version
---
- 4
...

box.iproto.protocol_features

box.iproto.protocol_features¶

The set of IPROTO protocol features supported by the server. Learn more: net.box features, src/box/iproto_features.h, and iproto_features_resolve().

Example

tarantool> box.iproto.protocol_features
---
- transactions: true
  watchers: true
  error_extension: true
  streams: true
  pagination: true
...

box.iproto.feature

box.iproto.feature¶

Contains the IPROTO protocol features that are supported by the server. Each feature is mapped to its corresponding code. Learn more: IPROTO_FEATURES.

The features in the namespace are written

in lowercase letters
without the IPROTO_FEATURE_ prefix

Example

tarantool> box.iproto.feature.streams
---
- 0
...
tarantool> box.iproto.feature.transactions
---
- 1
...

box.iproto.override()

box.iproto.override(request_type, handler)¶

Since version 2.11.0. Set a new IPROTO request handler callback for the given request type.

Parameters:

request_type (number) –
a request type code. Possible values:
- a type code from box.iproto.type (except box.iproto.type.UNKNOWN) – override the existing request type handler.
- box.iproto.type.UNKNOWN – override the handler of unknown request types.
handler (function) –
IPROTO request handler. The signature of a handler function: function(sid, header, body), where
- header (userdata): a request header encoded as a msgpack_object
- body (userdata): a request body encoded as a msgpack_object
Returns true on success, otherwise false. On false, there is a fallback to the default handler. Also, you can indicate an error by throwing an exception. In this case, the return value is false, but this does not always mean a failure.

To reset the request handler, set the handler parameter to nil.

Return:

none

Possible errors:

If a Lua handler throws an exception, the behavior is similar to that of a remote procedure call. The following errors are returned to the client over IPROTO (see src/lua/utils.h):

ER_PROC_LUA – an exception is thrown from a Lua handler, diagnostic is not set.
diagnostics from src/box/errcode.h – an exception is thrown, diagnostic is set.

For details, see src/box/errcode.h.

Warning

When using box.iproto.override(), it is important that you follow the wire protocol. That is, the server response should match the return value types of the corresponding request type. Otherwise, it could lead to peer breakdown or undefined behavior.

Example:

Define a handler function for the box.iproto.type.SELECT request type:

local function iproto_select_handler_lua(header, body)
    if body.space_id == 512 then
        box.iproto.send(box.session.id(),
                { request_type = box.iproto.type.OK,
                  sync = header.SYNC,
                  schema_version = box.info.schema_version },
                { data = { 1, 2, 3 } })
        return true
    end
    return false
end

Override box.iproto.type.SELECT handler:

box.iproto.override(box.iproto.type.SELECT, iproto_select_handler_lua)

Reset box.iproto.type.SELECT handler:

box.iproto.override(box.iproto.type.SELECT, nil)

Override a handler function for the unknown request type:

box.iproto.override(box.iproto.type.UNKNOWN, iproto_unknown_request_handler_lua)

box.iproto.send()

box.iproto.send(sid, header[, body])¶

Since version 2.11.0. Send an IPROTO packet over the session’s socket with the given MsgPack header and body. The header and body contain exported IPROTO constants from the box.iproto() submodule. Possible IPROTO constant formats:

a lowercase constant without the IPROTO_ prefix (schema_version, request_type)
a constant from the corresponding box.iproto subnamespace (box.iproto.SCHEMA_VERSION, box.iproto.REQUEST_TYPE)

The function works for binary sessions only. For details, see box.session.type().

Parameters:	sid (`number`) – the IPROTO session identifier (see box.session.id()) header (`table\|string`) – a request header encoded as MsgPack body (`table\|string\|nil`) – a request body encoded as MsgPack
Return:	0 on success, otherwise an error is raised
Rtype:	number

Possible errors:

ER_SESSION_CLOSED – the session is closed.
ER_NO_SUCH_SESSION – the session does not exist.
ER_MEMORY_ISSUE – out-of-memory limit has been reached.
ER_WRONG_SESSION_TYPE – the session type is not binary.

For details, see src/box/errcode.h.

Examples:

Send a packet using Lua tables and string IPROTO constants as keys:

box.iproto.send(box.session.id(),
        { request_type = box.iproto.type.OK,
          sync = 10,
          schema_version = box.info.schema_version },
        { data = 1 })

Send a packet using Lua tables and numeric IPROTO constants:

box.iproto.send(box.session.id(),
        { [box.iproto.key.REQUEST_TYPE] = box.iproto.type.OK,
          [box.iproto.key.SYNC] = 10,
          [box.iproto.key.SCHEMA_VERSION] = box.info.schema_version },
        { [box.iproto.key.DATA] = 1 })

Send a packet that contains only the header:

box.iproto.send(box.session.id(),
        { request_type = box.iproto.type.OK,
          sync = 10,
          schema_version = box.info.schema_version })

Submodule box.read_view

The box.read_view submodule contains functions related to read views.

Name	Use
box.read_view.list()	Return an array of all active database read views.
box.read_view.open()	Create a new read view.

box.read_view.list()

read_view.list()¶

Return an array of all active database read views. This array might include the following read view types:

read views created by application code (Enterprise Edition only)
system read views (used, for example, to make a checkpoint or join a new replica)

Read views created by application code also have the space field. The field lists all spaces available in a read view, and may be used like a read view object returned by box.read_view.open().

Note

read_view.list() also contains read views created using the C API (box_raw_read_view_new()). Note that you cannot access database spaces included in such views from Lua.

Example:

tarantool> box.read_view.list()
---
- - timestamp: 1138.98706933
    signature: 47
    is_system: false
    status: open
    vclock: &0 {1: 47}
    name: read_view1
    id: 1
  - timestamp: 1172.202995842
    signature: 49
    is_system: false
    status: open
    vclock: &1 {1: 49}
    name: read_view2
    id: 2
...

box.read_view.open()

Enterprise Edition

This API is available in the Enterprise Edition only.

box.read_view:open({opts})¶

Create a new read view.

Parameters:	opts (`table`) – (optional) configurations options for a read view. For example, the `name` option specifies a read view name. If `name` is not specified, a read view name is set to `unknown`.
Return:	a created read view object
Rtype:	read_view_object

Example:

tarantool> read_view1 = box.read_view.open({name = 'read_view1'})

object read_view_object¶

An object that represents a read view.

read_view_object:info()¶

Get information about a read view such as a name, status, or ID. All the available fields are listed below in the object options.

Return:	information about a read view
Rtype:	table

read_view_object:close()¶: Close a read view. After the read view is closed, its status is set to closed. On an attempt to use it, an error is raised.

read_view_object.status¶

A read view status. The possible values are open and closed.

Rtype:	string

read_view_object.id¶

A unique numeric identifier of a read view.

Rtype:	number

read_view_object.name¶

A read view name. You can specify a read view name in the box.read_view.open() arguments.

Rtype:	string

read_view_object.is_system¶

Determine whether a read view is system. For example, system read views can be created to make a checkpoint or join a new replica.

Rtype:	boolean

read_view_object.timestamp¶

The fiber.clock() value at the moment of opening a read view.

Rtype:	number

read_view_object.vclock¶

The box.info.vclock value at the moment of opening a read view.

Rtype:	table

read_view_object.signature¶

The box.info.signature value at the moment of opening a read view.

Rtype:	number

read_view_object.space¶

Get access to database spaces included in a read view. You can use this field to query space data.

Rtype:	space object

Submodule box.schema

The box.schema submodule has data-definition functions for spaces, users, roles, function tuples, and sequences.

Below is a list of all box.schema functions.

Name	Use
box.schema.space.create()	Create a space
box.schema.upgrade()	Upgrade a database
box.schema.downgrade()	Downgrade a database
box.schema.downgrade_issues()	List downgrade issues for the specified Tarantool version
box.schema.downgrade_versions()	List Tarantool versions available for downgrade
box.schema.user.create()	Create a user
box.schema.user.drop()	Drop a user
box.schema.user.exists()	Check if a user exists
box.schema.user.grant()	Grant privileges to a user or a role
box.schema.user.revoke()	Revoke privileges from a user or a role
box.schema.user.enable()	Grant `usage` and `session` permissions
box.schema.user.disable()	Revoke `usage` and `session` permissions
box.schema.user.password()	Get a hash of a user’s password
box.schema.user.passwd()	Associate a password with a user
box.schema.user.info()	Get a description of a user’s privileges
box.schema.role.create()	Create a role
box.schema.role.drop()	Drop a role
box.schema.role.exists()	Check if a role exists
box.schema.role.grant()	Grant privileges to a role
box.schema.role.revoke()	Revoke privileges from a role
box.schema.role.info()	Get a description of a role’s privileges
box.schema.func.create()	Create a function tuple
box.schema.func.drop()	Drop a function tuple
box.schema.func.exists()	Check if a function tuple exists
box.schema.func.reload()	Reload a C module with all its functions, no restart

box.schema.space.create()

box.schema.space.create(space-name[, {space_opts}])¶

box.schema.create_space(space-name[, {space_opts}])¶

Create a space. You can use either syntax. For example, s = box.schema.space.create('tester') has the same effect as s = box.schema.create_space('tester').

There are three syntax variations for object references targeting space objects, for example box.schema.space.drop(space-id) drops a space. However, the common approach is to use functions attached to the space objects, for example space_object:drop().

After a space is created, usually the next step is to create an index for it, and then it is available for insert, select, and all the other box.space functions.

Parameters:	space-name (`string`) – name of space, which should conform to the rules for object names options (`table`) – space options (see space_opts)
Return:	space object
Rtype:	userdata

space_opts

object space_opts¶

Space options that include the space id, format, field count, constraints and foreign keys, and so on. These options are passed to the box.schema.space.create() method.

Note

These options are also passed to space_object:alter().

space_opts.if_not_exists¶: Create a space only if a space with the same name does not exist already. Otherwise, do nothing but do not cause an error.

Type: boolean

Default: false

space_opts.engine¶: A storage engine.

Type: string

Default: memtx

Possible values: memtx, vinyl

space_opts.id¶: A unique numeric identifier of the space: users can refer to spaces with this id instead of the name.

Type: number

Default: last space’s ID + 1

space_opts.field_count¶: A fixed count of fields. For example, if field_count=5, it is illegal to insert a tuple with fewer than or more than 5 fields.

Type: number

Default: 0 (not fixed)

space_opts.user¶: The name of the user who is considered to be the space’s owner for authorization purposes.

Type: string

Default: current user’s name

space_opts.format¶: Field names and types. See the illustrations of format clauses in the space_object:format() description and in the box.space._space example. Optional and usually not specified.

Type: table

Default: blank

space_opts.is_local¶: Space contents are replication-local: changes are stored in the write-ahead log of the local node but there is no replication.

Type: boolean

Default: false

space_opts.temporary¶: Space contents are temporary: changes are not stored in the write-ahead log and there is no replication.

Note

Vinyl does not support temporary spaces.

Type: boolean

Default: false

space_opts.is_sync¶

Any transaction doing a DML request on this space becomes synchronous.

Example:

box.schema.space.create('bands', { is_sync = true })

Type: boolean

Default: false

space_opts.constraint¶

The constraints that space tuples must satisfy.

Type: table

Default: blank

Example:

-- Define a tuple constraint function --
box.schema.func.create('check_person', {
    language = 'LUA',
    is_deterministic = true,
    body = 'function(t, c) return (t.age >= 0 and #(t.name) > 3) end'
})

-- Create a space with a tuple constraint --
customers = box.schema.space.create('customers', {constraint = 'check_person'})

space_opts.foreign_key¶

The foreign keys for space fields.

Type: table

Default: blank

Example:

-- Create a space with a tuple foreign key --
box.schema.space.create("orders", {
    foreign_key = {
        space = 'customers',
        field = {customer_id = 'id', customer_name = 'name'}
    }
})

box.space.orders:format({
    {name = "id", type = "number"},
    {name = "customer_id" },
    {name = "customer_name"},
    {name = "price_total", type = "number"},
})

Saying box.cfg{read_only=true...} during configuration affects spaces differently depending on the options that were used during box.schema.space.create, as summarized by this chart:

Option	Can be created?	Can be written to?	Is replicated?	Is persistent?
(default)	no	no	yes	yes
temporary	no	yes	no	no
is_local	no	yes	no	yes

Example:

tarantool> s = box.schema.space.create('space55')
---
...
tarantool> s = box.schema.space.create('space55', {
         >   id = 555,
         >   temporary = false
         > })
---
- error: Space 'space55' already exists
...
tarantool> s = box.schema.space.create('space55', {
         >   if_not_exists = true
         > })
---
...

box.schema.upgrade()

box.schema.upgrade()¶

If you created a database with an older Tarantool version and have now installed a newer version, make the request box.schema.upgrade(). This updates Tarantool system spaces to match the currently installed version of Tarantool. You can learn about the general upgrade process from the Upgrades topic.

For example, here is what happens when you run box.schema.upgrade() with a database created with Tarantool version 1.6.4 to version 1.7.2 (only a small part of the output is shown):

tarantool> box.schema.upgrade()
alter index primary on _space set options to {"unique":true}, parts to [[0,"unsigned"]]
alter space _schema set options to {}
create view _vindex...
grant read access to 'public' role for _vindex view
set schema version to 1.7.0
---
...

You can also put the request box.schema.upgrade() inside a box.once() function in your Tarantool initialization file. On startup, this will create new system spaces, update data type names (for example, num -> unsigned, str -> string) and options in Tarantool system spaces.

box.schema.downgrade()

box.schema.downgrade(version)¶

Allows you to downgrade a database to the specified Tarantool version. This might be useful if you need to run a database on older Tarantool versions.

To prepare a database for using it on an older Tarantool instance, call box.schema.downgrade and pass the desired Tarantool version:

tarantool> box.schema.downgrade('2.8.4')

Note

The Tarantool’s downgrade procedure is similar to the upgrade process that is described in the Upgrades topic. You need to run box.schema.downgrade() only on master and execute box.snapshot() on every instance in a replica set before restart to an older version.

To see Tarantool versions available for downgrade, call box.schema.downgrade_versions(). The oldest release available for downgrade is 2.8.2.

Note that the downgrade process might fail if the database enables specific features not supported in the target Tarantool version. You can see all such issues using the box.schema.downgrade_issues() method, which accepts the target version. For example, downgrade to the 2.8.4 version fails if you use tuple compression or field constraints in your database:

tarantool> box.schema.downgrade_issues('2.8.4')
---
- - Tuple compression is found in space 'bands', field 'band_name'. It is supported
    starting from version 2.10.0.
  - Field constraint is found in space 'bands', field 'year'. It is supported starting
    from version 2.10.0.
...

box.schema.downgrade_versions()

box.schema.downgrade_versions()¶

Return a list of Tarantool versions available for downgrade. To learn how to downgrade a database to the specified Tarantool version, see box.schema.downgrade().

Return:	a list of Tarantool versions
Rtype:	table

box.schema.downgrade_issues()

box.schema.downgrade_issues(version)¶

Return a list of downgrade issues for the specified Tarantool version. To learn how to downgrade a database to the specified Tarantool version, see box.schema.downgrade().

Return:	a list of downgrade issues
Rtype:	table

box.schema.user.create()

box.schema.user.create(name[, {options}])¶

Create a user. For explanation of how Tarantool maintains user data, see section Users and reference on _user space.

The possible options are:

if_not_exists = true|false (default = false) - boolean; true means there should be no error if the user already exists,
password (default = ‘’) - string; the password = password specification is good because in a URI (Uniform Resource Identifier) it is usually illegal to include a username without a password.

Note

The maximum number of users is 32.

Parameters:	name (`string`) – a user name, which should conform to the rules for object names options (`table`) – `if_not_exists`, `password`
Return:	nil

Examples:

box.schema.user.create('testuser', { password = 'foobar' })

See also: Managing users.

box.schema.user.drop()

box.schema.user.drop(username[, {options}])¶

Drop a user. For explanation of how Tarantool maintains user data, see section Users and reference on _user space.

Parameters:	username (`string`) – the name of the user options (`table`) – `if_exists` = `true\|false` (default = `false`) - boolean; `true` means there should be no error if the user does not exist.

Examples:

box.schema.user.drop('testuser')

See also: Managing users.

box.schema.user.exists()

box.schema.user.exists(username)¶

Return true if a user exists; return false if a user does not exist. For explanation of how Tarantool maintains user data, see section Users and reference on _user space.

Parameters:	username (`string`) – the name of the user
Rtype:	bool

See also: Getting a user’s information.

box.schema.user.grant()

box.schema.user.grant(username, permissions, object-type, object-name[, {options}])¶

box.schema.user.grant(username, permissions, 'universe'[, nil, {options}])

box.schema.user.grant(username, role-name[, nil, nil, {options}])

Grant privileges to a user or to another role.

Parameters:

username (string) – the name of a user to grant privileges to
permissions (string) – one or more permissions to grant to the user (for example, read or read,write)
object-type (string) – a database object type to grant permissions to (for example, space, role, or function)
object-name (string) – the name of a database object to grant permissions to
role-name (string) – the name of a role to grant to the user
options (table) – grantor, if_not_exists

If 'function','object-name' is specified, then a _func tuple with that object-name must exist.

Variation: instead of object-type, object-name say universe which means ‘all object-types and all objects’. In this case, object name is omitted.

Variation: instead of permissions, object-type, object-name say role-name (see section Roles).

Variation: instead of box.schema.user.grant('username','usage,session','universe',nil, {if_not_exists=true}) say box.schema.user.enable('username') (see section box.schema.user.enable).

The possible options are:

grantor = grantor_name_or_id – string or number, for custom grantor,
if_not_exists = true|false (default = false) - boolean; true means there should be no error if the user already has the privilege.

Example:

box.schema.user.grant('testuser', 'read', 'space', 'writers')
box.schema.user.grant('testuser', 'read,write', 'space', 'books')

See also: Managing users.

box.schema.user.revoke()

box.schema.user.revoke(username, permissions, object-type, object-name[, {options}])¶

box.schema.user.revoke(username, permissions, 'universe'[, nil, {options}])

box.schema.user.revoke(username, role-name[, nil, nil, {options}])

Revoke privileges from a user or from another role.

Parameters:

username (string) – the name of the user
permissions (string) – one or more permissions to revoke from the user (for example, read or read,write)
object-type (string) – a database object type to revoke permissions from (for example, space, role, or function)
object-name (string) – the name of a database object to revoke permissions from
options (table) – if_exists

The user must exist, and the object must exist, but if the option setting is {if_exists=true} then it is not an error if the user does not have the privilege.

Variation: instead of object-type, object-name say ‘universe’ which means ‘all object-types and all objects’.

Variation: instead of permissions, object-type, object-name say role-name (see section Roles).

Variation: instead of box.schema.user.revoke('username','usage,session','universe',nil, {if_exists=true}) say box.schema.user.disable('username') (see section box.schema.user.disable).

Example:

box.schema.user.revoke('testuser', 'write', 'space', 'books')

See also: Managing users.

box.schema.user.enable()

box.schema.user.enable(username)¶

Grants usage and session permissions to the subject user. Equivalent to the following call:

box.schema.user.grant(username, 'usage,session', 'universe', nil, {if_not_exists = true})

Note

session - allows the binary protocol layer (iproto) to authenticate the user
usage - lets user use their privileges on database objects (such as read, write and alter space)

For more information about granting permissions see section box.schema.user.grant.

Parameters:	username (`string`) – the name of the subject user
Return:	(if success) nothing

Possible errors:

NO_SUCH_USER - in case the subject user is not found.

box.schema.user.disable()

box.schema.user.disable(username)¶

Revokes usage and session permissions from the subject user. Equivalent to the following call:

box.schema.user.revoke(username, 'usage,session', 'universe', nil, {if_not_exists = true})

Note

session - allows the binary protocol layer (iproto) to authenticate the user
usage - lets user use their privileges on database objects (such as read, write and alter space)

For more information about revoking permissions see section box.schema.user.revoke.

Parameters:	username (`string`) – the name of the subject user
Return:	(if success) nothing

Possible errors:

NO_SUCH_USER - in case the subject user is not found.

box.schema.user.password()

box.schema.user.password(password)¶

Return a hash of a user’s password. For explanation of how Tarantool maintains passwords, see section Passwords and reference on _user space.

Note

If a non-‘guest’ user has no password, it’s impossible to connect to Tarantool using this user. The user is regarded as “internal” only, not usable from a remote connection. Such users can be useful if they have defined some procedures with the SETUID option, on which privileges are granted to externally-connectable users. This way, external users cannot create/drop objects, they can only invoke procedures.
For the ‘guest’ user, it’s impossible to set a password: that would be misleading, since ‘guest’ is the default user on a newly-established connection over a binary port, and Tarantool does not require a password to establish a binary connection. It is, however, possible to change the current user to ‘guest’ by providing the AUTH packet with no password at all or an empty password. This feature is useful for connection pools, which want to reuse a connection for a different user without re-establishing it.

Parameters:	password (`string`) – password to be hashed
Rtype:	string

Example:

box.schema.user.password('foobar')

box.schema.user.passwd()

box.schema.user.passwd([username, ]password)¶

Sets a password for a currently logged in or a specified user:

A currently logged-in user can change their password using box.schema.user.passwd(password).
An administrator can change the password of another user with box.schema.user.passwd(username, password).

Parameters:	username (`string`) – a username password (`string`) – a new password

Example:

box.schema.user.passwd('testuser', 'foobar')

See also: Managing users.

box.schema.user.info()

box.schema.user.info([username])¶

Return a description of a user’s privileges.

Parameters:	username (`string`) – the name of the user. This is optional; if it is not supplied, then the information will be for the user who is currently logged in.

See also: Getting a user’s information.

box.schema.role.create()

box.schema.role.create(role-name[, {options}])¶

Create a role. For explanation of how Tarantool maintains role data, see section Roles.

Parameters:	role-name (`string`) – name of role, which should conform to the rules for object names options (`table`) – `if_not_exists` = `true\|false` (default = `false`) - boolean; `true` means there should be no error if the role already exists
Return:	nil

Example:

box.schema.role.create('books_space_manager')
box.schema.role.create('writers_space_reader')

See also: Managing roles.

box.schema.role.drop()

box.schema.role.drop(role-name[, {options}])¶

Drop a role. For explanation of how Tarantool maintains role data, see section Roles.

Parameters:	role-name (`string`) – the name of the role options (`table`) – `if_exists` = `true\|false` (default = `false`) - boolean; `true` means there should be no error if the role does not exist.

Example:

box.schema.role.drop('writers_space_reader')

See also: Managing roles.

box.schema.role.exists()

box.schema.role.exists(role-name)¶

Return true if a role exists; return false if a role does not exist.

Parameters:	role-name (`string`) – the name of the role
Rtype:	bool

See also: Getting a role’s information.

box.schema.role.grant()

box.schema.role.grant(role-name, permissions, object-type, object-name[, option])¶

box.schema.role.grant(role-name, permissions, 'universe'[, nil, option])

box.schema.role.grant(role-name, role-name[, nil, nil, option])

Grant privileges to a role.

Parameters:

role-name (string) – the name of the role
permissions (string) – one or more permissions to grant to the role (for example, read or read,write)
object-type (string) – a database object type to grant permissions to (for example, space, role, or function)
object-name (string) – the name of a database object to grant permissions to
option (table) – if_not_exists = true|false (default = false) - boolean; true means there should be no error if the role already has the privilege

The role must exist, and the object must exist.

Variation: instead of object-type, object-name say universe which means ‘all object-types and all objects’. In this case, object name is omitted.

Variation: instead of permissions, object-type, object-name say role-name – to grant a role to a role.

Example:

box.schema.role.grant('books_space_manager', 'read,write', 'space', 'books')

See also: Managing roles.

box.schema.role.revoke()

box.schema.role.revoke(role-name, permissions, object-type, object-name)¶

Revoke privileges from a role.

Parameters:	role-name (`string`) – the name of the role permissions (`string`) – one or more permissions to revoke from the role (for example, `read` or `read,write`) object-type (`string`) – a database object type to revoke permissions from (for example, `space`, `role`, or `function`) object-name (`string`) – the name of a database object to revoke permissions from

The role must exist, and the object must exist, but it is not an error if the role does not have the privilege.

Variation: instead of object-type, object-name say universe which means ‘all object-types and all objects’.

Variation: instead of permissions, object-type, object-name say role-name.

See also: Managing roles.

box.schema.role.info()

box.schema.role.info(role-name)¶

Return a description of a role’s privileges.

Parameters:	role-name (`string`) – the name of the role.

See also: Getting a role’s information.

box.schema.func.create()

box.schema.func.create(func_name[, function_options])¶

Create a function. The created function can be used in different usage scenarios, for example, in field or tuple constraints or functional indexes.

Using the body option, you can make a function persistent. In this case, the function is “persistent” because its definition is stored in a snapshot (the box.space._func system space) and can be recovered if the server restarts.

Parameters:	func_name (`string`) – a name of the function, which should conform to the rules for object names function_options (`table`) – see function_options
Return:	nil

Note

box.schema.user.grant() can be used to allow the specified user or role to execute the created function.

Example 1: a non-persistent Lua function

The example below shows how to create a non-persistent Lua function:

box.schema.func.create('calculate')
box.schema.func.create('calculate', {if_not_exists = false})
box.schema.func.create('calculate', {setuid = false})
box.schema.func.create('calculate', {language = 'LUA'})

Example 2: a persistent Lua function

The example below shows how to create a persistent Lua function, show its definition using box.func.{func-name}, and call this function using box.func.{func-name}:call([parameters]):

tarantool> lua_code = [[function(a, b) return a + b end]]
tarantool> box.schema.func.create('sum', {body = lua_code})

tarantool> box.func.sum
---
- is_sandboxed: false
  is_deterministic: false
  id: 2
  setuid: false
  body: function(a, b) return a + b end
  name: sum
  language: LUA
...

tarantool> box.func.sum:call({1, 2})
---
- 3
...

To call functions using net.box, use net_box:call().

Example 3: a persistent SQL expression used in a tuple constraint

The code snippet below defines a function that checks a tuple’s data using the SQL expression:

box.schema.func.create('check_person', {
    language = 'SQL_EXPR',
    is_deterministic = true,
    body = [["age" > 21 AND "name" != 'Admin']]
})

Then, this function is used to create a tuple constraint:

local customers = box.schema.space.create('customers', { constraint = 'check_person' })
customers:format({
    { name = 'id', type = 'number' },
    { name = 'name', type = 'string' },
    { name = 'age', type = 'number' },
})
customers:create_index('primary', { parts = { 1 } })

On an attempt to insert a tuple that doesn’t meet the required criteria, an error is raised:

customers:insert { 2, "Bob", 18 }
-- error: Check constraint 'check_person' failed for a tuple

function_options

object function_options¶

A table containing options passed to the box.schema.func.create(func-name [, function_options]) function.

function_options.if_not_exists¶: Specify whether there should be no error if the function already exists.

Type: boolean

Default: false

function_options.setuid¶: Make Tarantool treat the function’s caller as the function’s creator, with full privileges. Note that setuid works only over binary ports. setuid doesn’t work if you invoke a function using the admin console or inside a Lua script.

Type: boolean

Default: false

function_options.language¶

Specify the function language. The possible values are:

LUA: define a Lua function in the body attribute.
SQL_EXPR: define an SQL expression in the body attribute. An SQL expression can only be used as a field or tuple constraint.
C: import a C function using its name from a .so file. Learn how to call C code from Lua in the C tutorial.

Note

To reload a C module with all its functions without restarting the server, call box.schema.func.reload().

Type: string

Default: LUA

function_options.is_sandboxed¶

Whether the function should be executed in an isolated environment. This means that any operation that accesses the world outside the sandbox is forbidden or has no effect. Therefore, a sandboxed function can only use modules and functions that cannot affect isolation:

assert, assert, error, ipairs, math.*, next, pairs, pcall, print, select, string.*, table.*, tonumber, tostring, type, unpack, xpcall, utf8.*.

Also, a sandboxed function cannot refer to global variables – they are treated as local variables because the sandbox is established with setfenv. So, a sandboxed function is stateless and deterministic.

Type: boolean

Default: false

function_options.is_deterministic¶: Specify whether a function should be deterministic.

Type: boolean

Default: false

function_options.is_multikey¶: If true is set in the function definition for a functional index, the function returns multiple keys. For details, see the example.

Type: boolean

Default: false

function_options.body¶

Specify a function body. You can set a function’s language using the language attribute.

The code snippet below defines a constraint function that checks a tuple’s data using a Lua function:

box.schema.func.create('check_person', {
    language = 'LUA',
    is_deterministic = true,
    body = 'function(t, c) return (t.age >= 0 and #(t.name) > 3) end'
})

In the following example, an SQL expression is used to check a tuple’s data:

box.schema.func.create('check_person', {
    language = 'SQL_EXPR',
    is_deterministic = true,
    body = [["age" > 21 AND "name" != 'Admin']]
})

Example: A persistent SQL expression used in a tuple constraint

Type: string

Default: nil

function_options.takes_raw_args¶

Since: 2.10.0

If set to true for a Lua function and the function is called via net.box (conn:call()) or by box.func.<func-name>:call(), the function arguments are passed being wrapped in a MsgPack object:

local msgpack = require('msgpack')
box.schema.func.create('my_func', {takes_raw_args = true})
local my_func = function(mp)
    assert(msgpack.is_object(mp))
    local args = mp:decode() -- array of arguments
end

If a function forwards most of its arguments to another Tarantool instance or writes them to a database, the usage of this option can improve performance because it skips the MsgPack data decoding in Lua.

Type: boolean

Default: false

function_options.exports¶

Specify the languages that can call the function.

Example: exports = {'LUA', 'SQL'}

box.schema.func.drop()

box.schema.func.drop(func-name[, {options}])¶

Drop a function tuple. For explanation of how Tarantool maintains function data, see reference on _func space.

Parameters:	func-name (`string`) – the name of the function options (`table`) – `if_exists` = `true\|false` (default = `false`) - boolean; `true` means there should be no error if the _func tuple does not exist.

Example:

box.schema.func.drop('calculate')

box.schema.func.exists()

box.schema.func.exists(func-name)¶

Return true if a function tuple exists; return false if a function tuple does not exist.

Parameters:	func-name (`string`) – the name of the function
Rtype:	bool

Example:

box.schema.func.exists('calculate')

box.schema.func.reload()

box.schema.func.reload([name])¶

Reload a C module with all its functions without restarting the server.

Under the hood, Tarantool loads a new copy of the module (*.so shared library) and starts routing all new request to the new version. The previous version remains active until all started calls are finished. All shared libraries are loaded with RTLD_LOCAL (see “man 3 dlopen”), therefore multiple copies can co-exist without any problems.

Note

Reload will fail if a module was loaded from Lua script with ffi.load().

Parameters:	name (`string`) – the name of the module to reload

Example:

-- reload the entire module contents
box.schema.func.reload('module')

Sequences

An introduction to sequences is in the Sequences section of the “Data model” chapter. Here are the details for each function and option.

All functions related to sequences require appropriate privileges.

Below is a list of all box.schema.sequence functions.

Name	Use
box.schema.sequence.create()	Create a new sequence generator
sequence_object:next()	Generate and return the next value
sequence_object:alter()	Change sequence options
sequence_object:reset()	Reset sequence state
sequence_object:set()	Set the new value
sequence_object:current()	Return the last retrieved value
sequence_object:drop()	Drop the sequence
specifying a sequence in create_index()	Create an index with a sequence option

Example:

Here is an example showing all sequence options and operations:

s = box.schema.sequence.create(
               'S2',
               {start=100,
               min=100,
               max=200,
               cache=100000,
               cycle=false,
               step=100
               })
s:alter({step=6})
s:next()
s:reset()
s:set(150)
s:drop()

box.schema.sequence.create()

box.schema.sequence.create(name[, options])¶

Create a new sequence generator.

Parameters:	name (`string`) – the name of the sequence options (`table`) – see a quick overview in the “Options for `box.schema.sequence.create()`” chart (in the Sequences section of the “Data model” chapter), and see more details below.
Return:	a reference to a new sequence object.

Options:

start – the STARTS WITH value. Type = integer, Default = 1.
min – the MINIMUM value. Type = integer, Default = 1.
max - the MAXIMUM value. Type = integer, Default = 9223372036854775807.

There is a rule: min <= start <= max. For example it is illegal to say {start=0} because then the specified start value (0) would be less than the default min value (1).

There is a rule: min <= next-value <= max. For example, if the next generated value would be 1000, but the maximum value is 999, then that would be considered “overflow”.

There is a rule: start and min and max must all be <= 9223372036854775807 which is 2^63 - 1 (not 2^64).
cycle – the CYCLE value. Type = bool. Default = false.

If the sequence generator’s next value is an overflow number, it causes an error return – unless cycle == true.

But if cycle == true, the count is started again, at the MINIMUM value or at the MAXIMUM value (not the STARTS WITH value).
cache – the CACHE value. Type = unsigned integer. Default = 0.

Currently Tarantool ignores this value, it is reserved for future use.
step – the INCREMENT BY value. Type = integer. Default = 1.

Ordinarily this is what is added to the previous value.

sequence_object:next()

object sequence_object¶

sequence_object:next()¶

Generate the next value and return it.

The generation algorithm is simple:

If this is the first time, then return the STARTS WITH value.
If the previous value plus the INCREMENT value is less than the MINIMUM value or greater than the MAXIMUM value, that is “overflow”, so either raise an error (if cycle = false) or return the MAXIMUM value (if cycle = true and step < 0) or return the MINIMUM value (if cycle = true and step > 0).

If there was no error, then save the returned result, it is now the “previous value”.

For example, suppose sequence ‘S’ has:

min == -6,
max == -1,
step == -3,
start = -2,
cycle = true,
previous value = -2.

Then box.sequence.S:next() returns -5 because -2 + (-3) == -5.

Then box.sequence.S:next() again returns -1 because -5 + (-3) < -6, which is overflow, causing cycle, and max == -1.

This function requires a ‘write’ privilege on the sequence.

Note

This function should not be used in “cross-engine” transactions (transactions which use both the memtx and the vinyl storage engines).

To see what the previous value was, without changing it, you can select from the _sequence_data system space.

sequence_object:alter()

object sequence_object¶

sequence_object:alter(options)¶

The alter() function can be used to change any of the sequence’s options. Requirements and restrictions are the same as for box.schema.sequence.create().

Options:

start – the STARTS WITH value. Type = integer, Default = 1.
min – the MINIMUM value. Type = integer, Default = 1.
max - the MAXIMUM value. Type = integer, Default = 9223372036854775807.

There is a rule: min <= start <= max. For example it is illegal to say {start=0} because then the specified start value (0) would be less than the default min value (1).

There is a rule: min <= next-value <= max. For example, if the next generated value would be 1000, but the maximum value is 999, then that would be considered “overflow”.
cycle – the CYCLE value. Type = bool. Default = false.

If the sequence generator’s next value is an overflow number, it causes an error return – unless cycle == true.

But if cycle == true, the count is started again, at the MINIMUM value or at the MAXIMUM value (not the STARTS WITH value).
cache – the CACHE value. Type = unsigned integer. Default = 0.

Currently Tarantool ignores this value, it is reserved for future use.
step – the INCREMENT BY value. Type = integer. Default = 1.

Ordinarily this is what is added to the previous value.

sequence_object:reset()

object sequence_object¶

sequence_object:reset()¶: Set the sequence back to its original state. The effect is that a subsequent next() will return the start value. This function requires a ‘write’ privilege on the sequence.

sequence_object:set()

object sequence_object¶

sequence_object:set(new-previous-value)¶: Set the “previous value” to new-previous-value. This function requires a ‘write’ privilege on the sequence.

sequence_object:current()

object sequence_object¶

sequence_object:current()¶

Since version 2.4.1. Return the last retrieved value of the specified sequence or throw an error if no value has been generated yet (next() has not been called yet, or current() is called right after reset() is called).

Example:

tarantool> sq = box.schema.sequence.create('test')
---
...
tarantool> sq:current()
---
- error: Sequence 'test' is not started
...
tarantool> sq:next()
---
- 1
...
tarantool> sq:current()
---
- 1
...
tarantool> sq:set(42)
---
...
tarantool> sq:current()
---
- 42
...
tarantool> sq:reset()
---
...
tarantool> sq:current()  -- error
---
- error: Sequence 'test' is not started
...

sequence_object:drop()

object sequence_object¶

sequence_object:drop()¶: Drop an existing sequence.

specifying a sequence in create_index()

object space_object¶

space_object:create_index(... [sequence='...' option] ...)¶

You can use the sequence=sequence-name (or sequence=sequence-id or sequence=true) option when creating or altering a primary-key index. The sequence becomes associated with the index, so that the next insert() will put the next generated number into the primary-key field, if the field would otherwise be nil.

The syntax may be any of:
sequence = sequence identifier
or sequence = {id = sequence identifier }
or sequence = {field = field number }
or sequence = {id = sequence identifier , field = field number }
or sequence = true
or sequence = {}.
The sequence identifier may be either a number (the sequence id) or a string (the sequence name). The field number may be the ordinal number of any field in the index; default = 1. Examples of all possibilities: sequence = 1 or sequence = 'sequence_name' or sequence = {id = 1} or sequence = {id = 'sequence_name'} or sequence = {id = 1, field = 1} or sequence = {id = 'sequence_name', field = 1} or sequence = {field = 1} or sequence = true or sequence = {}. Notice that the sequence identifier can be omitted, if it is omitted then a new sequence is created automatically with default name = space-name_seq. Notice that the field number does not have to be 1, that is, the sequence can be associated with any field in the primary-key index.

For example, if ‘Q’ is a sequence and ‘T’ is a new space, then this will work:

tarantool> box.space.T:create_index('Q',{sequence='Q'})
---
- unique: true
  parts:
  - type: unsigned
    is_nullable: false
    fieldno: 1
  sequence_id: 8
  id: 0
  space_id: 514
  name: Q
  type: TREE
...

(Notice that the index now has a sequence_id field.)

And this will work:

tarantool> box.space.T:insert{box.NULL,0}
---
- [1, 0]
...

Note

The index key type may be either ‘integer’ or ‘unsigned’. If any of the sequence options is a negative number, then the index key type should be ‘integer’.

Users should not insert a value greater than 9223372036854775807, which is 2^63 - 1, in the indexed field. The sequence generator will ignore it.

A sequence cannot be dropped if it is associated with an index. However, index_object:alter() can be used to say that a sequence is not associated with an index, for example box.space.T.index.I:alter({sequence=false}).

If a sequence was created automatically because the sequence identifier was omitted, then it will be dropped automatically if the index is altered so that sequence=false, or if the index is dropped.

index_object:alter() can also be used to associate a sequence with an existing index, with the same syntax for options.

When a sequence is used with an index based on a JSON path, inserted tuples must have all components of the path preceding the autoincrement field, and the autoincrement field. To achieve that use box.NULL rather than nil. Example:

s = box.schema.space.create('test')
s:create_index('pk', {parts = {{'[1].a.b[1]', 'unsigned'}}, sequence = true})
s:replace{} -- error
s:replace{{c = {}}} -- error
s:replace{{a = {c = {}}}} -- error
s:replace{{a = {b = {}}}} -- error
s:replace{{a = {b = {nil}}}} -- error
s:replace{{a = {b = {box.NULL}}}} -- ok

Submodule box.session

The box.session submodule allows querying the session state, writing to a session-specific temporary Lua table, or sending out-of-band messages, or setting up triggers which will fire when a session starts or ends.

A session is an object associated with each client connection.

Below is a list of all box.session functions and members.

Name	Use
box.session.id()	Get the current session’s ID
box.session.exists()	Check if a session exists
box.session.peer()	Get the session peer’s host address and port
box.session.sync()	Get the sync integer constant
box.session.user()	Get the current user’s name
box.session.effective_user()	Get the current effective user’s name
box.session.type()	Get the connection type or cause of action
box.session.su()	Change the current user
box.session.uid()	Get the current user’s ID
box.session.euid()	Get the current effective user’s ID
box.session.storage	Table with session-specific names and values
box.session.on_connect()	Define a connect trigger
box.session.on_disconnect()	Define a disconnect trigger
box.session.on_auth()	Define an authentication trigger
box.session.on_access_denied()	Define a trigger to report restricted actions
box.session.push()	(Deprecated) Send an out-of-band message

box.session.id()

box.session.id()¶

Return the unique identifier (ID) for the current session.

Return:	the session identifier; 0 or -1 if there is no session
Rtype:	number

box.session.exists()

box.session.exists(id)¶

Return:	true if the session exists, false if the session does not exist.
Rtype:	boolean

box.session.peer()

box.session.peer(id)¶

This function works only if there is a peer, that is, if a connection has been made to a separate Tarantool instance.

Return:	The host address and port of the session peer, for example “127.0.0.1:55457”. If the session exists but there is no connection to a separate instance, the return is null. The command is executed on the server instance, so the “local name” is the server instance’s host and port, and the “peer name” is the client’s host and port.
Rtype:	string

Possible errors: ‘session.peer(): session does not exist’

box.session.sync()

box.session.sync()¶

Return:	the value of the `sync` integer constant used in the binary protocol. This value becomes invalid when the session is disconnected.
Rtype:	number

This function is local for the request, i.e. not global for the session. If the connection behind the session is multiplexed, this function can be safely used inside the request processor.

box.session.user()

box.session.user()¶

Return the name of the current Tarantool user. If the current user is changed temporarily using the box.session.su() method, box.session.user() ignores this change. In this case, box.session.user() returns the initial current user (the user who calls the box.session.su() function).

box.session.type()

box.session.type()¶

Return:	the type of connection or cause of action.
Rtype:	string

Possible return values are:

‘binary’ if the connection was done via the binary protocol, for example to a target made with box.cfg{listen=…};
‘console’ if the connection was done via the administrative console, for example to a target made with console.listen;
‘repl’ if the connection was done directly, for example when using Tarantool as a client;
‘applier’ if the action is due to replication, regardless of how the connection was done;
‘background’ if the action is in a background fiber, regardless of whether the Tarantool server was started in the background.

box.session.type() is useful for an on_replace() trigger on a replica – the value will be ‘applier’ if and only if the trigger was activated because of a request that was done on the master.

box.session.su()

box.session.su(user-name[, function-to-execute])¶

Change Tarantool’s current user – this is analogous to the Unix command su.

Or, if the function-to-execute option is specified, change Tarantool’s current user temporarily while executing the function – this is analogous to the Unix command sudo. If the user is changed temporarily:

box.session.user() ignores this change.
box.session.effective_user() shows this change.

Parameters:	user-name (`string`) – name of a target user function-to-execute – a function object. Additional parameters may be passed to `box.session.su()`, they will be interpreted as parameters of `function-to-execute`.

Example 1

Change Tarantool’s current user to guest:

app:instance001> box.session.su('guest')
---
...

Example 2

Change Tarantool’s current user to temporary_user temporarily:

app:instance001> function get_current_user() return box.session.user() end
---
...

app:instance001> function get_effective_user() return box.session.effective_user() end
---
...

app:instance001> get_current_user()
---
- admin
...

app:instance001> box.session.su('temporary_user', get_current_user)
---
- admin
...

app:instance001> box.session.su('temporary_user', get_effective_user)
---
- temporary_user
...

app:instance001> box.session.su('temporary_user', get_effective_user, '-xxx')
---
- temporary_user-xxx
...

app:instance001> box.session.su('temporary_user', function(...) return box.session.user() end)
---
- admin
...

box.session.uid()

box.session.uid()¶

Return:	the user ID of the current user.
Rtype:	number

Every user has a unique name (seen with box.session.user()) and a unique ID (seen with box.session.uid()). The values are stored together in the _user space.

box.session.euid()

box.session.euid()¶

Return:	the effective user ID of the current user.

The system uses the effective user ID to determine the process’s permissions at any given moment. This is the same as box.session.uid(), except two cases:

box.session.euid() is called within a function invoked by box.session.su(). In this case:
- box.session.euid() returns the ID of the changed user
(the user who is specified by the user-name parameter of the box.session.su() function). - box.session.uid() returns the ID of the original user (the user who calls the box.session.su() function).
box.session.euid() is called within a function specified with box.schema.func.create(function-name, {setuid= true}) and the binary protocol is in use. In this case:
- box.session.euid() returns the ID of the user who created function-name.
- box.session.uid() returns the ID of the user who calls function-name.

Rtype:	number

Example:

tarantool> box.session.su('admin')
---
...
tarantool> box.session.uid(), box.session.euid()
---
- 1
- 1
...
tarantool> function f() return {box.session.uid(),box.session.euid()} end
---
...
tarantool> box.session.su('guest', f)
---
- - 1
  - 0
...

box.session.storage

box.session.storage¶

A Lua table that can hold arbitrary unordered session-specific names and values, which will last until the session ends. For example, this table could be useful to store current tasks when working with a Tarantool queue manager.

Example:

tarantool> box.session.peer(box.session.id())
---
- 127.0.0.1:45129
...
tarantool> box.session.storage.random_memorandum = "Don't forget the eggs"
---
...
tarantool> box.session.storage.radius_of_mars = 3396
---
...
tarantool> m = ''
---
...
tarantool> for k, v in pairs(box.session.storage) do
         >   m = m .. k .. '='.. v .. ' '
         > end
---
...
tarantool> m
---
- 'radius_of_mars=3396 random_memorandum=Don't forget the eggs. '
...

box.session.on_connect()

box.session.on_connect([trigger-function[, old-trigger-function]])¶

Define a trigger for execution when a new session is created due to an event such as console.connect. The trigger function will be the first thing executed after a new session is created. If the trigger execution fails and raises an error, the error is sent to the client and the connection is closed.

Parameters:	trigger-function (`function`) – function which will become the trigger function old-trigger-function (`function`) – existing trigger function which will be replaced by trigger-function
Return:	nil or function pointer

If the parameters are (nil, old-trigger-function), then the old trigger is deleted.

If both parameters are omitted, then the response is a list of existing trigger functions.

Details about trigger characteristics are in the triggers section.

Example:

tarantool> function f ()
         >   x = x + 1
         > end
tarantool> box.session.on_connect(f)

Warning

If a trigger always results in an error, it may become impossible to connect to a server to reset it.

box.session.on_disconnect()

box.session.on_disconnect([trigger-function[, old-trigger-function]])¶

Define a trigger for execution after a client has disconnected. If the trigger function causes an error, the error is logged but otherwise is ignored. The trigger is invoked while the session associated with the client still exists and can access session properties, such as box.session.id().

Since version 1.10, the trigger function is invoked immediately after the disconnect, even if requests that were made during the session have not finished.

Parameters:	trigger-function (`function`) – function which will become the trigger function old-trigger-function (`function`) – existing trigger function which will be replaced by trigger-function
Return:	nil or function pointer

If the parameters are (nil, old-trigger-function), then the old trigger is deleted.

If both parameters are omitted, then the response is a list of existing trigger functions.

Details about trigger characteristics are in the triggers section.

Example #1

tarantool> function f ()
         >   x = x + 1
         > end
tarantool> box.session.on_disconnect(f)

Example #2

After the following series of requests, a Tarantool instance will write a message using the log module whenever any user connects or disconnects.

function log_connect ()
  local log = require('log')
  local m = 'Connection. user=' .. box.session.user() .. ' id=' .. box.session.id()
  log.info(m)
end

function log_disconnect ()
  local log = require('log')
  local m = 'Disconnection. user=' .. box.session.user() .. ' id=' .. box.session.id()
  log.info(m)
end

box.session.on_connect(log_connect)
box.session.on_disconnect(log_disconnect)

Here is what might appear in the log file in a typical installation:

2014-12-15 13:21:34.444 [11360] main/103/iproto I>
    Connection. user=guest id=3
2014-12-15 13:22:19.289 [11360] main/103/iproto I>
    Disconnection. user=guest id=3

box.session.on_auth()

box.session.on_auth([trigger-function[, old-trigger-function]])¶

Define a trigger for execution during authentication.

The on_auth trigger function is invoked in these circumstances:

The console.connect function includes an authentication check for all users except ‘guest’. For this case, the on_auth trigger function is invoked after the on_connect trigger function, if and only if the connection has succeeded so far.
The binary protocol has a separate authentication packet. For this case, connection and authentication are considered to be separate steps.

Unlike other trigger types, on_auth trigger functions are invoked before the event. Therefore a trigger function like function auth_function () v = box.session.user(); end will set v to “guest”, the user name before the authentication is done. To get the user name after the authentication is done, use the special syntax: function auth_function (user_name) v = user_name; end

If the trigger fails by raising an error, the error is sent to the client and the connection is closed.

Parameters:	trigger-function (`function`) – function which will become the trigger function old-trigger-function (`function`) – existing trigger function which will be replaced by trigger-function
Return:	nil or function pointer

If the parameters are (nil, old-trigger-function), then the old trigger is deleted.

If both parameters are omitted, then the response is a list of existing trigger functions.

Details about trigger characteristics are in the triggers section.

Example 1

tarantool> function f ()
         >   x = x + 1
         > end
tarantool> box.session.on_auth(f)

Example 2

This is a more complex example, with two server instances.

The first server instance listens on port 3301; its default user name is ‘admin’. There are three on_auth triggers:

The first trigger has a function with no arguments, it can only look at box.session.user().
The second trigger has a function with a user_name argument, it can look at both of: box.session.user() and user_name.
The third trigger has a function with a user_name argument and a status argument, it can look at all three of: box.session.user() and user_name and status.

The second server instance will connect with console.connect, and then will cause a display of the variables that were set by the trigger functions.

-- On the first server instance, which listens on port 3301
box.cfg{listen=3301}
function function1()
  print('function 1, box.session.user()='..box.session.user())
  end
function function2(user_name)
  print('function 2, box.session.user()='..box.session.user())
  print('function 2, user_name='..user_name)
  end
function function3(user_name, status)
  print('function 3, box.session.user()='..box.session.user())
  print('function 3, user_name='..user_name)
  if status == true then
    print('function 3, status = true, authorization succeeded')
    end
  end
box.session.on_auth(function1)
box.session.on_auth(function2)
box.session.on_auth(function3)
box.schema.user.passwd('admin')

-- On the second server instance, that connects to port 3301
console = require('console')
console.connect('admin:admin@localhost:3301')

The result looks like this:

function 3, box.session.user()=guest
function 3, user_name=admin
function 3, status = true, authorization succeeded
function 2, box.session.user()=guest
function 2, user_name=admin
function 1, box.session.user()=guest

box.session.on_access_denied()

box.session.on_access_denied([trigger-function[, old-trigger-function]])¶

Define a trigger for reacting to user’s attempts to execute actions that are not within the user’s privileges.

Parameters:	trigger-function (`function`) – function which will become the trigger function old-trigger-function (`function`) – existing trigger function which will be replaced by trigger-function
Return:	nil or function pointer

If the parameters are (nil, old-trigger-function), then the old trigger is deleted.

If both parameters are omitted, then the response is a list of existing trigger functions.

Details about trigger characteristics are in the triggers section.

Example:

For example, server administrator can log restricted actions like this:

tarantool> function on_access_denied(op, type, name)
         > log.warn('User %s tried to %s %s %s without required privileges', box.session.user(), op, type, name)
         > end
---
...
tarantool> box.session.on_access_denied(on_access_denied)
---
- 'function: 0x011b41af38'
...
tarantool> function test() print('you shall not pass') end
---
...
tarantool> box.schema.func.create('test')
---
...

Then, when some user without required privileges tries to call test() and gets the error, the server will execute this trigger and write to log “User *user_name* tried to Execute function test without required privileges”

box.session.push()

box.session.push(message[, sync])¶

Deprecated since 3.0.0.

Generate an out-of-band message. By “out-of-band” we mean an extra message which supplements what is passed in a network via the usual channels. Although box.session.push() can be called at any time, in practice it is used with networks that are set up with module net.box, and it is invoked by the server (on the “remote database system” to use our terminology for net.box), and the client has options for getting such messages.

This function returns an error if the session is disconnected.

Parameters:

message (any-Lua-type) – what to send
sync (int) – an optional argument to indicate what the session is, as taken from an earlier call to box.session.sync(). If it is omitted, the default is the current box.session.sync() value. In Tarantool version 2.4.2, sync is deprecated and its use will cause a warning. Since version 2.5.1, its use will cause an error.

Rtype:

{nil, error} or true:

If the result is an error, then the first part of the return is nil and the second part is the error object.
If the result is not an error, then the return is the boolean value true.
When the return is true, the message has gone to the network buffer as a packet with a different header code so the client can distinguish from an ordinary Okay response.

The server’s sole job is to call box.session.push(), there is no automatic mechanism for showing that the message was received.

The client’s job is to check for such messages after it sends something to the server. The major client methods – conn:call, conn:eval, conn:select, conn:insert, conn:replace, conn:update, conn:upsert, delete – may cause the server to send a message.

Situation 1: when the client calls synchronously with the default {async=false} option. There are two optional additional options: on_push=function-name, and on_push_ctx=function-argument. When the client receives an out-of-band message for the session, it invokes “function-name(function-argument)”. For example, with options {on_push=table.insert, on_push_ctx=messages}, the client will insert whatever it receives into a table named ‘messages’.

Situation 2: when the client calls asynchronously with the non-default {async=true} option. Here on_push and on_push_ctx are not allowed, but the messages can be seen by calling pairs() in a loop.

Situation 2 complication: pairs() is subject to timeout. So there is an optional argument = timeout per iteration. If timeout occurs before there is a new message or a final response, there is an error return. To check for an error one can use the first loop parameter (if the loop starts with “for i, message in future:pairs()” then the first loop parameter is i). If it is box.NULL then the second parameter (in our example, “message”) is the error object.

Example:

-- Make two shells. On Shell#1 set up a "server", and
-- in it have a function that includes box.session.push:
box.cfg{listen=3301}
box.schema.user.grant('guest','read,write,execute','universe')
x = 0
fiber = require('fiber')
function server_function() x=x+1; fiber.sleep(1); box.session.push(x); end

-- On Shell#2 connect to this server as a "client" that
-- can handle Lua (such as another Tarantool server operating
-- as a client), and initialize a table where we'll get messages:
net_box = require('net.box')
conn = net_box.connect(3301)
messages_from_server = {}

-- On Shell#2 remotely call the server function and receive
-- a SYNCHRONOUS out-of-band message:
conn:call('server_function', {},
          {is_async = false,
           on_push = table.insert,
           on_push_ctx = messages_from_server})
messages_from_server
-- After a 1-second pause that is caused by the fiber.sleep()
-- request inside server_function, the result in the
--  messages_from_server table will be: 1. Like this:
-- tarantool> messages_from_server
-- ---
-- - - 1
-- ...
-- Good. That shows that box.session.push(x) worked,
-- because we know that x was 1.

-- On Shell#2 remotely call the same server function and
-- get an ASYNCHRONOUS out-of-band message. For this we cannot
-- use on_push and on_push_ctx options, but we can use pairs():
future = conn:call('server_function', {}, {is_async = true})
messages = {}
keys = {}
for i, message in future:pairs() do
    table.insert(messages, message) table.insert(keys, i) end
messages
future:wait_result(1000)
for i, message in future:pairs() do
    table.insert(messages, message) table.insert(keys, i) end
messages
-- There is no pause because conn:call does not wait for
-- server_function to finish. The first time that we go through
-- the pairs() loop, we see the messages table is empty. Like this:
-- tarantool> messages
-- ---
-- - - 2
--   - []
-- ...
-- That is okay because the server hasn't yet called
-- box.session.push(). The second time that we go through
-- the pairs() loop, we see the value of x at the time of
-- the second call to box.session.push(). Like this:
-- tarantool> messages
-- ---
-- - - 2
--   - &0 []
--   - 2
--   - *0
-- ...
-- Good. That shows that the message was asynchronous, and
-- that box.session.push() did its job.

Submodule box.slab

The box.slab submodule provides access to slab allocator statistics. The slab allocator is the main allocator used to store tuples. This can be used to monitor the total memory usage and memory fragmentation.

Below is a list of all box.slab functions.

Name	Use
box.runtime.info()	Show a memory usage report for Lua runtime
box.slab.info()	Show an aggregated memory usage report for slab allocator
box.slab.stats()	Show a detailed memory usage report for slab allocator

box.runtime.info()

box.runtime.info()¶

Show runtime memory usage report in bytes.

The runtime memory encompasses internal Lua memory as well as the runtime arena. The Lua memory stores Lua objects. The runtime arena stores Tarantool-specific objects – for example, runtime tuples, network buffers and other objects associated with the application server subsystem.

Return:	`lua` is the size of the Lua heap that is controlled by the Lua garbage collector. `maxalloc` is the maximum size of the runtime memory. `used` is the current number of bytes used by the runtime memory.
Rtype:	table

Example:

tarantool> box.runtime.info()
---
- lua: 913710
  maxalloc: 4398046510080
  used: 12582912
...
tarantool> box.runtime.info().used
---
- used: 12582912
...

box.slab.info()

box.slab.info()¶

Show an aggregated memory usage report in bytes for the slab allocator. This report is useful for assessing out-of-memory risks.

box.slab.info gives a few ratios:

items_used_ratio
arena_used_ratio
quota_used_ratio

Here are two possible cases for monitoring memtx memory usage:

Case 1: 0.5 < items_used_ratio < 0.9

../../../../_images/items_used_ratio1.svg

Apparently your memory is highly fragmented. Check how many slab classes you have by looking at box.slab.stats() and counting the number of different classes. If there are many slab classes (more than a few dozens), you may run out of memory even though memory utilization is not high. While each slab may have few items used, whenever a tuple of a size different from any existing slab class size is allocated, Tarantool may need to get a new slab from the slab arena, and since the arena has few empty slabs left, it will attempt to increase its quota usage, which, in turn, may end up with an out-of-memory error due to the low remaining quota.

Case 2: items_used_ratio > 0.9

../../../../_images/items_used_ratio2.svg

You are running out of memory. All memory utilization indicators are high. Your memory is not fragmented, but there are few reserves left on each slab allocator level. You should consider increasing Tarantool’s memory limit (box.cfg.memtx_memory).

To sum up: your main out-of-memory indicator is quota_used_ratio. However, there are lots of perfectly stable setups with a high quota_used_ratio, so you only need to pay attention to it when both arena and item used ratio are also high.

Return:

quota_size - memory limit for slab allocator (as configured in the memtx_memory parameter, the default is 2^28 bytes = 268,435,456 bytes)
quota_used - used by slab allocator
items_size - allocated only for tuples
items_used - used only for tuples
arena_size - allocated for both tuples and indexes
arena_used - used for both tuples and indexes
items_used_ratio = items_used / items_size
quota_used_ratio = quota_used / quota_size
arena_used_ratio = arena_used / arena_size

Rtype:

table

Example:

tarantool> box.slab.info()
---
- items_size: 228128
  items_used_ratio: 1.8%
  quota_size: 1073741824
  quota_used_ratio: 0.8%
  arena_used_ratio: 43.2%
  items_used: 4208
  quota_used: 8388608
  arena_size: 2325176
  arena_used: 1003632
...

tarantool> box.slab.info().arena_used
---
- 1003632
...

box.slab.stats()

box.slab.stats()¶

Show a detailed memory usage report (in bytes) for the slab allocator. The report is broken down into groups by data item size as well as by slab size (64-byte, 136-byte, etc). The report includes the memory allocated for storing both tuples and indexes.

return:

mem_free is the allocated, but currently unused memory;

mem_used is the memory used for storing data items (tuples and indexes);

item_count is the number of stored items;

item_size is the size of each data item;

slab_count is the number of slabs allocated;

slab_size is the size of each allocated slab.

rtype:
table

Example:

Here is a sample report for the first group:
tarantool> box.slab.stats()[1]
---
- mem_free: 16232
  mem_used: 48
  item_count: 2
  item_size: 24
  slab_count: 1
  slab_size: 16384
...
This report is saying that there are 2 data items (item_count = 2) stored in one (slab_count = 1) 24-byte slab (item_size = 24), so mem_used = 2 * 24 = 48 bytes. Also, slab_size is 16384 bytes, of which 16384 - 48 = 16232 bytes are free (mem_free).

A complete report would show memory usage statistics for all groups:
tarantool> box.slab.stats()
---
- - mem_free: 16232
    mem_used: 48
    item_count: 2
    item_size: 24
    slab_count: 1
    slab_size: 16384
  - mem_free: 15720
    mem_used: 560
    item_count: 14
    item_size: 40
    slab_count: 1
    slab_size: 16384
  <...>
  - mem_free: 32472
    mem_used: 192
    item_count: 1
    item_size: 192
    slab_count: 1
    slab_size: 32768
  - mem_free: 1097624
    mem_used: 999424
    item_count: 61
    item_size: 16384
    slab_count: 1
    slab_size: 2097152
  ...

The total mem_used for all groups in this report equals arena_used in box.slab.info() report.

Submodule box.space

CRUD operations in Tarantool are implemented by the box.space submodule. It has the data-manipulation functions select, insert, replace, update, upsert, delete, get, put. It also has members, such as id, and whether or not a space is enabled.

Below is a list of all box.space functions and members.

Name	Use
box.schema.space.create()	Create a space
space_object:alter()	Alter a space
space_object:auto_increment()	Generate key + Insert a tuple
space_object:bsize()	Get count of bytes
space_object:count()	Get count of tuples
space_object:create_index()	Create an index
space_object:delete()	Delete a tuple
space_object:drop()	Destroy a space
space_object:format()	Declare field names and types
space_object:frommap()	Convert from map to tuple or table
space_object:get()	Select a tuple
space_object:insert()	Insert a tuple
space_object:len()	Get count of tuples
space_object:on_replace()	Create a replace trigger with a function that cannot change the tuple
space_object:before_replace()	Create a replace trigger with a function that can change the tuple
space_object:pairs()	Prepare for iterating
space_object:put()	Insert or replace a tuple
space_object:rename()	Rename a space
space_object:replace() / put()	Insert or replace a tuple
space_object:run_triggers()	Enable/disable a replace trigger
space_object:select()	Select one or more tuples
space_object:stat()	Get statistics on memory usage
space_object:truncate()	Delete all tuples
space_object:update()	Update a tuple
Upgrading space schema	Upgrade the space format and tuples
space_object:upsert()	Update a tuple
space_object extensions	Any function / method that any user wants to add
box.space.create_check_constraint()	Create a check constraint
space_object:enabled	Flag, true if space is enabled
space_object:field_count	Required number of fields
space_object.id	Numeric identifier of space
space_object.index	Container of space’s indexes
box.space._cluster	(Metadata) List of replica sets
box.space._func	(Metadata) List of function tuples
box.space._index	(Metadata) List of indexes
box.space._vindex	(Metadata) List of indexes accessible for the current user
box.space._priv	(Metadata) List of privileges
box.space._vpriv	(Metadata) List of privileges accessible for the current user
box.space._schema	(Metadata) List of schemas
box.space._sequence	(Metadata) List of sequences
box.space._sequence_data	(Metadata) List of sequences
box.space._space	(Metadata) List of spaces
box.space._vspace	(Metadata) List of spaces accessible for the current user
box.space._space_sequence	(Metadata) List of connections between spaces and sequences
box.space._vspace_sequence	(Metadata) List of connections between spaces and sequences accessible for the current user
box.space._user	(Metadata) List of users
box.space._vuser	(Metadata) List of users accessible for the current user
box.space._ck_constraint	(Metadata) List of check constraints
box.space._collation	(Metadata) List of collations
box.space._vcollation	(Metadata) List of collations accessible for the current user
System space views	(Metadata) Spaces whose names begin with _v
box.space._session_settings	(Metadata) List of settings affecting behavior of the current session

To see examples, visit the how-to guide on CRUD operations.

space_object:alter()

object space_object¶

space_object:alter(options)¶

Since version 2.5.2. Alter an existing space. This method changes certain space parameters.

Parameters:	options (`table`) – the space options such as `field_count`, `user`, `format`, `name`, and other. The full list of these options with descriptions parameters is provided in box.schema.space.create()
Return:	nothing in case of success; an error when fails

Example:

tarantool> s = box.schema.create_space('tester')
---
...
tarantool> format = {{name = 'field1', type = 'unsigned'}}
---
...
tarantool> s:alter({name = 'tester1', format = format})
---
...
tarantool> s.name
---
- tester1
...
tarantool> s:format()
---
- [{'name': 'field1', 'type': 'unsigned'}]
...

space_object:auto_increment()

object space_object¶

space_object:auto_increment(tuple)¶

Insert a new tuple using an auto-increment primary key. The space specified by space_object must have an ‘unsigned’ or ‘integer’ or ‘number’ primary key index of type TREE. The primary-key field will be incremented before the insert.

Since version 1.7.5 this method is deprecated – it is better to use a sequence.

Parameters:	space_object (`space_object`) – an object reference tuple (`table/tuple`) – tuple’s fields, other than the primary-key field
Return:	the inserted tuple.
Rtype:	tuple

Complexity factors: Index size, Index type, Number of indexes accessed, WAL settings.

Possible errors:

index has wrong type;
primary-key indexed field is not a number.

Example:

tarantool> box.space.tester:auto_increment{'Fld#1', 'Fld#2'}
---
- [1, 'Fld#1', 'Fld#2']
...
tarantool> box.space.tester:auto_increment{'Fld#3'}
---
- [2, 'Fld#3']
...

space_object:bsize()

object space_object¶

space_object:bsize()¶

Parameters:	space_object (`space_object`) – an object reference
Return:	Number of bytes in the space. This number, which is stored in Tarantool’s internal memory, represents the total number of bytes in all tuples, not including index keys. For a measure of index size, see index_object:bsize().

Example:

tarantool> box.space.tester:bsize()
---
- 22
...

space_object:count()

object space_object¶

space_object:count([key][, iterator])¶

Return the number of tuples. If compared with len(), this method works slower because count() scans the entire space to count the tuples.

Parameters:	space_object (`space_object`) – an object reference key (`scalar/table`) – primary-key field values, must be passed as a Lua table if key is multi-part iterator – comparison method
Return:	Number of tuples.

Possible errors:

ER_TRANSACTION_CONFLICT if a transaction conflict is detected in the MVCC transaction mode.

Example:

tarantool> box.space.tester:count(2, {iterator='GE'})
---
- 1
...

space_object:delete()

object space_object¶

space_object:delete(key)¶

Delete a tuple identified by the primary key.

Parameters:	space_object (`space_object`) – an object reference key (`scalar/table`) – primary-key field values, must be passed as a Lua table if key is multi-part
Return:	the deleted tuple
Rtype:	tuple

Possible errors:

ER_TRANSACTION_CONFLICT if a transaction conflict is detected in the MVCC transaction mode.

Complexity factors: Index size, Index type

Note regarding storage engine: vinyl will return nil, rather than the deleted tuple.

Example:

tarantool> box.space.tester:delete(1)
---
- [1, 'My first tuple']
...
tarantool> box.space.tester:delete(1)
---
...
tarantool> box.space.tester:delete('a')
---
- error: 'Supplied key type of part 0 does not match index part type:
  expected unsigned'
...

For more usage scenarios and typical errors see Example: using data operations further in this section.

space_object:drop()

object space_object¶

space_object:drop()¶

Drop a space. The method is performed in background and doesn’t block consequent requests.

Parameters:	space_object (`space_object`) – an object reference
Return:	nil

Possible errors: space_object does not exist.

Complexity factors: Index size, Index type, Number of indexes accessed, WAL settings.

Example:

box.space.space_that_does_not_exist:drop()

space_object:format()

object space_object¶

space_object:format([format-clause])¶

Declare field names and types.

Parameters:	space_object (`space_object`) – an object reference format-clause (`table`) – a list of field names and types
Return:	`nil`, unless format-clause is omitted

Possible errors:

space_object does not exist
field names are duplicated
type is not legal

Note

If you need to make a schema migration, see section Migrations.

Ordinarily Tarantool allows unnamed untyped fields. But with format users can, for example, document that the Nth field is the surname field and must contain strings. It is also possible to specify a format clause in box.schema.space.create().

The format clause contains, for each field, a definition within braces: {name='...',type='...'[,is_nullable=...]}, where:

The name value may be any string, provided that two fields do not have the same name.
The type value may be any of allowed types: any | unsigned | string | integer | number | varbinary | boolean | double | decimal | uuid | array | map | scalar, but for creating an index use only indexed fields;
(Optional) The is_nullable boolean value specifies whether nil can be used as a field value. See also: key_part.is_nullable.
(Optional) The collation string value specifies the collation used to compare field values. See also: key_part.collation.
(Optional) The constraint table specifies the constraints that the field value must satisfy.
(Optional) The foreign_key table specifies the foreign keys for the field.
(Optional) The default value specifies the explicit default value for the field or the argument of the default function if default_func is specified.
(Optional) The default_func string value specifies the name of the field’s default function. To pass the default function’s argument, add the default parameter.

It is not legal for tuples to contain values that have the wrong type. The example below will cause an error:

--This example will cause an error.
box.space.tester:format({{' ',type='number'}})
box.space.tester:insert{'string-which-is-not-a-number'}

It is not legal for tuples to contain null values if is_nullable=false, which is the default. The example below will cause an error:

--This example will cause an error.
box.space.tester:format({{' ',type='number',is_nullable=false}})
box.space.tester:insert{nil,2}

It is legal for tuples to have more fields than are described by a format clause. The way to constrain the number of fields is to specify a space’s field_count member.

It is legal for tuples to have fewer fields than are described by a format clause, if the omitted trailing fields are described with is_nullable=true. For example, the request below will not cause a format-related error:

box.space.tester:format({{'a',type='number'},{'b',type='number',is_nullable=true}})
box.space.tester:insert{2}

It is legal to use format on a space that already has a format, thus replacing any previous definitions, provided that there is no conflict with existing data or index definitions.

It is legal to use format to change the is_nullable flag. The example below will not cause an error – and will not cause rebuilding of the space.

box.space.tester:format({{' ',type='scalar',is_nullable=false}})
box.space.tester:format({{' ',type='scalar',is_nullable=true}})

But going the other way and changing is_nullable from true to false might cause rebuilding and might cause an error if there are existing tuples with nulls.

Example:

box.space.tester:format({{name='surname',type='string'},{name='IDX',type='array'}})
box.space.tester:format({{name='surname',type='string',is_nullable=true}})

There are legal variations of the format clause:

omitting both ‘name=’ and ‘type=’,
omitting ‘type=’ alone,
adding extra braces.

The following examples show all the variations, first for one field named ‘x’, second for two fields named ‘x’ and ‘y’.

box.space.tester:format({{name='x',type='scalar'}})
box.space.tester:format({{name='x',type='scalar'},{name='y',type='unsigned'}})

box.space.tester:format({{'x'}})
box.space.tester:format({{'x'},{'y'}})

-- types
box.space.tester:format({{name='x'}})
box.space.tester:format({{name='x'},{name='y'}})

box.space.tester:format({{'x',type='scalar'}})
box.space.tester:format({{'x',type='scalar'},{'y',type='unsigned'}})

box.space.tester:format({{'x','scalar'}})
box.space.tester:format({{'x','scalar'},{'y','unsigned'}})

The following example shows how to create a space, format it with all possible types, and insert into it.

tarantool> box.schema.space.create('t')
---
- engine: memtx
  before_replace: 'function: 0x4019c488'
  on_replace: 'function: 0x4019c460'
  ck_constraint: []
  field_count: 0
  temporary: false
  index: []
  is_local: false
  enabled: false
  name: t
  id: 534
- created
...
tarantool> ffi = require('ffi')
---
...
tarantool> decimal = require('decimal')
---
...
tarantool> uuid = require('uuid')
---
...
tarantool> box.space.t:format({{name = '1', type = 'any'},
         >                     {name = '2', type = 'unsigned'},
         >                     {name = '3', type = 'string'},
         >                     {name = '4', type = 'number'},
         >                     {name = '5', type = 'double'},
         >                     {name = '6', type = 'integer'},
         >                     {name = '7', type = 'boolean'},
         >                     {name = '8', type = 'decimal'},
         >                     {name = '9', type = 'uuid'},
         >                     {name = 'a', type = 'scalar'},
         >                     {name = 'b', type = 'array'},
         >                     {name = 'c', type = 'map'}})
---
...
tarantool> box.space.t:create_index('i',{parts={2, type = 'unsigned'}})
---
- unique: true
  parts:
  - type: unsigned
    is_nullable: false
    fieldno: 2
  id: 0
  space_id: 534
  type: TREE
  name: i
...
tarantool> box.space.t:insert{{'a'}, -- any
         >                    1, -- unsigned
         >                    'W?', -- string
         >                    5.5, -- number
         >                    ffi.cast('double', 1), -- double
         >                    -0, -- integer
         >                    true, -- boolean
         >                    decimal.new(1.2), -- decimal
         >                    uuid.new(), -- uuid
         >                    true, -- scalar
         >                    {{'a'}}, -- array
         >                    {val=1}} -- map
---
- [['a'], 1, 'W?', 5.5, 1, 0, true, 1.2, 1f41e7b8-3191-483d-b46e-1aa6a4b14557, true, [['a']], {'val': 1}]
...

Names specified with the format clause can be used in space_object:get() and in space_object:create_index() and in tuple_object[field-name] and in tuple_object[field-path].

If the format clause is omitted, then the returned value is the table that was used in a previous space_object:format(format-clause) invocation. For example, after box.space.tester:format({{'x','scalar'}}), box.space.tester:format() will return [{'name': 'x', 'type': 'scalar'}].

Formatting or reformatting a large space will cause occasional yields so that other requests will not be blocked. If the other requests cause an illegal situation such as a field value of the wrong type, the formatting or reformatting will fail.

Note regarding storage engine: vinyl supports formatting of non-empty spaces. Primary index definition cannot be formatted.

space_object:frommap()

object space_object¶

space_object:frommap(map[, option])¶

Convert a map to a tuple instance or to a table. The map must consist of “field name = value” pairs. The field names and the value types must match names and types stated previously for the space, via space_object:format().

Parameters:	space_object (`space_object`) – an object reference map (`field-value-pairs`) – a series of “field = value” pairs, in any order. option (`boolean`) – the only legal option is `{table = true\|false}`; if the option is omitted or if `{table = false}`, then return type will be ‘cdata’ (i.e. tuple); if `{table = true}`, then return type will be ‘table’.
Return:	a tuple instance or table.
Rtype:	tuple or table

Possible errors: space_object does not exist or has no format; “unknown field”.

Example:

-- Create a format with two fields named 'a' and 'b'.
-- Create a space with that format.
-- Create a tuple based on a map consistent with that space.
-- Create a table based on a map consistent with that space.
tarantool> format1 = {{name='a',type='unsigned'},{name='b',type='scalar'}}
---
...
tarantool> s = box.schema.create_space('test', {format = format1})
---
...
tarantool> s:frommap({b = 'x', a = 123456})
---
- [123456, 'x']
...
tarantool> s:frommap({b = 'x', a = 123456}, {table = true})
---
- - 123456
  - x
...

space_object:get()

object space_object¶

space_object:get(key)¶

Search for a tuple in the given space.

Parameters:	space_object (`space_object`) – an object reference key (`scalar/table`) – value to be matched against the index key, which may be multi-part.
Return:	the tuple whose index key matches `key`, or `nil`.
Rtype:	tuple

Possible errors:

space_object does not exist.
ER_TRANSACTION_CONFLICT if a transaction conflict is detected in the MVCC transaction mode.

Complexity factors: Index size, Index type, Number of indexes accessed, WAL settings.

The box.space...select function returns a set of tuples as a Lua table; the box.space...get function returns at most a single tuple. And it is possible to get the first tuple in a space by appending [1]. Therefore box.space.tester:get{1} has the same effect as box.space.tester:select{1}[1], if exactly one tuple is found.

Example:

box.space.tester:get{1}

Using field names instead of field numbers: get() can use field names described by the optional space_object:format() clause. This is true because the object returned by get() can be used with most of the features described in the Submodule box.tuple description, including tuple_object[field-name].

For example, we can format the tester space with a field named x and use the name x in the index definition:

box.space.tester:format({{name='x',type='scalar'}})
box.space.tester:create_index('I',{parts={'x'}})

Then, if get or select retrieves a single tuple, we can reference the field ‘x’ in the tuple by its name:

box.space.tester:get{1}['x']
box.space.tester:select{1}[1]['x']

space_object:insert()

object space_object¶

space_object:insert(tuple)¶

Insert a tuple into a space.

Parameters:	space_object (`space_object`) – an object reference tuple (`tuple/table`) – tuple to be inserted.
Return:	the inserted tuple
Rtype:	tuple

Possible errors:

ER_TUPLE_FOUND if a tuple with the same unique-key value already exists.
ER_TRANSACTION_CONFLICT if a transaction conflict is detected in the MVCC transaction mode.

Example:

tarantool> box.space.tester:insert{5000,'tuple number five thousand'}
---
- [5000, 'tuple number five thousand']
...

For more usage scenarios and typical errors see Example: using data operations further in this section.

space_object:len()

object space_object¶

space_object:len()¶

Return the number of tuples in the space. If compared with count(), this method works faster because len() does not scan the entire space to count the tuples.

Parameters:	space_object (`space_object`) – an object reference
Return:	Number of tuples in the space.

Possible errors:

ER_TRANSACTION_CONFLICT if a transaction conflict is detected in the MVCC transaction mode.

Example:

tarantool> box.space.tester:len()
---
- 2
...

Note regarding storage engine: vinyl supports len() but the result may be approximate. If an exact result is necessary then use count() or pairs():length().

space_object:on_replace()

object space_object¶

space_object:on_replace([trigger-function[, old-trigger-function]])¶

Create a “replace trigger”. The trigger-function will be executed whenever a replace() or insert() or update() or upsert() or delete() happens to a tuple in <space-name>.

Parameters:	trigger-function (`function`) – function which will become the trigger function; see Example 2 below for details about trigger function parameters old-trigger-function (`function`) – existing trigger function which will be replaced by `trigger-function`
Return:	nil or function pointer

If the parameters are (nil, old-trigger-function), then the old trigger is deleted.

If both parameters are omitted, then the response is a list of existing trigger functions.

If it is necessary to know whether the trigger activation happened due to replication or on a specific connection type, the function can refer to box.session.type().

Details about trigger characteristics are in the triggers section.

space_object:before_replace()

object space_object¶

space_object:before_replace([trigger-function[, old-trigger-function]])¶

Create a “replace trigger”. The trigger-function will be executed whenever a replace() or insert() or update() or upsert() or delete() happens to a tuple in <space-name>.

Parameters:	trigger-function (`function`) – function which will become the trigger function; for the trigger function’s optional parameters see the description of on_replace. old-trigger-function (`function`) – existing trigger function which will be replaced by `trigger-function`
Return:	nil or function pointer

If the parameters are (nil, old-trigger-function), then the old trigger is deleted.

If both parameters are omitted, then the response is a list of existing trigger functions.

If it is necessary to know whether the trigger activation happened due to replication or on a specific connection type, the function can refer to box.session.type().

Details about trigger characteristics are in the triggers section.

space_object:pairs()

object space_object¶

space_object:pairs([key[, iterator]])¶

Search for a tuple or a set of tuples in the given space, and allow iterating over one tuple at a time. To search by the specific index, use the index_object:pairs() method.

Parameters:

space_object (space_object) – an object reference
key (scalar/table) – value to be matched against the index key, which may be multi-part
iterator – the iterator type. The default iterator type is ‘EQ’
after – a tuple or the position of a tuple (tuple_pos) after which pairs starts the search. You can pass an empty string or box.NULL to this option to start the search from the first tuple.

Return:

The iterator, which can be used in a for/end loop or with totable().

Possible errors:

no such space
wrong type
ER_TRANSACTION_CONFLICT if a transaction conflict is detected in the MVCC transaction mode
iterator position is invalid

Complexity factors: Index size, Index type.

For information about iterators’ internal structures, see the “Lua Functional library” documentation.

Examples:

Below are few examples of using pairs with different parameters. To try out these examples, you need to bootstrap a Tarantool instance as described in Using data operations.

-- Insert test data --
tarantool> bands:insert{1, 'Roxette', 1986}
           bands:insert{2, 'Scorpions', 1965}
           bands:insert{3, 'Ace of Base', 1987}
           bands:insert{4, 'The Beatles', 1960}
           bands:insert{5, 'Pink Floyd', 1965}
           bands:insert{6, 'The Rolling Stones', 1962}
           bands:insert{7, 'The Doors', 1965}
           bands:insert{8, 'Nirvana', 1987}
           bands:insert{9, 'Led Zeppelin', 1968}
           bands:insert{10, 'Queen', 1970}
---
...

-- Select all tuples by the primary index --
tarantool> for _, tuple in bands:pairs() do
               print(tuple)
           end
[1, 'Roxette', 1986]
[2, 'Scorpions', 1965]
[3, 'Ace of Base', 1987]
[4, 'The Beatles', 1960]
[5, 'Pink Floyd', 1965]
[6, 'The Rolling Stones', 1962]
[7, 'The Doors', 1965]
[8, 'Nirvana', 1987]
[9, 'Led Zeppelin', 1968]
[10, 'Queen', 1970]
---
...

-- Select all tuples whose primary key values are between 3 and 6 --
tarantool> for _, tuple in bands:pairs(3, {iterator = "GE"}) do
             if (tuple[1] > 6) then break end
             print(tuple)
           end
[3, 'Ace of Base', 1987]
[4, 'The Beatles', 1960]
[5, 'Pink Floyd', 1965]
[6, 'The Rolling Stones', 1962]
---
...

-- Select all tuples after the specified tuple --
tarantool> for _, tuple in bands:pairs({}, {after={7, 'The Doors', 1965}}) do
               print(tuple)
           end
[8, 'Nirvana', 1987]
[9, 'Led Zeppelin', 1968]
[10, 'Queen', 1970]
---
...

space_object:put()

See space_object:replace() / put().

space_object:rename()

object space_object¶

space_object:rename(space-name)¶

Rename a space.

Parameters:	space_object (`space_object`) – an object reference space-name (`string`) – new name for space
Return:	nil

Possible errors: space_object does not exist.

Example:

tarantool> box.space.space55:rename('space56')
---
...
tarantool> box.space.space56:rename('space55')
---
...

space_object:replace() / put()

object space_object¶

space_object:replace(tuple)¶

space_object:put(tuple)¶

Insert a tuple into a space. If a tuple with the same primary key already exists, box.space...:replace() replaces the existing tuple with a new one. The syntax variants box.space...:replace() and box.space...:put() have the same effect; the latter is sometimes used to show that the effect is the converse of box.space...:get().

Parameters:	space_object (`space_object`) – an object reference tuple (`table/tuple`) – tuple to be inserted
Return:	the inserted tuple.
Rtype:	tuple

Possible errors:

ER_TUPLE_FOUND if a different tuple with the same unique-key value already exists. (This will only happen if there is a unique secondary index.)
ER_TRANSACTION_CONFLICT if a transaction conflict is detected in the MVCC transaction mode.

Complexity factors: Index size, Index type, Number of indexes accessed, WAL settings.

Example:

box.space.tester:replace{5000, 'tuple number five thousand'}

For more usage scenarios and typical errors see Example: using data operations further in this section.

space_object:run_triggers()

object space_object¶

space_object:run_triggers(true|false)¶

At the time that a trigger is defined, it is automatically enabled - that is, it will be executed. Replace triggers can be disabled with box.space.space-name:run_triggers(false) and re-enabled with box.space.space-name:run_triggers(true).

Return:	nil

Example:

The following series of requests will associate an existing function named F with an existing space named T, associate the function a second time with the same space (so it will be called twice), disable all triggers of T, and delete each trigger by replacing with nil.

tarantool> box.space.T:on_replace(F)
tarantool> box.space.T:on_replace(F)
tarantool> box.space.T:run_triggers(false)
tarantool> box.space.T:on_replace(nil, F)
tarantool> box.space.T:on_replace(nil, F)

space_object:select()

object space_object¶

space_object:select([key[, options]])¶

Search for a tuple or a set of tuples in the given space by the primary key. To search by the specific index, use the index_object:select() method.

Note

Note that this method doesn’t yield. For details, see Cooperative multitasking.

Parameters:

space_object (space_object) – an object reference.
key (scalar/table) – a value to be matched against the index key, which may be multi-part.
options (table/nil) –
none, any, or all of the same options that index_object:select() allows:
- options.iterator – the iterator type. The default iterator type is ‘EQ’.
- options.limit – the maximum number of tuples.
- options.offset – the number of tuples to skip.
- options.after – a tuple or the position of a tuple (tuple_pos) after which select starts the search. You can pass an empty string or box.NULL to this option to start the search from the first tuple.
- options.fetch_pos – if true, the select method returns the position of the last selected tuple as the second value.
  
  Note
  
  The after and fetch_pos options are supported for the TREE index only.

Return:

This function might return one or two values:

The tuples whose primary-key fields are equal to the fields of the passed key. If the number of passed fields is less than the number of fields in the primary key, then only the passed fields are compared, so select{1,2} matches a tuple whose primary key is {1,2,3}.
(Optionally) If options.fetch_pos is set to true, returns a base64-encoded string representing the position of the last selected tuple as the second value. If no tuples are fetched, returns nil.

Rtype:

array of tuples
(Optionally) string

Possible errors:

no such space
wrong type
ER_TRANSACTION_CONFLICT if a transaction conflict is detected in the MVCC transaction mode
iterator position is invalid

Complexity factors: Index size, Index type.

Examples:

Below are few examples of using select with different parameters. To try out these examples, you need to bootstrap a Tarantool instance as described in Using data operations.

-- Insert test data --
tarantool> bands:insert{1, 'Roxette', 1986}
           bands:insert{2, 'Scorpions', 1965}
           bands:insert{3, 'Ace of Base', 1987}
           bands:insert{4, 'The Beatles', 1960}
           bands:insert{5, 'Pink Floyd', 1965}
           bands:insert{6, 'The Rolling Stones', 1962}
           bands:insert{7, 'The Doors', 1965}
           bands:insert{8, 'Nirvana', 1987}
           bands:insert{9, 'Led Zeppelin', 1968}
           bands:insert{10, 'Queen', 1970}
---
...

-- Select a tuple by the specified primary key --
tarantool> bands:select(4)
---
- - [4, 'The Beatles', 1960]
...

-- Select maximum 3 tuples with the primary key value greater than 3 --
tarantool> bands:select({3}, {iterator='GT', limit = 3})
---
- - [4, 'The Beatles', 1960]
  - [5, 'Pink Floyd', 1965]
  - [6, 'The Rolling Stones', 1962]
...

-- Select maximum 3 tuples after the specified tuple --
tarantool> bands:select({}, {after = {4, 'The Beatles', 1960}, limit = 3})
---
- - [5, 'Pink Floyd', 1965]
  - [6, 'The Rolling Stones', 1962]
  - [7, 'The Doors', 1965]
...

-- Select first 3 tuples and fetch a last tuple's position --
tarantool> result, position = bands:select({}, {limit = 3, fetch_pos = true})
---
...
-- Then, pass this position as the 'after' parameter --
tarantool> bands:select({}, {limit = 3, after = position})
---
- - [4, 'The Beatles', 1960]
  - [5, 'Pink Floyd', 1965]
  - [6, 'The Rolling Stones', 1962]
...

Note

You can get a field from a tuple both by the field number and field name. See example: using field names instead of field numbers.

space_object:stat()

object space_object¶

space_object:stat()¶

Get statistics on memory usage by the space.

Returns a table with the cumulative statistics on the memory usage by tuples in the space. Statistics are grouped by arena types: memtx or malloc. For each arena type, the return table includes tuple memory usage statistics listed in the tuple_object.info() reference.

Note

Memory usage statistics are shown only for the memtx storage engine. For other types of spaces, an empty table is returned.

Parameters:	space_object (`space_object`) – an object reference
Return:	space memory usage statistics
Rtype:	table

Possible errors: space_object does not exist.

Example:

tarantool> box.space.tester:stat()
---
- tuple:
    memtx:
      waste_size: 145
      data_size: 235
      header_size: 36
      field_map_size: 24
    malloc:
      waste_size: 0
      data_size: 0
      header_size: 0
      field_map_size: 0
...

space_object:truncate()

object space_object¶

space_object:truncate()¶

Deletes all tuples. The method is performed in background and doesn’t block consequent requests.

Parameters:	space_object (`space_object`) – an object reference

Complexity factors: Index size, Index type, Number of tuples accessed.

Return:	nil

The truncate method can only be called by the user who created the space, or from within a setuid function created by the user who created the space. Read more about setuid functions in the reference for box.schema.func.create().

Note

Do not call this method within a transaction in Tarantool older than v. 2.10.0. See gh-6123 for details.

Example:

tarantool> box.space.tester:truncate()
---
...
tarantool> box.space.tester:len()
---
- 0
...

space_object:update()

object space_object¶

space_object:update(key, {{operator, field_identifier, value}, ...})¶

Update a tuple.

The update function supports operations on fields — assignment, arithmetic (if the field is numeric), cutting and pasting fragments of a field, deleting or inserting a field. Multiple operations can be combined in a single update request, and in this case they are performed atomically and sequentially. Each operation requires specification of a field identifier, which is usually a number. When multiple operations are present, the field number for each operation is assumed to be relative to the most recent state of the tuple, that is, as if all previous operations in a multi-operation update have already been applied. In other words, it is always safe to merge multiple update invocations into a single invocation, with no change in semantics.

Possible operators are:

+ for addition. values must be numeric, e.g. unsigned or decimal
- for subtraction. values must be numeric
& for bitwise AND. values must be unsigned numeric
| for bitwise OR. values must be unsigned numeric
^ for bitwise XOR. values must be unsigned numeric
: for string splice.
! for insertion of a new field.
# for deletion.
= for assignment.

Possible field_identifiers are:

Positive field number. The first field is 1, the second field is 2, and so on.

Negative field number. The last field is -1, the second-last field is -2, and so on. In other words: (#tuple + negative field number + 1).

Name. If the space was formatted with space_object:format(), then this can be a string for the field ‘name’.

Parameters:	space_object (`space_object`) – an object reference key (`scalar/table`) – primary-key field values, must be passed as a Lua table if key is multi-part operator (`string`) – operation type represented in string field_identifier (`number-or-string`) – what field the operation will apply to. value (`lua_value`) – what value will be applied
Return:	the updated tuple nil if the key is not found
Rtype:	tuple or nil

Possible errors:

It is illegal to modify a primary key field.
ER_TRANSACTION_CONFLICT if a transaction conflict is detected in the MVCC transaction mode.

Complexity factors: Index size, Index type, number of indexes accessed, WAL settings.

Thus, in the instruction:

s:update(44, {{'+', 1, 55 }, {'=', 3, 'x'}})

the primary-key value is 44, the operators are '+' and '=' meaning add a value to a field and then assign a value to a field, the first affected field is field 1 and the value which will be added to it is 55, the second affected field is field 3 and the value which will be assigned to it is 'x'.

Example:

Assume that initially there is a space named tester with a primary-key index whose type is unsigned. There is one tuple, with field[1] = 999 and field[2] = 'A'.

In the update:
box.space.tester:update(999, {{'=', 2, 'B'}})
The first argument is tester, that is, the affected space is tester. The second argument is 999, that is, the affected tuple is identified by primary key value = 999. The third argument is =, that is, there is one operation — assignment to a field. The fourth argument is 2, that is, the affected field is field[2]. The fifth argument is 'B', that is, field[2] contents change to 'B'. Therefore, after this update, field[1] = 999 and field[2] = 'B'.

In the update:
box.space.tester:update({999}, {{'=', 2, 'B'}})
the arguments are the same, except that the key is passed as a Lua table (inside braces). This is unnecessary when the primary key has only one field, but would be necessary if the primary key had more than one field. Therefore, after this update, field[1] = 999 and field[2] = 'B' (no change).

In the update:
box.space.tester:update({999}, {{'=', 3, 1}})
the arguments are the same, except that the fourth argument is 3, that is, the affected field is field[3]. It is okay that, until now, field[3] has not existed. It gets added. Therefore, after this update, field[1] = 999, field[2] = 'B', field[3] = 1.

In the update:
box.space.tester:update({999}, {{'+', 3, 1}})
the arguments are the same, except that the third argument is '+', that is, the operation is addition rather than assignment. Since field[3] previously contained 1, this means we’re adding 1 to 1. Therefore, after this update, field[1] = 999, field[2] = 'B', field[3] = 2.

In the update:
box.space.tester:update({999}, {{'|', 3, 1}, {'=', 2, 'C'}})
the idea is to modify two fields at once. The formats are '|' and =, that is, there are two operations, OR and assignment. The fourth and fifth arguments mean that field[3] gets OR’ed with 1. The seventh and eighth arguments mean that field[2] gets assigned 'C'. Therefore, after this update, field[1] = 999, field[2] = 'C', field[3] = 3.

In the update:
box.space.tester:update({999}, {{'#', 2, 1}, {'-', 2, 3}})
The idea is to delete field[2], then subtract 3 from field[3]. But after the delete, there is a renumbering, so field[3] becomes field[2] before we subtract 3 from it, and that’s why the seventh argument is 2, not 3. Therefore, after this update, field[1] = 999, field[2] = 0.

In the update:
box.space.tester:update({999}, {{'=', 2, 'XYZ'}})
we’re making a long string so that splice will work in the next example. Therefore, after this update, field[1] = 999, field[2] = 'XYZ'.

In the update:
box.space.tester:update({999}, {{':', 2, 2, 1, '!!'}})
The third argument is ':', that is, this is the example of splice. The fourth argument is 2 because the change will occur in field[2]. The fifth argument is 2 because deletion will begin with the second byte. The sixth argument is 1 because the number of bytes to delete is 1. The seventh argument is '!!', because '!!' is to be added at this position. Therefore, after this update, field[1] = 999, field[2] = 'X!!Z'.

For more usage scenarios and typical errors see Example: using data operations further in this section.

Since Tarantool 2.3 a tuple can also be updated via JSON paths.

space_object:upsert()

object space_object¶

space_object:upsert({tuple}, {{operator, field_identifier, value}, ...})¶

Update or insert a tuple.

If there is an existing tuple which matches the key fields of tuple, then the request has the same effect as space_object:update() and the {{operator, field_identifier, value}, ...} parameter is used. If there is no existing tuple which matches the key fields of tuple, then the request has the same effect as space_object:insert() and the {tuple} parameter is used. However, unlike insert or update, upsert will not read a tuple and perform error checks before returning – this is a design feature which enhances throughput but requires more caution on the part of the user.

Parameters:	space_object (`space_object`) – an object reference tuple (`table/tuple`) – default tuple to be inserted, if analogue isn’t found operator (`string`) – operation type represented in string field_identifier (`number`) – what field the operation will apply to value (`lua_value`) – what value will be applied
Return:	null

Possible errors:

It is illegal to modify a primary-key field.
It is illegal to use upsert with a space that has a unique secondary index.
ER_TRANSACTION_CONFLICT if a transaction conflict is detected in the MVCC transaction mode.

Complexity factors: Index size, Index type, number of indexes accessed, WAL settings.

Example:

box.space.tester:upsert({12,'c'}, {{'=', 3, 'a'}, {'=', 4, 'b'}})

For more usage scenarios and typical errors see Example: using data operations further in this section.

space_object extensions

You can extend space_object with custom functions as follows:

Create a Lua function.
Add the function name to a predefined global variable box.schema.space_mt, which has the table type. Adding to box.schema.space_mt makes the function available for all spaces.
Call the function on the space_object: space_object:function-name([parameters]).

Alternatively, you can make a user-defined function available for only one space by calling getmetatable(space_object) and then adding the function name to the meta table.

box.space.create_check_constraint()

Warning

This function was removed in 2.11.0. The check constraint mechanism is replaced with the new tuple constraints. Learn more about tuple constraints in Constraints.

object space_object¶

space_object:create_check_constraint(check_constraint_name, expression)¶

Create a check constraint. A check constraint is a requirement that must be met when a tuple is inserted or updated in a space. Check constraints created with space_object:create_check_constraint have the same effect as check constraints created with an SQL CHECK() clause in a CREATE TABLE statement.

Parameters:	space_object (`space_object`) – an object reference check_constraint_name (`string`) – name of check constraint, which should conform to the rules for object names expression (`string`) – SQL code of an expression which must return a boolean result
Return:	check constraint object
Rtype:	check_constraint_object

The space must be formatted with space_object:format() so that the expression can contain field names. The space must be empty. The space must not be a system space.

The expression must return true or false and should be deterministic. The expression may be any SQL (not Lua) expression containing field names, built-in function names, literals, and operators. Not subqueries. If a field name contains lower case characters, it must be enclosed in “double quotes”.

Check constraints are checked before the request is performed, at the same time as Lua before_replace triggers. If there is more than one check constraint or before_replace trigger, then they are ordered according to time of creation. (This is a change from the earlier behavior of check constraints, which caused checking before the tuple was formed.)

Check constraints can be dropped with space_object.ck_constraint.check_constraint_name:drop().

Check constraints can be disabled with space_object.ck_constraint.check_constraint_name:enable(false) or check_constraint_object:enable(false). Check constraints can be enabled with space_object.ck_constraint.check_constraint_name:enable(true) or check_constraint_object:enable(true). By default a check constraint is ‘enabled’ which means that the check is performed whenever the request is performed, but can be changed to ‘disabled’ which means that the check is not performed.

During the recovery process, for example when the Tarantool server is starting, the check is not performed unless force_recovery is specified.

Example:

box.schema.space.create('t')
box.space.t:format({{name = 'f1', type = 'unsigned'},
                    {name = 'f2', type = 'string'},
                    {name = 'f3', type = 'string'}})
box.space.t:create_index('i')
box.space.t:create_check_constraint('c1', [["f2" > 'A']])
box.space.t:create_check_constraint('c2',
                        [["f2"=UPPER("f3") AND NOT "f2" LIKE '__']])
-- This insert will fail, constraint c1 expression returns false
box.space.t:insert{1, 'A', 'A'}
-- This insert will fail, constraint c2 expression returns false
box.space.t:insert{1, 'B', 'c'}
-- This insert will succeed, both constraint expressions return true
box.space.t:insert{1, 'B', 'b'}
-- This update will fail, constraint c2 expression returns false
box.space.t:update(1, {{'=', 2, 'xx'}, {'=', 3, 'xx'}})

A list of check constraints is in box.space._ck_constraint.

space_object:enabled

object space_object¶

space_object.enabled¶: Whether or not this space is enabled. The value is false if the space has no index.

space_object:field_count

object space_object¶

space_object.field_count¶

The required field count for all tuples in this space. The field_count can be set initially with:

box.schema.space.create(..., {
    ... ,
    field_count = *field_count_value* ,
    ...
})

The default value is 0, which means there is no required field count.

Example:

tarantool> box.space.tester.field_count
---
- 0
...

space_object.id

object space_object¶

space_object.id¶

Ordinal space number. Spaces can be referenced by either name or number. Thus, if space tester has id = 800, then box.space.tester:insert{0} and box.space[800]:insert{0} are equivalent requests.

Example:

tarantool> box.space.tester.id
---
- 512
...

space_object.index

object space_object¶

index¶

A container for all defined indexes. There is a Lua object of type box.index with methods to search tuples and iterate over them in predefined order.

To reset, use box.stat.reset().

Rtype:	table

Example:

-- checking the number of indexes for space 'tester'
tarantool> local counter=0; for i=0,#box.space.tester.index do
  if box.space.tester.index[i]~=nil then counter=counter+1 end
  end; print(counter)
1
---
...
-- checking the type of index 'primary'
tarantool> box.space.tester.index.primary.type
---
- TREE
...

box.space._cluster

box.space._cluster¶: _cluster is a system space for support of the replication feature.

box.space._func

box.space._func¶: A system space containing functions created using box.schema.func.create(). If a function’s definition is specified in the body option, this function is persistent. In this case, its definition is stored in a snapshot and can be recovered if the server restarts.

Note

The system space view for _func is _vfunc.

box.space._index

box.space._index¶

_index is a system space.

Tuples in this space contain the following fields:

id (= id of space),
iid (= index number within space),
name,
type,
opts (e.g. unique option), [tuple-field-no, tuple-field-type …].

Here is what _index contains in a typical installation:

tarantool> box.space._index:select{}
---
- - [272, 0, 'primary', 'tree', {'unique': true}, [[0, 'string']]]
  - [280, 0, 'primary', 'tree', {'unique': true}, [[0, 'unsigned']]]
  - [280, 1, 'owner', 'tree', {'unique': false}, [[1, 'unsigned']]]
  - [280, 2, 'name', 'tree', {'unique': true}, [[2, 'string']]]
  - [281, 0, 'primary', 'tree', {'unique': true}, [[0, 'unsigned']]]
  - [281, 1, 'owner', 'tree', {'unique': false}, [[1, 'unsigned']]]
  - [281, 2, 'name', 'tree', {'unique': true}, [[2, 'string']]]
  - [288, 0, 'primary', 'tree', {'unique': true}, [[0, 'unsigned'], [1, 'unsigned']]]
  - [288, 2, 'name', 'tree', {'unique': true}, [[0, 'unsigned'], [2, 'string']]]
  - [289, 0, 'primary', 'tree', {'unique': true}, [[0, 'unsigned'], [1, 'unsigned']]]
  - [289, 2, 'name', 'tree', {'unique': true}, [[0, 'unsigned'], [2, 'string']]]
  - [296, 0, 'primary', 'tree', {'unique': true}, [[0, 'unsigned']]]
  - [296, 1, 'owner', 'tree', {'unique': false}, [[1, 'unsigned']]]
  - [296, 2, 'name', 'tree', {'unique': true}, [[2, 'string']]]
---
...

The system space view for _index is _vindex.

box.space._vindex

box.space._vindex¶: _vindex is the system space view for _index.

box.space._priv

box.space._priv¶

_priv is a system space where privileges are stored.

Tuples in this space contain the following fields:

the numeric id of the user who gave the privilege (“grantor_id”),
the numeric id of the user who received the privilege (“grantee_id”),
the type of object: ‘space’, ‘index’, ‘function’, ‘sequence’, ‘user’, ‘role’, or ‘universe’,
the numeric id of the object,
the type of operation: “read” = 1, “write” = 2, “execute” = 4, “create” = 32, “drop” = 64, “alter” = 128, or a combination such as “read,write,execute”.

See Access control for details about user privileges.

The system space view for _priv is _vpriv.

box.space._vpriv

box.space._vpriv¶: _vpriv is the system space view for _priv.

box.space._schema

box.space._schema¶

_schema is a system space.

This space contains the following tuples:

version: version information for this Tarantool instance.
replicaset_name (since 3.0.0): the name of the replica set to which this instance belongs.
replicaset_uuid (since 3.0.0): the instance’s replica set UUID. In version 3.0.0, the field was renamed from cluster to replicaset_uuid.
max_id (deprecated since 2.11.1): the maximal space ID. Use the box.space._space.index[0]:max() function instead.
once...: tuples that correspond to specific box.once() blocks from the instance’s initialization file. The first field in these tuples contains the key value from the corresponding box.once() block prefixed with ‘once’ (for example, oncehello), so you can easily find a tuple that corresponds to a specific box.once() block.

Example:

In the example, the _schema space contains two box.once objects – oncebye and oncehello.

app:instance001> box.space._schema:select{}
---
- - ['oncebye']
  - ['oncehello']
  - ['replicaset_name', 'replicaset001']
  - ['replicaset_uuid', '72d2d9bf-5d9f-48c4-ba80-9d657e128fee']
  - ['version', 3, 1, 0]
...

box.space._sequence

box.space._sequence¶

_sequence is a system space for support of the sequence feature. It contains persistent information that was established by box.schema.sequence.create() or sequence_object:alter().

The system space view for _sequence is _vsequence.

box.space._sequence_data

box.space._sequence_data¶

_sequence_data is a system space for support of the sequence feature.

Each tuple in _sequence_data contains two fields:

the id of the sequence, and
the last value that the sequence generator returned (non-persistent information).

There is no guarantee that this space will be updated immediately after every data-change request.

box.space._space

box.space._space¶

_space is a system space. It contains all spaces hosted on the current Tarantool instance, both system ones and created by users.

Tuples in this space contain the following fields:

id,
owner (= id of user who owns the space),
name, engine, field_count,
flags (e.g. temporary),
format (as made by a format clause).

These fields are established by box.schema.space.create().

The system space view for _space is _vspace.

Example #1:

The following function will display every simple field in all tuples of _space.

function example()
  local ta = {}
  local i, line
  for k, v in box.space._space:pairs() do
    i = 1
    line = ''
    while i <= #v do
      if type(v[i]) ~= 'table' then
        line = line .. v[i] .. ' '
      end
    i = i + 1
    end
    table.insert(ta, line)
  end
  return ta
end

Here is what example() returns in a typical installation:

tarantool> example()
---
- - '272 1 _schema memtx 0  '
  - '280 1 _space memtx 0  '
  - '281 1 _vspace sysview 0  '
  - '288 1 _index memtx 0  '
  - '296 1 _func memtx 0  '
  - '304 1 _user memtx 0  '
  - '305 1 _vuser sysview 0  '
  - '312 1 _priv memtx 0  '
  - '313 1 _vpriv sysview 0  '
  - '320 1 _cluster memtx 0  '
  - '512 1 tester memtx 0  '
  - '513 1 origin vinyl 0  '
  - '514 1 archive memtx 0  '
...

Example #2:

The following requests will create a space using box.schema.space.create() with a format clause, then retrieve the _space tuple for the new space. This illustrates the typical use of the format clause, it shows the recommended names and data types for the fields.

tarantool> box.schema.space.create('TM', {
         >   id = 12345,
         >   format = {
         >     [1] = {["name"] = "field_1"},
         >     [2] = {["type"] = "unsigned"}
         >   }
         > })
---
- index: []
  on_replace: 'function: 0x41c67338'
  temporary: false
  id: 12345
  engine: memtx
  enabled: false
  name: TM
  field_count: 0
- created
...
tarantool> box.space._space:select(12345)
---
- - [12345, 1, 'TM', 'memtx', 0, {}, [{'name': 'field_1'}, {'type': 'unsigned'}]]
...

box.space._vspace

box.space._vspace¶: _vspace is the system space view for _space.

box.space._space_sequence

box.space._space_sequence¶

_space_sequence is a system space. It contains connections between spaces and sequences.

Tuples in this space contain the following fields:

id (unsigned) – space id
sequence_id (unsigned) – id of the attached sequence
is_generated (boolean) – true if the sequence was created automatically via a space:create_index('pk', {sequence = true}) call
field (unsigned) – id of the space field to which the sequence is attached.
path (string) – path to the data within the field that is set using the attached sequence.

The system space view for _space_sequence is _vspace_sequence.

Example

-- Create a sequence --
box.schema.sequence.create('id_seq',{min=1000, start=1000})
-- Create a space --
box.schema.space.create('customers')

-- Create an index that uses the sequence --
box.space.customers:create_index('primary',{ sequence = 'id_seq' })

-- Create a space --
box.schema.space.create('orders')

-- Create an index that uses an auto sequence --
box.space.orders:create_index( 'primary', { sequence = true })

-- Check the connections between spaces and sequences
box.space._space_sequence:select{}
--[[
---
- - [512, 1, false, 0, '']
  - [513, 2, true, 0, '']
...
--]]

box.space._vspace_sequence

box.space._vspace_sequence¶: _vspace_sequence is the system space view for _space.

box.space._user

box.space._user¶

_user is a system space where user names and password hashes are stored. Learn more about Tarantool’s access control system from the Access control topic.

Tuples in this space contain the following fields:

a numeric id of the tuple (“id”)
a numeric id of the tuple’s creator
a name
a type: ‘user’ or ‘role’
(optional) a password hash
(optional) an array of previous authentication data
(optional) a timestamp of the last password update

There are five special tuples in the _user space: ‘guest’, ‘admin’, ‘public’, ‘replication’, and ‘super’.

Name	ID	Type	Description
guest	0	user	Default user when connecting remotely. Usually, an untrusted user with few privileges.
admin	1	user	Default user when using Tarantool as a console. Usually, an administrative user with all privileges.
public	2	role	Pre-defined role, automatically granted to new users when they are created with `box.schema.user.create(user-name)`. Therefore a convenient way to grant ‘read’ on space ‘t’ to every user that will ever exist is with `box.schema.role.grant('public','read','space','t')`.
replication	3	role	Pre-defined role, which the ‘admin’ user can grant to users who need to use replication features.
super	31	role	Pre-defined role, which the ‘admin’ user can grant to users who need all privileges on all objects. The ‘super’ role has these privileges on ‘universe’: read, write, execute, create, drop, alter.

To select a tuple from the _user space, use box.space._user:select(). In the example below, select is executed for a user with id = 0. This is the ‘guest’ user that has no password.

tarantool> box.space._user:select{0}
---
- - [0, 1, 'guest', 'user']
...

Warning

To change tuples in the _user space, do not use ordinary box.space functions for insert, update, or delete. Learn more from Managing users.

The system space view for _user is _vuser.

box.space._vuser

box.space._vuser¶: _vuser is the system space view for _user.

box.space._ck_constraint

box.space._ck_constraint¶

_ck_constraint is a system space where check constraints are stored.

Tuples in this space contain the following fields:

the numeric id of the space (“space_id”),
the name,
whether the check is deferred (“is_deferred”),
the language of the expression, such as ‘SQL’,
the expression (“code”)

Example:

tarantool> box.space._ck_constraint:select()
---
- - [527, 'c1', false, 'SQL', '"f2" > ''A''']
  - [527, 'c2', false, 'SQL', '"f2" == UPPER("f3") AND NOT "f2" LIKE ''__''']
...

box.space._collation

box.space._collation¶

_collation is a system space with a list of collations. There are over 270 built-in collations and users may add more. Here is one example:

tarantool> box.space._collation:select(239)
---
- - [239, 'unicode_uk_s2', 1, 'ICU', 'uk', {'strength': 'secondary'}]
...

Explanation of the fields in the example: id = 239 i.e. Tarantool’s primary key is 239, name = ‘unicode_uk_s2’ i.e. according to Tarantool’s naming convention this is a Unicode collation + it is for the uk locale + it has secondary strength, owner = 1 i.e. the admin user, type = ‘ICU’ i.e. the rules are according to International Components for Unicode, locale = ‘uk’ i.e. Ukrainian, opts = ‘strength:secondary’ i.e. with this collation comparisons use both primary and secondary weights.

The system space view for _collation is _vcollation.

box.space._vcollation

box.space._vcollation¶: _vcollation is the system space view for _collation.

System space views

A system space view, also called a ‘sysview’, is a restricted read-only copy of a system space.

The system space views and the system spaces that they are associated with are:
_vcollation, a view of _collation,
_vfunc, a view of _func,
_vindex, a view of _index,
_vpriv, a view of _priv,
_vsequence, a view of _sequence,
_vspace, a view of _space,
_vspace_sequence, a view of _space_sequence,
_vuser, a view of _user.

The structure of a system space view’s tuples is identical to the structure of the associated space’s tuples. However, the privileges for a system space view are usually different. By default, ordinary users do not have any privileges for most system spaces, but have a ‘read’ privilege for system space views.

Typically this is the default situation:
* The ‘public’ role has ‘read’ privilege on all system space views because that is the situation when the database is first created.
* All users have the ‘public’ role, because it is granted to them automatically during box.schema.user.create().
* The system space view will contain the tuples in the associated system space, if and only if the user has a privilege for the object named in the tuple.
Unless administrators change the privileges, the effect is that non-administrator users cannot access the system space, but they can access the system space view, which shows only the objects that they can access.

For example, typically, the ‘admin’ user can do anything with _space and _vspace looks the same as _space. But the ‘guest’ user can only read _vspace, and _vspace contains fewer tuples than _space. Therefore in most installations the ‘guest’ user should select from _vspace to get a list of spaces.

Example:

This example shows the difference between _vuser and _user. We have explained that: If the user has the full set of privileges (like ‘admin’), the contents of _vuser match the contents of _user. If the user has limited access, _vuser contains only tuples accessible to this user.

To see how _vuser works, connect to a Tarantool database remotely via net.box and select all tuples from the _user space, both when the ‘guest’ user is and is not allowed to read from the database.

First, start Tarantool and grant read, write and execute privileges to the guest user:

tarantool> box.cfg{listen = 3301}
---
...
tarantool> box.schema.user.grant('guest', 'read,write,execute', 'universe')
---
...

Switch to the other terminal, connect to the Tarantool instance and select all tuples from the _user space:

tarantool> conn = require('net.box').connect(3301)
---
...
tarantool> conn.space._user:select{}
---
- - [0, 1, 'guest', 'user', {}]
  - [1, 1, 'admin', 'user', {}]
  - [2, 1, 'public', 'role', {}]
  - [3, 1, 'replication', 'role', {}]
  - [31, 1, 'super', 'role', {}]
...

This result contains the same set of users as if you made the request from your Tarantool instance as ‘admin’.

Switch to the first terminal and revoke the read privileges from the ‘guest’ user:

tarantool> box.schema.user.revoke('guest', 'read', 'universe')
---
...

Switch to the other terminal, stop the session (to stop tarantool type Ctrl+C or Ctrl+D), start again, connect again, and repeat the conn.space._user:select{} request. The access is denied:

tarantool> conn.space._user:select{}
---
- error: Read access to space '_user' is denied for user 'guest'
...

However, if you select from _vuser instead, the users’ data available for the ‘guest’ user is displayed:

tarantool> conn.space._vuser:select{}
---
- - [0, 1, 'guest', 'user', {}]
...

box.space._session_settings

box.space._session_settings¶

A temporary system space with settings that affect behavior, particularly SQL behavior, for the current session. It uses a special engine named ‘service’. Every ‘service’ tuple is created on the fly, that is, new tuples are made every time _session_settings is accessed. Every settings tuple has two fields: name (the primary key) and value. The tuples’ names and default values are:

sql_default_engine: default storage engine for new SQL tables. Default: memtx.
sql_full_column_names: use full column names in SQL result set metadata. Default: false.
sql_full_metadata: whether SQL result set metadata includes more than just name and type. Default:false.
sql_parser_debug: show parser steps for following statements. Default: false.
sql_recursive_triggers: whether a triggered statement can activate a trigger. Default: true.
sql_reverse_unordered_selects: return result rows in reverse order if there is no ORDER BY clause. Default: false.
sql_select_debug: show execution steps during SELECT. Default:false.
sql_seq_scan: allow sequential scans in SQL SELECT. Default: true.
sql_vdbe_debug: for internal use. Default:false.
sql_defer_foreign_keys (removed in 2.11.0): whether foreign-key checks can wait till commit. Default: false.
error_marshaling_enabled (removed in 2.10.0): whether error objects have a special structure. Default: false.

Three requests are possible: select, get and update. For example, after s = box.space._session_settings, s:select('sql_default_engine') probably returns {'sql_default_engine', 'memtx'}, and s:update('sql_default_engine', {{'=', 'value', 'vinyl'}}) changes the default engine to ‘vinyl’.
Updating sql_parser_debug or sql_select_debug or sql_vdbe_debug has no effect unless Tarantool was built with -DCMAKE_BUILD_TYPE=Debug. To check if this is so, look at require('tarantool').build.target.

Submodule box.stat

The box.stat submodule provides access to request and network statistics.

Below is a list of all box.stat functions.

Name	Use
box.stat()	Show request statistics
box.stat.net()	Show network activity
box.stat.memtx()	Show `memtx` storage engine activity
box.stat.vinyl()	Show `vinyl` storage engine activity
box.stat.reset()	Reset the statistics

box.stat()

box.stat()¶

Shows the total number of requests since startup and the average number of requests per second, broken down by request type.

Return:

in the tables that box.stat() returns:

total: total number of requests processed per second since the server started
rps: average number of requests per second in the last 5 seconds.

ERROR is the count of requests that resulted in an error.

Example:

tarantool> box.stat() -- return 15 tables
---
- DELETE:
    total: 0
    rps: 0
  COMMIT:
    total: 0
    rps: 0
  SELECT:
    total: 12
    rps: 0
  ROLLBACK:
    total: 0
    rps: 0
  INSERT:
    total: 6
    rps: 0
  EVAL:
    total: 0
    rps: 0
  ERROR:
    total: 0
    rps: 0
  CALL:
    total: 0
    rps: 0
  BEGIN:
    total: 0
    rps: 0
  PREPARE:
    total: 0
    rps: 0
  REPLACE:
    total: 0
    rps: 0
  UPSERT:
    total: 0
    rps: 0
  AUTH:
    total: 0
    rps: 0
  EXECUTE:
    total: 0
    rps: 0
  UPDATE:
    total: 2
    rps: 0
...

tarantool> box.stat().DELETE -- total + requests per second from one table
---
- total: 0
  rps: 0
...

box.stat.net()

box.stat.net()¶

Shows network activity: the number of bytes sent and received, the number of connections, streams, and requests (current, average, and total).

Return:

in the tables that box.stat.net() returns:

SENT.rps and RECEIVED.rps – average number of bytes sent/received per second in the last 5 seconds
SENT.total and RECEIVED.total – total number of bytes sent/received since the server started
CONNECTIONS.current – number of open connections
CONNECTIONS.rps – average number of connections opened per second in the last 5 seconds
CONNECTIONS.total – total number of connections opened since the server started
REQUESTS.current – number of requests in progress, which can be limited by box.cfg.net_msg_max
REQUESTS.rps – average number of requests processed per second in the last 5 seconds
REQUESTS.total – total number of requests processed since the server started
REQUESTS_IN_PROGRESS.current – number of requests being currently processed by the TX thread
REQUESTS_IN_PROGRESS.rps – average number of requests processed by the TX thread per second in the last 5 seconds
REQUESTS_IN_PROGRESS.total – total number of requests processed by the TX thread since the server started
STREAMS.current – number of active streams
STREAMS.rps – average number of streams opened per second in the last 5 seconds
STREAMS.total – total number of streams opened since the server started
REQUESTS_IN_STREAM_QUEUE.current – number of requests waiting in stream queues
REQUESTS_IN_STREAM_QUEUE.rps – average number of requests in stream queues per second in the last 5 seconds
REQUESTS_IN_STREAM_QUEUE.total – total number of requests placed in stream queues since the server started

Example:

tarantool> box.stat.net() -- 5 tables
---
- CONNECTIONS:
    current: 1
    rps: 0
    total: 1
  REQUESTS:
    current: 0
    rps: 0
    total: 8
  REQUESTS_IN_PROGRESS:
    current: 0
    rps: 0
    total: 7
  SENT:
    total: 19579
    rps: 0
  REQUESTS_IN_STREAM_QUEUE:
    current: 0
    rps: 0
    total: 0
  STREAMS:
    current: 0
    rps: 0
    total: 0
  RECEIVED:
    total: 197
    rps
...

net.thread()¶

Shows network activity per network thread: the number of bytes sent and received, the number of connections, streams, and requests (current, average, and total).

When called with an index (box.stat.net.thread[1]), shows network statistics for a single network thread.

Return:	Same network activity metrics as box.stat.net() for each network thread

Example:

tarantool> box.stat.net.thread() -- iproto_threads = 2
- - CONNECTIONS:
      current: 0
      rps: 0
      total: 0
    REQUESTS:
      current: 0
      rps: 0
      total: 0
    REQUESTS_IN_PROGRESS:
      current: 0
      rps: 0
      total: 0
    SENT:
      total: 0
      rps: 0
    REQUESTS_IN_STREAM_QUEUE:
      current: 0
      rps: 0
      total: 0
    STREAMS:
      current: 0
      rps: 0
      total: 0
    RECEIVED:
      total: 0
      rps: 0
  - CONNECTIONS:
      current: 1
      rps: 0
      total: 1
    REQUESTS:
      current: 0
      rps: 0
      total: 8
    REQUESTS_IN_PROGRESS:
      current: 0
      rps: 0
      total: 7
    SENT:
      total: 19579
      rps: 0
    REQUESTS_IN_STREAM_QUEUE:
      current: 0
      rps: 0
      total: 0
    STREAMS:
      current: 0
      rps: 0
      total: 0
    RECEIVED:
      total: 197
      rps: 0
...

tarantool> box.stat.net.thread[1] -- first network thread
- - CONNECTIONS:
      current: 1
      rps: 0
      total: 1
    REQUESTS:
      current: 0
      rps: 0
      total: 8
    REQUESTS_IN_PROGRESS:
      current: 0
      rps: 0
      total: 7
    SENT:
      total: 19579
      rps: 0
    REQUESTS_IN_STREAM_QUEUE:
      current: 0
      rps: 0
      total: 0
    STREAMS:
      current: 0
      rps: 0
      total: 0
    RECEIVED:
      total: 197
      rps: 0
...

box.stat.memtx()

box.stat.memtx()¶: Shows memtx storage engine activity.

box.stat.memtx().data

data shows how much memory (in bytes) is allocated for memtx tuples:

data.garbage is the amount of memory that is unused and scheduled to be freed (freed lazily on memory allocation).
data.total is the total amount of memory allocated for data tuples. This includes data.read_view and data.garbage plus tuples that are actually stored in memtx spaces.
data.read_view is the amount of memory held for read views. This includes memory allocated both for system read views (snapshot, replication) and user read views (EE-only). This should be non-zero only if there are open read views.

To list all open read views, use box.read_view.list().

Example:

tarantool> box.stat.memtx().data
---
- garbage: 0
  total: 25334
  read_view: 0
...

box.stat.memtx().index

index shows how much memory (in bytes) is allocated for indexing memtx tuples:

index.read_view is the amount of memory held for read views. This includes memory allocated both for system read views (snapshot, replication) and user read views (EE-only). This should be non-zero only if there are open read views.

To list all open read views, use box.read_view.list().
index.total is the total amount of memory allocated for indexing data. This includes index.read_view plus memory used for indexing tuples that are actually stored in memtx spaces.

Example:

tarantool> box.stat.memtx().index
---
- read_view: 0
  total: 1032192
...

box.stat.memtx().tx

tx shows the statistics of the memtx transactional manager, which is responsible for transactions (box.stat.memtx().tx.txn) and multiversion concurrency control (box.stat.memtx().tx.mvcc).

box.stat.memtx().tx.txn shows memory allocation related to transactions.

It consists of the following sections:
- statements are transaction statements. As an example, consider a user starting a transaction with space:replace{0, 1} within this transaction. Under the hood, this operation becomes a statement for this transaction.
- user is the memory that a user allocated within the current transaction using the Tarantool C API function box_txn_alloc().
- system is the memory allocated for internal needs (for example, logs) and savepoints.
  
  For each section, Tarantool reports the following statistics:
  - total is the number of bytes that are currently allocated in memtx for all transactions within the section scope.
  - avg is the average number of bytes that a single transaction uses (equals total / number of open transactions).
  - max is the maximal number of bytes that a single transaction uses.
box.stat.memtx().tx.mvcc shows memory allocation related to multiversion concurrency control (MVCC). MVCC is reponsible for isolating transactions. It reveals conflicts and makes sure that tuples that do not belong to a particular space but were (or could be) read by some transaction were not deleted.

It consists of the following sections:
- trackers is the memory allocated for trackers of transaction reads. Like in the previous sections, Tarantool reports the total, average, and maximal number of bytes allocated for trackers per a single transaction.
- conflicts is the memory allocated for conflicts which are entities created when transactional conflicts occur. Like in the previous sections, Tarantool reports the total, average, and maximal number of allocated bytes.
- tuples is the memory allocated for storing tuples. With MVCC, tuples are stored using the stories mechanism. Nearly every tuple has its story. Even tuples in an index may have their stories, so it may be useful to differentiate memory allocated for tuples and memory allocated for stories.
  
  All stored tuples fall into three categories, with memory statistics reported for each category:
  - tracking is for tuples that are not used by any transactions directly, but MVCC uses them for tracking transaction reads.
  - used is for tuples that are used by active read-write transactions. See a detailed example below.
  - read_view is for tuples that are not used by active read-write transactions, but are used by read-only transactions.
    
    For each of the three categories, Tarantool reports two statistical blocks:
    - stories is for stories.
    - retained is for retained tuples which do not belong to any index, but MVCC doesn’t allow to delete them yet.
    For each block, Tarantool reports the following statistics:
    - count is the number of stories or retained tuples.
    - total is the number of bytes allocated for stories or retained tuples.

Example

This example illustrates memory statistics for used tuples in a transaction.

The cluster must be started with the database.use_mvcc_engine parameter set to true. This enables MVCC so that box.stat.memtx.tx().mvcc contains non-zero values.

The next step is to create a space with a primary index and to begin a transaction:

box.schema.space.create('test')
box.space.test:create_index('pk')

box.begin()
box.space.test:replace{0, 0}
box.space.test:replace{0, string.rep('a', 100)}
box.space.test:replace{0, 1}
box.space.test:replace{1, 1}
box.space.test:replace{2, 1}

In the transaction above, three tuples are replaced by the 0 key:

{0, 0}
{0, 'aa...aa'}
{0, 1}

MVCC considers all these tuples as used since they belong to the current transaction. Also, MVCC considers tuples {0, 0} and {0, 'aa..aa'} as retained because they don’t belong to any index (unlike {0, 1}) but cannot be deleted yet.

Calling box.stat.memtx.tx() now returns the following result:

tarantool> box.stat.memtx.tx()
---
- txn:
    statements:
      max: 720
      avg: 720
      total: 720
    user:
      max: 0
      avg: 0
      total: 0
    system:
      max: 916
      avg: 916
      total: 916
  mvcc:
    trackers:
      max: 0
      avg: 0
      total: 0
    conflicts:
      max: 0
      avg: 0
      total: 0
    tuples:
      tracking:
        stories:
          count: 0
          total: 0
        retained:
          count: 0
          total: 0
      used:
        stories:
          count: 6
          total: 944
        retained:
          count: 2
          total: 119
      read_view:
        stories:
          count: 0
          total: 0
        retained:
          count: 0
          total: 0
...

Pay attention to highlighted lines – it’s the memory allocated for used tuples.

box.stat.vinyl()

box.stat.vinyl()¶

Shows vinyl-storage-engine activity, for example box.stat.vinyl().tx has the number of commits and rollbacks.

Example:

tarantool> box.stat.vinyl().tx.commit -- one item of the vinyl table
---
- 1047632
...

box.stat.vinyl().disk

Since vinyl is an on-disk storage engine (unlike memtx which is an in-memory storage engine), it can handle large databases – but if a database is larger than the amount of memory that is allocated for vinyl, then there will be more disk activity.

box.stat.vinyl().disk.data and box.stat.vinyl().disk.index are the amount of data that has gone into files in a subdirectory of vinyl_dir, with names like {lsn}.run and {lsn}.index. The size of the run will be related to the output of scheduler.dump_*.
box.stat.vinyl().disk.data_compacted Sum size of data stored at the last LSM tree level, in bytes, without taking disk compression into account. It can be thought of as the size of disk space that the user data would occupy if there were no compression, indexing, or space increase caused by the LSM tree design.

box.stat.vinyl().memory

Although the vinyl storage engine is not “in-memory”, Tarantool does need to have memory for write buffers and for caches:

box.stat.vinyl().memory.tuple_cache is the size of memory (in bytes) occupied by tuples stored in the cache.
box.stat.vinyl().memory.tuple is the size of memory (in bytes) occupied by all allocated tuples. This includes cached tuples and tuples that are referenced in Lua.
box.stat.vinyl().memory.tx is transactional memory. This will usually be 0.
box.stat.vinyl().memory.level0 is the “level0” memory area, sometimes abbreviated “L0”, which is the area that vinyl can use for in-memory storage of an LSM tree.

Therefore we can say that “L0 is becoming full” when the amount in memory.level0 is close to the maximum, which is regulator.dump_watermark. We can expect that “L0 = 0” immediately after a dump. box.stat.vinyl().memory.page_index and box.stat.vinyl().memory.bloom_filter have the current amount being used for index-related structures. The size is a function of the number and size of keys, plus vinyl_page_size, plus vinyl_bloom_fpr. This is not a count of bloom filter “hits” (the number of reads that could be avoided because the bloom filter predicts their presence in a run file) – that statistic can be found with index_object:stat().

box.stat.vinyl().regulator

The vinyl regulator decides when to take or delay actions for disk IO, grouping activity in batches so that it is consistent and efficient. The regulator is invoked by the vinyl scheduler, once per second, and updates related variables whenever it is invoked.

box.stat.vinyl().regulator.dump_bandwidth is the estimated average rate at which dumps are done. Initially this will appear as 10485760 (10 megabytes per second). Only significant dumps (larger than one megabyte) are used for estimating.
box.stat.vinyl().regulator.dump_watermark is the point when dumping must occur. The value is slightly smaller than the amount of memory that is allocated for vinyl trees, which is the vinyl_memory parameter.
box.stat.vinyl().regulator.write_rate is the actual average rate at which recent writes to disk are done. Averaging is done over a 5-second time window, so if there has been no activity for 5 seconds then regulator.write_rate = 0. The write_rate may be slowed when a dump is in progress or when the user has set snap_io_rate_limit.
box.stat.vinyl().regulator.rate_limit is the write rate limit, in bytes per second, imposed on transactions by the regulator based on the observed dump/compaction performance.
box.stat.vinyl().regulator.blocked_writers is the number of fibers currently blocked waiting for vinyl L0 memory quota.

box.stat.vinyl().scheduler

This primarily has counters related to tasks that the scheduler has arranged for dumping or compaction: (most of these items are reset to 0 when the server restarts or when box.stat.reset() occurs):

box.stat.vinyl().scheduler.compaction_* is the amount of data from recent changes that has been compacted. This is divided into scheduler.compaction_input (the amount that is being compacted), scheduler.compaction_queue (the amount that is waiting to be compacted), scheduler.compaction_time (total time spent by all worker threads performing compaction, in seconds), and scheduler.compaction_output (the amount that has been compacted, which is presumably smaller than scheduler.compaction_input).
box.stat.vinyl().scheduler.tasks_* is about dump/compaction tasks, in three categories, scheduler.tasks_inprogress (currently running), scheduler.tasks_completed (successfully completed) scheduler.tasks_failed (aborted due to errors).
box.stat.vinyl().scheduler.dump_* has the amount of data from recent changes that has been dumped, including dump_time (total time spent by all worker threads performing dumps, in seconds), and dump_count (the count of completed dumps), dump_input and dump_output.

A “dump” is explained in section Storing data with vinyl:

Sooner or later the number of elements in an LSM tree exceeds the L0 size and that is when L0 gets written to a file on disk (called a ‘run’) and then cleared for storing new elements. This operation is called a ‘dump’.

Thus it can be predicted that a dump will occur if the size of L0 (which is memory.level0) is approaching the maximum (which is regulator.dump_watermark) and a dump is not already in progress. In fact Tarantool will try to arrange a dump before this hard limit is reached.

A dump will also occur during a snapshot operation.

box.stat.vinyl().tx

This is about requests that affect transactional activity (“tx” is used here as an abbreviation for “transaction”):

box.stat.vinyl().tx.conflict counts conflicts that caused a transaction to roll back.
box.stat.vinyl().tx.commit is the count of commits (successful transaction ends). It includes implicit commits, for example any insert causes a commit unless it is within a begin-end block.
box.stat.vinyl().tx.rollback is the count of rollbacks (unsuccessful transaction ends). This is not merely a count of explicit box.rollback() requests – it includes requests that ended in errors. For example, after an attempted insert request that causes a “Duplicate key exists in unique index” error, tx.rollback is incremented.
box.stat.vinyl().tx.statements will usually be 0.
box.stat.vinyl().tx.transactions is the number of transactions that are currently running.
box.stat.vinyl().tx.gap_locks is the number of gap locks that are outstanding during execution of a request. For a low-level description of Tarantool’s implementation of gap locking, see Gap locks in Vinyl transaction manager.
box.stat.vinyl().tx.read_views shows whether a transaction has entered a read-only state to avoid conflict temporarily. This will usually be 0.

box.stat.reset()

box.stat.reset()¶: Resets the statistics of box.stat(), box.stat.net(), box.stat.memtx(), box.stat.vinyl(), and box.space.index.

Submodule box.tuple

The box.tuple submodule provides read-only access for the tuple userdata type. It allows, for a single tuple: selective retrieval of the field contents, retrieval of information about size, iteration over all the fields, and conversion to a Lua table.

Below is a list of all box.tuple functions.

Name	Use
box.tuple.new()	Create a tuple
box.tuple.is()	Check whether a given object is a tuple
#tuple_object	Count tuple fields
tuple_object.bsize()	Get count of bytes in a tuple
tuple_object[field-number]	Get a tuple’s field by specifying a number
tuple_object[field-name]	Get a tuple’s field by specifying a name
tuple_object[field-path]	Get a tuple’s fields or parts by specifying a path
tuple_object:find(), tuple_object:findall()	Get the number of the first field/all fields matching the search value
tuple_object:format()	Get the format of a tuple
tuple_object.info()	Get information about the tuple
tuple_object:next()	Get the next field value from tuple
tuple_object:pairs(), tuple_object:ipairs()	Prepare for iterating
tuple_object:totable()	Get a tuple’s fields as a table
tuple_object:tomap()	Get a tuple’s fields as a table along with key:value pairs
tuple_object:transform()	Remove (and replace) a tuple’s fields
tuple_object:unpack()	Get a tuple’s fields
tuple_object:update()	Update a tuple
tuple_object:upsert()	Update a tuple ignoring errors

How to convert tuples to/from Lua tables

This function will illustrate how to convert tuples to/from Lua tables and lists of scalars:

tuple = box.tuple.new({scalar1, scalar2, ... scalar_n}) -- scalars to tuple
lua_table = {tuple:unpack()}                            -- tuple to Lua table
lua_table = tuple:totable()                             -- tuple to Lua table
scalar1, scalar2, ... scalar_n = tuple:unpack()         -- tuple to scalars
tuple = box.tuple.new(lua_table)                        -- Lua table to tuple

Then it will find the field that contains ‘b’, remove that field from the tuple, and display how many bytes remain in the tuple. The function uses Tarantool box.tuple functions new(), unpack(), find(), transform(), bsize().

function example()
  local tuple1, tuple2, lua_table_1, scalar1, scalar2, scalar3, field_number
  local luatable1 = {}
  tuple1 = box.tuple.new({'a', 'b', 'c'})
  luatable1 = tuple1:totable()
  scalar1, scalar2, scalar3 = tuple1:unpack()
  tuple2 = box.tuple.new(luatable1[1],luatable1[2],luatable1[3])
  field_number = tuple2:find('b')
  tuple2 = tuple2:transform(field_number, 1)
  return 'tuple2 = ' , tuple2 , ' # of bytes = ' , tuple2:bsize()
end

… And here is what happens when one invokes the function:

tarantool> example()
---
- tuple2 =
- ['a', 'c']
- ' # of bytes = '
- 5
...

box.tuple.new()

box.tuple.new(value)¶

Construct a new tuple from either a scalar or a Lua table. Alternatively, one can get new tuples from Tarantool’s select or insert or replace or update requests, which can be regarded as statements that do new() implicitly.

Parameters:	value (`lua-value`) – the value that will become the tuple contents.
Return:	a new tuple
Rtype:	tuple

In the following example, x will be a new table object containing one tuple and t will be a new tuple object. Saying t returns the entire tuple t.

Example:

tarantool> x = box.space.tester:insert{
         >   33,
         >   tonumber('1'),
         >   tonumber64('2')
         > }:totable()
---
...
tarantool> t = box.tuple.new{'abc', 'def', 'ghi', 'abc'}
---
...
tarantool> t
---
- ['abc', 'def', 'ghi', 'abc']
...

box.tuple.is()

box.tuple.is(object)¶

Since versions 2.2.3, 2.3.2, and 2.4.1. A function to check whether a given object is a tuple cdata object. Never raises nor returns an error.

Return:	true or false
Rtype:	boolean

#tuple_object

object tuple_object¶

#<tuple_object>¶

The # operator in Lua means “return count of components”. So, if t is a tuple instance, #t will return the number of fields.

Rtype:	number

In the following example, a tuple named t is created and then the number of fields in t is returned.

tarantool> t = box.tuple.new{'Fld#1', 'Fld#2', 'Fld#3', 'Fld#4'}
---
...
tarantool> #t
---
- 4
...

tuple_object.bsize()

object tuple_object¶

tuple_object:bsize()¶

If t is a tuple instance, t:bsize() will return the number of bytes in the tuple. With both the memtx storage engine and the vinyl storage engine the default maximum is one megabyte (memtx_max_tuple_size or vinyl_max_tuple_size). Every field has one or more “length” bytes preceding the actual contents, so bsize() returns a value which is slightly greater than the sum of the lengths of the contents.

The value does not include the size of “struct tuple” (for the current size of this structure look in the tuple.h file in Tarantool’s source code).

Return:	number of bytes
Rtype:	number

In the following example, a tuple named t is created which has three fields, and for each field it takes one byte to store the length and three bytes to store the contents, and then there is one more byte to store a count of the number of fields, so bsize() returns 3*(1+3)+1. This is the same as the size of the string that msgpack.encode({‘aaa’,’bbb’,’ccc’}) would return.

tarantool> t = box.tuple.new{'aaa', 'bbb', 'ccc'}
---
...
tarantool> t:bsize()
---
- 13
...

tuple_object[field-number]

object tuple_object¶

<tuple_object>[field-number]¶

If t is a tuple instance, t[field-number] will return the field numbered field-number in the tuple. The first field is t[1].

Return:	field value.
Rtype:	lua-value

In the following example, a tuple named t is created and then the second field in t is returned.

tarantool> t = box.tuple.new{'Fld#1', 'Fld#2', 'Fld#3', 'Fld#4'}
---
...
tarantool> t[2]
---
- Fld#2
...

tuple_object[field-name]

object tuple_object¶

<tuple_object>[field-name]¶

If t is a tuple instance, t['field-name'] will return the field named ‘field-name’ in the tuple. Fields have names if the tuple has been retrieved from a space that has an associated format. t[lua-variable-name] will do the same thing if lua-variable-name contains 'field-name'.

There is a variation which the Lua manual calls “syntactic sugar”: use t.field-name as an equivalent of t['field-name'].

Return:	field value.
Rtype:	lua-value

In the following example, a tuple named t is returned from replace and then the second field in t named ‘field2’ is returned.

tarantool> format = {}
---
...
tarantool> format[1] = {name = 'field1', type = 'unsigned'}
---
...
tarantool> format[2] = {name = 'field2', type = 'string'}
---
...
tarantool> s = box.schema.space.create('test', {format = format})
---
...
tarantool> pk = s:create_index('pk')
---
...
tarantool> t = s:replace{1, 'Я'}
---
...
tarantool> t['field2']
---
- Я
...

tuple_object[field-path]

object tuple_object¶

<tuple_object>[field-path]¶

If t is a tuple instance, t['path'] will return the field or subset of fields that are in path. path must be a well formed JSON specification. path may contain field names if the tuple has been retrieved from a space that has an associated format.

To prevent ambiguity, Tarantool first tries to interpret the request as tuple_object[field-number] or tuple_object[field-name]. If and only if that fails, Tarantool tries to interpret the request as tuple_object[field-path].

The path must be a well formed JSON specification, but it may be preceded by ‘.’. The ‘.’ is a signal that the path acts as a suffix for the tuple.

The advantage of specifying a path is that Tarantool will use it to search through a tuple body and get only the tuple part, or parts, that are actually necessary.

In the following example, a tuple named t is returned from replace and then only the relevant part (in this case, matching a name) of a relevant field is returned. Namely: the second field, its third item, the value following ‘key=’.

tarantool> format = {}
---
...
tarantool> format[1] = {name = 'field1', type = 'unsigned'}
---
...
tarantool> format[2] = {name = 'field2', type = 'array'}
---
...
tarantool> s = box.schema.space.create('test', {format = format})
---
...
tarantool> pk = s:create_index('pk')
---
...
tarantool> field2_value = {1, "ABC", {key="Hello", value="world"}}
---
...
tarantool> t = s:replace{1, field2_value}
---
...
tarantool> t["[2][3]['key']"]
---
- Hello
...

tuple_object:find(), tuple_object:findall()

object tuple_object¶

tuple_object:find([field-number, ]search-value)¶

tuple_object:findall([field-number, ]search-value)¶

If t is a tuple instance, t:find(search-value) will return the number of the first field in t that matches the search value, and t:findall(search-value [, search-value ...]) will return numbers of all fields in t that match the search value. Optionally one can put a numeric argument field-number before the search-value to indicate “start searching at field number field-number.”

Return:	the number of the field in the tuple.
Rtype:	number

In the following example, a tuple named t is created and then: the number of the first field in t which matches ‘a’ is returned, then the numbers of all the fields in t which match ‘a’ are returned, then the numbers of all the fields in t which match ‘a’ and are at or after the second field are returned.

tarantool> t = box.tuple.new{'a', 'b', 'c', 'a'}
---
...
tarantool> t:find('a')
---
- 1
...
tarantool> t:findall('a')
---
- 1
- 4
...
tarantool> t:findall(2, 'a')
---
- 4
...

tuple_object:format()

object tuple_object¶

tuple_object:format()¶

Get the format of a tuple. The resulting table lists the fields of a tuple (their names and types) if the format option was specified during the tuple creation. Otherwise, the return value is empty.

Return:	the tuple format.
Rtype:	table

Note

tuple_object.format() is equivalent to box.tuple.format(tuple_object).

Example:

A formatted tuple:

tarantool> f = box.tuple.format.new({{'id', 'number'}, {'name', 'string'}})
---
...

tarantool> ftuple = box.tuple.new({1, 'Bob'}, {format = f})
---
...

tarantool> ftuple:format()
---
- [{'name': 'id', 'type': 'number'}, {'name': 'name', 'type': 'string'}]
...

tarantool> box.tuple.format(ftuple)
---
- [{'name': 'id', 'type': 'number'}, {'name': 'name', 'type': 'string'}]
...

A tuple without a format:

tarantool> tuple1 = box.tuple.new({1, 'Bob'}) -- no format
---
...

tarantool> tuple1:format()
---
- []
...

tarantool> box.tuple.format(tuple1)
---
- []
...

tuple_object.info()

object tuple_object¶

tuple_object:info()¶

Get information about the tuple memory usage.

Returns a table with the following fields:

data_size – size of MessagePack data in the tuple. This number equals to number returned by tuple_object.bsize().
header_size - size of the internal tuple header.
field_map_size – size of the field map. Field map is used to speed up access to indexed fields of the tuple.
waste_size – amount of excess memory wasted due to internal fragmentation in the slab allocator.
arena - type of the arena where the tuple is allocated. Possible values are: memtx, malloc, runtime.

Return:	tuple memory usage statistics
Rtype:	table

Example

tarantool> box.space.tester:get('222200000'):info()
---
- data_size: 55
  waste_size: 95
  arena: memtx
  field_map_size: 4
  header_size: 6
...

tuple_object:next()

object tuple_object¶

tuple_object:next(tuple[, pos])¶

An analogue of the Lua next() function, but for a tuple object. When called without arguments, tuple:next() returns the first field from a tuple. Otherwise, it returns the field next to the indicated position.

However tuple:next() is not really efficient, and it is better to use tuple:pairs()/ipairs().

Return:	field number and field value
Rtype:	number and field type

tarantool> tuple = box.tuple.new({5, 4, 3, 2, 0})
---
...

tarantool> tuple:next()
---
- 1
- 5
...

tarantool> tuple:next(1)
---
- 2
- 4
...

tarantool> ctx, field = tuple:next()
---
...

tarantool> while field do
         > print(field)
         > ctx, field = tuple:next(ctx)
         > end
5
4
3
2
0
---
...

tuple_object:pairs(), tuple_object:ipairs()

object tuple_object¶

tuple_object:pairs()¶

tuple_object:ipairs()¶

In Lua, lua-table-value:pairs() is a method which returns: function, lua-table-value, nil. Tarantool has extended this so that tuple-value:pairs() returns: function, tuple-value, nil. It is useful for Lua iterators, because Lua iterators traverse a value’s components until an end marker is reached.

tuple_object:ipairs() is the same as pairs(), because tuple fields are always integers.

Return:	function, tuple-value, nil
Rtype:	function, lua-value, nil

In the following example, a tuple named t is created and then all its fields are selected using a Lua for-end loop.

tarantool> t = box.tuple.new{'Fld#1', 'Fld#2', 'Fld#3', 'Fld#4', 'Fld#5'}
---
...
tarantool> tmp = ''
---
...
tarantool> for k, v in t:pairs() do
         >   tmp = tmp .. v
         > end
---
...
tarantool> tmp
---
- Fld#1Fld#2Fld#3Fld#4Fld#5
...

tuple_object:totable()

object tuple_object¶

tuple_object:totable([start-field-number[, end-field-number]])¶

If t is a tuple instance, t:totable() will return all fields, t:totable(1) will return all fields starting with field number 1, t:totable(1,5) will return all fields between field number 1 and field number 5.

It is preferable to use t:totable() rather than t:unpack().

Return:	field(s) from the tuple
Rtype:	lua-table

In the following example, a tuple named t is created, then all its fields are selected, then the result is returned.

tarantool> t = box.tuple.new{'Fld#1', 'Fld#2', 'Fld#3', 'Fld#4', 'Fld#5'}
---
...
tarantool> t:totable()
---
- ['Fld#1', 'Fld#2', 'Fld#3', 'Fld#4', 'Fld#5']
...

tuple_object:tomap()

object tuple_object¶

tuple_object:tomap([options])¶

A Lua table can have indexed values, also called key:value pairs. For example, here:

a = {}; a['field1'] = 10; a['field2'] = 20

a is a table with “field1: 10” and “field2: 20”.

The tuple_object:totable() function only returns a table containing the values. But the tuple_object:tomap() function returns a table containing not only the values, but also the key:value pairs.

This only works if the tuple comes from a space that has been formatted with a format clause.

Parameters:	options (`table`) – the only possible option is `names_only`. If `names_only` is false or omitted (default), then all the fields will appear twice, first with numeric headings and second with name headings. If `names_only` is true, then all the fields will appear only once, with name headings.
Return:	field-number:value pair(s) and key:value pair(s) from the tuple
Rtype:	lua-table

In the following example, a tuple named t1 is returned from a space that has been formatted, then tables named t1map1 and t1map2 are produced from t1.

format = {{'field1', 'unsigned'}, {'field2', 'unsigned'}}
s = box.schema.space.create('test', {format = format})
s:create_index('pk',{parts={1,'unsigned',2,'unsigned'}})
t1 = s:insert{10, 20}
t1map = t1:tomap()
t1map_names_only = t1:tomap({names_only=true})

t1map will contain “1: 10”, “2: 20”, “field1: 10”, “field2: 20”.

t1map_names_only will contain “field1: 10”, “field2: 20”.

tuple_object:transform()

object tuple_object¶

tuple_object:transform(start-field-number, fields-to-remove[, field-value, ...])¶

If t is a tuple instance, t:transform(start-field-number,fields-to-remove) will return a tuple where, starting from field start-field-number, a number of fields (fields-to-remove) are removed. Optionally one can add more arguments after fields-to-remove to indicate new values that will replace what was removed.

If the original tuple comes from a space that has been formatted with a format clause, the formatting will not be preserved for the result tuple.

Parameters:	start-field-number (`integer`) – base 1, may be negative fields-to-remove (`integer`) – field-value(s) (`lua-value`) –
Return:	tuple
Rtype:	tuple

In the following example, a tuple named t is created and then, starting from the second field, two fields are removed but one new one is added, then the result is returned.

tarantool> t = box.tuple.new{'Fld#1', 'Fld#2', 'Fld#3', 'Fld#4', 'Fld#5'}
---
...
tarantool> t:transform(2, 2, 'x')
---
- ['Fld#1', 'x', 'Fld#4', 'Fld#5']
...

tuple_object:unpack()

object tuple_object¶

tuple_object:unpack([start-field-number[, end-field-number]])¶

If t is a tuple instance, t:unpack() will return all fields, t:unpack(1) will return all fields starting with field number 1, t:unpack(1,5) will return all fields between field number 1 and field number 5.

Return:	field(s) from the tuple.
Rtype:	lua-value(s)

In the following example, a tuple named t is created and then all its fields are selected, then the result is returned.

tarantool> t = box.tuple.new{'Fld#1', 'Fld#2', 'Fld#3', 'Fld#4', 'Fld#5'}
---
...
tarantool> t:unpack()
---
- Fld#1
- Fld#2
- Fld#3
- Fld#4
- Fld#5
...

tuple_object:update()

object tuple_object¶

tuple_object:update({{operator, field_no, value}, ...})¶

Update a tuple.

This function updates a tuple which is not in a space. Compare the function box.space.space-name:update(key, {{format, field_no, value}, ...}) which updates a tuple in a space.

For details: see the description for operator, field_no, and value in the section box.space.space-name:update{key, format, {field_number, value}…).

If the original tuple comes from a space that has been formatted with a format clause, the formatting will be preserved for the result tuple.

Parameters:	operator (`string`) – operation type represented in string (e.g. ‘`=`’ for ‘assign new value’) field_no (`number`) – what field the operation will apply to. The field number can be negative, meaning the position from the end of tuple. (#tuple + negative field number + 1) value (`lua_value`) – what value will be applied
Return:	new tuple
Rtype:	tuple

In the following example, a tuple named t is created and then its second field is updated to equal ‘B’.

tarantool> t = box.tuple.new{'Fld#1', 'Fld#2', 'Fld#3', 'Fld#4', 'Fld#5'}
---
...
tarantool> t:update({{'=', 2, 'B'}})
---
- ['Fld#1', 'B', 'Fld#3', 'Fld#4', 'Fld#5']
...

Since Tarantool 2.3 a tuple can also be updated via JSON paths.

tuple_object:upsert()

object tuple_object¶

tuple_object:upsert({{operator, field_no, value}, ...})¶

The same as tuple_object:update(), but ignores errors. In case of an error the tuple is left intact, but an error message is printed. Only client errors are ignored, such as a bad field type, or wrong field index/name. System errors, such as OOM, are not ignored and raised just like with a normal update(). Note that only bad operations are ignored. All correct operations are applied.

Parameters:	operator (`string`) – operation type represented as a string (e.g. ‘`=`’ for ‘assign new value’) field_no (`number`) – the field to which the operation will be applied. The field number can be negative, meaning the position from the end of tuple. (#tuple + negative field number + 1) value (`lua_value`) – the value which will be applied
Return:	new tuple
Rtype:	tuple

See the following example where one operation is applied, and one is not.

tarantool> t = box.tuple.new({1, 2, 3})
tarantool> t2 = t:upsert({{'=', 5, 100}})
UPSERT operation failed:
ER_NO_SUCH_FIELD_NO: Field 5 was not found in the tuple
---
...

tarantool> t
---
- [1, 2, 3]
...

tarantool> t2
---
- [1, 2, 3]
...

tarantool> t2 = t:upsert({{'=', 5, 100}, {'+', 1, 3}})
UPSERT operation failed:
ER_NO_SUCH_FIELD_NO: Field 5 was not found in the tuple
---
...

tarantool> t
---
- [1, 2, 3]
...

tarantool> t2
---
- [4, 2, 3]
...

Functions for transaction management

For general information and examples, see section Transactions.

Observe the following rules when working with transactions:

Rule #1

The requests in a transaction must be sent to a server as a single block. It is not enough to enclose them between begin and commit or rollback. To ensure they are sent as a single block: put them in a function, or put them all on one line, or use a delimiter so that multi-line requests are handled together.

Rule #2

All database operations in a transaction should use the same storage engine. It is not safe to access tuple sets that are defined with {engine='vinyl'} and also access tuple sets that are defined with {engine='memtx'}, in the same transaction.

Rule #3

Requests which cause changes to the data definition – create, alter, drop, truncate – are only allowed with Tarantool version 2.1 or later. Data-definition requests which change an index or change a format, such as space_object:create_index() and space_object:format(), are not allowed inside transactions except as the first request after box.begin().

Below is a list of all functions for transaction management.

Name	Use
box.begin()	Begin the transaction
box.commit()	End the transaction and save all changes
box.rollback()	End the transaction and discard all changes
box.savepoint()	Get a savepoint descriptor
box.rollback_to_savepoint()	Do not end the transaction and discard all changes made after a savepoint
box.atomic()	Execute a function, treating it as a transaction
box.on_commit()	Define a trigger that will be activated by `box.commit`
box.on_rollback()	Define a trigger that will be activated by `box.rollback`
box.is_in_txn()	State whether a transaction is in progress

box.begin()

box.begin([opts])¶

Begin the transaction. Disable implicit yields until the transaction ends. Signal that writes to the write-ahead log will be deferred until the transaction ends. In effect the fiber which executes box.begin() is starting an “active multi-request transaction”, blocking all other fibers.

Parameters:	opts (`table`) – (optional) transaction options: `txn_isolation` – the transaction isolation level `timeout` – a timeout (in seconds), after which the transaction is rolled back

Possible errors:

error if this operation is not permitted because there is already an active transaction.
error if for some reason memory cannot be allocated.
error and abort the transaction if the timeout is exceeded.

Example

-- Insert test data --
box.space.bands:insert { 1, 'Roxette', 1986 }
box.space.bands:insert { 2, 'Scorpions', 1965 }
box.space.bands:insert { 3, 'Ace of Base', 1987 }

-- Begin and commit the transaction explicitly --
box.begin()
box.space.bands:insert { 4, 'The Beatles', 1960 }
box.space.bands:replace { 1, 'Pink Floyd', 1965 }
box.commit()

-- Begin the transaction with the specified isolation level --
box.begin({ txn_isolation = 'read-committed' })
box.space.bands:insert { 5, 'The Rolling Stones', 1962 }
box.space.bands:replace { 1, 'The Doors', 1965 }
box.commit()

box.commit()

box.commit()¶

End the transaction, and make all its data-change operations permanent.

Possible errors:

error and abort the transaction in case of a conflict.
error if the operation fails to write to disk.
error if for some reason memory cannot be allocated.

Example

-- Insert test data --
box.space.bands:insert { 1, 'Roxette', 1986 }
box.space.bands:insert { 2, 'Scorpions', 1965 }
box.space.bands:insert { 3, 'Ace of Base', 1987 }

-- Begin and commit the transaction explicitly --
box.begin()
box.space.bands:insert { 4, 'The Beatles', 1960 }
box.space.bands:replace { 1, 'Pink Floyd', 1965 }
box.commit()

-- Begin the transaction with the specified isolation level --
box.begin({ txn_isolation = 'read-committed' })
box.space.bands:insert { 5, 'The Rolling Stones', 1962 }
box.space.bands:replace { 1, 'The Doors', 1965 }
box.commit()

box.rollback()

box.rollback()¶

End the transaction, but cancel all its data-change operations. An explicit call to functions outside box.space that always yield, such as fiber.sleep() or fiber.yield(), will have the same effect.

Example

-- Insert test data --
box.space.bands:insert { 1, 'Roxette', 1986 }
box.space.bands:insert { 2, 'Scorpions', 1965 }
box.space.bands:insert { 3, 'Ace of Base', 1987 }

-- Rollback the transaction --
box.begin()
box.space.bands:insert { 4, 'The Beatles', 1960 }
box.space.bands:replace { 1, 'Pink Floyd', 1965 }
box.rollback()

box.savepoint()

box.savepoint()¶

Return a descriptor of a savepoint (type = table), which can be used later by box.rollback_to_savepoint(savepoint). Savepoints can only be created while a transaction is active, and they are destroyed when a transaction ends.

Return:	savepoint table
Rtype:	Lua object
Return:	error if the savepoint cannot be set in absence of active transaction.

Possible errors: error if for some reason memory cannot be allocated.

Example

-- Insert test data --
box.space.bands:insert { 1, 'Roxette', 1986 }
box.space.bands:insert { 2, 'Scorpions', 1965 }
box.space.bands:insert { 3, 'Ace of Base', 1987 }

-- Rollback the transaction to a savepoint --
box.begin()
box.space.bands:insert { 4, 'The Beatles', 1960 }
save1 = box.savepoint()
box.space.bands:replace { 1, 'Pink Floyd', 1965 }
box.rollback_to_savepoint(save1)
box.commit()

box.rollback_to_savepoint()

box.rollback_to_savepoint(savepoint)¶

Do not end the transaction, but cancel all its data-change and box.savepoint() operations that were done after the specified savepoint.

Return:	error if the savepoint cannot be set in absence of active transaction.

Possible errors: error if the savepoint does not exist.

Example

-- Insert test data --
box.space.bands:insert { 1, 'Roxette', 1986 }
box.space.bands:insert { 2, 'Scorpions', 1965 }
box.space.bands:insert { 3, 'Ace of Base', 1987 }

-- Rollback the transaction to a savepoint --
box.begin()
box.space.bands:insert { 4, 'The Beatles', 1960 }
save1 = box.savepoint()
box.space.bands:replace { 1, 'Pink Floyd', 1965 }
box.rollback_to_savepoint(save1)
box.commit()

box.atomic()

box.atomic([opts, ]tx-function[, function-arguments])¶

Execute a function, acting as if the function starts with an implicit box.begin() and ends with an implicit box.commit() if successful, or ends with an implicit box.rollback() if there is an error.

Parameters:	opts (`table`) – (optional) transaction options: `txn_isolation` – the transaction isolation level `timeout` – a timeout (in seconds), after which the transaction is rolled back tx-function (`string`) – the function name function-arguments – (optional) arguments passed to the function
Return:	the result of the function passed to `atomic()` as an argument

Possible errors:

error and abort the transaction in case of a conflict.
error and abort the transaction if the timeout is exceeded.
error if the operation fails to write to disk.
error if for some reason memory cannot be allocated.

Example

-- Create an index with the specified sequence --
box.schema.sequence.create('id_sequence', { min = 1 })
box.space.bands:create_index('primary', { parts = { 'id' }, sequence = 'id_sequence' })

-- Insert test data --
box.space.bands:insert { 1, 'Roxette', 1986 }
box.space.bands:insert { 2, 'Scorpions', 1965 }
box.space.bands:insert { 3, 'Ace of Base', 1987 }

-- Define a function --
local function insert_band(band_name, year)
    box.space.bands:insert { nil, band_name, year }
end

-- Begin and commit the transaction implicitly --
box.atomic(insert_band, 'The Beatles', 1960)

-- Begin the transaction with the specified isolation level --
box.atomic({ txn_isolation = 'read-committed' },
        insert_band, 'The Rolling Stones', 1962)

box.on_commit()

box.on_commit(trigger-function[, old-trigger-function])¶

Define a trigger for execution when a transaction ends due to an event such as box.commit().

The trigger function may take an iterator parameter, as described in an example for this section.

The trigger function should not access any database spaces.

If the trigger execution fails and raises an error, the effect is severe and should be avoided – use Lua’s pcall() mechanism around code that might fail.

box.on_commit() must be invoked within a transaction, and the trigger ceases to exist when the transaction ends.

Parameters:	trigger-function (`function`) – function which will become the trigger function old-trigger-function (`function`) – existing trigger function which will be replaced by trigger-function
Return:	nil or function pointer

If the parameters are (nil, old-trigger-function), then the old trigger is deleted.

Details about trigger characteristics are in the triggers section.

Example 1

-- Insert test data --
box.space.bands:insert { 1, 'Roxette', 1986 }
box.space.bands:insert { 2, 'Scorpions', 1965 }
box.space.bands:insert { 3, 'Ace of Base', 1987 }

-- Define a function called on commit --
function print_commit_result()
    print('Commit happened')
end

-- Commit the transaction --
box.begin()
box.space.bands:insert { 4, 'The Beatles', 1960 }
box.on_commit(print_commit_result)
box.commit()

Example 2

The function parameter can be an iterator. The iterator goes through the effects of every request that changed a space during the transaction.

The iterator has:

an ordinal request number
the old value of the tuple before the request (nil for an insert request)
the new value of the tuple after the request (nil for a delete request)
the ID of the space

The example below displays the effects of two replace requests:

-- Insert test data --
box.space.bands:insert { 1, 'Roxette', 1986 }
box.space.bands:insert { 2, 'Scorpions', 1965 }
box.space.bands:insert { 3, 'Ace of Base', 1987 }

-- Define a function called on commit --
function print_replace_details(iterator)
    for request_number, old_tuple, new_tuple, space_id in iterator() do
        print('request_number: ' .. tostring(request_number))
        print('old_tuple: ' .. tostring(old_tuple))
        print('new_tuple: ' .. tostring(new_tuple))
        print('space_id: ' .. tostring(space_id))
    end
end

-- Commit the transaction --
box.begin()
box.space.bands:replace { 1, 'The Beatles', 1960 }
box.space.bands:replace { 2, 'The Rolling Stones', 1965 }
box.on_commit(print_replace_details)
box.commit()

The output might look like this:

request_number: 1
old_tuple: [1, 'Roxette', 1986]
new_tuple: [1, 'The Beatles', 1960]
space_id: 512
request_number: 2
old_tuple: [2, 'Scorpions', 1965]
new_tuple: [2, 'The Rolling Stones', 1965]
space_id: 512

box.on_rollback()

box.on_rollback(trigger-function[, old-trigger-function])¶

Define a trigger for execution when a transaction ends due to an event such as box.rollback().

The parameters and warnings are exactly the same as for box.on_commit().

box.is_in_txn()

box.is_in_txn()¶: If a transaction is in progress (for example the user has called box.begin() and has not yet called either box.commit() or box.rollback(), return true. Otherwise return false.

Functions for SQL

The box module contains some functions related to SQL:

box.schema.func.create – for making Lua functions callable from SQL statements. See Calling Lua routines from SQL in the SQL Plus Lua section.
box.execute – for making SQL statements callable from Lua functions. See the SQL user guide.
box.prepare and box.unprepare.

Some SQL statements are illustrated in the SQL tutorial.

Below is a list of all SQL functions and members.

Name	Use
box.execute()	Make Lua functions callable from SQL statements. See Calling Lua routines from SQL in the SQL Plus Lua section
box.prepare()	Make SQL statements callable from Lua functions. See the SQL user guide
object prepared_table	Methods for prepared SQL statement

box.execute()

box.execute(sql-statement[, extra-parameters])¶

Execute the SQL statement contained in the sql-statement parameter.

Parameters:	sql-statement (`string`) – statement, which should conform to the rules for SQL grammar extra-parameters (`table`) – optional table for placeholders in the statement
Return:	depends on statement

There are two ways to pass extra parameters to box.execute():

The first way, which is the preferred way, is to put placeholders in the string, and pass a second argument, an extra-parameters table. A placeholder is either a question mark “?”, or a colon “:” followed by a name. An extra parameter is any Lua expression.

If placeholders are question marks, then they are replaced by extra-parameters values in corresponding positions. That is, the first ? is replaced by the first extra parameter, the second ? is replaced by the second extra parameter, and so on.

If placeholders are :names, then they are replaced by extra-parameters values with corresponding names.

For example, this request that contains literal values 1 and 'x':
```
box.execute([[INSERT INTO tt VALUES (1, 'x');]]);
```
… is the same as the request below containing two question-mark placeholders (? and ?) and a two-element extra-parameters table:
```
x = {1,'x'}
box.execute([[INSERT INTO tt VALUES (?, ?);]], x);
```
… and is the same as this request containing two :name placeholders (:a and :b) and a two-element extra-parameters table with elements named “a” and “b”:
```
box.execute([[INSERT INTO tt VALUES (:a, :b);]], {{[':a']=1},{[':b']='x'}})
```
The second way is to concatenate strings. For example, the Lua script below inserts 10 rows with different primary-key values into table t:
```
for i=1,10,1 do
    box.execute("insert into t values (" .. i .. ")")
end
```
When creating SQL statements based on user input, application developers should beware of SQL injection.

Since box.execute() is an invocation of a Lua function, it either causes an error message or returns a value.

For some statements the returned value contains a field named rowcount, for example:

tarantool> box.execute([[CREATE TABLE table1 (column1 INT PRIMARY key, column2 VARCHAR(10));]])
---
- rowcount: 1
...
tarantool> box.execute([[INSERT INTO table1 VALUES (55,'Hello SQL world!');]])
---
- rowcount: 1
...

For statements that cause generation of values for PRIMARY KEY AUTOINCREMENT columns, there is a field named autoincrement_id.

For SELECT or PRAGMA statements, the returned value is a result set, containing a field named metadata (a table with column names and Tarantool/NoSQL type names) and a field named rows (a table with the contents of each row).

For example, for a statement SELECT "x" FROM t WHERE "x"=5; where "x" is an INTEGER column and there is one row, a display on the Tarantool client might look like this:

tarantool> box.execute([[SELECT "x" FROM t WHERE "x"=5;]])
---
- metadata:
  - name: x
    type: integer
  rows:
  - [5]
...

For a look at raw format of SELECT results, see Binary protocol – responses for SQL.

The order of components within a map is not guaranteed.

If sql_full_metadata in the _session_settings system table is TRUE, then result set metadata may include these things in addition to name and type:

collation (present only if COLLATE clause is specified for a STRING) = “Collation”.
is_nullable (present only if the select list specified a base table column and nothing else) = false if column was defined as NOT NULL, otherwise true. If this is not present, that implies that nullability is unknown.
is_autoincrement (present only if the select list specified a base table column and nothing else) = true if column was defined as PRIMARY KEY AUTOINCREMENT, otherwise false.
span (always present) = the original expression in a select list, which often is the same as name if the select list specifies a column name and nothing else, but otherwise differs, for example, after SELECT x+55 AS x FROM t; the name is X and the span is x+55. If span and name are the same then the content is MP_NIL.

Alternative: if you are using the Tarantool server as a client, you can switch languages as follows:

\set language sql
\set delimiter ;

Afterwards, you can enter any SQL statement directly without needing box.execute().

There is also an execute() function available in module net.box. For example, you can execute conn:execute(sql-statement]) after conn = net_box.connect(url-string).

box.prepare()

box.prepare(sql-statement)¶

Prepare the SQL statement contained in the sql-statement parameter. The syntax and requirements for box.prepare are the same as for box.execute().

Parameters:	sql-statement (`string`) – statement, which should conform to the rules for SQL grammar
Return:	prepared_table, with id and methods and metadata
Rtype:	table

box.prepare compiles an SQL statement into byte code and saves the byte code in a cache. Since compiling takes a significant amount of time, preparing a statement will enhance performance if the statement is executed many times.

If box.prepare succeeds, prepared_table contains:

stmt_id: integer – an identifier generated by a hash of the statement string
execute: function
params: map [name : string, type : string] – parameter descriptions
unprepare: function
metadata: map [name : string, type : string] (This is present only for SELECT or PRAGMA statements and has the same contents as the result set metadata for box.execute)
param_count: integer – number of parameters

This can be used by prepared_table:execute() and by prepared_table:unprepare().

The prepared statement cache (which is also called the prepared statement holder) is “shared”, that is, there is one cache for all sessions. However, session X cannot execute a statement prepared by session Y.
For monitoring the cache, see box.info().sql.
For changing the cache size, use sql.cache_size.

Prepared statements will “expire” (become invalid) if any database object is dropped or created or altered – even if the object is not mentioned in the SQL statement, even if the create or drop or alter is rolled back, even if the create or drop or alter is done in a different session.

object prepared_table

object prepared_table¶

prepared_table:execute([extra-parameters])¶

Execute a statement that has been prepared with box.prepare().

Parameter prepared_table should be the result from box.prepare().

Parameter extra-parameters should be an optional table to match placeholders or named parameters in the statement.

There are two ways to execute: with the method or with the statement id. That is, prepared_table:execute() and box.execute(prepared_table.stmt_id) do the same thing.

Example: here is a test. This function inserts a million rows in a table using a prepared INSERT statement.

function f()
  local p, start_time
  box.execute([[DROP TABLE IF EXISTS t;]])
  box.execute([[CREATE TABLE t (s1 INTEGER PRIMARY KEY);]])
  start_time = os.time()
  p = box.prepare([[INSERT INTO t VALUES (?);]])
  for i=1,1000000 do p:execute({i}) end
  p:unprepare()
  end_time = os.time()
  box.execute([[COMMIT;]])
  print(end_time - start_time) -- elapsed time
end
f()

Take note of the elapsed time. Now change the line with the loop to:
for i=1,1000000 do box.execute([[INSERT INTO t VALUES (?);]], {i}) end
Run the function again, and take note of the elapsed time again. The function which executes the prepared statement will be about 15% faster, though of course this will vary depending on Tarantool version and environment.

prepared_table:unprepare()¶

Undo the result of an earlier box.prepare() request. This is equivalent to standard-SQL DEALLOCATE PREPARE.

Parameter prepared_table should be the result from box.prepare().

There are two ways to unprepare: with the method or with the statement id. That is, prepared_table:unprepare() and box.unprepare(prepared_table.stmt_id) do the same thing.

Tarantool strongly recommends using unprepare as soon as the immediate objective (executing a prepared statement multiple times) is done, or whenever a prepared statement expires. There is no automatic eviction policy, although automatic unprepare will happen when the session disconnects (the session’s prepared statements will be removed from the prepared-statement cache).

Event watchers

Since 2.10.0.

The box module contains some features related to event subscriptions, also known as watchers. The subscriptions are used to inform the client about server-side events. Each event subscription is defined by a certain key.

Event: An event is a state change or a system update that triggers the action of other systems. To read more about built-in events in Tarantool, check the system events section.
State: A state is an internally stored key-value pair. The key is a string. The value is an arbitrary type that can be encoded as MsgPack. To update a state, use the box.broadcast() function.
Watcher: A watcher is a callback that is invoked when a state change occurs. To register a local watcher, use the box.watch() function. To create a remote watcher, use the watch() function from the net.box module. Note that it is possible to register more than one watcher for the same key.

How a watcher works

First, you register a watcher. After that, the watcher callback is invoked for the first time. In this case, the callback is triggered whether or not the key has already been broadcast. All subsequent invocations are triggered with box.broadcast() called on the remote host. If a watcher is subscribed for a key that has not been broadcast yet, the callback is triggered only once, after the registration of the watcher.

The watcher callback takes two arguments. The first argument is the name of the key for which it was registered. The second one contains current key data. The callback is always invoked in a new fiber. It means that it is allowed to yield in it. A watcher callback is never executed in parallel with itself. If the key is updated while the watcher callback is running, the callback will be invoked again with the new value as soon as it returns.

box.watch and box.broadcast functions can be used before box.cfg.

Below is a list of all functions and pages related to watchers or events.

Name	Use
box.watch()	Create a local watcher.
box.watch_once()	Get the current key value.
conn:watch()	Create a watcher for the remote host.
box.broadcast()	Update a state.
Built-in events	Predefined events in Tarantool

box.watch()

box.watch(key, func)¶

Subscribe to events broadcast by a local host.

Parameters:	key (`string`) – key name of the event to subscribe to func (`function`) – callback to invoke when the key value is updated
Return:	a watcher handle. The handle consists of one method – `unregister()`, which unregisters the watcher.

To read more about watchers, see the Functions for watchers section.

Note

Keep in mind that garbage collection of a watcher handle doesn’t lead to the watcher’s destruction. In this case, the watcher remains registered. It is okay to discard the result of watch function if the watcher will never be unregistered.

Example:

-- Broadcast value 42 for the 'foo' key.
box.broadcast('foo', 42)

local log = require('log')
-- Subscribe to updates of the 'foo' key.
local w = box.watch('foo', function(key, value)
    assert(key == 'foo')
    log.info("The box.id value is '%d'", value)
end)

If you don’t need the watcher anymore, you can unregister it using the command below:

w:unregister()

box.watch_once()

box.watch_once(key, func)¶

Returns the current value of a given notification key. The function can be used as an alternative to box.watch() when the caller only needs the current value without subscribing to future changes.

Parameters:	key (`string`) – key name
Return:	the key value

To read more about watchers, see the Event watchers section.

Example:

-- Broadcast value 42 for the 'foo' key.
box.broadcast('foo', 42)

-- Get the value of this key
tarantool> box.watch_once('foo')
---
- 42
...

-- Non-existent keys' values are empty
tarantool> box.watch_once('none')
---
...

box.broadcast()

box.broadcast(key, value)¶

Update the value of a particular key and notify all key watchers of the update.

Parameters:	key (`string`) – key name of the event to subscribe to value – any data that can be encoded in MsgPack
Return:	none

Possible errors:

The value can’t be encoded as MsgPack.
The key refers to a box. system event

Example:

-- Broadcast value 42 for the 'foo' key.
box.broadcast('foo', 42)

System events

Since 2.10.0.

Predefined events have a special naming schema – theirs names always start with the reserved box. prefix. It means that you cannot create new events with it.

The system processes the following events:

box.id
box.status
box.election
box.schema
box.shutdown

In response to each event, the server sends back certain IPROTO fields.

The events are available from the beginning as non-MP_NIL. If a watcher subscribes to a system event before it has been broadcast, it receives an empty table for the event value.

The event is generated when there is a change in any of the values listed in the event. For example, see the parameters in the box.id event below – id, instance_uuid, and replicaset_uuid. Suppose the ìd value (box.info.id) has changed. This triggers the box.info event, which states that the value of box.info.id has changed, while box.info.uuid and box.info.cluster.uuid remain the same.

box.id

Contains identification of the instance. Value changes are rare.

id: the numeric instance ID is unknown before the registration. For anonymous replicas, the value is 0 until they are officially registered.
instance_uuid: the UUID of the instance never changes after the first box.cfg. The value is unknown before the box.cfg call.
replicaset_uuid: the value is unknown until the instance joins a replicaset or boots a new one.

-- box.id value
{
MP_STR “id”: MP_UINT; box.info.id,
MP_STR “instance_uuid”: MP_UUID; box.info.uuid,
MP_STR “replicaset_uuid”: MP_UUID box.info.cluster.uuid,
}

box.status

Contains generic information about the instance status.

is_ro: indicates the read-only mode or the orphan status.
is_ro_cfg: indicates the read_only mode for the instance.
status: shows the status of an instance.

{
MP_STR “is_ro”: MP_BOOL box.info.ro,
MP_STR “is_ro_cfg”: MP_BOOL box.cfg.read_only,
MP_STR “status”: MP_STR box.info.status,
}

box.election

Contains fields of box.info.election that are necessary to find out the most recent writable leader.

term: shows the current election term.
role: indicates the election state of the node – leader, follower, or candidate.
is_ro: indicates the read-only mode or the orphan status.
leader: shows the leader node ID in the current term.

{
MP_STR “term”: MP_UINT box.info.election.term,
MP_STR “role”: MP_STR box.info.election.state,
MP_STR “is_ro”: MP_BOOL box.info.ro,
MP_STR “leader”: MP_UINT box.info.election.leader,
}

box.schema

Contains schema-related data.

version: shows the schema version.

{
MP_STR “version”: MP_UINT schema_version,
}

box.shutdown

Contains a boolean value which indicates whether there is an active shutdown request.

The event is generated when the server receives a shutdown request (os.exit() command or SIGTERM signal).

The box.shutdown event is applied for the graceful shutdown protocol. It is a feature which is available since 2.10.0. This protocol is supposed to be used with connectors to signal a client about the upcoming server shutdown and close active connections without broken requests. For more information, refer to the graceful shutdown protocol section.

Usage example

local conn = net.box.connect(URI)
local log = require('log')
-- Subscribe to updates of key 'box.id'
local w = conn:watch('box.id', function(key, value)
    assert(key == 'box.id')
    log.info("The box.id value is '%s'", value)
end)

If you want to unregister the watcher when it’s no longer needed, use the following command:

w:unregister()

Function box.once

box.once(key, function[, ...])¶

Execute a function, provided it has not been executed before. A passed value is checked to see whether the function has already been executed. If it has been executed before, nothing happens. If it has not been executed before, the function is invoked.

See an example of using box.once() in Adding storage code.

Warning: If an error occurs inside box.once() when initializing a database, you can re-execute the failed box.once() block without stopping the database. The solution is to delete the once object from the system space _schema. Say box.space._schema:select{}, find your once object there and delete it.

When box.once() is used for initialization, it may be useful to wait until the database is in an appropriate state (read-only or read-write). In that case, see the functions in the Submodule box.ctl.

Parameters:	key (`string`) – a value that will be checked function (`function`) – a function ... – arguments that must be passed to the function

Note

The parameter key will be stored in the _schema system space after box.once() is called in order to prevent a double run. These keys are global per replica set. So a simultaneous call of box.once() with the same key on two instances of the same replica set may succeed on both of them, but it’ll lead to a transaction conflict.

Example

The example shows how to re-execute the box.once() block that contains the hello key.

First, check the _schema system space. The _schema space in the example contains two box.once objects – oncebye and oncehello:

app:instance001> box.space._schema:select{}
---
- - ['oncebye']
  - ['oncehello']
  - ['replicaset_name', 'replicaset001']
  - ['replicaset_uuid', '72d2d9bf-5d9f-48c4-ba80-9d657e128fee']
  - ['version', 3, 1, 0]

Delete the oncehello object:

app:instance001> box.space._schema:delete('oncehello')
---
- ['oncehello']
...

After that, check the _schema space again:

app:instance001> box.space._schema:select{}
---
- - ['oncebye']
  - ['replicaset_name', 'replicaset001']
  - ['replicaset_uuid', '72d2d9bf-5d9f-48c4-ba80-9d657e128fee']
  - ['version', 3, 1, 0]
...

To re-execute the function, call the box.once() method again:

app:instance001> box.once('hello', function() end)
---
...

app:instance001> box.space._schema:select{}
---
- - ['oncebye']
  - ['oncehello']
  - ['replicaset_name', 'replicaset001']
  - ['replicaset_uuid', '72d2d9bf-5d9f-48c4-ba80-9d657e128fee']
  - ['version', 3, 1, 0]
...

Function box.snapshot

box.snapshot()¶

Memtx

Take a snapshot of all data and store it in snapshot.dir/<latest-lsn>.snap. To take a snapshot, Tarantool first enters the delayed garbage collection mode for all data. In this mode, the Tarantool garbage collector will not remove files which were created before the snapshot started, it will not remove them until the snapshot has finished. To preserve consistency of the primary key, used to iterate over tuples, a copy-on-write technique is employed. If the master process changes part of a primary key, the corresponding process page is split, and the snapshot process obtains an old copy of the page. In effect, the snapshot process uses multi-version concurrency control in order to avoid copying changes which are superseded while it is running.

Since a snapshot is written sequentially, you can expect a very high write performance (averaging to 80MB/second on modern disks), which means an average database instance gets saved in a matter of minutes. You may restrict the speed by changing snapshot.snap_io_rate_limit.

Note

As long as there are any changes to the parent index memory through concurrent updates, there are going to be page splits, and therefore you need to have some extra free memory to run this command. 10% of memtx_memory is, on average, sufficient. This statement waits until a snapshot is taken and returns operation result.

Note

Change notice: Prior to Tarantool version 1.6.6, the snapshot process caused a fork, which could cause occasional latency spikes. Starting with Tarantool version 1.6.6, the snapshot process creates a consistent read view and this view is written to the snapshot file by a separate thread (the “Write Ahead Log” thread).

Although box.snapshot() does not cause a fork, there is a separate fiber which may produce snapshots at regular intervals – see the discussion of the checkpoint daemon.

Example:

tarantool> box.info.version
---
- 1.7.0-1216-g73f7154
...
tarantool> box.snapshot()
---
- ok
...
tarantool> box.snapshot()
---
- error: can't save snapshot, errno 17 (File exists)
...

Taking a snapshot does not cause the server to start a new write-ahead log. Once a snapshot is taken, old WALs can be deleted as long as all replicated data is up to date. But the WAL which was current at the time box.snapshot() started must be kept for recovery, since it still contains log records written after the start of box.snapshot().

An alternative way to save a snapshot is to send a SIGUSR1 signal to the instance. While this approach could be handy, it is not recommended for use in automation: a signal provides no way to find out whether the snapshot was taken successfully or not.

Vinyl

In vinyl, inserted data is stacked in memory until the limit, set in the vinyl_memory parameter, is reached. Then vinyl automatically dumps it to the disc. box.snapshot() forces this dump in order to have the ability to recover from this checkpoint. The snapshot files are stored in space_id/index_id/*.run. Thus, strictly all the data that was written at the time of LSN of the checkpoint is in the *.run files on the disk, and all operations that happened after the checkpoint will be written in the *.xlog. All dump files created by box.snapshot() are consistent and have the same LSN as checkpoint.

At the checkpoint vinyl also rotates the metadata log *.vylog, containing data manipulation operations like “create file” and “delete file”. It goes through the log, removes duplicating operations from the memory and creates a new *.vylog file, giving it the name according to the vclock of the new checkpoint, with “create” operations only. This procedure cleans *.vylog and is useful for recovery because the name of the log is the same as the checkpoint signature.

Constant box.NULL

There are some major problems with using Lua nil values in tables. For example: you can’t correctly assess the length of a table that is not a sequence. (Learn more about data types in Lua and LuaJIT.)

Example:

tarantool> t = {0, nil, 1, 2, nil}
---
...

tarantool> t
---
- - 0
  - null
  - 1
  - 2
...

tarantool> #t
---
- 4
...

The console output of t processes nil values in the middle and at the end of the table differently. This is due to undefined behavior.

Note

Trying to find the length for sparse arrays in LuaJIT leads to another scenario of undefined behavior.

To avoid this problem, use Tarantool’s box.NULL constant instead of nil. box.NULL is a placeholder for a nil value in tables to preserve a key without a value.

Using box.NULL

box.NULL is a value of the cdata type representing a NULL pointer. It is similar to msgpack.NULL, json.NULL and yaml.NULL. So it is some not nil value, even if it is a pointer to NULL.

Use box.NULL only with capitalized NULL (box.null is incorrect).

Note

Technically speaking, box.NULL equals to ffi.cast('void *', 0).

Example:

tarantool> t = {0, box.NULL, 1, 2, box.NULL}
---
...

tarantool> t
---
- - 0
  - null # cdata
  - 1
  - 2
  - null # cdata
...

tarantool> #t
---
- 5
...

Note

Notice that t[2] shows the same null output in both examples. However in this example t[2] and t[5] are of the cdata type, while in the previous example their type was nil.

Important

Avoid using implicit comparisons with nullable values when using box.NULL. Due to Lua behavior, returning anything except false or nil from a condition expression is considered as true. And, as it was mentioned earlier, box.NULL is a pointer by design.

That is why the expression box.NULL will always be considered true in case it is used as a condition in a comparison. This means that the code

if box.NULL then func() end

will always execute the function func() (because the condition box.NULL will always be neither false nor nil).

Distinction of nil and box.NULL

Use the expression if x == nil to check if the x is either a nil or a box.NULL.

To check whether x is a nil but not a box.NULL, use the following condition expression:

type(x) == 'nil'

If it’s true, then x is a nil, but not a box.NULL.

You can use the following for box.NULL:

x == nil and type(x) == 'cdata'

If the expression above is true, then x is a box.NULL.

Note

By converting data to different formats (JSON, YAML, msgpack), you shall expect that it is possible that nil in sparse arrays will be converted to box.NULL. And it is worth mentioning that such conversion might be unexpected (for example: by sending data via net.box or by obtaining data from spaces etc.).

tarantool> type(({1, nil, 2})[2])
---
- nil
...

tarantool> type(json.decode(json.encode({1, nil, 2}))[2])
---
- cdata
...

You must anticipate such behavior and use a proper condition expression. Use the explicit comparison x == nil for checking for NULL in nullable values. It will detect both nil and box.NULL.

Module buffer

The buffer module returns a dynamically resizable buffer which is solely for optional use by methods of the net.box module or the msgpack module.

Ordinarily the net.box methods return a Lua table. If a buffer option is used, then the net.box methods return a raw MsgPack string. This saves time on the server, if the client application has its own routine for decoding raw MsgPack strings.

The buffer uses four pointers to manage its capacity:

buf – a pointer to the beginning of the buffer
rpos – a pointer to the beginning of the range; available for reading data (“read position”)
wpos – a pointer to the end of the range; available for reading data, and to the beginning of the range for writing new data (“write position”)
epos – a pointer to the end of the range; available for writing new data (“end position”)

buffer.ibuf()¶

Create a new buffer.

Example:

In this example we will show that using buffer allows you to keep the data in the format that you get from the server. So if you get the data only for sending it somewhere else, buffer fastens this a lot.

box.cfg{listen = 3301}
buffer = require('buffer')
net_box = require('net.box')
msgpack = require('msgpack')

box.schema.space.create('tester')
box.space.tester:create_index('primary')
box.space.tester:insert({1, 'ABCDE', 12345})

box.schema.user.create('usr1', {password = 'pwd1'})
box.schema.user.grant('usr1', 'read,write,execute', 'space', 'tester')

ibuf = buffer.ibuf()

conn = net_box.connect('usr1:pwd1@localhost:3301')
conn.space.tester:select({}, {buffer=ibuf})

msgpack.decode_unchecked(ibuf.rpos)

The result of the final request looks like this:

tarantool> msgpack.decode_unchecked(ibuf.rpos)
---
- {48: [['ABCDE', 12345]]}
- 'cdata<char *>: 0x7f97ba10c041'
...

Note

Before Tarantool version 1.7.7, the function to use for this case is msgpack.ibuf_decode(ibuf.rpos). Starting with Tarantool version 1.7.7, ibuf_decode is deprecated.

object buffer_object¶

buffer_object:alloc(size)¶

Allocate size bytes for buffer_object.

Parameters:	size (`number`) – memory in bytes to allocate
Return:	`wpos`

buffer_object:capacity()¶

Return the capacity of the buffer_object.

Return:	`epos - buf`

buffer_object:checksize(size)¶

Check if size bytes are available for reading in buffer_object.

Parameters:	size (`number`) – memory in bytes to check
Return:	`rpos`

buffer_object:pos()¶

Return the size of the range occupied by data.

Return:	`rpos - buf`

buffer_object:read(size)¶: Read size bytes from buffer.

buffer_object:recycle()¶

Clear the memory slots allocated by buffer_object.

tarantool> ibuf:recycle()
---
...
tarantool> ibuf.buf, ibuf.rpos, ibuf.wpos, ibuf.epos
---
- 'cdata<char *>: NULL'
- 'cdata<char *>: NULL'
- 'cdata<char *>: NULL'
- 'cdata<char *>: NULL'
...

buffer_object:reset()¶

Clear the memory slots used by buffer_object. This method allows to keep the buffer but remove data from it. It is useful when you want to use the buffer further.

tarantool> ibuf:reset()
---
...
tarantool> ibuf.buf, ibuf.rpos, ibuf.wpos, ibuf.epos
---
- 'cdata<char *>: 0x010cc28030'
- 'cdata<char *>: 0x010cc28030'
- 'cdata<char *>: 0x010cc28030'
- 'cdata<char *>: 0x010cc2c000'
...

buffer_object:reserve(size)¶: Reserve memory for buffer_object. Check if there is enough memory to write size bytes after wpos. If not, epos shifts until size bytes will be available.

buffer_object:size()¶

Return a range, available for reading data.

Return:	`wpos - rpos`

buffer_object:unused()¶

Return a range for writing data.

Return:	`epos - wpos`

Module buffer and skip_header

The example in the previous section

tarantool> msgpack.decode_unchecked(ibuf.rpos)
---
- {48: [['ABCDE', 12345]]}
- 'cdata<char *>: 0x7f97ba10c041'
...

showed that, ordinarily, the response from net.box includes a header – 48 (hexadecimal 30) that is the key for IPROTO_DATA. But in some situations, for example when passing the buffer to a C function that expects a MsgPack byte array without a header, the header can be skipped. This is done by specifying skip_header=true as an option to conn.space.space-name:select{…} or conn.space.space-name:insert{…} or conn.space.space-name:replace{…} or conn.space.space-name:update{…} or conn.space.space-name:upsert{…} or conn.space.space-name:delete{…}. The default is skip_header=false.

Now here is the end of the same example, except that skip_header=true is used.

ibuf = buffer.ibuf()

conn = net_box.connect('usr1:pwd1@localhost:3301')
conn.space.tester:select({}, {buffer=ibuf, skip_header=true})

msgpack.decode_unchecked(ibuf.rpos)

The result of the final request looks like this:

tarantool> msgpack.decode_unchecked(ibuf.rpos)
---
- [['ABCDE', 12345]]
- 'cdata<char *>: 0x7f8fd102803f'
...

Notice that the IPROTO_DATA header (48) is gone.

The result is still inside an array, as is clear from the fact that it is shown inside square brackets. It is possible to skip the array header too, with msgpack.decode_array_header().

Module checks

Since: 2.11.0

The checks module provides the ability to check the types of arguments passed to a Lua function. You need to call the checks(type_1, …) function inside the target Lua function and pass one or more type qualifiers to check the corresponding argument types. There are two types of type qualifiers:

A string type qualifier checks whether a function’s argument conforms to the specified type. Example: 'string'.
A table type qualifier checks whether the values of a table passed as an argument conform to the specified types. Example: { 'string', 'number' }.

Note

For earlier versions, you can install the checks module from the Tarantool rocks repository.

Loading checks

In Tarantool 2.11.0 and later versions, the checks API is available in a script without loading the module.

For earlier versions, you need to install the checks module from the Tarantool rocks repository and load the module using the require() directive:

local checks = require('checks')

Number of arguments to check

For each argument to check, you need to specify its own type qualifier in the checks(type_1, …) function.

One argument

In the example below, the checks function accepts a string type qualifier to verify that only a string value can be passed to the greet function. Otherwise, an error is raised.

function greet(name)
    checks('string')
    return 'Hello, ' .. name
end
--[[
greet('John')
-- returns 'Hello, John'

greet(123)
-- raises an error: bad argument #1 to nil (string expected, got number)
--]]

Multiple arguments

To check the types of several arguments, you need to pass the corresponding type qualifiers to the checks function. In the example below, both arguments should be string values.

function greet_fullname(firstname, lastname)
    checks('string', 'string')
    return 'Hello, ' .. firstname .. ' ' .. lastname
end
--[[
greet_fullname('John', 'Smith')
-- returns 'Hello, John Smith'

greet_fullname('John', 1)
-- raises an error: bad argument #2 to nil (string expected, got number)
--]]

To skip checking specific arguments, use the ? placeholder.

Variable number of arguments

You can check the types of explicitly specified arguments for functions that accept a variable number of arguments.

function extra_arguments_num(a, b, ...)
    checks('string', 'number')
    return select('#', ...)
end
--[[
extra_arguments_num('a', 2, 'c')
-- returns 1

extra_arguments_num('a', 'b', 'c')
-- raises an error: bad argument #1 to nil (string expected, got number)
--]]

String type qualifier

This section describes how to check a specific argument type using a string type qualifier:

The Supported types section describes all the types supported by the checks module.
If required, you can make a union type to allow an argument to accept several types.
You can make any of the supported types optional.
To skip checking specific arguments, use the ? placeholder.

Supported types

Lua types

A string type qualifier can accept any of the Lua types, for example, string, number, table, or nil. In the example below, the checks function accepts string to validate that only a string value can be passed to the greet function.

function greet(name)
    checks('string')
    return 'Hello, ' .. name
end
--[[
greet('John')
-- returns 'Hello, John'

greet(123)
-- raises an error: bad argument #1 to nil (string expected, got number)
--]]

Tarantool types

You can use Tarantool-specific types in a string qualifier. The example below shows how to check that a function argument is a decimal value.

local decimal = require('decimal')
function sqrt(value)
    checks('decimal')
    return decimal.sqrt(value)
end
--[[
sqrt(decimal.new(16))
-- returns 4

sqrt(16)
-- raises an error: bad argument #1 to nil (decimal expected, got number)
--]]

This table lists all the checks available for Tarantool types:

Check	Description	See also
`checks('datetime')`	Check whether the specified value is datetime_object	checkers.datetime(value)
`checks('decimal')`	Check whether the specified value has the decimal type	checkers.decimal(value)
`checks('error')`	Check whether the specified value is error_object	checkers.error(value)
`checks('int64')`	Check whether the specified value is an `int64` value	checkers.int64(value)
`checks('interval')`	Check whether the specified value is interval_object	checkers.interval(value)
`checks('tuple')`	Check whether the specified value is a tuple	checkers.tuple(value)
`checks('uint64')`	Check whether the specified value is a `uint64` value	checkers.uint64(value)
`checks('uuid')`	Check whether the specified value is uuid_object	checkers.uuid(value)
`checks('uuid_bin')`	Check whether the specified value is uuid represented by a 16-byte binary string	checkers.uuid_bin(value)
`checks('uuid_str')`	Check whether the specified value is uuid represented by a 36-byte hexadecimal string	checkers.uuid_str(value)

Custom function

A string type qualifier can accept the name of a custom function that performs arbitrary validations. To achieve this, create a function returning true if the value is valid and add this function to the checkers table.

The example below shows how to use the positive function to check that an argument value is a positive number.

function checkers.positive(value)
    return (type(value) == 'number') and (value > 0)
end

function get_doubled_number(value)
    checks('positive')
    return value * 2
end
--[[
get_doubled_number(10)
-- returns 20

get_doubled_number(-5)
-- raises an error: bad argument #1 to nil (positive expected, got number)
--]]

Metatable type

A string qualifier can accept a value stored in the __type field of the argument metatable.

local blue = setmetatable({ 0, 0, 255 }, { __type = 'color' })
function get_blue_value(color)
    checks('color')
    return color[3]
end
--[[
get_blue_value(blue)
-- returns 255

get_blue_value({0, 0, 255})
-- raises an error: bad argument #1 to nil (color expected, got table)
--]]

Union types

To allow an argument to accept several types (a union type), concatenate type names with a pipe (|). In the example below, the argument can be both a number and string value.

function get_argument_type(value)
    checks('number|string')
    return type(value)
end
--[[
get_argument_type(1)
-- returns 'number'

get_argument_type('key1')
-- returns 'string'

get_argument_type(true)
-- raises an error: bad argument #1 to nil (number|string expected, got boolean)
--]]

Optional types

To make any of the supported types optional, prefix its name with a question mark (?). In the example below, the name argument is optional. This means that the greet function can accept string and nil values.

function greet(name)
    checks('?string')
    if name ~= nil then
        return 'Hello, ' .. name
    else
        return 'Hello from Tarantool'
    end
end
--[[
greet('John')
-- returns 'Hello, John'

greet()
-- returns 'Hello from Tarantool'

greet(123)
-- raises an error: bad argument #1 to nil (string expected, got number)
--]]

As for a specific type, you can make a union type value optional: ?number|string.

Skipping argument checking

You can skip checking of the specified arguments using the question mark (?) placeholder. In this case, the argument can be any type.

function greet_fullname_any(firstname, lastname)
    checks('string', '?')
    return 'Hello, ' .. firstname .. ' ' .. tostring(lastname)
end
--[[
greet_fullname_any('John', 'Doe')
-- returns 'Hello, John Doe'

greet_fullname_any('John', 1)
-- returns 'Hello, John 1'
--]]

Table type qualifier

A table type qualifier checks whether the values of a table passed as an argument conform to the specified types. In this case, the following checks are made:

The argument is checked to conform to the ?table type, and its content is validated.
Table values are validated against the specified string type qualifiers.
Table keys missing in checks are validated against the nil type.

The code below checks that the first and second table values have the string and number types.

function configure_connection(options)
    checks({ 'string', 'number' })
    local ip_address = options[1] or '127.0.0.1'
    local port = options[2] or 3301
    return ip_address .. ':' .. port
end
--[[
configure_connection({'0.0.0.0', 3303})
-- returns '0.0.0.0:3303'

configure_connection({'0.0.0.0', '3303'})
-- raises an error: bad argument options[2] to nil (number expected, got string)
--]]

In the next example, the same checks are made for the specified keys.

function configure_connection_opts(options)
    checks({ ip_address = 'string', port = 'number' })
    local ip_address = options.ip_address or '127.0.0.1'
    local port = options.port or 3301
    return ip_address .. ':' .. port
end
--[[
configure_connection_opts({ip_address = '0.0.0.0', port = 3303})
-- returns '0.0.0.0:3303'

configure_connection_opts({ip_address = '0.0.0.0', port = '3303'})
-- raises an error: bad argument options.port to nil (number expected, got string)

configure_connection_opts({login = 'testuser', ip_address = '0.0.0.0', port = 3303})
-- raises an error: unexpected argument options.login to nil
--]]

Note

Table qualifiers can be nested and use tables, too.

API Reference

Members
checks()	When called inside a function, checks that the function’s arguments conform to the specified types
checkers	A global variable that provides access to checkers for different types

checks()

checks(type_1, ...)¶

When called inside a function, checks that the function’s arguments conform to the specified types.

Parameters:	type_1 (`string/table`) – a string or table type qualifier used to check the argument type ... – optional type qualifiers used to check the types of other arguments

checkers

The checkers global variable provides access to checkers for different types. You can use this variable to add a custom checker that performs arbitrary validations.

Note

The checkers variable also provides access to checkers for Tarantool-specific types. These checkers can be used in a custom checker.

checkers.datetime(value)¶

Check whether the specified value is datetime_object.

Parameters:	value (`any`) – the value to check the type for
Return:	`true` if the specified value is `datetime_object`; otherwise, `false`
Rtype:	boolean

Example

local datetime = require('datetime')
local is_datetime = checkers.datetime(datetime.new { day = 1, month = 6, year = 2023 })
local is_interval = checkers.interval(datetime.interval.new { day = 1 })

checkers.decimal(value)¶

Check whether the specified value has the decimal type.

Parameters:	value (`any`) – the value to check the type for
Return:	`true` if the specified value has the `decimal` type; otherwise, `false`
Rtype:	boolean

Example

local decimal = require('decimal')
local is_decimal = checkers.decimal(decimal.new(16))

checkers.error(value)¶

Check whether the specified value is error_object.

Parameters:	value (`any`) – the value to check the type for
Return:	`true` if the specified value is `error_object`; otherwise, `false`
Rtype:	boolean

Example

local server_error = box.error.new({ code = 500, reason = 'Server error' })
local is_error = checkers.error(server_error)

checkers.int64(value)¶

Check whether the specified value is one of the following int64 values:

a Lua number in a range from -2^53+1 to 2^53-1 (inclusive)
Lua cdata ctype<uint64_t> in a range from 0 to LLONG_MAX
Lua cdata ctype<int64_t>

Parameters:	value (`any`) – the value to check the type for
Return:	`true` if the specified value is an `int64` value; otherwise, `false`
Rtype:	boolean

Example

local is_int64 = checkers.int64(-1024)
local is_uint64 = checkers.uint64(2048)

checkers.interval(value)¶

Check whether the specified value is interval_object.

Parameters:	value (`any`) – the value to check the type for
Return:	`true` if the specified value is `interval_object`; otherwise, `false`
Rtype:	boolean

Example

local datetime = require('datetime')
local is_datetime = checkers.datetime(datetime.new { day = 1, month = 6, year = 2023 })
local is_interval = checkers.interval(datetime.interval.new { day = 1 })

checkers.tuple(value)¶

Check whether the specified value is a tuple.

Parameters:	value (`any`) – the value to check the type for
Return:	`true` if the specified value is a tuple; otherwise, `false`
Rtype:	boolean

Example

local is_tuple = checkers.tuple(box.tuple.new{1, 'The Beatles', 1960})

checkers.uint64(value)¶

Check whether the specified value is one of the following uint64 values:

a Lua number in a range from 0 to 2^53-1 (inclusive)
Lua cdata ctype<uint64_t>
Lua cdata ctype<int64_t> in range from 0 to LLONG_MAX

Parameters:	value (`any`) – the value to check the type for
Return:	`true` if the specified value is an `uint64` value; otherwise, `false`
Rtype:	boolean

Example

local is_int64 = checkers.int64(-1024)
local is_uint64 = checkers.uint64(2048)

checkers.uuid(value)¶

Check whether the specified value is uuid_object.

Parameters:	value (`any`) – the value to check the type for
Return:	`true` if the specified value is `uuid_object`; otherwise, `false`
Rtype:	boolean

Example

local uuid = require('uuid')
local is_uuid = checkers.uuid(uuid())
local is_uuid_bin = checkers.uuid_bin(uuid.bin())
local is_uuid_str = checkers.uuid_str(uuid.str())

checkers.uuid_bin(value)¶

Check whether the specified value is uuid represented by a 16-byte binary string.

Parameters:	value (`any`) – the value to check the type for
Return:	`true` if the specified value is `uuid` represented by a 16-byte binary string; otherwise, `false`
Rtype:	boolean

See also: uuid(value)

checkers.uuid_str(value)¶

Check whether the specified value is uuid represented by a 36-byte hexadecimal string.

Parameters:	value (`any`) – the value to check the type for
Return:	`true` if the specified value is `uuid` represented by a 36-byte hexadecimal string; otherwise, `false`
Rtype:	boolean

See also: uuid(value)

Module clock

Overview

The clock module returns time values derived from the Posix / C CLOCK_GETTIME function or equivalent. Most functions in the module return a number of seconds; functions whose names end in “64” return a 64-bit number of nanoseconds.

Index

Below is a list of all clock functions.

Name	Use
clock.time() clock.realtime()	Get the wall clock time in seconds
clock.time64() clock.realtime64()	Get the wall clock time in nanoseconds
clock.monotonic()	Get the monotonic time in seconds
clock.monotonic64()	Get the monotonic time in nanoseconds
clock.proc()	Get the processor time in seconds
clock.proc64()	Get the processor time in nanoseconds
clock.thread()	Get the thread time in seconds
clock.thread64()	Get the thread time in nanoseconds
clock.bench()	Measure the time a function takes within a processor

clock.time()¶

clock.time64()¶

clock.realtime()¶

clock.realtime64()¶

The wall clock time. Derived from C function clock_gettime(CLOCK_REALTIME).

Return:	seconds or nanoseconds since epoch (1970-01-01 00:00:00), adjusted.
Rtype:	number or cdata (ctype<int64_t>)

Example:

-- This will print an approximate number of years since 1970.
clock = require('clock')
print(clock.time() / (365*24*60*60))

See also fiber.time64 and the standard Lua function os.clock.

clock.monotonic()¶

clock.monotonic64()¶

The monotonic time. Derived from C function clock_gettime(CLOCK_MONOTONIC). Monotonic time is similar to wall clock time but is not affected by changes to or from daylight saving time, or by changes done by a user. This is the best function to use with benchmarks that need to calculate elapsed time.

Return:	seconds or nanoseconds since the last time that the computer was booted.
Rtype:	number or cdata (ctype<int64_t>)

Example:

-- This will print nanoseconds since the start.
clock = require('clock')
print(clock.monotonic64())

clock.proc()¶

clock.proc64()¶

The processor time. Derived from C function clock_gettime(CLOCK_PROCESS_CPUTIME_ID). This is the best function to use with benchmarks that need to calculate how much time has been spent within a CPU.

Return:	seconds or nanoseconds since processor start.
Rtype:	number or cdata (ctype<int64_t>)

Example:

-- This will print nanoseconds in the CPU since the start.
clock = require('clock')
print(clock.proc64())

clock.thread()¶

clock.thread64()¶

The thread time. Derived from C function clock_gettime(CLOCK_THREAD_CPUTIME_ID). This is the best function to use with benchmarks that need to calculate how much time has been spent within a thread within a CPU.

Return:	seconds or nanoseconds since the transaction processor thread started.
Rtype:	number or cdata (ctype<int64_t>)

Example:

-- This will print seconds in the thread since the start.
clock = require('clock')
print(clock.thread64())

clock.bench(function[, ...])¶

The time that a function takes within a processor. This function uses clock.proc(), therefore it calculates elapsed CPU time. Therefore it is not useful for showing actual elapsed time.

Parameters:	function (`function`) – function or function reference ... – whatever values are required by the function.
Return:	table. first element – seconds of CPU time, second element – whatever the function returns.

Example:

-- Benchmark a function which sleeps 10 seconds.
-- NB: bench() will not calculate sleep time.
-- So the returned value will be {a number less than 10, 88}.
clock = require('clock')
fiber = require('fiber')
function f(param)
  fiber.sleep(param)
  return 88
end
clock.bench(f, 10)

Module compat

The usual way to handle compatibility problems is to introduce an option for a new behavior and leave the old one by default. It is not always the perfect way.

Sometimes developers want to keep the old behavior for existing applications and offer the new behavior by default for the new ones. For example, the old behavior is known to be problematic, or less safe, or it doesn’t correspond to user expectations. In contrast, the user doesn’t always read all the documentation and often assumes good defaults. It was decided to introduce a compatibility module to provide a direct way to deprecate unwanted behavior.

The compat module is basically a global table of options with additional verbose interface and helper functions. There are three stages of changing behavior:

Old behavior by default.
New behavior by default.
New behavior is frozen and the old behavior is removed.

During the first two stages, a user can toggle options via the interface and change the behavior according to one’s needs. At the last stage, the old behavior is removed from the codebase, and the option is marked as obsolete. Because compat is a global instance, options can be hardcoded into it or added in runtime, for example, by external module.

Options are switched to the next stage in major releases. In this way, developers are able to adapt to the new standard behavior and test it before switching to the next release. If something is broken by a new Tarantool version, a developer can still have a way to fix it by a simple config change, that is, explicitly select the old behavior.

Consider example below:

The option json_esc_slash is introduced in the 2.11 minor release. Default is set to ‘old’, but a developer can utilize the new behavior or test the updated behavior by switching it manually to ‘new’.
In release 3.0, the next major release, json_esc_slash default is switched to ‘new’. Now, developers who don’t manage to adapt to the new behavior, are able to switch the option to ‘old’ and fix their module in the future.
In release 4.0, json_esc_slash is marked as obsolete, and the old behavior is no longer accessible. Developers are forced to use the new behavior.

Basic usage

If you want to explicitly secure every behavior in compat, you can do it manually, and then call compat.dump() to get a Lua command that sets up the compat with all the options selected. You should place this commands at the beginning of code in your init.lua file. In this way, you are guaranteed to get the same behavior on any other Tarantool version. See a tutorial on using compat for more examples.

Configuration options

Another way to handle compatibility issues is setting the compat.* configuration options. Similarly to the compat Lua module options, the configuration options can have values new and old. The set of configuration options matches the set of options available in the compat module.

Below is an example fragment of a YAML configuration file:

compat:
  box_space_max: 'new'
  sql_seq_scan_default: 'old'
  fiber_slice_default: 'old'
  binary_data_decoding: 'new'

Learn more in the Configuration reference.

Options

Below are the available compat options:

JSON encode escape forward slash

Option: json_escape_forward_slash

For some reason, in the upstream lua_cjson, the ‘/’ sign is escaped. But according to the rfc4627 standard, it is unnecessary and questionably compatible with other implementations.

Old and new behavior

By toggling the json_escape_forward_slash compat option, you can chose either the json encoder escapes the ‘/’ sign or it does not:

tarantool> require('compat').json_escape_forward_slash = 'old'
---
...

tarantool> require('json').encode('foo/bar')
---
- '"foo\/bar"'
...

tarantool> require('compat').json_escape_forward_slash = 'new'
---
...

tarantool> require('json').encode('foo/bar')
---
- '"foo/bar"'
...

The option affects both the global serializer instance and serializers created with json.new(). It also affects the way log messages are encoded when written to the log in the json format (the box.cfg.log_format option is set to ‘json’).

Known compatibility issues

At this point, no incompatible modules are known.

Detecting issues in your codebase

Both encoding styles are correct from the JSON standard standpoint, but if your module relies on encodings results bytewise, it may break with this change. Be cautious if you do the following:

Hash results of json.encode().

Lua-YAML prettier multiline output

Option: yaml_pretty_multiline

The lua-yaml encoder selects the string style automatically, but in Tarantool context, it can be beneficial to enforce them, for example, for better readability. The yaml_pretty_multiline compat option allows to encode multiline strings in a block style.

Old and new behavior

The compat module allows you to chose between the lua-yaml encodes multiline strings as usual or in the enforced block scalar style:

tarantool> compat.yaml_pretty_multiline = 'old'
---
...

tarantool> return "Title: xxx\n- Item 1\n- Item 2\n"
---
- 'Title: xxx

  - Item 1

  - Item 2

  '
...

tarantool> compat.yaml_pretty_multiline = 'new'
---
...

tarantool> return "Title: xxx\n- Item 1\n- Item 2\n"
---
- |
  Title: xxx
  - Item 1
  - Item 2
...

You can select the new/old behavior in compat. It affects the global YAML encoder.

Known compatibility issues

At this point, no incompatible modules are known.

Detecting issues in your codebase

Both encoding styles are correct from the YAML standard standpoint, but if your module relies on encodings results bytewise, it may break with this change. Be cautious if you do the following:

Compare results of YAML encoding as strings.
Hash results of yaml encoding.

Fiber channel close mode

Option: fiber_channel_close_mode

Before the change, there was an unexpected behavior when using channel:close() because it closed the channel entirely and discarded all unread events.

Old and new behavior

The compat module allows you chose between the channel force and graceful closing. The latter is a new behavior.

tarantool> compat = require('compat')
---
...

tarantool> compat
---
- - yaml_pretty_multiline: default (old)
  - json_escape_forward_slash: default (old)
  - fiber_channel_close_mode: default (old)
...

tarantool> fiber = require('fiber')
---
...

tarantool> ch = fiber.channel(10)
---
...

tarantool> ch:put('one')
---
- true
...

tarantool> ch:put('two')
---
- true
...

tarantool> ch:get()
---
- one
...

tarantool> ch:close()
---
...

tarantool> ch:get()
---
- null
...

tarantool> compat.fiber_channel_close_mode = 'new'
---
...

tarantool> ch = fiber.channel(10)
---
...

tarantool> ch:put('one')
---
- true
...

tarantool> ch:put('two')
---
- true
...

tarantool> ch:get()
---
- one
...

tarantool> ch:close()
---
...

tarantool> ch:get()
---
- two
...

tarantool> ch:get()
---
- null
...

You can select new/old behavior in compat. It will affect all existing channels and the future ones.

Known compatibility issues

At this point, no incompatible modules are known.

Detecting issues in your codebase

The new behavior is mostly backward compatible. The only known problem that can appear is when the code relies on channel being entirely closed after ch:close() and ch:get() returning nil.

Default value for replication_sync_timeout

Option: box_cfg_replication_sync_timeout

Having a non-zero replication_sync_timeout gives a user the false assumption that the box.cfg{replication = ...} call returns only when the configured node is synced with all the other nodes. This is mostly true for the big replication_sync_timeout values, but it is not 100% guaranteed. In other words, a user still has to check if the node is synced, or the sync just timed out. Besides, while replication_sync_timeout is ticking, you cannot reconfigure box with another box.cfg call, which hardens reconfiguration.

It is decided to set the replication_sync_timeout to zero by default.

Old and new behavior

The compat module allows you to choose between

the old behavior: box.cfg.replication_sync_timeout is 300 seconds by default
and the new behavior:box.cfg.replication_sync_timeout is 0 by default.

It is important to set the desired behavior before the initial box.cfg{} call to take effect for it.

tarantool> compat.box_cfg_replication_sync_timeout = 'new'
---
...
tarantool> box.cfg{}
---
...
tarantool> box.cfg.replication_sync_timeout
---
- 0
...
tarantool> compat.box_cfg_replication_sync_timeout = 'old'
---
- error: 'builtin/box/load_cfg.lua:253: The compat  option ''box_cfg_replication_sync_timeout''
    takes effect only before the initial box.cfg() call'
...

A fresh Tarantool run:

tarantool> compat.box_cfg_replication_sync_timeout = 'old'
---
...
tarantool> box.cfg{}
---
...
tarantool> box.cfg.replication_sync_timeout
---
- 300
...

Known compatibility issues

At this point, no incompatible modules are known.

Detecting issues in your codebase

We expect issues with a user assuming that the node is not in the orphan state (box.info.status ~= "orphan") after the box.cfg{replication=...} call returns. This is not true with the new behaviour. To simulate the old behavior, one may add a box.ctl.wait_rw() call after the box.cfg{} call. box.ctl.wait_rw() returns only when the node becomes writable, and hence is not an orphan.

Default value for sql_seq_scan session setting

Option: sql_seq_scan_default

The default value for the sql_seq_scan session setting will be set to false starting with Tarantool 3.0. To be able to return the behavior to the old default, a new compat option is introduced.

Old and new behavior

Old behavior: SELECT scan queries are always allowed.

New behavior: SELECT scan queries are only allowed if the SEQSCAN keyword is used correctly.

Note that the sql_seq_scan_default compat option only affects sessions during initialization. It means that you should set sql_seq_scan_default before running box.cfg{} or creating a new session. Also, a new session created before executing box.cfg{} will not be affected by the value of the compat option.

Examples of setting the option before execution of box.cfg{}:

tarantool> require('compat').sql_seq_scan_default = 'new'
---
...

tarantool> box.cfg{log_level = 1}
---
...

tarantool> box.space._session_settings:get{'sql_seq_scan'}
---
- ['sql_seq_scan', false]
...

tarantool> require('compat').sql_seq_scan_default = 'old'
---
...

tarantool> box.cfg{log_level = 1}
---
...

tarantool> box.space._session_settings:get{'sql_seq_scan'}
---
- ['sql_seq_scan', true]
...

Examples of setting the option before creation of a new session after execution of box.cfg{}:

tarantool> box.cfg{log_level = 1, listen = 3301}
---
...

tarantool> require('compat').sql_seq_scan_default = 'new'
---
...

tarantool> cn = require('net.box').connect(box.cfg.listen)
---
...

tarantool> cn.space._session_settings:get{'sql_seq_scan'}
---
- ['sql_seq_scan', false]
...

tarantool> require('tarantool').compat.sql_seq_scan_default = 'old'
---
...

tarantool> cn = require('net.box').connect(box.cfg.listen)
---
...

tarantool> cn.space._session_settings:get{'sql_seq_scan'}
---
- ['sql_seq_scan', true]
...

Known compatibility issues

At this point, no incompatible modules are known.

Detecting issues in your codebase

We expect most SELECTs that do not use indexes to fail after the sql_seq_scan session setting is set to false. The best way to avoid this is to refactor the query to use indexes. To understand if SELECT uses indexes, you can use EXPLAIN QUERY PLAN. If SEARCH TABLE is specified, the index is used. If it says SCAN TABLE, the index is not used.

You can use the SEQSCAN keyword to manually allow scanning queries. Or you can set the sql_seq_scan session setting to true to allow all scanning queries.

Default value for max fiber slice

Option: fiber_slice_default

The max fiber slice specifies the max fiber execution time without yield before a warning is logged or an error is raised. It is set with the fiber.set_max_slice() function. The new compat option – fiber_slice_default – controls the default value of the max fiber slice.

Old and new behavior

The old default value for the max fiber slice is infinity (no warnings or errors). The new default value is {warn = 0.5, err = 1.0}. To use the new behavior, set fiber_slice_default to new as follows:

compat = require('compat')
compat.fiber_slice_default = 'new'

Known compatibility issues

At this point, no incompatible modules are known.

Detecting issues in your codebase

If you see a warning like this in the log:

fiber has not yielded for more than 0.500 seconds,

or the following error is raised unexpectedly by a box function

error: fiber slice is exceeded,

then your application has a fiber that may exceed its slice and fail.

First, make sure that fiber.yield() is used for this fiber to transfer control to another fiber. You can also extend the fiber slice with the fiber.extend_slice(slice) function.

Decoding binary objects

Option: binary_data_decoding

Starting from version 3.0, Tarantool has the varbinary module for handling binary objects of arbitrary lengths. The binary_data_decoding compat option allows to define the format in which varbinary field values are returned for handling in Lua: plain strings or varbinary objects.

Old and new behavior

New behavior: varbinary field values are returned as varbinary objects.

tarantool> compat.binary_data_decoding = 'new'
---
...

tarantool> varbinary.is(msgpack.decode('\xC4\x02\xFF\xFE'))
---
- true
...

Old behavior: varbinary field values are returned as plain strings.

tarantool> compat.binary_data_decoding = 'old'
---
...

tarantool> varbinary.is(msgpack.decode('\xC4\x02\xFF\xFE'))
---
- false
...

tarantool> varbinary.is(yaml.decode('!!binary //4='))
---
- false
...

Known compatibility issues

At this point, no incompatible modules are known.

Detecting issues in your codebase

String manipulation methods, such as string.sub() or string.match() are not defined for varbinary objects. Thus, if you use such methods on results of binary data decoding from MsgPack or YAML, convert them to strings explicitly using the tostring() method.

box.session.push() deprecation

Option: box_session_push_deprecation

Starting from version 3.0, Lua API function box.session.push() and C API function box_session_push() are deprecated.

Old and new behavior

New behavior: calling box.session.push() raises an error.

tarantool> box.session.push({1})
---
- error: box.session.push is deprecated
...

Old behavior: box.session.push() is available to use. When it’s called for the first time, a deprecation warning is printed to the log.

tarantool> box.session.push({1})
2024-12-18 15:42:51.537 [50750] main/104/interactive session.c:569 W> box.session.push is deprecated. Consider using box.broadcast instead.
%TAG !push! tag:tarantool.io/push,2018
---
- 1
...
---
- true
...

Known compatibility issues

At this point, no incompatible modules are known.

Detecting issues in your codebase

If your application uses box.session.push(), consider rewriting it using box.broadcast().

Tutorial: Module compat

This tutorial covers the following compat module API and its usage:

Listing options
Listing options details
Changing option value
Restoring defaults
Getting compat setup with compat.dump()
Setting all options to a specific value with compat.dump()
Adding an option during runtime

Listing options

The options list is serialized in the interactive console with additional details for user convenience:

All non-obsolete options in order new > old > default.
Serialization returns array-like table with tables {<option> = <value>}.
The result of compat serialization can still be indexed as a normal key-value table.

tarantool> compat = require('compat')
---
...

tarantool> compat
---
- - json_escape_forward_slash: new
- - option_2: old
- - option_default_old: default (old)
- - option_default_new: default (new)
...

Listing options details

current – the state of the option.
default – the default state of the option.
brief – text options description with a link to more detailed description.

tarantool> compat.option_default_new
---
- current: old
default: new
brief: <...>
...

Changing option value

You can do it directly, or by passing a table with option-value. Possible values to assign are ‘new’ , ‘old’, and ‘default’.

tarantool> compat.json_escape_forward_slash = 'old'
---
...

tarantool> compat{json_escape_forward_slash = 'new', option_2 = 'default'}
---
...

Restoring defaults

By setting ‘default’ value to an option:

tarantool> compat.option_2 = 'default'
---
...

tarantool> compat.option_2
---
- current: default
- default: new
- brief: <...>
...

Getting compat setup with compat.dump()

tarantool> compat({
         >     obsolete_set_explicitly = 'new',
         >     option_set_old = 'old',
         >     option_set_new = 'new'
         > })
---
...

tarantool> compat
---
- - option_set_old: old
- - option_set_new: new
- - option_default_old: default (old)
- - option_default_new: default (new)
...

# Obsolete options are not returned in serialization, but have the following values:
# - obsolete_option_default: default (new)
# - obsolete_set_explicitly: new

# nil does output obsolete unset options as 'default'
tarantool> compat.dump()
---
- require('compat')({
            option_set_old          = 'old',
            option_set_new          = 'new',
            option_default_old      = 'default',
            option_default_new      = 'default',
            obsolete_option_default = 'default', -- obsolete since X.Y
            obsolete_set_explicitly = 'new',     -- obsolete since X.Y
    })
...

# 'current' is the same as nil with default set to current values
tarantool> compat.dump('current')
---
- require('compat')({
            option_set_old          = 'old',
            option_set_new          = 'new',
            option_default_old      = 'old',
            option_default_new      = 'new',
            obsolete_option_default = 'new',     -- obsolete since X.Y
            obsolete_set_explicitly = 'new',     -- obsolete since X.Y
    })
...

# 'new' outputs obsolete as 'new'.
tarantool> compat.dump('new')
---
- require('compat')({
            option_set_old          = 'new',
            option_set_new          = 'new',
            option_default_old      = 'new',
            option_default_new      = 'new',
            obsolete_option_default = 'new',     -- obsolete since X.Y
            obsolete_set_explicitly = 'new',     -- obsolete since X.Y
    })
...

# 'old' outputs obsolete options as 'new'.
tarantool> compat.dump('old')
---
- require('compat')({
            option_set_old          = 'old',
            option_set_new          = 'old',
            option_default_old      = 'old',
            option_default_new      = 'old',
            obsolete_option_default = 'new',     -- obsolete since X.Y
            obsolete_set_explicitly = 'new',     -- obsolete since X.Y
    })
...

# 'default' does output obsolete options as default.
tarantool> dump('default')
---
- require('compat')({
            option_set_old          = 'default',
            option_set_new          = 'default',
            option_default_old      = 'default',
            option_default_new      = 'default',
            obsolete_option_default = 'default', -- obsoleted since X.Y
            obsolete_set_explicitly = 'default', -- obsoleted since X.Y
    })
...

Setting all options to a specific value with compat.dump()

use compat.dump() to get a specific configuration
copy and paste it into console (or use loadstring())

tarantool> compat.dump('new')
---
- require('compat')({
      option_2 = 'new',
      json_escape_forward_slash = 'new',
  })
...
tarantool> require('compat')({
      option_2 = 'new',
      json_escape_forward_slash = 'new',
  })
---
...

tarantool> compat
---
- - json_escape_forward_slash: new
- - option_2: new
...

Adding an option during runtime

User must provide a table with:

name (string)
default (’new’ / ’old’)
brief (explanation of the option, can be multiline string)
obsolete (’X.Y’ / nil) — tarantool version that marked option as obsolete. When nil, option is treated as non-obsolete)
action function (argument - boolean is_new, changes the behavior accordingly)
run_action_now (true / false / nil) if add_options should run action afterwards, false by default

Option hot reload:

You can change an existing option in runtime using add_option(), it will update all the fields but keep currently selected behavior if any. The new action will be called afterwards.

tarantool> compat.add_option{
                 name = 'option_4',
                 default = 'new',
                 brief = "<...>",
                 obsolete = nil,          -- you can explicitly mark the option as non-obsolete
                 action = function(is_new)
                      print(("option_4 action was called with is_new = %s!"):format(is_new))
                 end,
                 run_action_now = true
           }
option_4 postaction was called with is_new = true!
---
...

tarantool> compat.add_option{             -- hot reload of option_4
                 name = 'option_4',
                 default = 'old',         -- different default
                 brief = "<...>",
                 action = function(is_new)
                      print(("new option_4 action was called with is_new = %s!"):format(is_new))
                 end
           }
---
...         -- action is not called by default

Module compress

Enterprise Edition

This module is a part of the Enterprise Edition.

Since: 2.11.0

The compress module provides a set of submodules for compressing and decompressing data using different algorithms:

Submodule compress.zlib

Enterprise Edition

This submodule is a part of the Enterprise Edition.

Overview

The compress.zlib submodule provides the ability to compress and decompress data using the zlib algorithm. You can use the zlib compressor as follows:

Create a compressor instance using the compress.zlib.new() function:

local zlib_compressor = require('compress.zlib').new()
-- or --
local zlib_compressor = require('compress').zlib.new()

Optionally, you can pass compression options (zlib_opts) specific for zlib:

local zlib_compressor = require('compress.zlib').new({
    level = 5,
    mem_level = 5,
    strategy = 'filtered'
})

To compress the specified data, use the compress() method:

compressed_data = zlib_compressor:compress('Hello world!')

To decompress data, call decompress():

decompressed_data = zlib_compressor:decompress(compressed_data)

API Reference

Functions
compress.zlib.new()	Create a `zlib` compressor instance.
Objects
zlib_compressor	A `zlib` compressor instance.
zlib_opts	Configuration options of the `zlib` compressor.

compress.zlib.new()

compress.zlib.new([zlib_opts])¶

Create a zlib compressor instance.

Parameters:	options (`table`) – `zlib` compression options (see zlib_opts)
Return:	a new `zlib` compressor instance (see zlib_compressor)
Rtype:	userdata

Example

local zlib_compressor = require('compress.zlib').new({
    level = 5,
    mem_level = 5,
    strategy = 'filtered'
})

zlib_compressor

object zlib_compressor¶

A compressor instance that exposes the API for compressing and decompressing data using the zlib algorithm. To create the zlib compressor, call compress.zlib.new().

zlib_compressor:compress(data)¶

Compress the specified data.

Parameters:	data (`string`) – data to be compressed
Return:	compressed data
Rtype:	string

Example

compressed_data = zlib_compressor:compress('Hello world!')

zlib_compressor:decompress(data)¶

Decompress the specified data.

Parameters:	data (`string`) – data to be decompressed
Return:	decompressed data
Rtype:	string

Example

decompressed_data = zlib_compressor:decompress(compressed_data)

zlib_opts

object zlib_opts¶

Configuration options of the zlib_compressor. These options can be passed to the compress.zlib.new() function.

Example

local zlib_compressor = require('compress.zlib').new({
    level = 5,
    mem_level = 5,
    strategy = 'filtered'
})

zlib_opts.level¶: Specifies the zlib compression level that enables you to adjust the compression ratio and speed. The lower level improves the compression speed at the cost of compression ratio.

Default: 6

Minimum: 0 (no compression)

Maximum: 9

zlib_opts.mem_level¶: Specifies how much memory is allocated for the zlib compressor. The larger value improves the compression speed and ratio.

Default: 8

Minimum: 1

Maximum: 9

zlib_opts.strategy¶

Specifies the compression strategy. The possible values:

default - for normal data.
huffman_only - forces Huffman encoding only (no string match). The fastest compression algorithm but not very effective in compression for most of the data.
filtered - for data produced by a filter or predictor. Filtered data consists mostly of small values with a somewhat random distribution. This compression algorithm is tuned to compress them better.
rle - limits match distances to one (run-length encoding). rle is designed to be almost as fast as huffman_only but gives better compression for PNG image data.
fixed - prevents the use of dynamic Huffman codes and provides a simpler decoder for special applications.

Submodule compress.zstd

Enterprise Edition

This submodule is a part of the Enterprise Edition.

Overview

The compress.zstd submodule provides the ability to compress and decompress data using the zstd algorithm. You can use the zstd compressor as follows:

Create a compressor instance using the compress.zstd.new() function:

local zstd_compressor = require('compress.zstd').new()
-- or --
local zstd_compressor = require('compress').zstd.new()

Optionally, you can pass compression options (zstd_opts) specific for zstd:

local zstd_compressor = require('compress.zstd').new({
    level = 5
})

To compress the specified data, use the compress() method:

compressed_data = zstd_compressor:compress('Hello world!')

To decompress data, call decompress():

decompressed_data = zstd_compressor:decompress(compressed_data)

API Reference

Functions
compress.zstd.new()	Create a `zstd` compressor instance.
Objects
zstd_compressor	A `zstd` compressor instance.
zstd_opts	Configuration options of the `zstd` compressor.

compress.zstd.new()

compress.zstd.new([zstd_opts])¶

Create a zstd compressor instance.

Parameters:	options (`table`) – `zstd` compression options (see zstd_opts)
Return:	a new `zstd` compressor instance (see zstd_compressor)
Rtype:	userdata

Example

local zstd_compressor = require('compress.zstd').new({
    level = 5
})

zstd_compressor

object zstd_compressor¶

A compressor instance that exposes the API for compressing and decompressing data using the zstd algorithm. To create the zstd compressor, call compress.zstd.new().

zstd_compressor:compress(data)¶

Compress the specified data.

Parameters:	data (`string`) – data to be compressed
Return:	compressed data
Rtype:	string

Example

compressed_data = zstd_compressor:compress('Hello world!')

zstd_compressor:decompress(data)¶

Decompress the specified data.

Parameters:	data (`string`) – data to be decompressed
Return:	decompressed data
Rtype:	string

Example

decompressed_data = zstd_compressor:decompress(compressed_data)

zstd_opts

object zstd_opts¶

Configuration options of the zstd_compressor. These options can be passed to the compress.zstd.new() function.

Example

local zstd_compressor = require('compress.zstd').new({
    level = 5
})

zstd_opts.level¶: Specifies the zstd compression level that enables you to adjust the compression ratio and speed. The lower level improves the compression speed at the cost of compression ratio. For example, you can use level 1 if speed is most important and level 22 if size is most important.

Default: 3

Minimum: -131072

Maximum: 22

Note

Assigning 0 to level resets its value to the default (3).

Submodule compress.lz4

Enterprise Edition

This submodule is a part of the Enterprise Edition.

Overview

The compress.lz4 submodule provides the ability to compress and decompress data using the lz4 algorithm. You can use the lz4 compressor as follows:

Create a compressor instance using the compress.lz4.new() function:

local lz4_compressor = require('compress.lz4').new()
-- or --
local lz4_compressor = require('compress').lz4.new()

Optionally, you can pass compression options (lz4_opts) specific for lz4:

local lz4_compressor = require('compress.lz4').new({
    acceleration = 1000,
    decompress_buffer_size = 2097152
})

To compress the specified data, use the compress() method:

compressed_data = lz4_compressor:compress('Hello world!')

To decompress data, call decompress():

decompressed_data = lz4_compressor:decompress(compressed_data)

API Reference

Functions
compress.lz4.new()	Create a `lz4` compressor instance.
Objects
lz4_compressor	A `lz4` compressor instance.
lz4_opts	Configuration options of the `lz4` compressor.

compress.lz4.new()

compress.lz4.new([lz4_opts])¶

Create a lz4 compressor instance.

Parameters:	options (`table`) – `lz4` compression options (see lz4_opts)
Return:	a new `lz4` compressor instance (see lz4_compressor)
Rtype:	userdata

Example

local lz4_compressor = require('compress.lz4').new({
    acceleration = 1000,
    decompress_buffer_size = 2097152
})

lz4_compressor

object lz4_compressor¶

A compressor instance that exposes the API for compressing and decompressing data using the lz4 algorithm. To create the lz4 compressor, call compress.lz4.new().

lz4_compressor:compress(data)¶

Compress the specified data.

Parameters:	data (`string`) – data to be compressed
Return:	compressed data
Rtype:	string

Example

compressed_data = lz4_compressor:compress('Hello world!')

lz4_compressor:decompress(data)¶

Decompress the specified data.

Parameters:	data (`string`) – data to be decompressed
Return:	decompressed data
Rtype:	string

Example

decompressed_data = lz4_compressor:decompress(compressed_data)

lz4_opts

object lz4_opts¶

Configuration options of the lz4_compressor. These options can be passed to the compress.lz4.new() function.

Example

local lz4_compressor = require('compress.lz4').new({
    acceleration = 1000,
    decompress_buffer_size = 2097152
})

lz4_opts.acceleration¶: Specifies the acceleration factor that enables you to adjust the compression ratio and speed. The higher acceleration factor increases the compression speed but decreases the compression ratio.

Default: 1

Maximum: 65537

Minimum: 1

lz4_opts.decompress_buffer_size¶: Specifies the decompress buffer size (in bytes). If the size of decompressed data is larger than this value, the compressor returns an error on decompression.

Default: 1048576

Module config

Since: 3.0.0

The config module provides the ability to work with an instance’s configuration. For example, you can determine whether the current instance is up and running without errors after applying the cluster’s configuration.

By using the config.storage role, you can set up a Tarantool-based centralized configuration storage and interact with this storage using the config module API.

Loading config

To load the config module, use the require() directive:

local config = require('config')

Then, you can access its API:

config:reload()

API Reference

config API
config:get()	Get a configuration applied to the current or remote instance
config:info()	Get the current instance’s state in regard to configuration
config:instance_uri()	Get a URI of the current or remote instance
config:instances()	List all instances of the cluster
config:reload()	Reload the current instance’s configuration
config.storage API
config.storage.put()	Put a value by the specified path
config.storage.get()	Get a value stored by the specified path
config.storage.delete()	Delete a value stored by the specified path
config.storage.info()	Get information about an instance’s connection state
config.storage.txn()	Make an atomic request

config API

object config¶

config:get([param, opts])¶

Get a configuration applied to the current or remote instance. Note the following differences between getting a configuration for the current and remote instance:

For the current instance, get() returns its configuration considering environment variables.
For a remote instance, get() only considers a cluster configuration and ignores environment variables.

Parameters:	param (`string`) – a configuration option name opts (`table`) – options to pass. The following options are available: `instance` (since 3.1.0) – the name of a remote instance whose configuration should be obtained
Return:	an instance configuration

Examples:

The example below shows how to get the full instance configuration:

app:instance001> require('config'):get()
---
- fiber:
    io_collect_interval: null
    too_long_threshold: 0.5
    top:
      enabled: false
  # Other configuration values
  # ...

This example shows how to get an iproto.listen option value:

app:instance001> require('config'):get('iproto.listen')
---
- - uri: 127.0.0.1:3301
...

config.get() can also be used in application code to get the value of a custom configuration option.

config:info([version])¶

Get the current instance’s state in regard to configuration.

Parameters:

version (string) –
(since 3.1.0) the version of the information that should be returned. The version argument can be one of the following values:
- v1 (default): the meta field returned by info() includes information about the last loaded configuration
- v2: the meta field returned by info() includes two fields:
  - the last field includes information about the last loaded configuration
  - the active field includes information for the last successfully applied configuration

Return:

a table containing an instance’s state. The returned state includes the following sections:

status – one of the following statuses:

ready – the configuration is applied successfully

check_warnings – the configuration is applied with warnings

check_errors – the configuration cannot be applied due to configuration errors

meta – additional configuration information

alerts – warnings or errors raised on an attempt to apply the configuration

Since version 3.3.0

hierarchy – table, showing names of the group, replicaset, and the instance itself.

These names are taken directly from the --name CLI option (or the TT_INSTANCE_NAME environment variable) and the cluster configuration. This means they are always present if the YAML configuration flow is in use, disregarding the database status (whether upgraded, writable or not).

Below are a few examples demonstrating how the info() output might look.

Example: no configuration warnings or errors

In the example below, an instance’s state is ready and no warnings are shown:

app:instance001> require('config'):info('v2')
---
- status: ready
  meta:
    last: &0 []
    active: *0
  alerts: []
  hierarchy:
    group: group-001
    replicaset: replicaset-001
    instance: instance-001
...

Example: configuration warnings

In the example below, the instance’s state is check_warnings. The alerts section informs that privileges to the bands space for sampleuser cannot be granted because the bands space has not been created yet:

app:instance001> require('config'):info('v2')
---
- status: check_warnings
  meta:
    last: &0 []
    active: *0
  alerts:
  - type: warn
    message: box.schema.user.grant("sampleuser", "read,write", "space", "bands") has
      failed because either the object has not been created yet, a database schema
      upgrade has not been performed, or the privilege write has failed (separate
      alert reported)
    timestamp: 2024-07-03T18:09:18.826138+0300
  hierarchy:
    group: group-001
    replicaset: replicaset-001
    instance: instance-001
...

This warning is cleared when the bands space is created.

Example: configuration errors

In the example below, the instance’s state is check_errors. The alerts section informs that the log.level configuration option has an incorrect value:

app:instance001> require('config'):info('v2')
---
- status: check_errors
  meta:
    last: []
    active: []
  alerts:
  - type: error
    message: '[cluster_config] log.level: Got 8, but only the following values are
      allowed: 0, fatal, 1, syserror, 2, error, 3, crit, 4, warn, 5, info, 6, verbose,
      7, debug'
    timestamp: 2024-07-03T18:13:19.755454+0300
  hierarchy:
    group: group-001
    replicaset: replicaset-001
    instance: instance-001
...

Example: configuration errors (centralized configuration storage)

In this example, the meta field includes information about a centralized storage the instance takes a configuration from:

app:instance001> require('config'):info('v2')
---
- status: check_errors
  meta:
    last:
      etcd:
        mod_revision:
          /myapp/config/all: 5
        revision: 5
    active:
      etcd:
        mod_revision:
          /myapp/config/all: 2
        revision: 4
  alerts:
  - type: error
    message: 'etcd source: invalid config at key "/myapp/config/all": [cluster_config]
      groups.group001.replicasets.replicaset001.instances.instance001.log.level: Got
      8, but only the following values are allowed: 0, fatal, 1, syserror, 2, error,
      3, crit, 4, warn, 5, info, 6, verbose, 7, debug'
    timestamp: 2024-07-03T15:22:06.438275Z
  hierarchy:
    group: group001
    replicaset: replicaset001
    instance: instance001
...

config:instance_uri([uri_type, opts])¶

Since: 3.1.0

Get a URI of the current or remote instance.

Parameters:

uri_type (string) –
a URI type. The following URI types are supported:
- peer – a URI used to advertise the instance to other cluster members. See also: iproto.advertise.peer.
- sharding – a URI used to advertise the current instance to a router and rebalancer. See also: iproto.advertise.sharding.
opts (table) –
options to pass. The following options are available:
- instance – the name of a remote instance whose URI should be obtained

Return:

a table representing an instance URI. This table might include the following fields:

uri – an instance URI
login – a username used to connect to this instance
password – a user’s password
params – URI parameters used to connect to this instance

Note

Note that the resulting URI object can be passed to the connect() function of the net.box module.

Example

The example below shows how to get a URI used to advertise storage-b-003 to other cluster members:

local config = require('config')
config:instance_uri('peer', { instance = 'storage-b-003' })

config:instances()¶

Since: 3.1.0

List all instances of the cluster.

Return:

a table containing information about instances. The returned table uses instance names as the keys and contains the following information for each instance:

instance_name – an instance name
replicaset_name – the name of a replica set the instance belongs to
group_name – the name of a group the instance belongs to

Example

The example below shows how to use instances() to get the names of all instances in the cluster, create a connection to each instance using the connpool module, and log connection URIs using the log module:

local config = require('config')
local connpool = require('experimental.connpool')
local log = require('log')

for instance_name in pairs(config:instances()) do
    local conn = connpool.connect(instance_name)
    log.info("Connection URI for %q: %s:%s", instance_name, conn.host, conn.port)
end

In this example, the same actions are performed for instances from the specified replica set:

local config = require('config')
local connpool = require('experimental.connpool')
local log = require('log')

for instance_name, def in pairs(config:instances()) do
    if def.replicaset_name == 'storage-b' then
        local conn = connpool.connect(instance_name)
        log.info("Connection URI for %q: %s:%s", instance_name, conn.host, conn.port)
    end
end

config:reload()¶

Reload the current instance’s configuration. Below are a few use cases when this function can be used:

A configuration option value specific to this instance is changed in a cluster’s configuration.
A new instance is added to a replica set.
A centralized configuration with turned-off configuration reloading is updated. Learn more at Reloading configuration.

config.storage API

The config.storage API allows you to interact with a Tarantool-based centralized configuration storage.

config.storage.put(path, value)¶

Put a value by the specified path.

Parameters:

path (string) – a path to put the value by
value (string) – a value to put

Return:

a table containing the following fields:

revision: a revision after performing the operation

Rtype:

table

Example:

The example below shows how to read a configuration stored in the source.yaml file using the fio module API and put this configuration by the /myapp/config/all path:

local fio = require('fio')
local cluster_config_handle = fio.open('../../source.yaml')
local cluster_config = cluster_config_handle:read()
local response = config.storage.put('/myapp/config/all', cluster_config)
cluster_config_handle:close()

Example on GitHub: tarantool_config_storage

config.storage.get(path)¶

Get a value stored by the specified path or prefix.

Parameters:

path (string) – a path or prefix to get a value by; prefixes end with /

Return:

a table containing the following fields:

data: a table containing the information about the value:
- path: a path
- mod_revision: the last revision at which this value was modified
- value:: a value
revision: a revision after performing the operation

Rtype:

table

Examples:

The example below shows how to get a configuration stored by the /myapp/config/all path:

local response = config.storage.get('/myapp/config/all')

This example shows how to get all configurations stored by the /myapp/ prefix:

local response = config.storage.get('/myapp/')

Example on GitHub: tarantool_config_storage

config.storage.delete(path)¶

Delete a value stored by the specified path or prefix.

Parameters:

path (string) – a path or prefix to delete the value by; prefixes end with /

Return:

a table containing the following fields:

data: a table containing the information about the value:
- path: a path
- mod_revision: the last revision at which this value was modified
- value:: a value
revision: a revision after performing the operation

Rtype:

table

Examples:

The example below shows how to delete a configuration stored by the /myapp/config/all path:

local response = config.storage.delete('/myapp/config/all')

In this example, all configuration are deleted:

local response = config.storage.delete('/')

Example on GitHub: tarantool_config_storage

config.storage.info()¶

Get information about an instance’s connection state.

Return:

a table containing the following fields:

status: a connection status, which can be one of the following:
- connected: if any instance from the quorum is available to the current instance
- disconnected: if the current instance doesn’t have a connection with the quorum

Rtype:

table

config.storage.txn(request)¶

Make an atomic request.

Parameters:

request (table) –
a table containing the following optional fields:
- predicates: a list of predicates to check. Each predicate is a list that contains:
```
{target, operator, value[, path]}
```
  - target – one of the following string values: revision, mod_revision, value, count
  - operator – a string value: eq, ne, gt, lt, ge, le or its symbolic equivalent, for example, ==, !=, >
  - value – an unsigned or string value to compare
  - path (optional) – a string value: can be a path with the mod_revision and value target or a path/prefix with the count target
- on_success: a list with operations to execute if all predicates in the list evaluate to true
- on_failure: a list with operations to execute if any of a predicate evaluates to false

Return:

a table containing the following fields:

data: a table containing response data:
- responses: the list of responses for all operations
- is_success: a boolean value indicating whether the predicate is evaluated to true
revision: a revision after performing the operation

Rtype:

table

Example:

local response = config.storage.txn({
    predicates = { { 'value', '==', 'v0', '/myapp/config/all' } },
    on_success = { { 'put', '/myapp/config/all', 'v1' } },
    on_failure = { { 'get', '/myapp/config/all' } }
})

Example on GitHub: tarantool_config_storage

Submodule experimental.config.utils.schema

Submodule experimental.config.utils.schema

Since: 3.2.0

The experimental.config.utils.schema module is used to validate and process parts of cluster configurations that have arbitrary user-defined structures:

app.cfg for applications loaded using the app option
roles_cfg for custom roles developed as a part of a cluster application

The module provides an API to get and set configuration values, filter and transform configuration data, and so on.

Important

experimental.config.utils.schema is an experimental submodule and is subject to changes.

Getting started with config.utils.schema

As an example, consider an application role that has a single configuration option – an HTTP endpoint address.

roles: [ http_api ]
roles_cfg:
  http_api: 'http://127.0.0.1:8080'

This is how you can use the experimental.config.utils.schema module to process the role configuration:

Load the module:

local schema = require('experimental.config.utils.schema')

Define a schema – the root object that stores information about the role’s configuration – using schema.new(). The example below shows a schema that includes a single string option:
```
local http_api_schema = schema.new('http_api', schema.scalar({ type = 'string' }))
```
Learn more in Defining a schema.
Use the validate() method of the schema object to validate configuration values against the schema. In case of a role, call this method inside the role’s validate() function:
```
local function validate(cfg)
    http_api_schema:validate(cfg)
end
```
Learn more in Validating configuration.
Refer to values of configuration options using the get() method inside the role’s apply() function. Learn more in Getting configuration values.

Defining a schema

A configuration schema stores information about a user-defined configuration structure that can be passed inside an app.cfg or a roles_cfg section. It includes option names, types, hierarchy, and other aspects of a configuration.

To create a schema, use the schema.new() function. It has the following arguments:

Schema name – an arbitrary string to use as an identifier.
Root schema node – a table describing the hierarchical schema structure starting from the root.
(Optional) methods – user-defined functions that can be called on this schema object.

Schema nodes

Schema nodes describe the hierarchy of options within a schema. There are two types of schema nodes:

Scalar nodes hold a single value of a supported primitive type. For example, a string configuration option of a role is a scalar node in its schema.
Composite nodes include multiple values in different forms: records, arrays, or maps.

A node can have annotations – named attributes that enable customization of its behavior, for example, setting a default value.

Scalar nodes

Scalar nodes hold a single value of a primitive type, for example, a string or a number. For the full list of supported scalar types, see Data types.

This configuration has one scalar node of the string type:

roles: [ http_api ]
roles_cfg:
  http_api: 'http://127.0.0.1:8080'

To define a scalar node in a schema, use schema.scalar(). The following schema can be used to process the configuration shown above:

local http_api_schema = schema.new('http_api', schema.scalar({ type = 'string' }))

If a scalar node has a limited set of allowed values, you can also define it with the schema.enum(). Pass the list of allowed values as its argument:

scheme = schema.enum({ 'http', 'https' }),

Note

Another way to restrict possible option values is the allowed_values built-in annotation.

Data types

Scalar nodes can have the following data types:

Scalar type	Lua type	Comment
`string`	`string`
`number`	`number`
`integer`	`number`	Only integer numbers
`boolean`	`boolean`	`true` or `false`
`string, number` or `number, string`	`string` or `number`
`any`	Arbitrary Lua value	May be used to declare an arbitrary value that doesn’t need validation.

Records

Record is a composite node that includes a predefined set of other nodes, scalar or composite. In YAML, a record is represented as a node with nested fields. For example, the following configuration has a record node http_api with three scalar fields:

roles: [ http_api ]
roles_cfg:
  http_api:
    host: '127.0.0.1'
    port: 8080
    scheme: 'http'

To define a record node in a schema, use schema.record(). The following schema describes the configuration above:

local listen_address_schema = schema.new('listen_address', schema.record({
    scheme = schema.enum({ 'http', 'https' }),
    host = schema.scalar({ type = 'string' }),
    port = schema.scalar({ type = 'integer' })
}))

Records are also used to define nested schema nodes of non-primitive types. In the example below, the http_api node includes another record listen_address.

roles: [ http_api ]
roles_cfg:
  http_api:
    listen_address:
      host: '127.0.0.1'
      port: 8080
      scheme: 'http'

The following schema describes this configuration:

local listen_address_schema = schema.new('listen_address', schema.record({
    listen_address = schema.record({
        scheme = schema.enum({ 'http', 'https' }),
        host = schema.scalar({ type = 'string' }),
        port = schema.scalar({ type = 'integer' })
    })
}))

Arrays

Array is a composite node type that includes a collection of items of the same type. The items can be either scalar or composite nodes.

In YAML, array items start with hyphens. For example, the following configuration includes an array named http_api. Each its item is a record with three fields: host, port, and scheme:

roles: [ http-api ]
roles_cfg:
  http-api:
  - host: '127.0.0.1'
    port: 8080
    scheme: 'http'
  - host: '127.0.0.1'
    port: 8443
    scheme: 'https'

To create an array node in a schema, use schema.array(). The following schema describes this configuration:

local listen_address_schema = schema.new('listen_address', schema.array({
    items = schema.record({
        scheme = schema.enum({ 'http', 'https' }),
        host = schema.scalar({ type = 'string' }),
        port = schema.scalar({ type = 'integer' })
    })
}))

There is also the schema.set() function that enables creating arrays with a limited set of allowed items.

Maps

Map is a composite node type that holds an arbitrary number of key-value pairs of predefined types.

In YAML, a map is represented as a node with nested fields. For example, the following configuration has the endpoints node:

roles: [ http_api ]
roles_cfg:
  http_api:
    host: '127.0.0.1'
    port: 8080
    scheme: 'http'
    endpoints:
      user: true
      order: true
      customer: false

To create a map node in a schema, use schema.map(). If this node is declared as a map as shown below, the endpoints section can include any number of options with arbitrary names and boolean values.

local listen_address_schema = schema.new('listen_address', schema.record({
    scheme = schema.enum({ 'http', 'https' }),
    host = schema.scalar({ type = 'string' }),
    port = schema.scalar({ type = 'integer' }),
    endpoints = schema.map({ key = schema.scalar({ type = 'string' }),
                             value = schema.scalar({ type = 'boolean' }) })
}))

Annotations

Node annotations are named attributes that define its various aspects. For example, scalar nodes have a required annotation type that defines the node value type. Other annotations can, for example, set a node’s default value and a validation function, or store arbitrary user-provided data.

Annotations are passed in a table to the node creation function:

scheme = schema.scalar({
    type = 'string',
    allowed_values = { 'http', 'https' },
    default = 'http',
}),

Node annotations fall into three groups:

Built-in annotations are handled by the module. These are: type, validate, allowed_values, default and apply_default_if. Note that validate and allowed_values are used for validation only. default and apply_default_if can transform the configuration.
User-defined annotations add named node attributes that can be used in the application or role code.
Computed annotations allow access to annotations of other nodes throughout the schema.

Built-in annotations

Built-in annotations are interpreted by the module itself. There are the following built-in annotations:

type – the node value type. The type must be explicitly specified for scalar nodes, except for those created with schema.enum(). For composite nodes and scalar enums, the corresponding constructors schema.record(), schema.map(), schema.array(), schema.set(), and schema.enum() set the type automatically.
allowed_values – (optional) a list of possible node values.
validate – (optional) a validation function for the provided node value.
default – (optional) a value to use if the option is not specified in the configuration.
apply_default_if – (optional) a function that defines when to apply the default value.

Consider the following role configuration:

roles: [ http_api ]
roles_cfg:
  http_api:
    host: '127.0.0.1'
    port: 8080
    scheme: 'http'

The following schema uses built-in annotations default, allowed_values, and validate to define default and allowed option values and validation functions:

local listen_address_schema = schema.new('listen_address', schema.record({
    scheme = schema.scalar({
        type = 'string',
        allowed_values = { 'http', 'https' },
        default = 'http',
    }),
    host = schema.scalar({
        type = 'string',
        validate = validate_host,
        default = '127.0.0.1',
    }),
    port = schema.scalar({
        type = 'integer',
        validate = validate_port,
        default = 8080,
    }),
}))

Validation functions can look as follows:

local function validate_host(host, w)
    local host_pattern = "^(%d+)%.(%d+)%.(%d+)%.(%d+)$"
    if not host:match(host_pattern) then
        w.error("'host' should be a string containing a valid IP address, got %q", host)
    end
end

local function validate_port(port, w)
    if port <= 1 or port >= 65535 then
        w.error("'port' should be between 1 and 65535, got %d", port)
    end
end

User-defined annotations

A schema node can have user-defined annotations with arbitrary names. Such annotations are used to implement custom behavior. You can get their names and values from the schema and use in the role or application code.

Example: the env user-defined annotation is used to provide names of environment variables from which the configuration values can be taken.

local listen_address_schema = schema.new('listen_address', schema.record({
    scheme = schema.enum({ 'http', 'https' }, { env = 'HTTP_SCHEME' }),
    host = schema.scalar({ type = 'string', env = 'HTTP_HOST' }),
    port = schema.scalar({ type = 'integer', env = 'HTTP_PORT' })
}))

See the full sample here: Parsing environment variables.

Computed annotations

Computed annotations enable access from a node to annotations of its ancestor nodes.

In the example below, the listen_address record validation function refers to the protocol annotation of its ancestor node:

local listen_address = schema.record({
    scheme = schema.enum({ 'http', 'https' }),
    host = schema.scalar({ type = 'string' }),
    port = schema.scalar({ type = 'integer' })
}, {
    validate = function(data, w)
        local protocol = w.schema.computed.annotations.protocol
        if protocol == 'iproto' and data.scheme ~= nil then
            w.error("iproto doesn't support 'scheme'")
        end
    end,
})

Note

If there are several ancestor nodes with this annotation, its value is taken from the closest one to the current node.

The following schema with listen_address passes the validation:

local http_listen_address_schema = schema.new('http_listen_address', schema.record({
    name = schema.scalar({ type = 'string' }),
    listen_address = listen_address,
}, {
    protocol = 'http',
}))

If this record is added to a schema with protocol = 'iproto', the listen_address validation fails with an error:

local iproto_listen_address_schema = schema.new('iproto_listen_address', schema.record({
    name = schema.scalar({ type = 'string' }),
    listen_address = listen_address,
}, {
    protocol = 'iproto',
}))

User-defined methods

A schema can implement custom logic with methods – user-defined functions that can be called on this schema.

For example, this schema has the format method that returns its fields merged in a URI string:

local listen_address_schema = schema.new(
        "listen_address",
        schema.record(
                {
                    scheme = schema.enum({ "http", "https" }),
                    host = schema.scalar({ type = "string" }),
                    port = schema.scalar({ type = "integer" })
                }
        ),
        {
            methods = {
                format = function(_self, url)
                    return string.format("%s://%s:%d", url.scheme, url.host, url.port)
                end
            }
        }
)

Processing configuration data

Validating configuration

The schema object’s validate() method performs all the necessary checks on the provided configuration. It validates the configuration structure, node types, allowed values, and other aspects of the schema.

When writing roles, call this function inside the role validation function:

local function validate(cfg)
    listen_address_schema:validate(cfg)
end

Getting configuration values

To get configuration values, use the schema object’s get() method. It takes the configuration and the full path to the node as arguments:

local function apply(cfg)
    local scheme = listen_address_schema:get(cfg, 'listen_address.scheme')
    local host = listen_address_schema:get(cfg, 'listen_address.host')
    local port = listen_address_schema:get(cfg, 'listen_address.port')
    log.info("HTTP API endpoint: %s://%s:%d", scheme, host, port)
end

Transforming configuration

The schema object has methods that transform configuration data based on the schema, for example, apply_default(), merge(), set().

The following sample shows how to apply default values from the schema to fill missing configuration fields:

local function apply(cfg)
    local cfg_with_defaults = listen_address_schema:apply_default(cfg)
    local scheme = listen_address_schema:get(cfg_with_defaults, 'scheme')
    local host = listen_address_schema:get(cfg_with_defaults, 'host')
    local port = listen_address_schema:get(cfg_with_defaults, 'port')
    log.info("HTTP API endpoint: %s://%s:%d", scheme, host, port)
end

Parsing environment variables

The schema.fromenv() function allows getting configuration values from environment variables. The example below shows how to do this by adding a user-defined annotation env:

local listen_address_schema = schema.new('listen_address', schema.record({
    scheme = schema.enum({ 'http', 'https' }, { env = 'HTTP_SCHEME' }),
    host = schema.scalar({ type = 'string', env = 'HTTP_HOST' }),
    port = schema.scalar({ type = 'integer', env = 'HTTP_PORT' })
}))

local function collect_env_cfg()
    local res = {}
    for _, w in listen_address_schema:pairs() do
        local env_var = w.schema.env
        if env_var ~= nil then
            local value = schema.fromenv(env_var, os.getenv(env_var), w.schema)
            listen_address_schema:set(res, w.path, value)
        end
    end
    return res
end

The function also uses schema object methods:

pairs() to iterate over the schema nodes.
set() to assign configuration values.

API Reference

Functions
schema.array()	Create an array node
schema.enum()	Create an enum scalar node
schema.fromenv()	Parse a value from an environment variable
schema.map()	Create a map node
schema.new()	Create a schema
schema.record()	Create a record node
schema.scalar()	Create a scalar node
schema.set()	Create a set array node
schema_object
schema_object:apply_default()	Apply default values
schema_object:filter()	Filter schema nodes
schema_object:get()	Get specified configuration data
schema_object:map()	Transform configuration data
schema_object:merge()	Merge two configurations
schema_object:pairs()	Walk over a configuration
schema_object:set()	Set a configuration value
schema_object:validate()	Validate a configuration against a schema
schema_object.methods	User-defined methods
schema_object.name	Schema name
schema_object.schema	Schema nodes hierarchy
schema_node_annotation
allowed_values	Allowed node values
apply_default_if	Condition to apply defaults
default	Default node value
type	Value type
validate	Validation function
schema_node_object
schema_node_object.allowed_values	Allowed node values
schema_node_object.apply_default_if	Condition to apply defaults
schema_node_object.computed	Computed annotations
schema_node_object.default	Default value
schema_node_object.fields	Record node fields
schema_node_object.items	Array node items
schema_node_object.type	Scalar node type
schema_node_object.validate	Validation function

Functions

schema.array(array_def)¶

Create an array node of a configuration schema.

Parameters:

array_def (table) –
a table in the following format:
```
{
    items = <schema_node_object>,
    <..annotations..>
}
```
See also: schema_node_object, schema_node_annotation.

Return:

the created array node as a table with the following fields:

type: array
items: a table describing an array item as a schema node
annotations, if provided in array_def

Rtype:

table

See also: Arrays

schema.enum(allowed_values, annotations)¶

A shortcut for creating a string scalar node with a limited set of allowed values.

Parameters:

allowed_values (table) – a list of enum members – values allowed for the node
annotations (table) – annotations (see schema_node_annotation)

Return:

the created scalar node as a table with the following fields:

type: string
allowed_values: allowed node values
annotations, if annotations is provided

Rtype:

table

See also: Scalar nodes

schema.fromenv(env_var_name, raw_value, schema_node)¶

Parse an environment variable as a value of the given schema node. The env_var_name parameter is used only for error messages. The value (raw_value) should be received using os.getenv() or os.environ().

How the raw value is parsed depends on the schema_node type:

Scalar:
- string: return the value as is
- number or integer: parse the value as a number or an integer
- string, number: attempt to parse as a number; in case of a failure return the value as is
- boolean: accept true and false (case-insensitively), or 1 and 0 for true and false values correspondingly
- any: parse the value as a JSON
Map: parse either as JSON (if the raw value starts with {) or as a comma-separated string of key=value pairs: key1=value1,key2=value2
Array: parse either as JSON (if the raw value starts with [) or as a comma-separated string of items: item1,item2,item3

Note

Parsing records from environment variables is not supported.

Parameters:	env_var_name (`string`) – environment variable name to use for error messages raw_value (`string`) – environment variable value schema_node (`schema_node_object`) – a schema node (see schema_node_object)
Return:	the parsed value
Rtype:	table

See also: Parsing environment variables

schema.map(map_def)¶

Create a map node of a configuration schema.

Parameters:

map_def (table) –
a table in the following format:
{ key = <schema_node_object>, value = <schema_node_object>, <..annotations..> }
See also: schema_node_object, schema_node_annotation.

Return:

the created map node as a table with the following fields:

type: map
key: map key type
value: map value type
annotations, if provided in map_def

Rtype:

table

See also: Maps

schema.new(schema_name, schema_node[, { methods = <...> }])¶

Create a schema object.

Parameters:

schema_name (string) – a name
schema_node (table) – a root schema node
methods (table) – methods

Return:

a new schema object (see schema_object) as a table with the following fields:

name: the schema name

schema: a table with schema nodes

methods: a table with user-provided methods

Rtype:

table

schema.record(fields[, annotations])¶

Create a record node of a configuration schema.

Parameters:

fields (table) –
a table of fields in the following format:
```
{
    [<field_name>] = <schema_node_object>,
    <...>
}
```
See also: schema_node_object.
annotations (table) – annotations (see Annotations)

Return:

the created record node as a table with the following fields:

type: record
fields: a table describing the record fields
annotations, if provided

Rtype:

table

See also: Records

schema.scalar(scalar_def)¶

Create a scalar node of a configuration schema.

Parameters:

scalar_def (table) –
a table in the following format:
{ type = <scalar_type>, <..annotations..> }
See also: schema_node_object, schema_node_annotation.
type (string) – data type (see Data types)
annotations (table) – annotations (see Annotations)

Return:

the created scalar node as a table with the following fields:

type: the node type (see Data types)
annotations, if provided

Rtype:

table

See also: Scalar nodes

schema.set(allowed_values, annotations)¶

Shortcut for creating an array node of unique string values from the given list of allowed values.

Parameters:

allowed_values (table) – allowed values of array items
annotations (table) – annotations (see Annotations)

Return:

the created array node as a table with the following fields:

type: array
items: a table describing an array item as a schema node
validate: an auto-generated validation function that checks that the values don’t repeat
annotations, if provided

Rtype:

table

See also: Arrays

schema_object

object schema_object¶

schema_object:apply_default(data)¶

Important

data is assumed to be validated against the given schema.

Apply default values to scalar nodes. The functions takes the default built-in annotation values of the scalar nodes and applies them based on the apply_default_if annotation. If there is no apply_default_if annotation on a node, the default value is also applied.

Note

The method works for static defaults. To define a dynamic default value, use the map() method.

Parameters:	data (`any`) – configuration data
Return:	configuration data with applied schema defaults

See also: default, apply_default_if

schema_object:filter(data, f)¶

Important

data is assumed to be validated against the given schema.

Filter data based on the schema annotations. The method returns an iterator by configuration nodes for which the given filter function f returns true.

The filter function f receives the following table as the argument:

w = {
    path = <array-like table>,
    schema = <schema node>,
    data = <data at the given path>,
}

The filter function returns a boolean value that is interpreted as “accepted” or “not accepted”.

Example:

Calling a function on all schema nodes that have the my_annotation annotation defined:

s:filter(function(w)
    return w.schema.my_annotation ~= nil
end):each(function(w)
    do_something(w.data)
end)

Parameters:	data (`any`) – configuration data f (`function`) – filter function
Return:	a luafun iterator

schema_object:get(data, path)¶

Important

data is assumed to be validated against the given schema.

Get nested configuration values at the given path. The path can be either a dot-separated string (http.scheme) or an array-like table ({ 'http', 'scheme'}).

Example:

local scheme = listen_address_schema:get(cfg, 'listen_address.scheme')

Parameters:	data (`any`) – configuration data path (`string/table`) – path to the target node as: a string in the dot notation an array-like table
Return:	data at the given path

See also: see Getting configuration values

schema_object:map(data, f, f_ctx)¶

Important

data is assumed to be validated against the given schema.

Transform data by the given function. The data fields are transformed by the function passed in the second argument (f), while its structure remains unchanged.

The transformation function takes three arguments:

data – the configuration data
w – walkthrough node with the following fields:
- w.schema – schema node
- w.path – the path to the schema node
- w.error() – a function for printing human-readable error messages
ctx – additional context for the transformation function. Can be used to provide values for a specific call.

An example of the transformation function:

local function f(data, w, ctx)
    if w.schema.type == 'string' and data ~= nil then
        return data:gsub('{{ *foo *}}', ctx.foo)
    end
    return data
end

The map() method traverses all fields of the schema records, even if they are nil or box.NULL in the provided configuration. This allows using this method to set computed default values for missing fields. Note that this is not the case for maps and arrays since the schema doesn’t define their fields to traverse.

Parameters:	data (`any`) – configuration data f (`function`) – transformation function f_ctx (`any`) – user-provided context for the transformation function
Return:	transformed configuration data

schema_object:merge(data_a, data_b)¶

Important

data_a and data_b are assumed to be validated against the given schema.

Merge two configurations. The method merges configurations in a single node hierarchy, preferring the latter in case of a collision.

The following merge rules are used:

any present value is preferred over nil and box.NULL
box.NULL is preferred over nil
for scalar and array nodes, the right-hand value is used
Note
- Scalars of the any type are merged the same way as other scalars. They are not deeply merged even if they are tables.
- Arrays are not concatenated. Left hand array items are discarded.
records and maps are deeply merged, that is, the merge is performed recursively for their nested nodes

Parameters:	data_a (`any`) – configuration data data_b (`any`) – configuration data
Return:	merged configuration data

schema_object:pairs()¶

Walk over the schema and return scalar, array, and map schema nodes

Important

The method doesn’t return record nodes.

Return:	a luafun iterator

Example:

for _, w in schema:pairs() do
    local path = w.path
    local schema = w.schema
    -- <...>
end

schema_object:set(data, path, value)¶

Important

data is assumed to be validated against the given schema. value is validated by the method before the assignment.

Set a given value at the given path in a configuration. The path can be either a dot-separated string (http.scheme) or an array-like table ({ 'http', 'scheme'}).

Parameters:	data (`any`) – configuration data path (`string/table`) – path to the target node as: a string in the dot notation an array-like table value (`any`) – new value
Return:	updated configuration data

Example: see Parsing environment variables

schema_object:validate(data)¶

Validate data against the schema. If the data doesn’t adhere to the schema, an error is raised.

The method performs the following checks:

field type checks: field values are checked against the schema node types
allowed values: if a node has the allowed_values annotations of schema nodes, the corresponding data field is checked against the allowed values list
validation functions: if a validation function is defined for a node (the validate annotation), it is executed to check that the provided value is valid.

Parameters:	data (`any`) – data

Example: see Annotations and Validating configuration

See also: allowed_values, validate

schema_object.methods¶

User-defined methods in the schema.

See also: User-defined methods

schema_object.name¶: Schema name.

schema_object.schema¶

Schema nodes hierarchy.

See also: schema_node_object

schema_node_annotation

The following elements of tables passed as node constructor arguments are parsed by the modules as built-in annotations:

allowed_values

A list of allowed values for a node.

See also: schema_object:validate()

apply_default_if

A boolean function that defines whether to apply the default value specified using default. If this function returns true on a provided configuration data, the node receives the default value upon the schema_object.apply_default() method call.

The function takes two arguments:
- data – the configuration data
- w – walkthrough node with the following fields:
  - w.schema – schema node
  - w.path – the path to the schema node
  - w.error() – a function for printing human-readable error messages
See also: schema_object:apply_default()

default

A default value to use for a scalar node if it’s not specified explicitly.

Example: see Transforming configuration

See also: schema_object:apply_default()

type

A schema node type.

See also: Data types

validate

A function used to validate node data. The function must raise an error to fail the check. The function is called upon the schema_object:validate() function calls.

The function takes two arguments:
- data – the configuration data
- w – walkthrough node with the following fields:
  - w.schema – schema node
  - w.path – the path to the schema node
  - w.error() – a function for printing human-readable error messages
Example:

A function that checks that a string is a valid IP address:
```
local function validate_host(host, w)
    local host_pattern = "^(%d+)%.(%d+)%.(%d+)%.(%d+)$"
    if not host:match(host_pattern) then
        w.error("'host' should be a string containing a valid IP address, got %q", host)
    end
end
```
See also: schema_object:validate()

schema_node_object

object schema_node_object¶

schema_node_object.allowed_values¶: A list of values allowed for the node. The values are taken from the allowed_values node annotation.

schema_node_object.apply_default_if¶: A function to define when to apply the default node value. The value is taken from the apply_default_if annotation.

schema_node_object.computed¶: computed.annotations stores the node’s computed annotations.

schema_node_object.default¶: Node’s default value. The value is taken from the default annotation.

schema_node_object.fields¶: Child nodes for record nodes. See also Records.

schema_node_object.items¶: Node items for array nodes. See also Arrays

schema_node_object.type¶: Node type for scalar nodes. See Data types

schema_node_object.validate¶: Node value validation function. The value is taken from the validate annotation.

Module console

Overview

The console module allows one Tarantool instance to access another Tarantool instance, and allows one Tarantool instance to start listening on an admin port.

Index

Below is a list of all console functions.

Name	Use
console.connect()	Connect to an instance
console.listen()	Listen for incoming requests
console.start()	Start the console
console.ac()	Set the auto-completion flag
console.delimiter()	Set a delimiter
console.get_default_output()	Get default output format
console.set_default_output()	Set default output format
console.eos()	Set or get end-of-output string

console.connect(uri)¶

Connect to the instance at URI, change the prompt from ‘tarantool>’ to ‘uri>’, and act henceforth as a client until the user ends the session or types control-D.

The console.connect function allows one Tarantool instance, in interactive mode, to access another Tarantool instance. Subsequent requests will appear to be handled locally, but in reality the requests are being sent to the remote instance and the local instance is acting as a client. Once connection is successful, the prompt will change and subsequent requests are sent to, and executed on, the remote instance. Results are displayed on the local instance. To return to local mode, enter control-D.

If the Tarantool instance at uri requires authentication, the connection might look something like: console.connect('admin:secretpassword@distanthost.com:3301').

There are no restrictions on the types of requests that can be entered, except those which are due to privilege restrictions – by default the login to the remote instance is done with user name = ‘guest’. The remote instance could allow for this by granting at least one privilege: box.schema.user.grant('guest','execute','universe').

Parameters:	uri (`string`) – the URI of the remote instance
Return:	nil

Possible errors: the connection will fail if the target Tarantool instance was not initiated with box.cfg{listen=...}.

Example:

tarantool> console = require('console')
---
...
tarantool> console.connect('198.18.44.44:3301')
---
...
198.18.44.44:3301> -- prompt is telling us that instance is remote

console.listen(uri)¶

Listen on URI. The primary way of listening for incoming requests is via the connection-information string, or URI, specified in box.cfg{listen=...}. The alternative way of listening is via the URI specified in console.listen(...). This alternative way is called “administrative” or simply “admin port”. The listening is usually over a local host with a Unix domain socket.

Parameters:	uri (`string`) – the URI of the local instance

The “admin” address is the URI to listen on. It has no default value, so it must be specified if connections will occur via an admin port. The parameter is expressed with URI = Universal Resource Identifier format, for example “/tmpdir/unix_domain_socket.sock”, or a numeric TCP port. Connections are often made with telnet. A typical port value is 3313.

Example:

tarantool> console = require('console')
---
...
tarantool> console.listen('unix/:/tmp/X.sock')
... main/103/console/unix/:/tmp/X I> started
---
- fd: 6
  name:
    host: unix/
    family: AF_UNIX
    type: SOCK_STREAM
    protocol: 0
    port: /tmp/X.sock
...

console.start()¶

Start the console on the current interactive terminal.

Example:

A special use of console.start() is with initialization files. Normally, if one starts the Tarantool instance with tarantool initialization file there is no console. This can be remedied by adding these lines at the end of the initialization file:

local console = require('console')
console.start()

console.ac([true|false])¶: Set the auto-completion flag. If auto-completion is true, and the user is using Tarantool as a client or the user is using Tarantool via console.connect(), then hitting the TAB key may cause tarantool to complete a word automatically. The default auto-completion value is true.

console.delimiter(marker)¶

Set a custom end-of-request marker for Tarantool console.

The default end-of-request marker is a newline (line feed). Custom markers are not necessary because Tarantool can tell when a multi-line request has not ended (for example, if it sees that a function declaration does not have an end keyword). Nonetheless for special needs, or for entering multi-line requests in older Tarantool versions, you can change the end-of-request marker. As a result, newline alone is not treated as end of request.

To go back to normal mode, say: console.delimiter('')<marker>

Parameters:	marker (`string`) – a custom end-of-request marker for Tarantool console

Example:

tarantool> console = require('console'); console.delimiter('!')
---
...
tarantool> function f ()
         > statement_1 = 'a'
         > statement_2 = 'b'
         > end!
---
...
tarantool> console.delimiter('')!
---
...

console.get_default_output()¶: Return the current default output format. The result will be fmt="yaml", or it will be fmt="lua" if the last set_default_output call was console.set_default_output('lua').

console.set_default_output('yaml'|'lua')¶: Set the default output format. The possible values are ‘yaml’ (the default default) or ‘lua’. The output format can be changed within a session by executing console.eval('\set output yaml|lua'); see the description of output format in the Interactive console section.

console.eos([string])¶: Set or access the end-of-output string if default output is ‘lua’. This is the string that appears at the end of output in a response to any Lua request. The default value is ; semicolon. Saying eos() will return the current value. For example, after require('console').eos('!!') responses will end with ‘!!’.

Module crypto

Overview

“Crypto” is short for “Cryptography”, which generally refers to the production of a digest value from a function (usually a Cryptographic hash function), applied against a string. Tarantool’s crypto module supports ten types of cryptographic hash functions (AES, DES, DSS, MD4, MD5, MDC2, RIPEMD, SHA-1, SHA-2). Some of the crypto functionality is also present in the Module digest module.

Index

Below is a list of all crypto functions.

Name	Use
crypto.cipher.{algorithm}.{cipher_mode}.encrypt()	Encrypt a string
crypto.cipher.{algorithm}.{cipher_mode}.decrypt()	Decrypt a string
crypto.digest.{algorithm}()	Get a digest
crypto.hmac.{algorithm}()	Get a hash key

crypto.cipher.{aes128|aes192|aes256|des}.{cbc|cfb|ecb|ofb}.encrypt(string, key, initialization_vector)¶

crypto.cipher.{aes128|aes192|aes256|des}.{cbc|cfb|ecb|ofb}.decrypt(string, key, initialization_vector)¶

Pass or return a cipher derived from the string, key, and (optionally, sometimes) initialization vector. The four choices of algorithms:

aes128 - aes-128 (with 192-bit binary strings using AES)
aes192 - aes-192 (with 192-bit binary strings using AES)
aes256 - aes-256 (with 256-bit binary strings using AES)
des - des (with 56-bit binary strings using DES, though DES is not recommended)

Four choices of block cipher modes are also available:

cbc - Cipher Block Chaining
cfb - Cipher Feedback
ecb - Electronic Codebook
ofb - Output Feedback

For more information, read the article about Encryption Modes

Example:

_16byte_iv='1234567890123456'
_16byte_pass='1234567890123456'
e=crypto.cipher.aes128.cbc.encrypt('string', _16byte_pass, _16byte_iv)
crypto.cipher.aes128.cbc.decrypt(e,  _16byte_pass, _16byte_iv)

crypto.digest.{dss|dss1|md4|md5|mdc2|ripemd160}(string)¶

crypto.digest.{sha1|sha224|sha256|sha384|sha512}(string)¶

Pass or return a digest derived from the string. The eleven algorithm choices:

dss - dss (using DSS)
dss1 - dss (using DSS-1)
md4 - md4 (with 128-bit binary strings using MD4)
md5 - md5 (with 128-bit binary strings using MD5)
mdc2 - mdc2 (using MDC2)
ripemd160 - ripemd (with 160-bit binary strings using RIPEMD-160)
sha1 - sha-1 (with 160-bit binary strings using SHA-1)
sha224 - sha-224 (with 224-bit binary strings using SHA-2)
sha256 - sha-256 (with 256-bit binary strings using SHA-2)
sha384 - sha-384 (with 384-bit binary strings using SHA-2)
sha512 - sha-512(with 512-bit binary strings using SHA-2).

Example:

crypto.digest.md4('string')
crypto.digest.sha512('string')

crypto.hmac.{md4|md5|ripemd160}(key, string)¶

crypto.hmac.{sha1|sha224|sha256|sha384|sha512}(key, string)¶

Pass a key and a string. The result is an HMAC message authentication code. The eight algorithm choices:

md4 or md4_hex - md4 (with 128-bit binary strings using MD4)
md5 or md5_hex - md5 (with 128-bit binary strings using MD5)
ripemd160 or ripemd160_hex - ripemd (with 160-bit binary strings using RIPEMD-160)
sha1 or sha1_hex - sha-1 (with 160-bit binary strings using SHA-1)
sha224 or sha224_hex - sha-224 (with 224-bit binary strings using SHA-2)
sha256 or sha256_hex - sha-256 (with 256-bit binary strings using SHA-2)
sha384 or sha384_hex - sha-384 (with 384-bit binary strings using SHA-2)
sha512 or sha512_hex - sha-512(with 512-bit binary strings using SHA-2).

Example:

crypto.hmac.md4('key', 'string')
crypto.hmac.md4_hex('key', 'string')

Incremental methods in the crypto module

Suppose that a digest is done for a string ‘A’, then a new part ‘B’ is appended to the string, then a new digest is required. The new digest could be recomputed for the whole string ‘AB’, but it is faster to take what was computed before for ‘A’ and apply changes based on the new part ‘B’. This is called multi-step or “incremental” digesting, which Tarantool supports for all crypto functions.

crypto = require('crypto')

-- print aes-192 digest of 'AB', with one step, then incrementally
key = 'key/key/key/key/key/key/'
iv =  'iviviviviviviviv'
print(crypto.cipher.aes192.cbc.encrypt('AB', key, iv))
c = crypto.cipher.aes192.cbc.encrypt.new(key)
c:init(nil, iv)
c:update('A')
c:update('B')
print(c:result())
c:free()

-- print sha-256 digest of 'AB', with one step, then incrementally
print(crypto.digest.sha256('AB'))
c = crypto.digest.sha256.new()
c:init()
c:update('A')
c:update('B')
print(c:result())
c:free()

Getting the same results from digest and crypto modules

The following functions are equivalent. For example, the digest function and the crypto function will both produce the same result.

crypto.cipher.aes256.cbc.encrypt('x',b32,b16)==digest.aes256cbc.encrypt('x',b32,b16)
crypto.digest.md4('string') == digest.md4('string')
crypto.digest.md5('string') == digest.md5('string')
crypto.digest.sha1('string') == digest.sha1('string')
crypto.digest.sha224('string') == digest.sha224('string')
crypto.digest.sha256('string') == digest.sha256('string')
crypto.digest.sha384('string') == digest.sha384('string')
crypto.digest.sha512('string') == digest.sha512('string')

Module csv

Overview

The csv module handles records formatted according to Comma-Separated-Values (CSV) rules.

The default formatting rules are:

Lua escape sequences such as \n or \10 are legal within strings but not within files,
Commas designate end-of-field,
Line feeds, or line feeds plus carriage returns, designate end-of-record,
Leading or trailing spaces are ignored,
Quote marks may enclose fields or parts of fields,
When enclosed by quote marks, commas and line feeds and spaces are treated as ordinary characters, and a pair of quote marks “” is treated as a single quote mark.

The possible options which can be passed to csv functions are:

delimiter = string (default: comma) – single-byte character to designate end-of-field
quote_char = string (default: quote mark) – single-byte character to designate encloser of string
chunk_size = number (default: 4096) – number of characters to read at once (usually for file-IO efficiency)
skip_head_lines = number (default: 0) – number of lines to skip at the start (usually for a header)

Index

Below is a list of all csv functions.

Name	Use
csv.load()	Load a CSV file
csv.dump()	Transform input into a CSV-formatted string
csv.iterate()	Iterate over CSV records

csv.load(readable[, {options}])¶

Get CSV-formatted input from readable and return a table as output. Usually readable is either a string or a file opened for reading. Usually options is not specified.

Parameters:	readable (`object`) – a string, or any object which has a read() method, formatted according to the CSV rules options (`table`) – see above
Return:	loaded_value
Rtype:	table

Example:

Readable string has 3 fields, field#2 has comma and space so use quote marks:

tarantool> csv = require('csv')
---
...
tarantool> csv.load('a,"b,c ",d')
---
- - - a
    - 'b,c '
    - d
...

Readable string contains 2-byte character = Cyrillic Letter Palochka: (This displays a palochka if and only if character set = UTF-8.)

tarantool> csv.load('a\\211\\128b')
---
- - - a\211\128b
...

Semicolon instead of comma for the delimiter:

tarantool> csv.load('a,b;c,d', {delimiter = ';'})
---
- - - a,b
    - c,d
...

Readable file ./file.csv contains two CSV records. Explanation of fio is in section fio. Source CSV file and example respectively:

tarantool> -- input in file.csv is:
tarantool> -- a,"b,c ",d
tarantool> -- a\\211\\128b
tarantool> fio = require('fio')
---
...
tarantool> f = fio.open('./file.csv', {'O_RDONLY'})
---
...
tarantool> csv.load(f, {chunk_size = 4096})
---
- - - a
    - 'b,c '
    - d
  - - a\\211\\128b
...
tarantool> f:close()
---
- true
...

csv.dump(csv-table[, options, writable])¶

Get table input from csv-table and return a CSV-formatted string as output. Or, get table input from csv-table and put the output in writable. Usually options is not specified. Usually writable, if specified, is a file opened for writing. csv.dump() is the reverse of csv.load().

Parameters:	csv-table (`table`) – a table which can be formatted according to the CSV rules. options (`table`) – optional. see above writable (`object`) – any object which has a `write()` method
Return:	dumped_value
Rtype:	string, which is written to `writable` if specified

Example:

CSV-table has 3 fields, field#2 has “,” so result has quote marks

tarantool> csv = require('csv')
---
...
tarantool> csv.dump({'a','b,c ','d'})
---
- 'a,"b,c ",d

'
...

Round Trip: from string to table and back to string

tarantool> csv_table = csv.load('a,b,c')
---
...
tarantool> csv.dump(csv_table)
---
- 'a,b,c

'
...

csv.iterate(input, {options})¶

Form a Lua iterator function for going through CSV records one field at a time. Use of an iterator is strongly recommended if the amount of data is large (ten or more megabytes).

Parameters:	csv-table (`table`) – a table which can be formatted according to the CSV rules. options (`table`) – see above
Return:	Lua iterator function
Rtype:	iterator function

Example:

csv.iterate() is the low level of csv.load() and csv.dump(). To illustrate that, here is a function which is the same as the csv.load() function, as seen in the Tarantool source code.

tarantool> load = function(readable, opts)
         >   opts = opts or {}
         >   local result = {}
         >   for i, tup in csv.iterate(readable, opts) do
         >     result[i] = tup
         >   end
         >   return result
         > end
---
...
tarantool> load('a,b,c')
---
- - - a
    - b
    - c
...

Module datetime

Since: 2.10.0

The datetime module provides support for the datetime data types. It allows creating the date and time values either via the object interface or via parsing string values conforming to the ISO-8601 standard.

API Reference

Below is a list of datetime functions, properties, and related objects.

Functions
datetime.new()	Create an object of the `datetime` type from a table of time units
datetime.now()	Create an object of the `datetime` type with the current date and time
datetime.is_datetime()	Check whether the specified value is a `datetime` object
datetime.parse()	Convert an input string with the date and time information into a `datetime` object
datetime.interval.is_interval()	Check whether the specified value is an `interval` object
datetime.interval.new()	Create an object of the `interval` type from a table of time units
Properties
datetime.TZ	A Lua table that maps timezone names and abbreviations to its index and vice-versa.
Methods
datetime_object:add()	Modify an existing `datetime` object by adding values of the input argument
datetime_object:format()	Convert the standard `datetime` object presentation into a formatted string
datetime_object:set()	Update the field values in the existing `datetime` object
datetime_object:sub()	Modify an existing `datetime` object by subtracting values of the input argument
datetime_object:totable()	Convert the information from a `datetime` object into the table format
interval_object:totable()	Convert the information from an `interval` object into the table format

Functions

datetime.new([{ units }])¶

Create an object of the datetime type from a table of time units. See the description of units and examples below.

Parameters:	units (`table`) – Table of time units. If an empty table or no arguments are passed, the `datetime` object with the default values corresponding to Unix Epoch is created: `1970-01-01T00:00:00Z`.
Return:	datetime object
Rtype:	cdata

Possible input time units for datetime.new()

Name	Description	Type	Default
nsec (usec, msec)	Fractional part of the last second. You can specify either nanoseconds (`nsec`), or microseconds (`usec`), or milliseconds (`msec`). Specifying two of these units simultaneously or all three ones lead to an error.	number	0
sec	Seconds. Value range: 0 - 60. A leap second is supported at the most basic level, see the section leap second.	number	0
min	Minutes. Value range: 0 - 59.	number	0
hour	Hours. Value range: 0 - 23.	number	0
day	Day number. Value range: 1 - 31. The special value `-1` generates the last day of a particular month (see example below).	number	1
month	Month number. Value range: 1 - 12.	number	1
year	Year.	number	1970
timestamp	Timestamp, in seconds. Similar to the Unix timestamp, but can have a fractional part that is converted in nanoseconds in the resulting `datetime` object. If the fractional part for the last second is set via the `nsec`, `usec`, or `msec` units, the timestamp value should be integer otherwise an error occurs. The timestamp is not allowed if you already set time and/or date via specific units, namely, `sec`, `min`, `hour`, `day`, `month`, and `year`.	number	0
tzoffset	A time zone offset from UTC, in minutes. Value range: -720 - 840 inclusive. If both `tzoffset` and `tz` are specified, `tz` has the preference and the `tzoffset` value is ignored. See a section timezone.	number	0
tz	A time zone name according to the Time Zone Database. See the Time zones section.	string

Examples:

tarantool> datetime.new {
            nsec = 123456789,

            sec = 20,
            min = 25,
            hour = 18,

            day = 20,
            month = 8,
            year = 2021,

            tzoffset  = 180
            }
---
- 2021-08-20T18:25:20.123456789+0300
...

tarantool> datetime.new {
            nsec = 123456789,

            sec = 20,
            min = 25,
            hour = 18,

            day = 20,
            month = 8,
            year = 2021,

            tzoffset = 60,
            tz = 'Europe/Moscow'
            }
---
- 2021-08-20T18:25:20.123456789 Europe/Moscow
...

tarantool> datetime.new {
            day = -1,
            month = 2,
            year = 2021,
            }
---
- 2021-02-28T00:00:00Z
...

tarantool> datetime.new {
            timestamp = 1656664205.123,
            tz = 'Europe/Moscow'
            }
---
- 2022-07-01T08:30:05.122999906 Europe/Moscow
...

tarantool> datetime.new {
            nsec = 123,
            timestamp = 1656664205,
            tz = 'Europe/Moscow'
            }
---
- 2022-07-01T08:30:05.000000123 Europe/Moscow
...

datetime.now()¶

Create an object of the datetime type with the current date and time.

Return:	datetime object
Rtype:	cdata

datetime.is_datetime([value])¶

Check whether the specified value is a datetime object.

Parameters:	value (`any`) – the value to check
Return:	`true` if the specified value is a `datetime` object; otherwise, `false`
Rtype:	boolean

datetime.parse('input_string'[, {format, tzoffset}])¶

Convert an input string with the date and time information into a datetime object. The input string should be formatted according to one of the following standards:

ISO 8601
RFC 3339
extended strftime – see description of the format() for details.

By default fields that are not specified are equal to appropriate values in a Unix time.

Leap second is supported at the most basic level, see the section leap second.

Parameters:	input_string (`string`) – string with the date and time information. format (`string`) – indicator of the `input_string` format. Possible values: ‘iso8601’, ‘rfc3339’, or `strptime`-like format string. If no value is set, the default formatting is used (`"%F %T %Z"`). Note that only a part of possible ISO 8601 and RFC 3339 formats are supported. To parse unsupported formats, you can specify a format string manually using conversion specifications and ordinary characters. tzoffset (`number`) – time zone offset from UTC, in minutes.
Return:	a datetime_object
Rtype:	cdata
Return:	a number of parsed characters
Rtype:	number

Implementation details:

For formats with a decimal fraction of the second ([1], 5.3.1.4, a) the tail beyond 9 fracitonal digits is truncated.

tarantool> datetime.parse('2024-07-31T17:30:00.123456789999', {format = 'iso8601'})
---
- 2024-07-31T17:30:00.123456789Z
- 32
...

For formats with a decimal fraction of the hour ([1], 5.3.1.4, c) or minute ([1], 5.3.1.4, b) fractions are truncated to seconds precision. If second fractions are desired, explicit representation (format a) must be used.

tarantool> datetime.parse('2024-07-31T17,333333333', {format = 'iso8601'})
---
- 2024-07-31T17:19:59Z
- 23
...

tarantool> datetime.parse('2024-07-31T17:30.333333333', {format = 'iso8601'})
---
- 2024-07-31T17:30:19Z
- 26
...

Example:

tarantool> datetime.parse('1970-01-01T00:00:00Z')
---
- 1970-01-01T00:00:00Z
- 20
...

tarantool> t = datetime.parse('1970-01-01T00:00:00', {format = 'iso8601', tzoffset = 180})

tarantool> t
---
- 1970-01-01T00:00:00+0300
...

tarantool> t = datetime.parse('2017-12-27T18:45:32.999999-05:00', {format = 'rfc3339'})

tarantool> t
---
- 2017-12-27T18:45:32.999999-0500
...

tarantool> T = datetime.parse('Thu Jan  1 03:00:00 1970', {format = '%c'})

tarantool> T
---
- 1970-01-01T03:00:00Z
...

tarantool> T = datetime.parse('12/31/2020', {format = '%m/%d/%y'})

tarantool> T
---
- 2020-12-31T00:00:00Z
...

tarantool> T = datetime.parse('1970-01-01T03:00:00.125000000+0300', {format = '%FT%T.%f%z'})

tarantool> T
---
- 1970-01-01T03:00:00.125+0300
...

tarantool> dt = datetime.parse('01:01:01 MSK', {format ='%H:%M:%S %Z'})

---
...

tarantool> dt.year
---
- 1970
...

tarantool> dt.month
---
- 1
...

tarantool> dt.wday
---
- 5
...

tarantool> dt.tz
---
- MSK
...

datetime.interval.is_interval([value])¶

Since: 3.2.0

Check whether the specified value is an interval object.

Parameters:	value (`any`) – the value to check
Return:	`true` if the specified value is an `interval` object; otherwise, `false`
Rtype:	boolean

Examples:

If a numeric value is passed to is_interval(), it returns false:

tarantool> datetime = require('datetime')
---
...
tarantool> datetime.interval.is_interval(123)
---
- false
...

If an interval object is passed to is_interval(), it returns true:

tarantool> datetime.interval.is_interval(datetime.interval.new())
---
- true
...

datetime.interval.new()¶

Create an object of the interval type from a table of time units. See description of units and examples below.

Parameters:	input (`table`) – Table with time units and parameters. For all possible time units, the values are not restricted. If an empty table or no arguments are passed, the `interval` object with the default value `0 seconds` is created.
Return:	interval_object
Rtype:	cdata

Possible input time units and parameters for datetime.interval.new()

Name	Description	Type	Default
nsec (usec, msec)	Fractional part of the last second. You can specify either nanoseconds (`nsec`), or microseconds (`usec`), or milliseconds (`msec`). Specifying two of these units simultaneously or all three ones lead to an error.	number	0
sec	Seconds	number	0
min	Minutes	number	0
hour	Hours	number	0
day	Day number	number	0
week	Week number	number	0
month	Month number	number	0
year	Year	number	0
adjust	Defines how to round days in a month after an arithmetic operation.	string	‘none’

Examples:

tarantool> datetime.interval.new()

---
- 0 seconds
...

tarantool> datetime.interval.new {
            month = 6,
            year = 1
            }
---
- +1 years, 6 months
...

tarantool> datetime.interval.new {
            day = -1
            }
---
- -1 days
...

Properties

TZ¶

Since: 2.11.0

A Lua table that maps timezone names (like Europe/Moscow) and timezone abbreviations (like MSK) to its index and vice-versa. See the Time zones section.

tarantool> datetime.TZ['Europe/Moscow']
---
- 947
...

tarantool> datetime.TZ[947]
---
- Europe/Moscow
...

Related objects

datetime_object

object datetime_object¶

A datetime object.

datetime_object:add(input[, { adjust }])¶

Modify an existing datetime object by adding values of the input argument. See also: Datetime and interval arithmetic. The addition is performed taking tzdata into account, when tzoffset or tz fields are set, see the Time zones.

Parameters:	input (`table`) – an interval object or an equivalent table (see Example #1) adjust (`string`) – defines how to round days in a month after an arithmetic operation. Possible values: `none`, `last`, `excess` (see Example #2). Defaults to `none`.
Return:	datetime_object
Rtype:	cdata

Example #1:

tarantool> dt = datetime.new {
            day = 26,
            month = 8,
            year = 2021,
            tzoffset  = 180
            }
---
...

tarantool> iv = datetime.interval.new {day = 7}
---
...

tarantool> dt, iv
---
- 2021-08-26T00:00:00+0300
- +7 days
...

tarantool> dt:add(iv)
---
- 2021-09-02T00:00:00+0300
...

tarantool> dt:add{ day = 7 }
---
- 2021-09-09T00:00:00+0300
...

Example #2:

tarantool> dt = datetime.new {
            day = 29,
            month = 2,
            year = 2020
            }
---
...

tarantool> dt:add{month = 1, adjust = 'none'}
---
- 2020-03-29T00:00:00Z
...

tarantool> dt = datetime.new {
            day = 29,
            month = 2,
            year = 2020
            }
---
...

tarantool> dt:add{month = 1, adjust = 'last'}
---
- 2020-03-31T00:00:00Z
...

tarantool> dt = datetime.new {
            day = 31,
            month = 1,
            year = 2020
            }
---
...

tarantool> dt:add{month = 1, adjust = 'excess'}
---
- 2020-03-02T00:00:00Z
...

datetime_object:format(['input_string'])¶

Convert the standard datetime object presentation into a formatted string. The conversion specifications are the same as in the strftime function. Additional specification for nanoseconds is %f which also allows a modifier to control the output precision of fractional part: %5f (see the example below). If no arguments are set for the method, the default conversions are used: '%FT%T.%f%z' (see the example below).

Parameters:	input_string (`string`) – string consisting of zero or more conversion specifications and ordinary characters
Return:	string with the formatted date and time information
Rtype:	string

Example:

tarantool> dt = datetime.new {
            nsec = 123456789,

            sec = 20,
            min = 25,
            hour = 18,

            day = 20,
            month = 8,
            year = 2021,

            tzoffset  = 180
            }
---
...

tarantool> dt:format('%d.%m.%y %H:%M:%S.%5f')
---
- 20.08.21 18:25:20.12345
...

tarantool> dt:format()
---
- 2021-08-20T18:25:20.123456789+0300
...

tarantool> dt:format('%FT%T.%f%z')
---
- 2021-08-20T18:25:20.123456789+0300
...

datetime_object:set([{ units }])¶

Update the field values in the existing datetime object.

Parameters:	units (`table`) – Table of time units. The time units are the same as for the `datetime.new()` function.
Return:	updated datetime_object
Rtype:	cdata

Example:

tarantool> dt = datetime.new {
            nsec = 123456789,

            sec = 20,
            min = 25,
            hour = 18,

            day = 20,
            month = 8,
            year = 2021,

            tzoffset  = 180
            }

tarantool> dt:set {msec = 567}
---
- 2021-08-20T18:25:20.567+0300
...

tarantool> dt:set {tzoffset = 60}
---
- 2021-08-20T18:25:20.567+0100
...

datetime_object:sub({ input[, adjust ] }])¶

Modify an existing datetime object by subtracting values of the input argument. See also: Datetime and interval arithmetic. The subtraction is performed taking tzdata into account, when tzoffset or tz fields are set, see the Time zones.

Parameters:	input (`table`) – an interval object or an equivalent table (see Example) adjust (`string`) – defines how to round days in a month after an arithmetic operation. Possible values: `none`, `last`, `excess`. Defaults to `none`. The logic is similar to the one of the `:add()` method – see Example #2.
Return:	datetime_object
Rtype:	cdata

Example:

tarantool> dt = datetime.new {
            day = 26,
            month = 8,
            year = 2021,
            tzoffset  = 180
            }
---
...

tarantool> iv = datetime.interval.new {day = 5}
---
...

tarantool> dt, iv
---
- 2021-08-26T00:00:00+0300
- +5 days
...

tarantool> dt:sub(iv)
---
- 2021-08-21T00:00:00+0300
...

tarantool> dt:sub{ day = 1 }
---
- 2021-08-20T00:00:00+0300
...

datetime_object:totable()¶

Convert the information from a datetime object into the table format. The resulting table has the following fields:

Field name	Description
nsec	Nanoseconds. Number.
sec	Seconds. Number.
min	Minutes. Number.
hour	Hours. Number.
day	Day number.
month	Month number.
year	Year. Number.
wday	Days since the beginning of the week. Number. 1 is Sunday as for `os.date('*t')`.
yday	Days since the beginning of the year. Number.
timestamp	Timestamp, in seconds. Number.
isdst	Is the DST (Daylight Saving Time) applicable for the date, see a section timezone. Boolean.
tzoffset	Time zone offset from UTC, see a section timezone. Number.
tz	Time zone name or abbreviation, see a section timezone. String.

Return:	table with the date and time parameters
Rtype:	table

Example:

tarantool> dt = datetime.new {
            sec = 20,
            min = 25,
            hour = 18,

            day = 20,
            month = 8,
            year = 2021,
            tz = 'MAGT',
            }
---
...

tarantool> dt:totable()
---
- tz: 'MAGT'
  sec: 20
  min: 25
  yday: 232
  day: 20
  nsec: 0
  isdst: false
  wday: 6
  tzoffset: 600
  month: 8
  year: 2021
  hour: 18
...

interval_object

object interval_object¶

An interval object.

interval_object:totable()¶

Convert the information from an interval object into the table format. The resulting table has the following fields:

Field name	Description
nsec	Nanoseconds
sec	Seconds
min	Minutes
hour	Hours
day	Day number
month	Month number
year	Year
week	Week number
adjust	Defines how to round days in a month after an arithmetic operation.

Return:	table with the date and time parameters
Rtype:	table

Example:

tarantool> iv = datetime.interval.new{month = 1, adjust = 'last'}
---
...

tarantool> iv:totable()
---
- adjust: last
  sec: 0
  nsec: 0
  day: 0
  week: 0
  hour: 0
  month: 1
  year: 0
  min: 0
...

Datetime and interval arithmetic

The datetime module enables creating of objects of two types: datetime and interval.

If you need to shift the datetime object values, you can use either the modifier methods, that is, the datetime_object:add() or datetime_object:sub() methods, or apply interval arithmetic using overloaded + (__add) or - (__sub) methods.

datetime_object:add()/datetime_object:sub() modify the current object, but +/- create copy of the object as the operation result.

In the interval operation, each of the interval subcomponents is sequentially calculated starting from the largest (year) to the smallest (nsec):

year – years
month – months
week – weeks
day – days
hour – hours
min – minutes
sec – seconds
nsec – nanoseconds

If the results of the operation exceed the allowed range for any of the components, an exception is raised.

The datetime and interval objects can participate in arithmetic operations:

The sum of two intervals is an interval object, whose fields are the sum of each particular component of operands.
The result of subtraction of two intervals is similar: it’s an interval object where each subcomponent is the result of subtraction of particular fields in the original operands.
If you add datetime and interval objects, the result is a datetime object. The addition is performed in a determined order from the largest component (year) to the smallest (nsec).
Subtraction of two datetime objects produces an interval object. The difference of two time values is performed not as the difference of the epoch seconds, but as difference of all the subcomponents, that is, years, months, days, hours, minutes, and seconds.
An untyped table object can be used in each context where the typed datetime or interval objects are used if the left operand is a typed object with an overloaded operation of + or -.

The matrix of the addition operands eligibility and their result types:

	datetime	interval	table
datetime	unsupported	datetime	datetime
interval	datetime	interval	interval

The matrix of the subtraction operands eligibility and their result types:

	datetime	interval	table
datetime	interval	datetime	datetime
interval	unsupported	interval	interval

The subtraction and addition of datetime objects are performed taking tzdata into account tzoffset or tz fields are set:

tarantool> datetime.new({tz='MSK'}) - datetime.new({tz='UTC'})
---
- -180 minutes
...

Datetime and interval comparison

If you need to compare the datetime and interval object values, you can use standard Lua relational operators: ==, ~=, >, <, >=, and <=. These operators use the overloaded __eq, __lt, and __le metamethods to compare values.

Support for relational operators for interval objects has been added since 2.11.0.

Example 1:

tarantool> dt1 = datetime.new({ year = 2010 })
---
...

tarantool> dt2 = datetime.new({ year = 2024 })
---
...

tarantool> dt1 == dt2
---
- false
...

tarantool> dt1 < dt2
---
- true
...

Example 2:

tarantool> iv1 = datetime.interval.new({month = 1})
---
...

tarantool> iv2 = datetime.interval.new({month = 2})
---
...

tarantool> iv1 < iv2
---
- true
...

Leap second

Leap seconds are a periodic one-second adjustment of Coordinated Universal Time (UTC) in order to keep a system’s time of day close to the mean solar time. However, the Earth’s rotation speed varies in response to climatic and geological events, and due to this, UTC leap seconds are irregularly spaced and unpredictable.

Tarantool includes the Time Zone Database that besides the time zone description files also contains a leapseconds file. You can use the Lua module tarantool to get a used version of tzdata.

The datetime module supports leap seconds at the most basic level:

The function datetime.parse() correctly parses an input string with 60 seconds:

tarantool> datetime.parse('23:12:60', {format ='%H:%M:%S'})
---
- 1970-01-01T23:13:00Z
- 8
...

The datetime.new() function and the datetime_object:set() method accept a table with the sec key set to 60 seconds:
```
tarantool> datetime.new({ sec = 60 })
---
- 1970-01-01T00:01:00Z
...
```

Meanwhile the following cases are NOT supported by the datetime module:

With the datetime.new() function, the 60 leap seconds in the sec key give an extra minute like regular seconds, and the result is represented in a regular manner, without leap seconds:
```
datetime.new({ year = 1998, month = 12, day = 31, hour = 23, min = 59, sec = 60})
---
- 1999-01-01T00:00:00Z
```
The function datetime.parse() produces an error when parsing a leap second input string with 60 seconds and a format that supports leap seconds (‘rfc3339’, ‘iso8601’):
```
datetime.parse('1998-12-31T23:59:60Z', {format='rfc3339'})
---
- error: 'builtin/datetime.lua:885: could not parse ''1998-12-31T23:59:60Z'''
```

Time zones

Full support has been added since 2.11.0.

Tarantool uses the Time Zone Database (also known as the Olson database and supported by IANA) for timezone support. You can use the Lua module tarantool to get a used version of tzdata.

Every datetime object has three fields that represent timezone support: tz, tzoffset, and isdst:

The field isdst is calculated using tzindex and attributes of the selected timezone in the Olson DB timezone.
```
tarantool> require('datetime').parse('2004-06-01T00:00 Europe/Moscow').isdst
---
- true
...
```
The field tz field can be set to a timezone name or abbreviation. A timezone name is a human-readable name based on the Time Zone Database, for example, “Europe/Moscow”. Timezone abbreviations represent time zones by alphabetic abbreviations such as “EST”, “WST”, and “F”. Both timezone names and abbreviations are available via the bidirectional array datetime.TZ.
The field tzoffset is calculated automatically using the current Olson rule. This means that it takes into account summer time, leap year, and leap seconds information when a timezone name is set. However, the tzoffset field can be set manually when an appropriate timezone is not available.

The fields tz and tzoffset can be set in datetime.new(), datetime.parse(), and datetime_object:set(). The arithmetic on datetime objects are performed taking tzdata into account, when tzoffset or tz fields are set, see the Datetime and interval arithmetic section.

Limitations

The supported date range is from -5879610-06-22 to +5879611-07-11.
There were moments in past history when local mean time in some particular zone used a timezone offset not representable in a whole minutes but rather in seconds. For example, in Moscow before 1918 there used to be offset +2 hours 31 minutes and 19 seconds. See an Olson dump for this period:
```
$ zdump -c1880,1918 -i Europe/Moscow

TZ="Europe/Moscow"
-       -       +023017 MMT
1916-07-03      00:01:02        +023119 MMT
1917-07-02      00      +033119 MST     1
1917-12-27      23      +023119 MMT
```
Modern tzdata rules do not use such a tiny fraction, and all timezones differ from UTC in units measured in minutes, not seconds. Tarantool datetime module uses minutes internally as units for tzoffset. So there might be some loss of precision if you try to operate with such ancient timestamps.

References

Module decimal

The decimal module has functions for working with exact numbers. This is important when numbers are large or even the slightest inaccuracy is unacceptable. For example Lua calculates 0.16666666666667 * 6 with floating-point so the result is 1. But with the decimal module (using decimal.new to convert the number to decimal type) decimal.new('0.16666666666667') * 6 is 1.00000000000002.

To construct a decimal number, bring in the module with require('decimal') and then use decimal.new(n) or any function in the decimal module:

abs(n)
exp(n)
is_decimal(n)
ln(n)
log10(n)
new(n)
precision(n)
rescale(decimal-number, new-scale)
scale(n)
sqrt(n)
trim(decimal-number),

where n can be a string or a non-decimal number or a decimal number. If it is a string or a non-decimal number, Tarantool converts it to a decimal number before working with it. It is best to construct from strings, and to convert back to strings after calculations, because Lua numbers have only 15 digits of precision.

Decimal numbers have N digits of precision, that is, the total number of digits before and after the decimal point can be equal to N. In Tarantool 3.5 the precision was increased from N = 38 to N = 76.

decimal = require('decimal')
e = decimal.exp(1)
-- In Tarantool version 3.5 and above:
decimal.precision(e)
---
- 76
...
#tostring(e)
---
- 77
...
-- In Tarantool versions before 3.5:
decimal.precision(e)
---
- 38
...
#tostring(e)
---
- 39
...

Tarantool supports the usual arithmetic and comparison operators + - * / % ^ < > <= >= ~= ==. If an operation has both decimal and non-decimal operands, then the non-decimal operand is converted to decimal before the operation happens.

Use tostring(decimal-number) to convert back to a string.

A decimal operation will fail if overflow happens (when a number is greater than 10^N - 1 or less than -10^N - 1). A decimal operation will fail if arithmetic is impossible (such as division by zero or square root of minus 1). A decimal operation will not fail if rounding of post-decimal digits is necessary to get N-digit precision.

decimal.abs(string-or-number-or-decimal-number)¶: Returns absolute value of a decimal number. For example if a is -1 then decimal.abs(a) returns 1.

decimal.exp(string-or-number-or-decimal-number)¶: Returns e raised to the power of a decimal number. For example if a is 1 then decimal.exp(a) returns 2.7182818284590452353602874713526624978 for N = 38 or 2.718281828459045235360287471352662497757247093699959574966967627724076630354 for N = 76. Compare math.exp(1) from the Lua math library, which returns 2.718281828459.

decimal.is_decimal(string-or-number-or-decimal-number)¶: Returns true if the specified value is a decimal, and false otherwise. For example if a is 123 then decimal.is_decimal(a) returns false. if a is decimal.new(123) then decimal.is_decimal(a) returns true.

decimal.ln(string-or-number-or-decimal-number)¶: Returns natural logarithm of a decimal number. For example if a is 1 then decimal.ln(a) returns 0.

decimal.log10(string-or-number-or-decimal-number)¶: Returns base-10 logarithm of a decimal number. For example if a is 100 then decimal.log10(a) returns 2.

decimal.new(string-or-number-or-decimal-number)¶: Returns the value of the input as a decimal number. For example if a is 1E-1 then decimal.new(a) returns 0.1.

decimal.precision(string-or-number-or-decimal-number)¶: Returns the number of digits in a decimal number. For example if a is 123.4560 then decimal.precision(a) returns 7.

decimal.rescale(decimal-number, new-scale)¶: Returns the number after possible rounding or padding. If the number of post-decimal digits is greater than new-scale, then rounding occurs. The rounding rule is: round half away from zero. If the number of post-decimal digits is less than new-scale, then padding of zeros occurs. For example if a is -123.4550 then decimal.rescale(a, 2) returns -123.46, and decimal.rescale(a, 5) returns -123.45500.

decimal.scale(string-or-number-or-decimal-number)¶: Returns the number of post-decimal digits in a decimal number. For example if a is 123.4560 then decimal.scale(a) returns 4.

decimal.sqrt(string-or-number-or-decimal-number)¶: Returns the square root of a decimal number. For example if a is 2 then decimal.sqrt(a) returns 1.4142135623730950488016887242096980786 for N = 38 or 1.414213562373095048801688724209698078569671875376948073176679737990732478462 for N = 76.

decimal.trim(decimal-number)¶: Returns a decimal number after possible removing of trailing post-decimal zeros. For example if a is 2.20200 then decimal.trim(a) returns 2.202.

Module digest

Overview

A “digest” is a value which is returned by a function (usually a Cryptographic hash function), applied against a string. Tarantool’s digest module supports several types of cryptographic hash functions ( AES, MD4, MD5, SHA-1, SHA-2, PBKDF2) as well as a checksum function (CRC32), two functions for base64, and two non-cryptographic hash functions (guava, murmur). Some of the digest functionality is also present in the crypto.

Index

Below is a list of all digest functions.

Name	Use
digest.aes256cbc.encrypt()	Encrypt a string using AES
digest.aes256cbc.decrypt()	Decrypt a string using AES
digest.md4()	Get a digest made with MD4
digest.md4_hex()	Get a hexadecimal digest made with MD4
digest.md5()	Get a digest made with MD5
digest.md5_hex()	Get a hexadecimal digest made with MD5
digest.pbkdf2()	Get a digest made with PBKDF2
digest.sha1()	Get a digest made with SHA-1
digest.sha1_hex()	Get a hexadecimal digest made with SHA-1
digest.sha224()	Get a 224-bit digest made with SHA-2
digest.sha224_hex()	Get a 56-byte hexadecimal digest made with SHA-2
digest.sha256()	Get a 256-bit digest made with SHA-2
digest.sha256_hex()	Get a 64-byte hexadecimal digest made with SHA-2
digest.sha384()	Get a 384-bit digest made with SHA-2
digest.sha384_hex()	Get a 96-byte hexadecimal digest made with SHA-2
digest.sha512()	Get a 512-bit digest made with SHA-2
digest.sha512_hex()	Get a 128-byte hexadecimal digest made with SHA-2
digest.base64_encode()	Encode a string to Base64
digest.base64_decode()	Decode a Base64-encoded string
digest.urandom()	Get an array of random bytes
digest.crc32()	Get a 32-bit checksum made with CRC32
digest.crc32.new()	Initiate incremental CRC32
digest.guava()	Get a number made with a consistent hash
digest.murmur()	Get a digest made with MurmurHash
digest.murmur.new()	Initiate incremental MurmurHash

digest.aes256cbc.encrypt(string, key, iv)¶
digest.aes256cbc.decrypt(string, key, iv)¶: Returns 256-bit binary string = digest made with AES.

digest.md4(string)¶: Returns 128-bit binary string = digest made with MD4.

digest.md4_hex(string)¶: Returns 32-byte string = hexadecimal of a digest calculated with md4.

digest.md5(string)¶: Returns 128-bit binary string = digest made with MD5.

digest.md5_hex(string)¶: Returns 32-byte string = hexadecimal of a digest calculated with md5.

digest.pbkdf2(string, salt[, iterations[, digest-length]])¶: Returns binary string = digest made with PBKDF2.
For effective encryption the iterations value should be at least several thousand. The digest-length value determines the length of the resulting binary string.

Note

digest.pbkdf2() yields and should not be used in a transaction (between box.begin() and box.commit()/box.rollback()). PBKDF2 is a time-consuming hash algorithm. It runs in a separate coio thread. While calculations are performed, the fiber that calls digest.pbkdf2() yields and another fiber continues working in the tx thread.

digest.sha1(string)¶: Returns 160-bit binary string = digest made with SHA-1.

digest.sha1_hex(string)¶: Returns 40-byte string = hexadecimal of a digest calculated with sha1.

digest.sha224(string)¶: Returns 224-bit binary string = digest made with SHA-2.

digest.sha224_hex(string)¶: Returns 56-byte string = hexadecimal of a digest calculated with sha224.

digest.sha256(string)¶: Returns 256-bit binary string = digest made with SHA-2.

digest.sha256_hex(string)¶: Returns 64-byte string = hexadecimal of a digest calculated with sha256.

digest.sha384(string)¶: Returns 384-bit binary string = digest made with SHA-2.

digest.sha384_hex(string)¶: Returns 96-byte string = hexadecimal of a digest calculated with sha384.

digest.sha512(string)¶: Returns 512-bit binary string = digest made with SHA-2.

digest.sha512_hex(string)¶: Returns 128-byte string = hexadecimal of a digest calculated with sha512.

digest.base64_encode()¶

Returns base64 encoding from a regular string.

The possible options are:

nopad – result must not include ‘=’ for padding at the end,
nowrap – result must not include line feed for splitting lines after 72 characters,
urlsafe – result must not include ‘=’ or line feed, and may contain ‘-‘ or ‘_’ instead of ‘+’ or ‘/’ for positions 62 and 63 in the index table.

Options may be true or false, the default value is false.

For example:

digest.base64_encode(string_variable,{nopad=true})

digest.base64_decode(string)¶: Returns a regular string from a base64 encoding.

digest.urandom(integer)¶: Returns array of random bytes with length = integer.

digest.crc32(string)¶

Returns 32-bit checksum made with CRC32.

The crc32 and crc32_update functions use the Cyclic Redundancy Check polynomial value: 0x1EDC6F41 / 4812730177. (Other settings are: input = reflected, output = reflected, initial value = 0xFFFFFFFF, final xor value = 0x0.) If it is necessary to be compatible with other checksum functions in other programming languages, ensure that the other functions use the same polynomial value.

For example, in Python, install the crcmod package and say:

>>> import crcmod
>>> fun = crcmod.mkCrcFun('4812730177')
>>> fun('string')
3304160206L

In Perl, install the Digest::CRC module and run the following code:

use Digest::CRC;
$d = Digest::CRC->new(width => 32, poly => 0x1EDC6F41, init => 0xFFFFFFFF, refin => 1, refout => 1);
$d->add('string');
print $d->digest;

(the expected output is 3304160206).

digest.crc32.new()¶: Initiates incremental crc32. See incremental methods notes.

digest.guava(state, bucket)¶

Returns a number made with consistent hash.

The guava function uses the Consistent Hashing algorithm of the Google guava library. The first parameter should be a hash code; the second parameter should be the number of buckets; the returned value will be an integer between 0 and the number of buckets. For example,

tarantool> digest.guava(10863919174838991, 11)
---
- 8
...

digest.murmur(string)¶: Returns 32-bit binary string = digest made with MurmurHash.

digest.murmur.new(opts)¶

Initiates incremental MurmurHash. See incremental methods notes. For example:

murmur.new({seed=0})

Incremental methods in the `digest`
module

digest = require('digest')

-- print crc32 of 'AB', with one step, then incrementally
print(digest.crc32('AB'))
c = digest.crc32.new()
c:update('A')
c:update('B')
print(c:result())

-- print murmur hash of 'AB', with one step, then incrementally
print(digest.murmur('AB'))
m = digest.murmur.new()
m:update('A')
m:update('B')
print(m:result())

Example

In the following example, the user creates two functions, password_insert() which inserts a SHA-1 digest of the word “^S^e^c^ret Wordpass” into a tuple set, and password_check() which requires input of a password.

tarantool> digest = require('digest')
---
...
tarantool> function password_insert()
         >   box.space.tester:insert{1234, digest.sha1('^S^e^c^ret Wordpass')}
         >   return 'OK'
         > end
---
...
tarantool> function password_check(password)
         >   local t = box.space.tester:select{12345}
         >   if digest.sha1(password) == t[2] then
         >     return 'Password is valid'
         >   else
         >     return 'Password is not valid'
         >   end
         > end
---
...
tarantool> password_insert()
---
- 'OK'
...

If a later user calls the password_check() function and enters the wrong password, the result is an error.

tarantool> password_check('Secret Password')
---
- 'Password is not valid'
...

Module errno

Overview

The errno module is typically used within a function or within a Lua program, in association with a module whose functions can return operating-system errors, such as fio.

Index

Below is a list of all errno functions.

Name	Use
errno()	Get an error number for the last OS-related function
errno.strerror()	Get an error message for the corresponding error number

errno()

Return an error number for the last operating-system-related function, or 0. To invoke it, simply say errno(), without the module name.

Rtype:	integer

errno.strerror([code])¶

Return a string, given an error number. The string will contain the text of the conventional error message for the current operating system. If code is not supplied, the error message will be for the last operating-system-related function, or 0.

Parameters:	code (`integer`) – number of an operating-system error
Rtype:	string

Example:

This function displays the result of a call to fio.open() which causes error 2 (errno.ENOENT). The display includes the error number, the associated error string, and the error name.

tarantool> function f()
         >   local fio = require('fio')
         >   local errno = require('errno')
         >   fio.open('no_such_file')
         >   print('errno() = ' .. errno())
         >   print('errno.strerror() = ' .. errno.strerror())
         >   local t = getmetatable(errno).__index
         >   for k, v in pairs(t) do
         >     if v == errno() then
         >       print('errno() constant = ' .. k)
         >     end
         >   end
         > end
---
...

tarantool> f()
errno() = 2
errno.strerror() = No such file or directory
errno() constant = ENOENT
---
...

To see all possible error names stored in the errno metatable, say getmetatable(errno) (output abridged):

tarantool> getmetatable(errno)
---
- __newindex: 'function: 0x41666a38'
  __call: 'function: 0x41666890'
  __index:
  ENOLINK: 67
  EMSGSIZE: 90
  EOVERFLOW: 75
  ENOTCONN: 107
  EFAULT: 14
  EOPNOTSUPP: 95
  EEXIST: 17
  ENOSR: 63
  ENOTSOCK: 88
  EDESTADDRREQ: 89
  <...>
...

Module experimental.connpool

Since: 3.1.0

Important

experimental.connpool is an experimental module and is subject to changes.

The experimental.connpool module provides a set of features for connecting to remote cluster instances and for executing remote procedure calls on an instance that satisfies the specified criteria.

Note

Note that the execution time for experimental.connpool functions depends on the number of instances and the time required to connect to each instance.

Loading a module

To load the experimental.connpool module, use the require() directive:

local connpool = require('experimental.connpool')

API Reference

Functions
connpool.call()	Execute the specified function on a remote instance
connpool.connect()	Create a connection to the specified instance
connpool.filter()	Get names of instances that match the specified conditions

Functions

connpool.call(func_name, args, opts)¶

Execute the specified function on a remote instance.

Note

The function is executed on behalf of the user that maintains replication in the cluster. Ensure that this user has the execute permission for the function to execute.

Parameters:

func_name (string) – a name of the function to execute.
args (table/nil) – function arguments.
opts (table/nil) –
options used to select candidates on which the function should be executed:
- labels – the labels an instance has.
- roles – the roles of an instance.
- prefer_local – whether to prefer a local or remote instance to execute call() on:
  - if true (default), call() tries to execute the specified function on a local instance.
  - if false, call() tries to connect to a random candidate until a connection is established.
- mode – a mode that allows filtering candidates based on their read-only status. This option accepts the following values:
  - nil (default) – don’t check the read-only status of instances.
  - ro – consider only read-only instances.
  - rw – consider only read-write instances.
  - prefer_ro – consider read-only instances, then read-write instances.
  - prefer_rw – consider read-write instances, then read-only instances.
- instances – the names of instances to consider as candidates.
- replicasets – the names of replica sets whose instances are considered as candidates.
- groups – the names of groups whose instances are considered as candidates.
- timeout – a connection timeout (in seconds).
- buffer – a buffer used to read a returned value.
- on_push – a function to execute when the client receives an out-of-band message. Learn more from box.session.push().
- on_push_ctx – an argument of the function executed when the client receives an out-of-band message. Learn more from box.session.push().
- is_async – whether to wait for the result of the call.

Return:

a function’s return value.

Example

In the example below, the following conditions are specified to choose an instance to execute the vshard.storage.buckets_count function:

An instance has the roles.crud-storage role.
An instance has the dc label set to east.
An instance is read-only.

local connpool = require('experimental.connpool')
local buckets_count = connpool.call('vshard.storage.buckets_count',
        nil,
        { roles = { 'roles.crud-storage' },
          labels = { dc = 'east' },
          mode = 'ro' }
)

connpool.connect(instance_name, opts)¶

Create a connection to the specified instance.

Parameters:

instance_name (string) – an instance name.
opts (table/nil) –
none, any, or all of the following parameters:
- connect_timeout – a connection timeout (in seconds).
- wait_connected – whether to block the connection until it is established:
  - if true (default), the connection is blocked until it is established.
  - if false, the connection is returned immediately.
- fetch_schema – whether to fetch schema changes from a remote instance.

Return:

a net.box connection.

Example

In the example below, connect() is used to create the active connection to storage-b-002:

local connpool = require('experimental.connpool')
local conn = connpool.connect("storage-b-002", { fetch_schema = true })

Once you have a connection, you can execute requests on the remote instance, for example, select data from a space using conn.space.<space-name>:select().

connpool.filter(opts)¶

Get names of instances that match the specified conditions.

Parameters:

opts (table/nil) –
none, any, or all of the following parameters:
- labels – the labels an instance has.
- roles – the roles of an instance.
- mode – a mode that allows filtering candidates based on their read-only status. This option accepts the following values:
  - nil (default) – don’t check the read-only status of instances.
  - ro – consider only read-only instances.
  - rw – consider only read-write instances.
- instances – the names of instances to consider as candidates.
- replicasets – the names of replica sets whose instances are considered as candidates.
- groups – the names of groups whose instances are considered as candidates.

Return:

an array of instance names.

Example

In the example below, filter() should return a list of instances with the roles.crud-storage role and specified label value:

local connpool = require('experimental.connpool')
local instance_names = connpool.filter({ roles = { 'roles.crud-storage' },
                                         labels = { dc = 'east' } })

Module fiber

Overview

With the fiber module, you can:

Create, run, and manage fibers.
Send and receive messages between different processes (i.e. different connections, sessions, or fibers) via channels.
Use a synchronization mechanism for fibers, similar to “condition variables” and similar to operating-system functions, such as pthread_cond_wait() plus pthread_cond_signal().

Index

Below is a list of all fiber functions and members.

Name	Use
Fibers
fiber.create()	Create and start a fiber
fiber.new()	Create but do not start a fiber
fiber.self()	Get a fiber object
fiber.find()	Get a fiber object by ID
fiber.sleep()	Make a fiber go to sleep
fiber.yield()	Yield control
fiber.status()	Get the current fiber’s status
fiber.info()	Get information about all fibers
fiber.top()	Return a table of alive fibers and show their CPU consumption
fiber.kill()	Cancel a fiber
fiber.testcancel()	Check if the current fiber has been cancelled
fiber.set_max_slice()	Set the default maximum slice for all fibers
fiber.set_slice()	Set a slice for the current fiber execution
fiber.extend_slice()	Extend a slice for the current fiber execution
fiber.check_slice()	Check whether a slice for the current fiber is over
fiber.time()	Get the system time in seconds
fiber.time64()	Get the system time in microseconds
fiber.clock()	Get the monotonic time in seconds
fiber.clock64()	Get the monotonic time in microseconds
Fiber object
fiber_object:id()	Get a fiber’s ID
fiber_object:name()	Get a fiber’s name
fiber_object:name(name)	Set a fiber’s name
fiber_object:status()	Get a fiber’s status
fiber_object:cancel()	Cancel a fiber
fiber_object.set_max_slice()	Set a fiber’s maximum slice
fiber_object.storage	Local storage within the fiber
fiber_object:set_joinable()	Make it possible for a new fiber to join
fiber_object:join()	Wait for a fiber’s state to become ‘dead’
Channels
fiber.channel()	Create a communication channel
channel_object:put()	Send a message via a channel
channel_object:close()	Close a channel
channel_object:get()	Fetch a message from a channel
channel_object:is_empty()	Check if a channel is empty
channel_object:count()	Count messages in a channel
channel_object:is_full()	Check if a channel is full
channel_object:has_readers()	Check if an empty channel has any readers waiting
channel_object:has_writers()	Check if a full channel has any writers waiting
channel_object:is_closed()	Check if a channel is closed
Example	A useful example about channels
Condition variables
fiber.cond()	Create a condition variable
cond_object:wait()	Make a fiber go to sleep until woken by another fiber
cond_object:signal()	Wake up a single fiber
cond_object:broadcast()	Wake up all fibers
Example	A useful example about condition variables

Fibers

A fiber is a set of instructions that are executed with cooperative multitasking. The fiber module enables you to create a fiber and associate it with a user-supplied function called a fiber function.

A fiber has the following possible states: running, suspended, ready, or dead. A program with fibers is, at any given time, running only one of its fibers. This running fiber only suspends its execution when it explicitly yields control to another fiber that is ready to execute.

When the fiber function ends, the fiber ends and becomes dead. If required, you can cancel a running or suspended fiber. Another useful capability is limiting a fiber execution time for long-running operations.

Note

By default, each transaction in Tarantool is executed in a single fiber on a single thread, sees a consistent database state, and commits all changes atomically.

Create a fiber

To create a fiber, call one of the following functions:

fiber.create() creates a fiber and runs it immediately. The initial fiber state is running.
fiber.new() creates a fiber but does not start it. The initial fiber state is ready. You can join such fibers by calling the fiber_object:join() function and get the result returned by the fiber’s function.

Yield control

Yield is an action that occurs in a cooperative environment that transfers control of the thread from the current fiber to another fiber that is ready to execute. The fiber module provides the following functions that yield control to another fiber explicitly:

fiber.yield() yields control to the scheduler.
fiber.sleep() yields control to the scheduler and sleeps for the specified number of seconds.

Cancel a fiber

To cancel a fiber, use the fiber_object.cancel function. You can also call fiber.kill() to locate a fiber by its numeric ID and cancel it.

Limit execution time

If a fiber works too long without yielding control, you can use a fiber slice to limit its execution time. The fiber_slice_default compat option controls the default value of the maximum fiber slice.

There are two slice types: a warning and an error slice.

When a warning slice is over, a warning message is logged, for example:
```
fiber has not yielded for more than 0.500 seconds
```
When an error slice is over, the fiber is cancelled and the FiberSliceIsExceeded error is thrown:
```
FiberSliceIsExceeded: fiber slice is exceeded
```
Control is passed to another fiber that is ready to execute.

The fiber slice is checked by all functions operating on spaces and indexes, such as index_object.select(), space_object.replace(), and so on. You can also use the fiber.check_slice() function in application code to check whether the slice for the current fiber is over.

The following functions override the the default value of the maximum fiber slice:

fiber.set_max_slice(slice) sets the default maximum slice for all fibers.
fiber_object:set_max_slice(slice) sets the maximum slice for a particular fiber.

The maximum slice is set when a fiber wakes up. This might be its first run or wake up after fiber.yield().

You can change or increase the slice for a current fiber’s execution using the following functions:

fiber.set_slice(slice) sets the slice for a current fiber execution.
fiber.extend_slice(slice) extends the slice for a current fiber execution.

Note that the specified values don’t affect a fiber’s execution after fiber.yield().

Information about fibers

To get information about all fibers or a specific fiber, use the following functions:

fiber.info returns information about all fibers.
fiber.status() gets the current fiber’s status. To get the status of the specified fiber, call fiber_object:status().
fiber.top() shows all alive fibers and their CPU consumption.

Garbage collection

Like all Lua objects, dead fibers are garbage collected. The Lua garbage collector frees pool allocator memory owned by the fiber, resets all fiber data, and returns the fiber (now called a fiber carcass) to the fiber pool. The carcass can be reused when another fiber is created.

A fiber has all the features of a Lua coroutine and all the programming concepts that apply to Lua coroutines apply to fibers as well. However, Tarantool has made some enhancements for fibers and has used fibers internally. So, although the use of coroutines is possible and supported, the use of fibers is recommended.

API reference

fiber.create(function[, function-arguments])¶

Create and start a fiber. The fiber is created and begins to run immediately.

Parameters:	function – the function to be associated with the fiber function-arguments – arguments to be passed to the function
Return:	created fiber object
Rtype:	userdata

Example:

The script below shows how to create a fiber using fiber.create:

-- app.lua --
fiber = require('fiber')

function greet(name)
    print('Hello, '..name)
end

greet_fiber = fiber.create(greet, 'John')
print('Fiber already started')

The following output should be displayed after running app.lua:

$ tarantool app.lua
Hello, John
Fiber already started

fiber.new(function[, function-arguments])¶

Create a fiber but do not start it. The created fiber starts after the fiber creator (that is, the job that is calling fiber.new()) yields. The initial fiber state is ready.

Note

Note that fiber.status() returns the suspended state for ready fibers because the ready state is not observable using the fiber module API.

You can join fibers created using fiber.new by calling the fiber_object:join() function and get the result returned by the fiber’s function. To join the fiber, you need to make it joinable using fiber_object:set_joinable().

Parameters:	function – the function to be associated with the fiber function-arguments – arguments to be passed to the function
Return:	created fiber object
Rtype:	userdata

Example:

The script below shows how to create a fiber using fiber.new:

-- app.lua --
fiber = require('fiber')

function greet(name)
    print('Hello, '..name)
end

greet_fiber = fiber.new(greet, 'John')
print('Fiber not started yet')

The following output should be displayed after running app.lua:

$ tarantool app.lua
Fiber not started yet
Hello, John

fiber.self()¶

Return:	fiber object for the currently scheduled fiber.
Rtype:	userdata

Example:

tarantool> fiber.self()
---
- status: running
  name: interactive
  id: 101
...

fiber.find(id)¶

Parameters:	id – numeric identifier of the fiber.
Return:	fiber object for the specified fiber.
Rtype:	userdata

Example:

tarantool> fiber.find(101)
---
- status: running
  name: interactive
  id: 101
...

fiber.sleep(time)¶

Yield control to the scheduler and sleep for the specified number of seconds. Only the current fiber can be made to sleep.

Parameters:	time – number of seconds to sleep.
Exception:	see the Example of yield failure.

Example:

The increment function below contains an infinite loop that adds 1 to the counter global variable. Then, the current fiber goes to sleep for period seconds. sleep causes an implicit fiber.yield().

-- app.lua --
fiber = require('fiber')

counter = 0
function increment(period)
    while true do
        counter = counter + 1
        fiber.sleep(period)
    end
end

increment_fiber = fiber.create(increment, 2)
require('console').start()

After running the script above, print the information about the fiber: a fiber ID, its status, and the counter value.

tarantool> print('ID: ' .. increment_fiber:id() .. '\nStatus: ' .. increment_fiber:status() .. '\nCounter: ' .. counter)
ID: 104
Status: suspended
Counter: 8
---
...

Then, cancel the fiber and print the information about the fiber one more time. This time the fiber status is dead.

tarantool> increment_fiber:cancel()
---
...

tarantool> print('ID: ' .. increment_fiber:id() .. '\nStatus: ' .. increment_fiber:status() .. '\nCounter: ' .. counter)
ID: 104
Status: dead
Counter: 12
---
...

fiber.yield()¶

Yield control to the scheduler. Equivalent to fiber.sleep(0).

Exception:	see the Example of yield failure.

Example:

In the example below, two fibers are associated with the same function. Each fiber yields control after printing a greeting.

-- app.lua --
fiber = require('fiber')

function greet()
    while true do
        print('Enter a name:')
        name = io.read()
        print('Hello, '..name..'. I am fiber '..fiber.id())
        fiber.yield()
    end
end

for i = 1, 2 do
    fiber_object = fiber.create(greet)
    fiber_object:cancel()
end

The output might look as follows:

$ tarantool app.lua
Enter a name:
John
Hello, John. I am fiber 104
Enter a name:
Jane
Hello, Jane. I am fiber 105

fiber.status([fiber_object])¶

Return the status of the current fiber. If the fiber_object is passed, return the status of the specified fiber.

Parameters:	fiber_object – (optional) the fiber object
Return:	the status of `fiber`. One of: `dead`, `suspended`, or `running`.
Rtype:	string

Example:

tarantool> fiber.status()
---
- running
...

fiber.info({[backtrace/bt]})¶

Return information about all fibers.

Parameters:	backtrace (`boolean`) – show backtrace. Default: `true`. Set to `false` to show less information (symbol resolving can be expensive). bt (`boolean`) – same as `backtrace`, but with lower priority.
Return:	number of context switches (`csw`), backtrace, total memory, used memory, fiber ID (`fid`), fiber name. If fiber.top is enabled or Tarantool was built with `ENABLE_FIBER_TOP`, processor time (`time`) is also returned.
Rtype:	table

Return values explained

csw – number of context switches.
backtrace, bt – each fiber’s stack trace, showing where it originated and what functions were called.
memory:
- total – total memory occupied by the fiber as a C structure, its stack, etc.
- used – actual memory used by the fiber.
time – duplicates the “time” entry from fiber.top().cpu for each fiber.
Only shown if fiber.top is enabled.

Example:

tarantool> fiber.info({ bt = true })
---
- 101:
    csw: 1
    backtrace:
    - C: '#0  0x5dd130 in lbox_fiber_id+96'
    - C: '#1  0x5dd13d in lbox_fiber_stall+13'
    - L: stall in =[C] at line -1
    - L: (unnamed) in @builtin/fiber.lua at line 59
    - C: '#2  0x66371b in lj_BC_FUNCC+52'
    - C: '#3  0x628f28 in lua_pcall+120'
    - C: '#4  0x5e22a8 in luaT_call+24'
    - C: '#5  0x5dd1a9 in lua_fiber_run_f+89'
    - C: '#6  0x45b011 in fiber_cxx_invoke(int (*)(__va_list_tag*), __va_list_tag*)+17'
    - C: '#7  0x5ff3c0 in fiber_loop+48'
    - C: '#8  0x81ecf4 in coro_init+68'
    memory:
    total: 516472
    used: 0
    time: 0
    name: lua
    fid: 101
  102:
    csw: 0
    backtrace:
    - C: '#0  (nil) in +63'
    - C: '#1  (nil) in +63'
    memory:
    total: 516472
    used: 0
    time: 0
    name: on_shutdown
    fid: 102

...

fiber.top()¶

Show all alive fibers and their CPU consumption.

Return:	a table with two entries: `cpu` and `cpu_misses`

cpu itself is a table whose keys are strings containing fiber ids and names. The three metrics available for each fiber are:

instant (in percent), which indicates the share of time the fiber was executing during the previous event loop iteration.
average (in percent), which is calculated as an exponential moving average of instant values over all the previous event loop iterations.
time (in seconds), which estimates how much CPU time each fiber spent processing during its lifetime.

The time entry is also added to each fiber’s output in fiber.info() (it duplicates the time entry from fiber.top().cpu per fiber).

Note that time is only counted while fiber.top() is enabled.

cpu_misses indicates the number of times the TX thread detected it was rescheduled on a different CPU core during the last event loop iteration. fiber.top() uses the CPU timestamp counter to measure each fiber’s execution time. However, each CPU core may have its own counter value (you can only rely on counter deltas if both measurements were taken on the same core, otherwise the delta may even get negative). When the TX thread is rescheduled to a different CPU core, Tarantool just assumes the CPU delta was zero for the latest measurement. This lowers the precision of our computations, so the bigger cpu misses value the lower the precision of fiber.top() results.

Note

With 2.11.0, cpu_misses is deprecated and always returns 0.

Example:

tarantool> fiber.top()
---
- cpu:
    107/lua:
      instant: 30.967324490456
      time: 0.351821993
      average: 25.582738345233
    104/lua:
      instant: 9.6473633128437
      time: 0.110869897
      average: 7.9693406131877
    101/on_shutdown:
      instant: 0
      time: 0
      average: 0
    103/lua:
      instant: 9.8026528631511
      time: 0.112641118
      average: 18.138387232255
    106/lua:
      instant: 20.071174377224
      time: 0.226901357
      average: 17.077908441831
    102/interactive:
      instant: 0
      time: 9.6858e-05
      average: 0
    105/lua:
      instant: 9.2461986412164
      time: 0.10657528
      average: 7.7068458630827
    1/sched:
      instant: 20.265286315108
      time: 0.237095335
      average: 23.141537169257
  cpu_misses: 0
...

Notice that by default new fibers created due to fiber.create are named ‘lua’ so it is better to set their names explicitly via fiber_object:name(‘name’).

There are several system fibers in fiber.top() output that might be useful:

sched is a special system fiber. It schedules tasks to other fibers, if any, and also handles some libev events.

It can have high instant and average values in fiber.top() output in two cases:
- The instance has almost no load - then practically only sched is executing, and the other fibers are sleeping. So relative to the other fibers, sched may have almost 100% load.
- sched handles a large number of system events. This should not cause performance problems.
main fibers process requests that come over the network (iproto requests). There are several such fibers, and new ones are created if needed. When a new request comes in, a free fiber takes it and executes it. The request can be a typical select/replace/delete/insert or a function call. For example, conn:eval() or conn:call().

Note

Enabling fiber.top() slows down fiber switching by about 15%, so it is disabled by default. To enable it, use fiber.top_enable(). To disable it after you finished debugging, use fiber.top_disable().

fiber.kill(id)¶

Locate a fiber by its numeric ID and cancel it. In other words, fiber.kill() combines fiber.find() and fiber_object:cancel().

Parameters:	id – the ID of the fiber to be cancelled.
Exception:	the specified fiber does not exist or cancel is not permitted.

Example:

tarantool> fiber.kill(fiber.id()) -- kill self, may make program end
---
- error: fiber is cancelled
...

fiber.testcancel()¶

Check if the current fiber has been cancelled and throw an exception if this is the case.

Note

Even if you catch the exception, the fiber will remain cancelled. Most types of calls will check fiber.testcancel(). However, some functions (id, status, join etc.) will return no error. We recommend application developers to implement occasional checks with fiber.testcancel() and to end fiber’s execution as soon as possible in case it has been cancelled.

Example:

tarantool> fiber.testcancel()
---
- error: fiber is cancelled
...

fiber.set_max_slice(slice)¶

Set the default maximum slice for all fibers. A fiber slice limits the time period of executing a fiber without yielding control.

Parameters:	slice (`number/table`) – a fiber slice, which can one of the following: a time period (in seconds) that specifies the error slice. Example: `fiber.set_max_slice(3)`. a table that specifies the warning and error slices (in seconds). Example: `fiber.set_max_slice({warn = 1.5, err = 3})`.

Example:

The example below shows how to use set_max_slice to limit the slice for all fibers. fiber.check_slice() is called inside a long-running operation to determine whether a slice for the current fiber is over.

-- app.lua --
fiber = require('fiber')
clock = require('clock')

fiber.set_max_slice({warn = 1.5, err = 3})
time = clock.monotonic()
function long_operation()
    while clock.monotonic() - time < 5 do
        fiber.check_slice()
        -- Long-running operation ⌛⌛⌛ --
    end
end

long_operation_fiber = fiber.create(long_operation)

The output should look as follows:

$ tarantool app.lua
fiber has not yielded for more than 1.500 seconds
FiberSliceIsExceeded: fiber slice is exceeded

fiber.set_slice(slice)¶

Set a slice for the current fiber execution. A fiber slice limits the time period of executing a fiber without yielding control.

Parameters:	slice (`number/table`) – a fiber slice, which can one of the following: a time period (in seconds) that specifies the error slice. Example: `fiber.set_slice(3)`. a table that specifies the warning and error slices (in seconds). Example: `fiber.set_slice({warn = 1.5, err = 3})`.

Example:

The example below shows how to use set_slice to limit the slice for the current fiber execution. fiber.check_slice() is called inside a long-running operation to determine whether a slice for the current fiber is over.

-- app.lua --
fiber = require('fiber')
clock = require('clock')

time = clock.monotonic()
function long_operation()
    fiber.set_slice({warn = 1.5, err = 3})
    while clock.monotonic() - time < 5 do
        fiber.check_slice()
        -- Long-running operation ⌛⌛⌛ --
    end
end

long_operation_fiber = fiber.create(long_operation)

The output should look as follows.

$ tarantool app.lua
fiber has not yielded for more than 1.500 seconds
FiberSliceIsExceeded: fiber slice is exceeded

fiber.extend_slice(slice)¶

Extend a slice for the current fiber execution. For example, if the default error slice is set using fiber.set_max_slice() to 3 seconds, extend_slice(1) extends the error slice to 4 seconds.

Parameters:	slice (`number/table`) – a fiber slice, which can one of the following: a time period (in seconds) that specifies the error slice. Example: `fiber.extend_slice(1)`. a table that specifies the warning and error slices (in seconds). Example: `fiber.extend_slice({warn = 0.5, err = 1})`.

Example:

The example below shows how to use extend_slice to extend the slice for the current fiber execution. The default fiber slice is set using set_max_slice.

-- app.lua --
fiber = require('fiber')
clock = require('clock')

fiber.set_max_slice({warn = 1.5, err = 3})
time = clock.monotonic()
function long_operation()
    fiber.extend_slice({warn = 0.5, err = 1})
    while clock.monotonic() - time < 5 do
        fiber.check_slice()
        -- Long-running operation ⌛⌛⌛ --
    end
end

long_operation_fiber = fiber.create(long_operation)

The output should look as follows.

$ tarantool app.lua
fiber has not yielded for more than 2.000 seconds
FiberSliceIsExceeded: fiber slice is exceeded

FiberSliceIsExceeded is thrown after 4 seconds.

fiber.check_slice()¶

Check whether a slice for the current fiber is over. A fiber slice limits the time period of executing a fiber without yielding control.

Example:

See the examples for the following functions:

fiber.set_max_slice()
fiber.set_slice()
fiber.extend_slice()

fiber.time()¶

Return:	current system time (in seconds since the epoch) as a Lua number. The time is taken from the event loop clock, which makes this call very cheap, but still useful for constructing artificial tuple keys.
Rtype:	number

Example:

tarantool> fiber.time(), fiber.time()
---
- 1448466279.2415
- 1448466279.2415
...

fiber.time64()¶

Return:	current system time (in microseconds since the epoch) as a 64-bit integer. The time is taken from the event loop clock.
Rtype:	cdata (ctype<int64_t>)

Example:

tarantool> fiber.time(), fiber.time64()
---
- 1448466351.2708
- 1448466351270762
...

fiber.clock()¶

Get the monotonic time in seconds. It is better to use fiber.clock() for calculating timeouts instead of fiber.time() because fiber.time() reports real time so it is affected by system time changes.

Return:	a floating-point number of seconds, representing elapsed wall-clock time since some time in the past that is guaranteed not to change during the life of the process
Rtype:	number

Example:

tarantool> start = fiber.clock()
---
...
tarantool> print(start)
248700.58805
---
...
tarantool> print(fiber.time(), fiber.time()-start)
1600785979.8291 1600537279.241
---
...

fiber.clock64()¶

Same as fiber.clock() but in microseconds.

Return:	a number of seconds as 64-bit integer, representing elapsed wall-clock time since some time in the past that is guaranteed not to change during the life of the process
Rtype:	cdata (ctype<int64_t>)

object fiber_object¶

fiber_object:id()¶

Parameters:	fiber_object – generally this is an object referenced in the return from fiber.create or fiber.self or fiber.find
Return:	ID of the fiber.
Rtype:	number

fiber.self():id() can also be expressed as fiber.id().

Example:

tarantool> fiber_object = fiber.self()
---
...
tarantool> fiber_object:id()
---
- 101
...

fiber_object:name()¶

Parameters:	fiber_object – generally this is an object referenced in the return from fiber.create or fiber.self or fiber.find
Return:	name of the fiber.
Rtype:	string

fiber.self():name() can also be expressed as fiber.name().

Example:

tarantool> fiber.self():name()
---
- interactive
...

fiber_object:name(name[, options])

Change the fiber name. By default a Tarantool server’s interactive-mode fiber is named ‘interactive’ and new fibers created due to fiber.create are named ‘lua’. Giving fibers distinct names makes it easier to distinguish them when using fiber.info and fiber.top(). Max length is 255.

Parameters:

fiber_object – generally this is an object referenced in the return from fiber.create or fiber.self or fiber.find
name (string) – the new name of the fiber.
options –
- truncate=true – truncates the name to the max length if it is too long. If this option is false (the default), fiber.name(new_name) fails with an exception if a new name is too long. The name length limit is 255 (since version 2.4.1).

Return:

nil

Example:

tarantool> fiber.self():name('non-interactive')
---
...

fiber_object:status()¶

Return the status of the specified fiber.

Parameters:	fiber_object – generally this is an object referenced in the return from fiber.create or fiber.self or fiber.find
Return:	the status of fiber. One of: “dead”, “suspended”, or “running”.
Rtype:	string

fiber.self():status() can also be expressed as fiber.status().

Example:

tarantool> fiber.self():status()
---
- running
...

fiber_object:cancel()¶

Send a cancellation request to the fiber. Running and suspended fibers can be cancelled. After a fiber has been cancelled, attempts to operate on it cause errors, for example, fiber_object:name() causes error: the fiber is dead. But a dead fiber can still report its ID and status.

Cancellation is asynchronous. Use fiber_object:join() to wait for the cancellation to complete. After fiber_object:cancel() is called, the fiber may or may not check whether it was cancelled. If the fiber does not check it, it cannot ever be cancelled.

Parameters:	fiber_object – generally this is an object referenced in the return from fiber.create or fiber.self or fiber.find
Return:	nil

Possible errors: cancel is not permitted for the specified fiber object.

Example:

See the fiber.sleep() example.

fiber_object:set_max_slice(slice)¶

Set a fiber’s maximum slice. A fiber slice limits the time period of executing a fiber without yielding control.

Parameters:	slice (`number/table`) – a fiber slice, which can one of the following: a time period (in seconds) that specifies the error slice. Example: `long_operation_fiber.set_max_slice(3)`. a table that specifies the warning and error slices (in seconds). Example: `long_operation_fiber.set_max_slice({warn = 1.5, err = 3})`.

Example:

The example below shows how to use set_max_slice to limit the fiber slice. fiber.check_slice() is called inside a long-running operation to determine whether a slice for the fiber is over.

-- app.lua --
fiber = require('fiber')
clock = require('clock')

time = clock.monotonic()
function long_operation()
    while clock.monotonic() - time < 5 do
        fiber.check_slice()
        -- Long-running operation ⌛⌛⌛ --
    end
end

long_operation_fiber = fiber.new(long_operation)
long_operation_fiber:set_max_slice({warn = 1.5, err = 3})

The output should look as follows.

$ tarantool app.lua
fiber has not yielded for more than 1.500 seconds
FiberSliceIsExceeded: fiber slice is exceeded

fiber_object.storage¶

A local storage within the fiber. It is a Lua table created when it is first accessed. The storage can contain any number of named values, subject to memory limitations. Naming may be done with fiber_object.storage.name or fiber_object.storage['name']. or with a number fiber_object.storage[number]. Values may be either numbers or strings.

fiber.storage is destroyed when the fiber is finished, regardless of how is it finished – via fiber_object:cancel(), or the fiber’s function did ‘return’. Moreover, the storage is cleaned up even for pooled fibers used to serve IProto requests. Pooled fibers never really die, but nonetheless their storage is cleaned up after each request. That makes possible to use fiber.storage as a full featured request-local storage. This behavior is implemented in versions 2.2.3, 2.3.2, 2.4.1, and all later versions.

This storage may be created for a fiber, no matter how the fiber itself is created – from C or from Lua. For example, a fiber can be created in C using fiber_new(), then it can insert into a space, which has Lua on_replace triggers, and one of the triggers can create fiber.storage. That storage is deleted when the fiber is stopped.

Example:

The example below shows how to save the last entered name in a fiber storage and get this value before cancelling a fiber.

-- app.lua --
fiber = require('fiber')

function greet()
    while true do
        print('Enter a name:')
        name = io.read()
        if name ~= 'bye' then
            fiber.self().storage.name = name
            print('Hello, ' .. name)
        else
            print('Goodbye, ' .. fiber.self().storage['name'])
            fiber.self():cancel()
        end
    end
end

fiber_object = fiber.create(greet)

The output might look as follows:

$ tarantool app.lua
Enter a name:
John
Hello, John
Enter a name:
Jane
Hello, Jane
Enter a name:
bye
Goodbye, Jane

Example of yield failure

Warning: yield() and any function which implicitly yields (such as sleep()) can fail (raise an exception).

For example, this function has a loop that repeats until cancel() happens. The last thing that it will print is ‘before yield’, which demonstrates that yield() failed, the loop did not continue until testcancel() failed.

fiber = require('fiber')
function function_name()
  while true do
    print('before testcancel')
    fiber.testcancel()
    print('before yield')
    fiber.yield()
  end
end
fiber_object = fiber.create(function_name)
fiber.sleep(.1)
fiber_object:cancel()

Channels

Call fiber.channel() to create and get a new channel object.

Call the other routines, via channel, to send messages, receive messages, or check channel status.

Message exchange is synchronous. The Lua garbage collector will mark or free the channel when no one is using it, as with any other Lua object. Use object-oriented syntax, for example, channel:put(message) rather than fiber.channel.put(message).

fiber.channel([capacity])¶

Create a new communication channel.

Parameters:	capacity (`int`) – the maximum number of slots (spaces for `channel:put` messages) that can be in use at once. The default is 0.
Return:	new channel object.
Rtype:	userdata. In the console output, it is serialized as `channel: [number]`, where `[number]` is the return of channel_object:count().

object channel_object¶

channel_object:put(message[, timeout])¶

Send a message using a channel. If the channel is full, channel:put() waits until there is a free slot in the channel.

Note

The default channel capacity is 0. With this default value, channel:put() waits infinitely until channel:get() is called.

Parameters:	message (`lua-value`) – what will be sent, usually a string or number or table timeout (`number`) – maximum number of seconds to wait for a slot to become free. Default: infinity.
Return:	If timeout is specified, and there is no free slot in the channel for the duration of the timeout, then the return value is `false`. If the channel is closed, then the return value is `false`. Otherwise, the return value is `true`, indicating success.
Rtype:	boolean

channel_object:close()¶: Close the channel. All waiters in the channel will stop waiting. All following channel:get() operations will return nil, and all following channel:put() operations will return false.

channel_object:get([timeout])¶

Fetch and remove a message from a channel. If the channel is empty, channel:get() waits for a message.

Parameters:	timeout (`number`) – maximum number of seconds to wait for a message. Default: infinity.
Return:	If timeout is specified, and there is no message in the channel for the duration of the timeout, then the return value is `nil`. If the channel is closed, then the return value is `nil`. Otherwise, the return value is the message placed on the channel by `channel:put()`.
Rtype:	usually string or number or table, as determined by `channel:put`

channel_object:is_empty()¶

Check whether the channel is empty (has no messages).

Return:	`true` if the channel is empty. Otherwise `false`.
Rtype:	boolean

channel_object:count()¶

Find out how many messages are in the channel.

Return:	the number of messages.
Rtype:	number

channel_object:is_full()¶

Check whether the channel is full.

Return:	`true` if the channel is full (the number of messages in the channel equals the number of slots so there is no room for a new message). Otherwise `false`.
Rtype:	boolean

channel_object:has_readers()¶

Check whether readers are waiting for a message because they have issued channel:get() and the channel is empty.

Return:	`true` if readers are waiting. Otherwise `false`.
Rtype:	boolean

channel_object:has_writers()¶

Check whether writers are waiting because they have issued channel:put() and the channel is full.

Return:	`true` if writers are waiting. Otherwise `false`.
Rtype:	boolean

channel_object:is_closed()¶

Return:	`true` if the channel is already closed. Otherwise `false`.
Rtype:	boolean

Example

This example should give a rough idea of what some functions for fibers should look like. It’s assumed that the functions would be referenced in fiber.create().

fiber = require('fiber')
channel = fiber.channel(10)
function consumer_fiber()
    while true do
        local task = channel:get()
        ...
    end
end

function consumer2_fiber()
    while true do
        -- 10 seconds
        local task = channel:get(10)
        if task ~= nil then
            ...
        else
            -- timeout
        end
    end
end

function producer_fiber()
    while true do
        task = box.space...:select{...}
        ...
        if channel:is_empty() then
            -- channel is empty
        end

        if channel:is_full() then
            -- channel is full
        end

        ...
        if channel:has_readers() then
            -- there are some fibers
            -- that are waiting for data
        end
        ...

        if channel:has_writers() then
            -- there are some fibers
            -- that are waiting for readers
        end
        channel:put(task)
    end
end

function producer2_fiber()
    while true do
        task = box.space...select{...}
        -- 10 seconds
        if channel:put(task, 10) then
            ...
        else
            -- timeout
        end
    end
end

Condition variables

Call fiber.cond() to create a named condition variable, which will be called ‘cond’ for examples in this section.

Call cond:wait() to make a fiber wait for a signal via a condition variable.

Call cond:signal() to send a signal to wake up a single fiber that has executed cond:wait().

Call cond:broadcast() to send a signal to all fibers that have executed cond:wait().

fiber.cond()¶

Create a new condition variable.

Return:	new condition variable.
Rtype:	Lua object

object cond_object¶

cond_object:wait([timeout])¶

Make the current fiber go to sleep, waiting until another fiber invokes the signal() or broadcast() method on the cond object. The sleep causes an implicit fiber.yield().

Parameters:	timeout – number of seconds to wait, default = forever.
Return:	If timeout is provided, and a signal doesn’t happen for the duration of the timeout, `wait()` returns false. If a signal or broadcast happens, `wait()` returns true.
Rtype:	boolean

cond_object:signal()¶

Wake up a single fiber that has executed wait() for the same variable. Does not yield.

Rtype:	nil

cond_object:broadcast()¶

Wake up all fibers that have executed wait() for the same variable. Does not yield.

Rtype:	nil

Example

Assume that a Tarantool instance is running and listening for connections on localhost port 3301. Assume that guest users have privileges to connect. We will use the tt utility to start two clients.

On terminal #1, say

$ tt connect localhost:3301
tarantool> fiber = require('fiber')
tarantool> cond = fiber.cond()
tarantool> cond:wait()

The job will hang because cond:wait() – without an optional timeout argument – will go to sleep until the condition variable changes.

On terminal #2, say

$ tt connect localhost:3301
tarantool> cond:signal()

Now look again at terminal #1. It will show that the waiting stopped, and the cond:wait() function returned true.

This example depended on the use of a global conditional variable with the arbitrary name cond. In real life, programmers would make sure to use different conditional variable names for different applications.

Module fio

Overview

Tarantool supports file input/output with an API that is similar to POSIX syscalls. All operations are performed asynchronously. Multiple fibers can access the same file simultaneously.

The fio module contains:

functions for common pathname manipulations,
functions for directory or file existence and type checks,
functions for common file manipulations, and
constants which are the same as POSIX flag values (for example fio.c.flag.O_RDONLY = POSIX O_RDONLY).

Index

Below is a list of all fio functions and members.

Name	Use
fio.pathjoin()	Form a path name from one or more partial strings
fio.basename()	Get a file name
fio.dirname()	Get a directory name
fio.abspath()	Get a directory and file name
fio.path.exists()	Check if file or directory exists
fio.path.is_dir()	Check if file or directory is a directory
fio.path.is_file()	Check if file or directory is a file
fio.path.is_link()	Check if file or directory is a link
fio.path.lexists()	Check if file or directory exists
fio.umask()	Set mask bits
fio.lstat() fio.stat()	Get information about a file object
fio.mkdir() fio.rmdir()	Create or delete a directory
fio.chdir()	Change working directory
fio.listdir()	List files in a directory
fio.glob()	Get files whose names match a given string
fio.tempdir()	Get the name of a directory for storing temporary files
fio.cwd()	Get the name of the current working directory
fio.copytree() fio.mktree() fio.rmtree()	Create and delete directories
fio.link() fio.symlink() fio.readlink() fio.unlink()	Create and delete links
fio.rename()	Rename a file or directory
fio.utime()	Change file update time
fio.copyfile()	Copy a file
fio.chown() fio.chmod()	Manage rights to and ownership of file objects
fio.truncate()	Reduce the file size
fio.sync()	Ensure that changes are written to disk
fio.open()	Open a file
file-handle:close()	Close a file
file-handle:pread() file-handle:pwrite()	Perform random-access read or write on a file
file-handle:read() file-handle:write()	Perform non-random-access read or write on a file
file-handle:truncate()	Change the size of an open file
file-handle:seek()	Change position in a file
file-handle:stat()	Get statistics about an open file
file-handle:fsync() file-handle:fdatasync()	Ensure that changes made to an open file are written to disk
fio.c	Table of constants similar to POSIX flag values

Common pathname manipulations

fio.pathjoin(partial-string[, partial-string ...])¶

Concatenate partial string, separated by ‘/’ to form a path name.

Parameters:	partial-string (`string`) – one or more strings to be concatenated.
Return:	path name
Rtype:	string

Example:

tarantool> fio.pathjoin('/etc', 'default', 'myfile')
---
- /etc/default/myfile
...

fio.basename(path-name[, suffix])¶

Given a full path name, remove all but the final part (the file name). Also remove the suffix, if it is passed.

Note that the basename of a path with a trailing slash is an empty string. It is different from how the Unix basename program interprets such a path.

Parameters:	path-name (`string`) – path name suffix (`string`) – suffix
Return:	file name
Rtype:	string

Example:

tarantool> fio.basename('/path/to/my.lua', '.lua')
---
- my
...

Example with a trailing slash:

tarantool> fio.basename('/path/to/')
---
-
...

fio.dirname(path-name)¶

Given a full path name, remove the final part (the file name).

Parameters:	path-name (`string`) – path name
Return:	directory name, that is, path name except for file name.
Rtype:	string

Example:

tarantool> fio.dirname('/path/to/my.lua')
---
- '/path/to/'

fio.abspath(file-name)¶

Given a final part (the file name), return the full path name.

Parameters:	file-name (`string`) – file name
Return:	directory name, that is, path name including file name.
Rtype:	string

Example:

tarantool> fio.abspath('my.lua')
---
- '/path/to/my.lua'
...

Directory or file existence and type checks

Functions in this section are similar to some Python os.path functions.

fio.path.exists(path-name)¶

Parameters:	path-name (`string`) – path to directory or file.
Return:	true if path-name refers to a directory or file that exists and is not a broken symbolic link; otherwise false
Rtype:	boolean

fio.path.is_dir(path-name)¶

Parameters:	path-name (`string`) – path to directory or file.
Return:	true if path-name refers to a directory; otherwise false
Rtype:	boolean

fio.path.is_file(path-name)¶

Parameters:	path-name (`string`) – path to directory or file.
Return:	true if path-name refers to a file; otherwise false
Rtype:	boolean

fio.path.is_link(path-name)¶

Parameters:	path-name (`string`) – path to directory or file.
Return:	true if path-name refers to a symbolic link; otherwise false
Rtype:	boolean

fio.path.lexists(path-name)¶

Parameters:	path-name (`string`) – path to directory or file.
Return:	true if path-name refers to a directory or file that exists or is a broken symbolic link; otherwise false
Rtype:	boolean

Common file manipulations

fio.umask(mask-bits)¶

Set the mask bits used when creating files or directories. For a detailed description type man 2 umask.

Parameters:	mask-bits (`number`) – mask bits.
Return:	previous mask bits.
Rtype:	number

Example:

tarantool> fio.umask(tonumber('755', 8))
---
- 493
...

fio.lstat(path-name)¶

fio.stat(path-name)¶

Returns information about a file object. For details type man 2 lstat or man 2 stat.

Parameters:	path-name (`string`) – path name of file.
Return:	(If no error) table of fields which describe the file’s block size, creation time, size, and other attributes. (If error) two return values: null, error message.
Rtype:	table.

Additionally, the result of fio.stat('file-name') will include methods equivalent to POSIX macros:

is_blk() = POSIX macro S_ISBLK,
is_chr() = POSIX macro S_ISCHR,
is_dir() = POSIX macro S_ISDIR,
is_fifo() = POSIX macro S_ISFIFO,
is_link() = POSIX macro S_ISLINK,
is_reg() = POSIX macro S_ISREG,
is_sock() = POSIX macro S_ISSOCK.

For example, fio.stat('/'):is_dir() will return true.

Example:

tarantool> fio.lstat('/etc')
---
- inode: 1048577
  rdev: 0
  size: 12288
  atime: 1421340698
  mode: 16877
  mtime: 1424615337
  nlink: 160
  uid: 0
  blksize: 4096
  gid: 0
  ctime: 1424615337
  dev: 2049
  blocks: 24
...

fio.mkdir(path-name[, mode])¶

fio.rmdir(path-name)¶

Create or delete a directory. For details type man 2 mkdir or man 2 rmdir.

Parameters:	path-name (`string`) – path of directory. mode (`number`) – Mode bits can be passed as a number or as string constants, for example `S_IWUSR`. Mode bits can be combined by enclosing them in braces.
Return:	(If no error) true. (If error) two return values: false, error message.
Rtype:	boolean

Example:

tarantool> fio.mkdir('/etc')
---
- false
...

fio.chdir(path-name)¶

Change working directory. For details type man 2 chdir.

Parameters:	path-name (`string`) – path of directory.
Return:	(If success) true. (If failure) false.
Rtype:	boolean

Example:

tarantool> fio.chdir('/etc')
---
- true
...

fio.listdir(path-name)¶

List files in directory. The result is similar to the ls shell command.

Parameters:	path-name (`string`) – path of directory.
Return:	(If no error) a list of files. (If error) two return values: null, error message.
Rtype:	table

Example:

tarantool> fio.listdir('/usr/lib/tarantool')
---
- - mysql
...

fio.glob(path-name)¶

Return a list of files that match an input string. The list is constructed with a single flag that controls the behavior of the function: GLOB_NOESCAPE. For details type man 3 glob.

Parameters:	path-name (`string`) – path-name, which may contain wildcard characters.
Return:	list of files whose names match the input string
Rtype:	table

Possible errors: nil.

Example:

tarantool> fio.glob('/etc/x*')
---
- - /etc/xdg
  - /etc/xml
  - /etc/xul-ext
...

fio.tempdir()¶

Return the name of a directory that can be used to store temporary files.

Example:

tarantool> fio.tempdir()
---
- /tmp/lG31e7
...

fio.tempdir() stores the created temporary directory into /tmp by default. Since version 2.4.1, this can be changed by setting the TMPDIR environment variable. Before starting Tarantool, or at runtime by os.setenv().

Example:

tarantool> fio.tempdir()
---
- /tmp/lG31e7
...
tarantool> fio.mkdir('./mytmp')
---
- true
...

tarantool> os.setenv('TMPDIR', './mytmp')
---
...

tarantool> fio.tempdir()
---
- ./mytmp/506Z0b
...

fio.cwd()¶

Return the name of the current working directory.

Example:

tarantool> fio.cwd()
---
- /home/username/tarantool_sandbox
...

fio.copytree(from-path, to-path)¶

Copy everything in the from-path, including subdirectory contents, to the to-path. The result is similar to the cp -r shell command. The to-path should not be empty.

Parameters:	from-path (`string`) – path-name. to-path (`string`) – path-name.
Return:	(If no error) true. (If error) two return values: false, error message.
Rtype:	boolean

Example:

tarantool> fio.copytree('/home/original','/home/archives')
---
- true
...

fio.mktree(path-name)¶

Create the path, including parent directories, but without file contents. The result is similar to the mkdir -p shell command.

Parameters:	path-name (`string`) – path-name.
Return:	(If no error) true. (If error) two return values: false, error message.
Rtype:	boolean

Example:

tarantool> fio.mktree('/home/archives')
---
- true
...

fio.rmtree(path-name)¶

Remove the directory indicated by path-name, including subdirectories. The result is similar to the rm -rf shell command.

Parameters:	path-name (`string`) – path-name.
Return:	(If no error) true. (If error) two return values: null, error message.
Rtype:	boolean

Example:

tarantool> fio.rmtree('/home/archives')
---
- true
...

fio.link(src, dst)¶

fio.symlink(src, dst)¶

fio.readlink(src)¶

fio.unlink(src)¶

Functions to create and delete links. For details type man readlink, man 2 link, man 2 symlink, man 2 unlink.

Parameters:	src (`string`) – existing file name. dst (`string`) – linked name.
Return:	(If no error) `fio.link` and `fio.symlink` and `fio.unlink` return true, `fio.readlink` returns the link value. (If error) two return values: false\|null, error message.

Example:

tarantool> fio.link('/home/username/tmp.txt', '/home/username/tmp.txt2')
---
- true
...
tarantool> fio.unlink('/home/username/tmp.txt2')
---
- true
...

fio.rename(path-name, new-path-name)¶

Rename a file or directory. For details type man 2 rename.

Parameters:	path-name (`string`) – original name. new-path-name (`string`) – new name.
Return:	(If no error) true. (If error) two return values: false, error message.
Rtype:	boolean

Example:

tarantool> fio.rename('/home/username/tmp.txt', '/home/username/tmp.txt2')
---
- true
...

fio.utime(file-name[, accesstime[, updatetime]])¶

Change the access time and possibly also change the update time of a file. For details type man 2 utime. Times should be expressed as number of seconds since the epoch.

Parameters:	file-name (`string`) – name. accesstime (`number`) – time of last access. default current time. updatetime (`number`) – time of last update. default = access time.
Return:	(If no error) true. (If error) two return values: false, error message.
Rtype:	boolean

Example:

tarantool> fio.utime('/home/username/tmp.txt')
---
- true
...

fio.copyfile(path-name, new-path-name)¶

Copy a file. The result is similar to the cp shell command.

Parameters:	path-name (`string`) – path to original file. new-path-name (`string`) – path to new file.
Return:	(If no error) true. (If error) two return values: false, error message.
Rtype:	boolean

Example:

tarantool> fio.copyfile('/home/user/tmp.txt', '/home/usern/tmp.txt2')
---
- true
...

fio.chown(path-name, owner-user, owner-group)¶

fio.chmod(path-name, new-rights)¶

Manage the rights to file objects, or ownership of file objects. For details type man 2 chown or man 2 chmod.

Parameters:	owner-user (`string`) – new user uid. owner-group (`string`) – new group uid. new-rights (`number`) – new permissions
Return:	null

Example:

tarantool> fio.chmod('/home/username/tmp.txt', tonumber('0755', 8))
---
- true
...
tarantool> fio.chown('/home/username/tmp.txt', 'username', 'username')
---
- true
...

fio.truncate(path-name, new-size)¶

Reduce file size to a specified value. For details type man 2 truncate.

Parameters:	path-name (`string`) – new-size (`number`) –
Return:	(If no error) true. (If error) two return values: false, error message.
Rtype:	boolean

Example:

tarantool> fio.truncate('/home/username/tmp.txt', 99999)
---
- true
...

fio.sync()¶

Ensure that changes are written to disk. For details type man 2 sync.

Return:	true if success, false if failure.
Rtype:	boolean

Example:

tarantool> fio.sync()
---
- true
...

fio.open(path-name[, flags[, mode]])¶

Open a file in preparation for reading or writing or seeking.

Parameters:	path-name (`string`) – Full path to the file to open. flags (`number`) – Flags can be passed as a number or as string constants, for example ‘`O_RDONLY`’, ‘`O_WRONLY`’, ‘`O_RDWR`’. Flags can be combined by enclosing them in braces. On Linux the full set of flags as described on the Linux man page is: O_APPEND (start at end of file), O_ASYNC (signal when IO is possible), O_CLOEXEC (enable a flag related to closing), O_CREAT (create file if it doesn’t exist), O_DIRECT (do less caching or no caching), O_DIRECTORY (fail if it’s not a directory), O_EXCL (fail if file cannot be created), O_LARGEFILE (allow 64-bit file offsets), O_NOATIME (no access-time updating), O_NOCTTY (no console tty), O_NOFOLLOW (no following symbolic links), O_NONBLOCK (no blocking), O_PATH (get a path for low-level use), O_SYNC (force writing if it’s possible), O_TMPFILE (the file will be temporary and nameless), O_TRUNC (truncate) … and, always, one of: O_RDONLY (read only), O_WRONLY (write only), or O_RDWR (either read or write). mode (`number`) – Mode bits can be passed as a number or as string constants, for example `S_IWUSR`. Mode bits are significant if flags include `O_CREAT` or `O_TMPFILE`. Mode bits can be combined by enclosing them in braces.
Return:	(If no error) file handle (abbreviated as ‘fh’ in later description). (If error) two return values: null, error message.
Rtype:	userdata

Possible errors: nil.

Note that since version 2.4.1 fio.open() returns a descriptor which can be closed manually by calling the :close() method, or it will be closed automatically when it has no references, and the garbage collector deletes it.

Keep in mind that the number of file descriptors is limited, and they can become exhausted earlier than the garbage collector will be triggered to collect not used descriptors. It is always good practice to close them manually as soon as possible.

Example 1:

tarantool> fh = fio.open('/home/username/tmp.txt', {'O_RDWR', 'O_APPEND'})
---
...
tarantool> fh -- display file handle returned by fio.open
---
- fh: 11
...

Example 2:

Using fio.open() with tonumber('N', 8) to set permissions as an octal number:

tarantool> fio.open('x.txt', {'O_WRONLY', 'O_CREAT'}, tonumber('644',8))
---
- fh: 12
...

object file-handle¶

file-handle:close()¶

Close a file that was opened with fio.open. For details type man 2 close.

Parameters:	fh (`userdata`) – file-handle as returned by `fio.open()`.
Return:	true if success, false if failure.
Rtype:	boolean

Example:

tarantool> fh:close() -- where fh = file-handle
---
- true
...

file-handle:pread(count, offset)¶

file-handle:pread(buffer, count, offset)

Perform random-access read operation on a file, without affecting the current seek position of the file. For details type man 2 pread.

Parameters:	fh (`userdata`) – file-handle as returned by `fio.open()` buffer – where to read into (if the format is `pread(buffer, count, offset)`) count (`number`) – number of bytes to read offset (`number`) – offset within file where reading begins

If the format is pread(count, offset) then return a string containing the data that was read from the file, or empty string if failure.

If the format is pread(buffer, count, offset) then return the data to the buffer. Buffers can be acquired with buffer.ibuf.

Example:

tarantool> fh:pread(25, 25)
---
- |
  elete from t8//
  insert in
...

file-handle:pwrite(new-string, offset)¶

file-handle:pwrite(buffer, count, offset)

Perform random-access write operation on a file, without affecting the current seek position of the file. For details type man 2 pwrite.

Parameters:	fh (`userdata`) – file-handle as returned by `fio.open()` new-string (`string`) – value to write (if the format is `pwrite(new-string, offset)`) buffer (`cdata`) – value to write (if the format is `pwrite(buffer, count, offset)`) count (`number`) – number of bytes to write offset (`number`) – offset within file where writing begins
Return:	true if success, false if failure.
Rtype:	boolean

If the format is pwrite(new-string, offset) then the returned string is written to the file, as far as the end of the string.

If the format is pwrite(buffer, count, offset) then the buffer contents are written to the file, for count bytes. Buffers can be acquired with buffer.ibuf.

tarantool> ibuf = require('buffer').ibuf()
---
...

tarantool> fh:pwrite(ibuf, 1, 0)
---
- true
...

file-handle:read([count])¶

file-handle:read(buffer, count)

Perform non-random-access read on a file. For details type man 2 read or man 2 write.

Note

fh:read and fh:write affect the seek position within the file, and this must be taken into account when working on the same file from multiple fibers. It is possible to limit or prevent file access from other fibers with fiber.cond() or fiber.channel().

Parameters:

fh (userdata) – file-handle as returned by fio.open().
buffer – where to read into (if the format is read(buffer, count))
count (number) – number of bytes to read

Return:

If the format is read() – omitting count – then read all bytes in the file.
If the format is read() or read([count]) then return a string containing the data that was read from the file, or empty string if failure.
If the format is read(buffer, count) then return the data to the buffer. Buffers can be acquired with buffer.ibuf.
In case of an error the method returns nil, err and sets the error to errno.

tarantool> ibuf = require('buffer').ibuf()
---
...

tarantool> fh:read(ibuf:reserve(5), 5)
---
- 5
...

tarantool> require('ffi').string(ibuf:alloc(5),5)
---
- abcde

file-handle:write(new-string)¶

file-handle:write(buffer, count)

Perform non-random-access write on a file. For details type man 2 write.

Note

Parameters:	fh (`userdata`) – file-handle as returned by `fio.open()` new-string (`string`) – value to write (if the format is `write(new-string)`) buffer (`cdata`) – value to write (if the format is `write(buffer, count)`) count (`number`) – number of bytes to write
Return:	true if success, false if failure.
Rtype:	boolean

If the format is write(new-string) then the returned string is written to the file, as far as the end of the string.

If the format is write(buffer, count) then the buffer contents are written to the file, for count bytes. Buffers can be acquired with buffer.ibuf.

Example:

tarantool> fh:write("new data")
---
- true
...
tarantool> ibuf = require('buffer').ibuf()
---
...
tarantool> fh:write(ibuf, 1)
---
- true
...

file-handle:truncate(new-size)¶

Change the size of an open file. Differs from fio.truncate, which changes the size of a closed file.

Parameters:	fh (`userdata`) – file-handle as returned by `fio.open()`.
Return:	true if success, false if failure.
Rtype:	boolean

Example:

tarantool> fh:truncate(0)
---
- true
...

file-handle:seek(position[, offset-from])¶

Shift position in the file to the specified position. For details type man 2 seek.

Parameters:	fh (`userdata`) – file-handle as returned by `fio.open()`. position (`number`) – position to seek to offset-from (`string`) – ‘`SEEK_END`’ = end of file, ‘`SEEK_CUR`’ = current position, ‘`SEEK_SET`’ = start of file.
Return:	the new position if success
Rtype:	number

Possible errors: nil.

Example:

tarantool> fh:seek(20, 'SEEK_SET')
---
- 20
...

file-handle:stat()¶

Return statistics about an open file. This differs from fio.stat which return statistics about a closed file. For details type man 2 stat.

Parameters:	fh (`userdata`) – file-handle as returned by `fio.open()`.
Return:	details about the file.
Rtype:	table

Example:

tarantool> fh:stat()
---
- inode: 729866
  rdev: 0
  size: 100
  atime: 140942855
  mode: 33261
  mtime: 1409430660
  nlink: 1
  uid: 1000
  blksize: 4096
  gid: 1000
  ctime: 1409430660
  dev: 2049
  blocks: 8
...

file-handle:fsync()¶

file-handle:fdatasync()¶

Ensure that file changes are written to disk, for an open file. Compare fio.sync, which is for all files. For details type man 2 fsync or man 2 fdatasync.

Parameters:	fh (`userdata`) – file-handle as returned by `fio.open()`.
Return:	true if success, false if failure.

Example:

tarantool> fh:fsync()
---
- true
...

FIO constants

fio.c¶

Table with constants which are the same as POSIX flag values on the target platform (see man 2 stat).

Example:

tarantool> fio.c
---
- seek:
    SEEK_SET: 0
    SEEK_END: 2
    SEEK_CUR: 1
  mode:
    S_IWGRP: 16
    S_IXGRP: 8
    S_IROTH: 4
    S_IXOTH: 1
    S_IRUSR: 256
    S_IXUSR: 64
    S_IRWXU: 448
    S_IRWXG: 56
    S_IWOTH: 2
    S_IRWXO: 7
    S_IWUSR: 128
    S_IRGRP: 32
  flag:
    O_EXCL: 2048
    O_NONBLOCK: 4
    O_RDONLY: 0
    <...>
...

Module fun

Luafun, also known as the Lua Functional Library, takes advantage of the features of LuaJIT to help users create complex functions. Inside the module are “sequence processors” such as map, filter, reduce, zip – they take a user-written function as an argument and run it against every element in a sequence, which can be faster or more convenient than a user-written loop. Inside the module are “generators” such as range, tabulate, and rands – they return a bounded or boundless series of values. Within the module are “reducers”, “filters”, “composers” … or, in short, all the important features found in languages like Standard ML, Haskell, or Erlang.

The full documentation is On the luafun section of github. However, the first chapter can be skipped because installation is already done, it’s inside Tarantool. All that is needed is the usual require request. After that, all the operations described in the Lua fun manual will work, provided they are preceded by the name returned by the require request. For example:

tarantool> fun = require('fun')
---
...
tarantool> for _k, a in fun.range(3) do
         >   print(a)
         > end
1
2
3
---
...

Module http

The http module, specifically the http.client submodule, provides the functionality of an HTTP client with support for HTTPS and keepalive. The HTTP client uses the libcurl library under the hood and takes into account the environment variables libcurl understands.

HTTP client instance

Default client

The http.client submodule provides the default HTTP client instance:

local http_client = require('http.client')

In this case, you need to make requests using the dot syntax, for example:

local response = http_client.get('https://httpbin.org/get')

Creating a client

If you need to configure specific HTTP client options, use the http.client.new() function to create the client instance:

local http_client = require('http.client').new()

In this case, you need to make requests using the colon syntax, for example:

local response = http_client:get('https://httpbin.org/get')

All the examples in this section use the HTTP client created using http.client.new().

Making requests

The client instance enables you to make HTTP requests.

HTTP method

The main way of making HTTP requests is the request method, which accepts the following arguments:

An HTTP method, such as GET, POST, PUT, and so on.
A request URL. You can use the uri module to construct a URL from its components.
(Optional) a request body for the POST, PUT, and PATCH methods.
(Optional) request options, such as request headers, SSL settings, and so on.

The example below shows how to make the GET request to the https://httpbin.org/get URL:

local http_client = require('http.client').new()
local response = http_client:request('GET', 'https://httpbin.org/get')

In addition to request, the HTTP client provides the API for particular HTTP methods: get, post, put, and so on. For example, you can replace the request above by calling get as follows:

local http_client = require('http.client').new()
local response = http_client:get('https://httpbin.org/get')

Query parameters

To add query string parameters, use the params option exposed by the request_options object:

local http_client = require('http.client').new()
local response = http_client:get('https://httpbin.org/get', {
    params = { page = 1 },
})
print('URL: '..response.url)

In the example above, the requested URL is https://httpbin.org/get?page=1.

Note

If a parameter name or value contains a reserved character (for example, & or =), the HTTP client encodes a query string.

Headers

To add headers to the request, use the headers option:

local http_client = require('http.client').new()
local response = http_client:get('https://httpbin.org/headers', {
    headers = {
        ['User-Agent'] = 'Tarantool HTTP client',
        ['Authorization'] = 'Bearer abc123'
    }
})
print('Authorization: '..response:decode()['headers']['Authorization'])

Cookies

You can add cookies to the request using the headers option:

local http_client = require('http.client').new()
local response = http_client:get('https://httpbin.org/cookies', {
    headers = {
        ['Cookie'] = 'session_id=abc123; csrftoken=u32t4o;',
    }
})
print(response.body)

To learn how to obtain cookies passed in the Set-Cookie response header, see Response cookies.

Body

Serialization

The HTTP client automatically serializes the content in a specific format when sending a request based on the specified Content-Type header. By default, the client uses the application/json content type and sends data serialized as JSON:

local http_client = require('http.client').new()
local response = http_client:post('https://httpbin.org/anything', {
    user_id = 123,
    user_name = 'John Smith'
})
print('Posted data: '..response:decode()['data'])

The body for the request above might look like this:

{
    "user_id": 123,
    "user_name": "John Smith"
}

To send data in the YAML or MsgPack format, set the Content-Type header explicitly to application/yaml or application/msgpack, for example:

local http_client = require('http.client').new()
local response = http_client:post('https://httpbin.org/anything', {
    user_id = 123,
    user_name = 'John Smith'
}, {
    headers = {
        ['Content-Type'] = 'application/yaml',
    }
})
print('Posted data:\n'..response:decode()['data'])

In this case, the request body is serialized to YAML:

user_id: 123
user_name: John Smith

Form parameters

To send form parameters using the application/x-www-form-urlencoded type, use the params option:

local http_client = require('http.client').new()
local response = http_client:post('https://httpbin.org/anything', nil, {
    params = { user_id = 123, user_name = 'John Smith' },
})
print('User ID: '..response:decode()['form']['user_id'])

Streaming upload

The HTTP client supports chunked writing of request data. This can be achieved as follows:

Set the chunked option to true. In this case, a request method returns io_object instead of response_object.
Use the io_object.write() method to write a chunk of data.
Call the io_object.finish() method to finish writing data and make a request.

The example below shows how to upload data in two chunks:

local http_client = require('http.client').new()
local json = require('json')

local io = http_client:post('https://httpbin.org/anything', nil, {chunked = true})
io:write('Data part 1')
io:write('Data part 2')
io:finish()
response = io:read('\r\n')
decoded_data = json.decode(response)
print('Posted data: '..decoded_data['data'])

Receiving responses

All methods that are used to make an HTTP request (request, get, post, etc.) receive response_object. response_object exposes the API required to get a response body and obtain response parameters, such as a status code, headers, and so on.

Status code

To get a response’s status code and text, use the response_object.status and response_object.reason options, respectively:

local http_client = require('http.client').new()
local response = http_client:get('https://httpbin.org/get')
print('Status: '..response.status..' '.. response.reason)

Headers

The response_object.headers option returns a set of response headers. The example below shows how to obtain the ETag header value:

local http_client = require('http.client').new()
local response = http_client:get('https://httpbin.org/etag/7c876b7e')
print('ETag header value: '..response.headers['etag'])

Cookies

To obtain response cookies, use response_object.cookies. This option returns a Lua table where a cookie name is the key. The value is an array of two elements where the first one is the cookie value and the second one is an array with the cookie’s options.

The example below shows how to obtain the session_id cookie value:

local http_client = require('http.client').new()
local response = http_client:get('https://httpbin.org/cookies/set?session_id=abc123&csrftoken=u32t4o&', {follow_location = false})
print("'session_id' cookie value: "..response.cookies['session_id'][1])

Response body

Deserialization

The HTTP client can deserialize response data to a Lua object based on the Content-Type response header value. To deserialize data, call the response_object.decode() method. In the example below, the JSON response is deserialized into a Lua object:

local http_client = require('http.client').new()
local response = http_client:get('https://httpbin.org/json')
local document = response:decode()
print("'title' value: "..document['slideshow']['title'])

The following content types are supported out of the box:

application/json
application/msgpack
application/yaml

If the response doesn’t have the Content-Type header, the client uses application/json.

To deserialize other content types, you need to provide a custom deserializer using the client_object.decoders property. In the example below, application/xml responses are decoded using the luarapidxml library:

local http_client = require('http.client').new()
local xml = require("luarapidxml")

http_client.decoders = {
    ['application/xml'] = function(body, _content_type)
        return xml.decode(body)
    end,
}

local response = http_client:get('https://httpbin.org/xml')
local document = response:decode()
print("'title' value: "..document['attr']['title'])

The output for the code sample above should look as follows:

'title' value: Sample Slide Show

Decompressing

The HTTP client can automatically decompress a response body based on the Content-Encoding header value. To enable this capability, pass the required formats using the request_options.accept_encoding option:

local http_client = require('http.client').new()
local response = http_client:get('https://httpbin.org/gzip', {accept_encoding = "br, gzip, deflate"})
print('Is response gzipped: '..tostring(response:decode()['gzipped']))

Streaming download

The HTTP client supports chunked reading of request data. This can be achieved as follows:

Set the chunked option to true. In this case, a request method returns io_object instead of response_object.
Use the io_object.read() method to read data in chunks of a specified length or up to a specific delimiter.
Call the io_object.finish() method to finish reading data.

The example below shows how to get chunks of a JSON response sequentially instead of waiting for the entire response:

local http_client = require('http.client').new()
local json = require('json')

local io = http_client:get('https://httpbin.org/stream/5', {chunked = true})
local chunk_ids = ''
while data ~= '' do
    local data = io:read('\n')
    if data == '' then break end
    local decoded_data = json.decode(data)
    chunk_ids = chunk_ids..decoded_data['id']..' '
end
print('IDs of received chunks: '..chunk_ids)
io:finish()

Redirects

By default, the HTTP client redirects to a URL provided in the Location header of a 3xx response. If required, you can disable redirection using the follow_location option:

local http_client = require('http.client').new()
local response = http_client:get('https://httpbin.org/cookies/set?session_id=abc123&csrftoken=u32t4o&', {follow_location = false})

API Reference

Functions
http.client.new()	Create an HTTP client instance
Objects
client_options	Configuration options of the client
client_object	An HTTP client instance
request_options	Options passed to a request
response_object	A response object
io_object	An IO object used to read/write data in chunks

http.client.new()

http.client.new([options])¶

Create an HTTP client instance.

Parameters:	options (`table`) – configuration options of the client (see client_options)
Return:	a new HTTP client instance (see client_object)
Rtype:	userdata

Example

local http_client = require('http.client').new()

client_options

object client_options¶

Configuration options of the client. These options can be passed to the http.client.new() function.

client_options.max_connections¶

Specifies the maximum number of entries in the cache. This option affects libcurl CURLMOPT_MAXCONNECTS. The default is -1.

Example

local http_client = require('http.client').new({max_connections = 5})

Note

Do not set max_connections to less than max_total_connections unless you are confident about your actions. If max_connections is less than max_total_connections, libcurl doesn’t reuse sockets in some cases for requests that go to the same host. If the limit is reached and a new request occurs, then libcurl creates a new socket first, sends the request, waits for the first connection to be free, and closes it to avoid exceeding the max_connections cache size. In the worst case, libcurl creates a new socket for every request, even if all requests go to the same host.

client_options.max_total_connections¶

Specifies the maximum number of active connections. This option affects libcurl CURLMOPT_MAX_TOTAL_CONNECTIONS.

Note

You may want to control the maximum number of sockets that a particular HTTP client uses simultaneously. If a system passes many requests to distinct hosts, then libcurl cannot reuse sockets. In this case, setting max_total_connections may be useful since it causes curl to avoid creating too many sockets, which would not be used anyway.

client_object

object client_object¶

An HTTP client instance that exposes the API for making requests. To create the client, call http.client.new().

client_object:request(method, url, body, opts)¶

Make an HTTP request and receive a response.

Parameters:

method (string) – a request HTTP method. Possible values: GET, POST, PUT, PATCH, OPTIONS, HEAD, DELETE, TRACE, CONNECT.
url (string) – a request URL, for example, https://httpbin.org/get
body (string) – a request body (see Body)
opts (table) – request options (see request_options)

Return:

This method returns one of the following objects:

response_object
io_object if request_options.chunked is set to true

Rtype:

table

Example

local http_client = require('http.client').new()
local response = http_client:request('GET', 'https://httpbin.org/get')

See also: Making requests, Receiving responses

client_object:get(url, opts)¶

Make a GET request and receive a response.

Parameters:

url (string) – a request URL, for example, https://httpbin.org/get
opts (table) – request options (see request_options)

Return:

This method might return one of the following objects:

response_object
io_object if request_options.chunked is set to true

Rtype:

table

Example

local http_client = require('http.client').new()
local response = http_client:get('https://httpbin.org/get')

See also: Making requests, Receiving responses

client_object:post(url, body, opts)¶

Make a POST request and receive a response.

Parameters:

url (string) – a request URL, for example, https://httpbin.org/post
body (string) – a request body (see Body)
opts (table) – request options (see request_options)

Return:

This method might return one of the following objects:

response_object
io_object if request_options.chunked is set to true

Rtype:

table

Example

local http_client = require('http.client').new()
local response = http_client:post('https://httpbin.org/anything', {
    user_id = 123,
    user_name = 'John Smith'
})
print('Posted data: '..response:decode()['data'])

See also: Making requests, Receiving responses

client_object:put(url, body, opts)¶

Make a PUT request and receive a response.

Parameters:

url (string) – a request URL, for example, https://httpbin.org/put
body (string) – a request body (see Body)
opts (table) – request options (see request_options)

Return:

This method might return one of the following objects:

response_object
io_object if request_options.chunked is set to true

Rtype:

table

See also: Making requests, Receiving responses

client_object:patch(url, body, opts)¶

Make a PATCH request and receive a response.

Parameters:

url (string) – a request URL, for example, https://httpbin.org/patch
body (string) – a request body (see Body)
opts (table) – request options (see request_options)

Return:

This method might return one of the following objects:

response_object
io_object if request_options.chunked is set to true

Rtype:

table

See also: Making requests, Receiving responses

client_object:delete(url, opts)¶

Make a DELETE request and receive a response.

Parameters:

url (string) – a request URL, for example, https://httpbin.org/delete
opts (table) – request options (see request_options)

Return:

This method might return one of the following objects:

response_object
io_object if request_options.chunked is set to true

Rtype:

table

See also: Making requests, Receiving responses

client_object:head(url, opts)¶

Make a HEAD request and receive a response.

Parameters:

url (string) – a request URL, for example, https://httpbin.org/get
opts (table) – request options (see request_options)

Return:

This method might return one of the following objects:

response_object
io_object if request_options.chunked is set to true

Rtype:

table

See also: Making requests, Receiving responses

client_object:options(url, opts)¶

Make an OPTIONS request and receive a response.

Parameters:

url (string) – a request URL, for example, https://httpbin.org/get
opts (table) – request options (see request_options)

Return:

This method might return one of the following objects:

response_object
io_object if request_options.chunked is set to true

Rtype:

table

See also: Making requests, Receiving responses

client_object:trace(url, opts)¶

Make a TRACE request and receive a response.

Parameters:

url (string) – a request URL, for example, https://httpbin.org/get
opts (table) – request options (see request_options)

Return:

This method might return one of the following objects:

response_object
io_object if request_options.chunked is set to true

Rtype:

table

See also: Making requests, Receiving responses

client_object:connect(url, opts)¶

Make a CONNECT request and receive a response.

Parameters:

url (string) – a request URL, for example, server.example.com:80
opts (table) – request options (see request_options)

Return:

This method might return one of the following objects:

response_object
io_object if request_options.chunked is set to true

Rtype:

table

See also: Making requests, Receiving responses

client_object:stat()¶

Get a table with statistics for the HTTP client:

active_requests – the number of currently executing requests
sockets_added – the total number of sockets added into an event loop
sockets_deleted – the total number of sockets deleted from an event loop
total_requests – the total number of requests
http_200_responses – the total number of requests that returned HTTP 200 OK responses
http_other_responses – the total number of requests that returned non-200 OK responses
failed_requests – the total number of failed requests, including system, curl, and HTTP errors

client_object.decoders¶

Since: 2.11.0

Decoders used to deserialize response data based on the Content-Type header value. Learn more from Deserialization.

request_options

object request_options¶

Options passed to a request method (request, get, post, and so on).

See also: Making requests

request_options.ca_file¶

The path to an SSL certificate file to verify the peer with.

Rtype:	string

request_options.ca_path¶

The path to a directory holding one or more certificates to verify the peer with.

Rtype:	string

request_options.chunked¶

Since: 2.11.0

Specifies whether an HTTP client should return the full response (response_object) or an IO object (io_object) used for streaming download/upload.

Rtype:	boolean

See also: Streaming download, Streaming upload

request_options.headers¶

A table of HTTP headers passed to a request.

Rtype:	table

request_options.params¶

Since: 2.11.0

A table of parameters passed to a request. The behavior of this option depends on the request type, for example:

For a GET request, this option specifies query string parameters.
For a POST request, this option specifies form parameters to be sent using the application/x-www-form-urlencoded type.

Rtype:	table

request_options.keepalive_idle¶

A delay (in seconds) the operating system waits while the connection is idle before sending keepalive probes.

Rtype:	integer

request_options.keepalive_interval¶

The interval (in seconds) the operating system waits between sending keepalive probes. If both keepalive_idle and keepalive_interval are set, then Tarantool also sets the HTTP keepalive headers: Connection:Keep-Alive and Keep-Alive:timeout=<keepalive_idle>. Otherwise, Tarantool sends Connection:close.

Rtype:	integer

See also: CURLOPT_TCP_KEEPINTVL

request_options.low_speed_limit¶

The average transfer speed in bytes per second that the transfer should be below during “low speed time” seconds for the library to consider it to be too slow and abort.

Rtype:	integer

See also: CURLOPT_LOW_SPEED_LIMIT

request_options.low_speed_time¶

The time that the transfer speed should be below the “low speed limit” for the library to consider it too slow and abort.

Rtype:	integer

See also: CURLOPT_LOW_SPEED_TIME

request_options.max_header_name_len¶

The maximum length of a header name. If a header name length exceeds this value, it is truncated to this length. The default value is 32.

Rtype:	integer

request_options.follow_location¶

Specify whether the HTTP client follows redirect URLs provided in the Location header for 3xx responses. When a non-3xx response is received, the client returns it as a result. If you set this option to false, the client returns the first 3xx response.

Rtype:	boolean

See also: Redirects

request_options.no_proxy¶

A comma-separated list of hosts that do not require proxies, or *, or ''.

Set no_proxy = host [, host ...] to specify hosts that can be reached without requiring a proxy, even if proxy is set to a non-blank value and/or if a proxy-related environment variable has been set.
Set no__proxy = '*' to specify that all hosts can be reached without requiring a proxy, which is equivalent to setting proxy=''.
Set no_proxy = '' to specify that no hosts can be reached without requiring a proxy, even if a proxy-related environment variable (HTTP_PROXY) is used.

If no_proxy is not set, then a proxy-related environment variable (HTTP_PROXY) may be used.

Rtype:	string

See also: CURLOPT_NOPROXY

request_options.proxy¶

A proxy server host or IP address, or ''.

If proxy is a host or IP address, then it may begin with a scheme, for example, https:// for an HTTPS proxy or http:// for an HTTP proxy.
If proxy is set to '' an empty string, then proxy use is disabled, and no proxy-related environment variable is used.
If proxy is not set, then a proxy-related environment variable may be used, such as HTTP_PROXY or HTTPS_PROXY or FTP_PROXY, or ALL_PROXY if the protocol can be any protocol.

Rtype:	string

See also: CURLOPT_PROXY

request_options.proxy_port¶

A proxy server port. The default is 443 for an HTTPS proxy and 1080 for a non-HTTPS proxy.

Rtype:	integer

See also: CURLOPT_PROXYPORT

request_options.proxy_user_pwd¶

A proxy server username and password. This option might have one of the following formats:

proxy_user_pwd = user_name:
proxy_user_pwd = :password
proxy_user_pwd = user_name:password

Rtype:	string

See also: CURLOPT_USERPWD

request_options.ssl_cert¶

A path to an SSL client certificate file.

Rtype:	string

See also: CURLOPT_SSLCERT

request_options.ssl_key¶

A path to a private key file for a TLS and SSL client certificate.

Rtype:	string

See also: CURLOPT_SSLKEY

request_options.timeout¶

The number of seconds to wait for a curl API read request before timing out. The default timeout is set to infinity (36586400100 seconds).

Rtype:	integer

request_options.unix_socket¶

A socket name to use instead of an Internet address for a local connection.

Rtype:	string

Example: /tmp/unix_domain_socket.sock

request_options.verbose¶

Turn on/off a verbose mode.

Rtype:	boolean

request_options.verify_host¶

Enable verification of the certificate’s name (CN) against the specified host.

Rtype:	integer

See also: CURLOPT_SSL_VERIFYHOST

request_options.verify_peer¶

Set on/off verification of the peer’s SSL certificate.

Rtype:	integer

See also: CURLOPT_SSL_VERIFYPEER

request_options.accept_encoding¶

Enable decompression of an HTTP response data based on the specified Accept-Encoding request header. You can pass the following values to this option:

'' – if an empty string is passed, the Accept-Encoding contains all the supported encodings (identity, deflate, gzip, and br).
br, gzip, deflate – a comma-separated list of encodings passed in Accept-Encoding.

Rtype:	string

See also: CURLOPT_ACCEPT_ENCODING

response_object

object response_object¶

A response object returned by a request method (request, get, post, and so on).

See also: io_object

response_object.status¶

A response status code.

Rtype:	integer

See also: Status code

response_object.reason¶

A response status text.

Rtype:	string

See also: Status code

response_object.headers¶

Response headers.

Rtype:	table

See also: Headers

response_object.cookies¶

Response cookies. The value is an array of two elements where the first one is the cookie value and the second one is an array with the cookie’s options.

Rtype:	table

See also: Cookies

response_object.body¶

A response body. Use decode to decode the response body.

Rtype:	table

See also: Response body

response_object.proto¶

An HTTP protocol version.

Rtype:	string

response_object:decode()¶

Since: 2.11.0

Decode the response body to a Lua object based on the content type.

Return:	a decoded body
Rtype:	table

See also: Deserialization

io_object

object io_object¶

Since: 2.11.0

An IO object used to read or write data in chunks. To get an IO object instead of the full response (response_object), you need to set the chunked request option to true.

io_object:read(chunk[, timeout])¶

io_object:read(delimiter[, timeout])

io_object:read({chunk = chunk, delimiter = delimiter}[, timeout])

Read request data in chunks of a specified length or up to a specific delimiter.

Parameters:	chunk (`integer`) – the maximum number of bytes to read delimiter (`string`) – the delimiter used to stop reading data timeout (`integer`) – the number of seconds to wait. The default is `10`.
Return:	A chunk of read data. Returns an empty string if there is nothing more to read.
Rtype:	string

See also: Streaming download

io_object:write(data[, timeout])¶

Write the specified chunk of data.

Parameters:	data (`table`) – data to be written timeout (`integer`) – the number of seconds to wait. The default is `10`.

See also: Streaming upload

io_object:finish([timeout])¶

Finish reading or writing data.

Parameters:	timeout (`integer`) – the number of seconds to wait. The default is `10`.

See also: Streaming download, Streaming upload

Module iconv

Overview

The iconv module provides a way to convert a string with one encoding to a string with another encoding, for example from ASCII to UTF-8. It is based on the POSIX iconv routines.

An exact list of the available encodings may depend on environment. Typically the list includes ASCII, BIG5, KOI8R, LATIN8, MS-GREEK, SJIS, and about 100 others. For a complete list, type iconv --list on a terminal.

Index

Below is a list of all iconv functions.

Name	Use
iconv.new()	Create an iconv instance
iconv.converter()	Perform conversion on a string

iconv.new(to, from)¶

Construct a new iconv instance.

Parameters:	to (`string`) – the name of the encoding that we will convert to. from (`string`) – the name of the encoding that we will convert from.
Return:	a new iconv instance – in effect, a callable function
Rtype:	userdata

If either parameter is not a valid name, there will be an error message.

Example:

tarantool> converter = require('iconv').new('UTF8', 'ASCII')
---
...

iconv.converter(input-string)¶

Convert.

param string input-string:

the string to be converted (the “from” string)

return: the string that results from the conversion (the “to” string)

If anything in input-string cannot be converted, there will be an error message and the result string will be unchanged.

Example:

We know that the Unicode code point for “Д” (CYRILLIC CAPITAL LETTER DE) is hexadecimal 0414 according to the character database of Unicode. Therefore that is what it will look like in UTF-16. We know that Tarantool typically uses the UTF-8 character set. So make a from-UTF-8-to-UTF-16 converter, use string.hex(‘Д’) to show what Д’s encoding looks like in the UTF-8 source, and use string.hex(‘Д’-after-conversion) to show what it looks like in the UTF-16 target. Since the result is 0414, we see that iconv conversion works. (Different iconv implementations might use different names, for example UTF-16BE instead of UTF16BE.)

tarantool> string.hex('Д')
---
- d094
...

tarantool> converter = require('iconv').new('UTF16BE', 'UTF8')
---
...

tarantool> utf16_string = converter('Д')
---
...

tarantool> string.hex(utf16_string)
---
- '0414'
...

Module jit

Overview

The jit module has functions for tracing the LuaJIT Just-In-Time compiler’s progress, showing the byte-code or assembler output that the compiler produces, and in general providing information about what LuaJIT does with Lua code.

Index

Below is a list of all jit functions.

Note

In this document, we will use:

jit_dis_x64 for require('jit.dis_x64'),
jit_v for require('jit.v'),
jit_dump for require('jit.dump').

Name	Use
jit_bc.dump()	Print the byte code of a function
jit_dis_x86.disass()	Print the i386 assembler code of a string of bytes
jit_dis_x64.disass()	Print the x86-64 assembler code of a string of bytes
jit_dump.on(), jit_dump.off()	Print the intermediate or machine code of the following Lua code
jit_v.on(), jit_v.off()	Print a trace of LuaJIT’s progress compiling and interpreting code

jit_bc.dump(function)¶

Prints the byte code of a function.

Example:

tarantool> jit_bc = require('jit.bc')
---
...

tarantool> function f()
         > print("D")
         > end
---
...

tarantool> jit_bc.dump(f)
-- BYTECODE -- 0x01113163c8:1-3
0001    GGET     0   0      ; "print"
0002    KSTR     2   1      ; "D"
0003    CALL     0   1   2
0004    RET0     0   1

---
...

function f()
  print("D")
end
require('jit.bc').dump(f)

For a list of available options, read the source code of bc.lua.

jit_dis_x86.disass(string)¶

Prints the i386 assembler code of a string of bytes.

Example:

tarantool> -- Disassemble hexadecimal 97 which is the x86 code for xchg eax, edi
---
...

tarantool> jit_dis_x86 = require('jit.dis_x86')
---
...

tarantool> jit_dis_86.disass('\x97')
00000000  97                xchg eax, edi
---
...

For a list of available options, read the source code of dis_x86.lua.

jit_dis_x64.disass(string)¶

Prints the x86-64 assembler code of a string of bytes.

Example:

tarantool> -- Disassemble hexadecimal 97 which is the x86-64 code for xchg eax, edi
---
...

tarantool> jit_dis_x64 = require('jit.dis_x64')
---
...

tarantool> jit_dis_64.disass('\x97')
00000000  97                xchg eax, edi
---
...

For a list of available options, read the source code of dis_x64.lua.

jit_dump.on(option[, output file])¶

jit_dump.off()¶

Prints the intermediate or machine code of the following Lua code.

Example:

tarantool> -- Show the machine code of a Lua "for" loop
tarantool> jit_dump = require('jit.dump')
tarantool> jit_dump.on('m')
tarantool> x = 0;
tarantool> for i = 1, 1e6 do
         > x = x + i
         > end
---- TRACE 1 start 0x01047fbc38:1
---- TRACE 1 mcode 148
104c29f6b  mov dword [r14-0xed0], 0x1
104c29f76  cvttsd2si ebp, [rdx]
104c29f7a  rorx rbx, [rdx-0x10], 0x2f
104c29f81  shr rbx, 0x11
104c29f85  mov rdx, [rbx+0x10]
104c29f89  cmp dword [rdx+0x34], +0x3f
104c29f8d  jnz 0x104c1a010  ->0
104c29f93  mov rcx, [rdx+0x28]
104c29f97  mov rdi, 0xfffd8001046b3d58
104c29fa1  cmp rdi, [rcx+0x320]
104c29fa8  jnz 0x104c1a010  ->0
104c29fae  lea rax, [rcx+0x318]
104c29fb5  cmp dword [rax+0x4], 0xfff90000
104c29fbc  jnb 0x104c1a010  ->0
104c29fc2  xorps xmm7, xmm7
104c29fc5  cvtsi2sd xmm7, ebp
104c29fc9  addsd xmm7, [rax]
104c29fcd  movsd [rax], xmm7
104c29fd1  add ebp, +0x01
104c29fd4  cmp ebp, 0x000f4240
104c29fda  jg 0x104c1a014   ->1
->LOOP:
104c29fe0  xorps xmm6, xmm6
104c29fe3  cvtsi2sd xmm6, ebp
104c29fe7  addsd xmm7, xmm6
104c29feb  movsd [rax], xmm7
104c29fef  add ebp, +0x01
104c29ff2  cmp ebp, 0x000f4240
104c29ff8  jle 0x104c29fe0  ->LOOP
104c29ffa  jmp 0x104c1a01c  ->3
---- TRACE 1 stop -> loop

---
...

tarantool> print(x)
500000500000
---
...

tarantool> jit_dump.off()
---
...

For a list of available options, read the source code of dump.lua.

jit_v.on(option[, output file])¶

jit_v.off()¶

Prints a trace of LuaJIT’s progress compiling and interpreting code.

Example:

tarantool> -- Show what LuaJIT is doing for a Lua "for" loop
tarantool> jit_v = require('jit.v')
tarantool> jit_v.on()
tarantool> l = 0
tarantool> for i = 1, 1e6 do
         >     l = l + i
         > end
[TRACE   3 "for i = 1, 1e6 do
    l = l + i
end":1 loop]
---
...

tarantool> print(l)
500000500000
---
...

tarantool> jit_v.off()
---
...

For a list of available options, read the source code of v.lua.

Module json

Overview

The json module provides JSON manipulation routines. It is based on the Lua-CJSON module by Mark Pulford. For a complete manual on Lua-CJSON please read the official documentation.

Index

Below is a list of all json functions and members.

Name	Use
json.encode()	Convert a Lua object to a JSON string
json.decode()	Convert a JSON string to a Lua object
__serialize parameter	Output structure specification
json.cfg()	Change configuration
json.NULL	Analog of Lua’s “nil”

json.encode(lua-value[, configuration])¶

Convert a Lua object to a JSON string.

Parameters:	lua_value – either a scalar value or a Lua table value. configuration – see json.cfg
Return:	the original value reformatted as a JSON string.
Rtype:	string

Example:

tarantool> json=require('json')
---
...
tarantool> json.encode(123)
---
- '123'
...
tarantool> json.encode({123})
---
- '[123]'
...
tarantool> json.encode({123, 234, 345})
---
- '[123,234,345]'
...
tarantool> json.encode({abc = 234, cde = 345})
---
- '{"cde":345,"abc":234}'
...
tarantool> json.encode({hello = {'world'}})
---
- '{"hello":["world"]}'
...

json.decode(string[, configuration])¶

Convert a JSON string to a Lua object.

Parameters:	string (`string`) – a string formatted as JSON. configuration – see json.cfg
Return:	the original contents formatted as a Lua table.
Rtype:	table

Example:

tarantool> json = require('json')
---
...
tarantool> json.decode('123')
---
- 123
...
tarantool> json.decode('[123, "hello"]')
---
- [123, 'hello']
...
tarantool> json.decode('{"hello": "world"}').hello
---
- world
...

See the tutorial Sum a JSON field for all tuples to see how json.decode() can fit in an application.

__serialize parameter:

The JSON output structure can be specified with __serialize:

‘seq’, ‘sequence’, ‘array’ - table encoded as an array
‘map’, ‘mapping’ - table encoded as a map
function - the meta-method called to unpack serializable representation of table, cdata or userdata objects

Serializing ‘A’ and ‘B’ with different __serialize values brings different results:

tarantool> json.encode(setmetatable({'A', 'B'}, { __serialize="seq"}))
---
- '["A","B"]'
...
tarantool> json.encode(setmetatable({'A', 'B'}, { __serialize="map"}))
---
- '{"1":"A","2":"B"}'
...
tarantool> json.encode({setmetatable({f1 = 'A', f2 = 'B'}, { __serialize="map"})})
---
- '[{"f2":"B","f1":"A"}]'
...
tarantool> json.encode({setmetatable({f1 = 'A', f2 = 'B'}, { __serialize="seq"})})
---
- '[[]]'
...

json.cfg(table)¶

Set values that affect the behavior of json.encode and json.decode.

The values are all either integers or boolean true/false.

Option	Default	Use
`cfg.encode_max_depth`	128	Max recursion depth for encoding
`cfg.encode_deep_as_nil`	false	A flag saying whether to crop tables with nesting level deeper than `cfg.encode_max_depth`. Not-encoded fields are replaced with one null. If not set, too deep nesting is considered an error.
`cfg.encode_invalid_numbers`	true	A flag saying whether to enable encoding of NaN and Inf numbers
`cfg.encode_number_precision`	14	Precision of floating point numbers
`cfg.encode_load_metatables`	true	A flag saying whether the serializer will follow __serialize metatable field
`cfg.encode_use_tostring`	false	A flag saying whether to use `tostring()` for unknown types
`cfg.encode_invalid_as_nil`	false	A flag saying whether use NULL for non-recognized types
`cfg.encode_sparse_convert`	true	A flag saying whether to handle excessively sparse arrays as maps. See detailed description below.
`cfg.encode_sparse_ratio`	2	1/`encode_sparse_ratio` is the permissible percentage of missing values in a sparse array.
`cfg.encode_sparse_safe`	10	A limit ensuring that small Lua arrays are always encoded as sparse arrays (instead of generating an error or encoding as a map)
`cfg.decode_invalid_numbers`	true	A flag saying whether to enable decoding of NaN and Inf numbers
`cfg.decode_save_metatables`	true	A flag saying whether to set metatables for all arrays and maps
`cfg.decode_max_depth`	128	Max recursion depth for decoding

Important

cfg.decode_save_metatables. Decoder uses globally defined tables as metatables for arrays and maps. You must not change entries of decode() result’s table metatable, because it affects all results and may lead to undefined behavior of other code. See Yaml for a detailed example.

Sparse arrays features:

During encoding, the JSON encoder tries to classify a table into one of four kinds:

map - at least one table index is not unsigned integer
regular array - all array indexes are available
sparse array - at least one array index is missing
excessively sparse array - the number of values missing exceeds the configured ratio

An array is excessively sparse when all the following conditions are met:

encode_sparse_ratio > 0
max(table) > encode_sparse_safe
max(table) > count(table) * encode_sparse_ratio

The JSON encoder will never consider an array to be excessively sparse when encode_sparse_ratio = 0. The encode_sparse_safe limit ensures that small Lua arrays are always encoded as sparse arrays. By default, attempting to encode an excessively sparse array will generate an error. If encode_sparse_convert is set to true, excessively sparse arrays will be handled as maps.

json.cfg() example 1:

The following code will encode 0/0 as NaN (“not a number”) and 1/0 as Inf (“infinity”), rather than returning nil or an error message:

json = require('json')
json.cfg{encode_invalid_numbers = true}
x = 0/0
y = 1/0
json.encode({1, x, y, 2})

The result of the json.encode() request will look like this:

tarantool> json.encode({1, x, y, 2})
---
- '[1,nan,inf,2]
...

json.cfg example 2:

To avoid generating errors on attempts to encode unknown data types as userdata/cdata, you can use this code:

tarantool> httpc = require('http.client').new()
---
...

tarantool> json.encode(httpc.curl)
---
- error: unsupported Lua type 'userdata'
...

tarantool> json.encode(httpc.curl, {encode_use_tostring=true})
---
- '"userdata: 0x010a4ef2a0"'
...

Note

To achieve the same effect for only one call to json.encode() (i.e. without changing the configuration permanently), you can use json.encode({1, x, y, 2}, {encode_invalid_numbers = true}).

Similar configuration settings exist for MsgPack and YAML.

json.NULL¶

A value comparable to Lua “nil” which may be useful as a placeholder in a tuple.

Example:

-- When nil is assigned to a Lua-table field, the field is null
tarantool> {nil, 'a', 'b'}
---
- - null
  - a
  - b
...
-- When json.NULL is assigned to a Lua-table field, the field is json.NULL
tarantool> {json.NULL, 'a', 'b'}
---
- - null
  - a
  - b
...
-- When json.NULL is assigned to a JSON field, the field is null
tarantool> json.encode({field2 = json.NULL, field1 = 'a', field3 = 'c'})
---
- '{"field2":null,"field1":"a","field3":"c"}'
...

Module key_def

The key_def module has a function for defining the field numbers and types of a tuple. The definition is usually used with an index definition to extract or compare the index key values.

key_def.new(parts)¶

Create a new key_def instance.

Parameters:	parts (`table`) – field numbers and types. There must be at least one part. Every part must contain the attributes `type` and `fieldno`/`field`. Other attributes are optional.
Returns:	a key_def object

The parts table has components which are the same as the parts option in Options for space_object:create_index().

fieldno (integer), for example, fieldno = 1. It is legal to use field instead of fieldno.

type (string), for example, type = 'string'.

Other components are optional.

Example: key_def.new({{type = 'unsigned', fieldno = 1}})

Example: key_def.new({{type = 'string', collation = 'unicode', field = 2}})

Since version 3.2.0, you can use the standard lua operator # (__len metamethod) to check the key_def length (parts count).

Example

function is_full_pkey(space, key)
return #space.index[0].parts == #key
end

object key_def_object¶

A key_def object is an object returned by key_def.new(). It has methods extract_key(), compare(), compare_with_key(), merge(), totable().

key_def_object:extract_key(tuple)¶

Return a tuple containing only the fields of the key_def object.

Parameters:	tuple (`table`) – tuple or Lua table with field contents
Return:	the fields defined for the `key_def` object

Example #1:

-- Suppose an item has five fields
-- 1, 99.5, 'X', nil, 99.5
-- and the fields that we care about are
-- #3 (a string) and #1 (an integer).
-- We can define those fields with k = key_def.new
-- and extract the values with k:extract_key.

tarantool> key_def = require('key_def')
---
...

tarantool> k = key_def.new({{type = 'string', fieldno = 3},
>                           {type = 'unsigned', fieldno = 1}})
---
...

tarantool> k:extract_key({1, 99.5, 'X', nil, 99.5})
---
- ['X', 1]
...

Example #2

-- Now suppose the item is a tuple in a space with
-- an index on field #3 plus field #1.
-- We can use key_def.new with the index definition
-- instead of filling it out (Example #1).
-- The result will be the same.
key_def = require('key_def')
box.schema.space.create('T')
i = box.space.T:create_index('I', {parts={3, 'string', 1, 'unsigned'}})
box.space.T:insert{1, 99.5, 'X', nil, 99.5}
k = key_def.new(i.parts)
k:extract_key(box.space.T:get({'X', 1}))

Example #3

-- Iterate through the tuples in a secondary non-unique index
-- extracting the tuples' primary-key values, so they could be deleted
-- using a unique index. This code should be a part of a Lua function.
local key_def_lib = require('key_def')
local s = box.schema.space.create('test')
local pk = s:create_index('pk')
local sk = s:create_index('test', {unique = false, parts = {
    {2, 'number', path = 'a'}, {2, 'number', path = 'b'}}})
s:insert{1, {a = 1, b = 1}}
s:insert{2, {a = 1, b = 2}}
local key_def = key_def_lib.new(pk.parts)
for _, tuple in sk:pairs({1})) do
    local key = key_def:extract_key(tuple)
    pk:delete(key)
end

key_def_object:compare(tuple_1, tuple_2)¶

Compare the key fields of tuple_1 with the key fields of tuple_2. It is a tuple-by-tuple comparison so users do not have to write code that compares one field at a time. Each field’s type and collation will be taken into account. In effect it is a comparison of extract_key(tuple_1) with extract_key(tuple_2).

Parameters:	tuple1 (`table`) – tuple or Lua table with field contents tuple2 (`table`) – tuple or Lua table with field contents
Return:	> 0 if tuple_1 key fields > tuple_2 key fields, = 0 if tuple_1 key fields = tuple_2 key fields, < 0 if tuple_1 key fields < tuple_2 key fields

Example:

-- This will return 0
key_def = require('key_def')
k = key_def.new({{type = 'string', fieldno = 3, collation = 'unicode_ci'},
                 {type = 'unsigned', fieldno = 1}})
k:compare({1, 99.5, 'X', nil, 99.5}, {1, 99.5, 'x', nil, 99.5})

key_def_object:compare_with_key(tuple_1, tuple_2)¶

Compare the key fields of tuple_1 with all the fields of tuple_2. This is the same as key_def_object:compare() except that tuple_2 contains only the key fields. In effect it is a comparison of extract_key(tuple_1) with tuple_2.

Parameters:	tuple1 (`table`) – tuple or Lua table with field contents tuple2 (`table`) – tuple or Lua table with field contents
Return:	> 0 if tuple_1 key fields > tuple_2 fields, = 0 if tuple_1 key fields = tuple_2 fields, < 0 if tuple_1 key fields < tuple_2 fields

Example:

-- Returns 0
key_def = require('key_def')
k = key_def.new({{type = 'string', fieldno = 3, collation = 'unicode_ci'},
                 {type = 'unsigned', fieldno = 1}})
k:compare_with_key({1, 99.5, 'X', nil, 99.5}, {'x', 1})

key_def_object:merge(other_key_def_object)¶

Combine the main key_def_object with other_key_def_object. The return value is a new key_def_object containing all the fields of the main key_def_object, then all the fields of other_key_def_object which are not in the main key_def_object.

Parameters:	other_key_def_object (`key_def_object`) – definition of fields to add
Return:	key_def_object

Example:

-- Returns a key definition with fieldno = 3 and fieldno = 1.
key_def = require('key_def')
k = key_def.new({{type = 'string', fieldno = 3}})
k2= key_def.new({{type = 'unsigned', fieldno = 1},
                 {type = 'string', fieldno = 3}})
k:merge(k2)

key_def_object:totable()¶

Returns a table containing the fields of the key_def_object. This is the reverse of key_def.new():

key_def.new() takes a table and returns a key_def object,
key_def_object:totable() takes a key_def object and returns a table.

This is useful for input to _serialize methods.

Return:	table

Example:

-- Returns a table with type = 'string', fieldno = 3
key_def = require('key_def')
k = key_def.new({{type = 'string', fieldno = 3}})
k:totable()

key_def_object:validate_key(key)¶

Since version 3.1.0

Validates all parts of the specified key match the key definition. Partial keys are considered valid. Returns nothing on success.

If the key fails the validation, a box.error type exception is raised.

Example:

-- Create a rule: key = {1 ('unsigned'), 2 (string)}
-- Validate key {1001} (only id data type). Returns nothing
-- Validate key {'x'}. ER_KEY_PART_TYPE is raised
-- Validate key ({1000, 2000}). ER_KEY_PART_TYPE is raised
-- Validate key ({1000, 'abc', 'xyz'}). ER_KEY_PART_COUNT is raised

tarantool> key_def = require('key_def').new({{fieldno = 1, type = 'unsigned'},
>                           {fieldno = 2, type = 'string'}})
---
...

tarantool> key_def:validate_key({1001})
---
...

tarantool> key_def:validate_key({'x'})
---
- error: 'Supplied key type of part 0 does not match index part type: expected unsigned'
...

tarantool> key_def:validate_key({1000, 2000})
---
- error: 'Supplied key type of part 1 does not match index part type: expected string'
...

tarantool> key_def:validate_key({1000, 'abc', 'xyz'})
---
- error: 'Invalid key part count: (expected [0..2], got 3)
...

key_def_object:validate_full_key(key)¶

Since version 3.1.0

Validates whether they input key contains all fields and mathces the rules of the key definition object. Returns nothing on success.

If the key fails the validation, a box.error type exception is raised.

Example:

-- Create a rule: key = {1 ('unsigned'), 2 (string)}
-- Validate key {100, "Testuser"}. Returns nothing
-- Validate key ({100}). ER_EXACT_MATCH is raised

tarantool> key_def = require('key_def').new({{fieldno = 1, type = 'unsigned'},
>                           {fieldno = 2, type = 'string'}})
---
...

tarantool> key_def:validate_full_key({100, "Testuser"})
---
...

tarantool> key_def:validate_full_key({100})
---
- error: 'Invalid key part count in an exact match: (expected 2, got 1)
...

key_def_object:validate_tuple(tuple)¶

Since version 3.1.0

Validates whether the tuple matches the rules of the key definition object Returns nothing on success.

If the key fails the validation, a box.error type exception is raised.

Example:

-- Create a rule: tuple = {id (number), name (string), age (number)}
-- Validate tuple {1001, "Testuser", 28}. Returns nothing

tarantool> key_def = require('key_def').new({
>                           {fieldno = 1, type = 'number'},
>                           {fieldno = 2, type = 'string'},
>                           {fieldno = 3, type = 'number'})
---
...

tarantool> key_def:validate_tuple({1001, "Testuser", 28})
---
...

key_def_object:compare_keys(key_a, key_b)¶

Since version 3.1.0

Compares two keys against each other and according to the key definition object. On success, returns:

<0 if key_a parts are less than key_b parts
0 if key_a parts are equal to key_b parts
>0 if key_a parts are greater than key_b parts

If any key does not match the key definition rules, a box.error type exception is raised.

Example:

-- Create a rule: key = {1 ('unsigned'), 2 (string)}
-- Validate keys ({1000, 'x'}, {1000, 'y'}). Returns -1
-- Validate keys ({1000, 'x'}, {1000, 'x'}). Returns 0
-- Validate keys ({1000, 'x'}, {1000}). Returns 0
-- Validate keys ({2000, 'x'}, {1000, 'x'}). Returns 1

tarantool> key_def = require('key_def').new({{fieldno = 1, type = 'unsigned'},
>                           {fieldno = 2, type = 'string'}})
---
...

tarantool> key_def:compare_keys({1000, 'x'}, {1000, 'y'})
---
- -1
...

tarantool> key_def:compare_keys({1000, 'x'}, {1000, 'x'})
---
- 0
...

tarantool> key_def:compare_keys({1000, 'x'}, {1000})
---
- 0
...

tarantool> key_def:compare_keys({2000, 'x'}, {1000, 'x'})
---
- 1
...

Module log

Overview

Tarantool provides a set of options used to configure logging in various ways: you can set a level of logging, specify where to send the log’s output, configure a log format, and so on. The log module allows you to configure logging in your application and provides additional capabilities, for example, logging custom messages and rotating log files.

Index

Below is a list of all log functions.

Name	Use
log.cfg({})	Configure a logger
log.error() log.warn() log.info() log.verbose() log.debug()	Log a message with the specified level
log.pid()	Get the PID of a logger
log.rotate()	Rotate a log file
log.new()	Create a new logger with the specified name

log.cfg({})¶

Configure logging options. The following options are available:

level: Specify the level of detail the log has.

The example below shows how to set the log level to verbose:
```
local log = require('log')
log.cfg { level = 'verbose' }
```
See also: log.level.
log: Specify where to send the log’s output, for example, to a file, pipe, or system logger.

Example 1: sending the log to the tarantool.log file
```
log.cfg { log = 'tarantool.log' }
```
Example 2: sending the log to a pipe
```
log.cfg { log = '| cronolog tarantool.log' }
```
Example 3: sending the log to syslog
```
log.cfg { log = 'syslog:server=unix:/dev/log' }
```
See also: log.to.
nonblock: If true, Tarantool does not block during logging when the system is not ready for writing, and drops the message instead.

See also: log.nonblock.
format: Specify the log format: ‘plain’ or ‘json’.

See also: log.format.
modules: Configure the specified log levels for different modules.

See also: log.modules.

log.error(message)¶

log.warn(message)¶

log.info(message)¶

log.verbose(message)¶

log.debug(message)¶

Log a message with the specified logging level. You can learn more about the available levels from the log.level option description.

Example

The example below shows how to log a message with the warn level:

log.warn('Warning message')

Parameters:	message (`any`) – A log message. A message can be a string. A message may contain C-style format specifiers `%d` or `%s`. Example: log.info('Tarantool version: %s', box.info.version) A message may be a scalar data type or a table. Example: log.error({ 500, 'Internal error' })
Return:	nil

The actual output will be a line in the log, containing:

the current timestamp
a module name
‘E’, ‘W’, ‘I’, ‘V’ or ‘D’ depending on the called function
message

Note that the message will not be logged if the severity level corresponding to the called function is less than log.level.

log.pid()¶

Return:	A PID of a logger. You can use this PID to send a signal to a log rotation program, so it can rotate logs.

log.rotate()¶

Rotate the log. For example, you need to call this function to continue logging after a log rotation program renames or moves a file with the latest logs.

Return:	nil

log.new(name)¶

Since: 2.11.0

Create a new logger with the specified name. You can configure a specific log level for a new logger using the log.modules configuration property.

Parameters:	name (`string`) – a logger name
Return:	a logger instance

Example

This example shows how to set the verbose level for module1 and the error level for module2 in a configuration file:

log:
  modules:
    module1: 'verbose'
    module2: 'error'
app:
  file: 'app.lua'

To create the module1 and module2 loggers in your application (app.lua), call the new() function:

-- Creates new loggers --
module1_log = require('log').new('module1')
module2_log = require('log').new('module2')

Then, you can call functions corresponding to different logging levels to make sure that events with severities above or equal to the given levels are shown:

-- Prints 'info' messages --
module1_log.info('Info message from module1')
--[[
[16300] main/103/interactive/module1 I> Info message from module1
---
...
--]]

-- Swallows 'debug' messages --
module1_log.debug('Debug message from module1')
--[[
---
...
--]]

-- Swallows 'info' messages --
module2_log.info('Info message from module2')
--[[
---
...
--]]

At the same time, the events with severities below the specified levels are swallowed.

Example on GitHub: log_new_modules.

Module merger

Overview

The merger module takes a stream of tuples and provides access to them as tables.

Index

The four functions for creating a merger object instance are:

merger.new_tuple_source(),
merger.new_buffer_source(),
merger.new_table_source,
merger.new(merger_source…).

The methods for using a merger object are:

merger_object:select(),
merger_object:pairs().

merger.new_tuple_source(gen, param, state)¶

Create a new merger instance from a tuple source.

A tuple source just returns one tuple.

The generator function gen() allows creation of multiple tuples via an iterator.

The gen() function should return:

state, tuple each time it is called and a new tuple is available,
nil when no more tuples are available.

Parameters:	gen – function for iteratively returning tuples param – parameter for the gen function
Return:	merger-object a merger object

Example: see merger_object:pairs() method.

merger.new_buffer_source(gen, param, state)¶

Create a new merger instance from a buffer source.

Parameters and return: same as for merger.new_tuple_source.

To set up a buffer, or a series of buffers, use the buffer module.

merger.new_table_source(gen, param, state)¶

Create a new merger instance from a table source.

Parameters and return: same as for merger.new_tuple_source.

Example: see merger_object:select() method.

merger.new(key_def, sources, options)¶

Create a new merger instance from a merger source.

A merger source is created from a key_def object and a set of (tuple or buffer or table or merger) sources. It performs a kind of merge sort. It chooses a source with a minimal / maximal tuple on each step, consumes a tuple from this source, and repeats.

Parameters:	key_def – object created with `key_def` source – parameter for the `gen()` function options – `reverse=true` if descending, false or nil if ascending
Return:	merger-object a merger object

A key_def can be cached across requests with the same ordering rules (typically these would be requests accessing the same space).

Example: see merger_object:pairs() method.

object merger_object¶

A merger object is an object returned by:

merger.new_tuple_source() or
merger.new_buffer_source() or
merger.new_table_source or
merger.new(merger_source…).

It has methods:

merger_object:select() or
merger_object:pairs().

merger_object:select([buffer[, limit]])¶

Access the contents of a merger object with familiar select syntax.

Parameters:	buffer – as in `net.box` client conn:select method limit – as in `net.box` client conn:select method
Return:	a table of tuples, similar to what `select` would return

Example with new_table_source():

-- Source via new_table_source, simple generator function
-- tarantool> s:select()
-- ---
-- - - [100]
--   - [200]
-- ...
merger=require('merger')
k=0
function merger_function(param)
  k = k + 1
  if param[k] == nil then return nil end
  return box.NULL, param[k]
  end
chunks={}
chunks[1] = {{100}} chunks[2] = {{200}} chunks[3] = nil
s = merger.new_table_source(merger_function, chunks)
s:select()

merger_object:pairs()¶

The pairs() method (or the equivalent ipairs() alias method) returns a luafun iterator. It is a Lua iterator, but also provides a set of handy methods to operate in functional style.

Parameters:	tuple (`table`) – tuple or Lua table with field contents
Return:	the tuples that can be found with a standard `pairs()` function

Example with new_tuple_source():

-- Source via new_tuple_source, from a space of tables
-- The result will look like this:
-- tarantool> so:pairs():totable()
-- ---
-- - - [100]
--   - [200]
-- ...
merger = require('merger')
box.schema.space.create('s')
box.space.s:create_index('i')
box.space.s:insert({100})
box.space.s:insert({200})
so = merger.new_tuple_source(box.space.s:pairs())
so:pairs():totable()

Example with two mergers:

-- Source via key_def, and table data

-- Create the key_def object
merger = require('merger')
key_def_lib = require('key_def')
key_def = key_def_lib.new({{
    fieldno = 1,
    type = 'string',
}})
-- Create the table source
data = {{'a'}, {'b'}, {'c'}}
source = merger.new_source_fromtable(data)
i1 = merger.new(key_def, {source}):pairs()
i2 = merger.new(key_def, {source}):pairs()
-- t1 will be 'a' (tuple 1 from merger 1)
t1 = i1:head():totable()
-- t3 will be 'c' (tuple 3 from merger 2)
t3 = i2:head():totable()
-- t2 will be 'b' (tuple 2 from merger 1)
t2 = i1:head():totable()
-- i1:is_null() will be true (merger 1 ends)
i1:is_null()
-- i2:is_null() will be true (merger 2 ends)
i2:is_null()

More examples:

See https://github.com/Totktonada/tarantool-merger-examples which, in addition to discussing the merger API in detail, shows Lua code for handling many more situations than are in this manual’s brief examples.

Module metrics

Since: 2.11.1

The metrics module provides the ability to collect and expose Tarantool metrics.

Note

If you use a Tarantool version below 2.11.1, it is necessary to install the latest version of metrics first. For Tarantool 2.11.1 and above, you can also use the external metrics module. In this case, the external metrics module takes priority over the built-in one.

Overview

Collectors

Tarantool provides the following metric collectors:

counter
gauge
histogram
summary

A collector is a representation of one or more observations that change over time.

counter

A counter is a cumulative metric that denotes a single monotonically increasing counter. Its value might only increase or be reset to zero on restart. For example, you can use the counter to represent the number of requests served, tasks completed, or errors.

The design is based on the Prometheus counter.

gauge

A gauge is a metric that denotes a single numerical value that can arbitrarily increase and decrease.

The gauge type is typically used for measured values like temperature or current memory usage. It could also be used for values that can increase or decrease, such as the number of concurrent requests.

The design is based on the Prometheus gauge.

histogram

A histogram metric is used to collect and analyze statistical data about the distribution of values within the application. Unlike metrics that track the average value or quantity of events, a histogram provides detailed visibility into the distribution of values and can uncover hidden dependencies.

The design is based on the Prometheus histogram.

summary

A summary metric is used to collect statistical data about the distribution of values within the application.

Each summary provides several measurements:

total count of measurements
sum of measured values
values at specific quantiles

Similar to histograms, the summary also operates with value ranges. However, unlike histograms, it uses quantiles (defined by a number between 0 and 1) for this purpose. In this case, it is not required to define fixed boundaries. For summary type, the ranges depend on the measured values and the number of measurements.

The design is based on the Prometheus summary.

Labels

A label is a piece of metainfo that you associate with a metric in the key-value format. For details, see labels in Prometheus and tags in Graphite.

Labels are used to differentiate between the characteristics of a thing being measured. For example, in a metric associated with the total number of HTTP requests, you can represent methods and statuses as label pairs:

http_requests_total_counter:inc(1, { method = 'POST', status = '200' })

The example above allows extracting the following time series:

The total number of requests over time with method = "POST" (and any status).
The total number of requests over time with status = 500 (and any method).

Configuring metrics

To configure metrics, use metrics.cfg(). This function can be used to turn on or off the specified metrics or to configure labels applied to all collectors. Moreover, you can use the following shortcut functions to set-up metrics or labels:

metrics.enable_default_metrics()
metrics.set_global_labels()

Note

Starting from version 3.0, metrics can be configured using a configuration file in the metrics section.

Custom metrics

Creating custom metrics

To create a custom metric, follow the steps below:

Create a metric

To create a new metric, you need to call a function corresponding to the desired collector type. For example, call metrics.counter() or metrics.gauge() to create a new counter or gauge, respectively. In the example below, a new counter is created:
```
local metrics = require('metrics')
local bands_replace_count = metrics.counter('bands_replace_count', 'The number of data operations')
```
This counter is intended to collect the number of data operations performed on the specified space.

In the next example, a gauge is created:
```
local metrics = require('metrics')
local bands_waste_size = metrics.gauge('bands_waste_size', 'The size of memory wasted due to internal fragmentation')
```

Observe a value

You can observe a value in two ways:

At the appropriate place, for example, in an API request handler or trigger. In this example below, the counter value is increased any time a data operation is performed on the bands space. To increase a counter value, counter_obj:inc() is called.

local metrics = require('metrics')
local bands_replace_count = metrics.counter('bands_replace_count', 'The number of data operations')
local trigger = require('trigger')
trigger.set(
        'box.space.bands.on_replace',
        'update_bands_replace_count_metric',
        function(_, _, _, request_type)
            bands_replace_count:inc(1, { request_type = request_type })
        end
)

At the time of requesting the data collected by metrics. In this case, you need to collect the required metric inside metrics.register_callback(). The example below shows how to use a gauge collector to measure the size of memory wasted due to internal fragmentation:
```
local metrics = require('metrics')
local bands_waste_size = metrics.gauge('bands_waste_size', 'The size of memory wasted due to internal fragmentation')
metrics.register_callback(function()
    bands_waste_size:set(box.space.bands:stat()['tuple']['memtx']['waste_size'])
end)
```
To set a gauge value, gauge_obj:set() is called.

You can find the full example on GitHub: metrics_collect_custom.

Possible limitations

The module allows to add your own metrics, but there are some subtleties when working with specific tools.

When adding your custom metric, it’s important to ensure that the number of label value combinations is kept to a minimum. Otherwise, combinatorial explosion may happen in the timeseries database with metrics values stored. Examples of data labels:

Labels in Prometheus
Tags in InfluxDB

For example, if your company uses InfluxDB for metric collection, you can potentially disrupt the entire monitoring setup, both for your application and for all other systems within the company. As a result, monitoring data is likely to be lost.

Example:

local some_metric = metrics.counter('some', 'Some metric')

-- THIS IS POSSIBLE
local function on_value_update(instance_alias)
   some_metric:inc(1, { alias = instance_alias })
end

-- THIS IS NOT ALLOWED
local function on_value_update(customer_id)
   some_metric:inc(1, { customer_id = customer_id })
end

In the example, there are two versions of the function on_value_update. The top version labels the data with the cluster instance’s alias. Since there’s a relatively small number of nodes, using them as labels is feasible. In the second case, an identifier of a record is used. If there are many records, it’s recommended to avoid such situations.

The same principle applies to URLs. Using the entire URL with parameters is not recommended. Use a URL template or the name of the command instead.

In essence, when designing custom metrics and selecting labels or tags, it’s crucial to opt for a minimal set of values that can uniquely identify the data without introducing unnecessary complexity or potential conflicts with existing metrics and systems.

Collecting HTTP metrics

The metrics module provides middleware for monitoring HTTP latency statistics for endpoints that are created using the http module. The latency collector observes both latency information and the number of invocations. The metrics collected by HTTP middleware are separated by a set of labels:

a route (path)
a method (method)
an HTTP status code (status)

For each route that you want to track, you must specify the middleware explicitly. The example below shows how to collect statistics for requests made to the /metrics/hello endpoint.

httpd = require('http.server').new('127.0.0.1', 8080)
local metrics = require('metrics')
metrics.http_middleware.configure_default_collector('summary')
httpd:route({
    method = 'GET',
    path = '/metrics/hello'
}, metrics.http_middleware.v1(
        function()
            return { status = 200,
                     headers = { ['content-type'] = 'text/plain' },
                     body = 'Hello from http_middleware!' }
        end))

httpd:start()

Note

The middleware does not cover the 404 errors.

Collecting metrics using plugins

The metrics module provides a set of plugins that let you collect metrics through a unified interface:

For example, you can obtain an HTTP response object containing metrics in the Prometheus format by calling the metrics.plugins.prometheus.collect_http() function:

local prometheus_plugin = require('metrics.plugins.prometheus')
local prometheus_metrics = prometheus_plugin.collect_http()

To expose the collected metrics, you can use the http module:

httpd = require('http.server').new('127.0.0.1', 8080)
httpd:route({
    method = 'GET',
    path = '/metrics/prometheus'
}, function()
    local prometheus_plugin = require('metrics.plugins.prometheus')
    local prometheus_metrics = prometheus_plugin.collect_http()
    return prometheus_metrics
end)
httpd:start()

Example on GitHub: metrics_plugins

Creating custom plugins

Use the following API to create custom plugins:

metrics.invoke_callbacks()
metrics.collectors()
collector_object

To create a plugin, you need to include the following in your main export function:

-- Invoke all callbacks registered via `metrics.register_callback(<callback-function>)`
metrics.invoke_callbacks()

-- Loop over collectors
for _, c in pairs(metrics.collectors()) do
    ...

    -- Loop over instant observations in the collector
    for _, obs in pairs(c:collect()) do
        -- Export observation `obs`
        ...
    end
end

See the source code of built-in plugins in the metrics GitHub repository.

API Reference

metrics API
metrics.cfg()	Entrypoint to setup the module
metrics.collect()	Collect observations from each collector
metrics.collectors()	List all collectors in the registry
metrics.counter()	Register a new counter
metrics.enable_default_metrics()	Same as `metrics.cfg{ include = include, exclude = exclude }`
metrics.gauge()	Register a new gauge
metrics.histogram()	Register a new histogram
metrics.invoke_callbacks()	Invoke all registered callbacks
metrics.register_callback()	Register a function named `callback`
metrics.set_global_labels()	Same as `metrics.cfg{ labels = label_pairs }`
metrics.summary()	Register a new summary
metrics.unregister_callback()	Unregister a function named `callback`
metrics.http_middleware API
metrics.http_middleware.build_default_collector()	Register and return a collector for the middleware
metrics.http_middleware.configure_default_collector()	Register a collector for the middleware and set it as default
metrics.http_middleware.get_default_collector()	Get the default collector
metrics.http_middleware.set_default_collector()	Set the default collector
metrics.http_middleware.v1()	Latency measuring wrap-up
Related objects
collector_object	A collector object
counter_obj	A counter object
gauge_obj	A gauge object
histogram_obj	A histogram object
registry	A metrics registry
summary_obj	A summary object

metrics API

metrics.cfg([config])¶

Entrypoint to setup the module.

Parameters:

config (table) –
module configuration options:
- cfg.include (string/table, default all): all to enable all supported default metrics, none to disable all default metrics, table with names of the default metrics to enable a specific set of metrics.
- cfg.exclude (table, default {}): a table containing the names of the default metrics that you want to disable. Has higher priority than cfg.include.
- cfg.labels (table, default {}): a table containing label names as string keys, label values as values. See also: Labels.

You can work with metrics.cfg as a table to read values, but you must call metrics.cfg{} as a function to update them.

Supported default metric names (for cfg.include and cfg.exclude tables):

all (metasection including all metrics)
network
operations
system
replicas
info
slab
runtime
memory
spaces
fibers
cpu
vinyl
memtx
luajit
clock
event_loop
config

See metrics reference for details. All metric collectors from the collection have metainfo.default = true.

cfg.labels are the global labels to be added to every observation.

Global labels are applied only to metric collection. They have no effect on how observations are stored.

Global labels can be changed on the fly.

label_pairs from observation objects have priority over global labels. If you pass label_pairs to an observation method with the same key as some global label, the method argument value will be used.

Note that both label names and values in label_pairs are treated as strings.

metrics.collect([opts])¶

Collect observations from each collector.

Parameters:	opts (`table`) – table of collect options: `invoke_callbacks` – if `true`, invoke_callbacks() is triggered before actual collect. `default_only` – if `true`, observations contain only default metrics (`metainfo.default = true`).

metrics.collectors()¶

List all collectors in the registry. Designed to be used in exporters.

Return:	A list of created collectors (see collector_object).

See also: Creating custom metrics

metrics.enable_default_metrics([include, exclude])¶: Same as metrics.cfg{include=include, exclude=exclude}, but include={} is treated as include='all' for backward compatibility.

metrics.gauge(name[, help, metainfo])¶

Parameters:	name (`string`) – collector name. Must be unique. help (`string`) – collector description. metainfo (`table`) – collector metainfo.
Return:	A gauge object (see gauge_obj).
Rtype:	gauge_obj

See also: Creating custom metrics

metrics.histogram(name[, help, buckets, metainfo])¶

Parameters:	name (`string`) – collector name. Must be unique. help (`string`) – collector description. buckets (`table`) – histogram buckets (an array of sorted positive numbers). The infinity bucket (`INF`) is appended automatically. Default: `{.005, .01, .025, .05, .075, .1, .25, .5, .75, 1.0, 2.5, 5.0, 7.5, 10.0, INF}`. metainfo (`table`) – collector metainfo.
Return:	A histogram object (see histogram_obj).
Rtype:	histogram_obj

See also: Creating custom metrics

Note

A histogram is basically a set of collectors:

name .. "_sum" – a counter holding the sum of added observations.
name .. "_count" – a counter holding the number of added observations.
name .. "_bucket" – a counter holding all bucket sizes under the label le (less or equal). To access a specific bucket – x (where x is a number), specify the value x for the label le.

metrics.invoke_callbacks()¶

Invoke all registered callbacks. Has to be called before each collect(). You can also use collect{invoke_callbacks = true} instead. If you’re using one of the default exporters, invoke_callbacks() will be called by the exporter.

See also: Creating custom metrics

Note

A summary represents a set of collectors:

name .. "_sum" – a counter holding the sum of added observations.
name .. "_count" – a counter holding the number of added observations.
name holds all the quantiles under observation that find themselves under the label quantile (less or equal). To access bucket x (where x is a number), specify the value x for the label quantile.

metrics.unregister_callback(callback)¶

Unregister a function named callback that is called right before metric collection on plugin export.

Parameters:	callback (`function`) – a function that takes no parameters.

Example:

local cpu_callback = function()
    local cpu_metrics = require('metrics.psutils.cpu')
    cpu_metrics.update()
end

metrics.register_callback(cpu_callback)

-- after a while, we don't need that callback function anymore

metrics.unregister_callback(cpu_callback)

metrics.http_middleware API

metrics.http_middleware.build_default_collector(type_name, name[, help])¶

Parameters:	type_name (`string`) – collector type: `histogram` or `summary`. The default is `histogram`. name (`string`) – collector name. The default is `http_server_request_latency`. help (`string`) – collector description. The default is `HTTP Server Request Latency`.
Return:	A collector object

Possible errors:

A collector with the same type and name already exists in the registry.

metrics.http_middleware.configure_default_collector(type_name, name, help)¶

Parameters:	type_name (`string`) – collector type: `histogram` or `summary`. The default is `histogram`. name (`string`) – collector name. The default is `http_server_request_latency`. help (`string`) – collector description. The default is `HTTP Server Request Latency`.

Possible errors:

A collector with the same type and name already exists in the registry.

metrics.http_middleware.get_default_collector()¶

Return the default collector. If the default collector hasn’t been set yet, register it (with default http_middleware.build_default_collector() parameters) and set it as default.

Return:	A collector object

metrics.http_middleware.set_default_collector(collector)¶

Set the default collector.

Parameters:	collector – middleware collector object

metrics.http_middleware.v1(handler, collector)¶

Latency measuring wrap-up for the HTTP ver. 1.x.x handler. Returns a wrapped handler.

Learn more in Collecting HTTP metrics.

Parameters:	handler (`function`) – handler function. collector – middleware collector object. If not set, the default collector is used (like in http_middleware.get_default_collector()).

Usage:

httpd:route(route, http_middleware.v1(request_handler, collector))

Related objects

object collector_object¶

A collector object.

metrics.plugins.prometheus

metrics.plugins.prometheus.collect_http()¶

Get an HTTP response object containing metrics in the Prometheus format.

Return:

a table containing the following fields:

status: set to 200
headers: response headers
body: metrics in the Prometheus format

Rtype:

table

Example

local prometheus_plugin = require('metrics.plugins.prometheus')
local prometheus_metrics = prometheus_plugin.collect_http()

Example on GitHub: metrics_plugins

metrics.plugins.graphite

metrics.plugins.graphite.init(options)¶

Send all metrics to a remote Graphite server. Exported metric names are formatted as follows: <prefix>.<metric_name>.

Parameters:	options (`table`) – possible options: `prefix` (string): metrics prefix (`'tarantool'` by default) `host` (string): Graphite server host (`'127.0.0.1'` by default) `port` (number): Graphite server port (`2003` by default) `send_interval` (number): metrics collection interval in seconds (`2` by default)

Example

local graphite_plugin = require('metrics.plugins.graphite')
graphite_plugin.init {
    prefix = 'tarantool',
    host = '127.0.0.1',
    port = 2003,
    send_interval = 1,
}

metrics.plugins.json

metrics.plugins.json.export()¶

Export metrics in the JSON format.

Return:	a string containing metrics in the JSON format
Rtype:	string

Important

The values can also be +-math.huge and math.huge * 0. In such case:

math.huge is serialized to "inf"
-math.huge is serialized to "-inf"
math.huge * 0 is serialized to "nan".

Example

local json_plugin = require('metrics.plugins.json')
local json_metrics = json_plugin.export()

Example on GitHub: metrics_plugins

Module msgpack

Overview

The msgpack module decodes raw MsgPack strings by converting them to Lua objects, and encodes Lua objects by converting them to raw MsgPack strings. Tarantool makes heavy internal use of MsgPack because tuples in Tarantool are stored as MsgPack arrays.

Besides, starting from version 2.10.0, the msgpack module enables creating a specific userdata Lua object – MsgPack object. The MsgPack object stores arbitrary MsgPack data, and can be created from any Lua object including another MsgPack object and from a raw MsgPack string. The MsgPack object has its own set of methods and iterators.

Note

MsgPack is short for MessagePack.
A “raw MsgPack string” is a byte array formatted according to the MsgPack specification including type bytes and sizes. The type bytes and sizes can be made displayable with string.hex(), or the raw MsgPack strings can be converted to Lua objects by using the msgpack module methods.

API Reference

Below is a list of msgpack members and related objects.

Members
msgpack.encode(lua_value)	Convert a Lua object to a raw MsgPack string
msgpack.encode(lua_value,ibuf)	Convert a Lua object to a raw MsgPack string in an ibuf
msgpack.decode(msgpack_string)	Convert a raw MsgPack string to a Lua object
msgpack.decode(C_style_string_pointer)	Convert a raw MsgPack string in an ibuf to a Lua object
msgpack.decode_unchecked(msgpack_string)	Convert a raw MsgPack string to a Lua object
msgpack.decode_unchecked(C_style_string_pointer)	Convert a raw MsgPack string to a Lua object
msgpack.decode_array_header(byte-array, size)	Call the MsgPuck’s `mp_decode_array` function and return the array size and a pointer to the first array component
msgpack.decode_map_header(byte-array, size)	Call the MsgPuck’s `mp_decode_map` function and return the map size and a pointer to the first map component
__serialize parameter	Output structure specification
msgpack.cfg()	Change MsgPack configuration settings
msgpack.NULL	Analog of Lua’s `nil`
msgpack.object(lua_value)	Create a MsgPack object from a Lua object
msgpack.object_from_raw(msgpack_string)	Create a MsgPack object from a raw MsgPack string
msgpack.object_from_raw(C_style_string_pointer, size)	Create a MsgPack object from a raw MsgPack string
msgpack.is_object(some_argument)	Check if an argument is a MsgPack object
Related objects
msgpack_object	A MsgPack object
iterator_object	A MsgPack iterator object

Members

msgpack.encode(lua_value)¶

Convert a Lua object to a raw MsgPack string.

Parameters:	lua_value – either a scalar value or a Lua table value.
Return:	the original contents formatted as a raw MsgPack string;
Rtype:	raw MsgPack string

msgpack.encode(lua_value, ibuf)

Convert a Lua object to a raw MsgPack string in an ibuf, which is a buffer such as buffer.ibuf() creates. As with encode(lua_value), the result is a raw MsgPack string, but it goes to the ibuf output instead of being returned.

Parameters:	lua_value (`lua-object`) – either a scalar value or a Lua table value. ibuf (`buffer`) – (output parameter) where the result raw MsgPack string goes
Return:	number of bytes in the output
Rtype:	raw MsgPack string

Example using buffer.ibuf() and ffi.string() and string.hex(): The result will be ‘91a161’ because 91 is the MessagePack encoding of “fixarray size 1”, a1 is the MessagePack encoding of “fixstr size 1”, and 61 is the UTF-8 encoding of ‘a’:

ibuf = require('buffer').ibuf()
msgpack_string_size = require('msgpack').encode({'a'}, ibuf)
msgpack_string = require('ffi').string(ibuf.rpos, msgpack_string_size)
string.hex(msgpack_string)

msgpack.decode(msgpack_string[, start_position])¶

Convert a raw MsgPack string to a Lua object.

Parameters:	msgpack_string (`string`) – a raw MsgPack string. start_position (`integer`) – where to start, minimum = 1, maximum = string length, default = 1.
Return:	(if `msgpack_string` is a valid raw MsgPack string) the original contents of `msgpack_string`, formatted as a Lua object, usually a Lua table, (otherwise) a scalar value, such as a string or a number; “next_start_position”. If `decode` stops after parsing as far as byte N in `msgpack_string`, then “next_start_position” will equal N + 1, and `decode(msgpack_string, next_start_position)` will continue parsing from where the previous `decode` stopped, plus 1. Normally `decode` parses all of `msgpack_string`, so “next_start_position” will equal `string.len(msgpack_string)` + 1.
Rtype:	Lua object and number

Example: The result will be [‘a’] and 4:

msgpack_string = require('msgpack').encode({'a'})
require('msgpack').decode(msgpack_string, 1)

msgpack.decode(C_style_string_pointer, size)

Convert a raw MsgPack string, whose address is supplied as a C-style string pointer such as the rpos pointer which is inside an ibuf such as buffer.ibuf() creates, to a Lua object. A C-style string pointer may be described as cdata<char *> or cdata<const char *>.

Parameters:	C_style_string_pointer (`buffer`) – a pointer to a raw MsgPack string. size (`integer`) – number of bytes in the raw MsgPack string
Return:	(if C_style_string_pointer points to a valid raw MsgPack string) the original contents of `msgpack_string`, formatted as a Lua object, usually a Lua table, (otherwise) a scalar value, such as a string or a number; returned_pointer = a C-style pointer to the byte after what was passed, so that C_style_string_pointer + size = returned_pointer
Rtype:	table and C-style pointer to after what was passed

Example using buffer.ibuf and pointer arithmetic: The result will be [‘a’] and 3 and true:

ibuf = require('buffer').ibuf()
msgpack_string_size = require('msgpack').encode({'a'}, ibuf)
a, b = require('msgpack').decode(ibuf.rpos, msgpack_string_size)
a, b - ibuf.rpos, msgpack_string_size == b - ibuf.rpos

msgpack.decode_unchecked(msgpack_string[, start_position])¶: Input and output are the same as for decode(string).

msgpack.decode_unchecked(C_style_string_pointer): Input and output are the same as for decode(C_style_string_pointer), except that size is not needed. Some checking is skipped, and decode_unchecked(C_style_string_pointer) can operate with string pointers to buffers which decode(C_style_string_pointer) cannot handle. For an example see the buffer module.

msgpack.decode_array_header(byte-array, size)¶

Call the MsgPuck’s mp_decode_array function and return the array size and a pointer to the first array component. A subsequent call to msgpack_decode can decode the component instead of the whole array.

Parameters:	byte-array – a pointer to a raw MsgPack string. size – a number greater than or equal to the string’s length
Return:	the size of the array; a pointer to after the array header.

Example:

-- Example of decode_array_header
-- Suppose we have the raw data '\x93\x01\x02\x03'.
-- \x93 is MsgPack encoding for a header of a three-item array.
-- We want to skip it and decode the next three items.
msgpack = require('msgpack');
ffi = require('ffi');
x, y = msgpack.decode_array_header(ffi.cast('char*', '\x93\x01\x02\x03'), 4)
a = msgpack.decode(y, 1);
b = msgpack.decode(y + 1, 1);
c = msgpack.decode(y + 2, 1);
a, b, c
-- The result is: 1,2,3.

msgpack.decode_map_header(byte-array, size)¶

Call the MsgPuck’s mp_decode_map function and return the map size and a pointer to the first map component. A subsequent call to msgpack_decode can decode the component instead of the whole map.

Parameters:	byte-array – a pointer to a raw MsgPack string. size – a number greater than or equal to the raw MsgPack string’s length
Return:	the size of the map; a pointer to after the map header.

Example:

-- Example of decode_map_header
-- Suppose we have the raw data '\x81\xa2\x41\x41\xc3'.
-- '\x81' is MsgPack encoding for a header of a one-item map.
-- We want to skip it and decode the next map item.
msgpack = require('msgpack');
ffi = require('ffi')
x, y = msgpack.decode_map_header(ffi.cast('char*', '\x81\xa2\x41\x41\xc3'), 5)
a = msgpack.decode(y, 3);
b = msgpack.decode(y + 3, 1)
x, a, b
-- The result is: 1,"AA", true.

__serialize parameter

The MsgPack output structure can be specified with the __serialize parameter:

‘seq’, ‘sequence’, ‘array’ – table encoded as an array
‘map’, ‘mappping’ – table encoded as a map
function – the meta-method called to unpack the serializable representation of table, cdata, or userdata objects

Serializing ‘A’ and ‘B’ with different __serialize values brings different results. To show this, here is a routine which encodes {'A','B'} both as an array and as a map, then displays each result in hexadecimal.

function hexdump(bytes)
    local result = ''
    for i = 1, #bytes do
        result = result .. string.format("%x", string.byte(bytes, i)) .. ' '
    end
    return result
end

msgpack = require('msgpack')
m1 = msgpack.encode(setmetatable({'A', 'B'}, {
                             __serialize = "seq"
                          }))
m2 = msgpack.encode(setmetatable({'A', 'B'}, {
                             __serialize = "map"
                          }))
print('array encoding: ', hexdump(m1))
print('map encoding: ', hexdump(m2))

Result:

**array** encoding: 92 a1 41 a1 42
**map** encoding:   82 01 a1 41 02 a1 42

The MsgPack Specification page explains that the first encoding means:

fixarray(2), fixstr(1), "A", fixstr(1), "B"

and the second encoding means:

fixmap(2), key(1), fixstr(1), "A", key(2), fixstr(2), "B"

Here are examples for all the common types, with the Lua-table representation on the left, with the MsgPack format name and encoding on the right.

Common Types and MsgPack Encodings

{}	‘fixmap’ if metatable is ‘map’ = 80 otherwise ‘fixarray’ = 90
‘a’	‘fixstr’ = a1 61
false	‘false’ = c2
true	‘true’ = c3
127	‘positive fixint’ = 7f
65535	‘uint 16’ = cd ff ff
4294967295	‘uint 32’ = ce ff ff ff ff
nil	‘nil’ = c0
msgpack.NULL	same as nil
[0] = 5	‘fixmap(1)’ + ‘positive fixint’ (for the key) + ‘positive fixint’ (for the value) = 81 00 05
[0] = nil	‘fixmap(0)’ = 80 – nil is not stored when it is a missing map value
1.5	‘float 64’ = cb 3f f8 00 00 00 00 00 00

msgpack.cfg(table)¶

Change MsgPack configuration settings.

The values are all either integers or boolean true/false.

Option	Default	Use
`cfg.encode_max_depth`	128	The maximum recursion depth for encoding
`cfg.encode_deep_as_nil`	false	Specify whether to crop tables with nesting level deeper than `cfg.encode_max_depth`. Not-encoded fields are replaced with one null. If not set, too high nesting is considered an error.
`cfg.encode_invalid_numbers`	true	Specify whether to enable encoding of NaN and Inf numbers
`cfg.encode_load_metatables`	true	Specify whether the serializer will follow __serialize metatable field
`cfg.encode_use_tostring`	false	Specify whether to use `tostring()` for unknown types
`cfg.encode_invalid_as_nil`	false	Specify whether to use NULL for non-recognized types
`cfg.encode_sparse_convert`	true	Specify whether to handle excessively sparse arrays as maps. See detailed description below
`cfg.encode_sparse_ratio`	2	1/`encode_sparse_ratio` is the permissible percentage of missing values in a sparse array
`cfg.encode_sparse_safe`	10	A limit ensuring that small Lua arrays are always encoded as sparse arrays (instead of generating an error or encoding as a map)
`cfg.encode_error_as_ext`	true	Specify how error objects (box.error.new()) are encoded in the MsgPack format: if `true`, errors are encoded as the the MP_ERROR MsgPack extension. if `false`, the encoding format depends on other configuration options (`encode_load_metatables`, `encode_use_tostring`, `encode_invalid_as_nil`).
`cfg.decode_invalid_numbers`	true	Specify whether to enable decoding of NaN and Inf numbers
`cfg.decode_save_metatables`	true	Specify whether to set metatables for all arrays and maps

Important

Sparse arrays features

During encoding, the MsgPack encoder tries to classify tables into one of four kinds:

map - at least one table index is not unsigned integer
regular array - all array indexes are available
sparse array - at least one array index is missing
excessively sparse array - the number of values missing exceeds the configured ratio

An array is excessively sparse when all the following conditions are met:

encode_sparse_ratio > 0
max(table) > encode_sparse_safe
max(table) > count(table) * encode_sparse_ratio

MsgPack encoder never considers an array to be excessively sparse when encode_sparse_ratio = 0. The encode_sparse_safe limit ensures that small Lua arrays are always encoded as sparse arrays. By default, attempting to encode an excessively sparse array generates an error. If encode_sparse_convert is set to true, excessively sparse arrays will be handled as maps.

msgpack.cfg() example 1:

If msgpack.cfg.encode_invalid_numbers = true (the default), then NaN and Inf are legal values. If that is not desirable, then ensure that msgpack.encode() does not accept them, by saying msgpack.cfg{encode_invalid_numbers = false}, thus:

tarantool> msgpack = require('msgpack'); msgpack.cfg{encode_invalid_numbers = true}
---
...
tarantool> msgpack.decode(msgpack.encode{1, 0 / 0, 1 / 0, false})
---
- [1, -nan, inf, false]
- 22
...
tarantool> msgpack.cfg{encode_invalid_numbers = false}
---
...
tarantool> msgpack.decode(msgpack.encode{1, 0 / 0, 1 / 0, false})
---
- error: ... number must not be NaN or Inf'
...

msgpack.cfg() example 2:

To avoid generating errors on attempts to encode unknown data types as userdata/cdata, you can use this code:

tarantool> httpc = require('http.client').new()
---
...

tarantool> msgpack.encode(httpc.curl)
---
- error: unsupported Lua type 'userdata'
...

tarantool> msgpack.cfg{encode_use_tostring = true}
---
...

tarantool> msgpack.encode(httpc.curl)
---
- !!binary tnVzZXJkYXRhOiAweDAxMDU5NDQ2Mzg=
...

Note

To achieve the same effect for only one call to msgpack.encode() (that is without changing the configuration permanently), you can use msgpack.new({encode_invalid_numbers = true}).encode({1, 2}).

Similar configuration settings exist for JSON and YAML.

msgpack.NULL¶

A value comparable to Lua “nil” which may be useful as a placeholder in a tuple.

Example

tarantool> msgpack = require('msgpack')
---
...
tarantool> y = msgpack.encode({'a',1,'b',2})
---
...
tarantool> z = msgpack.decode(y)
---
...
tarantool> z[1], z[2], z[3], z[4]
---
- a
- 1
- b
- 2
...
tarantool> box.space.tester:insert{20, msgpack.NULL, 20}
---
- [20, null, 20]
...

msgpack.object(lua_value)¶

Since: 2.10.0

Encode an arbitrary Lua object into the MsgPack format.

Parameters:	lua_value (`lua-object`) – a Lua object of any type.
Return:	encoded MsgPack data encapsulated in a MsgPack object.
Rtype:	userdata

Example:

local msgpack = require('msgpack')

-- Create a MsgPack object from a Lua object of any type
local mp_from_number = msgpack.object(123)
local mp_from_string = msgpack.object('hello world')
local mp_from_array = msgpack.object({ 10, 20, 30 })
local mp_from_table = msgpack.object({ band_name = 'The Beatles', year = 1960 })
local mp_from_tuple = msgpack.object(box.tuple.new{1, 'The Beatles', 1960})

msgpack.object_from_raw(msgpack_string)¶

Since: 2.10.0

Create a MsgPack object from a raw MsgPack string.

Parameters:	msgpack_string (`string`) – a raw MsgPack string.
Return:	a MsgPack object
Rtype:	userdata

Example:

local msgpack = require('msgpack')

-- Create a MsgPack object from a raw MsgPack string
local raw_mp_string = msgpack.encode({ 10, 20, 30 })
local mp_from_mp_string = msgpack.object_from_raw(raw_mp_string)

msgpack.object_from_raw(C_style_string_pointer, size)

Since: 2.10.0

Create a MsgPack object from a raw MsgPack string. The address of the MsgPack string is supplied as a C-style string pointer such as the rpos pointer inside an ibuf that the buffer.ibuf() creates. A C-style string pointer may be described as cdata<char *> or cdata<const char *>.

Parameters:	C_style_string_pointer (`buffer`) – a pointer to a raw MsgPack string. size (`integer`) – number of bytes in the raw MsgPack string.
Return:	a MsgPack object
Rtype:	userdata

Example:

local msgpack = require('msgpack')

-- Create a MsgPack object from a raw MsgPack string using buffer
local buffer = require('buffer')
local ibuf = buffer.ibuf()
msgpack.encode({ 10, 20, 30 }, ibuf)
local mp_from_mp_string_pt = msgpack.object_from_raw(ibuf.buf, ibuf:size())

msgpack.is_object(some_argument)¶

Since: 2.10.0

Check if the given argument is a MsgPack object.

Parameters:	some_agrument – any argument.
Return:	`true` if the argument is a MsgPack object; otherwise, `false`
Rtype:	boolean

Example:

local msgpack = require('msgpack')

local mp_from_string = msgpack.object('hello world')

-- Check if the given argument is a MsgPack object
local mp_is_object = msgpack.is_object(mp_from_string) -- Returns true
local string_is_object = msgpack.is_object('hello world') -- Returns false

Related objects

msgpack_object

object msgpack_object¶

A MsgPack object that stores arbitrary MsgPack data. To create a MsgPack object from a Lua object or string, use the following methods:

msgpack.object
msgpack.object_from_raw

If a MsgPack object stores an array, it can be inserted into a database space:

box.space.bands:insert(msgpack.object({1, 'The Beatles', 1960}))

msgpack_object:decode()¶

Since: 2.10.0

Decode MsgPack data in the MsgPack object.

Return:	a Lua object
Rtype:	Lua object

Example

local msgpack = require('msgpack')

local mp_from_number = msgpack.object(123)
local mp_from_string = msgpack.object('hello world')

-- Decode MsgPack data
local mp_number_decoded = mp_from_number:decode() -- Returns 123
local mp_string_decoded = mp_from_string:decode() -- Returns 'hello world'

msgpack_object:iterator()¶

Since: 2.10.0

Create an iterator over the MsgPack data.

Return:	an iterator object over the MsgPack data
Rtype:	userdata

msgpack_object[key]

Since: 2.11.0

Get an element of the MsgPack array by the specified index key. You can also use the get(key) method to get an array element.

The index key used to get the array element might be one of the following:

if a MsgPack object is an array, the key is an integer value (starting with 1) that specifies the element index.
if a MsgPack object is an associative array, key is the string value that specifies the element key. In this case, you can also access the array element using dot notation (msgpack_object.<key>).

If the specified key is missing in the array, msgpack_object[key] returns nil.

Example

local msgpack = require('msgpack')

local mp_from_array = msgpack.object({ 10, 20, 30 })
local mp_from_table = msgpack.object({ band_name = 'The Beatles', year = 1960 })
local mp_from_tuple = msgpack.object(box.tuple.new{1, 'The Beatles', 1960})

-- Get MsgPack data by the specified index or key
local mp_array_get_by_index = mp_from_array[1] -- Returns 10
local mp_table_get_by_key = mp_from_table['band_name'] -- Returns 'The Beatles'
local mp_table_get_by_nonexistent_key = mp_from_table['rating'] -- Returns nil
local mp_tuple_get_by_index = mp_from_tuple[3] -- Returns 1960

Note

Note that if the key for an associative array coincides with any msgpack_object’s method name, for example, ‘iterator’, mp_from_table['iterator'] returns the iterator method function instead of a value corresponding to the ‘iterator’ key.

msgpack_object:get(key)¶

Since: 2.11.0

Get an element of the MsgPack array by the specified index key. You can also use the indexed notation (msgpack_object[key]) to get an array element.

Parameters:	key (`number/string`) – the index key used to get the array element, which might be one of the following: if a MsgPack object is an array, the `key` is an integer value (starting with 1) that specifies the element index. if a MsgPack object is an associative array, `key` is the string value that specifies the element key.
Return:	an element of the MsgPack array. If the specified key is missing in the array, `get` returns `nil`.

iterator_object

object iterator_object¶

An iterator over a MsgPack array.

iterator_object:decode_array_header()¶

Since: 2.10.0

Decode a MsgPack array header under the iterator cursor and advance the cursor. After calling this function, the iterator points to the first element of the array or to the value following the array if the array is empty.

Return:	number of elements in the array
Rtype:	number

Possible errors: raise an error if the type of the value under the iterator cursor is not MP_ARRAY.

Example

local msgpack = require('msgpack')

local mp_array = msgpack.object({ 10, 20, 30, 40 })
local mp_array_iterator = mp_array:iterator()

local size = mp_array_iterator:decode_array_header()  -- returns 4
local first = mp_array_iterator:decode()              -- returns 10
local second = mp_array_iterator:decode()             -- returns 20
mp_array_iterator:skip()                              -- returns none, skips 30
local fourth = mp_array_iterator:decode()             -- returns 40

iterator_object:decode_map_header()¶

Since: 2.10.0

Decode a MsgPack map header under the iterator cursor and advance the cursor. After calling this function, the iterator points to the first key stored in the map or to the value following the map if the map is empty.

Return:	number of key-value pairs in the map
Rtype:	number

Possible errors: raise an error if the type of the value under the iterator cursor is not MP_MAP.

Example

local msgpack = require('msgpack')

local mp_map = msgpack.object({ foo = 123 })
local mp_map_iterator = mp_map:iterator()

local size = mp_map_iterator:decode_map_header() -- returns 1
local first = mp_map_iterator:decode()           -- returns 'foo'
local second = mp_map_iterator:decode()          -- returns '123'

iterator_object:decode()¶

Since: 2.10.0

Decode a MsgPack value under the iterator cursor and advance the cursor.

Return:	a Lua object corresponding to the MsgPack value
Rtype:	Lua object

Possible errors: raise a Lua error if there’s no data to decode.

Example

local msgpack = require('msgpack')

local mp_array = msgpack.object({ 10, 20, 30, 40 })
local mp_array_iterator = mp_array:iterator()

local size = mp_array_iterator:decode_array_header()  -- returns 4
local first = mp_array_iterator:decode()              -- returns 10
local second = mp_array_iterator:decode()             -- returns 20
mp_array_iterator:skip()                              -- returns none, skips 30
local fourth = mp_array_iterator:decode()             -- returns 40

iterator_object:take()¶

Since: 2.10.0

Return a MsgPack value under the iterator cursor as a MsgPack object without decoding and advance the cursor. The method doesn’t copy MsgPack data. Instead, it takes a reference to the original object.

Possible errors: raise a Lua error if there’s no data to decode.

Example

local msgpack = require('msgpack')

local mp_array = msgpack.object({ 10, 20, 30 })
local mp_array_iterator = mp_array:iterator()

local size = mp_array_iterator:decode_array_header()  -- returns 3
local first = mp_array_iterator:decode()              -- returns 10
mp_array_iterator:skip()                              -- returns none, skips 20
local mp_value_under_cursor = mp_array_iterator:take()
local third = mp_value_under_cursor:decode()          -- returns 30

iterator_object:take_array(count)¶

Since: 2.10.0

Copy the specified number of MsgPack values starting from the iterator’s cursor position to a new MsgPack array object and advance the cursor.

Parameters:	count (`number`) – the number of MsgPack values to copy
Return:	a new MsgPack object

Possible errors: raise a Lua error if there aren’t enough values to decode. In this case, the iterator’s cursor position doesn’t change.

Example

local msgpack = require('msgpack')

local mp_array = msgpack.object({ 10, 20, 30, 40 })
local mp_array_iterator = mp_array:iterator()

local size = mp_array_iterator:decode_array_header()  -- returns 4
local first = mp_array_iterator:decode()              -- returns 10
local mp_array_new = mp_array_iterator:take_array(2)
local mp_array_new_decoded = mp_array_new:decode()    -- returns {20, 30}
local fourth = mp_array_iterator:decode()             -- returns 40

iterator_object:skip()¶

Since: 2.10.0

Advance the iterator cursor by skipping one MsgPack value under the cursor. Returns nothing.

Possible errors: raise a Lua error if there’s no data to skip.

Example

local msgpack = require('msgpack')

local mp_array = msgpack.object({ 10, 20, 30, 40 })
local mp_array_iterator = mp_array:iterator()

local size = mp_array_iterator:decode_array_header()  -- returns 4
local first = mp_array_iterator:decode()              -- returns 10
local second = mp_array_iterator:decode()             -- returns 20
mp_array_iterator:skip()                              -- returns none, skips 30
local fourth = mp_array_iterator:decode()             -- returns 40

Module net.box

The net.box module contains connectors to remote database systems. One variant is for connecting to MySQL or MariaDB or PostgreSQL (see SQL DBMS modules reference). The other variant, which is discussed in this section, is for connecting to Tarantool server instances via a network.

Connecting to a database using net.box

Examples on GitHub: sample_db, net_box

The tutorial shows how to use net.box to connect to a remote Tarantool instance, perform CRUD operations, and execute stored procedures. For more information about the net.box module API, check Module net.box.

Note

This tutorial shows how to make CRUD requests to a single-instance Tarantool database. To make requests to a sharded Tarantool cluster with the CRUD module, use its API for CRUD operations.

Sample database configuration

This section describes the configuration of a sample database that allows remote connections:

credentials:
  users:
    sampleuser:
      password: '123456'
      privileges:
      - permissions: [ read, write ]
        spaces: [ bands ]
      - permissions: [ execute ]
        functions: [ get_bands_older_than ]

groups:
  group001:
    replicasets:
      replicaset001:
        instances:
          instance001:
            iproto:
              listen:
              - uri: '127.0.0.1:3301'

app:
  file: 'myapp.lua'

The configuration contains one instance that listens for incoming requests on the 127.0.0.1:3301 address.
sampleuser has privileges to select and modify data in the bands space and execute the get_bands_older_than stored function. This user can be used to connect to the instance remotely.
myapp.lua defines the data model and a stored function.

The myapp.lua file looks as follows:

-- Create a space --
box.schema.space.create('bands')

-- Specify field names and types --
box.space.bands:format({
    { name = 'id', type = 'unsigned' },
    { name = 'band_name', type = 'string' },
    { name = 'year', type = 'unsigned' }
})

-- Create indexes --
box.space.bands:create_index('primary', { parts = { 'id' } })
box.space.bands:create_index('band', { parts = { 'band_name' } })
box.space.bands:create_index('year_band', { parts = { { 'year' }, { 'band_name' } } })

-- Create a stored function --
box.schema.func.create('get_bands_older_than', {
    body = [[
    function(year)
        return box.space.bands.index.year_band:select({ year }, { iterator = 'LT', limit = 10 })
    end
    ]]
})

You can find the full example on GitHub: sample_db.

Making net.box requests interactively

To try out net.box requests in the interactive console, start the sample_db application using tt start:

$ tt start sample_db

Then, use the tt run -i command to start an interactive console:

$ tt run -i
Tarantool 3.0.0-entrypoint-1144-geaff238d9
type 'help' for interactive help
tarantool>

In the console, you can create a net.box connection and try out data operations.

Creating a net.box connection

To load the net.box module, use the require() directive:

net_box = require('net.box')
--[[
---
...
]]

To create a connection, pass a database URI to the net_box.connect() method:

conn = net_box.connect('sampleuser:123456@127.0.0.1:3301')
--[[
---
...
]]

connection:ping() can be used to check the connection status:

conn:ping()
--[[
---
- true
...
]]

Then, you can get a space object and perform CRUD operations on it using conn.space.<space_name>.

Note

Learn more about performing data operations from the CRUD operation examples section.

Inserting data

In the example below, four tuples are inserted into the bands space:

conn.space.bands:insert({ 1, 'Roxette', 1986 })
--[[
---
- - [1, 'Roxette', 1986]
...
]]
conn.space.bands:insert({ 2, 'Scorpions', 1965 })
--[[
---
- [2, 'Scorpions', 1965]
...
]]
conn.space.bands:insert({ 3, 'Ace of Base', 1987 })
--[[
---
- [3, 'Ace of Base', 1987]
...
]]
conn.space.bands:insert({ 4, 'The Beatles', 1960 })
--[[
---
- [4, 'The Beatles', 1960]
...
]]

Querying data

The example below shows how to get a tuple by the specified primary key value:

conn.space.bands:select({ 1 })
--[[
---
- - [1, 'Roxette', 1986]
...
]]

You can also get a tuple by the value of the specified index as follows:

conn.space.bands.index.band:select({ 'The Beatles' })
--[[
---
- - [4, 'The Beatles', 1960]
...
]]

Updating data

space_object.update() updates a tuple identified by the primary key. This method accepts a full key and an operation to execute:

conn.space.bands:update({ 2 }, { { '=', 'band_name', 'Pink Floyd' } })
--[[
---
- [2, 'Pink Floyd', 1965]
...
]]

space_object.upsert() updates an existing tuple or inserts a new one. In the example below, a new tuple is inserted:

conn.space.bands:upsert({ 5, 'The Rolling Stones', 1962 }, { { '=', 'band_name', 'The Doors' } })
--[[
---
...
]]

In this example, space_object.replace() is used to delete the existing tuple and insert a new one:

conn.space.bands:replace({ 1, 'Queen', 1970 })
--[[
---
- [1, 'Queen', 1970]
...
]]

Deleting data

The space_object.delete() call in the example below deletes a tuple whose primary key value is 5:

conn.space.bands:delete({ 5 })
--[[
---
- [5, 'The Rolling Stones', 1962]
...
]]

Executing stored procedures

To execute a stored procedure, use the connection:call() method:

conn:call('get_bands_older_than', { 1966 })
-- ---
-- - [[2, 'Pink Floyd', 1965], [4, 'The Beatles', 1960]]
-- ...

Closing the connection

The connection:close() method can be used to close the connection when it is no longer needed:

conn:close()
--[[
---
...
]]

Note

You can find the example with all the requests above on GitHub: net_box.

Overview

You can call the following methods:

require('net.box') – to get a net.box object (named net_box for examples in this section)
net_box.connect() – to connect and get a connection object (named conn for examples in this section)
other net.box() routines, passing conn:, to execute requests on the remote database system
conn:close – to disconnect

All net.box methods are fiber-safe, that is, it is safe to share and use the same connection object across multiple concurrent fibers. In fact that is perhaps the best programming practice with Tarantool. When multiple fibers use the same connection, all requests are pipelined through the same network socket, but each fiber gets back a correct response. Reducing the number of active sockets lowers the overhead of system calls and increases the overall server performance. However for some cases a single connection is not enough – for example, when it is necessary to prioritize requests or to use different authentication IDs.

Most net.box methods accept the last {options} argument, which can be:

{timeout=...}. For example, a method whose last argument is {timeout=1.5} will stop after 1.5 seconds on the local node, although this does not guarantee that execution will stop on the remote server node.
{buffer=...}. For an example, see the buffer module.
{is_async=...}. For example, a method whose last argument is {is_async=true} will not wait for the result of a request. See the is_async description.
{on_push=... on_push_ctx=...}. For receiving out-of-band messages. See the box.session.push() description.
{return_raw=...} (since version 2.10.0). If set to true, net.box returns response data wrapped in a MsgPack object instead of decoding it to Lua. The default value is false. For an example, see option description below.

The diagram below shows possible connection states and transitions:

On this diagram:

net_box.connect() method spawns a worker fiber, which will establish the connection and start the state machine.
The state machine goes to the initial state.
Authentication and schema upload. It is possible later on to re-enter the fetch_schema state from active to trigger schema reload.
The state changes to the graceful_shutdown state when the state machine receives a box.shutdown event from the remote host (see conn:on_shutdown()). Once all pending requests are completed, the state machine switches to the error (error_reconnect) state.
The transport goes to the error state in case of an error. It can happen, for example, if the server closed the connection. If the reconnect_after option is set, instead of the ‘error’ state, the transport goes to the error_reconnect state.
conn.close() method sets the state to closed and kills the worker. If the transport is already in the error state, close() does nothing.

Index

Below is a list of all net.box functions.

Name	Use
net_box.connect() net_box.new() net_box.self	Create a connection
conn:ping()	Execute a PING command
conn:wait_connected()	Wait for a connection to be active or closed
conn:is_connected()	Check if a connection is active or closed
conn:wait_state()	Wait for a target state
conn:close()	Close a connection
conn.space.space-name:select{field-value}	Select one or more tuples
conn.space.space-name:get{field-value}	Select a tuple
conn.space.space-name:insert{field-value}	Insert a tuple
conn.space.space-name:replace{field-value}	Insert or replace a tuple
conn.space.space-name:update{field-value}	Update a tuple
conn.space.space-name:upsert{field-value}	Update a tuple
conn.space.space-name:delete{field-value}	Delete a tuple
conn:eval()	Evaluate the expression in a string and execute it
conn:call()	Call a stored procedure
conn:watch()	Subscribe to events broadcast by a remote host
conn:on_connect()	Define a connect trigger
conn:on_disconnect()	Define a disconnect trigger
conn:on_shutdown()	Define a shutdown trigger
conn:on_schema_reload()	Define a trigger when schema is modified
conn:new_stream()	Create a stream
stream:begin()	Begin a stream transaction
stream:commit()	Commit a stream transaction
stream:rollback()	Rollback a stream transaction

net_box.connect(URI[, {option[s]}])¶

Create a new connection. The connection is established on demand, at the time of the first request. It can be re-established automatically after a disconnect (see reconnect_after option below). The returned conn object supports methods for making remote requests, such as select, update or delete.

Parameters:

URI – the URI of the target for the connection. The URI type may be string or table as for the uri.parse() function. The table form is used to set up connection parameters, see the URI page for details.
options –
the supported options are shown below:
- user/password: two options to connect to a remote host other than through URI. For example, instead of connect('username:userpassword@localhost:3301') you can write connect('localhost:3301', {user = 'username', password='userpassword'}).
- wait_connected: a connection timeout. By default, the connection is blocked until the connection is established, but if you specify wait_connected=false, the connection returns immediately. If you specify this timeout, it will wait before returning (wait_connected=1.5 makes it wait at most 1.5 seconds).
  
  Note
  
  If reconnect_after is greater than zero, then wait_connected ignores transient failures. The wait completes once the connection is established or is closed explicitly.
- reconnect_after: a number of seconds to wait before reconnecting. The default value, as with the other connect options, is nil. If reconnect_after is greater than zero, then a net.box instance will attempt to reconnect if a connection is lost or a connection attempt fails. This makes transient network failures transparent to the application. Reconnection happens automatically in the background, so requests that initially fail due to connection drops fail, are transparently retried. The number of retries is unlimited, connection retries are made after any specified interval (for example, reconnect_after=5 means that reconnect attempts are made every 5 seconds). When a connection is explicitly closed or when the Lua garbage collector removes it, then reconnect attempts stop.
- connect_timeout: a number of seconds to wait before returning “error: Connection timed out”.
- fetch_schema: a boolean option that controls fetching schema changes from the server. Default: true. If you don’t operate with remote spaces, for example, run only call or eval, set fetch_schema to false to avoid fetching schema changes which is not needed in this case.
  
  Important
  
  In connections with fetch_schema == false, remote spaces are unavailable and the on_schema_reload triggers don’t work.
- required_protocol_version: a minimum version of the IPROTO protocol supported by the server. If the version of the IPROTO protocol supported by the server is lower than specified, the connection will fail with an error message. With required_protocol_version = 1, all connections fail where the IPROTO protocol version is lower than 1.
- required_protocol_features: specified IPROTO protocol features supported by the server. You can specify one or more net.box features from the table below. If the server does not support the specified features, the connection will fail with an error message. With required_protocol_features = {'transactions'}, all connections fail where the server has transactions: false.

net.box feature	Use	IPROTO feature ID	IPROTO versions supporting the feature
`streams`	Requires streams support on the server	IPROTO_FEATURE_STREAMS	1 and newer
`transactions`	Requires transactions support on the server	IPROTO_FEATURE_TRANSACTIONS	1 and newer
`error_extension`	Requires support for MP_ERROR MsgPack extension on the server	IPROTO_FEATURE_ERROR_EXTENSION	2 and newer
`watchers`	Requires remote watchers support on the server	IPROTO_FEATURE_WATCHERS	3 and newer

To learn more about IPROTO features, see IPROTO_ID and the IPROTO_FEATURES key.

Return:	conn object
Rtype:	userdata

Examples:

net_box = require('net.box')

conn = net_box.connect('localhost:3301')
conn = net_box.connect('127.0.0.1:3302', {wait_connected = false})
conn = net_box.connect('127.0.0.1:3304', {required_protocol_version = 4, required_protocol_features = {'transactions', 'streams'}, })

net_box.new(URI[, {option[s]}])¶: new() is a synonym for connect(). It is retained for backward compatibility. For more information, see the description of net_box.connect().

object self¶

For a local Tarantool server, there is a pre-created always-established connection object named net_box.self. Its purpose is to make polymorphic use of the net_box API easier. Therefore conn = net_box.connect('localhost:3301') can be replaced by conn = net_box.self.

However, there is an important difference between the embedded connection and a remote one:

With the embedded connection, requests which do not modify data do not yield. When using a remote connection, due to the implicit rules any request can yield, and the database state may have changed by the time it regains control.
All the options passed to a request (as is_async, on_push, timeout) will be ignored.

object conn¶

conn:ping([options])¶

Execute a PING command.

Parameters:	options (`table`) – the supported option is `timeout=seconds`
Return:	true on success, false on error
Rtype:	boolean

Example:

net_box.self:ping({timeout = 0.5})

conn:wait_connected([timeout])¶

Wait for connection to be active or closed.

Parameters:	timeout (`number`) – in seconds
Return:	true when connected, false on failure.
Rtype:	boolean

Example:

net_box.self:wait_connected()

conn:is_connected()¶

Show whether connection is active or closed.

Return:	true if connected, false on failure.
Rtype:	boolean

Example:

net_box.self:is_connected()

conn:wait_state(state[s][, timeout])¶

[since 1.7.2] Wait for a target state.

Parameters:	states (`string`) – target states timeout (`number`) – in seconds
Return:	true when a target state is reached, false on timeout or connection closure
Rtype:	boolean

Examples:

-- wait infinitely for 'active' state:
conn:wait_state('active')

-- wait for 1.5 secs at most:
conn:wait_state('active', 1.5)

-- wait infinitely for either `active` or `fetch_schema` state:
conn:wait_state({active=true, fetch_schema=true})

conn:close()¶

Close a connection.

Connection objects are destroyed by the Lua garbage collector, just like any other objects in Lua, so an explicit destruction is not mandatory. However, since close() is a system call, it is good programming practice to close a connection explicitly when it is no longer needed, to avoid lengthy stalls of the garbage collector.

Example:

conn:close()

conn.space.<space-name>:select({field-value, ...} [, {options}])

conn.space.space-name:select({...}) is the remote-call equivalent of the local call box.space.space-name:select{...} (see details). For an additional option see Module buffer and skip-header.

Example:

conn.space.testspace:select({1,'B'}, {timeout=1})

Note

Due to the implicit yield rules a local box.space.space-name:select{...} does not yield, but a remote conn.space.space-name:select{...} call does yield, so global variables or database tuples data may change when a remote conn.space.space-name:select{...} occurs.

conn.space.<space-name>:get({field-value, ...} [, {options}])

conn.space.space-name:get(...) is the remote-call equivalent of the local call box.space.space-name:get(...) (see details).

Example:

conn.space.testspace:get({1})

conn.space.<space-name>:insert({field-value, ...} [, {options}])

conn.space.space-name:insert(...) is the remote-call equivalent of the local call box.space.space-name:insert(...) (see details). For an additional option see Module buffer and skip-header.

Example:

conn.space.testspace:insert({2,3,4,5}, {timeout=1.1})

conn.space.<space-name>:replace({field-value, ...} [, {options}])

conn.space.space-name:replace(...) is the remote-call equivalent of the local call box.space.space-name:replace(...) (see details). For an additional option see Module buffer and skip-header.

Example:

conn.space.testspace:replace({5,6,7,8})

conn.space.<space-name>:update({field-value, ...} [, {options}])

conn.space.space-name:update(...) is the remote-call equivalent of the local call box.space.space-name:update(...) (see details). For an additional option see Module buffer and skip-header.

Example:

conn.space.Q:update({1},{{'=',2,5}}, {timeout=0})

conn.space.<space-name>:upsert({field-value, ...} [, {options}]): conn.space.space-name:upsert(...) is the remote-call equivalent of the local call box.space.space-name:upsert(...). (see details). For an additional option see Module buffer and skip-header.

conn.space.<space-name>:delete({field-value, ...} [, {options}]): conn.space.space-name:delete(...) is the remote-call equivalent of the local call box.space.space-name:delete(...) (see details). For an additional option see Module buffer and skip-header.

conn:eval(Lua-string[, {arguments}[, {options}]])¶

conn:eval(Lua-string) evaluates and executes the expression in Lua-string, which may be any statement or series of statements. An execute privilege is required; if the user does not have it, an administrator may grant it with box.schema.user.grant(username, 'execute', 'universe').

To ensure that the return from conn:eval is whatever the Lua expression returns, begin the Lua-string with the word “return”.

Examples:

tarantool> --Lua-string
tarantool> conn:eval('function f5() return 5+5 end; return f5();')
---
- 10
...
tarantool> --Lua-string, {arguments}
tarantool> conn:eval('return ...', {1,2,{3,'x'}})
---
- 1
- 2
- [3, 'x']
...
tarantool> --Lua-string, {arguments}, {options}
tarantool> conn:eval('return {nil,5}', {}, {timeout=0.1})
---
- [null, 5]
...

conn:call(function-name[, {arguments}[, {options}]])¶

conn:call('func', {'1', '2', '3'}) is the remote-call equivalent of func('1', '2', '3'). That is, conn:call is a remote stored-procedure call. The return from conn:call is whatever the function returns.

Limitation: the called function cannot return a function, for example if func2 is defined as function func2 () return func end then conn:call(func2) will return “error: unsupported Lua type ‘function’”.

Examples:

tarantool> -- create 2 functions with conn:eval()
tarantool> conn:eval('function f1() return 5+5 end;')
tarantool> conn:eval('function f2(x,y) return x,y end;')
tarantool> -- call first function with no parameters and no options
tarantool> conn:call('f1')
---
- 10
...
tarantool> -- call second function with two parameters and one option
tarantool> conn:call('f2',{1,'B'},{timeout=99})
---
- 1
- B
...

Note

The results incoming from call() or eval() are decoded via msgpack.decode. When msgpack.cfg.decode_save_metatables is true, you may not change the result’s table metatable entries. See MsgPack for details.

conn:watch(key, func)¶

Subscribe to events broadcast by a remote host.

Parameters:	key (`string`) – a key name of an event to subscribe to func (`function`) – a callback to invoke when the key value is updated
Return:	a watcher handle. The handle consists of one method – `unregister()`, which unregisters the watcher.

To read more about watchers, see the Functions for watchers section.

The method has the same syntax as the box.watch() function, which is used for subscribing to events locally.

Watchers survive reconnection (see the reconnect_after connection option). All registered watchers are automatically resubscribed when the connection is reestablished.

If a remote host supports watchers, the watchers key will be set in the connection peer_protocol_features. For details, check the net.box features table.

Note

Example 1:

Server:

-- Broadcast value 42 for the 'foo' key.
box.broadcast('foo', 42)

Client:

conn = net.box.connect(URI)
local log = require('log')
-- Subscribe to updates of the 'foo' key.
w = conn:watch('foo', function(key, value)
    assert(key == 'foo')
    log.info("The box.id value is '%d'", value)
end)

If you don’t need the watcher anymore, you can unregister it using the command below:

w:unregister()

Example 2:

The net.box module provides the ability to monitor updates of a configuration stored in a Tarantool-based configuration storage by watching path or prefix changes. In the example below, conn:watch() is used to monitor updates of a configuration stored by the /myapp/config/all path:

net_box = require('net.box')
local conn = net_box.connect('127.0.0.1:4401')
local log = require('log')
conn:watch('config.storage:/myapp/config/all', function(key, value)
    log.info("Configuration stored by the '/myapp/config/all' key is changed")
end)

You can find the full example here: config_storage.

conn:request(... {is_async=...})¶

{is_async=true|false} is an option which is applicable for all net_box requests including conn:call, conn:eval, and the conn.space.space-name requests.

The default is is_async=false, meaning requests are synchronous for the fiber. The fiber is blocked, waiting until there is a reply to the request or until timeout expires. Before Tarantool version 1.10, the only way to make asynchronous requests was to put them in separate fibers.

The non-default is is_async=true, meaning requests are asynchronous for the fiber. The request causes a yield but there is no waiting. The immediate return is not the result of the request, instead it is an object that the calling program can use later to get the result of the request.

This immediately-returned object, which we’ll call “future”, has its own methods:

future:is_ready() which will return true when the result of the request is available,
future:result() to get the result of the request (returns the response or nil in case it’s not ready yet or there has been an error),
future:wait_result(timeout) to wait until the result of the request is available and then get it, or throw an error if there is no result after the timeout exceeded,
future:discard() to abandon the object.

Typically a user would say future=request-name(...{is_async=true}), then either loop checking future:is_ready() until it is true and then say request_result=future:result(), or say request_result=future:wait_result(...). Alternatively the client could check for “out-of-band” messages from the server by calling pairs() in a loop – see box.session.push().

A user would say future:discard() to make a connection forget about the response – if a response for a discarded object is received then it will be ignored, so that the size of the requests table will be reduced and other requests will be faster.

Examples:

-- Insert a tuple asynchronously --
tarantool> future = conn.space.bands:insert({10, 'Queen', 1970}, {is_async=true})
---
...
tarantool> future:is_ready()
---
- true
...
tarantool> future:result()
---
- [10, 'Queen', 1970]
...

-- Iterate through a space with 10 records to get data in chunks of 3 records --
tarantool> while true do
               future = conn.space.bands:select({}, {limit=3, after=position, fetch_pos=true, is_async=true})
               result = future:wait_result()
               tuples = result[1]
               position = result[2]
               if position == nil then
                   break
               end
               print('Chunk size: '..#tuples)
           end
Chunk size: 3
Chunk size: 3
Chunk size: 3
Chunk size: 1
---
...

Typically {is_async=true} is used only if the load is large (more than 100,000 requests per second) and latency is large (more than 1 second), or when it is necessary to send multiple requests in parallel then collect responses (sometimes called a “map-reduce” scenario).

Note

Although the final result of an async request is the same as the result of a sync request, it is structured differently: as a table, instead of as the unpacked values.

conn:request(... {return_raw=...})

{return_raw=true} is ignored for:

Methods that return nil: begin, commit, rollback, upsert, prepare.
index.count (returns number).

For execute, the option is applied only to data (rows). Metadata is decoded even if {return_raw=true}.

Example:

local c = require('net.box').connect(uri)
local mp = c.eval('eval ...', {1, 2, 3}, {return_raw = true})
mp:decode() -- {1, 2, 3}

The option can be useful if you want to pass a response through without decoding or with partial decoding. The usage of MsgPack object can reduce pressure on the Lua garbage collector.

conn:new_stream()¶

Create a stream.

Example:

-- Start a server to create a new stream
local conn = net_box.connect('localhost:3301')
local conn_space = conn.space.test
local stream = conn:new_stream()
local stream_space = stream.space.test

object stream¶

stream:begin([txn_isolation])¶

Begin a stream transaction. Instead of the direct method, you can also use the call, eval or execute methods with SQL transaction.

Parameters:	txn_isolation – transaction isolation level

stream:commit()¶

Commit a stream transaction. Instead of the direct method, you can also use the call, eval or execute methods with SQL transaction.

Examples:

-- Begin stream transaction
stream:begin()
-- In the previously created ``accounts`` space with the primary key ``test``, modify the fields 2 and 3
stream.space.accounts:update(test_1, {{'-', 2, 370}, {'+', 3, 100}})
-- Commit stream transaction
stream:commit()

stream:rollback()¶

Rollback a stream transaction. Instead of the direct method, you can also use the call, eval or execute methods with SQL transaction.

Example:

-- Test rollback for memtx space
space:replace({1})
-- Select return tuple that was previously inserted, because this select belongs to stream transaction
space:select({})
stream:rollback()
-- Select is empty, stream transaction rollback
space:select({})

Triggers

With the net.box module, you can use the following triggers:

conn:on_connect([trigger-function[, old-trigger-function]])¶

Define a trigger for execution when a new connection is established, and authentication and schema fetch are completed due to an event such as net_box.connect.

If a trigger function issues net_box requests, they must be asynchronous ({is_async = true}). An attempt to wait for request completion with future:pairs() or future:wait_result() in the trigger function will result in an error.

If the trigger execution fails and an exception happens, the connection’s state changes to ‘error’. In this case, the connection is terminated, regardless of the reconnect_after option’s value. Can be called as many times as reconnection happens, if reconnect_after is greater than zero.

Parameters:	trigger-function (`function`) – the trigger function. Takes the `conn` object as the first argument. old-trigger-function (`function`) – an existing trigger function to replace with `trigger-function`
Return:	nil or function pointer

conn:on_disconnect([trigger-function[, old-trigger-function]])¶

Define a trigger for execution after a connection is closed. If the trigger function causes an error, the error is logged but otherwise is ignored. Execution stops after a connection is explicitly closed, or once the Lua garbage collector removes it.

Parameters:	trigger-function (`function`) – the trigger function. Takes the `conn` object as the first argument old-trigger-function (`function`) – an existing trigger function to replace with `trigger-function`
Return:	nil or function pointer

conn:on_shutdown([trigger-function[, old-trigger-function]])¶

Define a trigger for shutdown when a box.shutdown event is received.

The trigger starts in a new fiber. While the on_shutdown() trigger is running, the connection stays active. It means that the trigger callback is allowed to send new requests.

After the trigger return, the net.box connection goes to the graceful_shutdown state (check the state diagram for details). In this state, no new requests are allowed. The connection waits for all pending requests to be completed.

Once all in-progress requests have been processed, the connection is closed. The state changes to error or error_reconnect (if the reconnect_after option is defined).

Servers that do not support the box.shutdown event or IPROTO_WATCH just close the connection abruptly. In this case, the on_shutdown() trigger is not executed.

Parameters:	trigger-function (`function`) – the trigger function. Takes the `conn` object as the first argument old-trigger-function (`function`) – an existing trigger function to replace with `trigger-function`
Return:	nil or function pointer

conn:on_schema_reload([trigger-function[, old-trigger-function]])¶

Define a trigger executed when some operation has been performed on the remote server after schema has been updated. So, if a server request fails due to a schema version mismatch error, schema reload is triggered.

Parameters:	trigger-function (`function`) – the trigger function. Takes the `conn` object as the first argument old-trigger-function (`function`) – an existing trigger function to replace with `trigger-function`
Return:	nil or function pointer

Note

If the parameters are (nil, old-trigger-function), then the old trigger is deleted.

If both parameters are omitted, then the response is a list of existing trigger functions.

Find the detailed information about triggers in the triggers section.

Module os

Overview

The os module contains the functions execute(), rename(), getenv(), remove(), date(), exit(), time(), clock(), tmpname(), environ(), setenv(), setlocale(), difftime(). Most of these functions are described in the Lua manual Chapter 22 The Operating System Library.

Index

Below is a list of all os functions.

Name	Use
os.execute()	Execute by passing to the shell
os.rename()	Rename a file or directory
os.getenv()	Get an environment variable
os.remove()	Remove a file or directory
os.date()	Get a formatted date
os.exit()	Exit the program
os.time()	Get the number of seconds since the epoch
os.clock()	Get the number of CPU seconds since the program start
os.tmpname()	Get the name of a temporary file
os.environ()	Get a table with all environment variables
os.setenv()	Set an environment variable
os.setlocale()	Change the locale
os.difftime()	Get the number of seconds between two times

os.execute(shell-command)¶

Execute by passing to the shell.

Parameters:	shell-command (`string`) – what to execute.

Example:

tarantool> os.execute('ls -l /usr')
total 200
drwxr-xr-x   2 root root 65536 Apr 22 15:49 bin
drwxr-xr-x  59 root root 20480 Apr 18 07:58 include
drwxr-xr-x 210 root root 65536 Apr 18 07:59 lib
drwxr-xr-x  12 root root  4096 Apr 22 15:49 local
drwxr-xr-x   2 root root 12288 Jan 31 09:50 sbin
---
...

os.rename(old-name, new-name)¶

Rename a file or directory.

Parameters:	old-name (`string`) – name of existing file or directory, new-name (`string`) – changed name of file or directory.

Example:

tarantool> os.rename('local','foreign')
---
- null
- 'local: No such file or directory'
- 2
...

os.getenv(variable-name)¶

Get environment variable.

Parameters: (string) variable-name = environment variable name.

Example:

tarantool> os.getenv('PATH')
---
- /usr/local/sbin:/usr/local/bin:/usr/sbin
...

os.remove(name)¶

Remove file or directory.

Parameters: (string) name = name of file or directory which will be removed.

Example:

tarantool> os.remove('file')
---
- true
...

os.date(format-string[, time-since-epoch])¶

Return a formatted date.

Parameters: (string) format-string = instructions; (string) time-since-epoch = number of seconds since 1970-01-01. If time-since-epoch is omitted, it is assumed to be the current time.

Example:

tarantool> os.date("%A %B %d")
---
- Sunday April 24
...

os.exit()¶

Exit the program. If this is done on a server instance, then the instance stops.

Example:

tarantool> os.exit()
user@user-shell:~/tarantool_sandbox$

os.time()¶

Return the number of seconds since the epoch.

Example:

tarantool> os.time()
---
- 1461516945
...

os.clock()¶

Return the number of CPU seconds since the program start.

Example:

tarantool> os.clock()
---
- 0.05
...

os.tmpname()¶

Return a name for a temporary file.

Example:

tarantool> os.tmpname()
---
- /tmp/lua_7SW1m2
...

os.environ()¶

Return a table containing all environment variables.

Example:

tarantool> os.environ()['TERM']..os.environ()['SHELL']
---
- xterm/bin/bash
...

os.setenv(variable-name, variable-value)¶

Set an environment variable.

Example:

tarantool> os.setenv('VERSION','99')
---
-
...

os.setlocale([new-locale-string])¶

Change the locale. If new-locale-string is not specified, return the current locale.

Example:

tarantool> string.sub(os.setlocale(),1,20)
---
- LC_CTYPE=en_US.UTF-8
...

os.difftime(time1, time2)¶

Return the number of seconds between two times.

Example:

tarantool> os.difftime(os.time() - 0)
---
- 1486594859
...

Module pickle

Index

Below is a list of all pickle functions.

Name	Use
pickle.pack()	Convert Lua variables to binary format
pickle.unpack()	Convert Lua variables back from binary format

pickle.pack(format, argument[, argument ...])¶

To use Tarantool binary protocol primitives from Lua, it’s necessary to convert Lua variables to binary format. The pickle.pack() helper function is prototyped after Perl pack.

Format specifiers

b, B	converts Lua scalar value to a 1-byte integer, and stores the integer in the resulting string
s, S	converts Lua scalar value to a 2-byte integer, and stores the integer in the resulting string, low byte first
i, I	converts Lua scalar value to a 4-byte integer, and stores the integer in the resulting string, low byte first
l, L	converts Lua scalar value to an 8-byte integer, and stores the integer in the resulting string, low byte first
n	converts Lua scalar value to a 2-byte integer, and stores the integer in the resulting string, big endian,
N	converts Lua scalar value to a 4-byte integer, and stores the integer in the resulting string, big endian,
q, Q	converts Lua scalar value to an 8-byte integer, and stores the integer in the resulting string, big endian,
f	converts Lua scalar value to a 4-byte float, and stores the float in the resulting string
d	converts Lua scalar value to a 8-byte double, and stores the double in the resulting string
a, A	converts Lua scalar value to a sequence of bytes, and stores the sequence in the resulting string

Parameters:	format (`string`) – string containing format specifiers argument(s) (`scalar-value`) – scalar values to be formatted
Return:	a binary string containing all arguments, packed according to the format specifiers.
Rtype:	string

A scalar value can be either a variable or a literal. Remember that large integers should be entered with tonumber64() or LL or ULL suffixes.

Possible errors:

Argument count does not match the format. Note: excess values are simply ignored.
Expected 8/16/32/64-bit int.
Unsupported pack format specifier.

Example:

tarantool> pickle = require('pickle')
---
...
tarantool> box.space.tester:insert{0, 'hello world'}
---
- [0, 'hello world']
...
tarantool> box.space.tester:update({0}, {{'=', 2, 'bye world'}})
---
- [0, 'bye world']
...
tarantool> box.space.tester:update({0}, {
         >   {'=', 2, pickle.pack('iiA', 0, 3, 'hello')}
         > })
---
- [0, "\0\0\0\0\x03\0\0\0hello"]
...
tarantool> box.space.tester:update({0}, {{'=', 2, 4}})
---
- [0, 4]
...
tarantool> box.space.tester:update({0}, {{'+', 2, 4}})
---
- [0, 8]
...
tarantool> box.space.tester:update({0}, {{'^', 2, 4}})
---
- [0, 12]
...

pickle.unpack(format, binary-string)¶

Counterpart to pickle.pack(). Warning: if format specifier ‘A’ is used, it must be the last item.

Parameters:	format (`string`) – binary-string (`string`) –
Return:	A list of strings or numbers.
Rtype:	table

Possible errors:

Too many bytes: unpacked X, total Y. X < Y.
Got X bytes (expected: Y+)’. X < Y.
Unsupported format specifier.

Example:

tarantool> pickle = require('pickle')
---
...
tarantool> tuple = box.space.tester:replace{0}
---
...
tarantool> string.len(tuple[1])
---
- 1
...
tarantool> pickle.unpack('b', tuple[1])
---
- 48
...
tarantool> pickle.unpack('bsi', pickle.pack('bsi', 255, 65535, 4294967295))
---
- 255
- 65535
- 4294967295
...
tarantool> pickle.unpack('ls', pickle.pack('ls', tonumber64('18446744073709551615'), 65535))
---
...
tarantool> num, num64, str = pickle.unpack('slA', pickle.pack('slA', 666,
         > tonumber64('666666666666666'), 'string'))
---
...

Module popen

Overview

Since version 2.4.1, Tarantool has the popen built-in module that supports execution of external programs. It is similar to Python’s subprocess() or Ruby’s Open3. However, Tarantool’s popen module does not have all the helpers that those languages provide, it provides only basic functions. popen uses the vfork() system call to create an object, so the caller thread is blocked until execution of a child process begins.

The popen module provides two functions to create the popen object:

popen.shell which is similar to the libc popen syscall
popen.new to create a popen object with more specific options

Either function returns a handle which we will call popen_handle or ph. With the handle one can execute methods.

Index

Below is a list of all popen functions and handle methods.

Name	Use
popen.shell()	Execute a shell command
popen.new()	Execute a child program in a new process
popen_handle:read()	Read data from a child peer
popen_handle:write()	Write a string to stdin stream of a child process
popen_handle:shutdown()	Close parent’s ends of std* fds
popen_handle:terminate()	Send SIGTERM signal to a child process
popen_handle:kill()	Send SIGKILL signal to a child process
popen_handle:signal()	Send signal to a child process
popen_handle:info()	Return information about the popen handle
popen_handle:wait()	Wait until a child process gets exited or signaled
popen_handle:close()	Close a popen handle
Module constants	Module constants
Handle fields	Handle fields

popen.shell(command[, mode])¶

Execute a shell command.

Parameters:

command (string) – a command to run, mandatory
mode (string) – communication mode, optional

Return:

(if success) a popen handle, which we will call popen_handle or ph

(if failure) nil, err

Possible errors: if a parameter is incorrect, the result is IllegalParams: incorrect type or value of a parameter. For other possible errors, see popen.new().

The possible mode values are:

'w' which enables popen_handle:write()
'r' which enables popen_handle:read()
'R' which enables popen_handle:read({stderr = true})
'nil' which means inherit parent’s std* file descriptors

Several mode characters can be set together, for example 'rw', 'rRw'.

The shell function is just a shortcut for popen.new({command}, opts) with opts.shell.setsid and opts.shell.group_signal both set to true, and with opts.stdin and opts.stdout and opts.stderr all set based on the mode parameter.

All std* streams are inherited from the parent by default unless it is changed using mode: 'r' for stdout, 'R' for stderr, or 'w' for stdin.

Example:

This is the equivalent of the sh -c date command. It starts a process, runs 'date', reads the output, and closes the popen object (ph).

local popen = require('popen')
-- Run the program and save its handle.
local ph = popen.shell('date', 'r')
-- Read program's output, strip trailing newline.
local date = ph:read():rstrip()
-- Free resources. The process is killed (but 'date'
-- exits itself anyway).
ph:close()
print(date)

Unix defines a text file as a sequence of lines. Each line is terminated by a newline (\\n) symbol. The same convention is usually applied for text output of a command. So, when it is redirected to a file, the file will be correct.

However, internally an application usually operates on strings, which are not terminated by newline (for example literals for error messages). The newline is usually added just before a string is written for the outside world (stdout, console or log). That is why the example above contains rstrip().

popen.new(argv[, opts])¶

Execute a child program in a new process.

Parameters:

argv (array) – an array of a program to run with command line options, mandatory; absolute path to the program is required when opts.shell is false (default)
opts (table) – table of options, optional

Return:

(if success) a popen handle, which we will call popen_handle or ph

(if failure) nil, err

Possible raised errors:

IllegalParams: incorrect type or value of a parameter
IllegalParams: group signal is set, while setsid is not

Possible error reasons when nil, err is returned:

SystemError: dup(), fcntl(), pipe(), vfork() or close() fails in the parent process
SystemError: (temporary restriction) the parent process has closed stdin, stdout or stderr

Possible opts items:

opts.stdin (action on STDIN_FILENO)
opts.stdout (action on STDOUT_FILENO)
opts.stderr (action on STDERR_FILENO)

The opts table file descriptor actions may be:

popen.opts.INHERIT (== 'inherit') [default] inherit the fd from the parent
popen.opts.DEVNULL (== 'devnull') open /dev/null on the fd
popen.opts.CLOSE (== 'close') close the fd
popen.opts.PIPE (== 'pipe') feed data from fd to parent, or from parent to fd, using a pipe

The opts table may contain an env table of environment variables to be used inside a process. Each opts.env item may be a key-value pair (key is a variable name, value is a variable value).

If opts.env is not set then the current environment is inherited.
If opts.env is an empty table, then the environment will be dropped.
If opts.env is set to a non-empty table, then the environment will be replaced.

The opts table may contain these boolean items:

Name	Default	Use
opts.shell	false	If true, then run a child process via `sh -c "${opts.argv}"`. If false, then call the executable directly.
opts.setsid	false	If true, then run the program in a new session. If false, then run the program in the Tarantool instance’s session and process group.
opts.close_fds	true	If true, then close all inherited fds from the parent. If false, then do not close all inherited fds from the parent.
opts.restore_signals	true	If true, then reset all signal actions modified in the parent’s process. If false, then inherit all signal actions modified in the parent’s process.
opts.group_signal	false	If true, then send signal to a child process group, if and only if `opts.setsid` is enabled. If false, then send signal to a child process only.
opts.keep_child	false	If true, then do not send SIGKILL to a child process (or to a process group if `opts.group_signal` true). If false, then do send SIGKILL to a child process (or to a process group if `opts.group_signal` is true) at popen_handle:close() or when Lua GC collects the handle.

The returned ph handle provides a popen_handle:close() method for explicitly releasing all occupied resources, including the child process itself if opts.keep_child is not set). However, if the close() method is not called for a handle during its lifetime, the Lua GC will trigger the same freeing actions.

Since version 3.2.0, the inherit_fds option is added to the opts table. The option allows define file descriptor numbers that should be left open in the child process if the close_fds flag is set to true.

Tarantool recommends using opts.setsid plus opts.group_signal if a child process may spawn its own children and if they should all be killed together.

A signal will not be sent if the child process is already dead. Otherwise we might kill another process that occupies the same PID later. This means that if the child process dies before its own children die, then the function will not send a signal to the process group even when opts.setsid and opts.group_signal are set.

Use os.environ() to pass a copy of the current environment with several replacements (see example 2 below).

Example 1

This is the equivalent of the sh -c date command. It starts a process, runs ‘date’, reads the output, and closes the popen object (ph).

local popen = require('popen')

local ph = popen.new({'/bin/date'}, {
    stdout = popen.opts.PIPE,
})
local date = ph:read():rstrip()
ph:close()
print(date) -- e.g. Thu 16 Apr 2020 01:40:56 AM MSK

Example 2

Example 2 is quite similar to Example 1, but sets an environment variable and uses the shell builtin 'echo' to show it.

local popen = require('popen')
local env = os.environ()
env['FOO'] = 'bar'
local ph = popen.new({'echo "${FOO}"'}, {
    stdout = popen.opts.PIPE,
    shell = true,
    env = env,
})
local res = ph:read():rstrip()
ph:close()
print(res) -- bar

Example 3

Example 3 demonstrates how to capture a child’s stderr.

local popen = require('popen')
local ph = popen.new({'echo hello >&2'}, { -- !!
    stderr = popen.opts.PIPE,              -- !!
    shell = true,
})
local res = ph:read({stderr = true}):rstrip()
ph:close()
print(res) -- hello

Example 4

Example 4 demonstrates how to run a stream program (like grep, sed and so on), write to its stdin and read from its stdout.

The example assumes that input data are small enough to fit in a pipe buffer (typically 64 KiB, but this depends on the platform and its configuration).

If a process writes lengthy data, it will get stuck in popen_handle:write(). To handle this case: call popen_handle:read() in a loop in another fiber (start it before the first :write()).

If a process writes lengthy text to stderr, it may get stick in write() because the stderr pipe buffer becomes full. To handle this case: read stderr in a separate fiber.

local function call_jq(input, filter)
    -- Start jq process, connect to stdin, stdout and stderr.
    local jq_argv = {'/usr/bin/jq', '-M', '--unbuffered', filter}
    local ph, err = popen.new(jq_argv, {
        stdin = popen.opts.PIPE,
        stdout = popen.opts.PIPE,
        stderr = popen.opts.PIPE,
    })
    if ph == nil then return nil, err end
    -- Write input data to child's stdin and send EOF.
    local ok, err = ph:write(input)
    if not ok then return nil, err end
    ph:shutdown({stdin = true})
    -- Read everything until EOF.
    local chunks = {}
    while true do
        local chunk, err = ph:read()
        if chunk == nil then
            ph:close()
            return nil, err
        end
        if chunk == '' then break end -- EOF
        table.insert(chunks, chunk)
    end
    -- Read diagnostics from stderr if any.
    local err = ph:read({stderr = true})
    if err ~= '' then
        ph:close()
        return nil, err
    end
    -- Glue all chunks, strip trailing newline.
    return table.concat(chunks):rstrip()
end

popen handle methods

object popen_handle¶

popen_handle:read([opts])¶

Read data from a child peer.

Parameters:	ph (`handle`) – handle of a child process created with popen.new() or popen.shell() opts (`table`) – options

Possible opts items:

opts.stdout (boolean, default true, if true then read from stdout)
opts.stderr (boolean, default false, if true then read from stderr)
opts.timeout (number, default 100 years, time quota in seconds)

In other words: by default read() reads from stdout, but reads from stderr if one sets opts.stderr to true. It is not legal to set both opts.stdout and opts.stderr to true.

Return:

(if success) string with read value, empty string if EOF

(if failure) nil, err

Possible errors

These errors are raised on incorrect parameters or when the fiber is cancelled:

IllegalParams: incorrect type or value of a parameter
IllegalParams: called on a closed handle
IllegalParams: opts.stdout and opts.stderr are both set
IllegalParams: a requested IO operation is not supported by the handle (stdout / stderr is not piped)
IllegalParams: attempt to operate on a closed file descriptor
FiberIsCancelled: cancelled by external code

nil, err is returned on following failures:

SocketError: an IO error occurs at read()
TimedOut: exceeded the opts.timeout quota
LuajitError: (“not enough memory”): no memory space for the Lua string

popen_handle:write(str[, opts])¶

Write string str to stdin stream of a child process.

Parameters:

ph (handle) – handle of a child process created with popen.new() or popen.shell()
str (string) – string to write
opts (table) – options

Return:

true on success, false on error

Rtype:

(if success) boolean = true

(if failure) nil, err

Possible opts items are: opts.timeout (number, default 100 years, time quota in seconds).

Possible raised errors are:

IllegalParams: incorrect type or value of a parameter
IllegalParams: called on a closed handle
IllegalParams: string length is greater then SSIZE_MAX
IllegalParams: a requested IO operation is not supported by the handle (stdin is not piped)
IllegalParams: attempt to operate on a closed file descriptor
FiberIsCancelled: cancelled by an outside code

Possible error reasons when nil, err is returned are:

SocketError: an IO error occurs at write()
TimedOut: exceeded opts.timeout quota

write() may yield forever if the child process does not read data from stdin and a pipe buffer becomes full. The size of this pipe buffer depends on the platform. Set opts.timeout when unsure.

When opts.timeout is not set, the write() blocks (yields the fiber) until all data is written or an error happens.

popen_handle:shutdown([opts])¶

Close parent’s ends of std* fds.

Parameters:	ph (`handle`) – handle of a child process created with popen.new() or popen.shell() opts (`table`) – options
Return:	`true` on success, `false` on error
Rtype:	(if success) boolean = true

Possible opts items are:

opts.stdin (boolean) close parent’s end of stdin
opts.stdout (boolean) close parent’s end of stdout
opts.stderr (boolean) close parent’s end of stderr

We may use the term std* to mean any one of these items.

Possible raised errors are:

IllegalParams: an incorrect handle parameter
IllegalParams: called on a closed handle
IllegalParams: neither stdin, stdout nor stderr is chosen
IllegalParams: a requested IO operation is not supported by the handle (one of std* is not piped)

The main reason to use shutdown() is to send EOF to a child’s stdin. However the parent’s end of stdout / stderr may be closed too.

shutdown() does not fail on already closed fds (idempotence). However, it fails on an attempt to close the end of a pipe that never existed. In other words, only those std* options that were set to popen.opts.PIPE during handle creation may be used here (for popen.shell(): 'r' corresponds to stdout, 'R' to stderr and 'w' to stdin).

shutdown() does not close any fds on a failure: either all requested fds are closed or none of them.

Example:

local popen = require('popen')
local ph = popen.shell('sed s/foo/bar/', 'rw')
ph:write('lorem foo ipsum')
ph:shutdown({stdin = true})
local res = ph:read()
ph:close()
print(res) -- lorem bar ipsum

popen_handle:terminate()¶

Send SIGTERM signal to a child process.

Parameters:	ph (`handle`) – handle of a child process created with popen.new() or popen.shell()
Return:	see popen_handle:signal() for errors and return values

terminate() only sends a SIGTERM signal. It does not free any resources (such as popen handle memory and file descriptors).

popen_handle:kill()¶

Send SIGKILL signal to a child process.

Parameters:	ph (`handle`) – handle of a child process created with popen.new() or popen.shell()
Return:	see popen_handle:signal() for errors and return values

kill() only sends a SIGKILL signal. It does not free any resources (such as popen handle memory and file descriptors).

popen_handle:signal(signo)¶

Send signal to a child process.

Parameters:

ph (handle) – handle of a child process created with popen.new() or popen.shell()
signo (number) – signal to send

Return:

(if success) true (signal is sent)

(if failure) nil, err

Possible raised errors:

IllegalParams: an incorrect handle parameter
IllegalParams: called on a closed handle

Possible error values for nil, err:

SystemError: a process does not exists any more (this may also be returned for a zombie process or when all processes in a group are zombies (but see note re Mac OS below)
SystemError: invalid signal number
SystemError: no permission to send a signal to a process or a process group (this is returned on Mac OS when a signal is sent to a process group, where a group leader is a zombie (or when all processes in it are zombies, details re uncertain) (this may also appear due to other reasons, details are uncertain)

If opts.setsid and opts.group_signal are set for the handle, the signal is sent to the process group rather than to the process. See popen.new() for details about group signaling. Warning: On Mac OS it is possible that a process in the group will not receive the signal, particularly if the process has just been forked (this may be due to a race condition).

Note: The module offers popen.signal.SIG* constants, because some signals have different numbers on different platforms.

popen_handle:info()¶

Return information about the popen handle.

Parameters:	ph (`handle`) – handle of a child process created with popen.new() or popen.shell()
Return:	(if success) formatted result
Rtype:	res

Possible raised errors are:

IllegalParams: an incorrect handle parameter
IllegalParams: called on a closed handle

The result format is:

{
    pid = <number> or <nil>,
    command = <string>,
    opts = <table>,
    status = <table>,
    stdin = one-of(
        popen.stream.OPEN   (== 'open'),
        popen.stream.CLOSED (== 'closed'),
        nil,
    ),
    stdout = one-of(
        popen.stream.OPEN   (== 'open'),
        popen.stream.CLOSED (== 'closed'),
        nil,
    ),
    stderr = one-of(
        popen.stream.OPEN   (== 'open'),
        popen.stream.CLOSED (== 'closed'),
        nil,
    ),
}

pid is a process id of the process when it is alive, otherwise pid is nil.

command is a concatenation of space-separated arguments that were passed to execve(). Multiword arguments are quoted. Quotes inside arguments are not escaped.

opts is a table of handle options as in the popen.new() opts parameter. opts.env is not shown here, because the environment variables map is not stored in a handle.

status is a table that represents a process status in the following format:

{
    state = one-of(
        popen.state.ALIVE    (== 'alive'),
        popen.state.EXITED   (== 'exited'),
        popen.state.SIGNALED (== 'signaled'),
    )
    -- Present when `state` is 'exited'.
    exit_code = <number>,
    -- Present when `state` is 'signaled'.
    signo = <number>,
    signame = <string>,
}

stdin, stdout, and stderr reflect the status of the parent’s end of a piped stream. If a stream is not piped, the field is not present (nil). If it is piped, the status may be either popen.stream.OPEN (== 'open') or popen.stream.CLOSED (== 'closed'). The status may be changed from 'open' to 'closed' by a popen_handle:shutdown({std… = true}) call.

Example 1

(on Tarantool console)

tarantool> require('popen').new({'/usr/bin/touch', '/tmp/foo'})
---
- command: /usr/bin/touch /tmp/foo
  status:
    state: alive
  opts:
    stdout: inherit
    stdin: inherit
    group_signal: false
    keep_child: false
    close_fds: true
    restore_signals: true
    shell: false
    setsid: false
    stderr: inherit
  pid: 9499
...

Example 2

(on Tarantool console)

tarantool> require('popen').shell('grep foo', 'wrR')
---
- stdout: open
  command: sh -c 'grep foo'
  stderr: open
  status:
    state: alive
  stdin: open
  opts:
    stdout: pipe
    stdin: pipe
    group_signal: true
    keep_child: false
    close_fds: true
    restore_signals: true
    shell: true
    setsid: true
    stderr: pipe
  pid: 10497
...

popen_handle:wait()¶

Wait until a child process gets exited or signaled.

Parameters:

ph (handle) – handle of a child process created with popen.new() or popen.shell()
timeout (number) – since version 3.2.0. The parameter defines the period in seconds for the method to wait for a resolution. The default value is “infinity”.

Return:

(if success) formatted result

(if failure) nil, err

Rtype:

res

Possible raised errors:

IllegalParams: an incorrect handle parameter
IllegalParams: called on a closed handle
FiberIsCancelled: cancelled by an outside code

Possible error reasons when nil, err is returned are:

TimedOut: since version 3.2.0. The error means that the method has not reached the positive result but has reached the defined timeout.
ChannelIsClosed: since version 3.2.0. The error is returned when the target popen handle is closed during the :wait() operation.

The formatted result is a process status table (the same as the status component of the table returned by popen_handle:info()).

Timeout parameter example

local ph = popen.new(<...>)
local res, err = ph:wait({timeout = 1})
if res == nil then
-- Timeout is reached.
assert(err.type == 'TimedOut')
<...>
end

popen_handle:close()¶

Close a popen handle.

Parameters:

ph (handle) – handle of a child process created with popen.new() or popen.shell()

Return:

(if success) true

(if failure) nil, err

Possible raised errors are:

IllegalParams: an incorrect handle parameter

Possible diagnostics when nil, err is returned (do not consider them as errors):

SystemError: no permission to send a signal to a process or a process group (This diagnostic may appear due to Mac OS behavior on zombies when opts.group_signal is set, see popen_handle:signal(). It may appear for other reasons, details are unclear.)

The return is always true when a process is known to be dead (for example, after popen_handle:wait() no signal will be sent, so no ‘failure’ may appear).

close() kills a process using SIGKILL and releases all resources associated with the popen handle.

Details about signaling:

The signal is sent only when opts.keep_child is not set.
The signal is sent only when a process is alive according to the information available on current event loop iteration. (There is a gap here: a zombie may be signaled; it is harmless.)
The signal is sent to a process or a process group depending on opts.group_signal. (See popen.new() for details of group signaling).

Resources are released regardless whether or not a signal sending succeeds: fds are closed, memory is released, the handle is marked as closed.

No operation is possible on a closed handle except close(), which is always successful on a closed handle (idempotence).

close() may return true or nil, err, but it always frees the handle resources. So any return value usually means success for a caller. The return values are purely informational: they are for logging or some kind of reporting.

Handle fields

popen_handle.pid
popen_handle.command
popen_handle.opts
popen_handle.status
popen_handle.stdin
popen_handle.stdout
popen_handle.stderr

See popen_handle:info() for details.

Module constants

- popen.opts
  - INHERIT (== 'inherit')
  - DEVNULL (== 'devnull')
  - CLOSE   (== 'close')
  - PIPE    (== 'pipe')

- popen.signal
  - SIGTERM (== 9)
  - SIGKILL (== 15)
  - ...

- popen.state
  - ALIVE    (== 'alive')
  - EXITED   (== 'exited')
  - SIGNALED (== 'signaled')

- popen.stream
  - OPEN    (== 'open')
  - CLOSED  (== 'closed')

Module socket

Overview

The socket module allows exchanging data via BSD sockets with a local or remote host in connection-oriented (TCP) or datagram-oriented (UDP) mode. Semantics of the calls in the socket API closely follow semantics of the corresponding POSIX calls.

The functions for setting up and connecting are socket, sysconnect, tcp_connect. The functions for sending data are send, sendto, write, syswrite. The functions for receiving data are recv, recvfrom, read. The functions for waiting before sending/receiving data are wait, readable, writable. The functions for setting flags are nonblock, setsockopt. The functions for stopping and disconnecting are shutdown, close. The functions for error checking are errno, error.

Index

Below is a list of all socket functions.

Name	Use
socket()	Create a socket
socket.tcp_connect()	Connect a socket to a remote host
socket.getaddrinfo()	Get information about a remote site
socket.tcp_server()	Make Tarantool act as a TCP server
socket.bind()	Bind a socket to the given host/port
socket_object:sysconnect()	Connect a socket to a remote host
socket_object:send() socket_object:write()	Send data over a connected socket
socket_object:syswrite()	Write data to the socket buffer if non-blocking
socket_object:recv()	Read from a connected socket
socket_object:sysread()	Read data from the socket buffer if non-blocking
socket_object:bind()	Bind a socket to the given host/port
socket_object:listen()	Start listening for incoming connections
socket_object:accept()	Accept a client connection + create a connected socket
socket_object:sendto()	Send a message on a UDP socket to a specified host
socket_object:recvfrom()	Receive a message on a UDP socket
socket_object:shutdown()	Shut down a reading end, a writing end, or both
socket_object:close()	Close a socket
socket_object:error() socket_object:errno()	Get information about the last error on a socket
socket_object:setsockopt()	Set socket flags
socket_object:getsockopt()	Get socket flags
socket_object:linger()	Set/clear the SO_LINGER flag
socket_object:nonblock()	Set/get the flag value
socket_object:readable()	Wait until something is readable
socket_object:writable()	Wait until something is writable
socket_object:wait()	Wait until something is either readable or writable
socket_object:name()	Get information about the connection’s near side
socket_object:peer()	Get information about the connection’s far side
socket.iowait()	Wait for read/write activity
LuaSocket wrapper functions	Several methods for emulating the LuaSocket API

Typically a socket session will begin with the setup functions, will set one or more flags, will have a loop with sending and receiving functions, will end with the teardown functions – as an example at the end of this section will show. Throughout, there may be error-checking and waiting functions for synchronization. To prevent a fiber containing socket functions from “blocking” other fibers, the implicit yield rules will cause a yield so that other processes may take over, as is the norm for cooperative multitasking.

For all examples in this section the socket name will be sock and the function invocations will look like sock:function_name(...).

socket.__call(domain, type, protocol)¶

Create a new TCP or UDP socket. The argument values are the same as in the Linux socket(2) man page.

Return:	an unconnected socket, or nil.
Rtype:	userdata

Example:

socket('AF_INET', 'SOCK_STREAM', 'tcp')

socket.tcp_connect(host[, port[, timeout]])¶

Connect a socket to a remote host.

Parameters:	host (`string`) – URL or IP address port (`number`) – port number timeout (`number`) – number of seconds to wait
Return:	(if error) {nil, error-message-string}. (if no error) a new socket object.
Rtype:	socket object, which may be viewed as a table

Example:

sock, e = socket.tcp_connect('127.0.0.1', 3301)
if sock == nil then print(e) end

socket.getaddrinfo(host, port[, timeout[, {option-list}]])¶

socket.getaddrinfo(host, port[, {option-list}])

The socket.getaddrinfo() function is useful for finding information about a remote site so that the correct arguments for sock:sysconnect() can be passed. This function may use the worker_pool_threads configuration parameter.

Parameters:	host (`string`) – URL or IP address port (`number/string`) – port number as a numeric or string timeout (`number`) – maximum number of seconds to wait options (`table`) – `type` – preferred socket type `family` – desired address family for the returned addresses `protocol` `flags` – additional options (see details here)
Return:	(if error) {nil, error-message-string}. (if no error) A table containing these fields: “host”, “family”, “type”, “protocol”, “port”.
Rtype:	table

Example:

tarantool> socket.getaddrinfo('tarantool.org', 'http')
---
- - host: 188.93.56.70
    family: AF_INET
    type: SOCK_STREAM
    protocol: tcp
    port: 80
  - host: 188.93.56.70
    family: AF_INET
    type: SOCK_DGRAM
    protocol: udp
    port: 80
...
-- To find the available values for the options use the following:
tarantool> socket.internal.AI_FLAGS -- or SO_TYPE, or DOMAIN
---
- AI_ALL: 256
  AI_PASSIVE: 1
  AI_NUMERICSERV: 4096
  AI_NUMERICHOST: 4
  AI_V4MAPPED: 2048
  AI_ADDRCONFIG: 1024
  AI_CANONNAME: 2
...

socket.tcp_server(host, port, handler-function-or-table[, timeout])¶

The socket.tcp_server() function makes Tarantool act as a server that can accept connections. Usually the same objective is accomplished with box.cfg{listen=…}.

Parameters:	host (`string`) – host name or IP port (`number`) – host port, may be 0 handler-function-or-table (`function/table`) – what to execute when a connection occurs timeout (`number`) – host resolving timeout in seconds
Return:	(if error) {nil, error-message-string}. (if no error) a new socket object.
Rtype:	socket object, which may be viewed as a table

The handler-function-or-table parameter may be simply a function name / function declaration: handler_function. Or it may be a table: {handler = handler_function [, prepare = prepare_function] [, name = name] }. handler_function is mandatory; it may have a parameter = the socket; it is executed once after accept() happens (once per connection); it is for continuous operation after the connection is made. prepare_function is optional; it may have parameters = the socket object and a table with client information; it should return either a backlog value or nothing; it is executed only once before bind() on the listening socket (not once per connection). Examples:

socket.tcp_server('localhost', 3302, function (s) loop_loop() end)
socket.tcp_server('localhost', 3302, {handler=hfunc, name='name'})
socket.tcp_server('localhost', 3302, {handler=hfunc, prepare=pfunc})

For fuller examples see Use tcp_server to accept file contents sent with socat and Use tcp_server with handler and prepare.

socket.bind(host, port)¶

Bind a socket to the given host/port. This is equivalent to socket_object:bind(), but is done on the result of require('socket'), rather than on the socket object.

Parameters:	host (`string`) – URL or IP address port (`number`) – port number
Return:	(if error) {nil, error-message-string}. (if no error) A table which may have information about the bind result.
Rtype:	table

object socket_object¶

socket_object:sysconnect(host, port)¶

Connect an existing socket to a remote host. The argument values are the same as in tcp_connect(). The host must be an IP address.

Parameters:

Either:
- host - a string representation of an IPv4 address or an IPv6 address;
- port - a number.
Or:
- host - a string containing “unix/”;
- port - a string containing a path to a unix socket.
Or:
- host - a number, 0 (zero), meaning “all local interfaces”;
- port - a number. If a port number is 0 (zero), the socket will be bound to a random local port.

Return:	the socket object value may change if sysconnect() succeeds.
Rtype:	boolean

Example:

socket = require('socket')
sock = socket('AF_INET', 'SOCK_STREAM', 'tcp')
sock:sysconnect(0, 3301)

socket_object:send(data)¶

socket_object:write(data)¶

Send data over a connected socket.

Parameters:	data (`string`) – what is to be sent
Return:	the number of bytes sent.
Rtype:	number

Possible errors: nil on error.

socket_object:syswrite(size)¶: Write as much data as possible to the socket buffer if non-blocking. Rarely used. For details see this description.

socket_object:recv(size)¶

Read size bytes from a connected socket. An internal read-ahead buffer is used to reduce the cost of this call.

Parameters:	size (`integer`) – maximum number of bytes to receive. See Recommended size.
Return:	a string of the requested length on success.
Rtype:	string

Possible errors: On error, returns an empty string, followed by status, errno, errstr. In case the writing side has closed its end, returns the remainder read from the socket (possibly an empty string), followed by “eof” status.

socket_object:read(limit[, timeout])¶

socket_object:read(delimiter[, timeout])

socket_object:read({options}[, timeout])

Read from a connected socket until some condition is true, and return the bytes that were read. Reading goes on until limit bytes have been read, or a delimiter has been read, or a timeout has expired. Unlike socket_object:recv (which uses an internal read-ahead buffer), socket_object:read depends on the socket’s buffer.

Parameters:	limit (`integer`) – maximum number of bytes to read, for example 50 means “stop after 50 bytes” delimiter (`string`) – separator for example `?` means “stop after a question mark”; this parameter can accept a table of separators, for example, `delimiter = {"\n", "\r"}` timeout (`number`) – maximum number of seconds to wait, for example 50 means “stop after 50 seconds”. options (`table`) – `chunk=limit` and/or `delimiter=delimiter`, for example `{chunk=5,delimiter='x'}`.
Return:	an empty string if there is nothing more to read, or a nil value if error, or a string up to `limit` bytes long, which may include the bytes that matched the `delimiter` expression.
Rtype:	string

socket_object:sysread(size)¶

Return data from the socket buffer if non-blocking. In case the socket is blocking, sysread() can block the calling process. Rarely used. For details, see also this description.

Parameters:	size (`integer`) – maximum number of bytes to read, for example 50 means “stop after 50 bytes”
Return:	an empty string if there is nothing more to read, or a nil value if error, or a string up to `size` bytes long.
Rtype:	string

socket_object:bind(host[, port])¶

Bind a socket to the given host/port. A UDP socket after binding can be used to receive data (see socket_object.recvfrom). A TCP socket can be used to accept new connections, after it has been put in listen mode.

Parameters:	host (`string`) – URL or IP address port (`number`) – port number
Return:	true for success, false for error. If return is false, use socket_object:errno() or socket_object:error() to see details.
Rtype:	boolean

socket_object:listen(backlog)¶

Start listening for incoming connections.

Parameters:	backlog – on Linux the listen `backlog` backlog may be from `/proc/sys/net/core/somaxconn`, on BSD the backlog may be `SOMAXCONN`.
Return:	true for success, false for error.
Rtype:	boolean.

socket_object:accept()¶

Accept a new client connection and create a new connected socket. It is good practice to set the socket’s blocking mode explicitly after accepting.

Return:	new socket if success.
Rtype:	userdata

Possible errors: nil.

socket_object:sendto(host, port, data)¶

Send a message on a UDP socket to a specified host.

Parameters:	host (`string`) – URL or IP address port (`number`) – port number data (`string`) – what is to be sent
Return:	the number of bytes sent.
Rtype:	number

Possible errors: on error, returns nil and may return status, errno, errstr.

socket_object:recvfrom(size)¶

Receive a message on a UDP socket.

Parameters:	size (`integer`) – maximum number of bytes to receive. See Recommended size.
Return:	message, a table containing “host”, “family” and “port” fields.
Rtype:	string, table

Possible errors: on error, returns status, errno, errstr.

Example:

After message_content, message_sender = recvfrom(1) the value of message_content might be a string containing ‘X’ and the value of message_sender might be a table containing

message_sender.host = '18.44.0.1'
message_sender.family = 'AF_INET'
message_sender.port = 43065

socket_object:shutdown(how)¶

Shutdown a reading end, a writing end, or both ends of a socket.

Parameters:	how – socket.SHUT_RD, socket.SHUT_WR, or socket.SHUT_RDWR.
Return:	true or false.
Rtype:	boolean

socket_object:close()¶

Close (destroy) a socket. A closed socket should not be used any more. A socket is closed automatically when the Lua garbage collector removes its user data.

Return:	true on success, false on error. For example, if sock is already closed, sock:close() returns false.
Rtype:	boolean

socket_object:error()¶

socket_object:errno()¶

Retrieve information about the last error that occurred on a socket, if any. Errors do not cause throwing of exceptions so these functions are usually necessary.

Return:	result for `sock:errno()`, result for `sock:error()`. If there is no error, then `sock:errno()` will return 0 and `sock:error()`.
Rtype:	number, string

socket_object:setsockopt(level, name, value)¶

Set socket flags. The argument values are the same as in the Linux getsockopt(2) man page. The ones that Tarantool accepts are:

SO_ACCEPTCONN
SO_BINDTODEVICE
SO_BROADCAST
SO_DEBUG
SO_DOMAIN
SO_ERROR
SO_DONTROUTE
SO_KEEPALIVE
SO_MARK
SO_OOBINLINE
SO_PASSCRED
SO_PEERCRED
SO_PRIORITY
SO_PROTOCOL
SO_RCVBUF
SO_RCVBUFFORCE
SO_RCVLOWAT
SO_SNDLOWAT
SO_RCVTIMEO
SO_SNDTIMEO
SO_REUSEADDR
SO_SNDBUF
SO_SNDBUFFORCE
SO_TIMESTAMP
SO_TYPE

Setting SO_LINGER is done with sock:linger(active).

socket_object:getsockopt(level, name)¶: Get socket flags. For a list of possible flags see sock:setsockopt().

socket_object:linger([active])¶

Set or clear the SO_LINGER flag. For a description of the flag, see the Linux man page.

Parameters:	active (`boolean`) –
Return:	new active and timeout values.

socket_object:nonblock([flag])¶

sock:nonblock() returns the current flag value.
sock:nonblock(false) sets the flag to false and returns false.
sock:nonblock(true) sets the flag to true and returns true.

This function may be useful before invoking a function which might otherwise block indefinitely.

socket_object:readable([timeout])¶

Wait until something is readable, or until a timeout value expires.

Return:	true if the socket is now readable, false if timeout expired;

socket_object:writable([timeout])¶

Wait until something is writable, or until a timeout value expires.

Return:	true if the socket is now writable, false if timeout expired;

socket_object:wait([timeout])¶

Wait until something is either readable or writable, or until a timeout value expires.

Return:	‘R’ if the socket is now readable, ‘W’ if the socket is now writable, ‘RW’ if the socket is now both readable and writable, ‘’ (empty string) if timeout expired;

socket_object:name()¶

The sock:name() function is used to get information about the near side of the connection. If a socket was bound to xyz.com:45, then sock:name will return information about [host:xyz.com, port:45]. The equivalent POSIX function is getsockname().

Return:	A table containing these fields: “host”, “family”, “type”, “protocol”, “port”.
Rtype:	table

socket_object:peer()¶

The sock:peer() function is used to get information about the far side of a connection. If a TCP connection has been made to a distant host tarantool.org:80, sock:peer() will return information about [host:tarantool.org, port:80]. The equivalent POSIX function is getpeername().

Return:	A table containing these fields: “host”, “family”, “type”, “protocol”, “port”.
Rtype:	table

socket.iowait(fd, read-or-write-flags[, timeout])¶

The socket.iowait() function is used to wait until read-or-write activity occurs for a file descriptor.

Parameters:	fd – file descriptor read-or-write-flags – ‘R’ or 1 = read, ‘W’ or 2 = write, ‘RW’ or 3 = read\|write. timeout – number of seconds to wait

If the fd parameter is nil, then there will be a sleep until the timeout. If the timeout parameter is nil or unspecified, then timeout is infinite.

Ordinarily the return value is the activity that occurred (‘R’ or ‘W’ or ‘RW’ or 1 or 2 or 3). If the timeout period goes by without any reading or writing, the return is an error = ETIMEDOUT.

Example: socket.iowait(sock:fd(), 'r', 1.11)

LuaSocket wrapper functions

The LuaSocket API has functions that are equivalent to the ones described above, with different names and parameters, for example connect() rather than tcp_connect(). Tarantool supports these functions so that third-party packages which depend on them will work.

The LuaSocket project is on github. The API description is in the LuaSocket manual (click the “introduction” and “reference” links at the bottom of the manual’s main page).

A Tarantool example is Use of a socket with LuaSocket wrapper functions.

Recommended size

For recv and recvfrom: use the optional size parameter to limit the number of bytes to receive. A fixed size such as 512 is often reasonable; a pre-calculated size that depends on context – such as the message format or the state of the network – is often better. For recvfrom, be aware that a size greater than the Maximum Transmission Unit can cause inefficient transport. For Mac OS X, be aware that the size can be tuned by changing sysctl net.inet.udp.maxdgram.

If size is not stated: Tarantool will make an extra call to calculate how many bytes are necessary. This extra call takes time, therefore not stating size may be inefficient.

If size is stated: on a UDP socket, excess bytes are discarded. On a TCP socket, excess bytes are not discarded and can be received by the next call.

Examples

Use of a TCP socket over the Internet

In this example a connection is made over the internet between a Tarantool instance and tarantool.org, then an HTTP “head” message is sent, and a response is received: “HTTP/1.1 200 OK” or something else if the site has moved. This is not a useful way to communicate with this particular site, but shows that the system works.

tarantool> socket = require('socket')
---
...
tarantool> sock = socket.tcp_connect('tarantool.org', 80)
---
...
tarantool> type(sock)
---
- table
...
tarantool> sock:error()
---
- null
...
tarantool> sock:send("HEAD / HTTP/1.0\r\nHost: tarantool.org\r\n\r\n")
---
- 40
...
tarantool> sock:read(17)
---
- HTTP/1.1 302 Move
...
tarantool> sock:close()
---
- true
...

Use of a socket with LuaSocket wrapper functions

This is a variation of the earlier example “Use of a TCP socket over the Internet”. It uses LuaSocket wrapper functions, with a too-short timeout so that a “Connection timed out” error is likely. The more common way to specify timeout is with an option of tcp_connect().

tarantool> socket = require('socket')
---
...
tarantool> sock = socket.connect('tarantool.org', 80)
---
...
tarantool> sock:settimeout(0.001)
---
- 1
...
tarantool> sock:send("HEAD / HTTP/1.0\r\nHost: tarantool.org\r\n\r\n")
---
- 40
...
tarantool> sock:receive(17)
---
- null
- Connection timed out
...
tarantool> sock:close()
---
- 1
...

Use of a UDP socket on localhost

Here is an example with datagrams. Set up two connections on 127.0.0.1 (localhost): sock_1 and sock_2. Using sock_2, send a message to sock_1. Using sock_1, receive a message. Display the received message. Close both connections.
This is not a useful way for a computer to communicate with itself, but shows that the system works.

tarantool> socket = require('socket')
---
...
tarantool> sock_1 = socket('AF_INET', 'SOCK_DGRAM', 'udp')
---
...
tarantool> sock_1:bind('127.0.0.1')
---
- true
...
tarantool> sock_2 = socket('AF_INET', 'SOCK_DGRAM', 'udp')
---
...
tarantool> sock_2:sendto('127.0.0.1', sock_1:name().port,'X')
---
- 1
...
tarantool> message = sock_1:recvfrom(512)
---
...
tarantool> message
---
- X
...
tarantool> sock_1:close()
---
- true
...
tarantool> sock_2:close()
---
- true
...

Use tcp_server to accept file contents sent with socat

Here is an example of the tcp_server function, reading strings from the client and printing them. On the client side, the Linux socat utility will be used to ship a whole file for the tcp_server function to read.

Start two shells. The first shell will be a server instance. The second shell will be the client.

On the first shell, start Tarantool and say:

box.cfg{}
socket = require('socket')
socket.tcp_server('0.0.0.0', 3302,
{
  handler = function(s)
    while true do
      local request
      request = s:read("\n");
      if request == "" or request == nil then
        break
      end
      print(request)
    end
  end,
  prepare = function()
    print('Initialized')
  end
}
)

The above code means:

Use tcp_server() to wait for a connection from any host on port 3302.
When it happens, enter a loop that reads on the socket and prints what it reads. The “delimiter” for the read function is “\n” so each read() will read a string as far as the next line feed, including the line feed.

On the second shell, create a file that contains a few lines. The contents don’t matter. Suppose the first line contains A, the second line contains B, the third line contains C. Call this file “tmp.txt”.

On the second shell, use the socat utility to ship the tmp.txt file to the server instance’s host and port:

$ socat TCP:localhost:3302 ./tmp.txt

Now watch what happens on the first shell. The strings “A”, “B”, “C” are printed.

Use tcp_server with handler and prepare

Here is an example of the tcp_server function using handler and prepare.

Start two shells. The first shell will be a server instance. The second shell will be the client.

On the first shell, start Tarantool and say:

box.cfg{}
socket = require('socket')
sock = socket.tcp_server(
  '0.0.0.0',
  3302,
  {prepare =
     function(sock)
       print('listening on socket ' .. sock:fd())
       sock:setsockopt('SOL_SOCKET','SO_REUSEADDR',true)
       return 5
     end,
   handler =
    function(sock, from)
      print('accepted connection from: ')
      print('  host: ' .. from.host)
      print('  family: ' .. from.family)
      print('  port: ' .. from.port)
    end
  }
)

The above code means:

Use tcp_server() to wait for a connection from any host on port 3302.
Specify that there will be an initial call to prepare which displays something about the server, then calls setsockopt(...'SO_REUSEADDR'...) (this is the same option that Tarantool would set if there was no prepare), and then returns 5 (this is a rather low backlog queue size).
Specify that there will be per-connection calls to handler which display something about the client.

Now watch what happens on the first shell. The display will include something like ‘listening on socket 12’.

On the second shell, start Tarantool and say:

box.cfg{}
require('socket').tcp_connect('127.0.0.1', 3302)

Now watch what happens on the first shell. The display will include something like ‘accepted connection from host: 127.0.0.1 family: AF_INET port: 37186’.

Module strict

The strict module has functions for turning “strict mode” on or off. When strict mode is on, an attempt to use an undeclared global variable will cause an error. A global variable is considered “undeclared” if it has never had a value assigned to it. Often this is an indication of a programming error.

By default strict mode is off, unless tarantool was built with the -DCMAKE_BUILD_TYPE=Debug option – see the description of build options in section building-from-source.

Example:

tarantool> strict = require('strict')
---
...
tarantool> strict.on()
---
...
tarantool> a = b -- strict mode is on so this will cause an error
---
- error: ... variable ''b'' is not declared'
...
tarantool> strict.off()
---
...
tarantool> a = b -- strict mode is off so this will not cause an error
---
...

Module string

Overview

The string module has everything in the standard Lua string library, and some Tarantool extensions.

In this section we only discuss the additional functions that the Tarantool developers have added.

Index

Below is a list of all additional string functions.

Name	Use
string.ljust()	Left-justify a string
string.rjust()	Right-justify a string
string.hex()	Given a string, return hexadecimal values
string.fromhex()	Given hexadecimal values, return a string
string.startswith()	Check if a string starts with a given substring
string.endswith()	Check if a string ends with a given substring
string.lstrip()	Remove characters from the left of a string
string.rstrip()	Remove characters from the right of a string
string.split()	Split a string into a table of strings
string.strip()	Remove spaces on the left and right of a string

string.ljust(input-string, width[, pad-character])¶

Return the string left-justified in a string of length width.

Parameters:	input-string (`string`) – the string to left-justify width (`integer`) – the width of the string after left-justifying pad-character (`string`) – a single character, default = 1 space
Return:	left-justified string (unchanged if width <= string length)
Rtype:	string

Example:

tarantool> string = require('string')
---
...
tarantool> string.ljust(' A', 5)
---
- ' A   '
...

string.rjust(input-string, width[, pad-character])¶

Return the string right-justified in a string of length width.

Parameters:	input-string (`string`) – the string to right-justify width (`integer`) – the width of the string after right-justifying pad-character (`string`) – a single character, default = 1 space
Return:	right-justified string (unchanged if width <= string length)
Rtype:	string

Example:

tarantool> string = require('string')
---
...
tarantool> string.rjust('', 5, 'X')
---
- 'XXXXX'
...

string.hex(input-string)¶

Return the hexadecimal value of the input string.

Parameters:	input-string (`string`) – the string to process
Return:	hexadecimal, 2 hex-digit characters for each input character
Rtype:	string

Example:

tarantool> string = require('string')
---
...
tarantool> string.hex('ABC ')
---
- '41424320'
...

string.fromhex(hexadecimal-input-string)¶

Given a string containing pairs of hexadecimal digits, return a string with one byte for each pair. This is the reverse of string.hex(). The hexadecimal-input-string must contain an even number of hexadecimal digits.

Parameters:	hexadecimal-input-string (`string`) – string with pairs of hexadecimal digits
Return:	string with one byte for each pair of hexadecimal digits
Rtype:	string

Example:

tarantool> string = require('string')
---
...
tarantool> string.fromhex('41424320')
---
- 'ABC '
...

string.startswith(input-string, start-string[, start-pos[, end-pos]])¶

Return True if input-string starts with start-string, otherwise return False.

Parameters:	input-string (`string`) – the string where `start-string` should be looked for start-string (`string`) – the string to look for start-pos (`integer`) – position: where to start looking within `input-string` end-pos (`integer`) – position: where to end looking within `input-string`
Return:	true or false
Rtype:	boolean

start-pos and end-pos may be negative, meaning the position should be calculated from the end of the string.

Example:

tarantool> string = require('string')
---
...
tarantool> string.startswith(' A', 'A', 2, 5)
---
- true
...

string.endswith(input-string, end-string[, start-pos[, end-pos]])¶

Return True if input-string ends with end-string, otherwise return False.

Parameters:	input-string (`string`) – the string where `end-string` should be looked for end-string (`string`) – the string to look for start-pos (`integer`) – position: where to start looking within `input-string` end-pos (`integer`) – position: where to end looking within `input-string`
Return:	true or false
Rtype:	boolean

start-pos and end-pos may be negative, meaning the position should be calculated from the end of the string.

Example:

tarantool> string = require('string')
---
...
tarantool> string.endswith('Baa', 'aa')
---
- true
...

string.lstrip(input-string[, list-of-characters])¶

Return the value of the input string, after removing characters on the left. The optional list-of-characters parameter is a set not a sequence, so string.lstrip(...,'ABC') does not mean strip 'ABC', it means strip 'A' or 'B' or 'C'.

Parameters:	input-string (`string`) – the string to process list-of-characters (`string`) – what characters can be stripped. Default = space.
Return:	result after stripping characters from input string
Rtype:	string

Example:

tarantool> string = require('string')
---
...
tarantool> string.lstrip(' ABC ')
---
- 'ABC '
...

string.rstrip(input-string[, list-of-characters])¶

Return the value of the input string, after removing characters on the right. The optional list-of-characters parameter is a set not a sequence, so string.rstrip(...,'ABC') does not mean strip 'ABC', it means strip 'A' or 'B' or 'C'.

Parameters:	input-string (`string`) – the string to process list-of-characters (`string`) – what characters can be stripped. Default = space.
Return:	result after stripping characters from input string
Rtype:	string

Example:

tarantool> string = require('string')
---
...
tarantool> string.rstrip(' ABC ')
---
- ' ABC'
...

string.split(input-string[, split-string[, max]])¶

Split input-string into one or more output strings in a table. The places to split are the places where split-string occurs.

Parameters:	input-string (`string`) – the string to split split-string (`string`) – the string to find within `input-string`. Default = space. max (`integer`) – maximum number of delimiters to process counting from the beginning of the input string. Result will contain max + 1 parts maximum.
Return:	table of strings that were split from `input-string`
Rtype:	table

Example:

tarantool> string = require('string')
---
...
tarantool> string.split("A:B:C:D:F", ":", 2)
---
- - A
  - B
  - C:D:F
...

string.strip(input-string[, list-of-characters])¶

Return the value of the input string, after removing characters on the left and the right. The optional list-of-characters parameter is a set not a sequence, so string.strip(...,'ABC') does not mean strip 'ABC', it means strip 'A' or 'B' or 'C'.

Parameters:	input-string (`string`) – the string to process list-of-characters (`string`) – what characters can be stripped. Default = space.
Return:	result after stripping characters from input string
Rtype:	string

Example:

tarantool> string = require('string')
---
...
tarantool> string.strip(' ABC ')
---
- ABC
...

Module swim

Overview

The swim module contains Tarantool’s implementation of SWIM – Scalable Weakly-consistent Infection-style Process Group Membership Protocol. It is recommended for any type of Tarantool cluster where the number of nodes can be large. Its job is to discover and monitor the other members in the cluster and keep their information in a “member table”. It works by sending and receiving, in a background event loop, periodically, via UDP, messages.

Each message has several parts, including:

the ping such as “I am checking whether you are alive”,
the event such as “I am joining”,
the anti-entropy such as “I know that another member exists”,
the payload such as “I or another member could have user-generated data”.

The maximum message size is about 1500 bytes.

SWIM sends messages periodically to a random subset of the member table. SWIM processes replies from those members asynchronously.

Each entry in the member table has:

a UUID,
a status (“alive”, “suspected”, “dead”, or “left”).

When a member fails to acknowledge a certain number of pings, its status is changed from “alive” to “suspected”, that is, suspected of being dead. But SWIM tries to avoid false positives (misidentifying members as dead) which could happen when a member is overloaded and responds to pings too slowly, or when there is network trouble and packets can not go through some channels. When a member is suspected, SWIM randomly chooses other members and sends requests to them: “please ping this suspected member”. This is called an indirect ping. Thus via different routes and additional hops the suspected member gets additional chances to reply, and thus “refute” the suspicion.

Because selection is random there is an even network load of about one message per member per protocol step, regardless of the cluster size. This is a major feature of SWIM. Because the protocol depends on members passing information on, also known as “gossiping”, members do not need to broadcast messages to every member, which would cause a network load of N messages per member per protocol step, where N is the number of members in the cluster. However, selection is not entirely random, there is a preference for selecting least-recently-pinged members, like a round-robin.

Regarding the anti-entropy part of a message: this is necessary for maintaining the status in entries of the member table. Consider an example where two members, #1 and #2, are both alive. No events happen so only pings are being sent periodically. Then a third member, #3 appears. It knows about one of the existing members, #2. How can it discover the other member? Certainly #1 could notify #2 and #2 could notify #3, but messages go via UDP, so any notification event can be lost. However, regular messages containing “ping” and/or “event” also can contain an “anti-entropy” section, which is taken from a randomly-chosen part of the member table. So for this example, #2 will eventually randomly add to a regular message the anti-entropy note that #1 is alive, and thus #3 will discover #1 even though it did not receive a direct “I am alive” event message from #1.

Regarding the UUID part of an entry in the member table: this is necessary for stable identification, because UUID changes more rarely than URI (a combination of IP and port number). But if the UUID does change, SWIM will include both the new and old UUID in messages, so all other members will eventually learn about the new UUID and change the member table accordingly.

Regarding the payload part of a message: this is not always necessary, it is a feature which allows passing user-generated information via SWIM instead of via node-to-node communication. The swim module has methods for specifying a “payload”, which is arbitrary user data with a maximum size of about 1.2 KB. The payload can be anything, and it will be eventually disseminated over the cluster and available at other members. Each member can have its own payload.

Messages can be encrypted. Encryption may not be necessary in a closed network but is necessary for safety if the cluster is on the public Internet. Users can specify an encryption algorithm, an encryption mode, and a private key. All parts of all messages (including ping, acknowledgment, event, payload, URI, and UUID) will be encrypted with that private key, as well as a random public key generated for each message to prevent pattern attacks.

In theory the event dissemination speed (the number of hops to pass information throughout the cluster) is O(log(cluster_size)). For that and other theoretical information see the Cornell University paper which originally described SWIM.

swim.new([cfg])¶

Create a new SWIM instance. A SWIM instance maintains a member table and interacts with other members. Multiple SWIM instances can be created in a single Tarantool process.

Parameters:

cfg (table) –
an optional configuration parameter.

If cfg is not specified or is nil, then the new SWIM instance is not bound to a socket and has nil attributes, so it cannot interact with other members and only a few methods are valid until swim_object:cfg() is called.

If cfg is specified, then the effect is the same as calling s = swim.new() s:cfg(), except for generation. For configuration description see swim_object:cfg().

The generation part of cfg can only be specified during new(), it cannot be specified later during cfg(). Generation is part of incarnation. Usually generation is not specified because the default value (a timestamp) is sufficient, but if there is reason to mistrust timestamps (because the time is changed or because the instance is started on a different machine), then users may say swim.new({generation = <number>}). In that case the latest value should be persisted somehow (for example in a file, or in a space, or in a global service), and the new value must be greater than any previous value of generation.

Return:	swim-object a swim object

Example:

swim_object = swim.new({uri = 3333, uuid = '00000000-0000-1000-8000-000000000001', heartbeat_rate = 0.1})

object swim_object¶

A swim object is an object returned by swim.new(). It has methods: cfg(), delete(), is_configured(), size(), quit(), add_member(), remove_member(), probe_member(), broadcast(), set_payload(), set_payload_raw(), set_codec(), self(), member_by_uuid(), pairs().

swim_object:cfg(cfg)¶

Configure or reconfigure a SWIM instance.

Parameters:	cfg (`table`) – the options to describe instance behavior

The cfg table may have these components:

heartbeat_rate (double) – rate of sending round messages, in seconds. Setting heartbeat_rate to X does not mean that every member will be checked every X seconds, instead X is the protocol speed. Protocol period depends on member count and heartbeat_rate. Default = 1.
ack_timeout (double) – time in seconds after which a ping is considered to be unacknowledged. Default = 30.
gc_mode (enum) – dead member collection mode.

If gc_mode == 'off' then SWIM never removes dead members from the member table (though users may remove them with swim_object:remove_member()), and SWIM will continue to ping them as if they were alive.

If gc_mode == 'on' then SWIM removes dead members from the member table after one round.

Default = 'on'.
uri (string or number) – either an 'ip:port' address, or just a port number (if ip is omitted then 127.0.0.1 is assumed). If port == 0, then the kernel will select any free port for the IP address.
uuid (string or cdata struct tt_uuid) – a value which should be unique among SWIM instances. Users may choose any value but the recommendation is: use box.cfg.instance_uuid, the Tarantool instance’s UUID.

All the cfg components are dynamic – swim_object:cfg() may be called more than once. If it is not being called for the first time and a component is not specified, then the component retains its previous value. If it is being called for the first time then uri and uuid are mandatory, since a SWIM instance cannot operate without URI and UUID.

swim_object:cfg() is atomic – if there is an error, then nothing changes.

Return:	true if configuration succeeds
Return:	nil, `err` if an error occurred. `err` is an error object

Example:

swim_object:cfg({heartbeat_rate = 0.5})

After swim_object:cfg(), all other swim_object methods are callable.

.cfg

Expose all non-nil components of the read-only table which was set up or changed by swim_object:cfg().

Example:

tarantool> swim_object.cfg
---
- gc_mode: off
  uri: 3333
  uuid: 00000000-0000-1000-8000-000000000001
...

swim_object:delete()¶

Delete a SWIM instance immediately. Its memory is freed, its member table entry is deleted, and it can no longer be used. Other members will treat this member as ‘dead’.

After swim_object:delete() any attempt to use the deleted instance will cause an exception to be thrown.

Return:	none, this method does not fail

Example: swim_object:delete()

swim_object:is_configured()¶

Return false if a SWIM instance was created via swim.new() without an optional cfg argument, and was not configured with swim_object:cfg(). Otherwise return true.

Return:	boolean result, true if configured, otherwise false

Example: swim_object:is_configured()

swim_object:size()¶

Return the size of the member table. It will be at least 1 because the “self” member is included.

Return:	integer size

Example: swim_object:size()

swim_object:quit()¶

Leave the cluster.

This is a graceful equivalent of swim_object:delete() – the instance is deleted, but before deletion it sends to each member in its member table a message, that this instance has left the cluster, and should not be considered dead.

Other instances will mark such a member in their tables as ‘left’, and drop it after one round of dissemination.

Consequences to the caller are the same as after swim_object:delete() – the instance is no longer usable, and an error will be thrown if there is an attempt to use it.

Return:	none, the method does not fail

Example: swim_object:quit()

swim_object:add_member(cfg)¶

Explicitly add a member into the member table.

This method is useful when a new member is joining the cluster and does not yet know what members already exist. In that case it can start interaction explicitly by adding the details about an already-existing member into its member table. Subsequently SWIM will discover other members automatically via messages from the already-existing member.

Parameters:	cfg (`table`) – description of the member

The cfg table has two mandatory components, uuid and uri, which have the same format as uuid and uri in the table for swim_object:cfg().

Return:	true if member is added
Return:	nil, `err` if an error occurred. `err` is an error object

Example:

swim_member_object = swim_object:add_member({uuid = ..., uri = ...})

swim_object:remove_member(uuid)¶

Explicitly and immediately remove a member from the member table.

Parameters:	uuid (`string-or-cdata-struct-tt_uuid`) – UUID
Return:	true if member is removed
Return:	nil, `err` if an error occurred. `err` is an error object.

Example: swim_object:delete('00000000-0000-1000-8000-000000000001')

swim_object:probe_member(uri)¶

Send a ping request to the specified uri address. If another member is listening at that address, it will receive the ping, and respond with an ACK (acknowledgment) message containing information such as UUID. That information will be added to the member table.

swim_object:probe_member() is similar to swim_object:add_member(), but it does not require UUID, and it is not reliable because it uses UDP.

Parameters:	uri (`string-or-number`) – URI. Format is the same as for `uri` in swim_object:cfg().
Return:	true if member is pinged
Return:	nil, `err` if an error occurred. `err` is an error object.

Example: swim_object:probe_member(3333)

swim_object:broadcast([port])¶

Broadcast a ping request to all the network interfaces in the system.

swim_object:broadcast() is like swim_object:probe_member() to many members at once.

Parameters:	port (`number`) – All the sent ping requests have this port as destination port in their UDP headers. By default a currently bound port is used.
Return:	true if broadcast is sent
Return:	nil, `err` if an error occurred. `err` is an error object.

Example:

tarantool> fiber = require('fiber')
---
...
tarantool> swim = require('swim')
---
...
tarantool> s1 = swim.new({uri = 3333, uuid = '00000000-0000-1000-8000-000000000001', heartbeat_rate = 0.1})
---
...
tarantool> s2 = swim.new({uri = 3334, uuid = '00000000-0000-1000-8000-000000000002', heartbeat_rate = 0.1})
---
...
tarantool> s1:size()
---
- 1
...
tarantool> s1:add_member({uri = s2:self():uri(), uuid = s2:self():uuid()})
---
- true
...
tarantool> s1:size()
---
- 1
...
tarantool> s2:size()
---
- 1
...

tarantool> fiber.sleep(0.2)
---
...
tarantool> s1:size()
---
- 2
...
tarantool> s2:size()
---
- 2
...
tarantool> s1:remove_member(s2:self():uuid()) s2:remove_member(s1:self():uuid())
---
...
tarantool> s1:size()
---
- 1
...
tarantool> s2:size()
---
- 1
...

tarantool> s1:probe_member(s2:self():uri())
---
- true
...
tarantool> fiber.sleep(0.1)
---
...
tarantool> s1:size()
---
- 2
...
tarantool> s2:size()
---
- 2
...
tarantool> s1:remove_member(s2:self():uuid()) s2:remove_member(s1:self():uuid())
---
...
tarantool> s1:size()
---
- 1
...
tarantool> s2:size()
---
- 1
...
tarantool> s1:broadcast(3334)
---
- true
...
tarantool> fiber.sleep(0.1)
---
...
tarantool> s1:size()
---
- 2
...

tarantool> s2:size()
---
- 2
...

swim_object:set_payload(payload)¶

Set a payload, as formatted data.

Payload is arbitrary user defined data up to 1200 bytes in size and disseminated over the cluster. So each cluster member will eventually learn what is the payload of other members in the cluster, because it is stored in the member table and can be queried with swim_member_object:payload().

Different members may have different payloads.

Parameters:	payload (`object`) – Arbitrary Lua object to disseminate. Set to nil to remove the payload, in which case it will be eventually removed on other instances. The object is serialized in MessagePack.
Return:	true if payload is set
Return:	nil, `err` if an error occurred. `err` is an error object

Example:

swim_object:set_payload({field1 = 100, field2 = 200})

swim_object:set_payload_raw(payload[, size])¶

Set a payload, as raw data.

Sometimes a payload does not need to be a Lua object. For example, a user may already have a well formatted MessagePack object and just wants to set it as a payload. Or cdata needs to be exposed.

set_payload_raw allows setting a payload as is, without MessagePack serialization.

Parameters:	payload (`string-or-cdata`) – any value size (`number`) – Payload size in bytes. If `payload` is string then `size` is optional, and if specified, then should not be larger than actual `payload` size. If `size` is less than actual `payload` size, then only the first `size` bytes of `payload` are used. If `payload` is cdata then `size` is mandatory.
Return:	true if payload is set
Return:	nil, `err` if an error occurred. `err` is an error object

Example:

tarantool> tarantool> ffi = require('ffi')
---
...
tarantool> fiber = require('fiber')
---
...
tarantool> swim = require('swim')
---
...
tarantool> s1 = swim.new({uri = 0, uuid = '00000000-0000-1000-8000-000000000001', heartbeat_rate = 0.1})
---
...
tarantool> s2 = swim.new({uri = 0, uuid = '00000000-0000-1000-8000-000000000002', heartbeat_rate = 0.1})
---
...
tarantool> s1:add_member({uri = s2:self():uri(), uuid = s2:self():uuid()})
---
- true
...
tarantool> s1:set_payload({a = 100, b = 200})
---
- true
...
tarantool> s2:set_payload('any payload')
---
- true
...
tarantool> fiber.sleep(0.2)
---
...
tarantool> s1_view = s2:member_by_uuid(s1:self():uuid())
---
...
tarantool> s2_view = s1:member_by_uuid(s2:self():uuid())
---
...
tarantool> s1_view:payload()
---
- {'a': 100, 'b': 200}
...
tarantool> s2_view:payload()
---
- any payload
...
tarantool> cdata = ffi.new('char[?]', 2)
---
...
tarantool> cdata[0] = 1
---
...
tarantool> cdata[1] = 2
---
...
tarantool> s1:set_payload_raw(cdata, 2)
---
- true
...
tarantool> fiber.sleep(0.2)
---
...
tarantool> cdata, size = s1_view:payload_cdata()
---
...
tarantool> cdata[0]
---
- 1
...
tarantool> cdata[1]
---
- 2
...
tarantool> size
---
- 2
...

swim_object:set_codec(codec_cfg)¶

Enable encryption for all following messages.

For a brief description of encryption algorithms see “enum_crypto_algo” and “enum crypto_mode” in the Tarantool source code file crypto.h.

When encryption is enabled, all the messages are encrypted with a chosen private key, and a randomly generated and updated public key.

Parameters:	codec_cfg (`table`) – description of the encryption

The components of the codec_cfg table may be:

algo (string) – encryption algorithm name. All the names in module crypto are supported: ‘aes128’, ‘aes192’, ‘aes256’, ‘des’. Specify ‘none’ to disable encryption.
mode (string) – encryption algorithm mode. All the modes in module crypto are supported: ‘ecb’, ‘cbc’, ‘cfb’, ‘ofb’. Default = ‘cbc’.
key (cdata or string) – a private secret key which is kept secret and should never be stored hard-coded in source code.
key_size (integer) – size of the key in bytes.

key_size is mandatory if key is cdata.

key_size is optional if key is string, and if key_size is shorter than than actual key size then the key is truncated.

All of algo, mode, key, and key_size should be the same for all SWIM instances, so that members can understand each others’ messages.

Example;

tarantool> tarantool> swim = require('swim')
---
...
tarantool> s1 = swim.new({uri = 0, uuid = '00000000-0000-1000-8000-000000000001'})
---
...
tarantool> s1:set_codec({algo = 'aes128', mode = 'cbc', key = '1234567812345678'})
---
- true
...

swim_object:self()¶

Return a swim member object (of self) from the member table, or from a cache containing earlier results of swim_object:self() or swim_object:member_by_uuid() or swim_object:pairs().

Return:	swim member object, not nil because self() will not fail

Example: swim_member_object = swim_object:self()

swim_object:member_by_uuid(uuid)¶

Return a swim member object (given UUID) from the member table, or from a cache containing earlier results of swim_object:self() or swim_object:member_by_uuid() or swim_object:pairs().

Parameters:	uuid (`string-or-cdata-struct-tt-uuid`) – UUID
Return:	swim member object, or nil if not found

Example:

swim_member_object = swim_object:member_by_uuid('00000000-0000-1000-8000-000000000001')

swim_object:pairs()¶

Set up an iterator for returning swim member objects from the member table, or from a cache containing earlier results of swim_object:self() or swim_object:member_by_uuid() or swim_object:pairs().

swim_object:pairs() should be in a ‘for’ loop, and there should only be one iterator in operation at one time. (The iterator is implemented in an extra light fashion so only one iterator object is available per SWIM instance.)

Parameters:	generator+object+key (`varies`) – as for any Lua pairs() iterators. generator function, iterator object (a swim member object), and initial key (a UUID).

Example:

tarantool> fiber = require('fiber')
---
...
tarantool> swim = require('swim')
---
...
tarantool> s1 = swim.new({uri = 0, uuid = '00000000-0000-1000-8000-000000000001', heartbeat_rate = 0.1})
---
...
tarantool> s2 = swim.new({uri = 0, uuid = '00000000-0000-1000-8000-000000000002', heartbeat_rate = 0.1})
---
...
tarantool> s1:add_member({uri = s2:self():uri(), uuid = s2:self():uuid()})
---
- true
...
tarantool> fiber.sleep(0.2)
---
...
tarantool> s1:self()
---
- uri: 127.0.0.1:55845
  status: alive
  incarnation: cdata {generation = 1569353431853325ULL, version = 1ULL}
  uuid: 00000000-0000-1000-8000-000000000001
  payload_size: 0
...
tarantool> s1:member_by_uuid(s1:self():uuid())
---
- uri: 127.0.0.1:55845
  status: alive
  incarnation: cdata {generation = 1569353431853325ULL, version = 1ULL}
  uuid: 00000000-0000-1000-8000-000000000001
  payload_size: 0
...
tarantool> s1:member_by_uuid(s2:self():uuid())
---
- uri: 127.0.0.1:53666
  status: alive
  incarnation: cdata {generation = 1569353431865138ULL, version = 1ULL}
  uuid: 00000000-0000-1000-8000-000000000002
  payload_size: 0
...
tarantool> t = {}
---
...
tarantool> for k, v in s1:pairs() do table.insert(t, {k, v}) end
---
...
tarantool> t
---
- - - 00000000-0000-1000-8000-000000000002
    - uri: 127.0.0.1:53666
      status: alive
      incarnation: cdata {generation = 1569353431865138ULL, version = 1ULL}
      uuid: 00000000-0000-1000-8000-000000000002
      payload_size: 0
  - - 00000000-0000-1000-8000-000000000001
    - uri: 127.0.0.1:55845
      status: alive
      incarnation: cdata {generation = 1569353431853325ULL, version = 1ULL}
      uuid: 00000000-0000-1000-8000-000000000001
      payload_size: 0
...

object swim_member_object¶

Methods swim_object:member_by_uuid(), swim_object:self(), and swim_object:pairs() return swim member objects.

A swim member object has methods for reading its attributes: status(), uuid, uri(), incarnation(), payload_cdata, payload_str(), payload(), is_dropped().

swim_member_object:status()¶

Return the status, which may be ‘alive’, ‘suspected’, ‘left’, or ‘dead’.

Return:	string ‘alive’ \| ‘suspected’ \| ‘left’ \| dead’

swim_member_object:uuid()¶

Return the UUID as cdata struct tt_uuid.

Return:	cdata-struct-tt-uuid UUID

swim_member_object:uri()¶

Return the URI as a string ‘ip:port’. Via this method a user can learn a real assigned port, if port = 0 was specified in swim_object:cfg().

Return:	string ip:port

swim_member_object:incarnation()¶

Return a cdata object with the incarnation. The cdata object has two attributes: incarnation().generation and incarnation().version.

Incarnations can be compared to each other with any comparison operator (==, <, >, <=, >=, ~=).

Incarnations, when printed, will appear as strings with both generation and version.

Return:	cdata incarnation

swim_member_object:payload_cdata()¶

Return member’s payload.

Return:	pointer-to-cdata payload and size in bytes

swim_member_object:payload_str()¶

Return payload as a string object. Payload is not decoded. It is just returned as a string instead of cdata. If payload was not specified by swim_object:set_payload() or by swim_object:set_payload_raw(), then its size is 0 and nil is returned.

Return:	string-object payload, or nil if there is no payload

swim_member_object:payload()¶

Since the swim module is a Lua module, a user is likely to use Lua objects as a payload – tables, numbers, strings etc. And it is natural to expect that swim_member_object:payload() should return the same object which was passed into swim_object:set_payload() by another instance. swim_member_object:payload() tries to interpret payload as MessagePack, and if that fails then it returns the payload as a string.

swim_member_object:payload() caches its result. Therefore only the first call actually decodes cdata payload. All following calls return a pointer to the same result, unless payload is changed with a new incarnation. If payload was not specified (its size is 0), then nil is returned.

swim_member_object:is_dropped()¶

Returns true if this member object is a stray reference to a member which has already been dropped from the member table.

Return:	boolean true if member is dropped, otherwise false

Example:

tarantool> swim = require('swim')
---
...
tarantool> s = swim.new({uri = 0, uuid = '00000000-0000-1000-8000-000000000001'})
---
...
tarantool> self = s:self()
---
...
tarantool> self:status()
---
- alive
...
tarantool> self:uuid()
---
- 00000000-0000-1000-8000-000000000001
...
tarantool> self:uri()
---
- 127.0.0.1:56367
...
tarantool> self:incarnation()
---
- - cdata {generation = 1569354463981551ULL, version = 1ULL}
...
tarantool> self:is_dropped()
---
- false
...
tarantool> s:set_payload_raw('123')
---
- true
...
tarantool> self:payload_cdata()
---
- 'cdata<const char *>: 0x0103500050'
- 3
...
tarantool> self:payload_str()
---
- '123'
...
tarantool> s:set_payload({a = 100})
---
- true
...
tarantool> self:payload_cdata()
---
- 'cdata<const char *>: 0x0103500050'
- 4
...
tarantool> self:payload_str()
---
- !!binary gaFhZA==
...
tarantool> self:payload()
---
- {'a': 100}
...

swim_member_object:on_member_event(trigger-function[, ctx])¶

Create an “on_member trigger”. The trigger-function will be executed when a member in the member table is updated.

Parameters:	trigger-function (`function`) – this will become the trigger function ctx (`cdata`) – (optional) this will be passed to trigger-function
Return:	nil or function pointer.

The trigger-function should have three parameter declarations (Tarantool will pass values for them when it invokes the function):

the member which is having the member event,
the event object,
the ctx which will be the same value as what is passed to swim_object:on_member_event.

A member event is any of:

appearance of a new member,
drop of an existing member, or
update of an existing member.

An event object is an object which the trigger-function can use for determining what type of member event has happened. The object’s methods – such as is_new_status(), is_new_uri(), is_new_incarnation(), is_new_payload(), is_drop() – return boolean values.

A member event may have more than one associated trigger. Triggers are executed sequentially. Therefore if a trigger function causes yields or sleeps, other triggers may be forced to wait. However, since trigger execution is done in a separate fiber, SWIM itself is not forced to wait.

Example of an on-member trigger function:

tarantool> swim = require('swim')

local function on_event(member, event, ctx)
    if event:is_new() then
        ...
    elseif event:is_drop() then
        ...
    end

    if event:is_update() then
        -- All next conditions can be
        -- true simultaneously.
        if event:is_new_status() then
...
        end
        if event:is_new_uri() then
...
        end
        if event:is_new_incarnation() then
...
        end
        if event:is_new_payload() then
...
        end
    end
end

Notice in the above example that the function is ready for the possibility that multiple events can happen simultaneously for a single trigger activation. is_new() and is_drop() can not both be true, but is_new() and is_update() can both be true, or is_drop() and is_update() can both be true. Multiple simultaneous events are especially likely if there are many events and trigger functions are slow – in that case, for example, a member might be added and then updated after a while, and then after a while there will be a single trigger activation.

Also: is_new() and is_new_payload() can both be true. This case is not due to trigger functions that are slow. It occurs because “omitted payload” and “size-zero payload” are not the same thing. For example: when a ping is received, a new member might be added, but ping messages do not include payload. The payload will appear later in a different message. If that is important for the application, then the function should not assume when is_new() is true that the member already has a payload, and should not assume that payload size says something about the payload’s presence or absence.

Also: functions should not assume that is_new() and is_drop() will always be seen. If a new member appears but then is dropped before its appearance has caused a trigger activation, then there will be no trigger activation.

is_new_generation() will be true if the generation part of incarnation changes. is_new_version() will be true if the version part of incarnation changes. is_new_incarnation() will be true if either the generation part or the version part of incarnation changes. For example a combination of these methods can be used within a user-defined trigger to check whether a process has restarted, or a member has changed …

swim = require('swim')
s = swim.new()
s:on_member_event(function(m, e)
...
    if e:is_new_incarnation() then
        if e:is_new_generation() then
            -- Process restart.
        end
        if e:is_new_version() then
            -- Process version update. It means
            -- the member is somehow changed.
        end
    end
end

swim_member_object:on_member_event(nil, old-trigger)

Delete an on-member trigger.

Parameters:	old-trigger (`function`) – old-trigger

The old-trigger value should be the value returned by on_member_event(trigger-function[, ctx]).

swim_member_object:on_member_event(new-trigger, old-trigger[, ctx])

This is a variation of on_member_event(new-trigger, [, ctx]).

The additional parameter is old-trigger. Instead of adding the new-trigger at the end of a list of triggers, this function will replace the entry in the list of triggers that matches old-trigger. The position within a list may be important because triggers are activated sequentially starting with the first trigger in the list.

The old-trigger value should be the value returned by on_member_event(trigger-function[, ctx]).

swim_member_object:on_member_event(): Return the list of on-member triggers.

SWIM internals

The SWIM internals section is not necessary for programmers who wish to use the SWIM module, it is for programmers who wish to change or replace the SWIM module.

The SWIM wire protocol is open, will be backward compatible in case of any changes, and can be implemented by users who wish to simulate their own SWIM cluster members because they use another language than Lua, or another environment unrelated to Tarantool. The protocol is encoded as MsgPack.

SWIM packet structure:

+-----------------Public data, not encrypted------------------+
|                                                             |
|      Initial vector, size depends on chosen algorithm.      |
|                   Next data is encrypted.                   |
|                                                             |
+----------Meta section, handled by transport level-----------+
| map {                                                       |
|     0 = SWIM_META_TARANTOOL_VERSION: uint, Tarantool        |
|                                      version ID,            |
|     1 = SWIM_META_SRC_ADDRESS: uint, ip,                    |
|     2 = SWIM_META_SRC_PORT: uint, port,                     |
|     3 = SWIM_META_ROUTING: map {                            |
|         0 = SWIM_ROUTE_SRC_ADDRESS: uint, ip,               |
|         1 = SWIM_ROUTE_SRC_PORT: uint, port,                |
|         2 = SWIM_ROUTE_DST_ADDRESS: uint, ip,               |
|         3 = SWIM_ROUTE_DST_PORT: uint, port                 |
|     }                                                       |
| }                                                           |
+-------------------Protocol logic section--------------------+
| map {                                                       |
|     0 = SWIM_SRC_UUID: 16 byte UUID,                        |
|                                                             |
|                 AND                                         |
|                                                             |
|     2 = SWIM_FAILURE_DETECTION: map {                       |
|         0 = SWIM_FD_MSG_TYPE: uint, enum swim_fd_msg_type,  |
|         1 = SWIM_FD_GENERATION: uint,                       |
|         2 = SWIM_FD_VERSION: uint                           |
|     },                                                      |
|                                                             |
|               OR/AND                                        |
|                                                             |
|     3 = SWIM_DISSEMINATION: array [                         |
|         map {                                               |
|             0 = SWIM_MEMBER_STATUS: uint,                   |
|                                     enum member_status,     |
|             1 = SWIM_MEMBER_ADDRESS: uint, ip,              |
|             2 = SWIM_MEMBER_PORT: uint, port,               |
|             3 = SWIM_MEMBER_UUID: 16 byte UUID,             |
|             4 = SWIM_MEMBER_GENERATION: uint,               |
|             5 = SWIM_MEMBER_VERSION: uint,                  |
|             6 = SWIM_MEMBER_PAYLOAD: bin                    |
|         },                                                  |
|         ...                                                 |
|     ],                                                      |
|                                                             |
|               OR/AND                                        |
|                                                             |
|     1 = SWIM_ANTI_ENTROPY: array [                          |
|         map {                                               |
|             0 = SWIM_MEMBER_STATUS: uint,                   |
|                                     enum member_status,     |
|             1 = SWIM_MEMBER_ADDRESS: uint, ip,              |
|             2 = SWIM_MEMBER_PORT: uint, port,               |
|             3 = SWIM_MEMBER_UUID: 16 byte UUID,             |
|             4 = SWIM_MEMBER_GENERATION: uint,               |
|             5 = SWIM_MEMBER_VERSION: uint,                  |
|             6 = SWIM_MEMBER_PAYLOAD: bin                    |
|         },                                                  |
|         ...                                                 |
|     ],                                                      |
|                                                             |
|               OR/AND                                        |
|                                                             |
|     4 = SWIM_QUIT: map {                                    |
|         0 = SWIM_QUIT_GENERATION: uint,                     |
|         1 = SWIM_QUIT_VERSION: uint                         |
|     }                                                       |
| }                                                           |
+-------------------------------------------------------------+

The Initial vector section appears only when encryption is enabled. This section contains a public key. For example, for AES algorithms it is a 16-byte initial vector stored as is. When no encryption is used, the section size is 0.

The later sections (Meta and Protocol Logic) are encrypted as one big data chunk if encryption is enabled.

The Meta section handles routing and protocol versions compatibility. It works at the ‘transport’ level of the SWIM protocol, and is always present. Keys in the meta section are:

SWIM_META_TARANTOOL_VERSION – mandatory field. Tarantool sets here its version as a 3 byte integer:
- 1 byte for major,
- 1 byte for minor,
- 1 byte for patch.
For example, Tarantool version 2.1.3 would be encoded like this: (((2 << 8) | 1) << 8) | 3;. This field will be used to support multiple versions of the protocol.
SWIM_META_SRC_ADDRESS and SWIM_META_SRC_PORT – mandatory. source IP address and port. IP is encoded as 4 bytes. “xxx.xxx.xxx.xxx” where each ‘xxx’ is encoding of one byte. Port is encoded as an integer. Example of how to encode “127.0.0.1:3313”:
```
struct in_addr addr;
inet_aton("127.0.0.1", &addr);
pos = mp_encode_uint(pos, SWIM_META_SRC_ADDRESS);
pos = mp_encode_uint(pos, addr->s_addr);
pos = mp_encode_uint(pos, SWIM_META_SRC_PORT);
pos = mp_encode_uint(pos, 3313);
```
SWIM_META_ROUTING subsection – not mandatory. Responsible for packet forwarding. Used by SWIM suspicion mechanism. Read about suspicion in the SWIM paper.

If this subsection is present then the following fields are mandatory:
- SWIM_ROUTE_SRC_ADDRESS and SWIM_ROUTE_SRC_PORT (source IP address and port) (should be an address of the message originator (can differ from
- SWIM_META_SRC_ADDRESS and from SWIM_META_SRC_ADDRESS_PORT);
- SWIM_ROUTE_DST_ADDRESS and SWIM_ROUTE_DST_PORT (destination IP address and port, for the message’s final destination).
If a message was sent indirectly with the help of SWIM_META_ROUTING, then the reply should be sent back by the same route.

For an example of how SWIM uses routing for indirect pings … Assume there are 3 nodes: S1, S2, S3. S1 sends a message to S3 via S2. The following steps are executed in order to deliver the message:
```
S1 -> S2
{ src: S1, routing: {src: S1, dst: S3}, body: ... }
```
S2 receives the message and sees that routing.dst is not equal to S2, so it is a foreign packet. S2 forwards the packet to S3 preserving all the data including body and routing sections.
```
S2 -> S3
```
S3 receives the message and sees that routing.dst is equal to S3, so the message is delivered. If S3 wants to answer, it sends a response via the same proxy. It knows that the message was delivered from S2, so it sends an answer via S2.

The Protocol logic section handles SWIM logical protocol steps and actions.

SWIM_SRC_UUID – mandatory field. SWIM uses UUID as a unique identifier of a member, not IP/port. This field stores UUID of sender. Its type is MP_BIN. Size is always 16 bytes. UUID is encoded in host byte order, no bswaps are needed.

Following SWIM_SRC_UUID there are four possible subsections: SWIM_FAILURE_DETECTION, SWIM_DISSEMINATION, SWIM_ANTI_ENTROPY, SWIM_QUIT. Any or all of these subsections may be present. A connector should be ready to handle any combination.

SWIM_FAILURE_DETECTION subsection – describes a ping or ACK. In the SWIM_FAILURE_DETECTION subsection are:
- SWIM_FD_MSG_TYPE (0 is ping, 1 is ack);
- SWIM_FD_GENERATION + SWIM_FD_VERSION (the incarnation).
SWIM_DISSEMINATION subsection – a list of changed cluster members. It may include only a subset of changed cluster members if there are too many changes to fit into one UDP packet.

In the SWIM_DISSEMINATION subsection are:
- SWIM_MEMBER_STATUS (mandatory) (0 = alive, 1 = suspected, 2 = dead, 3 = left);
- SWIM_MEMBER_ADDRESS and SWIM_MEMBER_PORT (mandatory) member IP and port;
- SWIM_MEMBER_UUID (mandatory) (member UUID);
- SWIM_MEMBER_GENERATION + SWIM_MEMBER_VERSION (mandatory) (the member incarnation);
- SWIM_MEMBER_PAYLOAD (not mandatory) (member payload) (MessagePack type is MP_BIN).
Note that absence of SWIM_MEMBER_PAYLOAD means nothing - it is not the same as a payload with zero size.
SWIM_ANTI_ENTROPY subsection – a helper for the dissemination. It contains all the same fields as the dissemination sub, but all of them are mandatory, including payload even when payload size is 0. Anti-entropy eventually spreads changes which for any reason are not spread by the dissemination.
SWIM_QUIT subsection – statement that the sender has left the cluster gracefully, for example via swim_object:quit(), and should not be considered dead. Sender status should be changed to ‘left’.

In the SWIM_QUIT subsection are:
- SWIM_QUIT_GENERATION + SWIM_QUIT_VERSION (the sender incarnation).

The incarnation is a 128-bit cdata value which is part of each member’s configuration and is present in most messages. It has two parts: generation and version.

Generation is persistent. By default it has the number of microseconds since the epoch (compare the value returned by clock_realtime64()). Optionally a user can set generation during new().

Version is volatile. It is initially 0. It is incremented automatically every time that a change occurs.

The incarnation, or sometimes the version alone, is useful for deciding to ignore obsolete messages, for updating a member’s attributes on remote nodes, and for refuting messages that say a member is dead.

If the member’s incarnation is less than the locally stored incarnation, then the message is obsolete. This can happen because UDP allows reordering and duplication.

If the member’s incarnation in a message is greater than the locally stored incarnation, then most of its attributes (IP, port, status) should be updated with the values received in the message. However, the payload attribute should not be updated unless it is present in the message. Because of its relatively large size, payload is not always included in every message.

Refutation usually happens when a false-positive failure detection has happened. In such a case the member thought to be dead receives that information from other members, increases its own incarnation, and spreads a message saying the member is alive (a “refutation”).

Note: in the original version of Tarantool SWIM, and in the original SWIM specification, there is no generation and the incarnation consists of only the version. Generation was added because it is useful for detecting obsolete messages left over from a previous life of an instance that has restarted.

Module table

The table module has everything in the standard Lua table library, and some Tarantool extensions.

Write table to see the list of functions:

clear (LuaJIT extension = erase all elements)
concat (concatenate)
copy (make a copy of an array)
deepcopy (see the description below)
foreach
foreachi
getn (get the number of elements in an array)
insert (insert an element into an array)
maxn (get the largest index)
move (move elements between tables)
new (LuaJIT extension = return a new table with pre-allocated elements)
remove (remove an element from an array)
sort (sort the elements of an array)

In this section we only discuss the additional function that the Tarantool developers have added: deepcopy.

table.deepcopy(input-table)¶

Return a “deep” copy of the table – a copy which follows nested structures to any depth and does not depend on pointers, it copies the contents.

Parameters:	input-table – (table) the table to copy
Return:	the copy of the table
Rtype:	table

Example:

tarantool> input_table = {1,{'a','b'}}
---
...

tarantool> output_table = table.deepcopy(input_table)
---
...

tarantool> output_table
---
- - 1
  - - a
    - b
...

table.sort(input-table[, comparison-function])¶

Put the input-table contents in sorted order.

The basic Lua table.sort has a default comparison function: function (a, b) return a < b end.

That is efficient and standard. However, sometimes Tarantool users will want an equivalent to table.sort which has any of these features:

If the table contains nils, except nils at the end, the results must still be correct. That is not the case with the default tarantool_sort, and it cannot be fixed by making a comparison that checks whether a and b are nil. (Before trying certain Internet suggestions, test with {1, nil, 2, -1, 44, 1e308, nil, 2, nil, nil, 0}.
If strings are to be sorted in a language-aware way, there must be a parameter for collation.
If the table has a mix of types, then they must be sorted as booleans, then numbers, then strings, then byte arrays.

Since all those features are available in Tarantool spaces, the solution for Tarantool is simple: make a temporary Tarantool space, put the table contents into it, retrieve the tuples from it in order, and overwrite the table.

Here then is tarantool_sort() which does the same thing as table.sort but has those extra features. It is not fast and it requires a database privilege, so it should only be used if the extra features are necessary.

Example:

function tarantool_sort(input_table, collation)
    local c = collation or 'binary'
    local tmp_name = 'Temporary_for_tarantool_sort'
    pcall(function() box.space[tmp_name]:drop() end)
    box.schema.space.create(tmp_name, {temporary = true})
    box.space[tmp_name]:create_index('I')
    box.space[tmp_name]:create_index('I2',
                                     {unique = false,
                                      type='tree',
                                      parts={{2, 'scalar',
                                              collation = c,
                                              is_nullable = true}}})
    for i = 1, table.maxn(input_table) do
        box.space[tmp_name]:insert{i, input_table[i]}
    end
    local t = box.space[tmp_name].index.I2:select()
    for i = 1, table.maxn(input_table) do
        input_table[i] = t[i][2]
    end
    box.space[tmp_name]:drop()
  end

For example, suppose table t = {1, 'A', -88.3, nil, true, 'b', 'B', nil, 'À'}.

After tarantool_sort(t, 'unicode_ci') t contains {nil, nil, true, -88.3, 1, 'A', 'À', 'b', 'B'}.

Module tap

Overview

The tap module streamlines the testing of other modules. It allows writing of tests in the TAP protocol. The results from the tests can be parsed by standard TAP-analyzers so they can be passed to utilities such as prove. Thus, one can run tests and then use the results for statistics, decision-making, and so on.

API Reference

Name	Use
tap.test()	Initialize
taptest:test()	Create a subtest and print the results
taptest:plan()	Indicate how many tests to perform
taptest:check()	Check the number of tests performed
taptest:diag()	Display a diagnostic message
taptest:ok()	Evaluate the condition and display the message
taptest:fail()	Evaluate the condition and display the message
taptest:skip()	Evaluate the condition and display the message
taptest:is()	Check if the two arguments are equal
taptest:isnt()	Check if the two arguments are different
taptest:is_deeply()	Recursively check if the two arguments are equal
taptest:like()	Check if the argument matches a pattern
taptest:unlike()	Check if the argument does not match a pattern
taptest:isnil() taptest:isstring() taptest:isnumber() taptest:istable() taptest:isboolean() taptest:isudata() taptest:iscdata()	Check if a value has a particular type
taptest.strict	Flag, true if comparisons with `nil` should be strict

tap.test(test-name)¶

Initialize.

The result of tap.test is an object, which will be called taptest in the rest of this discussion, which is necessary for taptest:plan() and all the other methods.

Parameters:	test-name (`string`) – an arbitrary name to give for the test outputs.
Return:	taptest
Rtype:	table

tap = require('tap')
taptest = tap.test('test-name')

object taptest¶

taptest:test(test-name, func)¶

Create a subtest (if no func argument specified), or (if all arguments are specified) create a subtest, run the test function and print the result.

See the example.

Parameters:	name (`string`) – an arbitrary name to give for the test outputs. fun (`function`) – the test logic to run.
Return:	taptest
Rtype:	userdata or string

taptest:plan(count)¶

Indicate how many tests will be performed.

Parameters:	count (`number`) –
Return:	nil

taptest:check()¶

Checks the number of tests performed.

The result will be a display saying # bad plan: ... if the number of completed tests is not equal to the number of tests specified by taptest:plan(...). (This is a purely Tarantool feature: “bad plan” messages are out of the TAP13 standard.)

This check should only be done after all planned tests are complete, so ordinarily taptest:check() will only appear at the end of a script. However, as a Tarantool extension, taptest:check() may appear at the end of any subtest. Therefore there are three ways to cause the check:

by calling taptest:check() at the end of a script,
by calling a function which ends with a call to taptest:check(),
or by calling taptest:test(‘…’, subtest-function-name) where subtest-function-name does not need to end with taptest:check() because it can be called after the subtest is complete.

Return:	true or false.
Rtype:	boolean

taptest:diag(message)¶

Display a diagnostic message.

Parameters:	message (`string`) – the message to be displayed.
Return:	nil

taptest:ok(condition, test-name)¶

This is a basic function which is used by other functions. Depending on the value of condition, print ‘ok’ or ‘not ok’ along with debugging information. Displays the message.

Parameters:	condition (`boolean`) – an expression which is true or false test-name (`string`) – name of the test
Return:	true or false.
Rtype:	boolean

Example:

tarantool> taptest:ok(true, 'x')
ok - x
---
- true
...
tarantool> tap = require('tap')
---
...
tarantool> taptest = tap.test('test-name')
TAP version 13
---
...
tarantool> taptest:ok(1 + 1 == 2, 'X')
ok - X
---
- true
...

taptest:fail(test-name)¶

taptest:fail('x') is equivalent to taptest:ok(false, 'x'). Displays the message.

Parameters:	test-name (`string`) – name of the test
Return:	true or false.
Rtype:	boolean

taptest:skip(message)¶

taptest:skip('x') is equivalent to taptest:ok(true, 'x' .. '# skip'). Displays the message.

Parameters:	test-name (`string`) – name of the test
Return:	nil

Example:

tarantool> taptest:skip('message')
ok - message # skip
---
- true
...

taptest:is(got, expected, test-name)¶

Check whether the first argument equals the second argument. Displays extensive message if the result is false.

Parameters:	got (`number`) – actual result expected (`number`) – expected result test-name (`string`) – name of the test
Return:	true or false.
Rtype:	boolean

taptest:isnt(got, expected, test-name)¶

This is the negation of taptest:is().

Parameters:	got (`number`) – actual result expected (`number`) – expected result test-name (`string`) – name of the test
Return:	true or false.
Rtype:	boolean

taptest:is_deeply(got, expected, test-name)¶

Recursive version of taptest:is(...), which can be used to compare tables as well as scalar values.

Return:	true or false.
Rtype:	boolean
Parameters:	got (`lua-value`) – actual result expected (`lua-value`) – expected result test-name (`string`) – name of the test

taptest:like(got, expected, test-name)¶

Verify a string against a pattern. Ok if match is found.

Return:	true or false.
Rtype:	boolean
Parameters:	got (`lua-value`) – actual result expected (`lua-value`) – pattern test-name (`string`) – name of the test

test:like(tarantool.version, '^[1-9]', "version")

taptest:unlike(got, expected, test-name)¶

This is the negation of taptest:like().

Parameters:	got (`number`) – actual result expected (`number`) – pattern test-name (`string`) – name of the test
Return:	true or false.
Rtype:	boolean

taptest:isnil(value, message, extra)¶

taptest:isstring(value, message, extra)¶

taptest:isnumber(value, message, extra)¶

taptest:istable(value, message, extra)¶

taptest:isboolean(value, message, extra)¶

taptest:isudata(value, utype, message, extra)¶

taptest:iscdata(value, ctype, message, extra)¶

Test whether a value has a particular type. Displays a long message if the value is not of the specified type.

Parameters:	value (`lua-value`) – value the type of which is to be checked utype (`string`) – type of data that a passed value should have ctype (`string`) – type of data that a passed value should have message (`string`) – text that will be shown to the user in case of failure
Return:	true or false.
Rtype:	boolean

test:iscdata(slab_info.quota_size, ffi.typeof('uint64_t'), 'memcached.slab.info().quota_size returns a cdata')

taptest.strict¶

Set taptest.strict=true if taptest:is() and taptest:isnt() and taptest:is_deeply() must be compared strictly with nil. Set taptest.strict=false if nil and box.NULL both have the same effect.

The default is false. For example, if and only if taptest.strict=true has happened, then taptest:is_deeply({a = box.NULL}, {}) will return false.

Since v. 2.8.3, taptest.strict is inherited in all subtests:

t = require('tap').test('123')
t.strict = true

t:is_deeply({a = box.NULL}, {}) -- false

t:test('subtest', function(t)
    t:is_deeply({a = box.NULL}, {}) -- also false
end)

Example

To run this example: put the script in a file named ./tap.lua, then make tap.lua executable by saying chmod a+x ./tap.lua, then execute using Tarantool as a script processor by saying ./tap.lua.

#!/usr/bin/tarantool
local tap = require('tap')
test = tap.test("my test name")
test:plan(2)
test:ok(2 * 2 == 4, "2 * 2 is 4")
test:test("some subtests for test2", function(test)
    test:plan(2)
    test:is(2 + 2, 4, "2 + 2 is 4")
    test:isnt(2 + 3, 4, "2 + 3 is not 4")
end)
test:check()

The output from the above script will look approximately like this:

TAP version 13
1..2
ok - 2 * 2 is 4
    # Some subtests for test2
    1..2
    ok - 2 + 2 is 4,
    ok - 2 + 3 is not 4
    # Some subtests for test2: end
ok - some subtests for test2

Module tarantool

By saying require('tarantool'), one can answer some questions about how the tarantool server was built, such as “what flags were used”, or “what was the version of the compiler”.

Additionally one can see the uptime and the server version and the process id. Those information items can also be accessed with box.info() but use of the tarantool module is recommended.

Example:

tarantool> tarantool = require('tarantool')
---
...
tarantool> tarantool
---
- version: 2.10.4-0-g816000e
  build:
    target: Darwin-x86_64-Release
    options: cmake . -DCMAKE_INSTALL_PREFIX=/usr/local/Cellar/tarantool/2.10.4 -DENABLE_BACKTRACE=ON
    linking: dynamic
    mod_format: dylib
    flags: ' -fexceptions -funwind-tables -fno-common -fopenmp -msse2 -Wformat -Wformat-security
      -Werror=format-security -fstack-protector-strong -fPIC -fmacro-prefix-map=/tmp/tarantool-20221113-6655-1clb1lj/tarantool-2.10.4=.
      -std=c11 -Wall -Wextra -Wno-strict-aliasing -Wno-char-subscripts -Wno-format-truncation
      -Wno-gnu-alignof-expression -Wno-cast-function-type'
    compiler: Clang-14.0.0.14000029
  pid: 'function: 0x0102df34f8'
  package: Tarantool
  uptime: 'function: 0x0102df34c0'
...
tarantool> tarantool.pid()
---
- 30155
...
tarantool> tarantool.uptime()
---
- 108.64641499519
...

Tarantool includes parts of tzdata package and uses its database for a correct time zone support. Since 3.2.0, you can get a used version of tzdata:

tarantool> tarantool.build.tzdata_version
---
- 2022a
...

Module uri

Overview

The URI module provides functions that convert URI strings into their components, or turn components into URI strings, for example:

local uri = require('uri')

parsed_uri = uri.parse('https://www.tarantool.io/doc/latest/reference/reference_lua/http/#api-reference')
--[[
---
- host: www.tarantool.io
  fragment: api-reference
  scheme: https
  path: /doc/latest/reference/reference_lua/http/
...
--]]

formatted_uri = uri.format({ scheme = 'https',
                             host = 'www.tarantool.io',
                             path = '/doc/latest/reference/reference_lua/http/',
                             fragment = 'api-reference' })
--[[
---
- https://www.tarantool.io/doc/latest/reference/reference_lua/http/#api-reference
...
--]]

To escape and unescape special characters, corresponding functions must be used:

formatted_uri3_e = uri.format({
    login = uri.escape('replic@ator'),
    password = uri.escape(':::'),
    host = 'foo.bar',
    params = {x = uri.escape('sec ret?', uri.FORM_URLENCODED)}
    },
    true
)
--[[
---
- replic%40ator:%3A%3A%3A@foo.bar?x=sec+ret%3F
...
]]--

parsed_uri3_e = uri.parse(formatted_uri3_e)
--[[
---
- password: '%3A%3A%3A'
  login: replic%40ator
  query: x=sec+ret%3F
  params:
    x:
    - sec+ret%3F
  host: foo.bar
...
]]--

parsed_uri3 = {
    login = uri.unescape(parsed_uri3_e.login),
    password = uri.unescape(parsed_uri3_e.password),
    host = parsed_uri3_e.host,
    params = {x = {uri.unescape(parsed_uri3_e.params.x[1], uri.FORM_URLENCODED)}},
}
--[[
---
- password: ':::'
  params:
    x:
    - sec ret?
  host: foo.bar
  login: replic@ator
...
]]--

You can also use this module to encode and decode arbitrary strings using the specified encoding options.

API Reference

Below is a list of uri functions, properties, and related objects.

Functions
uri.parse()	Get a table of URI components
uri.format()	Construct a URI from the specified components
uri.escape()	Encode a string using the specified encoding options
uri.unescape()	Decode a string using the specified encoding options
Properties
uri.RFC3986	Encoding options that use unreserved symbols defined in RFC 3986
uri.PATH	Options used to encode the `path` URI component
uri.PATH_PART	Options used to encode specific `path` parts
uri.QUERY	Options used to encode the `query` URI component
uri.QUERY_PART	Options used to encode specific `query` parts
uri.FRAGMENT	Options used to encode the `fragment` URI component
uri.FORM_URLENCODED	Options used to encode `application/x-www-form-urlencoded` form parameters
Related objects
uri_components	URI components
uri_encoding_opts	URI encoding options

Functions

uri.parse(uri-string | uri-table)¶

Parse a URI string into components.

See also: uri.format()

Parameters:	uri-string (`string`) – a URI string uri-table (`table`) – a URI table with an URI string and an optional override of URI query parameters. The URI string table key must be `'uri'` or `1` (the first array-like element). The override of URI query parameters must be given in the `'params'` element of the table.
Return:	a URI components table (see uri_components)
Rtype:	table

Example:

local uri = require('uri')

parsed_uri = uri.parse('https://www.tarantool.io/doc/latest/reference/reference_lua/http/#api-reference')
--[[
---
- host: www.tarantool.io
  fragment: api-reference
  scheme: https
  path: /doc/latest/reference/reference_lua/http/
...
--]]

parsed_uri11 = uri.parse({'foo.bar?x=1&x=2'})
parsed_uri12 = uri.parse({uri = 'foo.bar?x=1&x=2'})
--[[
---
- host: foo.bar
  params:
    x:
    - '1'
    - '2'
  query: x=1&x=2
...
--]]

parsed_uri21 = uri.parse({'foo.bar?x=1', params = {x = 2, y = 3}})
parsed_uri22 = uri.parse({uri = 'foo.bar?x=1', params = {x = 2, y = 3}})
--[[
---
- host: foo.bar
  params:
    y:
    - '3'
    x:
    - '2'
  query: x=1
...
--]]

uri.format(uri_components[, include_password])¶

Construct a URI from the specified components.

See also: uri.parse()

Parameters:	uri_components (`table`) – a series of `name=value` pairs, one for each component (see uri_components) include_password (`boolean`) – specify whether the password component is rendered in clear text; otherwise, it is omitted
Return:	URI string
Rtype:	string

Example:

local uri = require('uri')

formatted_uri = uri.format({ scheme = 'https',
                             host = 'www.tarantool.io',
                             path = '/doc/latest/reference/reference_lua/http/',
                             fragment = 'api-reference' })
--[[
---
- https://www.tarantool.io/doc/latest/reference/reference_lua/http/#api-reference
...
--]]

parsed_uri12 = uri.parse({uri = 'foo.bar?x=1&x=2'})
formatted_uri1 = uri.format(parsed_uri12)
--[[
---
- foo.bar?x=1&x=2
...
--]]

parsed_uri21 = uri.parse({'foo.bar?x=1', params = {x = 2, y = 3}})
formatted_uri2 = uri.format(parsed_uri21)
--[[
---
- foo.bar?y=3&x=2
...
--]]

uri.escape(string[, uri_encoding_opts])¶

Since: 2.11.0

Encode a string using the specified encoding options.

By default, uri.escape() uses encoding options defined by the uri.RFC3986 table. If required, you can customize encoding options using the uri_encoding_opts optional parameter, for example:

Pass the predefined set of options targeted for encoding a specific URI part (for example, uri.PATH or uri.QUERY).
Pass custom encoding options using the uri_encoding_opts object.

Parameters:	string – a string to encode uri_encoding_opts (`table`) – encoding options (optional, see uri_encoding_opts)
Return:	an encoded string
Rtype:	string

Example 1:

This example shows how to encode a string using the default encoding options.

local uri = require('uri')

escaped_string = uri.escape('C++')
--[[
---
- C%2B%2B
...
--]]

Example 2:

This example shows how to encode a string using the uri.FORM_URLENCODED encoding options.

local uri = require('uri')

escaped_string_url_enc = uri.escape('John Smith', uri.FORM_URLENCODED)
--[[
---
- John+Smith
...
--]]

Example 3:

This example shows how to encode a string using custom encoding options.

local uri = require('uri')

local escape_opts = {
    plus = true,
    unreserved = uri.unreserved("a-z")
}
escaped_string_custom = uri.escape('Hello World', escape_opts)
--[[
---
- '%48ello+%57orld'
...
--]]

uri.unescape(string[, uri_encoding_opts])¶

Since: 2.11.0

Decode a string using the specified encoding options.

By default, uri.escape() uses encoding options defined by the uri.RFC3986 table. If required, you can customize encoding options using the uri_encoding_opts optional parameter, for example:

Pass the predefined set of options targeted for encoding a specific URI part (for example, uri.PATH or uri.QUERY).
Pass custom encoding options using the uri_encoding_opts object.

Parameters:	string – a string to decode uri_encoding_opts (`table`) – encoding options (optional, see uri_encoding_opts)
Return:	a decoded string
Rtype:	string

Example 1:

This example shows how to decode a string using the default encoding options.

local uri = require('uri')

unescaped_string = uri.unescape('C%2B%2B')
--[[
---
- C++
...
--]]

Example 2:

This example shows how to decode a string using the uri.FORM_URLENCODED encoding options.

local uri = require('uri')

unescaped_string_url_enc = uri.unescape('John+Smith', uri.FORM_URLENCODED)
--[[
---
- John Smith
...
--]]

Example 3:

This example shows how to decode a string using custom encoding options.

local uri = require('uri')

local escape_opts = {
    plus = true,
    unreserved = uri.unreserved("a-z")
}
unescaped_string_custom = uri.unescape('%48ello+%57orld', escape_opts)
--[[
---
- Hello World
...
--]]

Properties

uri.RFC3986¶

Encoding options that use unreserved symbols defined in RFC 3986. These are default options used to encode and decode using the uri.escape() and uri.unescape() functions, respectively.

See also: uri_encoding_opts

Rtype:	table

uri.PATH¶

Options used to encode the path URI component.

See also: uri_encoding_opts

Rtype:	table

uri.PATH_PART¶

Options used to encode specific path parts.

See also: uri_encoding_opts

Rtype:	table

uri.QUERY¶

Options used to encode the query URI component.

See also: uri_encoding_opts

Rtype:	table

uri.QUERY_PART¶

Options used to encode specific query parts.

See also: uri_encoding_opts

Rtype:	table

uri.FRAGMENT¶

Options used to encode the fragment URI component.

See also: uri_encoding_opts

Rtype:	table

uri.FORM_URLENCODED¶

Options used to encode application/x-www-form-urlencoded form parameters.

See also: uri_encoding_opts

Rtype:	table

Related objects

uri_components

object uri_components¶

URI components. The uri_components object is used in the following functions:

The uri.parse() function returns the uri_components object.
The uri.format() function accepts the uri_components object as an argument.

uri_components.scheme¶

A URI scheme.

Examples: https, http

uri_components.login¶: A user name, which is a part of the userinfo subcomponent.

uri_components.password¶: A password, which is a part of the userinfo subcomponent.

uri_components.host¶

A host subcomponent.

Example: www.tarantool.io

uri_components.service¶

A service subcomponent. This property might return different values depending on the used URI scheme, for example:

If the https or http scheme is used, service returns the port value.
If the Unix domain socket is used, service returns the socket path.

Examples: 3301, /tmp/unix.sock

uri_components.path¶

A path component.

Example: /doc/latest/reference/reference_lua/http/

uri_components.query¶

A query component.

Example: key1=value1&key2=value2

uri_components.params¶

Parameters of a query component. Overrides query. The table elements may be string or arrays of string.

Example: {key1 = 'value1', key2 = 'value2', key3 = {'1', '2'}}

uri_components.fragment¶

A fragment component.

Example: api-reference

uri_components.ipv4¶

An IPv4 address.

Example: 127.0.0.1

uri_components.ipv6¶

An IPv6 address.

Example: 2a00:1148:b0ba:2016:12bf:48ff:fe78:fd10

uri_components.unix¶

A Unix domain socket.

Example: /tmp/unix.sock

uri_encoding_opts

object uri_encoding_opts¶

Since: 2.11.0

URI encoding options. These options can be passed to the uri.escape() and uri.unescape() functions.

Example:

The example below shows how to encode a string using custom encoding options.

local uri = require('uri')

local escape_opts = {
    plus = true,
    unreserved = uri.unreserved("a-z")
}
escaped_string_custom = uri.escape('Hello World', escape_opts)
--[[
---
- '%48ello+%57orld'
...
--]]

Note

The uri module also provides several sets of predefined options targeted for encoding a specific URI part (for example, uri.PATH or uri.QUERY).

uri_encoding_opts.plus¶

Enable encoding of + as the space character. By default, this property is set to false.

Rtype:	boolean

uri_encoding_opts.unreserved¶

Specify a Lua pattern defining unreserved symbols that are not encoded.

Rtype:	table

Example: 'a-zA-Z0-9%-._~'

Module utf8

Overview

utf8 is Tarantool’s module for handling UTF-8 strings. It includes some functions which are compatible with ones in Lua 5.3 but Tarantool has much more. For example, because internally Tarantool contains a complete copy of the “International Components For Unicode” library, there are comparison functions which understand the default ordering for Cyrillic (Capital Letter Zhe Ж = Small Letter Zhe ж) and Japanese (Hiragana A = Katakana A).

Name	Use
casecmp and cmp	Comparisons
lower and upper	Case conversions
isalpha, isdigit, islower and isupper	Determine character types
sub	Substrings
len	Length in characters
next	Character-at-a-time iterations

utf8.casecmp(UTF8-string, utf8-string)¶

Parameters:	string (`UTF8-string`) – a string encoded with UTF-8
Return:	-1 meaning “less”, 0 meaning “equal”, +1 meaning “greater”
Rtype:	number

Compare two strings with the Default Unicode Collation Element Table (DUCET) for the Unicode Collation Algorithm. Thus ‘å’ is less than ‘B’, even though the code-point value of å (229) is greater than the code-point value of B (66), because the algorithm depends on the values in the Collation Element Table, not the code-point values.

The comparison is done with primary weights. Therefore the elements which affect secondary or later weights (such as “case” in Latin or Cyrillic alphabets, or “kana differentiation” in Japanese) are ignored. If asked “is this like a Microsoft case-insensitive accent-insensitive collation” we tend to answer “yes”, though the Unicode Collation Algorithm is far more sophisticated than those terms imply.

Example:

tarantool> utf8.casecmp('é','e'),utf8.casecmp('E','e')
---
- 0
- 0
...

utf8.char(code-point[, code-point ...])¶

Parameters:	number (`code-point`) – a Unicode code point value, repeatable
Return:	a UTF-8 string
Rtype:	string

The code-point number is the value that corresponds to a character in the Unicode Character Database This is not the same as the byte values of the encoded character, because the UTF-8 encoding scheme is more complex than a simple copy of the code-point number.

Another way to construct a string with Unicode characters is with the \u{hex-digits} escape mechanism, for example ‘\u{41}\u{42}’ and utf8.char(65,66) both produce the string ‘AB’.

Example:

tarantool> utf8.char(229)
---
- å
...

utf8.cmp(UTF8-string, utf8-string)¶

Parameters:	string (`UTF8-string`) – a string encoded with UTF-8
Return:	-1 meaning “less”, 0 meaning “equal”, +1 meaning “greater”
Rtype:	number

The comparison is done with at least three weights. Therefore the elements which affect secondary or later weights (such as “case” in Latin or Cyrillic alphabets, or “kana differentiation” in Japanese) are not ignored. and upper case comes after lower case.

Example:

tarantool> utf8.cmp('é','e'),utf8.cmp('E','e')
---
- 1
- 1
...

utf8.isalpha(UTF8-character)¶

Parameters:	string-or-number (`UTF8-character`) – a single UTF8 character, expressed as a one-byte string or a code point value
Return:	true or false
Rtype:	boolean

Return true if the input character is an “alphabetic-like” character, otherwise return false. Generally speaking a character will be considered alphabetic-like provided it is typically used within a word, as opposed to a digit or punctuation. It does not have to be a character in an alphabet.

Example:

tarantool> utf8.isalpha('Ж'),utf8.isalpha('å'),utf8.isalpha('9')
---
- true
- true
- false
...

utf8.isdigit(UTF8-character)¶

Parameters:	string-or-number (`UTF8-character`) – a single UTF8 character, expressed as a one-byte string or a code point value
Return:	true or false
Rtype:	boolean

Return true if the input character is a digit, otherwise return false.

Example:

tarantool> utf8.isdigit('Ж'),utf8.isdigit('å'),utf8.isdigit('9')
---
- false
- false
- true
...

utf8.islower(UTF8-character)¶

Parameters:	string-or-number (`UTF8-character`) – a single UTF8 character, expressed as a one-byte string or a code point value
Return:	true or false
Rtype:	boolean

Return true if the input character is lower case, otherwise return false.

Example:

tarantool> utf8.islower('Ж'),utf8.islower('å'),utf8.islower('9')
---
- false
- true
- false
...

utf8.isupper(UTF8-character)¶

Parameters:	string-or-number (`UTF8-character`) – a single UTF8 character, expressed as a one-byte string or a code point value
Return:	true or false
Rtype:	boolean

Return true if the input character is upper case, otherwise return false.

Example:

tarantool> utf8.isupper('Ж'),utf8.isupper('å'),utf8.isupper('9')
---
- true
- false
- false
...

utf8.len(UTF8-string[, start-byte[, end-byte]])¶

Parameters:	string (`UTF8-string`) – a string encoded with UTF-8 integer (`end-byte`) – byte position of the first character integer – byte position where to stop
Return:	the number of characters in the string, or between start and end
Rtype:	number

Byte positions for start and end can be negative, which indicates “calculate from end of string” rather than “calculate from start of string”.

If the string contains a byte sequence which is not valid in UTF-8, each byte in the invalid byte sequence will be counted as one character.

UTF-8 is a variable-size encoding scheme. Typically a simple Latin letter takes one byte, a Cyrillic letter takes two bytes, a Chinese/Japanese character takes three bytes, and the maximum is four bytes.

Example:

tarantool> utf8.len('G'),utf8.len('ж')
---
- 1
- 1
...

tarantool> string.len('G'),string.len('ж')
---
- 1
- 2
...

utf8.lower(UTF8-string)¶

Parameters:	string (`UTF8-string`) – a string encoded with UTF-8
Return:	the same string, lower case
Rtype:	string

Example:

tarantool> utf8.lower('ÅΓÞЖABCDEFG')
---
- åγþжabcdefg
...

utf8.next(UTF8-string[, start-byte])¶

Parameters:	string (`UTF8-string`) – a string encoded with UTF-8 integer (`start-byte`) – byte position where to start within the string, default is 1
Return:	byte position of the next character and the code point value of the next character
Rtype:	table

The next function is often used in a loop to get one character at a time from a UTF-8 string.

Example:

In the string ‘åa’ the first character is ‘å’, it starts at position 1, it takes two bytes to store so the character after it will be at position 3, its Unicode code point value is (decimal) 229.

tarantool> -- show next-character position + first-character codepoint
tarantool> utf8.next('åa', 1)
---
- 3
- 229
...
tarantool> -- (loop) show codepoint of every character
tarantool> for position,codepoint in utf8.next,'åa' do print(codepoint) end
229
97
...

utf8.sub(UTF8-string, start-character[, end-character])¶

Parameters:	string (`UTF8-string`) – a string encoded as UTF-8 number (`end-character`) – the position of the first character number – the position of the last character
Return:	a UTF-8 string, the “substring” of the input value
Rtype:	string

Character positions for start and end can be negative, which indicates “calculate from end of string” rather than “calculate from start of string”.

The default value for end-character is the length of the input string. Therefore, saying utf8.sub(1, 'abc') will return ‘abc’, the same as the input string.

Example:

tarantool> utf8.sub('åγþжabcdefg', 5, 8)
---
- abcd
...

utf8.upper(UTF8-string)¶

Parameters:	string (`UTF8-string`) – a string encoded with UTF-8
Return:	the same string, upper case
Rtype:	string

Note

In rare cases the upper-case result may be longer than the lower-case input, for example utf8.upper('ß') is ‘SS’.

Example:

tarantool> utf8.upper('åγþжabcdefg')
---
- ÅΓÞЖABCDEFG
...

Module ulid

Since version 3.6.0.

Overview

The ulid module implements ULID (Universally Unique Lexicographically Sortable Identifier) support in Tarantool. A ULID is a 128-bit identifier consisting of:

a 48-bit timestamp in milliseconds since the Unix epoch
an 80-bit random (entropy) component

ULID strings are encoded using Crockford Base32, a compact and human-friendly alphabet that excludes visually ambiguous characters.

In binary form, ULIDs are represented as 16-byte values in big-endian byte order. This ensures that the lexicographical order of binary ULIDs matches their chronological order, in accordance with the ULID specification.

ULIDs have several useful properties:

They are lexicographically sortable by creation time.
They fit entirely into 26 ASCII characters.
They avoid visually ambiguous symbols (I, L, O, U).
They have 128 bits of total uniqueness-same as UUID v4.

Tarantool uses a monotonic ULID generator. This ensures that multiple ULIDs created within the same millisecond are strictly increasing and preserve sort order: for any two ULIDs generated in the same millisecond, the one created later is greater than the earlier one.

Internally, the monotonic generator keeps the last generated ULID for the current millisecond and increments the 80-bit random part for each subsequent ULID. A real overflow of the random part can happen only after generating 2^80 ULIDs within the same millisecond, which is practically impossible on real hardware. However, for strict correctness of the ULID specification and to avoid silent wrap-around, the implementation detects this overflow and fails the next generation attempt with a Lua error (ULID random component overflow).

To use this module, run the following command:

ulid = require('ulid')

Comparison

ULID objects support the full set of Lua comparison operators:

== and ~= - equality and inequality.
< and <= - lexicographical comparison.
> and >= - lexicographical comparison.

The comparison is based on the internal 16-byte representation in big-endian order and is consistent with the ULID specification: for ULIDs created by the monotonic generator, later ULIDs are greater than earlier ones, including ULIDs generated within the same millisecond.

Comparison works both between ULID objects and between a ULID object and a ULID string:

u1 == u2 compares two ULID objects directly.
u1 == "01..." converts the string to ULID and compares values.
u1 < "01..." or "01..." < u1 convert the string argument to ULID and perform lexicographical comparison.

Examples:

tarantool> u1 = ulid.new()
tarantool> u2 = ulid.new()
tarantool> u1 < u2, u1 <= u2, u1 == u2, u1 ~= u2, u1 > u2, u1 >= u2
---
- true
- true
- false
- true
- false
- false
...

tarantool> u = ulid.new()
tarantool> s = u:str()
tarantool> u == s, u < s, u > s
---
- true
- false
- false
...

tarantool> u == "not-a-valid-ulid"
---
- false
...

tarantool> u < "not-a-valid-ulid"
---
- error: '[string "return u < "not-a-valid-ulid""]:1: incorrect value to convert to
    ulid as 2 argument'
...

API Reference

Below is list of all ulid functions and members.

Name	Use
ulid.NULL	A nil ULID object
ulid.ulid() ulid.bin() ulid.str()	Shortcuts to create a new ULID value
ulid.new()	Create a new ULID object
ulid.fromstr() ulid.frombin() ulid_object:bin() ulid_object:str()	Convert between string, binary, and ULID object forms
ulid.is_ulid() ulid_object:isnil()	Check ULID type or nil value

ulid.NULL¶

A nil ULID object - a ULID that contains 16 zero bytes.

Return:	the nil ULID value
Rtype:	cdata

Example:

tarantool> ulid.NULL
---
- 00000000000000000000000000
...

ulid.new()¶

Create a new ULID object.

This function uses the monotonic generator described in the Overview. Multiple ULIDs created within the same millisecond are strictly increasing.

Return:	a new ULID object
Rtype:	cdata

Example:

tarantool> ulid.new()
---
- 06DGE3YNDCM2PPWJT3SKTTRNZR
...

ulid.ulid()¶

Calling the module directly is the same as calling ulid.new().

In other words, ulid() is a shortcut for ulid.new().

Return:	a new ULID object (same as `ulid.new()`)
Rtype:	cdata

Example:

tarantool> ulid()
---
- 06DGE41G63GAZ6F0TV4WRSVCCW
...

ulid.str()¶

Create a new ULID and return its string representation.

This is a shortcut for ulid.new():str().

The result is always 26 characters, encoded using Crockford Base32.

Return:	a ULID string
Rtype:	26-byte string

Example:

tarantool> ulid.str()
---
- 06DGE480BWZ6H5BKX0KS3Q8S2G
...

ulid.bin()¶

Create a new ULID and return its binary representation as a 16-byte string.

This is a shortcut for ulid.new():bin().

Return:	a ULID in binary form
Rtype:	16-byte string

Example:

tarantool> #ulid.bin()
---
- 16
...

ulid.fromstr(ulid_string)¶

Create a ULID object from a 26-character string.

The input must be a valid ULID string encoded using Crockford Base32. If the string is invalid (wrong length or invalid symbols), nil is returned.

Parameters:	ulid_string (`string`) – ULID in 26-character string form
Return:	converted ULID or `nil`
Rtype:	cdata or `nil`

Example:

tarantool> u = ulid.fromstr('06DGE4FH80PHA28YZVV5Z473T4')
tarantool> u
---
- 06DGE4FH80PHA28YZVV5Z473T4
...

ulid.frombin(ulid_bin)¶

Create a ULID object from a 16-byte binary string.

Parameters:	ulid_bin (`string`) – ULID in 16-byte binary string form
Return:	converted ULID
Rtype:	cdata

Example:

tarantool> u1 = ulid.new()
tarantool> b = u1:bin()
tarantool> u2 = ulid.frombin(b)
tarantool> u1 == u2
---
- true
...

ulid.is_ulid(value)¶

Check if the given value is a ULID cdata object.

Parameters:	value – a value of any type
Return:	`true` if the value is a ULID, otherwise `false`
Rtype:	boolean

Example:

tarantool> ulid.is_ulid(ulid.new())
---
- true
...

tarantool> ulid.is_ulid("string")
---
- false
...

ULID object

A ULID object returned by ulid.new(), ulid.fromstr(), or ulid.frombin() provides the following methods:

ulid_object:bin()¶

Return the ULID as a 16-byte binary string.

Return:	ULID in binary form
Rtype:	16-byte string

Example:

tarantool> u = ulid.new()
tarantool> b = u:bin()
tarantool> #b, b
---
- 16
- "\x01\x9B\a\xAD==\x81u۶-\x93hPa\xAE"
...

ulid_object:str()¶

Return the ULID as a 26-character string.

Return:	ULID in string form
Rtype:	26-byte string

ULID objects also implement the standard Lua __tostring metamethod. This means that calling tostring(u) for a ULID object u returns the same value as u:str(), and ULID objects are automatically converted to their 26-character string representation when needed in string context.

Example:

tarantool> u = ulid.new()
tarantool> u:str(), tostring(u)
---
- 06DGFBE3J07B7DB5A3JP4WQ9CM
- 06DGFBE3J07B7DB5A3JP4WQ9CM
...

ulid_object:isnil()¶

Check if the ULID is the nil ULID (all 16 bytes are zero).

Return:	`true` for `ulid.NULL`, otherwise `false`
Rtype:	boolean

Example:

tarantool> ulid.NULL:isnil()
---
- true
...

tarantool> ulid.new():isnil()
---
- false
...

Examples

Basic usage:

tarantool> u_obj = ulid.new()
tarantool> u_str = ulid.str()
tarantool> u_bin = ulid.bin()

tarantool> u_obj, u_str, u_bin
---
- 06DGE6SEZDCFFSFEAJFMC9YQAR
- 06DGE6T0DW0N2N6KMPVZ8SGE4W
- "\x01\x9B\a\eSz8\xBA\xB3\xC5\xCC\xFE\x18\xBB1\xD6"
...

Creating a ULID object and inspecting it:

tarantool> u = ulid()
---
...

tarantool> #u:bin(), #u:str(), type(u), u:isnil()
---
- 16
- 26
- cdata
- false
...

tarantool> tostring(u) == u:str()
---
- true
...

Converting between string and binary formats:

tarantool> s = ulid.str()
tarantool> s
---
- 06DGE70CPFDV344XX43687N1SM
...

tarantool> u = ulid.fromstr(s)
tarantool> u:str() == s
---
- true
...

tarantool> b = u:bin()
tarantool> u2 = ulid.frombin(b)
tarantool> u2 == u
---
- true
...

Working with ulid.NULL:

tarantool> ulid.NULL
---
- 00000000000000000000000000
...

tarantool> ulid.NULL:isnil()
---
- true
...

tarantool> ulid.new():isnil()
---
- false
...

Comparison operators:

tarantool> u1 = ulid.new()
tarantool> u2 = ulid.new()
tarantool> u1 < u2
---
- true
...

Checking types:

tarantool> u = ulid.new()
tarantool> ulid.is_ulid(u)
---
- true
...

tarantool> ulid.is_ulid("06DGE7RJK8QWJE27X5VVCC5VDW")
---
- false
...

Generating many ULIDs:

tarantool> for i = 1,5 do print(ulid.str()) end
---
06DGE7T8AJD6EDNJ3VQ2ZYB3R4
06DGE7T8AJD6EDNJ3VQ2ZYB3R8
06DGE7T8AJD6EDNJ3VQ2ZYB3RC
06DGE7T8AJD6EDNJ3VQ2ZYB3RG
06DGE7T8AJD6EDNJ3VQ2ZYB3RM
...

Module uuid

Overview

A “UUID” is a Universally unique identifier. If an application requires that a value be unique only within a single computer or on a single database, then a simple counter is better than a UUID, because getting a UUID is time-consuming (it requires a syscall). For clusters of computers, or widely distributed applications, UUIDs are better. Tarantool generates UUIDs following the rules for RFC 4122 version 4 variant 1.

API Reference

Below is list of all uuid functions and members.

Name	Use
uuid.NULL	A nil UUID object
uuid() uuid.bin() uuid.str()	Get a UUID
uuid.new()	Create a UUID
uuid.fromstr() uuid.frombin() uuid_object:bin() uuid_object:str()	Get a converted UUID
uuid.is_uuid()	Check if the specified value has UUID type
uuid_object:isnil()	Check if a UUID is an all-zero value

uuid.NULL¶: A nil UUID object. Contains the all-zero UUID value – 00000000-0000-0000-0000-000000000000.

uuid.new()¶

Since version 2.4.1. Create a UUID sequence. You can use it in an index over a UUID field. For example, to create such index for a space named test, say:

tarantool> box.space.test:create_index("pk", {parts={{field = 1, type = 'uuid'}}})

Now you can insert UUIDs into the space:

tarantool> box.space.test:insert{uuid.new()}
---
- [e631fdcc-0e8a-4d2f-83fd-b0ce6762b13f]
...

tarantool> box.space.test:insert{uuid.fromstr('64d22e4d-ac92-4a23-899a-e59f34af5479')}
---
- [64d22e4d-ac92-4a23-899a-e59f34af5479]
...

tarantool> box.space.test:select{}
---
- - [64d22e4d-ac92-4a23-899a-e59f34af5479]
- [e631fdcc-0e8a-4d2f-83fd-b0ce6762b13f]
...

Return:	a UUID
Rtype:	cdata

uuid.__call()¶

Return:	a UUID
Rtype:	cdata

uuid.bin([byte-order])¶

Parameters:	byte-order (`string`) – Byte order of the resulting UUID: `'l'` – little-endian `'b'` – big-endian `'h'`, `'host'` – endianness depends on host (default) `'n'`, `'network'` – endianness depends on network
Return:	a UUID
Rtype:	16-byte string

uuid.str()¶

Return:	a UUID
Rtype:	36-byte binary string

uuid.fromstr(uuid-str)¶

Parameters:	uuid-str (`string`) – UUID in 36-byte hexadecimal string
Return:	converted UUID
Rtype:	cdata

uuid.frombin(uuid-bin[, byte-order])¶

Parameters:	uuid-bin (`string`) – UUID in 16-byte binary string byte-order (`string`) – Byte order of the given string: `'l'` – little-endian, `'b'` – big-endian, `'h'`, `'host'` – endianness depends on host (default), `'n'`, `'network'` – endianness depends on network.
Return:	converted UUID
Rtype:	cdata

uuid.is_uuid(value)¶

Since version 2.6.1.

Parameters:	value – a value to check
Return:	`true` if the specified value is a UUID, and `false` otherwise
Rtype:	bool

object uuid_object¶

uuid_object:bin([byte-order])¶

Parameters:	byte-order (`string`) – Byte order of the resulting UUID: `'l'` – little-endian, `'b'` – big-endian, `'h'`, `'host'` – endianness depends on host (default), `'n'`, `'network'` – endianness depends on network.
Return:	UUID converted from cdata input value.
Rtype:	16-byte binary string

uuid_object:str()¶

Return:	UUID converted from cdata input value.
Rtype:	36-byte hexadecimal string

uuid_object:isnil()¶

The all-zero UUID value can be expressed as uuid.NULL, or as uuid.fromstr('00000000-0000-0000-0000-000000000000'). The comparison with an all-zero value can also be expressed as uuid_with_type_cdata == uuid.NULL.

Return:	true if the value is all zero, otherwise false.
Rtype:	bool

Example

tarantool> uuid = require('uuid')
---
...
tarantool> uuid(), uuid.bin(), uuid.str()
---
- 16ffedc8-cbae-4f93-a05e-349f3ab70baa
- !!binary FvG+Vy1MfUC6kIyeM81DYw==
- 67c999d2-5dce-4e58-be16-ac1bcb93160f
...
tarantool> uu = uuid()
---
...
tarantool> #uu:bin(), #uu:str(), type(uu), uu:isnil()
---
- 16
- 36
- cdata
- false
...

Module varbinary

Since: 3.0.0

Overview

The varbinary module provides functions for operating variable-length binary objects in Lua. It provides functions for creating varbinary objects, checking their type, and also defines basic operators on such objects.

For example:

local varbinary = require('varbinary')

-- Create a varbinary object
local bin = varbinary.new('data')
local bin_hex = varbinary.new('\xFF\xFE')

-- Check whether a value is a varbinary object
varbinary.is(bin) -- true
varbinary.is(bin_hex) -- true
varbinary.is(100) -- false
varbinary.is('data') -- false

-- Check varbinary objects equality
print(bin == varbinary.new('data')) -- true
print(bin == 'data') -- true
print(bin ~= 'data1') -- true
print(bin_hex ~= '\xFF\xFE') -- false

-- Check varbinary objects length
print(#bin) -- 4
print(#bin_hex) -- 2

-- Print string representation
print(tostring(bin)) -- data

Encoding varbinary objects

varbinary objects preserve their binary type when encoded by the built-in MsgPack and YAML encoders. See the difference with strings:

String to MsgPack:

tarantool> msgpack.encode('\xFF\xFE')
---
- "\xA2\xFF\xFE"
...

varbinary to MsgPack:

tarantool> msgpack.encode(varbinary.new('\xFF\xFE'))
---
- "\xC4\x02\xFF\xFE"
...

String to YAML:

tarantool> '\xFF\xFE'
---
- "\xFF\xFE"
...

varbinary to YAML:

tarantool> varbinary.new('\xFF\xFE')
---
- !!binary //4=
...

Note

The JSON format doesn’t support the binary type so varbinary objects are encoded as plain strings in JSON:

tarantool> json.encode('\xFF\xFE')
---
- "\"\xFF\xFE\""
...

tarantool> json.encode(varbinary.new('\xFF\xFE'))
---
- "\"\xFF\xFE\""
...

Decoding binary data to varbinary objects

The built-in decoders also decode binary data fields (fields with the binary tag in YAML and the MP_BIN type in MsgPack) to varbinary objects by default:

tarantool> varbinary.is(msgpack.decode('\xC4\x02\xFF\xFE'))
---
- true
...

tarantool> varbinary.is(yaml.decode('!!binary //4='))
---
- true
...

Important

This behavior is different from what it was before Tarantool 3.0. In earlier versions, such fields were decoded to plain strings. To return to this behavior, use the compat option binary_data_decoding.

API Reference

Below is a list of varbinary functions, properties, and related objects.

Functions
varbinary.is()	Check that the argument is a `varbinary` object
varbinary.new()	Create a `varbinary` object
Metamethods
varbinary_object.__eq	Checks the equality of two `varbinary` objects
varbinary_object.__len	Returns the length of the binary data in bytes
varbinary_object.__tostring	Returns the binary data in a plain string

Functions

varbinary.is(object)¶

Check that the given object is a varbinary object.

Parameters:	object (`object`) – an object to check
Return:	Whether the given object is of `varbinary` type
Rtype:	boolean

Example:

local bin = varbinary.new('data')
local bin_hex = varbinary.new('\xFF\xFE')

-- Check whether a value is a varbinary object
varbinary.is(bin) -- true
varbinary.is(bin_hex) -- true
varbinary.is(100) -- false
varbinary.is('data') -- false

varbinary.new(string)¶

Create a new varbinary object from a given string.

Parameters:	string (`string`) – a string object
Return:	A `varbinary` object containing the string data
Rtype:	cdata

Example:

local bin = varbinary.new('data')
local bin_hex = varbinary.new('\xFF\xFE')

varbinary.new(ptr, size)

Create a new varbinary object from a cdata pointer and size.

Parameters:	ptr (`cdata`) – a `cdata` pointer size (`number`) – object size in bytes
Return:	A `varbinary` object containing the data
Rtype:	cdata

Example:

local bin2 = varbinary.new(ffi.cast('const char *', 'data'), 4)
varbinary.is(bin2) -- true
print(bin2) -- data

Metamethods

object varbinary_object¶

varbinary_object:__eq(object)¶

Checks the equality of two varbinary objects or a varbinary object and a string. A varbinary object equals to another varbinary object or a string if it contains the same data.

Defines the == and ~= operators for varbinary objects.

Rtype:	boolean

Example:

print(bin == varbinary.new('data')) -- true
print(bin == 'data') -- true
print(bin ~= 'data1') -- true
print(bin_hex ~= '\xFF\xFE') -- false

varbinary_object:__len()¶

Returns the length of the binary data in bytes.

Defines the # operator for varbinary objects.

Rtype:	number

Example:

print(#bin) -- 4
print(#bin_hex) -- 2

varbinary_object:__tostring()¶

Returns the binary data in a plain string.

Defines the tostring() function for varbinary objects.

Rtype:	string

Module xlog

The xlog module contains one function: pairs(). It can be used to read Tarantool’s snapshot files or write-ahead-log (WAL) files. A description of the file format is in section Data persistence and the WAL file format.

xlog.pairs([file-name])¶

Open a file, and allow iterating over one file entry at a time.

Returns:	iterator which can be used in a for/end loop.
Rtype:	iterator

Possible errors: File does not contain properly formatted snapshot or write-ahead-log information.

Example:

This will read the first write-ahead-log (WAL) file that was created in the wal_dir directory in our “Getting started” exercises.

Each result from pairs() is formatted with MsgPack so its structure can be specified with __serialize.

xlog = require('xlog')
t = {}
for k, v in xlog.pairs('00000000000000000000.xlog') do
  table.insert(t, setmetatable(v, { __serialize = "map"}))
end
return t

The first lines of the result will look like:

(...)
---
- - {'BODY':   {'space_id': 272, 'index_base': 1, 'key': ['max_id'],
                'tuple': [['+', 2, 1]]},
     'HEADER': {'type': 'UPDATE', 'timestamp': 1477846870.8541,
                'lsn': 1, 'server_id': 1}}
  - {'BODY':   {'space_id': 280,
                 'tuple': [512, 1, 'tester', 'memtx', 0, {}, []]},
     'HEADER': {'type': 'INSERT', 'timestamp': 1477846870.8597,
                'lsn': 2, 'server_id': 1}}

Module yaml

Overview

The yaml module takes strings in YAML format and decodes them, or takes a series of non-YAML values and encodes them.

Index

Below is a list of all yaml functions and members.

Name	Use
yaml.encode()	Convert a Lua object to a YAML string
yaml.decode()	Convert a YAML string to a Lua object
__serialize parameter	Output structure specification
yaml.cfg()	Change configuration
yaml.NULL	Analog of Lua’s “nil”

yaml.encode(lua_value)¶

Convert a Lua object to a YAML string.

Parameters:	lua_value – either a scalar value or a Lua table value.
Return:	the original value reformatted as a YAML string.
Rtype:	string

yaml.decode(string)¶

Convert a YAML string to a Lua object.

Parameters:	string – a string formatted as YAML.
Return:	the original contents formatted as a Lua table.
Rtype:	table

__serialize parameter:

The YAML output structure can be specified with __serialize:

'seq', 'sequence', 'array': table encoded as an array
'map', 'mapping': table encoded as a map
function: the meta-method called to unpack serializable representation of table, cdata, or userdata objects

tarantool> yaml.encode(setmetatable({'A', 'B'}, {__serialize='seq'}))
---
- |
  --- ['A', 'B']
  ...
...
tarantool> yaml.encode(setmetatable({'A', 'B'}, {__serialize='map'}))
---
- |
  --- {1: 'A', 2: 'B'}
  ...
...

'seq' or 'map' also enable the flow (compact) mode for the YAML serializer (flow="[1,2,3]" vs block=" - 1\n - 2\n - 3\n"). See the full example in the ‘Example’ section below.

yaml.cfg(table)¶

Set values affecting the behavior of encode and decode functions.

The values are all either integers or boolean true/false.

Option	Default	Use
`cfg.encode_invalid_numbers`	true	A flag saying whether to enable encoding of NaN and Inf numbers
`cfg.encode_number_precision`	14	Precision of floating point numbers
`cfg.encode_load_metatables`	true	A flag saying whether the serializer will follow __serialize metatable field
`cfg.encode_use_tostring`	false	A flag saying whether to use `tostring()` for unknown types
`cfg.encode_invalid_as_nil`	false	A flag saying whether to use NULL for non-recognized types
`cfg.encode_sparse_convert`	true	A flag saying whether to handle excessively sparse arrays as maps. See detailed description below
`cfg.encode_sparse_ratio`	2	1/`encode_sparse_ratio` is the permissible percentage of missing values in a sparse array
`cfg.encode_sparse_safe`	10	A limit ensuring that small Lua arrays are always encoded as sparse arrays (instead of generating an error or encoding as map)
`cfg.decode_invalid_numbers`	true	A flag saying whether to enable decoding of NaN and Inf numbers
`cfg.decode_save_metatables`	true	A flag saying whether to set metatables for all arrays and maps

Note on ``decode_save_metatables``

You may want to change the result’s metatable to get block-formatted encode() for better readability, but be careful to do it correctly.

Important

Decoder uses globally defined tables as metatables for arrays and maps. You must not change entries of decode() result’s table metatable, because it affects all results and may lead to undefined behavior of other code.

The correct way is to assign a new metatable.

tarantool> t1 = yaml.decode(yaml.encode({[1] = 'a', x = 'b'}))
tarantool> yaml.encode(t1)
---
- |
  --- {'x': 'b', 1: 'a'}
  ...
...
tarantool> my_mt = {__serialize = 'mapping'}
tarantool> setmetatable(t1, my_mt)
tarantool> yaml.encode(t1)
---
- |
  ---
  x: b
  1: a
  ...
...

Do not change the metatable like this.

tarantool> t1 = yaml.decode(yaml.encode({[1] = 'a', x = 'b'}))
tarantool> getmetatable(t1).__serialize
---
- map
...
tarantool> getmetatable(t1).__serialize = 'mapping' -- (!) bad
tarantool> t2 = yaml.decode(yaml.encode({[1] = 'a', x = 'b'}))
tarantool> yaml.encode(t2) -- (!) got 'block' maps for all results
---
- |
  ---
  x: b
  1: a
  ...
...

Sparse arrays features:

During encoding, The YAML encoder tries to classify table into one of four kinds:

Map: at least one table index is not unsigned integer.
Regular array: all array indexes are available.
Sparse array: at least one array index is missing.
Excessively sparse array: the number of values missing exceeds the configured ratio.

An array is excessively sparse when all the following conditions are met:

encode_sparse_ratio > 0
max(table) > encode_sparse_safe
max(table) > count(table) * encode_sparse_ratio

The YAML encoder will never consider an array to be excessively sparse when encode_sparse_ratio = 0. The encode_sparse_safe limit ensures that small Lua arrays are always encoded as sparse arrays. By default, attempting to encode an excessively sparse array will generate an error. If encode_sparse_convert is set to true, excessively sparse arrays will be handled as maps.

yaml.cfg() example 1:

The following code will encode 0/0 as NaN (“not a number”) and 1/0 as Inf (“infinity”), rather than returning nil or an error message:

yaml = require('yaml')
yaml.cfg{encode_invalid_numbers = true}
x = 0/0
y = 1/0
yaml.encode({1, x, y, 2})

The result of the yaml.encode() request will look like this:

tarantool> yaml.encode({1, x, y, 2})
---
- '[1,nan,inf,2]
...

yaml.cfg example 2:

To avoid generating errors on attempts to encode unknown data types as userdata/cdata, you can use this code:

tarantool> httpc = require('http.client').new()
---
...

tarantool> yaml.encode(httpc.curl)
---
- error: unsupported Lua type 'userdata'
...

tarantool> yaml.encode(httpc.curl, {encode_use_tostring=true})
---
- '"userdata: 0x010a4ef2a0"'
...

Note

To achieve the same effect for only one call to yaml.encode() (i.e. without changing the configuration permanently), you can use yaml.encode({1, x, y, 2}, {encode_invalid_numbers = true}).

Similar configuration settings exist for JSON and MsgPack.

yaml.NULL¶: A value comparable to Lua “nil” which may be useful as a placeholder in a tuple.

Example

tarantool> yaml = require('yaml')
---
...
tarantool> y = yaml.encode({'a', 1, 'b', 2})
---
...
tarantool> z = yaml.decode(y)
---
...
tarantool> z[1], z[2], z[3], z[4]
---
- a
- 1
- b
- 2
...
tarantool> if yaml.NULL == nil then print('hi') end
hi
---
...

The YAML collection style can be specified with __serialize:

__serialize="sequence" or __serialize="array" for a Block Sequence array,
__serialize="seq" for a Flow Sequence array,
__serialize="mapping" for a Block Mapping map,
__serialize="map" for a Flow Mapping map.

Serializing array- or map-like tables containing 'A' and 'B' with different __serialize values brings different results:

tarantool> yaml = require('yaml')
---
...
tarantool> yaml.encode(setmetatable({'A', 'B'}, {__serialize='seq'}))
---
- |
  --- ['A', 'B']
  ...
...
tarantool> yaml.encode(setmetatable({'A', 'B'}, {__serialize='map'}))
---
- |
  --- {1: 'A', 2: 'B'}
  ...
...
tarantool> array_like_table = {'A', 'B'}
tarantool> yaml.encode(setmetatable(array_like_table, {__serialize='seq'}))
---
- |
  --- ['A', 'B']
  ...
...
tarantool> yaml.encode(setmetatable(array_like_table, {__serialize='sequence'}))
tarantool> yaml.encode(setmetatable(array_like_table, {__serialize='array'}))
---
- |
  ---
  - A
  - B
  ...
...
tarantool> yaml.encode(setmetatable(array_like_table, {__serialize='map'}))
---
- |
  --- {1: 'A', 2: 'B'}
  ...
...
tarantool> yaml.encode(setmetatable(array_like_table, {__serialize='mapping'}))
---
- |
  ---
  1: A
  2: B
  ...
...
tarantool> map_like_table = {f1 = 'A', f2 = 'B'}
tarantool> yaml.encode(setmetatable(map_like_table, {__serialize='seq'}))
tarantool> yaml.encode(setmetatable(map_like_table, {__serialize='sequence'}))
tarantool> yaml.encode(setmetatable(map_like_table, {__serialize='array'}))
---
- |
  --- []
  ...
...
tarantool> yaml.encode(setmetatable(map_like_table, {__serialize='map'}))
---
- |
  --- {'f2': 'B', 'f1': 'A'}
  ...
...
tarantool> yaml.encode(setmetatable(map_like_table, {__serialize='mapping'}))
---
- |
  ---
  f2: B
  f1: A
  ...
...

Other package components

All the Tarantool modules are, at some level, inside a package which, appropriately, is named package. There are also miscellaneous functions and variables which are outside all modules.

Name	Use
tonumber64()	Convert a string or a Lua number to a 64-bit integer
dostring()	Parse and execute an arbitrary chunk of Lua code
package.path	Get file paths used to search for Lua modules
package.cpath	Get file paths used to search for C modules
package.loaded	Show Lua or C modules loaded by Tarantool
package.searchroot	Get the root path for a directory search
package.setsearchroot	Set the root path for a directory search

tonumber64(value)¶

Convert a string or a Lua number to a 64-bit integer. The input value can be expressed in decimal, binary (for example 0b1010), or hexadecimal (for example -0xffff). The result can be used in arithmetic, and the arithmetic will be 64-bit integer arithmetic rather than floating-point arithmetic. (Operations on an unconverted Lua number use floating-point arithmetic.) The tonumber64() function is added by Tarantool; the name is global.

Example:

tarantool> type(123456789012345), type(tonumber64(123456789012345))
---
- number
- number
...
tarantool> i = tonumber64('1000000000')
---
...
tarantool> type(i), i / 2, i - 2, i * 2, i + 2, i % 2, i ^ 2
---
- number
- 500000000
- 999999998
- 2000000000
- 1000000002
- 0
- 1000000000000000000
...

Warning: There is an underlying LuaJIT library that operates with C rules. Therefore you should expect odd results if you compare unsigned and signed (for example 0ULL > -1LL is false), or if you use numbers outside the 64-bit integer range (for example 9223372036854775808LL is negative). Also you should be aware that type(number-literal-ending-in-ULL) is cdata, not a Lua arithmetic type, which prevents direct use with some functions in Lua libraries such as math. See the LuaJIT reference and look for the phrase “64 bit integer arithmetic”. and the phrase “64 bit integer comparison”. Or see the comments on Issue#4089.

dostring(lua-chunk-string[, lua-chunk-string-argument ...])¶

Parse and execute an arbitrary chunk of Lua code. This function is mainly useful to define and run Lua code without having to introduce changes to the global Lua environment.

Parameters:	lua-chunk-string (`string`) – Lua code lua-chunk-string-argument (`lua-value`) – zero or more scalar values which will be appended to, or substitute for, items in the Lua chunk.
Return:	whatever is returned by the Lua code chunk.

Possible errors: If there is a compilation error, it is raised as a Lua error.

Example:

tarantool> dostring('abc')
---
error: '[string "abc"]:1: ''='' expected near ''<eof>'''
...
tarantool> dostring('return 1')
---
- 1
...
tarantool> dostring('return ...', 'hello', 'world')
---
- hello
- world
...
tarantool> dostring([[
         >   local f = function(key)
         >     local t = box.space.tester:select{key}
         >     if t ~= nil then
         >       return t[1]
         >     else
         >       return nil
         >     end
         >   end
         >   return f(...)]], 1)
---
- null
...

package.path¶

Get file paths used to search for Lua modules. For example, these paths are used to find modules loaded using the require() directive.

Database error codes

The table below lists some popular errors that can be raised by Tarantool in case of various issues. You can find a complete list of errors in the errcode.h file.

Note

The box.error module provides the ability to get the information about the last error raised by Tarantool or raise custom errors manually.

Code	box.error value	Description
ER_NONMASTER	box.error.NONMASTER	(In replication) A server instance cannot modify data unless it is a master.
ER_ILLEGAL_PARAMS	box.error.ILLEGAL_PARAMS	Illegal parameters. Malformed protocol message.
ER_MEMORY_ISSUE	box.error.MEMORY_ISSUE	Out of memory: memtx_memory limit has been reached.
ER_WAL_IO	box.error.WAL_IO	Failed to write to disk. May mean: failed to record a change in the write-ahead log.
ER_READONLY	box.error.READONLY	Can’t modify data on a read-only instance.
ER_KEY_PART_COUNT	box.error.KEY_PART_COUNT	Key part count is not the same as index part count.
ER_NO_SUCH_SPACE	box.error.NO_SUCH_SPACE	The specified space does not exist.
ER_NO_SUCH_INDEX	box.error.NO_SUCH_INDEX	The specified index in the specified space does not exist.
ER_PROC_LUA	box.error.PROC_LUA	An error occurred inside a Lua procedure.
ER_FIBER_STACK	box.error.FIBER_STACK	The recursion limit was reached when creating a new fiber. This usually indicates that a stored procedure is recursively invoking itself too often.
ER_UPDATE_FIELD	box.error.UPDATE_FIELD	An error occurred during update of a field.
ER_TUPLE_FOUND	box.error.TUPLE_FOUND	A duplicate key exists in a unique index.

Handling errors

Here are some procedures that can make Lua functions more robust when there are errors, particularly database errors.

Invoke a function using pcall.

Take advantage of Lua’s mechanisms for Error handling and exceptions, particularly pcall. That is, instead of invoking with …
```
box.space.{space-name}:{function-name}()
```
… call the function as follows:
```
if pcall(box.space.{space-name}.{function-name}, box.space.{space-name}) ...
```
For some Tarantool box functions, pcall also returns error details, including a file-name and line-number within Tarantool’s source code. This can be seen by unpacking, for example:
```
status, error = pcall(function() box.schema.space.create('') end)
error:unpack()
```
See the tutorial Sum a JSON field for all tuples to see how pcall can fit in an application.
Examine errors and raise new errors using box.error.

To make a new error and pass it on, the box.error module provides box.error().

To find the last error, the box.error submodule provides box.error.last(). There is also a way to find the text of the last operating-system error for certain functions – errno.strerror([code]).
Log.

Put messages in a log using the log module.

Filter automatically generated messages using the log configuration parameter.

Generally, for Tarantool built-in functions which are designed to return objects: the result is an object, or nil, or a Lua error. For example consider the fio_read.lua program in a cookbook:

#!/usr/bin/env tarantool

local fio = require('fio')
local errno = require('errno')
local f = fio.open('/tmp/xxxx.txt', {'O_RDONLY' })
if not f then
    error("Failed to open file: "..errno.strerror())
end
local data = f:read(4096)
f:close()
print(data)

After a function call that might fail, like fio.open() above, it is common to see syntax like if not f then ... or if f == nil then ..., which check for common failures. But if there had been a syntax error, for example fio.opex instead of fio.open, then there would have been a Lua error and f would not have been changed. If checking for such an obvious error had been a concern, the programmer would probably have used pcall().

All functions in Tarantool modules should work this way, unless the manual explicitly says otherwise.

Debug facilities

Overview

Tarantool users can benefit from built-in debug facilities that are part of:

Lua (debug library, see details below) and
LuaJit (debug.* functions).

The debug library provides an interface for debugging Lua programs. All functions in this library reside in the debug table. Those functions that operate on a thread have an optional first parameter that specifies the thread to operate on. The default is always the current thread.

Note

This library should be used only for debugging and profiling and not as a regular programming tool, as the functions provided here can take too long to run. Besides, several of these functions can compromise otherwise secure code.

Index

Below is a list of all debug functions.

Name	Use
debug.debug()	Enter an interactive mode
debug.getfenv()	Get an object’s environment
debug.gethook()	Get a thread’s current hook settings
debug.getinfo()	Get information about a function
debug.getlocal()	Get a local variable’s name and value
debug.getmetatable()	Get an object’s metatable
debug.getregistry()	Get the registry table
debug.getupvalue()	Get an upvalue’s name and value
debug.setfenv()	Set an object’s environment
debug.sethook()	Set a given function as a hook
debug.setlocal()	Assign a value to a local variable
debug.setmetatable()	Set an object’s metatable
debug.setupvalue()	Assign a value to an upvalue
debug.sourcedir()	Get the source directory name
debug.sourcefile()	Get the source file name
debug.traceback()	Get a traceback of the call stack

debug.debug()¶

Enters an interactive mode and runs each string that the user types in. The user can, among other things, inspect global and local variables, change their values and evaluate expressions.

Enter cont to exit this function, so that the caller can continue its execution.

Note

Commands for debug.debug() are not lexically nested within any function and so have no direct access to local variables.

debug.getfenv(object)¶

Parameters:	object – object to get the environment of
Return:	the environment of the `object`

debug.gethook([thread])¶

Return:

the current hook settings of the thread as three values:

the current hook function
the current hook mask
the current hook count as set by the debug.sethook() function

debug.getinfo([thread, ]function[, what])¶

Parameters:	function – function to get information on what (`string`) – what information on the `function` to return
Return:	a table with information about the `function`

You can pass in a function directly, or you can give a number that specifies a function running at level function of the call stack of the given thread: level 0 is the current function (getinfo() itself), level 1 is the function that called getinfo(), and so on. If function is a number larger than the number of active functions, getinfo() returns nil.

The default for what is to get all information available, except the table of valid lines. If present, the option f adds a field named func with the function itself. If present, the option L adds a field named activelines with the table of valid lines.

debug.getlocal([thread, ]level, local)¶

Parameters:	level (`number`) – level of the stack local (`number`) – index of the local variable
Return:	the name and the value of the local variable with the index `local` of the function at level `level` of the stack or `nil` if there is no local variable with the given index; raises an error if `level` is out of range

Note

You can call debug.getinfo() to check whether the level is valid.

debug.getmetatable(object)¶

Parameters:	object – object to get the metatable of
Return:	a metatable of the `object` or `nil` if it does not have a metatable

debug.getregistry()¶

Return:	the registry table

debug.getupvalue(func, up)¶

Parameters:	func (`function`) – function to get the upvalue of up (`number`) – index of the function upvalue
Return:	the name and the value of the upvalue with the index `up` of the function `func` or `nil` if there is no upvalue with the given index

debug.setfenv(object, table)¶

Sets the environment of the object to the table.

Parameters:	object – object to change the environment of table (`table`) – table to set the object environment to
Return:	the `object`

debug.sethook([thread, ]hook, mask[, count])¶

Sets the given function as a hook. When called without arguments, turns the hook off.

Parameters:

hook (function) – function to set as a hook
mask (string) –
describes when the hook will be called; may have the following values:
- c - the hook is called every time Lua calls a function
- r - the hook is called every time Lua returns from a function
- l - the hook is called every time Lua enters a new line of code
count (number) – describes when the hook will be called; when different from zero, the hook is called after every count instructions.

debug.setlocal([thread, ]level, local, value)¶

Assigns the value value to the local variable with the index local of the function at level level of the stack.

Parameters:	level (`number`) – level of the stack local (`number`) – index of the local variable value – value to assign to the local variable
Return:	the name of the local variable or `nil` if there is no local variable with the given index; raises an error if `level` is out of range

Note

You can call debug.getinfo() to check whether the level is valid.

debug.setmetatable(object, table)¶

Sets the metatable of the object to the table.

Parameters:	object – object to change the metatable of table (`table`) – table to set the object metatable to

debug.setupvalue(func, up, value)¶

Assigns the value value to the upvalue with the index up of the function func.

Parameters:	func (`function`) – function to set the upvalue of up (`number`) – index of the function upvalue value – value to assign to the function upvalue
Return:	the name of the upvalue or `nil` if there is no upvalue with the given index

debug.sourcedir([level])¶

Parameters:	level (`number`) – the level of the call stack which should contain the path (default is 2)
Return:	a string with the relative path to the source file directory

Instead of debug.sourcedir() one can say debug.__dir__ which means the same thing.

Determining the real path to a directory is only possible if the function was defined in a Lua file (this restriction may not apply for loadstring() since Lua will store the entire string in debug info).

If debug.sourcedir() is part of a return argument, then it should be inside parentheses: return (debug.sourcedir()).

debug.sourcefile([level])¶

Parameters:	level (`number`) – the level of the call stack which should contain the path (default is 2)
Return:	a string with the relative path to the source file

Instead of debug.sourcefile() one can say debug.__file__ which means the same thing.

Determining the real path to a file is only possible if the function was defined in a Lua file (this restriction may not apply to loadstring() since Lua will store the entire string in debug info).

If debug.sourcefile() is part of a return argument, then it should be inside parentheses: return (debug.sourcefile()).

debug.traceback([thread, ][message][, level])¶

Parameters:	message (`string`) – an optional message prepended to the traceback level (`number`) – specifies at which level to start the traceback (default is 1)
Return:	a string with a traceback of the call stack

Debug example:

Make a file in the /tmp directory named example.lua, containing:

function w()
  print(debug.sourcedir())
  print(debug.sourcefile())
  print(debug.traceback())
  print(debug.getinfo(1)['currentline'])
end
w()

Execute tarantool /tmp/example.lua. Expect to see this:

/tmp
/tmp/example.lua
stack traceback:
    /tmp/example.lua:4: in function 'w'
    /tmp/example.lua:7: in main chunk
5

JSON paths

Overview

Since version 2.3, Tarantool supports JSON path updates. You can update or upsert formatted tuple / space / index fields by name (not only by field number). Updates of nested structures are also supported.

Example:

tarantool> box.cfg{};
         > format = {};
         > format[1] = {'field1', 'unsigned'};
         > format[2] = {'field2', 'map'};
         > format[3] = {'field3', 'array'};
         > format[4] = {'field4', 'string', is_nullable = true}
---
...
tarantool> s = box.schema.create_space('test', {format = format});
         > _ = s:create_index('pk')
---
...
tarantool> t = {
         >     1,
         >     {
         >         key1 = 'value',
         >         key2 = 10
         >     },
         >     {
         >         2,
         >         3,
         >         {key3 = 20}
         >     }
         > }
---
...
tarantool> t = s:replace(t)
---
...
tarantool> t:update({{'=', 'field2.key1', 'new_value'}})
---
- [1, {'key1': 'new_value', 'key2': 10}, [2, 3, {'key3': 20}]]
...
tarantool> t:update({{'+', 'field3[2]', 1}})
---
- [1, {'key1': 'value', 'key2': 10}, [2, 4, {'key3': 20}]]
...
tarantool> s:update({1}, {{'!', 'field4', 'inserted value'}})
---
- [1, {'key1': 'value', 'key2': 10}, [2, 3, {'key3': 20}], 'inserted value']
...
tarantool> s:update({1}, {{'#', '[2].key2', 1}, {'=', '[3][3].key4', 'value4'}})
---
- [1, {'key1': 'value'}, [2, 3, {'key3': 20, 'key4': 'value4'}], 'inserted value']
...
tarantool> s:upsert({1, {k = 'v'}, {}}, {{'#', '[2].key1', 1}})
---
...
tarantool> s:select{}
---
- - [1, {}, [2, 3, {'key3': 20, 'key4': 'value4'}], 'inserted value']
...

Notice that field names that look like JSON paths are processed similarly to accessing tuple fields by JSON: first, the whole path is interpreted as a field name; if such a name does not exist, then it is treated as a path.

For example, for a field name field.name.like.json, this update

object-name:update(..., 'field.name.like.json', ...)

will update this field instead of keys field -> name -> like -> json. If you need such a name as part of a bigger path, then you should wrap it in quotes "" and brackets []:

object-name:update(..., '["field.name.like.json"].next.fields', ...)

There are some rules for JSON updates:

Operation '!' can’t be used to create all intermediate nodes of a path. For example, {'!', 'field1[1].field3', ...} can’t create fields 'field1' and '[1]', they should exist.
Operation '#', when applied to maps, can’t delete more than one key at once. That is, its argument should be always 1 for maps.

{'#', 'field1.field2', 1} is allowed;

{'#', 'field1.field2', 10} is not.

This limitation originates from the problem that keys in a map are not ordered anyhow, and '#' with more than 1 key would lead to undefined behavior.
Operation '!' on maps can’t create a key, if it exists already.
If a map contains non-string keys (booleans, numbers, maps, arrays - anything), then these keys can’t be updated via JSON paths. But it is still allowed to update string keys in such a map.

Why JSON updates are good, and should be preferred when only a part of a tuple needs to be updated:

They consume less space in WAL, because for an update only its keys, operations, and arguments are stored. It is cheaper to store an update of one deep field than of the whole tuple.
They are faster. Firstly, this is because they are implemented in C, and have no problems with Lua GC and dynamic typing. Secondly, some cases of JSON paths are highly optimized. For example, an update with a single JSON path costs O(1) memory regardless of how deep that path goes (not counting update arguments).
They are available from remote clients, as well as any other DML. Before JSON updates became available in Tarantool, to update one deep part of a tuple, it was necessary to download that tuple, update it in memory, and send it back – 2 network hops. With JSON paths, it can be 1 hop when the update can be described in paths.

Rocks reference

This reference covers third-party Lua modules for Tarantool.

For Tarantool Enterprise modules, see the Tarantool EE documentation.

Module membership

This module is a membership library for Tarantool based on a gossip protocol.

This library builds a mesh from multiple Tarantool instances. The mesh monitors itself, helps members discover everyone else in the group and get notified about their status changes with low latency. It is built upon the ideas from Consul or, more precisely, the SWIM algorithm.

The membership module works over UDP protocol and can operate even before the box.cfg initialization.

Member data structure

A member is represented by the table with the following fields:

uri (string) is a Uniform Resource Identifier.
status (string) is a string that takes one of the values below.
- alive: a member that replies to ping-messages is alive and well.
- suspect: if any member in the group cannot get a reply from any other member, the first member asks three other alive members to send a ping-message to the member in question. If there is no response, the latter becomes a suspect.
- dead: a suspect becomes dead after a timeout.
- left: a member gets the left status after executing the leave() function.
  
  Note
  
  The gossip protocol guarantees that every member in the group becomes aware of any status change in two communication cycles.
incarnation (number) is a value incremented every time the instance becomes a suspect, dead, or updates its payload.
payload (table) is auxiliary data that can be used by various modules.
timestamp (number) is a value of fiber.time64() which:
- corresponds to the last update of status or incarnation;
- is always local;
- does not depend on other members’ clock setting.

Below is an example of the table:

tarantool> membership.myself()
---
uri: localhost:33001
status: alive
incarnation: 1
payload:
    uuid: 2d00c500-2570-4019-bfcc-ab25e5096b73
timestamp: 1522427330993752
...

API reference

Below is a list of membership’s common, encryption, subscription functions, and options.

Name	Use
Common functions
init(advertise_host, port)	Initialize the `membership` module.
myself()	Get the member data structure of the current instance.
get_member(uri)	Get the member data structure for a given URI.
members()	Obtain a table with all members known to the current instance.
pairs()	Shorthand for `pairs(membership.members())`.
add_member(uri)	Add a member to the group.
probe_uri(uri)	Check if the member is in the group.
broadcast()	Discover members in LAN by sending a UDP broadcast message.
set_payload(key, value)	Update `myself().payload` and disseminate it.
leave()	Gracefully leave the group.
is_encrypted()	Check if encryption is enabled.
Encryption functions
set_encryption_key(key)	Set the key for low-level message encryption.
get_encryption_key()	Retrieve the encryption key in use.
Subscription functions
subscribe()	Subscribe for the members table updates.
unsubscribe()	Remove the subscription.
Options
PROTOCOL_PERIOD_SECONDS	Direct ping period.
ACK_TIMEOUT_SECONDS	ACK message wait time.
ANTI_ENTROPY_PERIOD_SECONDS	Anti-entropy synchronization period.
SUSPECT_TIMEOUT_SECONDS	Timeout to mark a `suspect` `dead`.
NUM_FAILURE_DETECTION_SUBGROUPS	Number of members to ping a `suspect` indirectly.

Common functions:

membership.init(advertise_host, port)¶

Initialize the membership module. This binds a UDP socket to 0.0.0.0:<port>, sets the advertise_uri parameter to <advertise_host>:<port>, and incarnation to 1.

The init() function can be called several times, the old socket will be closed and a new one opened.

If the advertise_uri changes during the next init(), the old URI is considered DEAD. In order to leave the group gracefully, use the leave() function.

Parameters:	advertise_host (`string`) – a hostname or IP address to advertise to other members port (`number`) – a UDP port to bind
Return:	`true`
Rtype:	boolean
Raises:	socket bind error

membership.myself()¶

Return:	the member data structure of the current instance.
Rtype:	table

membership.get_member(uri)¶

Parameters:	uri (`string`) – the given member’s `advertise_uri`
Return:	the member data structure of the instance with the given URI.
Rtype:	table

membership.members()¶

Obtain all members known to the current instance.

Editing this table has no effect.

Return:	a table with URIs as keys and corresponding member data structures as values.
Rtype:	table

membership.pairs()¶

A shorthand for pairs(membership.members()).

Return:	Lua iterator

It can be used in the following way:

for uri, member in membership.pairs()
  -- do something
end

membership.add_member(uri)¶

Add a member with the given URI to the group and propagate this event to other members. Adding a member to a single instance is enough as everybody else in the group will receive the update with time. It does not matter who adds whom.

Parameters:	uri (`string`) – the `advertise_uri` of the member to add
Return:	`true` or `nil` in case of an error
Rtype:	boolean
Raises:	parse error if the URI cannot be parsed

membership.probe_uri(uri)¶

Send a message to a member to make sure it is in the group. If the member is alive but not in the group, it is added. If it already is in the group, nothing happens.

Parameters:	uri (`string`) – the `advertise_uri` of the member to ping
Return:	`true` if the member responds within 0.2 seconds, otherwise `no response`
Rtype:	boolean
Raises:	`ping was not sent` if the hostname could not be resolved

membership.broadcast()¶

Discover members in local network by sending a UDP broadcast message to all networks discovered by a getifaddrs() C call.

Return:	`true` if broadcast was sent, `false` if `getaddrinfo()` fails.
Rtype:	boolean

membership.set_payload(key, value)¶

Update myself().payload and disseminate it along with the member status.

Increments incarnation.

Parameters:	key (`string`) – a key to set in payload table value – auxiliary data
Return:	`true`
Rtype:	boolean

membership.leave()¶

Gracefully leave the membership group. The node will be marked with the left status and no other members will ever try to reconnect it.

Return:	`true`
Rtype:	boolean

membership.is_encrypted()¶

Return:	`true` if encryption is enabled, `false` otherwise.
Rtype:	boolean

Encryption functions:

membership.set_encryption_key(key)¶

Set the key used for low-level message encryption. The key is either trimmed or padded automatically to be exactly 32 bytes. If the key value is nil, the encryption is disabled.

The encryption is handled by the crypto.cipher.aes256.cbc Tarantool module.

For proper communication, all members must be configured to use the same encryption key. Otherwise, members report either dead or non-decryptable in their status.

Parameters:	key (`string`) – encryption key
Return:	`nil`.

membership.get_encryption_key()¶

Retrieve the encryption key that is currently in use.

Return:	encryption key or `nil` if the encryption is disabled.
Rtype:	string

Subscription functions:

membership.subscribe()¶

Subscribe for updates in the members table.

Return:	a `fiber.cond` object broadcasted whenever the members table changes.
Rtype:	object

membership.unsubscribe(cond)¶

Remove subscription on cond obtained by the subscribe() function.

The cond’s validity is not checked.

Parameters:	cond – the `fiber.cond` object obtained from subscribe()
Return:	`nil`.

Below is a list of membership options. They can be set as follows:

options = require('membership.options')
options.<option> = <value>

options.PROTOCOL_PERIOD_SECONDS¶: Period of sending direct pings. Denoted as T' in the SWIM protocol.

options.ACK_TIMEOUT_SECONDS¶: Time to wait for ACK message after a ping. If a member is late to reply, the indirect ping algorithm is invoked.

options.ANTI_ENTROPY_PERIOD_SECONDS¶: Period to perform the anti-entropy synchronization algorithm of the SWIM protocol.

options.SUSPECT_TIMEOUT_SECONDS¶: Timeout to mark suspect members as dead.

options.NUM_FAILURE_DETECTION_SUBGROUPS¶: Number of members to try pinging a suspect indirectly. Denoted as k in the SWIM protocol.

Luatest

More about Luatest API see below.

Overview

Tool for testing tarantool applications.

Highlights:

executable to run tests in directory or specific files,
before/after suite hooks,
before/after test group hooks,
output capturing,
helpers for testing tarantool applications,
luacov integration.

Requirements

Tarantool (it requires tarantool-specific fio module and ffi from LuaJIT).

Installation

tt rocks install luatest
.rocks/bin/luatest --help # list available options

Usage

Define tests.

-- test/feature_test.lua
local t = require('luatest')
local g = t.group('feature')
-- Default name is inferred from caller filename when possible.
-- For `test/a/b/c_d_test.lua` it will be `a.b.c_d`.
-- So `local g = t.group()` works the same way.

-- Tests. All properties with name staring with `test` are treated as test cases.
g.test_example_1 = function() ... end
g.test_example_n = function() ... end

-- Define suite hooks
t.before_suite(function() ... end)
t.before_suite(function() ... end)

-- Hooks to run once for tests group
g.before_all(function() ... end)
g.after_all(function() ... end)

-- Hooks to run for each test in group
g.before_each(function() ... end)
g.after_each(function() ... end)

-- Hooks to run for a specified test in group
g.before_test('test_example_1', function() ... end)
g.after_test('test_example_2', function() ... end)
-- before_test runs after before_each
-- after_test runs before after_each

-- test/other_test.lua
local t = require('luatest')
local g = t.group('other')
-- ...
g.test_example_2 = function() ... end
g.test_example_m = function() ... end

-- Define parametrized groups
local pg = t.group('pgroup', {{engine = 'memtx'}, {engine = 'vinyl'}})
pg.test_example_3 = function(cg)
    -- Use cg.params here
    box.schema.space.create('test', {
        engine = cg.params.engine,
    })
end

-- Hooks can be specified for one parameter
pg.before_all({engine = 'memtx'}, function() ... end)
pg.before_each({engine = 'memtx'}, function() ... end)
pg.before_test('test_example_3', {engine = 'vinyl'}, function() ... end)

Run tests from a path.

luatest                               # run all tests from the ./test directory
luatest test/integration              # run all tests from the specified directory
luatest test/feature_test.lua         # run all tests from the specified file

Run tests from a group.

luatest feature                       # run all tests from the specified group
luatest other.test_example_2          # run one test from the specified group
luatest feature other.test_example_2  # run tests by group and test name

Note that luatest recognizes an input parameter as a path only if it contains /, otherwise, it will be considered as a group name.

luatest feature                       # considered as a group name
luatest ./feature                     # considered as a path
luatest feature/                      # considered as a path

You can also use -p option in combination with the examples above for running tests matching to some name pattern.

luatest feature -p test_example       # run all tests from the specified group matching to the specified pattern

Luatest automatically requires test/helper.lua file if it’s present. You can configure luatest or run any bootstrap code there.

See the getting-started example in cartridge-cli repo.

Tests order

Use the --shuffle option to tell luatest how to order the tests. The available ordering schemes are group, all and none.

group shuffles tests within the groups.

all randomizes execution order across all available tests. Be careful: before_all/after_all hooks run always when test group is changed, so it may run multiple time.

none is the default, which executes examples within the group in the order they are defined (eventually they are ordered by functions line numbers).

With group and all you can also specify a seed to reproduce specific order.

--shuffle none
--shuffle group
--shuffle all --seed 123
--shuffle all:123 # same as above

To change default order use:

-- test/helper.lua
local t = require('luatest')
t.configure({shuffle = 'group'})

Preloaded hooks

Preloaded hooks extend base hooks. They behave like the pytest fixture with the autouse parameter.

If you run the following test:

Then the hooks are executed in the following sequence:

List of luatest functions

Assertions
`assert (value[, message])`	Check that value is truthy.
`assert_almost_equals (actual, expected, margin[, message])`	Check that two floats are close by margin.
`assert_covers (actual, expected[, message])`	Checks that actual map includes expected one.
`assert_lt (left, right[, message])`	Compare numbers.
`assert_le (left, right[, message])`
`assert_gt (left, right[, message])`
`assert_ge (left, right[, message])`
`assert_equals (actual, expected[, message[, deep_analysis]])`	Check that two values are equal.
`assert_error (fn, ...)`	Check that calling fn raises an error.
`assert_error_msg_contains (expected_partial, fn, ...)`
`assert_error_msg_content_equals (expected, fn, ...)`	Strips location info from message text.
`assert_error_msg_equals (expected, fn, ...)`	Checks full error: location and text.
`assert_error_msg_matches (pattern, fn, ...)`
`assert_error_covers (expected, fn, ...)`	Checks that actual error map includes expected one.
`assert_eval_to_false (value[, message])`	Alias for assert_not.
`assert_eval_to_true (value[, message])`	Alias for assert.
`assert_items_exclude (actual, expected[, message])`	Checks that one table does not include any items of another, irrespective of their keys.
`assert_items_include (actual, expected[, message])`	Checks that one table includes all items of another, irrespective of their keys.
`assert_is (actual, expected[, message])`	Check that values are the same.
`assert_is_not (actual, expected[, message])`	Check that values are not the same.
`assert_items_equals (actual, expected[, message])`	Checks that two tables contain the same items, irrespective of their keys.
`assert_nan (value[, message])`
`assert_not (value[, message])`	Check that value is falsy.
`assert_not_almost_equals (actual, expected, margin[, message])`	Check that two floats are not close by margin
`assert_not_covers (actual, expected[, message])`	Checks that map does not contain the other one.
`assert_not_equals (actual, expected[, message])`	Check that two values are not equal.
`assert_not_nan (value[, message])`
`assert_not_str_contains (actual, expected[, is_pattern[, message]])`	Case-sensitive strings comparison.
`assert_not_str_icontains (value, expected[, message])`	Case-insensitive strings comparison.
`assert_str_contains (value, expected[, is_pattern[, message]])`	Case-sensitive strings comparison.
`assert_str_icontains (value, expected[, message])`	Case-insensitive strings comparison.
`assert_str_matches (value, pattern[, start=1[, final=value:len() [, message]]])`	Verify a full match for the string.
`assert_type (value, expected_type[, message])`	Check value’s type.
Flow control
`fail (message)`	Stops a test due to a failure.
`fail_if (condition, message)`	Stops a test due to a failure if condition is met.
`xfail (message)`	Mark test as xfail.
`xfail_if (condition, message)`	Mark test as xfail if condition is met.
`skip (message)`	Skip a running test.
`skip_if (condition, message)`	Skip a running test if condition is met.
`success ()`	Stops a test with a success.
`success_if (condition)`	Stops a test with a success if condition is met.
Suite and groups
`after_suite (fn)`	Add after suite hook.
`before_suite (fn)`	Add before suite hook.
`group (name)`	Create group of tests.

XFail

The xfail mark makes test results to be interpreted vice versa: it’s threated as passed when an assertion fails, and it fails if no errors are raised. It allows one to mark a test as temporarily broken due to a bug in some other component which can’t be fixed immediately. It’s also a good practice to keep xfail tests in sync with an issue tracker.

local g = t.group()
g.test_fail = function()
    t.xfail('Must fail no matter what')
    t.assert_equals(3, 4)
end

XFail only applies to the errors raised by the luatest assertions. Regular Lua errors still cause the test failure.

Capturing output

By default runner captures all stdout/stderr output and shows it only for failed tests. Capturing can be disabled with -c flag.

Tests repeating

Runners can repeat tests with flags -r / --repeat (to repeat all the tests) or -R / --repeat-group (to repeat all the tests within the group).

Parametrization

Test group can be parametrized.

local g = t.group('pgroup', {{a = 1, b = 4}, {a = 2, b = 3}})

g.test_params = function(cg)
    ...
    log.info('a = %s', cg.params.a)
    log.info('b = %s', cg.params.b)
    ...
end

Group can be parametrized with a matrix of parameters using luatest.helpers:

local g = t.group('pgroup', t.helpers.matrix({a = {1, 2}, b = {3, 4}}))
-- Will run:
-- * a = 1, b = 3
-- * a = 1, b = 4
-- * a = 2, b = 3
-- * a = 2, b = 4

Each test will be performed for every params combination. Hooks will work as usual unless there are specified params. The order of execution in the hook group is determined by the order of declaration.

-- called before every test
g.before_each(function(cg) ... end)

-- called before tests when a == 1
g.before_each({a = 1}, function(cg) ... end)

-- called only before the test when a == 1 and b == 3
g.before_each({a = 1, b = 3}, function(cg) ... end)

-- called before test named 'test_something' when a == 1
g.before_test('test_something', {a = 1}, function(cg) ... end)

--etc

Test from a parameterized group can be called from the command line in such a way:

luatest pgroup.a:1.b:4.test_params
luatest pgroup.a:2.b:3.test_params

Note that values for a and b have to match to defined group params. The command below will give you an error because such params are not defined for the group.

luatest pgroup.a:2.b:2.test_params  # will raise an error

Test helpers

There are helpers to run tarantool applications and perform basic interaction with it. If application follows configuration conventions it is possible to use options to configure server instance and helpers at the same time. For example http_port is used to perform http request in tests and passed in TARANTOOL_HTTP_PORT to server process.

local server = luatest.Server:new({
    command = '/path/to/executable.lua',
    -- arguments for process
    args = {'--no-bugs', '--fast'},
    -- additional envars to pass to process
    env = {SOME_FIELD = 'value'},
    -- passed as TARANTOOL_WORKDIR
    workdir = '/path/to/test/workdir',
    -- passed as TARANTOOL_HTTP_PORT, used in http_request
    http_port = 8080,
    -- passed as TARANTOOL_LISTEN, used in connect_net_box
    net_box_port = 3030,
    -- passed to net_box.connect in connect_net_box
    net_box_credentials = {user = 'username', password = 'secret'},
})
server:start()
-- Wait until server is ready to accept connections.
-- This may vary from app to app: for one server:connect_net_box() is enough,
-- for another more complex checks are required.
luatest.helpers.retrying({}, function() server:http_request('get', '/ping') end)

-- http requests
server:http_request('get', '/path')
server:http_request('post', '/path', {body = 'text'})
server:http_request('post', '/path', {json = {field = value}, http = {
    -- http client options
    headers = {Authorization = 'Basic ' .. credentials},
    timeout = 1,
}})

-- This method throws error when response status is outside of then range 200..299.
-- To change this behaviour, path `raise = false`:
t.assert_equals(server:http_request('get', '/not_found', {raise = false}).status, 404)
t.assert_error(function() server:http_request('get', '/not_found') end)

-- using net_box
server:connect_net_box()
server:eval('return do_something(...)', {arg1, arg2})
server:call('function_name', {arg1, arg2})
server:exec(function() return box.info() end)
server:stop()

luatest.Process:start(path, args, env) provides low-level interface to run any other application.

luatest.cluster runs a declarative configuration as a set of Tarantool instances. By default clusters are registered inside the current test group and cleaned up with preloaded hooks (auto_cleanup = true). When you need to reuse the same cluster across multiple tests or keep several clusters running at once, pass auto_cleanup = false and manage the lifecycle manually:

local t = require('luatest')
local cluster = require('luatest.cluster')
local cbuilder = require('luatest.cbuilder')

local g = t.group('shared')

g.before_all(function()
    local config = cbuilder:new()
        :use_group('g-1')
        :use_replicaset('rs-1')
        :add_instance('instance-1', {})
        :config()

    g.cluster = cluster:new(config, {}, {auto_cleanup = false})
    g.cluster:start()
end)

g.after_all(function()
    g.cluster:drop()
end)

g.test_reuses_cluster_between_cases = function()
    t.assert_not_equals(g.cluster['instance-1'].process, nil)
end

There are several small helpers for common actions:

luatest.helpers.uuid('ab', 2, 1) == 'abababab-0002-0000-0000-000000000001'

luatest.helpers.retrying({timeout = 1, delay = 0.1}, failing_function, arg1, arg2)
-- wait until server is up
luatest.helpers.retrying({}, function() server:http_request('get', '/status') end)

luacov integration

Install luacov with tt rocks install luacov
Configure it with .luacov file
Clean old reports rm -f luacov.*.out*
Run luatest with --coverage option
Generate report with .rocks/bin/luacov .
Show summary with grep -A999 '^Summary' luacov.report.out

When running integration tests with coverage collector enabled, luatest automatically starts new tarantool instances with luacov enabled. So coverage is collected from all the instances. However this has some limitations:

It works only for instances started with Server helper.
Process command should be executable lua file or tarantool with script argument.
Instance must be stopped with server:stop(), because this is the point where stats are saved.
Don’t save stats concurrently to prevent corruption.

Development

Check out the repo.
Prepare makefile with cmake ..
Install dependencies with make bootstrap.
Run it with make lint before committing changes.
Run tests with bin/luatest.

Contributing

Bug reports and pull requests are welcome on at https://github.com/tarantool/luatest.

License

MIT

API

Module luatest.helpers

Collection of test helpers.

Functions

matrix (parameters_values)

Return all combinations of parameters. Accepts params’ names and thier every possible value.

helpers.matrix({a = {1, 2}, b = {3, 4}})

{
  {a = 1, b = 3},
  {a = 2, b = 3},
  {a = 1, b = 4},
  {a = 2, b = 4},
}

Parameters:

parameters_values: (tab)

retrying (config, fn, …)

Keep calling fn until it returns without error. Throws last error if config.timeout is elapsed. Default options are taken from helpers.RETRYING_TIMEOUT and helpers.RETRYING_DELAY.

helpers.retrying({}, fn, arg1, arg2)
helpers.retrying({timeout = 2, delay = 0.5}, fn, arg1, arg2)

Parameters:

config:
- timeout: (number)
- delay: (number)
fn: (func)
…: args

uuid (a, …)

Generates uuids from its 5 parts. Strings are repeated and numbers are padded to match required part length. If number of arguments is less than 5 then first and last arguments are used for corresponding parts, missing parts are set to 0.

'aaaaaaaa-0000-0000-0000-000000000000' == uuid('a')
'abababab-0000-0000-0000-000000000001' == uuid('ab', 1)
'00000001-0002-0000-0000-000000000003' == uuid(1, 2, 3)
'11111111-2222-0000-0000-333333333333' == uuid('1', '2', '3')
'12121212-3434-5656-7878-909090909090' == uuid('12', '34', '56', '78', '90')

Parameters:

a: first part
…: parts

Module luatest.justrun

Simple Tarantool runner and output catcher.

Sometimes it is necessary to run tarantool with particular arguments and verify its output. luatest.server provides a supervisor like interface: an instance is started, calls box.cfg() and we can communicate with it using net.box. Another helper in tarantool/tarantool, test.interactive_tarantool , aims to solve all the problems around readline console and also provides ability to communicate with the instance interactively.

However, there is nothing like ‘just run tarantool with given args and give me its output’.

Functions

tarantool (dir, env, args[, opts])

Run tarantool in given directory with given environment and command line arguments and catch its output.

Expects JSON lines as the output and parses it into an array (it can be disabled using nojson option).

Options:

nojson (boolean, default: false)

Don’t attempt to decode stdout as a stream of JSON lines, return as is.

stderr (boolean, default: false)

Collect stderr and place it into the stderr field of the return value

quote_args (boolean, default: false)

Quote CLI arguments before concatenating them into a shell command.

setsearchroot (boolean, default: true)

Set package.searchroot to be the same as in the Tarantool that starts the server.

Parameters:

dir: (string) Directory where the process will run.
env: (table) Environment variables for the process.
args: (table) Options that will be passed when the process starts.
opts: (table) Custom options: nojson, stderr and quote_args. (optional)

Returns:

(table)

Module luatest.hooks

Provide extra methods for hooks.

Preloaded hooks extend base hooks. They behave like the pytest fixture with the autouse parameter.

Usage:

local hooks = require('luatest.hooks')

hooks.before_suite_preloaded(...)
hooks.after_suite_preloaded(...)

hooks.before_all_preloaded(...)
hooks.after_all_preloaded(...)

hooks.before_each_preloaded(...)
hooks.after_each_preloaded(...)

Functions

export.after_all_preloaded (fn)

Parameters:

fn: (func) The function where you will be cleaning up for the test.

export.after_each_preloaded (fn)

Parameters:

fn: (func) The function where you will be cleaning up for the test.

export.after_suite_preloaded (fn)

Parameters:

fn: (func) The function where you will be cleaning up for the test.

export.before_all_preloaded (fn)

Parameters:

fn: (func) The function where you will be preparing for the test.

export.before_each_preloaded (fn)

Parameters:

fn: (func) The function where you will be preparing for the test.

export.before_suite_preloaded (fn)

Register preloaded before hook in the suite scope. It will be done before the classic before_suite() hook in the tests.

Parameters:

fn: (func) The function where you will be preparing for the test.

Module luatest.treegen

Working tree generator.

Generates a tree of Lua files using provided templates and filenames.

Usage:

local t = require('luatest')
local treegen = require('luatest.treegen')

local g = t.group()

g.test_foo = function(g)
    treegen.add_template('^.*$', 'test_script')
    local dir = treegen.prepare_directory({'foo/bar.lua', 'main.lua'})
    ...
end

Functions

add_template (pattern, template)

Save the template with the given pattern.

Parameters:

pattern: (string) File name template
template: (string) A content template for creating a file.

prepare_directory (contents[, replacements])

Create a temporary directory with given contents.

The contents are generated using templates added by treegen.add_template().

Parameters:

contents: (tab) List of bodies of the content to write.
replacements: (tab) List of replacement templates. (optional)

Returns:

string

Usage:

Example for {'foo/bar.lua', 'baz.lua'}:

/
+ tmp/
  + rfbWOJ/
    + foo/
    | + bar.lua
    + baz.lua

The return value is '/tmp/rfbWOJ' for this example.

remove_template (pattern)

Remove the template by pattern.

Parameters:

pattern: (string) File name template

write_file (directory, filename, content)

Write provided content into the given directory.

Parameters:

directory: (string) Directory where the content will be created.
filename: (string) File to write (possible nested path: /foo/bar/main.lua).
content: (string) The body to write.

Returns:

string

Module luatest.cluster

Tarantool 3.0+ cluster management utils.

The helper is used to automatically collect a set of instances from the provided configuration and automatically set up servers per each configured instance.

Usage:

local cluster = Cluster:new(config)
cluster:start()
cluster['instance-001']:exec(<...>)
cluster:each(function(server)
    server:exec(<...>)
end)

After setting up a cluster object the following methods could
be used to interact with it:

* :start() Startup the cluster.
* :start_instance() Startup a specific instance.
* :stop() Stop the cluster.
* :each() Execute a function on each instance.
* :size() get an amount of instances
* :drop() Drop the cluster.
* :sync() Sync the configuration and collect a new set of
  instances
* :reload() Reload the configuration.
* :config() Return the last applied configuration.
* :modify_config() Initialize a configuration builder based on
  the current config and store it inside the cluster object.
* :apply_config_changes() Apply the configuration built via
  :modify_config() by passing it to :sync().

The module can also be used for testing failure startup
cases:

Cluster:startup_error(config, error_message)

Functions

Cluster:apply_config_changes ([opts])

Apply configuration changes built via :modify_config().

Uses the internal configuration builder created by :modify_config(), converts it to a config table and calls :sync() with it. After the call the stored builder is cleared.

Parameters:

opts: Options.
- start_stop: (bool) Start/stop added/removed servers(default: false). (optional)
- wait_until_ready: (bool) Wait until servers are ready(default: true; used only if start_stop is set). (optional)
- wait_until_running: (bool) Wait until servers are running(default: wait_until_ready; used only if start_stop is set). (optional)

Cluster:config ()

Return the last applied configuration.

Cluster:drop ()

Drop the cluster’s servers.

Cluster:each (f)

Execute for server in the cluster.

Parameters:

f: (func) Function to execute with a server as the first param.

Cluster:modify_config ()

Initialize a configuration builder based on the current config.

The returned builder is stored inside the cluster object and later consumed by :apply_config_changes(), which turns it into a config table and passes it to :sync().

Cluster:new (config[, server_opts[, opts]])

Create a new Tarantool cluster.

Parameters:

config: (tab) Cluster configuration.
server_opts: (tab) Extra options passed to server:new(). (optional)
opts: Cluster options.
- dir: (string) Specific directory for the cluster. (optional)
- auto_cleanup: (bool) Register the cluster in a test group andautomatically drop it using hooks (default: true). (optional)

Returns:

table

Cluster:reload ([config])

Reload configuration on all the instances.

Parameters:

config: (tab) New config. (optional)

Cluster:size ()

Get cluster size.

Returns:

number.

Cluster:start ([opts])

Start all the instances.

Parameters:

opts: Cluster startup options.
- wait_until_ready: (bool) Wait until servers are ready(default: true). (optional)
- wait_until_running: (bool) Wait until servers are running(default: wait_until_ready). (optional)

Cluster:start_instance (instance_name)

Start the given instance.

Parameters:

instance_name: (string) Instance name.

Cluster:startup_error (config, exp_err)

Ensure cluster startup error Starts a all instance of a cluster from the given config and ensure that all the instances fails to start and reports the given error message.

Parameters:

config: (tab) Cluster configuration.
exp_err: (string) Expected error message.

Cluster:stop ()

Stop the whole cluster.

Cluster:sync (config[, opts])

Sync the cluster object with the new config.

It performs the following actions.

Write the new config into the config file.
Update the internal list of instances.
Optionally starts instances added to the config and stops instances removed from the config.

Parameters:

config: (tab) New config.
opts: Options.
- start_stop: (bool) Start/stop added/removed servers(default: false). (optional)
- wait_until_ready: (bool) Wait until servers are ready(default: true; used only if start_stop is set). (optional)
- wait_until_running: (bool) Wait until servers are running(default: wait_until_ready; used only if start_stop is set). (optional)

Class luatest.group

Tests group.

To add new example add function at key starting with test .

Group hooks run always when test group is changed. So it may run multiple times when --shuffle option is used.

Instance methods

Group.mt.after_all (fn)

Add callback to run once after all tests in the group.

Parameters:

fn:

Group.mt.after_each (fn)

Add callback to run after each test in the group.

Parameters:

fn:

Group.mt.before_all (fn)

Add callback to run once before all tests in the group.

Parameters:

fn:

Group.mt.before_each (fn)

Add callback to run before each test in the group.

Parameters:

fn:

Group.mt:initialize ([name])

Parameters:

name: (string) Default name is inferred from caller filename when possible.For test/a/b/c_d_test.lua it will be a.b.c_d . (optional)

Returns:

Group instance

Class luatest.http_response

Class to provide helper methods for HTTP responses

Instance getter methods

HTTPResponse.getters

For backward compatibility this methods should be accessed as object’s fields (eg., response.json.id ).

They are not assigned to object’s fields on initialization to be evaluated lazily and to be able to throw errors.

HTTPResponse.getters:json ()

Parse json from body.

Usage:

response.json.id

HTTPResponse.mt:is_successful ()

Check that status code is 2xx.

Class luatest.runner

Class to run test suite.

Functions

Runner.is_test_name (s)

Check that string matches the name of a test method. Default rule is that is starts with ‘test’

Parameters:

Runner.run ([args=_G.args[, options]])

Main entrypoint to run test suite.

Parameters:

args: (tab) List of CLI arguments (default $(def))
options:
- verbosity: (int) (optional)
- fail_fast: (bool) (default $(def))
- output_file_name: (string) Filename for JUnit report (optional)
- exe_repeat: (int) Times to repeat each test (optional)
- exe_repeat_group: (int) Times to repeat each group of tests (optional)
- tests_pattern: (tab) Patterns to filter tests (optional)
- tests_names: (tab) List of test names or groups to run (optional)
- paths: (tab) List of directories to load tests from. (default $(def))
- load_tests: (func) Function to load tests. Called once for every item in paths . (optional)
- shuffle: (string) Shuffle method (none, all, group) (default $(def))
- seed: (int) Random seed for shuffle (optional)
- output: (string) Output formatter (text, tap, junit, nil) (default $(def))

Runner.split_test_method_name (someName)

Split some.group.name.method into some.group.name and method . Returns nil, input if input value does not have a dot.

Parameters:

someName:

Runner:expand_group (group)

Exrtact all test methods from group.

Parameters:

group:

Class luatest.server

Class to manage Tarantool instances.

Functions

Server.build_listen_uri (server_alias[, extra_path])

Build a listen URI based on the given server alias and extra path. The resulting URI: <Server.vardir>/[<extra_path>/]<server_alias>.sock. Provide a unique alias or extra path to avoid collisions with other sockets. For now, only UNIX sockets are supported.

Parameters:

server_alias: (string) Server alias.
extra_path: (string) Extra path relative to the Server.vardir directory. (optional)

Returns:

string

Server:assert_follows_upstream (server_id)

Assert that the server follows the source node with the given ID. Meaning that it replicates from the remote node normally, and has already joined and subscribed.

Parameters:

server_id: (number) Server ID.

Server:call (fn_name[, args[, options]])

Call remote function on the server by name.

This is a shortcut for server.net_box:call() .

Parameters:

fn_name: (string)
args: (tab) (optional)
options: (tab) (optional)

Server:connect_net_box ()

Establish net.box connection. It’s available in net_box field.

Server:copy_datadir ()

Copy contents of the data directory into the server’s working directory. Invoked on the server’s start.

Server:eval (code[, args[, options]])

Evaluate Lua code on the server.

This is a shortcut for server.net_box:eval() .

Parameters:

code: (string)
args: (tab) (optional)
options: (tab) (optional)

Server:exec (fn[, args[, options]])

Run given function on the server.

Much like Server:eval , but takes a function instead of a string. The executed function must have no upvalues (closures). Though it may use global functions and modules (like box , os , etc.)

Parameters:

fn: (function)
args: (tab) (optional)
options: (tab) (optional)

Usage:

local vclock = server:exec(function()
    return box.info.vclock
end)

local sum = server:exec(function(a, b)
    return a + b
end, {1, 2})
-- sum == 3

local t = require('luatest')
server:exec(function()
    -- luatest is available via `t` upvalue
    t.assert_equals(math.pi, 3)
end)
-- mytest.lua:12: expected: 3, actual: 3.1415926535898

Server:get_box_cfg ()

A simple wrapper around the Server:exec() method to get the box.cfg value from the server.

Returns:

table

Server:get_downstream_vclock (server_id)

Get vclock acknowledged by another node to the current server.

Parameters:

server_id: (number) Server ID.

Returns:

table

Server:get_election_term ()

Get the election term as seen by the server.

Returns:

number

Server:get_instance_id ()

Get ID of the server instance.

Returns:

number

Server:get_instance_uuid ()

Get UUID of the server instance.

Returns:

string

Server:get_synchro_queue_term ()

Get the synchro term as seen by the server.

Returns:

number

Server:get_vclock ()

Get the server’s own vclock, including the local component.

Returns:

table

Server:grep_log (pattern[, bytes_num[, opts]])

Search a string pattern in the server’s log file.

Parameters:

pattern: (string) String pattern to search in the server’s log file.
bytes_num: (number) Number of bytes to read from the server’s log file. (optional)
opts:
- reset: (bool) Reset the result when Tarantool %d+.%d+.%d+-.*%d+-g.* pattern is found, which means that the server was restarted.Defaults to true . (optional)
- filename: (string) Path to the server’s log file.Defaults to <workdir>/<alias>.log. (optional)

Returns:

string|nil

Server:http_request (method, path[, options])

Perform HTTP request.

Parameters:

method: (string)
path: (string)
options:
- body: (string) request body (optional)
- json: data to encode as JSON into request body (optional)
- http: (tab) other options for HTTP-client (optional)
- raise: (bool) raise error when status is not in 200..299. Default to true. (optional)

Returns:

response object from HTTP client with helper methods.

Raises:

HTTPRequest error when response status is not 200.

See also:

luatest.http_response

Server:make_socketdir ()

Make directory for the server’s Unix socket. Invoked on the server’s start.

Server:make_workdir ()

Make the server’s working directory. Invoked on the server’s start.

Server:new ([object[, extra]])

Build a server object.

Parameters:

object: Table with the entries listed below. (optional)
- command: (string) Executable path to run a server process with.Defaults to the internal server_instance.lua script. If a custom pathis provided, it should correctly process all env variables listed belowto make constructor parameters work. (optional)
- args: (tab) Arbitrary args to run object.command with. (optional)
- env: (tab) Pass the given env variables into the server process. (optional)
- chdir: (string) Change to the given directory before runningthe server process. (optional)
- alias: (string) Alias for the new server and the value of the.. code-block:: lua TARANTOOL_ALIAS env variable which is passed into the server process.Defaults to ‘server’. (optional)
- workdir: (string) Working directory for the new server and thevalue of the TARANTOOL_WORKDIR env variable which is passed into theserver process. The directory path will be created on the server start.Defaults to <vardir>/<alias>-<random id>. (optional)
- datadir: (string) Directory path whose contents will be recursivelycopied into object.workdir on the server start. (optional)
- http_port: (number) Port for HTTP connection to the new server andthe value of the TARANTOOL_HTTP_PORT env variable which is passed intothe server process.Not supported in the default server_instance.lua script. (optional)
- net_box_port: (number) Port for the net.box connection to the newserver and the value of the TARANTOOL_LISTEN env variable which is passedinto the server process. (optional)
- net_box_uri: (string) URI for the net.box connection to the newserver and the value of the TARANTOOL_LISTEN env variable which is passedinto the server process. If it is a Unix socket, the corresponding socketdirectory path will be created on the server start. (optional)
- net_box_credentials: (tab) Override the default credentials for the.. code-block:: lua net.box connection to the new server and the value of the TARANTOOL_CREDENTIALS env variable which is passed into the serverprocess. (optional)
- box_cfg: (tab) Extra options for box.cfg() and the value of the.. code-block:: lua TARANTOOL_BOX_CFG env variable which is passed into the server process. (optional)
- config_file: (string) Declarative YAML configuration for a serverinstance. Used to deduce advertise URI to connect net.box to the instance.The special value ‘’ means running without --config <...> CLI option(but still passes --name <alias>). (optional)
- remote_config: (tab) If config_file is not passed, this configvalue is used to deduce advertise URI to connect net.box to the instance. (optional)
- setsearchroot: (tab) Set package.searchroot to be the same asin the Tarantool that starts the server. (optional)
extra: (tab) Table with extra properties for the server object. (optional)

Returns:

table

Server:play_wal_until_synchro_queue_is_busy ()

Play WAL until the synchro queue becomes busy. WAL records go one by one. The function is needed, because during box.ctl.promote() it is not known for sure which WAL record is PROMOTE - first, second, third? Even if known, it might change in the future. WAL delay should already be started before the function is called.

Server:restart ([params[, opts]])

Restart the server with the given parameters. Optionally waits until the server is ready.

Parameters:

params: (tab) Parameters to restart the server with.Like command , args , env , etc. (optional)
opts:
- wait_until_ready: (bool) Wait until the server is ready.Defaults to true unless a custom executable path was provided whilebuilding the server object. (optional)

See also:

luatest.server.Server:new

Server:start ([opts])

Start a server. Optionally waits until the server is ready.

Parameters:

opts:
- wait_until_ready: (bool) Wait until the server is ready.Defaults to true unless a custom executable was provided while buildingthe server object. (optional)

Server:stop ()

Stop the server. Waits until the server process is terminated.

Server:update_box_cfg (cfg)

A simple wrapper around the Server:exec() method to update the box.cfg value on the server.

Parameters:

cfg: (tab) Box configuration settings.

Server:wait_for_downstream_to (server)

Wait for the given server to reach at least the same vclock as the local server. Not including the local component, of course.

Parameters:

server: (tab) Server’s object.

Server:wait_for_election_leader ()

Wait for the server to become a writable election leader.

Server:wait_for_election_state (state)

Wait for the server to enter the given election state. Note that if it becomes a leader, it does not mean it is already writable.

Parameters:

state: (string) Election state to wait for.

Server:wait_for_election_term (term)

Wait for the server to reach at least the given election term.

Parameters:

term: (string) Election term to wait for.

Server:wait_for_synchro_queue_term (term)

Wait for the server to reach at least the given synchro term.

Parameters:

term: (number) Synchro queue term to wait for.

Server:wait_for_vclock (vclock)

Wait until the server’s own vclock reaches at least the given value. Including the local component.

Parameters:

vclock: (tab) Server’s own vclock to reach.

Server:wait_for_vclock_of (other_server)

Wait for the server to reach at least the same vclock as the other server. Not including the local component, of course.

Parameters:

other_server: (tab) Other server’s object.

Server:wait_until_election_leader_found ()

Wait for the server to discover an election leader.

Server:wait_until_ready ()

Wait until the server is ready after the start. A server is considered ready when its _G.ready variable becomes true .

Class luatest.replica_set

Class to manage groups of Tarantool instances with the same data set.

Functions

ReplicaSet:add_server (server)

Add the server object to the replica set. The added server object should be built via the ReplicaSet:build_server function.

Parameters:

server: (tab) Server object to be added to the replica set.

ReplicaSet:build_and_add_server ([config])

Build a server object and add it to the replica set.

Parameters:

config: (tab) Configuration for the new server. (optional)

Returns:

table

See also:

luatest.server.Server:new

ReplicaSet:build_server ([config])

Build a server object for the replica set.

Parameters:

config: (tab) Configuration for the new server. (optional)

Returns:

table

See also:

luatest.server.Server:new

ReplicaSet:delete_server (alias)

Delete the server object from the replica set by the given server alias.

Parameters:

alias: (string) Server alias.

ReplicaSet:drop ()

Stop all servers in the replica set. This function should be used only at the end of the test ( after_test , after_each , after_all hooks) to terminate all server processes in the replica set.

ReplicaSet:get_leader ()

Get a server which is a writable node in the replica set.

Returns:

table

ReplicaSet:get_server (alias)

Get the server object from the replica set by the given server alias.

Parameters:

alias: (string) Server alias.

Returns:

table|nil

ReplicaSet:new ([object])

Build a replica set object.

Parameters:

object: Table with the entries listed below. (optional)
- servers: (tab) List of server configurations to build serverobjects from and add them to the new replica set. See an example below. (optional)

Returns:

table

See also:

luatest.server.Server:new

Usage:

local ReplicaSet = require('luatest.replica_set')
local Server = require('luatest.server')
local box_cfg = {
    replication_timeout = 0.1,
    replication_connect_timeout = 10,
    replication_sync_lag = 0.01,
    replication_connect_quorum = 3,
    replication = {
        Server.build_listen_uri('replica1'),
        Server.build_listen_uri('replica2'),
        Server.build_listen_uri('replica3'),
    },
}
local replica_set = ReplicaSet:new({
    servers = {
        {alias = 'replica1', box_cfg = box_cfg},
        {alias = 'replica2', box_cfg = box_cfg},
        {alias = 'replica3', box_cfg = box_cfg},
    }
})
replica_set:start()
replica_set:wait_for_fullmesh()

ReplicaSet:start ([opts])

Start all servers in the replica set. Optionally waits until all servers are ready.

Parameters:

opts: Table with the entries listed below. (optional)
- wait_until_ready: (bool) Wait until all servers are ready.Defaults to true . (optional)

ReplicaSet:stop ()

Stop all servers in the replica set.

ReplicaSet:wait_for_fullmesh ([opts])

Wait until every node is connected to every other node in the replica set.

Parameters:

opts: Table with the entries listed below. (optional)
- timeout: (number) Timeout in seconds to wait for full mesh.Defaults to 60. (optional)
- delay: (number) Delay in seconds between attempts to check full mesh.Defaults to 0.1. (optional)

Class luatest.cbuilder

Configuration builder.

It allows to construct a declarative configuration for a test case using less boilerplace code/options, especially when a replicaset is to be tested, not a single instance. All the methods support chaining (return the builder object back).

Usage:

local config = Builder:new()
    :add_instance('instance-001', {
        database = {
            mode = 'rw',
        },
    })
    :add_instance('instance-002', {})
    :add_instance('instance-003', {})
    :config()

By default, all instances are added to replicaset-001 in group-001,
but it's possible to select a different replicaset and/or group:

local config = Builder:new()
    :use_group('group-001')
    :use_replicaset('replicaset-001')
    :add_instance(<...>)
    :add_instance(<...>)
    :add_instance(<...>)

    :use_group('group-002')
    :use_replicaset('replicaset-002')
    :add_instance(<...>)
    :add_instance(<...>)
    :add_instance(<...>)

    :config()

The default credentials and iproto options are added to
setup replication and to allow a test to connect to the
instances.

There is a few other methods:

* :set_replicaset_option('foo.bar', value)
* :set_instance_option('instance-001', 'foo.bar', value)

Functions

Builder:add_instance (instance_name, iconfig)

Add an instance with the given options to the selected replicaset.

Parameters:

instance_name: (string) Instance where the config will be saved.
iconfig: (tab) Declarative config for the instance.

Builder:config ()

Return the resulting configuration.

Builder:new ([config])

Build a config builder object.

Parameters:

config: (tab) Table with declarative configuration. (optional)

Builder:set_global_option (path, value)

Set option to the cluster config.

Parameters:

path: (string) Option path.
value: Option value (int, string, table).

Builder:set_group_option (path, value)

Set an option for the selected group.

Parameters:

path: (string) Option path.
value: Option value (int, string, table).

Builder:set_replicaset_option (path, value)

Set an option for the selected replicaset.

Parameters:

path: (string) Option path.
value: Option value (int, string, table).

Builder:use_group (group_name)

Select a group for following calls.

Parameters:

group_name: (string) Group of replicas.

Builder:use_replicaset (replicaset_name)

Select a replicaset for following calls.

Parameters:

replicaset_name: (string) Replica set name.

Module vshard

The vshard module introduces an advanced sharding feature based on the concept of virtual buckets and enables horizontal scaling in Tarantool.

To learn how sharding works in Tarantool, refer to the Sharding page.

You can also check out the Quick start guide – or dive into the vshard reference:

Configuration reference

Note

Starting with the 3.0 version, the recommended way of configuring Tarantool is using a configuration file. Configuring Tarantool in code is considered a legacy approach.

Basic parameters

sharding
weights
shard_index
bucket_count
collect_bucket_garbage_interval
collect_lua_garbage
sync_timeout
rebalancer_disbalance_threshold
rebalancer_max_receiving
rebalancer_max_sending
discovery_mode
sched_move_quota
sched_ref_quota

sharding¶: A field defining the logical topology of the sharded Tarantool cluster.

Type: table

Default: false

Dynamic: yes

weights¶: A field defining the configuration of relative weights for each zone pair in a replica set.

Type: table

Default: false

Dynamic: yes

shard_index¶: Name or id of a TREE index over the bucket id. Spaces without this index do not participate in a sharded Tarantool cluster and can be used as regular spaces if needed. It is necessary to specify the first part of the index, other parts are optional.

Type: non-empty string or non-negative integer

Default: “bucket_id”

Dynamic: no

bucket_count¶

The total number of buckets in a cluster.

This number should be several orders of magnitude larger than the potential number of cluster nodes, considering potential scaling out in the foreseeable future.

Example:

If the estimated number of nodes is M, then the data set should be divided into 100M or even 1000M buckets, depending on the planned scaling out. This number is certainly greater than the potential number of cluster nodes in the system being designed.

Type: number
Default: 3000
Dynamic: no

collect_bucket_garbage_interval¶

Deprecated since: 0.1.17.

The interval between garbage collector actions, in seconds.

Type: number
Default: 0.5
Dynamic: yes

collect_lua_garbage¶

Deprecated since: 0.1.20.

If set to true, the Lua collectgarbage() function is called periodically.

Type: boolean
Default: no
Dynamic: yes

sync_timeout¶: Timeout to wait for synchronization of the old master with replicas before demotion. Used when switching a master or when manually calling the sync() function.

Type: number

Default: 1

Dynamic: yes

rebalancer_disbalance_threshold¶

A maximum bucket disbalance threshold, in percent. The disbalance is calculated for each replica set using the following formula:

|etalon_bucket_count - real_bucket_count| / etalon_bucket_count * 100

Type: number
Default: 1
Dynamic: yes

rebalancer_max_receiving¶

The maximum number of buckets that can be received in parallel by a single replica set. This number must be limited, because when a new replica set is added to a cluster, the rebalancer sends a very large amount of buckets from the existing replica sets to the new replica set. This produces a heavy load on the new replica set.

Example:

Suppose rebalancer_max_receiving is equal to 100, bucket_count is equal to 1000. There are 3 replica sets with 333, 333 and 334 buckets on each respectively. When a new replica set is added, each replica set’s etalon_bucket_count becomes equal to 250. Rather than receiving all 250 buckets at once, the new replica set receives 100, 100 and 50 buckets sequentially.

Type: number
Default: 100
Dynamic: yes

rebalancer_max_sending¶

The degree of parallelism for parallel rebalancing.

Works for storages only, ignored for routers.

The maximum value is 15.

Type: number
Default: 1
Dynamic: yes

discovery_mode¶: A mode of a bucket discovery fiber: on/off/once. See details below.

Type: string

Default: ‘on’

Dynamic: yes

sched_move_quota¶

A scheduler’s bucket move quota used by the rebalancer.

sched_move_quota defines how many bucket moves can be done in a row if there are pending storage refs. Then, bucket moves are blocked and a router continues making map-reduce requests.

Replica set parameters

uuid
weight
master

uuid¶: A unique identifier of a replica set.

Type:

Default:

Dynamic:

weight¶: A weight of a replica set. See the Replica set weights section for details.

Type:

Default: 1

Dynamic:

master¶

Turns on automated master discovery in a replica set if set to auto. Applicable only to the configuration of a router; the storage configuration ignores this parameter.

The parameter should be specified per replica set. The configuration is not compatible with a manual master selection.

Examples

Correct configuration:

config = {
    sharding = {
        <replicaset uuid> = {
            master = 'auto',
            replicas = {...},
        },
        ...
    },
    ...
}

Incorrect configuration:

config = {
    sharding = {
        <replicaset uuid> = {
            master = 'auto',
            replicas = {
                <replica uuid1> = {
                    master = true,
                    ...
                },
                <replica uuid2> = {
                    master = false,
                    ...
                },
            },
        },
        ...
    },
    ...
}

If the configuration is incorrect, it is not applied, and the vshard.router.cfg() call throws an error.

If the master parameter is set to auto for some replica sets, the router goes to these replica sets, discovers the master in each of them, and periodically checks if the master instance still has its master status. When the master in the replica set stops being a master, the router goes around all the nodes of the replica set to find out which one is the new master.

Without this setting, the router cannot detect master nodes in the configured replica sets on its own. It relies only on how they are specified in the configuration. This becomes a problem when the master changes, and the change is not delivered to the router’s configuration: for instance, in case the router doesn’t rely on a central configuration provider or the provider cannot deliver a new configuration due to some reason.

Type: string
Default: nil
Dynamic: yes

API Reference

This section represents public and internal API for the router and the storage.

Router API

Subsection

Methods

Router public API

vshard.router.bootstrap()
vshard.router.cfg(cfg)
vshard.router.new(name, cfg)
vshard.router.call(bucket_id, mode, function_name, {argument_list}, {options})
vshard.router.callro(bucket_id, function_name, {argument_list}, {options})
vshard.router.callrw(bucket_id, function_name, {argument_list}, {options})
vshard.router.callre(bucket_id, function_name, {argument_list}, {options})
vshard.router.callbro(bucket_id, function_name, {argument_list}, {options})
vshard.router.callbre(bucket_id, function_name, {argument_list}, {options})
vshard.router.map_callrw(function_name, {argument_list}, {options})
vshard.router.route(bucket_id)
vshard.router.routeall()
vshard.router.bucket_id_strcrc32(key)
vshard.router.bucket_id_mpcrc32(key)
vshard.router.bucket_count()
vshard.router.sync(timeout)
vshard.router.discovery_wakeup()
vshard.router.discovery_set()
vshard.router.info({options})
vshard.router.buckets_info()
vshard.router.enable()
vshard.router.disable()
replicaset_object:call()
replicaset_object:callro()
replicaset_object:callrw()
replicaset_object:callre()

Router internal API

vshard.router.bucket_discovery(bucket_id)

Router public API

vshard.router.bootstrap()¶

Perform the initial cluster bootstrap and distribute all buckets across the replica sets.

Parameters:	timeout – a number of seconds before ending a bootstrap attempt as unsuccessful. Recreate the cluster in case of bootstrap timeout. if_not_bootstrapped – by default is set to `false` that means raise an error, when the cluster is already bootstrapped. `True` means consider an already bootstrapped cluster a success.

Example:

vshard.router.bootstrap({timeout = 4, if_not_bootstrapped = true})

Note

To detect whether a cluster is bootstrapped, vshard looks for at least one bucket in the whole cluster. If the cluster was bootstrapped only partially (for example, due to an error during the first bootstrap), then it will be considered a bootstrapped cluster on a next bootstrap call with if_not_bootstrapped. So this is still a bad practice. Avoid calling bootstrap() multiple times.

vshard.router.cfg(cfg)¶

Configure the database and start sharding for the specified router instance.

Parameters:	cfg – a configuration table

vshard.router.new(name, cfg)¶

Create a new router instance. vshard supports multiple routers in a single Tarantool instance. Each router can be connected to any vshard cluster, and multiple routers can be connected to the same cluster.

A router created via vshard.router.new() works in the same way as a static router, but the method name is preceded by a colon (vshard.router:method_name(...)), while for a static router the method name is preceded by a period (vshard.router.method_name(...)).

A static router can be obtained via the vshard.router.static() method and then used like a router created via the vshard.router.new() method.

Note

box.cfg is shared among all the routers of a single instance.

Parameters:	name – a router instance name. This name is used as a prefix in logs of the router and must be unique within the instance cfg – a configuration table
Return:	a router instance, if created successfully; otherwise, nil and an error object

vshard.router.call(bucket_id, mode, function_name, {argument_list}, {options})¶

Call the function identified by function-name on the shard storing the bucket identified by bucket_id. See the Processing requests section for details on function operation.

Parameters:

bucket_id – a bucket identifier
mode – either a string = ‘read’|’write’, or a map with mode=’read’|’write’ and/or prefer_replica=true|false and/or balance=true|false.
function_name – a function to execute
argument_list – an array of the function’s arguments
options –
- timeout — a request timeout, in seconds. If the router cannot identify a shard with the specified bucket_id, it will retry until the timeout is reached.
- request_timeout (since vshard 0.1.28) — timeout in seconds that serves as a protection against hung replicas. The parameter is used in read requests only (mode=read). It is necessary to pass the request_timeout and timeout parameters together, with the following requirement: timeout > request_timeout.
  The request_timeout parameter controls how much time a single request attempt may take. When this time is over (the TimedOut error is raised), the router retries this request on the next replica as long as the timeout value is not elapsed.
- other net.box options, such as is_async, buffer, on_push are also supported.

The mode parameter has two possible forms: a string or a map. Examples of the string form are: 'read', 'write'. Examples of the map form are: {mode='read'}, {mode='write'}, {mode='read', prefer_replica=true}, {mode='read', balance=true}, {mode='read', prefer_replica=true, balance=true}.

If 'write' is specified then the target is the master.

If prefer_replica=true is specified then the preferred target is one of the replicas, but the target is the master if there is no conveniently available replica.

It may be good to specify prefer_replica=true for functions which are expensive in terms of resource use, to avoid slowing down the master.

If balance=true then there is load balancing—reads are distributed over all the nodes in the replica set in round-robin fashion, with a preference for replicas if prefer_replica=true is also set.

Return:

The original return value of the executed function, or nil and error object. The error object has a type attribute equal to ShardingError or one of the regular Tarantool errors (ClientError, OutOfMemory, SocketError, etc.).

ShardingError is returned on errors specific for sharding: the master is missing, wrong bucket id, etc. It has an attribute code containing one of the values from the vshard.error.code.* LUA table, an optional attribute containing a message with the human-readable error description, and other attributes specific for the error code.

Note

Examples:

To call customer_add function from vshard/example, say:

vshard.router.call(100,
                   'write',
                   'customer_add',
                   {{customer_id = 2, bucket_id = 100, name = 'name2', accounts = {}}},
                   {timeout = 5})
-- or, the same thing but with a map for the second argument
vshard.router.call(100,
                   {mode='write'},
                   'customer_add',
                   {{customer_id = 2, bucket_id = 100, name = 'name2', accounts = {}}},
                   {timeout = 5})

vshard.router.callro(bucket_id, function_name, {argument_list}, {options})¶

Call the function identified by function-name on the shard storing the bucket identified by bucket_id, in read-only mode (similar to calling vshard.router.call with mode=’read’). See the Processing requests section for details on function operation.

Parameters:

bucket_id – a bucket identifier
function_name – a function to execute
argument_list – an array of the function’s arguments
options –
- timeout — a request timeout, in seconds.If the router cannot identify a shard with the specified bucket_id, it will retry until the timeout is reached.
- request_timeout (since vshard 0.1.28) — timeout in seconds that serves as a protection against hung replicas. It is necessary to pass the request_timeout and timeout parameters together, with the following requirement: timeout > request_timeout. The request_timeout parameter controls how much time a single request attempt may take. When this time is over (the TimedOut error is raised), the router retries this request on the next replica as long as the timeout value is not elapsed.
- other net.box options, such as is_async, buffer, on_push are also supported.

Return:

ShardingError is returned on errors specific for sharding: the replica set is not available, the master is missing, wrong bucket id, etc. It has an attribute code containing one of the values from the vshard.error.code.* LUA table, an optional attribute containing a message with the human-readable error description, and other attributes specific for this error code.

vshard.router.callrw(bucket_id, function_name, {argument_list}, {options})¶

Call the function identified by function-name on the shard storing the bucket identified by bucket_id, in read-write mode (similar to calling vshard.router.call with mode=’write’). See the Processing requests section for details on function operation.

Parameters:

bucket_id – a bucket identifier
function_name – a function to execute
argument_list – an array of the function’s arguments
options –
- timeout — a request timeout, in seconds. If the router cannot identify a shard with the specified bucket_id, it will retry until the timeout is reached.
- other net.box options, such as is_async, buffer, on_push are also supported.

Return:

Note

Any write requests that are intended to be executed repeatedly (for example, retried after an error) should be idempotent. The operations’ idempotency ensures that the change is applied only once. Read more: Deduplication of non-idempotent requests.

vshard.router.callre(bucket_id, function_name, {argument_list}, {options})¶

Call the function identified by function-name on the shard storing the bucket identified by bucket_id, in read-only mode (similar to calling vshard.router.call with mode='read'), with preference for a replica rather than a master (similar to calling vshard.router.call with prefer_replica = true). See the Processing requests section for details on function operation.

Parameters:

bucket_id – a bucket identifier
function_name – a function to execute
argument_list – an array of the function’s arguments
options –
- timeout — a request timeout, in seconds. If the router cannot identify a shard with the specified bucket_id, it will retry until the timeout is reached.
- request_timeout (since vshard 0.1.28) — timeout in seconds that serves as a protection against hung replicas. It is necessary to pass the request_timeout and timeout parameters together, with the following requirement: timeout > request_timeout. The request_timeout parameter controls how much time a single request attempt may take. When this time is over (the TimedOut error is raised), the router retries this request on the next replica as long as the timeout value is not elapsed.
- other net.box options, such as is_async, buffer, on_push are also supported.

Return:

vshard.router.callbro(bucket_id, function_name, {argument_list}, {options})¶: This has the same effect as vshard.router.call() with mode parameter = {mode='read', balance=true}.

vshard.router.callbre(bucket_id, function_name, {argument_list}, {options})¶: This has the same effect as vshard.router.call() with mode parameter = {mode='read', balance=true, prefer_replica=true}.

vshard.router.map_callrw(function_name, {argument_list}, {options})¶

The function implements consistent map-reduce over the entire cluster. Consistency means:

All the data was accessible.
The data was not migrated between physical storages during the map requests execution.

The function can be helpful if you need to access:

all the data in the cluster
a vast number of buckets scattered over the instances in case their individual vshard.router.call() takes up too much time.

The function is called on the master node of each replica set with the given arguments.

Parameters:

function_name – a function to call on the storages (masters of all replica sets)
argument_list – an array of the function’s arguments
options –
- timeout – a request timeout, in seconds. The timeout is for the entire map_callrw(), including all its stages.
- return_raw – the net.box option implemented in Tarantool since version 2.10.0. If set to true, net.box returns the response data wrapped in a MessagePack object instead of decoding it to Lua. For more details, see the Return section below.

Important

Do not use a big timeout (longer than 1 minute, for instance). The router tries to block the bucket moves to another storage for the given timeout on all storages. On failure, the block remains for the entire timeout.

Return:

On success: a map with replica set UUIDs (keys) and results of the function_name (values).
```
{uuid1 = {res1}, uuid2 = {res2}, ...}
```
If the function returns nil or box.NULL from one of the storages, it will not be present in the resulting map.

If the return_raw option is used, the result is a map of the following format: {[replicaset_uuid] = msgpack.object} where msgpack.object is an object that stores a MessagePack array with the results returned from the storage map function.

The option use case is the same as in using net.box: to avoid decoding of the call results into Lua. The option can be helpful if a router is used as a proxy and results received from a storage are big.

Example:
```
local res = vshard.router.map_callrw('my_func', args, {..., return_raw = true})

for replicaset_uuid, msgpack_value in pairs(res) do
    log.info('Replicaset %s returned %s', replicaset_uuid,
             msgpack_value:decode())
end
```
This is an illustration of the option usage. Normally, you don’t need to use return_raw if you call the decode() function.
On failure: nil, error object, and optional replica set UUID where the error occurred. UUID will not be returned if the error is not related to a particular replica set. For instance, the method fails if not all buckets were found, even if all replica sets were scanned successfully. Handling the result looks like this:
```
res, err, uuid = vshard.router.map_callrw(...)
if not res then
    -- Error.
    -- 'err' - error object. 'uuid' - optional UUID of replica set
    -- where the error happened.
    ...
else
    -- Success.
    for uuid, value in pairs(res) do
        ...
    end
end
```
If the return_raw option is used, the result on failure is the same as described above.

Map-Reduce in vshard can be divided into three stages: Ref, Map, and Reduce.

Ref and Map. map_callrw() combines both the Ref and the Map stages. The Ref stage ensures data consistency while executing the user’s function (function_name) on all nodes. Keep in mind that consistency is incompatible with rebalancing (it breaks data consistency). Map-reduce and rebalancing are mutually exclusive, they compete for the cluster time. Any bucket move makes the sender and receiver nodes inconsistent, so it is impossible to call a function on them to access all the data without vshard.storage.bucket_ref(). It makes the Ref stage intricate, as it should work together with the rebalancer to ensure they do not block each other.

For this, the storage has a special scheduler for bucket moves and storage refs. Storage ref is a volatile counter defined on each instance. It is incremented when a map-reduce request comes and decremented when it ends. Storage ref pins the entire instance with all its buckets, not just a single bucket (like bucket ref).

The scheduler shares storage time between bucket moves and storage refs fairly. The distribution depends on how long and frequent the moves and refs are. It can be configured using the storage options sched_move_quota and sched_ref_quota. Keep in mind that the scheduler configuration may affect map-reduce requests if used during rebalancing.

During the Map stage, map_callrw() sends map requests one by one to many servers. On success, the function returns a map. The map is a set of “key—value” pairs. The keys are replica set UUIDs, and the values are the results of the user’s function—function_name.

Reduce. The Reduce stage is not performed by vshard. It is what the user’s code does with the results of map_callrw().

Note

map_callrw() works only on masters. Therefore, you can’t use it if at least one replica set has its master node down.

vshard.router.route(bucket_id)¶

Return the replica set object for the bucket with the specified bucket id value.

Parameters:	bucket_id – a bucket identifier
Return:	a replica set object

Example:

replicaset = vshard.router.route(123)

vshard.router.routeall()¶

Return all available replica set objects.

Return:	a map of the following type: `{UUID = replicaset}`
Rtype:	a map of replica set objects

Example:

function selectall()
    local resultset = {}
    shards, err = vshard.router.routeall()
    if err ~= nil then
        error(err)
    end
    for uid, replica in pairs(shards) do
        local set = replica:callro('box.space.*space-name*:select', {{}, {limit=10}}, {timeout=5})
        for _, item in ipairs(set) do
            table.insert(resultset, item)
        end
    end
    table.sort(resultset, function(a, b) return a[1] < b[1] end)
    return resultset
end

vshard.router.bucket_id(key)¶

Deprecated. Logs a warning when used because it is not consistent for cdata numbers.

In particular, it returns 3 different values for normal Lua numbers like 123, for unsigned long long cdata (like 123ULL, or ffi.cast('unsigned long long',123)), and for signed long long cdata (like 123LL, or ffi.cast('long long', 123)). And it is important.

vshard.router.bucket_id(123)
vshard.router.bucket_id(123LL)
vshard.router.bucket_id(123ULL)

For float and double cdata (ffi.cast('float', number), ffi.cast('double', number)) these functions return different values even for the same numbers of the same floating point type. This is because tostring() on a floating point cdata number returns not the number, but a pointer at it. Different on each call.

vshard.router.bucket_id_strcrc32() behaves exactly the same, but does not log a warning. In case you need that behavior.

vshard.router.bucket_id_strcrc32(key)¶

Calculate the bucket id using a simple built-in hash function.

Parameters:	key – a hash key. This can be any Lua object (number, table, string).
Return:	a bucket identifier
Rtype:	number

Example:

tarantool> vshard.router.bucket_count()
---
- 3000
...

tarantool> vshard.router.bucket_id_strcrc32("18374927634039")
---
- 2032
...

tarantool> vshard.router.bucket_id_strcrc32(18374927634039)
---
- 2032
...

tarantool> vshard.router.bucket_id_strcrc32("test")
---
- 1216
...

tarantool> vshard.router.bucket_id_strcrc32("other")
---
- 2284
...

Note

Remember that it is not safe. See details in bucket_id()

vshard.router.bucket_id_mpcrc32(key)¶

This function is safer than bucket_id_strcrc32. It takes a CRC32 from a MessagePack encoded value. That is, bucket id of integers does not depend on their Lua type. In case of a string key, it does not encode it into MessagePack, but takes a hash right from the string.

Parameters:	key – a hash key. This can be any Lua object (number, table, string).
Return:	a bucket identifier
Rtype:	number

However it still may return different values for not equal floating point types. That is, ffi.cast('float', number) may be reflected into a bucket id not equal to ffi.cast('double', number). This can’t be fixed, because a float value, even being casted to double, may have a garbage tail in its fraction.

Floating point keys should not be used to calculate a bucket id, usually.

Be very careful in case you store floating point types in a space. When data is returned from a space, it is cast to Lua number. And if that value had an empty fraction part, it will be treated as an integer by bucket_id_mpcrc32(). So you need to do explicit casts in such cases. Here is an example of the problem:

tarantool> s = box.schema.create_space('test', {format = {{'id', 'double'}}}); _ = s:create_index('pk')
---
...

tarantool> inserted = ffi.cast('double', 1)
---
...

-- Value is stored as double
tarantool> s:replace({inserted})
---
- [1]
...

-- But when returned to Lua, stored as Lua number, not cdata.
tarantool> returned = s:get({inserted}).id
---
...

tarantool> type(returned), returned
---
- number
- 1
...

tarantool> vshard.router.bucket_id_mpcrc32(inserted)
---
- 1411
...
tarantool> vshard.router.bucket_id_mpcrc32(returned)
---
- 1614
...

vshard.router.bucket_count()¶

Return the total number of buckets specified in vshard.router.cfg().

Return:	the total number of buckets
Rtype:	number

tarantool> vshard.router.bucket_count()
---
- 10000
...

vshard.router.sync(timeout)¶

Wait until the dataset is synchronized on replicas.

Parameters:	timeout – a timeout, in seconds
Return:	`true` if the dataset was synchronized successfully; or `nil` and `err` explaining why the dataset cannot be synchronized.

vshard.router.discovery_wakeup()¶: Force wakeup of the bucket discovery fiber.

vshard.router.discovery_set(mode)¶

Turn on/off the background discovery fiber used by the router to find buckets.

Parameters:	mode – working mode of a discovery fiber. There are three modes: `on`, `off` and `once`

When the mode is on (default), the discovery fiber works during all the lifetime of the router. Even after all buckets are discovered, it will still come to storages and download their buckets with some big period (DISCOVERY_IDLE_INTERVAL). This is useful if the bucket topology changes often and the number of buckets is not big. The router will keep its route table up to date even when no requests are processed.

When the mode is off, discovery is disabled completely.

When the mode is once, discovery starts and finds the locations of all buckets, and then the discovery fiber is terminated. This is good for a large bucket count and for clusters, where rebalancing is rare.

The method is good to enable/disable discovery after the router is already started, but discovery is enabled by default. You may want to never enable it even for a short time—then specify the discovery_mode option in the configuration. It takes the same values as vshard.router.discovery_set(mode).

You may decide to turn off discovery or make it once if you have many routers, or tons of buckets (hundreds of thousands and more), and you see that the discovery process consumes notable CPU % on routers and storages. In that case it may be wise to turn off the discovery when there is no rebalancing in the cluster. And turn it on for new routers, as well as for all routers when rebalancing is started.

vshard.router.info({options})¶

Return information about each instance. Since vshard v.0.1.22, the function also accepts options, which can be used to get additional information.

Parameters:	options – `with_services` — a bool value. If set to `true`, the function returns information about the background services (such as discovery, master search, or failover) that are working on the current instance.
Return:

Replica set parameters:

replica set uuid
master instance parameters
replica instance parameters

Instance parameters:

uri—URI of the instance
uuid—UUID of the instance
status—status of the instance (available, unreachable, missing)
network_timeout—a timeout for the request. The value is updated automatically on each 10th successful request and each 2nd failed request.

Bucket parameters:

available_ro – the number of buckets known to the router and available for read requests
available_rw – the number of buckets known to the router and available for read and write requests
unreachable – the number of buckets known to the router but unavailable for any requests
unknown – the number of buckets whose replica sets are not known to the router

Service parameters:

name – service name. Possible values: discovery, failover, master_search.
status – service status. Possible values: ok, error.
error – error message that appears on the error status.
activity – service state. It shows what the service is currently doing (for example, updating replicas).
status_idx – incrementing counter of the status changes. The ok status is updated on every successful iteration of the service. The error status is updated only when it is fixed.

Example:

tarantool> vshard.router.info()
---
- replicasets:
    ac522f65-aa94-4134-9f64-51ee384f1a54:
      replica: &0
        network_timeout: 0.5
        status: available
        uri: storage@127.0.0.1:3303
        uuid: 1e02ae8a-afc0-4e91-ba34-843a356b8ed7
      uuid: ac522f65-aa94-4134-9f64-51ee384f1a54
      master: *0
    cbf06940-0790-498b-948d-042b62cf3d29:
      replica: &1
        network_timeout: 0.5
        status: available
        uri: storage@127.0.0.1:3301
        uuid: 8a274925-a26d-47fc-9e1b-af88ce939412
      uuid: cbf06940-0790-498b-948d-042b62cf3d29
      master: *1
  bucket:
    unreachable: 0
    available_ro: 0
    unknown: 0
    available_rw: 3000
  status: 0
  alerts: []
...

tarantool> vshard.router.info({with_services = true})
---
<all info from vshard.router.info()>
  services:
    failover:
      status_idx: 2
      error:
      activity: idling
      name: failover
      status: ok
    discovery:
      status_idx: 2
      error: Error during discovery: TimedOut
      activity: idling
      name: discovery
      status: error
...

vshard.router.buckets_info()¶

Return information about each bucket. Since a bucket map can be huge, only the required range of buckets can be specified.

Parameters:	offset – the offset in a bucket map of the first bucket to show limit – the maximum number of buckets to show
Return:	a map of the following type: `{bucket_id = 'unknown'/replicaset_uuid}`

tarantool> vshard.router.buckets_info()
---
- - uuid: aaaaaaaa-0000-4000-a000-000000000000
    status: available_rw
  - uuid: aaaaaaaa-0000-4000-a000-000000000000
    status: available_rw
  - uuid: aaaaaaaa-0000-4000-a000-000000000000
    status: available_rw
  - uuid: bbbbbbbb-0000-4000-a000-000000000000
    status: available_rw
  - uuid: bbbbbbbb-0000-4000-a000-000000000000
    status: available_rw
  - uuid: bbbbbbbb-0000-4000-a000-000000000000
    status: available_rw
  - uuid: bbbbbbbb-0000-4000-a000-000000000000
    status: available_rw
...

vshard.router.enable()¶: Since vshard v.0.1.21. Manually allow access to the router API, revert vshard.router.disable().

Note

vshard.router.enable() cannot be used for enabling a router API that was automatically disabled due to a running configuration process.

vshard.router.disable()¶

Since vshard v.0.1.21. Manually restrict access to the router API. When the API is disabled, all its methods throw a Lua error, except vshard.router.cfg(), vshard.router.new(), vshard.router.enable() and vshard.router.disable(). The error object’s name attribute is ROUTER_IS_DISABLED.

The router is enabled by default. However, it is automatically and forcefully disabled until the configuration is finished, as accessing the router’s methods at that time is not safe.

Manual disabling can be used, for example, if some preparatory work needs to be done after calling vshard.router.cfg() but before the router’s methods are available. It will look like this:

vshard.router.disable()
vshard.router.cfg(...)
-- Some preparatory work here ...
vshard.router.enable()
-- vshard.router's methods are available now

object replicaset_object¶

replicaset_object:call(function_name, {argument_list}, {options})¶

Call a function on a nearest available master (distances are defined using replica.zone and cfg.weights matrix) with specified arguments.

Note

The replicaset_object:call method is similar to replicaset_object:callrw.

Parameters:	function_name – function to execute argument_list – array of the function’s arguments options – `timeout` — a request timeout, in seconds. If the `router` cannot identify a shard with the specified `bucket_id`, it will retry until the timeout is reached. other net.box options, such as `is_async`, `buffer`, `on_push` are also supported.
Return:	result of `function_name` on success nil, err otherwise

replicaset_object:callrw(function_name, {argument_list}, {options})¶

Call a function on a nearest available master (distances are defined using replica.zone and cfg.weights matrix) with a specified arguments.

Note

The replicaset_object:callrw method is similar to replicaset_object:call.

Parameters:	function_name – function to execute argument_list – array of the function’s arguments options – `timeout` — a request timeout, in seconds. If the `router` cannot identify a shard with the specified `bucket_id`, it will retry until the timeout is reached. other net.box options, such as `is_async`, `buffer`, `on_push` are also supported.
Return:	result of `function_name` on success nil, err otherwise

tarantool> local bucket = 1; return vshard.router.callrw(
         >     bucket,
         >     'box.space.actors:insert',
         >     {{
         >         1, bucket, 'Renata Litvinova',
         >         {theatre="Moscow Art Theatre"}
         >     }},
         >     {timeout=5}
         > )

replicaset_object:callro(function_name, {argument_list}, {options})¶

Call a function on the nearest available replica (distances are defined using replica.zone and cfg.weights matrix) with specified arguments. It is recommended to use replicaset_object:callro() for calling only read-only functions, as the called functions can be executed not only on a master, but also on replicas.

Parameters:	function_name – function to execute argument_list – array of the function’s arguments options – `timeout` — a request timeout, in seconds. If the `router` cannot identify a shard with the specified `bucket_id`, it will retry until the timeout is reached. other net.box options, such as `is_async`, `buffer`, `on_push` are also supported.
Return:	result of `function_name` on success nil, err otherwise

replicaset:callre(function_name, {argument_list}, {options})¶

Call a function on the nearest available replica (distances are defined using replica.zone and cfg.weights matrix) with specified arguments, with preference for a replica rather than a master (similar to calling vshard.router.call with prefer_replica = true). It is recommended to use replicaset_object:callre() for calling only read-only functions, as the called function can be executed not only on a master, but also on replicas.

Parameters:	function_name – function to execute argument_list – array of the function’s arguments options – `timeout` — a request timeout, in seconds. If the `router` cannot identify a shard with the specified `bucket_id`, it will retry until the timeout is reached. other net.box options, such as `is_async`, `buffer`, `on_push` are also supported.
Return:	result of `function_name` on success nil, err otherwise

vshard.router.master_search_wakeup()¶

Automated master discovery works in its own fiber on a router, which is activated only if at least one replica set is configured to look for the master (the master parameter is set to auto). The fiber wakes up within a certain period. But it is possible to wake it up on demand by using this function.

Manual fiber wakeup can help speed up tests for master change. Another use case is performing some actions with a router in the router console.

The function does nothing if master search is not configured for any replica set.

Return:	none

Router internal API

vshard.router.bucket_discovery(bucket_id)¶

Search for the bucket in the whole cluster. If the bucket is not found, it is likely that it does not exist. The bucket might also be moved during rebalancing and currently is in the RECEIVING state.

Parameters:	bucket_id – a bucket identifier

Storage API

Storage public API

Storage internal API

vshard.storage.bucket_stat(bucket_id)
vshard.storage.bucket_recv(bucket_id, from, data)
vshard.storage.bucket_delete_garbage(bucket_id)
vshard.storage.bucket_collect(bucket_id)
vshard.storage.bucket_force_create(first_bucket_id, count)
vshard.storage.bucket_force_drop(bucket_id, to)
vshard.storage.bucket_send(bucket_id, to)
vshard.storage.buckets_discovery()
vshard.storage.rebalancer_request_state()

Storage public API

vshard.storage.cfg(cfg, instance_uuid)¶

Configure the database and start sharding for the specified storage instance.

Parameters:	cfg – a `storage` configuration instance_uuid – UUID of the instance

vshard.storage.info({options})¶

Return information about the storage instance. Since vshard v.0.1.22, the function also accepts options, which can be used to get additional information.

Parameters:	options – `with_services` — a bool value. If set to `true`, the function returns information about the background services (such as garbage collector, rebalancer, recovery, or applier of the routes) that are working on the current instance. See vshard.router.info for detailed reference.

Example:

tarantool> vshard.storage.info()
---
- replicasets:
    c862545d-d966-45ff-93ad-763dce4a9723:
      uuid: c862545d-d966-45ff-93ad-763dce4a9723
      master:
        uri: admin@localhost:3302
    1990be71-f06e-4d9a-bcf9-4514c4e0c889:
      uuid: 1990be71-f06e-4d9a-bcf9-4514c4e0c889
      master:
        uri: admin@localhost:3304
  bucket:
    receiving: 0
    active: 15000
    total: 15000
    garbage: 0
    pinned: 0
    sending: 0
  status: 0
  replication:
    status: master
  alerts: []
...

vshard.storage.call(bucket_id, mode, function_name, {argument_list})¶

Call the specified function on the current storage instance.

Parameters:	bucket_id – a bucket identifier mode – a type of the function: ‘read’ or ‘write’ function_name – function to execute argument_list – array of the function’s arguments
Return:

The original return value of the executed function, or nil and error object.

vshard.storage.sync(timeout)¶

Wait until the dataset is synchronized on replicas.

Parameters:	timeout – a timeout, in seconds
Return:	`true` if the dataset was synchronized successfully; or `nil` and `err` explaining why the dataset cannot be synchronized.

vshard.storage.bucket_pin(bucket_id)¶

Pin a bucket to a replica set. A pinned bucket cannot be moved even if it breaks the cluster balance.

Parameters:	bucket_id – a bucket identifier
Return:	`true` if the bucket is pinned successfully; or `nil` and `err` explaining why the bucket cannot be pinned

vshard.storage.bucket_unpin(bucket_id)¶

Return a pinned bucket back into the active state.

Parameters:	bucket_id – a bucket identifier
Return:	`true` if the bucket is unpinned successfully; or `nil` and `err` explaining why the bucket cannot be unpinned

vshard.storage.bucket_ref(bucket_id, mode)¶

Create an RO or RW ref.

Parameters:	bucket_id – a bucket identifier mode – ‘read’ or ‘write’
Return:	`true` if the bucket ref is created successfully; or `nil` and `err` explaining why the ref cannot be created

vshard.storage.bucket_refro(bucket_id)¶

An alias for vshard.storage.bucket_ref in the RO mode.

Parameters:	bucket_id – a bucket identifier
Return:	`true` if the bucket ref is created successfully; or `nil` and `err` explaining why the ref cannot be created

vshard.storage.bucket_refrw(bucket_id)¶

An alias for vshard.storage.bucket_ref in the RW mode.

Parameters:	bucket_id – a bucket identifier
Return:	`true` if the bucket ref is created successfully; or `nil` and `err` explaining why the ref cannot be created

vshard.storage.bucket_unref(bucket_id, mode)¶

Remove a RO/RW ref.

Parameters:	bucket_id – a bucket identifier mode – ‘read’ or ‘write’
Return:	`true` if the bucket ref is removed successfully; or `nil` and `err` explaining why the ref cannot be removed

vshard.storage.bucket_unrefro(bucket_id)¶

An alias for vshard.storage.bucket_unref in the RO mode.

Parameters:	bucket_id – a bucket identifier
Return:	`true` if the bucket ref is removed successfully; or `nil` and `err` explaining why the ref cannot be removed

vshard.storage.bucket_unrefrw(bucket_id)¶

An alias for vshard.storage.bucket_unref in the RW mode.

Parameters:	bucket_id – a bucket identifier
Return:	`true` if the bucket ref is removed successfully; or `nil` and `err` explaining why the ref cannot be removed

vshard.storage.find_garbage_bucket(bucket_index, control)¶

Find a bucket which has data in a space but is not stored in a _bucket space; or is in a GARBAGE state.

Parameters:	bucket_index – index of a space with the part of a bucket id control – a garbage collector controller. If there is an increased buckets generation, then the search should be interrupted.
Return:	an identifier of the bucket in the garbage state, if found; otherwise, nil

vshard.storage.buckets_info()¶

Return information about each bucket located in storage. For example:

tarantool> vshard.storage.buckets_info(1)
---
- 1:
    status: active
    ref_rw: 1
    ref_ro: 1
    ro_lock: true
    rw_lock: true
    id: 1

vshard.storage.buckets_count()¶: Return the number of buckets located in storage.

vshard.storage.recovery_wakeup()¶: Immediately wake up a recovery fiber, if it exists.

vshard.storage.rebalancing_is_in_progress()¶: Return a flag indicating whether rebalancing is in progress. The result is true if the node is currently applying routes received from a rebalancer node in the special fiber.

vshard.storage.is_locked()¶: Return a flag indicating whether storage is invisible to the rebalancer.

vshard.storage.rebalancer_disable()¶: Disable rebalancing. A disabled rebalancer sleeps until it is enabled again with vshard.storage.rebalancer_enable().

vshard.storage.rebalancer_enable()¶: Enable rebalancing.

vshard.storage.sharded_spaces()¶

Show the spaces that are visible to rebalancer and garbage collector fibers.

tarantool> vshard.storage.sharded_spaces()
---
- 513:
    engine: memtx
    before_replace: 'function: 0x010e50e738'
    field_count: 0
    id: 513
    on_replace: 'function: 0x010e50e700'
    temporary: false
    index:
      0: &0
        unique: true
        parts:
        - type: number
          fieldno: 1
          is_nullable: false
        id: 0
        type: TREE
        name: primary
        space_id: 513
      1: &1
        unique: false
        parts:
        - type: number
          fieldno: 2
          is_nullable: false
        id: 1
        type: TREE
        name: bucket_id
        space_id: 513
      primary: *0
      bucket_id: *1
    is_local: false
    enabled: true
    name: actors
    ck_constraint: []
...

vshard.storage.on_bucket_event([trigger-function[, old-trigger-function]])¶

Since vshard v.0.1.22. Define a trigger for execution when the data from the user spaces is changed (deleted or inserted) due to the rebalancing process. The trigger is invoked each time the data batch changes.

Parameters:	trigger-function (`function`) – function which will become the trigger function. old-trigger-function (`function`) – existing trigger function which will be replaced by trigger-function.
Return:	nil or function pointer

The trigger-function can have up to three parameters:

event_type (string) – in order to distinguish event, you can compare this argument with the supported event types, bucket_data_recv_txn and bucket_data_gc_txn.

bucket_id (unsigned) – bucket id.

data (table) – additional information about data change transaction. Currently it only includes an array of all spaces (data.spaces), affected by a transaction in which trigger-function is executed.

Example:

vshard.storage.on_bucket_event(function(event, bucket_id, data)
    if event == 'bucket_data_recv_txn' then
        -- Handle it.
        for idx, space in ipairs(data.spaces) do
            ...
        end
    elseif event == 'bucket_data_gc_txn' then
        -- Handle it.
        ...
    end
end)

Note

As everything executed inside triggers is already in a transaction, you shouldn’t use transactions, yield-operations (explicit or not), changes to different space engines (see rule #2).

If the parameters are (nil, old-trigger-function), then the old trigger is deleted. If both parameters are omitted, then the response is a list of existing trigger functions.

Details about trigger characteristics are in the triggers section.

Storage internal API

vshard.storage.bucket_recv(bucket_id, from, data)¶

Receive a bucket identified by bucket id from a remote replica set.

Parameters:	bucket_id – a bucket identifier from – UUID of source replica set data – data logically stored in a bucket identified by bucket_id, in the same format as the return value from `bucket_collect() <storage_api-bucket_collect>`

vshard.storage.bucket_stat(bucket_id)¶

Return information about the bucket id:

tarantool> vshard.storage.bucket_stat(1)
---
- 0
- status: active
  id: 1
...

Parameters:	bucket_id – a bucket identifier

vshard.storage.bucket_delete_garbage(bucket_id)¶

Force garbage collection for the bucket identified by bucket_id in case the bucket was transferred to a different replica set.

Parameters:	bucket_id – a bucket identifier

vshard.storage.bucket_collect(bucket_id)¶

Collect all the data that is logically stored in the bucket identified by bucket_id:

tarantool> vshard.storage.bucket_collect(1)
---
- 0
- - - 514
    - - [10, 1, 1, 100, 'Account 10']
      - [11, 1, 1, 100, 'Account 11']
      - [12, 1, 1, 100, 'Account 12']
      - [50, 5, 1, 100, 'Account 50']
      - [51, 5, 1, 100, 'Account 51']
      - [52, 5, 1, 100, 'Account 52']
  - - 513
    - - [1, 1, 'Customer 1']
      - [5, 1, 'Customer 5']
...

Parameters:	bucket_id – a bucket identifier

vshard.storage.bucket_force_create(first_bucket_id, count)¶

Force creation of the buckets (single or multiple) on the current replica set. Use only for manual emergency recovery or for initial bootstrap.

Parameters:	first_bucket_id – an identifier of the first bucket in a range count – the number of buckets to insert (default = 1)

vshard.storage.bucket_force_drop(bucket_id)¶

Drop a bucket manually for tests or emergency cases.

Parameters:	bucket_id – a bucket identifier

vshard.storage.bucket_send(bucket_id, to)¶

Send a specified bucket from the current replica set to a remote replica set.

Parameters:	bucket_id – bucket identifier to – UUID of a remote replica set

vshard.storage.rebalancer_request_state()¶

Check all buckets of the host storage that have the SENT or ACTIVE state, return the number of active buckets.

Return:	the number of buckets in the active state, if found; otherwise, nil

vshard.storage.buckets_discovery()¶: Collect an array of active bucket identifiers for discovery.

SQL DBMS Modules

The discussion here in the reference is about incorporating and using two modules that have already been created: the “SQL DBMS rocks” for MySQL and PostgreSQL.

To call another DBMS from Tarantool, the essential requirements are: another DBMS, and Tarantool. The module which connects Tarantool to another DBMS may be called a “connector”. Within the module there is a shared library which may be called a “driver”.

Tarantool supplies DBMS connector modules with the module manager for Lua, LuaRocks. So the connector modules may be called “rocks”.

The Tarantool rocks allow for connecting to SQL servers and executing SQL statements the same way that a MySQL or PostgreSQL client does. The SQL statements are visible as Lua methods. Thus Tarantool can serve as a “MySQL Lua Connector” or “PostgreSQL Lua Connector”, which would be useful even if that was all Tarantool could do. But of course Tarantool is also a DBMS, so the module also is useful for any operations, such as database copying and accelerating, which work best when the application can work on both SQL and Tarantool inside the same Lua routine. The methods for connect/select/insert/etc. are similar to the ones in the net.box module.

From a user’s point of view the MySQL and PostgreSQL rocks are very similar, so the following sections – “MySQL Example” and “PostgreSQL Example” – contain some redundancy.

MySQL Example

This example assumes that MySQL 5.5 or MySQL 5.6 or MySQL 5.7 has been installed. Recent MariaDB versions will also work, the MariaDB C connector is used. The package that matters most is the MySQL client developer package, typically named something like libmysqlclient-dev. The file that matters most from this package is libmysqlclient.so or a similar name. One can use find or whereis to see what directories these files are installed in.

It will be necessary to install Tarantool’s MySQL driver shared library, load it, and use it to connect to a MySQL server instance. After that, one can pass any MySQL statement to the server instance and receive results, including multiple result sets.

Installation

Check the instructions for downloading and installing a binary package that apply for the environment where Tarantool was installed. In addition to installing tarantool, install tarantool-dev. For example, on Ubuntu, add the line:

$ sudo apt-get install tarantool-dev

Now, for the MySQL driver shared library, there are two ways to install:

With LuaRocks

Begin by installing luarocks and making sure that tarantool is among the upstream servers, as in the instructions on rocks.tarantool.org, the Tarantool luarocks page. Now execute this:

luarocks install mysql [MYSQL_LIBDIR = path]
                       [MYSQL_INCDIR = path]
                       [--local]

For example:

$ luarocks install mysql MYSQL_LIBDIR=/usr/local/mysql/lib

With GitHub

Go the site github.com/tarantool/mysql. Follow the instructions there, saying:

$ git clone https://github.com/tarantool/mysql.git
$ cd mysql && cmake . -DCMAKE_BUILD_TYPE=RelWithDebInfo
$ make
$ make install

At this point it is a good idea to check that the installation produced a file named driver.so, and to check that this file is on a directory that is searched by the require request.

Connecting

Begin by making a require request for the mysql driver. We will assume that the name is mysql in further examples.

mysql = require('mysql')

Now, say:

connection_name = mysql.connect(connection options)

The connection-options parameter is a table. Possible options are:

host = host-name - string, default value = ‘localhost’
port = port-number - number, default value = 3306
user = user-name - string, default value is operating-system user name
password = password - string, default value is blank
db = database-name - string, default value is blank
raise = true|false - boolean, default value is false

The option names, except for raise, are similar to the names that MySQL’s mysql client uses, for details see the MySQL manual at dev.mysql.com/doc/refman/5.6/en/connecting.html. The raise option should be set to true if errors should be raised when encountered. To connect with a Unix socket rather than with TCP, specify host = 'unix/' and port = socket-name.

Example, using a table literal enclosed in {braces}:

conn = mysql.connect({
    host = '127.0.0.1',
    port = 3306,
    user = 'p',
    password = 'p',
    db = 'test',
    raise = true
})
-- OR
conn = mysql.connect({
    host = 'unix/',
    port = '/var/run/mysqld/mysqld.sock'
})

Example, creating a function which sets each option in a separate line:

tarantool> -- Connection function. Usage: conn = mysql_connect()
tarantool> function mysql_connection()
         >   local p = {}
         >   p.host = 'widgets.com'
         >   p.db = 'test'
         >   conn = mysql.connect(p)
         >   return conn
         > end
---
...
tarantool> conn = mysql_connect()
---
...

We will assume that the name is ‘conn’ in further examples.

How to ping

To ensure that a connection is working, the request is:

connection-name:ping()

Example:

tarantool> conn:ping()
---
- true
...

Executing a statement

For all MySQL statements, the request is:

connection-name:execute(sql-statement [, parameters])

where sql-statement is a string, and the optional parameters are extra values that can be plugged in to replace any question marks (“?”s) in the SQL statement.

Example:

tarantool> conn:execute('select table_name from information_schema.tables')
---
- - table_name: ALL_PLUGINS
  - table_name: APPLICABLE_ROLES
  - table_name: CHARACTER_SETS
  <...>
- 78
...

Closing connection

To end a session that began with mysql.connect, the request is:

connection-name:close()

Example:

tarantool> conn:close()
---
...

For further information, including examples of rarely-used requests, see the README.md file at github.com/tarantool/mysql.

Example

The example was run on an Ubuntu 12.04 (“precise”) machine where tarantool had been installed in a /usr subdirectory, and a copy of MySQL had been installed on ~/mysql-5.5. The mysqld server instance is already running on the local host 127.0.0.1.

$ export TMDIR=~/mysql-5.5
$ # Check that the include subdirectory exists by looking
$ # for .../include/mysql.h. (If this fails, there's a chance
$ # that it's in .../include/mysql/mysql.h instead.)
$ [ -f $TMDIR/include/mysql.h ] && echo "OK" || echo "Error"
OK

$ # Check that the library subdirectory exists and has the
$ # necessary .so file.
$ [ -f $TMDIR/lib/libmysqlclient.so ] && echo "OK" || echo "Error"
OK

$ # Check that the mysql client can connect using some factory
$ # defaults: port = 3306, user = 'root', user password = '',
$ # database = 'test'. These can be changed, provided one uses
$ # the changed values in all places.
$ $TMDIR/bin/mysql --port=3306 -h 127.0.0.1 --user=root \
    --password= --database=test
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 25
Server version: 5.5.35 MySQL Community Server (GPL)
...
Type 'help;' or '\h' for help. Type '\c' to clear ...

$ # Insert a row in database test, and quit.
mysql> CREATE TABLE IF NOT EXISTS test (s1 INT, s2 VARCHAR(50));
Query OK, 0 rows affected (0.13 sec)
mysql> INSERT INTO test.test VALUES (1,'MySQL row');
Query OK, 1 row affected (0.02 sec)
mysql> QUIT
Bye

$ # Install luarocks
$ sudo apt-get -y install luarocks | grep -E "Setting up|already"
Setting up luarocks (2.0.8-2) ...

$ # Set up the Tarantool rock list in ~/.luarocks,
$ # following instructions at rocks.tarantool.org
$ mkdir ~/.luarocks
$ echo "rocks_servers = {[[http://rocks.tarantool.org/]]}" >> \
    ~/.luarocks/config.lua

$ # Ensure that the next "install" will get files from Tarantool
$ # master repository. The resultant display is normal for Ubuntu
$ # 12.04 precise
$ cat /etc/apt/sources.list.d/tarantool.list
deb http://tarantool.org/dist/2.1/ubuntu/ precise main
deb-src http://tarantool.org/dist/2.1/ubuntu/ precise main

$ # Install tarantool-dev. The displayed line should show version = 2.1
$ sudo apt-get -y install tarantool-dev | grep -E "Setting up|already"
Setting up tarantool-dev (2.1.0.222.g48b98bb~precise-1) ...
$

$ # Use luarocks to install locally, that is, relative to $HOME
$ luarocks install mysql MYSQL_LIBDIR=/usr/local/mysql/lib --local
Installing http://rocks.tarantool.org/mysql-scm-1.rockspec...
... (more info about building the Tarantool/MySQL driver appears here)
mysql scm-1 is now built and installed in ~/.luarocks/

$ # Ensure driver.so now has been created in a place
$ # tarantool will look at
$ find ~/.luarocks -name "driver.so"
~/.luarocks/lib/lua/5.1/mysql/driver.so

$ # Change directory to a directory which can be used for
$ # temporary tests. For this example we assume that the name
$ # of this directory is /home/pgulutzan/tarantool_sandbox.
$ # (Change "/home/pgulutzan" to whatever is the user's actual
$ # home directory for the machine that's used for this test.)
$ cd /home/pgulutzan/tarantool_sandbox

$ # Start the Tarantool server instance. Do not use a Lua initialization file.

$ tarantool
tarantool: version 2.1.0-222-g48b98bb
type 'help' for interactive help
tarantool>

Configure tarantool and load mysql module. Make sure that tarantool doesn’t reply “error” for the call to “require()”.

tarantool> box.cfg{}
...
tarantool> mysql = require('mysql')
---
...

Create a Lua function that will connect to the MySQL server instance, (using some factory default values for the port and user and password), retrieve one row, and display the row. For explanations of the statement types used here, read the Lua tutorial earlier in the Tarantool user manual.

tarantool> function mysql_select ()
         >   local conn = mysql.connect({
         >     host = '127.0.0.1',
         >     port = 3306,
         >     user = 'root',
         >     db = 'test'
         >   })
         >   local test = conn:execute('SELECT * FROM test WHERE s1 = 1')
         >   local row = ''
         >   for i, card in pairs(test) do
         >       row = row .. card.s2 .. ' '
         >       end
         >   conn:close()
         >   return row
         > end
---
...
tarantool> mysql_select()
---
- 'MySQL row '
...

Observe the result. It contains “MySQL row”. So this is the row that was inserted into the MySQL database. And now it’s been selected with the Tarantool client.

PostgreSQL Example

This example assumes that PostgreSQL 8 or PostgreSQL 9 has been installed. More recent versions should also work. The package that matters most is the PostgreSQL developer package, typically named something like libpq-dev. On Ubuntu this can be installed with:

$ sudo apt-get install libpq-dev

However, because not all platforms are alike, for this example the assumption is that the user must check that the appropriate PostgreSQL files are present and must explicitly state where they are when building the Tarantool/PostgreSQL driver. One can use find or whereis to see what directories PostgreSQL files are installed in.

It will be necessary to install Tarantool’s PostgreSQL driver shared library, load it, and use it to connect to a PostgreSQL server instance. After that, one can pass any PostgreSQL statement to the server instance and receive results.

Installation

$ sudo apt-get install tarantool-dev

Now, for the PostgreSQL driver shared library, there are two ways to install:

With LuaRocks

Begin by installing luarocks and making sure that tarantool is among the upstream servers, as in the instructions on rocks.tarantool.org, the Tarantool luarocks page. Now execute this:

luarocks install pg [POSTGRESQL_LIBDIR = path]
                    [POSTGRESQL_INCDIR = path]
                    [--local]

For example:

$ luarocks install pg POSTGRESQL_LIBDIR=/usr/local/postgresql/lib

With GitHub

Go the site github.com/tarantool/pg. Follow the instructions there, saying:

$ git clone https://github.com/tarantool/pg.git
$ cd pg && cmake . -DCMAKE_BUILD_TYPE=RelWithDebInfo
$ make
$ make install

At this point it is a good idea to check that the installation produced a file named driver.so, and to check that this file is on a directory that is searched by the require request.

Connecting

Begin by making a require request for the pg driver. We will assume that the name is pg in further examples.

pg = require('pg')

Now, say:

connection_name = pg.connect(connection options)

The connection-options parameter is a table. Possible options are:

host = host-name - string, default value = ‘localhost’
port = port-number - number, default value = 5432
user = user-name - string, default value is operating-system user name
pass = password or password = password - string, default value is blank
db = database-name - string, default value is blank

The names are similar to the names that PostgreSQL itself uses.

Example, using a table literal enclosed in {braces}:

conn = pg.connect({
    host = '127.0.0.1',
    port = 5432,
    user = 'p',
    password = 'p',
    db = 'test'
})

Example, creating a function which sets each option in a separate line:

tarantool> function pg_connect()
         >   local p = {}
         >   p.host = 'widgets.com'
         >   p.db = 'test'
         >   p.user = 'postgres'
         >   p.password = 'postgres'
         >   local conn = pg.connect(p)
         >   return conn
         > end
---
...
tarantool> conn = pg_connect()
---
...

We will assume that the name is ‘conn’ in further examples.

How to ping

To ensure that a connection is working, the request is:

connection-name:ping()

Example:

tarantool> conn:ping()
---
- true
...

Executing a statement

For all PostgreSQL statements, the request is:

connection-name:execute(sql-statement [, parameters])

where sql-statement is a string, and the optional parameters are extra values that can be plugged in to replace any placeholders ($1 $2 $3 etc.) in the SQL statement.

Example:

tarantool> conn:execute('select tablename from pg_tables')
---
- - tablename: pg_statistic
  - tablename: pg_type
  - tablename: pg_authid
  <...>
...

Closing connection

To end a session that began with pg.connect, the request is:

connection-name:close()

Example:

tarantool> conn:close()
---
...

For further information, including examples of rarely-used requests, see the README.md file at github.com/tarantool/pg.

Example

The example was run on an Ubuntu 12.04 (“precise”) machine where tarantool had been installed in a /usr subdirectory, and a copy of PostgreSQL had been installed on /usr. The PostgreSQL server instance is already running on the local host 127.0.0.1.

$ # Check that the include subdirectory exists
$ # by looking for /usr/include/postgresql/libpq-fe-h.
$ [ -f /usr/include/postgresql/libpq-fe.h ] && echo "OK" || echo "Error"
OK

$ # Check that the library subdirectory exists and has the necessary .so file.
$ [ -f /usr/lib/x86_64-linux-gnu/libpq.so ] && echo "OK" || echo "Error"
OK

$ # Check that the psql client can connect using some factory defaults:
$ # port = 5432, user = 'postgres', user password = 'postgres',
$ # database = 'postgres'. These can be changed, provided one changes
$ # them in all places. Insert a row in database postgres, and quit.
$ psql -h 127.0.0.1 -p 5432 -U postgres -d postgres
Password for user postgres:
psql (9.3.10)
SSL connection (cipher: DHE-RSA-AES256-SHA, bits: 256)
Type "help" for help.

postgres=# CREATE TABLE test (s1 INT, s2 VARCHAR(50));
CREATE TABLE
postgres=# INSERT INTO test VALUES (1,'PostgreSQL row');
INSERT 0 1
postgres=# \q
$

$ # Install luarocks
$ sudo apt-get -y install luarocks | grep -E "Setting up|already"
Setting up luarocks (2.0.8-2) ...

$ # Set up the Tarantool rock list in ~/.luarocks,
$ # following instructions at rocks.tarantool.org
$ mkdir ~/.luarocks
$ echo "rocks_servers = {[[http://rocks.tarantool.org/]]}" >> \
        ~/.luarocks/config.lua

$ # Ensure that the next "install" will get files from Tarantool master
$ # repository. The resultant display is normal for Ubuntu 12.04 precise
$ cat /etc/apt/sources.list.d/tarantool.list
deb http://tarantool.org/dist/2.0/ubuntu/ precise main
deb-src http://tarantool.org/dist/2.0/ubuntu/ precise main

$ # Install tarantool-dev. The displayed line should show version = 2.0
$ sudo apt-get -y install tarantool-dev | grep -E "Setting up|already"
Setting up tarantool-dev (2.0.4.222.g48b98bb~precise-1) ...
$

$ # Use luarocks to install locally, that is, relative to $HOME
$ luarocks install pg POSTGRESQL_LIBDIR=/usr/lib/x86_64-linux-gnu --local
Installing http://rocks.tarantool.org/pg-scm-1.rockspec...
... (more info about building the Tarantool/PostgreSQL driver appears here)
pg scm-1 is now built and installed in ~/.luarocks/

$ # Ensure driver.so now has been created in a place
$ # tarantool will look at
$ find ~/.luarocks -name "driver.so"
~/.luarocks/lib/lua/5.1/pg/driver.so

$ # Change directory to a directory which can be used for
$ # temporary tests. For this example we assume that the
$ # name of this directory is $HOME/tarantool_sandbox.
$ # (Change "$HOME" to whatever is the user's actual
$ # home directory for the machine that's used for this test.)
cd $HOME/tarantool_sandbox

$ # Start the Tarantool server instance. Do not use a Lua initialization file.

$ tarantool
tarantool: version 2.0.4-412-g803b15c
type 'help' for interactive help
tarantool>

Configure tarantool and load pg module. Make sure that tarantool doesn’t reply “error” for the call to “require()”.

tarantool> box.cfg{}
...
tarantool> pg = require('pg')
---
...

Create a Lua function that will connect to a PostgreSQL server, (using some factory default values for the port and user and password), retrieve one row, and display the row. For explanations of the statement types used here, read the Lua tutorial earlier in the Tarantool user manual.

tarantool> function pg_select ()
         >   local conn = pg.connect({
         >     host = '127.0.0.1',
         >     port = 5432,
         >     user = 'postgres',
         >     password = 'postgres',
         >     db = 'postgres'
         >   })
         >   local test = conn:execute('SELECT * FROM test WHERE s1 = 1')
         >   local row = ''
         >   for i, card in pairs(test) do
         >       row = row .. card.s2 .. ' '
         >       end
         >   conn:close()
         >   return row
         > end
---
...
tarantool> pg_select()
---
- 'PostgreSQL row '
...

Observe the result. It contains “PostgreSQL row”. So this is the row that was inserted into the PostgreSQL database. And now it’s been selected with the Tarantool client.

Other rocks

This page features a list of links to third-party Tarantool module documentation that is hosted externally – mostly on GitHub pages or in READMEs:

For Tarantool Enterprise modules, see the Tarantool EE documentation.

C API reference

List of C API headers

Module box

type box_function_ctx_t¶: Opaque structure passed to a C stored procedure

int box_return_tuple(box_function_ctx_t *ctx, box_tuple_t *tuple)¶

Return a tuple from a C stored procedure.

The returned tuple is automatically reference-counted by Tarantool. An example program that uses box_return_tuple() is write.c.

Parameters:	ctx (box_function_ctx_t) – an opaque structure passed to the C stored procedure by Tarantool tuple* (box_tuple_t*) – a tuple to return
Returns:	-1 on error (perhaps, out of memory; check box_error_last())
Returns:	0 otherwise

int box_return_mp(box_function_ctx_t *ctx, const char *mp, const char *mp_end)¶

Return a pointer to a series of bytes in MessagePack format.

This can be used instead of box_return_tuple() – it can send the same value, but as MessagePack instead of as a tuple object. It may be simpler than box_return_tuple() when the result is small, for example a number or a boolean or a short string. It will also be faster than box_return_tuple(), if the result is that users save time by not creating a tuple every time they want to return something from a C function.

On the other hand, if an already-existing tuple was obtained from an iterator, then it would be faster to return the tuple via box_return_tuple() rather than extracting its parts and sending them via box_return_mp().

Parameters:	ctx (box_function_ctx_t) – an opaque structure passed to the C stored procedure by Tarantool mp (char) – the first MessagePack byte mp_end (char*) – after the last MessagePack byte
Returns:	-1 on error (perhaps, out of memory; check box_error_last())
Returns:	0 otherwise

For example, if mp is a buffer, and mp_end is a return value produced by encoding a single MP_UINT scalar value with mp_end=mp_encode_uint(mp,1);, then box_return_mp(ctx,mp,mp_end); should return 0.

uint32_t box_space_id_by_name(const char *name, uint32_t len)¶

Find space id by name.

This function performs a SELECT request on the _vspace system space.

Parameters:	name (const char) – space name len* (uint32_t) – length of `name`
Returns:	`BOX_ID_NIL` on error or if not found (check box_error_last())
Returns:	space_id otherwise

Module clock

double clock_realtime(void)¶
double clock_monotonic(void)¶
double clock_process(void)¶
double clock_thread(void)¶

int64_t clock_realtime64(void)¶
int64_t clock_monotonic64(void)¶
int64_t clock_process64(void)¶
int64_t clock_thread64(void)¶

Module coio

enum COIO_EVENT¶

enumerator ::COIO_READ¶: READ event

enumerator ::COIO_WRITE¶: WRITE event

int coio_wait(int fd, int event, double timeout)¶

Wait until READ or WRITE event on socket (fd). Yields.

Parameters:	fd (int) – non-blocking socket file description event (int) – requested events to wait. Combination of `COIO_READ \| COIO_WRITE` bit flags. timeout (double) – timeout in seconds.
Returns:	0 - timeout
Returns:	>0 - returned events. Combination of `TNT_IO_READ \| TNT_IO_WRITE` bit flags.

ssize_t coio_call(ssize_t (*func)(va_list), ...)¶

Create new eio task with specified function and arguments. Yield and wait until the task is complete. This function may use the worker_pool_threads configuration parameter.

To avoid double error checking, this function does not throw exceptions. In most cases it is also necessary to check the return value of the called function and perform necessary actions. If func sets errno, the errno is preserved across the call.

Returns:	-1 and `errno` = ENOMEM if failed to create a task
Returns:	the function’s return (`errno` is preserved).

Example:

static ssize_t openfile_cb(va_list ap)
{
        const char* filename = va_arg(ap);
        int flags = va_arg(ap);
        return open(filename, flags);
}

if (coio_call(openfile_cb, "/tmp/file", 0) == -1)
    // handle errors.
...

int coio_getaddrinfo(const char *host, const char *port, const struct addrinfo *hints, struct addrinfo **res, double timeout)¶: Fiber-friendly version of getaddrinfo(3).

int coio_close(int fd)¶

Close the fd and wake any fiber blocked in coio_wait() call on this fd.

Parameters:	fd (int) – non-blocking socket file description
Returns:	the result of `close(fd)`, see close(2)

Module error

enum box_error_code¶: For a complete list of errors, refer to the Tarantool error code header file.

type box_error_t¶: Error – contains information about error.

const char *box_error_type(const box_error_t *error)¶

Return the error type, e.g. “ClientError”, “SocketError”, etc.

Parameters:	error (box_error_t*) – error
Returns:	not-null string

uint32_t box_error_code(const box_error_t *error)

Return IPROTO error code

Parameters:	error (box_error_t*) – error
Returns:	box_error_code

const char *box_error_message(const box_error_t *error)¶

Return the error message

Parameters:	error (box_error_t*) – error
Returns:	not-null string

box_error_t *box_error_last(void)¶

Get the information about the last API call error.

The Tarantool error handling works most like libc’s errno. All API calls return -1 or NULL in the event of error. An internal pointer to box_error_t type is set by API functions to indicate what went wrong. This value is only significant if API call failed (returned -1 or NULL).

Successful function can also touch the last error in some cases. You don’t have to clear the last error before calling API functions. The returned object is valid only until next call to any API function.

You must set the last error using box_error_set() in your stored C procedures if you want to return a custom error message. You can re-throw the last API error to IPROTO client by keeping the current value and returning -1 to Tarantool from your stored procedure.

Returns:	last error

void box_error_clear(void)¶: Clear the last error.

int box_error_set(const char *file, unsigned line, uint32_t code, const char *format, ...)¶

Set the last error.

Parameters:	file (const char) – line* (unsigned) – code (uint32_t) – IPROTO error code format (const char) – ...* – format arguments

Module fiber

type fiber¶: Fiber - contains information about a fiber.

typedef int (*fiber_func)(va_list)¶: Function to run inside a fiber.

fiber *fiber_new(const char *name, fiber_func f)¶

Create a new fiber.

Takes a fiber from the fiber cache, if it’s not empty. Can fail only if there is not enough memory for the fiber structure or fiber stack.

The created fiber automatically returns itself to the fiber cache when its “main” function completes.

Parameters:	name (const char*) – string with fiber name f (fiber_func) – func for run inside fiber

Module index

type box_iterator_t¶: A space iterator

enum iterator_type¶

Controls how to iterate over tuples in an index. Different index types support different iterator types. For example, one can start iteration from a particular value (request key) and then retrieve all tuples where keys are greater or equal (= GE) to this key.

If iterator type is not supported by the selected index type, iterator constructor must fail with ER_UNSUPPORTED. To be selectable for primary key, an index must support at least ITER_EQ and ITER_GE types.

NULL value of request key corresponds to the first or last key in the index, depending on iteration direction. (first key for GE and GT types, and last key for LE and LT). Therefore, to iterate over all tuples in an index, one can use ITER_GE or ITER_LE iteration types with start key equal to NULL. For ITER_EQ, the key must not be NULL.

enumerator ::ITER_EQ¶: key == x ASC order

enumerator ::ITER_REQ¶: key == x DESC order

enumerator ::ITER_ALL¶: all tuples

enumerator ::ITER_LT¶: key < x

enumerator ::ITER_LE¶: key <= x

enumerator ::ITER_GE¶: key >= x

enumerator ::ITER_GT¶: key > x

enumerator ::ITER_BITS_ALL_SET¶: all bits from x are set in key

enumerator ::ITER_BITS_ANY_SET¶: at least one x’s bit is set

enumerator ::ITER_BITS_ALL_NOT_SET¶: all bits are not set

enumerator ::ITER_OVERLAPS¶: key overlaps x

enumerator ::ITER_NEIGHBOR¶: tuples in distance ascending order from specified point

box_iterator_t *box_index_iterator(uint32_t space_id, uint32_t index_id, int type, const char *key, const char *key_end)¶

Allocate and initialize iterator for space_id, index_id.

The returned iterator must be destroyed by box_iterator_free.

Parameters:	space_id (uint32_t) – space identifier index_id (uint32_t) – index identifier type (int) – iterator_type key (const char) – encode key in MsgPack Array format ([part1, part2, …]) key_end* (const char*) – the end of encoded `key`
Returns:	NULL on error (check box_error_last)
Returns:	iterator otherwise

int box_iterator_next(box_iterator_t *iterator, box_tuple_t **result)¶

Retrieve the next item from the iterator.

Parameters:	iterator (box_iterator_t) – an iterator returned by box_index_iterator result* (box_tuple_t**) – output argument. result a tuple or NULL if there is no more data.
Returns:	-1 on error (check box_error_last)
Returns:	0 on success. The end of data is not an error.

void box_iterator_free(box_iterator_t *iterator)¶

Destroy and deallocate iterator.

Parameters:	iterator (box_iterator_t*) – an iterator returned by box_index_iterator

int iterator_direction(enum iterator_type type)¶: Determine a direction of the given iterator type: -1 for REQ, LT, LE, and +1 for all others.

ssize_t box_index_len(uint32_t space_id, uint32_t index_id)¶

Return the number of element in the index.

Parameters:	space_id (uint32_t) – space identifier index_id (uint32_t) – index identifier
Returns:	-1 on error (check box_error_last)
Returns:	>= 0 otherwise

ssize_t box_index_bsize(uint32_t space_id, uint32_t index_id)¶

Return the number of bytes used in memory by the index.

Parameters:	space_id (uint32_t) – space identifier index_id (uint32_t) – index identifier
Returns:	-1 on error (check box_error_last)
Returns:	>= 0 otherwise

int box_index_random(uint32_t space_id, uint32_t index_id, uint32_t rnd, box_tuple_t **result)¶

Return a random tuple from the index (useful for statistical analysis).

Parameters:	space_id (uint32_t) – space identifier index_id (uint32_t) – index identifier rnd (uint32_t) – random seed result (box_tuple_t**) – output argument. result a tuple or NULL if there is no tuples in space

Module latch

type box_latch_t¶: A lock for cooperative multitasking environment

box_latch_t *box_latch_new(void)¶

Allocate and initialize the new latch.

Returns:	allocated latch object
Return type:	box_latch_t *

void box_latch_delete(box_latch_t *latch)¶

Destroy and free the latch.

Parameters:	latch (box_latch_t*) – latch to destroy

void box_latch_lock(box_latch_t *latch)¶

Lock a latch. Waits indefinitely until the current fiber can gain access to the latch. Since version 2.11.0, locks are acquired exactly in the order in which they were requested.

Parameters:	latch (box_latch_t*) – latch to lock

int box_latch_trylock(box_latch_t *latch)¶

Try to lock a latch. Return immediately if the latch is locked.

Parameters:	latch (box_latch_t*) – latch to lock
Returns:	status of operation. 0 - success, 1 - latch is locked
Return type:	int

void box_latch_unlock(box_latch_t *latch)¶

Unlock a latch. The fiber calling this function must own the latch.

Parameters:	latch (box_latch_t*) – latch to unlock

Function on_shutdown

int box_on_shutdown(void *arg, int (*new_handler)(void*), int (*old_handler)(void*))¶

Parameters:	arg (void) – Pointer to an area that the new handler can use new_handler* (function) – Pointer to a function which will be registered, or NULL old_handler* (function*) – Pointer to a function which will be deregistered, or NULL
Returns:	status of operation. 0 - success, -1 - failure
Return type:	int

A function which is registered will be called when the Tarantool instance shuts down. This is functionally similar to what box.ctl.on_shutdown does.

If there are several on_shutdown functions, the Tarantool instance will call them in reverse order of registration, that is, it will call the last-registered function first.

Typically a module developer will register an on_shutdown function that does whatever cleanup work the module requires, and then returns control to the Tarantool instance. Such an on_shutdown function should be fast, or should use an asynchronous waiting mechanism (for example coio_wait).

Possible errors: old_handler does not exist (errno = EINVAL), new_handler and old_handler are both NULL (errno = EINVAL), memory allocation fails (errno = ENOMEM).

Example: if the C API .c program contains a function int on_shutdown_function(void *arg) {printf("Bye!\n");return 0; } and later, in the function which the instance calls, contains a line box_on_shutdown(NULL, on_shutdown_function, NULL); then, if all goes well, when the instance shuts down, it will display “Bye!”.

Added in version 2.8.1.

Module lua/utils

void *luaL_pushcdata(struct lua_State *L, uint32_t ctypeid)¶

Push cdata of given ctypeid onto the stack.

CTypeID must be used from FFI at least once. Allocated memory returned uninitialized. Only numbers and pointers are supported.

Parameters:	L (lua_State) – Lua State ctypeid* (uint32_t) – FFI’s CTypeID of this cdata
Returns:	memory associated with this cdata

Module say (logging)

enum say_level¶

enumerator ::S_FATAL¶: do not use this value directly

enumerator ::S_SYSERROR¶

enumerator ::S_ERROR¶

enumerator ::S_CRIT¶

enumerator ::S_WARN¶

enumerator ::S_INFO¶

enumerator ::S_VERBOSE¶

enumerator ::S_DEBUG¶

say(level, format, ...)¶

Format and print a message to Tarantool log file.

Parameters:	level (int) – log level format (const char) – `printf()`-like format string ...* – format arguments

Module schema

enum SCHEMA¶

enumerator ::BOX_SYSTEM_ID_MIN¶: Start of the reserved range of system spaces.

enumerator ::BOX_SCHEMA_ID¶: Space id of _schema.

enumerator ::BOX_SPACE_ID¶: Space id of _space.

enumerator ::BOX_VSPACE_ID¶: Space id of _vspace view.

enumerator ::BOX_INDEX_ID¶: Space id of _index.

enumerator ::BOX_VINDEX_ID¶: Space id of _vindex view.

enumerator ::BOX_FUNC_ID¶: Space id of _func.

enumerator ::BOX_VFUNC_ID¶: Space id of _vfunc view.

enumerator ::BOX_USER_ID¶: Space id of _user.

enumerator ::BOX_VUSER_ID¶: Space id of _vuser view.

enumerator ::BOX_PRIV_ID¶: Space id of _priv.

enumerator ::BOX_VPRIV_ID¶: Space id of _vpriv view.

enumerator ::BOX_CLUSTER_ID¶: Space id of _cluster.

enumerator ::BOX_TRIGGER_ID¶: Space id of _trigger.

enumerator ::BOX_TRUNCATE_ID¶: Space id of _truncate.

enumerator ::BOX_SYSTEM_ID_MAX¶: End of reserved range of system spaces.

enumerator ::BOX_ID_NIL¶: NULL value, returned on error.

Module trivia/config

API_EXPORT¶: Extern modifier for all public functions.

PACKAGE_VERSION_MAJOR¶: Package major version - 2 for 2.0.5.

PACKAGE_VERSION_MINOR¶: Package minor version - 0 for 2.0.5.

PACKAGE_VERSION_PATCH¶: Package patch version - 5 for 2.0.5.

PACKAGE_VERSION¶: A string with major-minor-patch-commit-id identifier of the release, e.g. 2.0.5-75-gdd8e14ffb.

SYSCONF_DIR¶: System configuration dir (e.g /etc)

INSTALL_PREFIX¶: Install prefix (e.g. /usr)

BUILD_TYPE¶: Build type, e.g. Debug or Release

BUILD_INFO¶: CMake build type signature, e.g. Linux-x86_64-Debug

BUILD_OPTIONS¶: Command line used to run CMake.

COMPILER_INFO¶: Paths to C and CXX compilers.

TARANTOOL_C_FLAGS¶: C compile flags used to build Tarantool.

TARANTOOL_CXX_FLAGS¶: CXX compile flags used to build Tarantool.

MODULE_LIBDIR¶: A path to install *.lua module files.

MODULE_LUADIR¶: A path to install *.so/*.dylib module files.

MODULE_INCLUDEDIR¶: A path to Lua includes (the same directory where this file is contained)

MODULE_LUAPATH¶: A constant added to package.path in Lua to find *.lua module files.

MODULE_LIBPATH¶: A constant added to package.cpath in Lua to find *.so module files.

Module tuple

type box_tuple_format_t¶

box_tuple_format_t *box_tuple_format_default(void)¶

Tuple format.

Each Tuple has an associated format (class). Default format is used to create tuples which are not attached to any particular space.

type box_tuple_t¶: Tuple

box_tuple_t *box_tuple_new(box_tuple_format_t *format, const char *tuple, const char *tuple_end)¶

Allocate and initialize a new tuple from raw MsgPack Array data.

Parameters:	format (box_tuple_format_t) – tuple format. Use box_tuple_format_default() to create space-independent tuple. tuple* (const char) – tuple data in MsgPack Array format ([field1, field2, …]) tuple_end* (const char*) – the end of `data`
Returns:	NULL on out of memory
Returns:	tuple otherwise

Module txn

bool box_txn(void)¶: Return true if there is an active transaction.

int box_txn_begin(void)¶

Begin a transaction in the current fiber.

A transaction is attached to caller fiber, therefore one fiber can have only one active transaction. See also box.begin().

Returns:	0 on success
Returns:	-1 on error. Perhaps a transaction has already been started.

int box_txn_commit(void)¶

Commit the current transaction. See also box.commit().

Returns:	0 on success
Returns:	-1 on error. Perhaps a disk write failure

void box_txn_rollback(void)¶: Roll back the current transaction. See also box.rollback().

box_txn_savepoint_t *savepoint(void)¶: Return a descriptor of a savepoint.

void box_txn_rollback_to_savepoint(box_txn_savepoint_t *savepoint)¶: Roll back the current transaction as far as the specified savepoint.

void *box_txn_alloc(size_t size)¶

Allocate memory on txn memory pool.

The memory is automatically deallocated when the transaction is committed or rolled back.

Returns:	NULL on out of memory

Read views: C API

Enterprise Edition

This API is available in the Enterprise Edition only.

This topic describes the C API for working with read views. The C API is MT-safe and provides the ability to use a read view from any thread, not only from the main (TX) thread.

The C API has the following specifics:

The space.upgrade function is not applied to retrieved tuples even if a space upgrade is in progress.
Tuples stored in compressed spaces are not decompressed - they are returned as raw MessagePack (MP_EXT/MP_COMPRESSION).

Note

You can learn how to call C code using stored procedures in the C tutorial.

Data types

The opaque data types below represent raw read views and an iterator over data in a raw read view. Note that there is no special data type for tuples retrieved from a read view. Tuples are returned as raw MessagePack data (const char *).

typedef box_raw_read_view box_raw_read_view_t¶: A raw database read view.

typedef box_raw_read_view_space box_raw_read_view_space_t¶: A space in a raw read view.

typedef box_raw_read_view_index box_raw_read_view_index_t¶: An index in a raw read view.

typedef box_raw_read_view_iterator box_raw_read_view_iterator_t¶: An iterator over data in a raw read view.

Creating and destroying read views

To create or destroy a read view, use the functions below.

box_raw_read_view_t *box_raw_read_view_new(const char *name)¶

Open a raw read view with the specified name and get a pointer to this read view. In the case of error, returns NULL and sets box_error_last(). This function may be called from the main (TX) thread only.

Parameters:	*name (const char) – (optional) a read view name; if `name` is not specified, a read view name is set to `unknown`
Returns:	a pointer to a read view

void box_raw_read_view_delete(box_raw_read_view_t *rv)¶

Close a raw read view and release all resources associated with it. This function may be called from the main (TX) thread only.

Parameters:	*rv (box_raw_read_view_t) – a pointer to a read view

Note

Read views created using box_raw_read_view_new are displayed in box.read_view.list() along with read views created in Lua.

Spaces and indexes

To fetch data from a read view, you need to specify an index to fetch the data from. The following functions are available for looking up spaces and indexes in a read view object.

box_raw_read_view_space_t *box_raw_read_view_space_by_id(const box_raw_read_view_t *rv, uint32_t space_id)¶

Find a space by ID in a raw read view. If not found, returns NULL and sets box_error_last().

Parameters:	rv (const box_raw_read_view_t) – a pointer to a read view space_id* (uint32_t) – a space identifier
Returns:	a pointer to a space

box_raw_read_view_space_t *box_raw_read_view_space_by_name(const box_raw_read_view_t *rv, const char *space_name, uint32_t space_name_len)¶

Find a space by name in a raw read view. If not found, returns NULL and sets box_error_last().

Parameters:	rv (const box_raw_read_view_t) – a pointer to a read view space_name (const char) – a space name space_name_len (uint32_t) – a space name length
Returns:	a pointer to a space

box_raw_read_view_index_t *box_raw_read_view_index_by_id(const box_raw_read_view_space_t *space, uint32_t index_id)¶

Find an index by ID in a read view’s space. If not found, returns NULL and sets box_error_last().

Parameters:	space (const box_raw_read_view_space_t) – a pointer to a read view’s space space_id* (uint32_t) – a space identifier
Returns:	a pointer to an index

box_raw_read_view_index_t *box_raw_read_view_index_by_name(const box_raw_read_view_space_t *space, const char *index_name, uint32_t index_name_len)¶

Find an index by name in a read view’s space. If not found, returns NULL and sets box_error_last().

Parameters:	space (const box_raw_read_view_space_t) – a pointer to a space index_name (const char) – an index name index_name_len (uint32_t) – an index name length
Returns:	a pointer to an index

Iteration and lookup

The functions below provide the ability to look up a tuple by the key or create an iterator over a read view index.

Note

Methods of the read view iterator are safe to call from any thread, but they may be used in one thread at the same time. This means that an iterator should be thread-local.

int box_raw_read_view_get(const box_raw_read_view_index_t *index, const char *key, const char *key_end, const char **data, uint32_t *size)¶

Look up a tuple in a read view’s index. If found, the data and size out arguments return a pointer to and the size of tuple data. If not found, *data is set to NULL and *size is set to 0.

Parameters:

*index (const box_raw_read_view_index_t) –
a pointer to a read view’s index
*key (const char) –
a pointer to the first byte of the MsgPack data that represents the search key
*key_end (const char) –
a pointer to the byte following the last byte of the MsgPack data that represents the search key
**data (const char) –
a pointer to the tuple data
*size (uint32_t) –
the size of tuple data

Returns:

0 on success; in the case of error, returns -1 and sets box_error_last()

int box_raw_read_view_iterator_create(box_raw_read_view_iterator_t *it, const box_raw_read_view_index_t *index, int type, const char *key, const char *key_end)¶

Create an iterator over a raw read view index. The initialized iterator object returned by this function remains valid and may be safely used until it’s destroyed or the read view is closed. When the iterator object is no longer needed, it should be destroyed using box_raw_read_view_iterator_destroy().

Parameters:

*it (box_raw_read_view_iterator_t) –
an iterator over a raw read view index
*index (const box_raw_read_view_index_t) –
a pointer to a read view index
type (int) – an iteration direction represented by the iterator_type
*key (const char) –
a pointer to the first byte of the MsgPack data that represents the search key
*key_end (const char) –
a pointer to the byte following the last byte of the MsgPack data that represents the search key

Returns:

0 on success; in the case of error, returns -1 and sets box_error_last()

int box_raw_read_view_iterator_next(box_raw_read_view_iterator_t *it, const char **data, uint32_t *size)¶

Retrieve the current tuple and advance the given iterator over a raw read view index. The pointer to and the size of tuple data are returned in the data and the size out arguments. The data returned by this function remains valid and may be safely used until the read view is closed.

Parameters:	it (box_raw_read_view_iterator_t) – an iterator over a read view index data (const char) – a pointer to the tuple data; at the end of iteration, `data` is set to `NULL` size (uint32_t) – the size of tuple data; at the end of iteration, `size` is set to `0`
Returns:	`0` on success; in the case of error, returns `-1` and sets box_error_last()

void box_raw_read_view_iterator_destroy(box_raw_read_view_iterator_t *it)¶

Destroy an iterator over a raw read view index. The iterator object should not be used after calling this function, but the data returned by the iterator may be safely dereferenced until the read view is closed.

Parameters:	*it (box_raw_read_view_iterator_t) – an iterator over a read view index

Space format

A space object’s methods below provide the ability to get names and types of space fields.

uint32_t box_raw_read_view_space_field_count(const box_raw_read_view_space_t *space)¶

Get the number of fields defined in the format of a read view space.

Parameters:	*space (const box_raw_read_view_space_t) – a pointer to a read view space
Returns:	the number of fields

const char *box_raw_read_view_space_field_name(const box_raw_read_view_space_t *space, uint32_t field_no)¶

Get the name of a field defined in the format of a read view space. If the field number is greater than the total number of fields defined in the format, NULL is returned. The string returned by this function is guaranteed to remain valid until the read view is closed.

Parameters:	space (const box_raw_read_view_space_t) – a pointer to a read view space field_no* (uint32_t) – the field number (starts with `0`)
Returns:	the name of a field

const char *box_raw_read_view_space_field_type(const box_raw_read_view_space_t *space, uint32_t field_no)¶

Get the type of a field defined in the format of a read view space. If the field number is greater than the total number of fields defined in the format, NULL is returned. The string returned by this function is guaranteed to remain valid until the read view is closed.

Parameters:	space (const box_raw_read_view_space_t) – a pointer to a read view space field_no* (uint32_t) – the field number (starts with `0`)
Returns:	the type of a field

Binary protocol

This section provides information on the Tarantool binary protocol, iproto. The protocol is called “binary” because the database is most frequently accessed via binary code instead of Lua request text. Tarantool experts use it:

to write their own connectors
to understand network messages
to support new features that their favorite connector doesn’t support yet
to avoid repetitive parsing by the server

The binary protocol provides complete access to Tarantool functionality, including:

request multiplexing, for example ability to issue multiple requests asynchronously via the same connection
response format that supports zero-copy writes

Note

Since version 2.11.0, you can use the box.iproto submodule to access IPROTO constants and features from Lua. The submodule enables to send arbitrary IPROTO packets over the session’s socket and override the behavior for all IPROTO request types. Also, IPROTO_UNKNOWN constant is introduced. The constant is used for the box.iproto.override() API, which allows setting a handler for incoming requests with an unknown type.

Understanding the binary protocol

Overview

To communicate with each other, Tarantool instances use a binary protocol called iproto.

In this set of examples, the user will be looking at binary code transferred via iproto. The code is intercepted with tcpdump, a monitoring utility.

Examples

To follow the examples in this section, get a single Linux computer and start three command-line shells (“terminals”).

– On terminal #1, Start monitoring port 3302 with tcpdump:

sudo tcpdump -i lo 'port 3302' -X

On terminal #2, start a server with:

box.cfg{listen=3302}
box.schema.space.create('tspace')
box.space.tspace:create_index('I')
box.space.tspace:insert{280}
box.schema.user.grant('guest','read,write,execute,create,drop','universe')

On terminal #3, start another server, which will act as a client, with:

box.cfg{}
net_box = require('net.box')
conn = net_box.connect('localhost:3302')

IPROTO_SELECT

On terminal #3, run the following:

conn.space.tspace:select(280)

Now look at what tcpdump shows for the job connecting to 3302 – the “request”. After the words “length 32” is a packet that ends with these 32 bytes (we have added indented comments):

ce 00 00 00 1b   MP_UINT = decimal 27 = number of bytes after this
             MP_MAP, size 2 (we'll call this "Main-Map")
               IPROTO_SYNC (Main-Map Item#1)
               MP_INT = 4 = number that gets incremented with each request
               IPROTO_REQUEST_TYPE (Main-Map Item#2)
               IPROTO_SELECT
               MP_MAP, size 6 (we'll call this "Select-Map")
                 IPROTO_SPACE_ID (Select-Map Item#1)
cd 02 00             MP_UINT = decimal 512 = id of tspace (could be larger)
                 IPROTO_INDEX_ID (Select-Map Item#2)
                 MP_INT = 0 = id of index within tspace
                 IPROTO_ITERATOR (Select-Map Item#3)
                 MP_INT = 0 = Tarantool iterator_type.h constant ITER_EQ
                 IPROTO_OFFSET (Select-Map Item#4)
                 MP_INT = 0 = amount to offset
                 IPROTO_LIMIT (Select-Map Item#5)
ce ff ff ff ff       MP_UINT = 4294967295 = biggest possible limit
                 IPROTO_KEY (Select-Map Item#6)
                 MP_ARRAY, size 1 (we'll call this "Key-Array")
cd 01 18               MP_UINT = 280 (Select-Map Item#6, Key-Array Item#1)
                       -- 280 is the key value that we are searching for

Now read the source code file net_box.c and skip to the line netbox_encode_select(lua_State *L). From the comments and from simple function calls like mpstream_encode_uint(&stream, IPROTO_SPACE_ID); you will be able to see how net_box put together the packet contents that you have just observed with tcpdump.

There are libraries for reading and writing MessagePack objects. C programmers sometimes include msgpuck.h.

Now you know how Tarantool itself makes requests with the binary protocol. When in doubt about a detail, consult net_box.c – it has routines for each request. Some connectors have similar code.

IPROTO_UPDATE

For an IPROTO_UPDATE example, suppose a user changes field #2 in tuple #2 in space #256 to 'BBBB'. The body will look like this: (notice that in this case there is an extra map item IPROTO_INDEX_BASE, to emphasize that field numbers start with 1, which is optional and can be omitted):

             IPROTO_UPDATE
             IPROTO_MAP, size 5
               IPROTO_SPACE_ID, Map Item#1
cd 02 00           MP_UINT 256
               IPROTO_INDEX_ID, Map Item#2
               MP_INT 0 = primary-key index number
               IPROTO_INDEX_BASE, Map Item#3
               MP_INT = 1 i.e. field numbers start at 1
               IPROTO_TUPLE, Map Item#4
               MP_ARRAY, size 1, for array of operations
                 MP_ARRAY, size 3
a1 3d                   MP_STR = OPERATOR = '='
                    MP_INT = FIELD_NO = 2
a5 42 42 42 42 42       MP_STR = VALUE = 'BBBB'
               IPROTO_KEY, Map Item#5
               MP_ARRAY, size 1, for array of key values
                 MP_UINT = primary-key value = 2

IPROTO_EXECUTE

Byte codes for the IPROTO_EXECUTE example:

0b               IPROTO_EXECUTE
83               MP_MAP, size 3
43                 IPROTO_STMT_ID Map Item#1
ce d7 aa 74 1b     MP_UINT value of n.stmt_id
41                 IPROTO_SQL_BIND Map Item#2
92                 MP_ARRAY, size 2
01                   MP_INT = 1 = value for first parameter
a1 61                MP_STR = 'a' = value for second parameter
2b                 IPROTO_OPTIONS Map Item#3
90                 MP_ARRAY, size 0 (there are no options)

IPROTO_INSERT

Byte codes for the response to the box.space.space-name:insert{6} example:

ce 00 00 00 20                MP_UINT = HEADER AND BODY SIZE
83                            MP_MAP, size 3
00                              IPROTO_REQUEST_TYPE
ce 00 00 00 00                  MP_UINT = IPROTO_OK
01                              IPROTO_SYNC
cf 00 00 00 00 00 00 00 53      MP_UINT = sync value
05                              IPROTO_SCHEMA_VERSION
ce 00 00 00 68                  MP_UINT = schema version
81                            MP_MAP, size 1
30                              IPROTO_DATA
dd 00 00 00 01                  MP_ARRAY, size 1 (row count)
91                              MP_ARRAY, size 1 (field count)
06                              MP_INT = 6 = the value that was inserted

IPROTO_EVAL

Byte codes for the response to the conn:eval([[box.schema.space.create('_space');]]) example:

ce 00 00 00 3b                  MP_UINT = HEADER AND BODY SIZE
83                              MP_MAP, size 3 (i.e. 3 items in header)
   00                              IPROTO_REQUEST_TYPE
   ce 00 00 80 0a                  MP_UINT = hexadecimal 800a
   01                              IPROTO_SYNC
   cf 00 00 00 00 00 00 00 26      MP_UINT = sync value
   05                              IPROTO_SCHEMA_VERSION
   ce 00 00 00 78                  MP_UINT = schema version value
   81                              MP_MAP, size 1
     31                              IPROTO_ERROR_24
     db 00 00 00 1d 53 70 61 63 etc. MP_STR = "Space '_space' already exists"

Creating a table with IPROTO_EXECUTE

Byte codes, if we use the same net.box connection that we used in the beginning and we say
conn:execute([[CREATE TABLE t1 (dd INT PRIMARY KEY AUTOINCREMENT, дд STRING COLLATE "unicode");]])
conn:execute([[INSERT INTO t1 VALUES (NULL, 'a'), (NULL, 'b');]])
and we watch what tcpdump displays, we will see two noticeable things: (1) the CREATE statement caused a schema change so the response has a new IPROTO_SCHEMA_VERSION value and the body includes the new contents of some system tables (caused by requests from net.box which users will not see); (2) the final bytes of the response to the INSERT will be:

 MP_MAP, size 1
   IPROTO_SQL_INFO
   MP_MAP, size 2
     Tarantool constant (not in iproto_constants.h) = SQL_INFO_ROW_COUNT
     1 = row count
     Tarantool constant (not in iproto_constants.h) = SQL_INFO_AUTOINCREMENT_ID
     MP_ARRAY, size 2
       first autoincrement number
       second autoincrement number

SELECT with SQL

Byte codes for the SQL SELECT example, if we ask for full metadata by saying
conn.space._session_settings:update('sql_full_metadata', {{'=', 'value', true}})
and we select the two rows from the table that we just created
conn:execute([[SELECT dd, дд AS д FROM t1;]])
then tcpdump will show this response, after the header:

                     MP_MAP, size 2 (i.e. metadata and rows)
                       IPROTO_METADATA
                       MP_ARRAY, size 2 (i.e. 2 columns)
                         MP_MAP, size 5 (i.e. 5 items for column#1)
a2 44 44                    IPROTO_FIELD_NAME and 'DD'
a7 69 6e 74 65 67 65 72     IPROTO_FIELD_TYPE and 'integer'
c2                          IPROTO_FIELD_IS_NULLABLE and false
c3                          IPROTO_FIELD_IS_AUTOINCREMENT and true
c0                          PROTO_FIELD_SPAN and nil
                         MP_MAP, size 5 (i.e. 5 items for column#2)
a2 d0 94                    IPROTO_FIELD_NAME and 'Д' upper case
a6 73 74 72 69 6e 67        IPROTO_FIELD_TYPE and 'string'
a7 75 6e 69 63 6f 64 65     IPROTO_FIELD_COLL and 'unicode'
c3                          IPROTO_FIELD_IS_NULLABLE and true
a4 d0 b4 d0 b4              IPROTO_FIELD_SPAN and 'дд' lower case
                       IPROTO_DATA
                       MP_ARRAY, size 2
                         MP_ARRAY, size 2
                           MP_INT = 1 i.e. contents of row#1 column#1
a1 61                          MP_STR = 'a' i.e. contents of row#1 column#2
                         MP_ARRAY, size 2
                           MP_INT = 2 i.e. contents of row#2 column#1
a1 62                          MP_STR = 'b' i.e. contents of row#2 column#2

IPROTO_PREPARE

Byte code for the SQL PREPARE example. If we said
conn:prepare([[SELECT dd, дд AS д FROM t1;]])
then tcpdump would show almost the same response, but there would be no IPROTO_DATA. Instead, additional items will appear:

                     IPROTO_BIND_COUNT
                     MP_UINT = 0

                     IPROTO_BIND_METADATA
                     MP_ARRAY, size 0

MP_UINT = 0 and MP_ARRAY has size 0 because there are no parameters to bind. Full output:

                     MP_MAP, size 4
                       IPROTO_STMT_ID
ce c2 3c 2c 1e             MP_UINT = statement id
                       IPROTO_BIND_COUNT
                       MP_INT = 0 = number of parameters to bind
                       IPROTO_BIND_METADATA
                       MP_ARRAY, size 0 = there are no parameters to bind
                       IPROTO_METADATA
                       MP_ARRAY, size 2 (i.e. 2 columns)
                         MP_MAP, size 5 (i.e. 5 items for column#1)
a2 44 44                    IPROTO_FIELD_NAME and 'DD'
a7 69 6e 74 65 67 65 72     IPROTO_FIELD_TYPE and 'integer'
c2                          IPROTO_FIELD_IS_NULLABLE and false
c3                          IPROTO_FIELD_IS_AUTOINCREMENT and true
c0                          PROTO_FIELD_SPAN and nil
                         MP_MAP, size 5 (i.e. 5 items for column#2)
a2 d0 94                    IPROTO_FIELD_NAME and 'Д' upper case
a6 73 74 72 69 6e 67        IPROTO_FIELD_TYPE and 'string'
a7 75 6e 69 63 6f 64 65     IPROTO_FIELD_COLL and 'unicode'
c3                          IPROTO_FIELD_IS_NULLABLE and true
a4 d0 b4 d0 b4              IPROTO_FIELD_SPAN and 'дд' lower case

Heartbeat

Byte code for the heartbeat example. The master might send this body:

                    MP_MAP, size 3
                      Main-Map Item #1 IPROTO_REQUEST_TYPE
                        MP_UINT = 0
                      Main-Map Item #2 IPROTO_REPLICA_ID
                        MP_UINT = 2 = id
                      Main-Map Item #3 IPROTO_TIMESTAMP
cb                          MP_DOUBLE (MessagePack "Float 64")
d7 ba 06 7b 3a 03 21     8-byte timestamp
                    MP_MAP (body), size 1
5a                      Body Map Item #1 IPROTO_VCLOCK_SYNC
                      MP_UINT = 20 (vclock sync value)

Byte code for the heartbeat example. The replica might send back this body:

                     MP_MAP, size 1
                       Main-Map Item #1 IPROTO_REQUEST_TYPE
                       MP_UINT = 0 = IPROTO_OK
                       MP_MAP (body), size 3
                         Body Map Item #1 IPROTO_VCLOCK
                           MP_MAP, size 1 (vclock of 1 component)
                             MP_UINT = 1 = id (part 1 of vclock)
                             MP_UINT = 6 = lsn (part 2 of vclock)
5a                           Body Map Item #2 IPROTO_VCLOCK_SYNC
                           MP_UINT = 20 (vclock sync value)
                         Body Map Item #3 IPROTO_TERM
                           MP_UINT = 49 (term value)

MP_* MessagePack types

The binary protocol handles data in the MessagePack format. Short descriptions of the basic MessagePack data types are on MessagePack’s specification page. Tarantool also introduces several MessagePack type extensions.

In this document, MessagePack types are described by words that start with MP_. See this table:

MP_NIL	nil
MP_UINT	unsigned integer
MP_INT	either integer or unsigned integer
MP_STR	string
MP_BIN	binary string
MP_ARRAY	array
MP_MAP	map
MP_BOOL	boolean
MP_FLOAT	float
MP_DOUBLE	double
MP_EXT	extension
MP_OBJECT	any MessagePack object

Request and response format

The types referred to in this document are MessagePack types. For their definitions, see the MP_* MessagePack types section.

Packet structure

Requests and responses have similar structure. They contain three sections: size, header, and body.

It is legal to put more than one request in a packet.

Size

The size is an MP_UINT – unsigned integer, usually 32-bit. It is the size of the header plus the size of the body. It may be useful to compare it with the number of bytes remaining in the packet.

Header

The header is an MP_MAP. It may contain, in any order:

Both the request and response use the IPROTO_REQUEST_TYPE key. It denotes the type of the packet.
The request and the matching response have the same sync number (IPROTO_SYNC).
IPROTO_SCHEMA_VERSION is an optional key that indicates whether there was a major change in the schema.
In interactive transactions, every stream is identified by a unique IPROTO_STREAM_ID.

In case of replicating synchronous transactions, the header also contains the IPROTO_FLAGS key.

Encoding and decoding

To see how Tarantool encodes the header, have a look at file xrow.c, function xrow_header_encode.

To see how Tarantool decodes the header, have a look at file net_box.c, function netbox_decode_data.

For example, in a successful response to box.space:select(), the IPROTO_REQUEST_TYPE value is 0 = IPROTO_OK and the array has all the tuples of the result.

Read the source code file net_box.c where the function decode_metadata_optional is an example of how Tarantool itself decodes extra items.

Body

The body is an MP_MAP. Maximal iproto package body length is 2 GiB.

The body has the details of the request or response. In a request, it can also be absent or be an empty map. Both these states are interpreted equally. Responses contain the body anyway even for an IPROTO_PING request, where it is an empty MP_MAP.

A lot of responses contain the IPROTO_DATA map:

For most data-access requests (IPROTO_SELECT, IPROTO_INSERT, IPROTO_DELETE, etc.) the body is an IPROTO_DATA map with an array of tuples that contain an array of fields.

IPROTO_DATA is what we get with net_box and Module buffer so if we were using net_box we could decode with msgpack.decode_unchecked(), or we could convert to a string with ffi.string(pointer,length). The pickle.unpack() function might also be helpful.

Note

For SQL-specific requests and responses, the body is a bit different. Learn more about this type of packets.

Error responses

Instead of IPROTO_OK, an error response header has IPROTO_REQUEST_TYPE = IPROTO_TYPE_ERROR. Its code is 0x8XXX, where XXX is the error code – a value in src/box/errcode.h. src/box/errcode.h also has some convenience macros which define hexadecimal constants for return codes.

The error response body is a map that contains two keys: IPROTO_ERROR and IPROTO_ERROR_24. While IPROTO_ERROR contains an MP_MAP value, IPROTO_ERROR_24 contains a string. The two keys are provided to accommodate clients with older and newer Tarantool versions.

Error responses before 2.4.1

Before Tarantool v. 2.4.1, the key IPROTO_ERROR contained a string and was identical to the current IPROTO_ERROR_24 key.

Let’s consider an example. This is the fifth message, and the request was to create a duplicate space with conn:eval([[box.schema.space.create('_space');]]). The unsuccessful response looks like this:

The tutorial Understanding the binary protocol shows actual byte codes of the response to the IPROTO_EVAL message.

Looking in errcode.h, we find that the error code 0x0a (decimal 10) is ER_SPACE_EXISTS, and the string associated with ER_SPACE_EXISTS is “Space ‘%s’ already exists”.

Since version 2.4.1, responses for errors have extra information following what was described above. This extra information is given via the MP_ERROR extension type. See details in the MessagePack extensions section.

Keys used in requests and responses

This section describes iproto keys contained in requests and responses. The keys are Tarantool constants that are either defined or mentioned in the iproto_constants.h file.

While the keys themselves are unsigned 8-bit integers, their values can have different types.

Basic description

General

Name	Code and value type	Description
IPROTO_VERSION	0x54 MP_UINT	Binary protocol version supported by the client
IPROTO_FEATURES	0x55 MP_ARRAY	Supported binary protocol features
IPROTO_SYNC	0x01 MP_UINT	Unique request identifier
IPROTO_SCHEMA_VERSION	0x05 MP_UINT	Version of the database schema
IPROTO_TIMESTAMP	0x04 MP_DOUBLE	Time in seconds since the Unix epoch
IPROTO_REQUEST_TYPE	0x00 MP_UINT	Request type or response type
IPROTO_ERROR	0x52 MP_ERROR	Error response
IPROTO_ERROR_24	0x31 MP_STR	Error as a string
IPROTO_DATA	0x30 MP_OBJECT	Data passed in the transaction. Can be empty. Used in all requests and responses
IPROTO_SPACE_ID	0x10 MP_UINT	Space identifier
IPROTO_INDEX_ID	0x11 MP_UINT	Index identifier
IPROTO_TUPLE	0x21 MP_ARRAY	Tuple, arguments, operations, or authentication pair. See details
IPROTO_KEY	0x20 MP_ARRAY	Array of index keys in the request. See space_object:select()
IPROTO_LIMIT	0x12 MP_UINT	Maximum number of tuples in the space
IPROTO_OFFSET	0x13 MP_UINT	Number of tuples to skip in the select
IPROTO_ITERATOR	0x14 MP_UINT	Iterator type
IPROTO_INDEX_BASE	0x15 MP_UINT	Indicates whether the first field number is 1 or 0
IPROTO_FUNCTION_NAME	0x22 MP_STR	Name of the called function. Used in IPROTO_CALL
IPROTO_USER_NAME	0x23 MP_STR	User name. Used in IPROTO_AUTH
IPROTO_OPS	0x28 MP_ARRAY	Array of operations. Used in IPROTO_UPSERT
IPROTO_EXPR	0x27 MP_STR	Command argument. Used in IPROTO_EVAL
IPROTO_AUTH_TYPE	0x5b MP_STR	A protocol used to generate user authentication data
IPROTO_AFTER_POSITION	0x2e MP_STR	The position of a tuple after which space_object:select() starts the search
IPROTO_AFTER_TUPLE	0x2f MP_ARRAY	A tuple after which space_object:select() starts the search
IPROTO_FETCH_POSITION	0x1f MP_BOOL	If true, space_object:select() returns the position of the last selected tuple
IPROTO_POSITION	0x35 MP_STR	If `IPROTO_FETCH_POSITION` is true, returns a base64-encoded string representing the position of the last selected tuple

Streams

Name	Code and value type	Description
IPROTO_STREAM_ID	0x0a MP_UINT	Unique stream identifier
IPROTO_TIMEOUT	0x56 MP_DOUBLE	Timeout in seconds, after which the transactions are rolled back
IPROTO_TXN_ISOLATION	0x59 MP_UINT	Transaction isolation level

General replication

Name	Code and value type	Description
IPROTO_REPLICA_ID	0x02 MP_INT	Replica ID
IPROTO_INSTANCE_UUID	0x24 MP_STR	Instance UUID
IPROTO_VCLOCK	0x26 MP_MAP	The instance’s vclock
IPROTO_VCLOCK_SYNC	0x5a MP_UINT	ID of the vclock synchronization request. Since 2.11
IPROTO_REPLICASET_UUID	0x25 MP_STR	Before Tarantool version 2.11, IPROTO_REPLICASET_UUID was called IPROTO_CLUSTER_UUID.
IPROTO_LSN	0x03 MP_UINT	Log sequence number of the transaction
IPROTO_TSN	0x08 MP_UINT	Transaction sequence number
IPROTO_BALLOT_IS_RO_CFG	0x01 MP_BOOL	True if the instance is configured as read_only. Since 2.6.1
IPROTO_BALLOT_VCLOCK	0x02 MP_MAP	Current vclock of the instance
IPROTO_BALLOT_GC_VCLOCK	0x03 MP_MAP	Vclock of the instance’s oldest WAL entry
IPROTO_BALLOT_IS_RO	0x04 MP_BOOL	True if the instance is not writable: configured as read_only, has orphan status, or is a Raft follower. Since 2.6.1
IPROTO_BALLOT_IS_ANON	0x05 MP_BOOL	True if the replica is anonymous. Corresponds to replication.anon. Since 2.7.1
IPROTO_BALLOT_IS_BOOTED	0x06 MP_BOOL	True if the instance has finished its bootstrap or recovery process. Since 2.7.3, 2.8.2, 2.10.0
IPROTO_BALLOT_CAN_LEAD	0x07 MP_BOOL	True if box.cfg.election_mode is `candidate` or `manual`. Since v. 2.7.3 and 2.8.2
IPROTO_BALLOT_BOOTSTRAP_LEADER_UUID	0x08 MP_STR	UUID of the bootstrap leader. The UUID is encoded as a 36-byte string. Since v. 2.11
IPROTO_BALLOT_REGISTERED_REPLICA_UUIDS	0x09 MP_ARRAY	An array of MP_STR elements that contains the UUIDs of members registered in the replica set. Each UUID is encoded as a 36-byte string. Since v. 2.11
IPROTO_BALLOT_INSTANCE_NAME	0x0a MP_STR	The name of the instance Since v. 3.0
IPROTO_FLAGS	0x09 MP_UINT	Auxiliary data to indicate the last transaction message state. Included in the header of any DML request that is recorded in the WAL.
IPROTO_SERVER_VERSION	0x06 MP_UINT	Tarantool version of the subscribing node, in a compact representation
IPROTO_REPLICA_ANON	0x50 MP_BOOL	Optional key used in SUBSCRIBE request. True if the subscribing replica is anonymous
IPROTO_ID_FILTER	0x51 MP_ARRAY	Optional key used in SUBSCRIBE request, followed by an array of ids of instances whose rows won’t be relayed to the replica. Since v. 2.10.0
IPROTO_REPLICASET_NAME	0x5c MP_STR	Optional key used to pass the initiator instance name in JOIN, SUBSCRIBE, and REGISTER requests.
IPROTO_INSTANCE_NAME	0x5d MP_STR	Optional key used to pass the instance’s replica set name in SUBSCRIBE requests.

Synchronous replication

Name	Code and value type	Description
IPROTO_TERM	0x53 MP_UINT	RAFT term on an instance
IPROTO_RAFT_TERM	0x00 MP_UINT	RAFT term on an instance. The key is only used for requests with the IPROTO_RAFT type.
IPROTO_RAFT_VOTE	0x01 MP_UINT	Instance vote in the current term (if any)
IPROTO_RAFT_STATE	0x02 MP_UINT	RAFT state. Possible values: `1` – follower, `2` – candidate, `3` – leader
IPROTO_RAFT_VCLOCK	0x03 MP_MAP	Current vclock of the instance
IPROTO_RAFT_LEADER_ID	0x04 MP_UINT	Current leader node ID as seen by the node that issues the request Since version 2.10.0
IPROTO_RAFT_IS_LEADER_SEEN	0x05 MP_BOOL	True if the node has a direct connection to the leader node. Since version 2.10.0

All IPROTO_RAFT_* keys are used only in IPROTO_RAFT* requests.

Events and subscriptions

Name	Code and value type	Description
IPROTO_EVENT_KEY	0x57 MP_STR	Event key name
IPROTO_EVENT_DATA	0x58 MP_OBJECT	Event data sent to a remote watcher

Learn more about events and subscriptions in iproto.

SQL-specific

These keys are used with SQL within SQL-specific requests and responses like IPROTO_EXECUTE and IPROTO_PREPARE.

Name	Code and value type	Description
IPROTO_SQL_TEXT	0x40 MP_STR	SQL statement text
IPROTO_STMT_ID	0x43 MP_INT	Identifier of the prepared statement
IPROTO_OPTIONS	0x2b MP_ARRAY	SQL transaction options. Usually empty
IPROTO_METADATA	0x32 MP_ARRAY of MP_MAP items	SQL transaction metadata
IPROTO_FIELD_NAME	0x00 MP_STR	Field name. Nested in IPROTO_METADATA
IPROTO_FIELD_TYPE	0x01 MP_STR	Field type. Nested in IPROTO_METADATA
IPROTO_FIELD_COLL	0x02 MP_STR	Field collation. Nested in IPROTO_METADATA
IPROTO_FIELD_IS_NULLABLE	0x03 MP_BOOL	True if the field is nullable. Nested in IPROTO_METADATA.
IPROTO_FIELD_IS_AUTOINCREMENT	0x04 MP_BOOL	True if the field is auto-incremented. Nested in IPROTO_METADATA.
IPROTO_FIELD_SPAN	0x05 MP_STR or MP_NIL	Original expression under SELECT. Nested in IPROTO_METADATA. See box.execute()
IPROTO_BIND_METADATA	0x33 MP_ARRAY	Bind variable names and types
IPROTO_BIND_COUNT	0x34 MP_INT	Number of parameters to bind
IPROTO_SQL_BIND	0x41 MP_ARRAY	Parameter values to match ? placeholders or :name placeholders
IPROTO_SQL_INFO	0x42 MP_MAP	Additional SQL-related parameters
SQL_INFO_ROW_COUNT	0x00 MP_UINT	Number of changed rows. Is `0` for statements that do not change rows. Nested in IPROTO_SQL_INFO
SQL_INFO_AUTO_INCREMENT_IDS	0x01 MP_ARRAY of MP_UINT items	New primary key value (or values) for an INSERT in a table defined with PRIMARY KEY AUTOINCREMENT. Nested in IPROTO_SQL_INFO

Details on individual keys

IPROTO_VERSION

Code: 0x54.

IPROTO_VERSION is an integer number reflecting the version of protocol that the client supports. The latest IPROTO_VERSION is 10.

IPROTO_FEATURES

Code: 0x55.

Available IPROTO_FEATURES are the following:

IPROTO_FEATURE_STREAMS = 0 – streams support: IPROTO_STREAM_ID in the request header.
IPROTO_FEATURE_TRANSACTIONS = 1 – transaction support: IPROTO_BEGIN, IPROTO_COMMIT, and IPROTO_ROLLBACK commands (with IPROTO_STREAM_ID in the request header). Learn more about sending transaction commands.
IPROTO_FEATURE_ERROR_EXTENSION = 2 – MP_ERROR MsgPack extension support. Clients that don’t support this feature receive error responses for IPROTO_EVAL and IPROTO_CALL encoded to string error messages.
IPROTO_FEATURE_WATCHERS = 3 – remote watchers support: IPROTO_WATCH, IPROTO_UNWATCH, and IPROTO_EVENT commands.
IPROTO_FEATURE_INSERT_ARROW = 12 – support of data insertion in the Arrow format. Learn more about the feature. Available since version 3.3.0.

IPROTO_SYNC

Code: 0x01.

This is an unsigned integer that should be incremented so that it is unique in every request. This integer is also returned from box.session.sync().

The IPROTO_SYNC value of a response should be the same as the IPROTO_SYNC value of a request.

IPROTO_SCHEMA_VERSION

Code: 0x05.

Version of the database schema – an unsigned number that goes up when there is a major change in the schema.

In a request header, IPROTO_SCHEMA_VERSION is optional, so the version will not be checked if it is absent.

In a response header, IPROTO_SCHEMA_VERSION is always present, and it is up to the client to check if it has changed.

IPROTO_ITERATOR

Code: 0x14.

Possible values (see iterator_type.h):

`0`	EQ
`1`	REQ
`2`	ALL, all tuples
`3`	LT, less than
`4`	LE, less than or equal
`5`	GE, greater than or equal
`6`	GT, greater than
`7`	BITS_ALL_SET, all bits of the value are set in the key
`8`	BITS_ANY_SET, at least one bit of the value is set
`9`	BITS_ALL_NOT_SET, no bits are set
`10`	OVERLAPS, overlaps the rectangle or box
`11`	NEIGHBOR, neighbors the rectangle or box

IPROTO_STREAM_ID

Code: 0x0a.

Only used in streams. This is an unsigned number that should be unique in every stream.

In requests, IPROTO_STREAM_ID is useful for two things: ensuring that requests within transactions are done in separate groups, and ensuring strictly consistent execution of requests (whether or not they are within transactions).

In responses, IPROTO_STREAM_ID does not appear.

See Binary protocol – streams.

IPROTO_TXN_ISOLATION

IPROTO_TXN_ISOLATION is the transaction isolation level. It can take the following values:

TXN_ISOLATION_DEFAULT = 0 – use the default level from box.cfg (default value)
TXN_ISOLATION_READ_COMMITTED = 1 – read changes that are committed but not confirmed yet
TXN_ISOLATION_READ_CONFIRMED = 2 – read confirmed changes
TXN_ISOLATION_BEST_EFFORT = 3 – determine isolation level automatically

See Binary protocol – streams to learn more about stream transactions in the binary protocol.

IPROTO_REQUEST_TYPE

Code: 0x00.

The key is used both in requests and responses. It indicates the request or response type and has any request or response name for the value (example: IPROTO_AUTH). See requests and responses for client-server communication, replication, events and subscriptions, streams and interactive transactions.

IPROTO_ERROR

Code: 0x52.

In case of error, the response body contains IPROTO_ERROR and IPROTO_ERROR_24 instead of IPROTO_DATA.

To learn more about error responses, check the section Request and response format.

IPROTO_ERROR_24

Code: 0x31.

IPROTO_ERROR_24 is used in Tarantool versions before 2.4.1. The key contains the error in the string format.

Since Tarantool 2.4.1, Tarantool packs errors as the MP_ERROR MessagePack extension, which includes extra information. Two keys are passed in the error response body: IPROTO_ERROR and IPROTO_ERROR_24.

To learn more about error responses, check the section Request and response format.

IPROTO_TUPLE

Code: 0x21.

Multiple operations make use of this key in different ways:

IPROTO_INSERT, IPROTO_REPLACE, IPROTO_UPSERT	Tuple to be inserted
IPROTO_UPDATE	Operations to perform
IPROTO_AUTH	Array of 2 fields: authentication mechanism and scramble, encrypted according to the specified mechanism. See more on the authentication sequence.
IPROTO_CALL, IPROTO_EVAL	Array of arguments

IPROTO_FLAGS

Code: 0x09.

When it comes to replicating synchronous transactions, the IPROTO_FLAGS key is included in the header. The key contains an MP_UINT value of one or more bits:

IPROTO_FLAG_COMMIT (0x01) is set if this is the last message for a transaction.
IPROTO_FLAG_WAIT_SYNC (0x02) is set if this is the last message for a transaction which cannot be completed immediately.
IPROTO_FLAG_WAIT_ACK (0x04) is set if this is the last message for a synchronous transaction.

Example:

IPROTO_TERM

Code: 0x53.

The key is used in the IPROTO_RAFT_PROMOTE and IPROTO_RAFT_DEMOTE requests.
Since version 2.11, the key is included in response to a heartbeat message. The term corresponds to the value of box.info.synchro.queue.term on the sender instance.

Vclock keys

The vclock (vector clock) is a log sequence number map that defines the version of the dataset stored on the node. In fact, it represents the number of logical operations executed on a specific node. A vclock looks like this:

There are five keys that correspond to vector clocks in different contexts of replication. They all have the MP_MAP type:

IPROTO_VCLOCK (0x26) is passed to a new instance joining the replica set.
IPROTO_VCLOCK_SYNC (0x5a) is used by replication heartbeats. The master sends its heartbeats, including this monotonically growing key, to a replica. Once the replica receives a heartbeat with a non-zero IPROTO_VCLOCK_SYNC value, it starts responding with the same value in all its acknowledgements. This key was introduced in version 2.11.
IPROTO_BALLOT_VCLOCK (0x02) is included in the IPROTO_BALLOT message. IPROTO_BALLOT is sent in response to the IPROTO_VOTE request. This key was introduced in /release/2.6.1.
IPROTO_BALLOT_GC_VCLOCK (0x03) is also included in the IPROTO_BALLOT message. IPROTO_BALLOT is sent in response to the IPROTO_VOTE request. It is the vclock of the oldest WAL entry on the instance. Corresponds to box.info.gc().vclock. This key was introduced in /release/2.6.1.
IPROTO_RAFT_VCLOCK (0x03) is included in the IPROTO_RAFT message. It is present only on the instances in the “candidate” state (IPROTO_RAFT_STATE == 2).

IPROTO_BALLOT keys

All IPROTO_BALLOT_* keys are only used in the IPROTO_BALLOT requests. There have been the following name changes starting with versions /release/2.7.3, /release/2.8.2, and /release/2.10.0:

IPROTO_BALLOT_IS_RO_CFG (0x01) was formerly called IPROTO_BALLOT_IS_RO.
IPROTO_BALLOT_IS_RO (0x04) was formerly called IPROTO_BALLOT_IS_LOADING.

IPROTO_METADATA

Code: 0x32.

Used with SQL within IPROTO_EXECUTE.

The key contains an array of column maps, with each column map containing at least IPROTO_FIELD_NAME (0x00) and MP_STR, and IPROTO_FIELD_TYPE (0x01) and MP_STR.

Additionally, if sql_full_metadata in the _session_settings system space is TRUE, then the array has these additional column maps which correspond to components described in the box.execute() section.

IPROTO_SQL_BIND

Code: 0x41.

Used with SQL within IPROTO_EXECUTE.

IPROTO_SQL_BIND is an array of parameter values to match ? placeholders or :name placeholders. It can contain values of any type, including MP_MAP.

Values that are not MP_MAP replace the ? placeholders in the request.

MP_MAP values must have the format {[name] = value}, where name is the named parameter in the request. Here is an example of such a request:

tarantool> conn:execute('SELECT ?, ?, :name1, ?, :name2, :name1', {1, 2, {[':name1'] = 5}, 'str', {[':name2'] = true}})
---
- metadata:
- name: COLUMN_1
    type: integer
- name: COLUMN_2
    type: integer
- name: COLUMN_3
    type: integer
- name: COLUMN_4
    type: text
- name: COLUMN_5
    type: boolean
- name: COLUMN_6
    type: boolean
rows:
- [1, 2, 5, 'str', true, 5]

Client-server requests and responses

This section describes client requests, their arguments, and the values returned by the server.

Some requests are described on separate pages. Those are the requests related to:

Overview

Name	Code	Description
IPROTO_OK	0x00 MP_UINT	Successful response
IPROTO_CHUNK	0x80 MP_UINT	Out-of-band response
IPROTO_TYPE_ERROR	0x8XXX MP_INT	Error response
IPROTO_UNKNOWN	-1 MP_UINT	An unknown request type
IPROTO_SELECT	0x01	Select request
IPROTO_INSERT	0x02	Insert request
IPROTO_REPLACE	0x03	Replace request
IPROTO_UPDATE	0x04	Update request
IPROTO_UPSERT	0x09	Upsert request
IPROTO_DELETE	0x05	Delete request
IPROTO_CALL	0x0a	Function remote call (conn:call())
IPROTO_AUTH	0x07	Authentication request
IPROTO_EVAL	0x08	Evaluate a Lua expression (conn:eval())
IPROTO_NOP	0x0c	Increment the LSN and do nothing else
IPROTO_INSERT_ARROW	0x11	Iproto Insert Arrow data request. Available since version 3.3.0.
IPROTO_PING	0x40	Ping (conn:ping())
IPROTO_ID	0x49	Share iproto version and supported features

IPROTO_OK

Code: 0x00.

This request/response type is contained in the header and signifies success. Here is an example:

IPROTO_CHUNK

Code: 0x80.

If the response is out-of-band, due to use of box.session.push(), then IPROTO_REQUEST_TYPE is IPROTO_CHUNK instead of IPROTO_OK.

IPROTO_TYPE_ERROR

Code: 0x8XXX (see below).

Instead of IPROTO_OK, an error response header has 0x8XXX for IPROTO_REQUEST_TYPE. XXX is the error code – a value in src/box/errcode.h. src/box/errcode.h also has some convenience macros which define hexadecimal constants for return codes.

To learn more about error responses, check the section Request and response format.

IPROTO_UNKNOWN

Since 2.11.0.

Code: -1.

An unknown request type. The constant is used to override the handler of unknown IPROTO request types. Learn more: box.iproto.override() and box_iproto_override.

IPROTO_INSERT

Code: 0x02.

See space_object:insert(). The body is a 2-item map:

For example, if the request is INSERT INTO table-name VALUES (1), (2), (3), then the response body contains an IPROTO_SQL_INFO map with SQL_INFO_ROW_COUNT = 3. SQL_INFO_ROW_COUNT can be 0 for statements that do not change rows, but can be 1 for statements that create new objects.

Example

If the ID of tspace is 512 and this is the fifth message,
conn.space.tspace:insert{1, 'AAA'} will produce the following request and response packets:

The tutorial Understanding the binary protocol shows actual byte codes of the response to the IPROTO_INSERT message.

IPROTO_REPLACE

Code: 0x03.

See space_object:replace(). The body is a 2-item map, the same as for IPROTO_INSERT:

IPROTO_UPDATE

Code: 0x04.

See space_object:update().

The body is usually a 4-item map:

Examples

If the operation specifies no values, then IPROTO_TUPLE is a 2-item array:

Normally field numbers start with 1.

If the operation specifies one value, then IPROTO_TUPLE is a 3-item array:

Otherwise IPROTO_TUPLE is a 5-item array:

If the ID of tspace is 512 and this is the fifth message,
conn.space.tspace:update(999, {{'=', 2, 'B'}}) will cause the following request packet:

The map item IPROTO_INDEX_BASE is optional.

The tutorial Understanding the binary protocol shows the actual byte codes of an IPROTO_UPDATE message.

IPROTO_UPSERT

Code: 0x09.

See space_object:upsert().

The body is usually a 4-item map:

IPROTO_OPS is the array of operations. It is the same as the IPROTO_TUPLE of IPROTO_UPDATE.

IPROTO_TUPLE is an array of primary-key field values.

IPROTO_DELETE

Code: 0x05.

See space_object:delete(). The body is a 3-item map:

IPROTO_EVAL

Code: 0x08.

See conn:eval(). Since the argument is a Lua expression, this is Tarantool’s way to handle non-binary with the binary protocol. Any request that does not have its own code, for example box.space.space-name:drop(), will be handled either with IPROTO_CALL or IPROTO_EVAL.

The tt administrative utility makes extensive use of eval.

The body is a 2-item map:

For IPROTO_EVAL and IPROTO_CALL the response body will usually be an array but, since Lua requests can result in a wide variety of structures, bodies can have a wide variety of structures.

Note

For SQL-specific responses, the body is a bit different. Learn more about this type of packets.

Example

If this is the fifth message, conn:eval('return 5;') will cause:

IPROTO_CALL

Code: 0x0a.

See conn:call(). This is a remote stored-procedure call. /release/1.6 and earlier made use of the IPROTO_CALL_16 request (code: 0x06). It is now deprecated and superseded by IPROTO_CALL.

The body is a 2-item map. The response will be a list of values, similar to the IPROTO_EVAL response. The return from conn:call is whatever the function returns.

Note

For SQL-specific responses, the body is a bit different. Learn more about this type of packets.

IPROTO_AUTH

Code: 0x07.

For general information, see the Access control section in the administrator’s guide.

For more on how authentication is handled in the binary protocol, see the Authentication section of this document.

The client sends an authentication packet as an IPROTO_AUTH message:

IPROTO_USERNAME holds the user name. IPROTO_TUPLE must be an array of 2 fields: authentication mechanism and scramble, encrypted according to the specified mechanism.

The server instance responds to an authentication packet with a standard response with 0 tuples.

To see how Tarantool handles this, look at net_box.c function netbox_encode_auth.

IPROTO_NOP

Code: 0x0c.

There is no Lua request exactly equivalent to IPROTO_NOP. It causes the LSN to be incremented. It could be sometimes used for updates where the old and new values are the same, but the LSN must be increased because a data-change must be recorded. The body is: nothing.

IPROTO_INSERT_ARROW

Since version 3.3.0.

Code: 0x11 The body is a 2-item map:

IPROTO_PING

Code: 0x40.

See conn:ping(). The body will be an empty map because IPROTO_PING in the header contains all the information that the server instance needs.

IPROTO_ID

Code: 0x49.

Clients send this message to inform the server about the protocol version and features they support. Based on this information, the server can enable or disable certain features in interacting with these clients.

The body is a 2-item map:

The response body has the same structure as the request body. It informs the client about the protocol version, features supported by the server, and a protocol used to generate user authentication data.

IPROTO_ID requests can be processed without authentication.

Session start and authentication

Every iproto session begins with a greeting and optional authentication.

Greeting message

When a client connects to the server instance, the instance responds with a 128-byte text greeting message, not in MsgPack format:

Tarantool <version> (<protocol>) <instance-uuid>
<salt>

For example:

Tarantool 2.10.0 (Binary) 29b74bed-fdc5-454c-a828-1d4bf42c639a
QK2HoFZGXTXBq2vFj7soCsHqTo6PGTF575ssUBAJLAI=

The greeting contains two 64-byte lines of ASCII text. Each line ends with a newline character (\n). If the line content is less than 64 bytes long, the rest of the line is filled up with symbols with an ASCII code of 0 that aren’t displayed in the console.

The first line contains the instance version and protocol type. The second line contains the session salt – a base64-encoded random string, which is usually 44 bytes long. The salt is used in the authentication packet – the IPROTO_AUTH message.

Authentication

If authentication is skipped, then the session user is 'guest' (the 'guest' user does not need a password).

If authentication is not skipped, then at any time an authentication packet can be prepared using the greeting, the user’s name and password, and sha-1 functions, as follows.

PREPARE SCRAMBLE:

    size_of_encoded_salt_in_greeting = 44;
    size_of_salt_after_base64_decode = 32;
     /* sha1() will only use the first 20 bytes */
    size_of_any_sha1_digest = 20;
    size_of_scramble = 20;

prepare 'chap-sha1' scramble:

    salt = base64_decode(encoded_salt);
    step_1 = sha1(password);
    step_2 = sha1(step_1);
    step_3 = sha1(first_20_bytes_of_salt, step_2);
    scramble = xor(step_1, step_3);
    return scramble;

Streams

Overview

The Streams and interactive transactions feature, which was added in Tarantool version v. 2.10.0, allows two things: sequential processing and interleaving.

Sequential processing: With streams there is a guarantee that the server instance will not handle the next request in a stream until it has completed the previous one.

Interleaving: For example, a series of requests can include “begin for stream #1”, “begin for stream #2”, “insert for stream #1”, “insert for stream #2”, “delete for stream #1”, “commit for stream #1”, “rollback for stream #2”.

To work with stream transactions using iproto, the following is required:

The engine should be vinyl or memtx with mvcc.
The client is responsible for ensuring that the stream identifier, unsigned integer IPROTO_STREAM_ID, is in the request header. IPROTO_STREAM_ID can be any positive 64-bit number, and should be unique for the connection. If IPROTO_STREAM_ID equals zero, the server instance will ignore it.

Basic request description

Name	Code	Description
IPROTO_BEGIN	0x0e	Begin a transaction in the specified stream
IPROTO_COMMIT	0x0f	Commit the transaction in the specified stream
IPROTO_ROLLBACK	0x10	Rollback the transaction in the specified stream

IPROTO_BEGIN

Code: 0x0e.

Begin a transaction in the specified stream. See stream:begin(). The body is optional and can contain two items:

IPROTO_TIMEOUT is an optional timeout (in seconds). After it expires, the transaction will be rolled back automatically.

IPROTO_COMMIT

Code: 0x0f.

Commit the transaction in the specified stream. See stream:commit().

IPROTO_ROLLBACK

Codde: 0x10.

Rollback the transaction in the specified stream. See stream:rollback().

Example

Suppose that the client has started a stream with the net.box module

net_box = require('net.box')
conn = net_box.connect('localhost:3302')
stream = conn:new_stream()

At this point the stream object will look like a duplicate of the conn object, with just one additional member: stream_id. Now, using stream instead of conn, the client sends two requests:

stream.space.T:insert{1}
stream.space.T:insert{2}

The header and body of these requests will be the same as in non-stream IPROTO_INSERT requests, except that the header will contain an additional item: IPROTO_STREAM_ID=0x0a with MP_UINT=0x01. It happens to equal 1 for this example because each call to conn:new_stream() assigns a new number, starting with 1.

The client makes stream transactions by sending, in order:

IPROTO_BEGIN with an optional transaction timeout in the IPROTO_TIMEOUT field of the request body.
The transaction data-change and query requests.
IPROTO_COMMIT or IPROTO_ROLLBACK.

All these requests must contain the same IPROTO_STREAM_ID value.

A rollback will happen automatically if a disconnect occurs or the transaction timeout expires before the commit is possible.

Thus there are now multiple ways to do transactions: with net_box stream:begin() and stream:commit() or stream:rollback() which cause IPROTO_BEGIN and IPROTO_COMMIT or IPROTO_ROLLBACK with the current value of stream.stream_id; with box.begin() and box.commit() or box.rollback(); with SQL and START TRANSACTION and COMMIT or ROLLBACK. An application can use any or all of these ways.

Events and subscriptions

The commands below support asynchronous server-client notifications signalled with box.broadcast(). Servers that support the new feature set the IPROTO_FEATURE_WATCHERS feature in reply to the IPROTO_ID command. When the connection is closed, all watchers registered for it are unregistered.

The remote watcher (event subscription) protocol works in the following way:

The client sends an IPROTO_WATCH packet to subscribe to the updates of a specified key defined on the server.
The server sends an IPROTO_EVENT packet to the subscribed client after registration. The packet contains the key name and its current value. After that, the packet is sent every time the key value is updated with box.broadcast(), provided that the last notification was acknowledged (see below).
After receiving the notification, the client sends an IPROTO_WATCH packet to acknowledge the notification.
If the client doesn’t want to receive any more notifications, it unsubscribes by sending an IPROTO_UNWATCH packet.

All the three request types are asynchronous – the receiving end doesn’t send a packet in reply to any of them. Therefore, neither of them has a sync number.

IPROTO_WATCH

Code: 0x4a.

Register a new watcher for the given notification key or confirms a notification if the watcher is already subscribed. The watcher is notified after registration. After that, the notification is sent every time the key is updated. The server doesn’t reply to the request unless it fails to parse the packet.

IPROTO_UNWATCH

Code: 0x4b.

Unregister a watcher subscribed to the given notification key. The server doesn’t reply to the request unless it fails to parse the packet.

IPROTO_EVENT

Code: 0x4c.

Sent by the server to notify a client about an update of a key.

IPROTO_EVENT_DATA contains data sent to a remote watcher. The parameter is optional, the default value is MP_NIL.

Graceful shutdown protocol

Overview

Since 2.10.0.

The graceful shutdown protocol is a mechanism that helps to prevent data loss in requests in case of a shutdown command. According to the protocol, when a server receives an os.exit() command or a SIGTERM signal, it does not exit immediately. Instead of that, first, the server stops listening for new connections. Then, the server sends the shutdown packets to all connections that support the graceful shutdown protocol. When a client is notified about the upcoming server exit, it stops serving any new requests and waits for active requests to complete before closing the connections. Once all connections are terminated, the server will be shut down.

The protocol uses the event subscription system. That is, the feature is available if the server supports the box.shutdown event and IPROTO_WATCH. For more information about it, see reference for the event watchers and the corresponding page in the Binary Protocol section.

How the graceful shutdown works

The shutdown protocol works in the following way:

First, the server receives a shutdown request. It can be either an os.exit() command or a SIGTERM signal.
Then the box.shutdown event is generated. The server broadcasts it to all subscribed remote watchers (see IPROTO_WATCH). That is, the server calls box.broadcast(‘box.shutdown’, true) from the box.ctl.on_shutdown() trigger callback. Once this is done, the server stops listening for new connections.
From now on, the server waits until all subscribed connections are terminated.
At the same time, the client gets the box.shutdown event and shuts the connection down gracefully.
After all connections are closed, the server will be stopped. Otherwise, a timeout occurs, and the Tarantool exits immediately. You can set up the required timeout with the set_on_shutdown_timeout() function.

SQL-specific requests and responses

Below are considered the IPROTO_EXECUTE and IPROTO_PREPARE requests, followed by a description of responses.

Basic request description

Name	Code	Description
IPROTO_EXECUTE	0x0b	Execute an SQL statement (box.execute())
IPROTO_PREPARE	0x0d	Prepare an SQL statement (box.prepare())

IPROTO_EXECUTE

Code: 0x0b.

The body is a 3-item map:

Use IPROTO_STMT_ID (0x43) and statement-id (MP_INT) if executing a prepared statement. Use IPROTO_SQL_TEXT (0x40) and statement-text (MP_STR) if executing an SQL string.
IPROTO_SQL_BIND (0x41) corresponds to the array of parameter values to match ? placeholders or :name placeholders.
IPROTO_OPTIONS (0x2b) corresponds to the array of options. It is usually empty.

Example 1

Suppose we prepare a statement with two ? placeholders, and execute with two parameters, thus:

n = conn:prepare([[VALUES (?, ?);]])
conn:execute(n.stmt_id, {1,'a'})

Then the body will look like this:

The Understanding binary protocol tutorial shows actual byte codes of the IPROTO_EXECUTE message.

To call a prepared statement with named parameters from a connector pass the parameters within an array of maps. A client should wrap each element into a map, where the key holds a name of the parameter (with a colon) and the value holds an actual value. So, to bind foo and bar to 42 and 43, a client should send IPROTO_SQL_TEXT: <...>, IPROTO_SQL_BIND: [{"foo": 42}, {"bar": 43}].

If a statement has both named and non-named parameters, wrap only named ones into a map. The rest of the parameters are positional and will be substituted in order.

Example 2

Let’s ask for full metadata and then select the two rows from a table named t1 that has columns named DD and Д:

conn.space._session_settings:update('sql_full_metadata', {{'=', 'value', true}})
conn:prepare([[SELECT dd, дд AS д FROM t1;]])

In the iproto request, there would be no IPROTO_DATA and there would be two additional items:

34 00 = IPROTO_BIND_COUNT and MP_UINT = 0 (there are no parameters to bind).
33 90 = IPROTO_BIND_METADATA and MP_ARRAY, size 0 (there are no parameters to bind).

Here is what the request body looks like:

IPROTO_PREPARE

Code: 0x0d.

The body is a 1-item map:

The IPROTO_PREPARE map item is the same as the first item of the IPROTO_EXECUTE body for an SQL string.

Responses for SQL

After the header, for a response to an SQL statement, there will be a body that is slightly different from the body for non-SQL requests/responses.

Responses to SELECT, VALUES, or PRAGMA

If the SQL statement is SELECT or VALUES or PRAGMA, the response contains:

Example

Let’s ask for full metadata and then select the two rows from a table named t1 that has columns named DD and Д:

conn.space._session_settings:update('sql_full_metadata', {{'=', 'value', true}})
conn:execute([[SELECT dd, дд AS д FROM t1;]])

The response body might look like this:

The tutorial Understanding the binary protocol shows actual byte codes of responses to the above SQL messages.

Responses to other requests

If the SQL request is not SELECT or VALUES or PRAGMA, then the response body contains only IPROTO_SQL_INFO (0x42). Usually IPROTO_SQL_INFO is a map with only one item – SQL_INFO_ROW_COUNT (0x00) – which is the number of changed rows.

For example, if the request is INSERT INTO table-name VALUES (1), (2), (3), then the response body contains an IPROTO_SQL_INFO map with SQL_INFO_ROW_COUNT = 3.

The IPROTO_SQL_INFO map may contain a second item – SQL_INFO_AUTO_INCREMENT_IDS (0x01) – which is the new primary-key value (or values) for an INSERT in a table defined with PRIMARY KEY AUTOINCREMENT. In this case the MP_MAP will have two keys, and one of the two keys will be 0x01: SQL_INFO_AUTO_INCREMENT_IDS, which is an array of unsigned integers.

Replication requests and responses

This section describes internal requests and responses that happen during replication. Each of them is distinguished by the header, containing a unique IPROTO_REQUEST_TYPE value. These values and the corresponding packet body structures are considered below.

Connectors and clients do not need to send replication packets.

General

Name	Code	Description
IPROTO_JOIN	0x41	Request to join a replica set
IPROTO_SUBSCRIBE	0x42	Request to subscribe to a specific node in a replica set
IPROTO_VOTE	0x44	Request for replication
IPROTO_BALLOT	0x29	Response to IPROTO_VOTE. Used during replica set bootstrap
IPROTO_FETCH_SNAPSHOT	0x45	Fetch the master’s snapshot and start anonymous replication.
IPROTO_REGISTER	0x46	Register an anonymous replica so it is not anonymous anymore
IPROTO_JOIN_META	0x47	A request sent in response to IPROTO_JOIN or IPROTO_FETCH_SNAPSHOT before the instance initialization information
IPROTO_JOIN_SNAPSHOT	0x48	A request sent in response to IPROTO_JOIN or IPROTO_FETCH_SNAPSHOT after the instance initialization information

The master also sends heartbeat messages to the replicas. The heartbeat message’s IPROTO_REQUEST_TYPE is 0.

Below are details on individual replication requests. For synchronous replication requests, see Synchronous.

Heartbeats

Once in replication_timeout seconds, a master sends a heartbeat message to a replica, and the replica sends a response. Both messages’ IPROTO_REQUEST_TYPE is IPROTO_OK. IPROTO_TIMESTAMP is a float-64 MP_DOUBLE 8-byte timestamp.

Since version 2.11, both messages have an optional field in the body that contains the IPROTO_VCLOCK_SYNC key. The master’s heartbeat has no body if the IPROTO_VCLOCK_SYNC key is omitted.

The message from master to a replica:

The response from the replica:

The tutorial Understanding the binary protocol shows actual byte codes of the above heartbeat examples.

IPROTO_JOIN

Code: 0x41.

To join a replica set, an instance must send an initial IPROTO_JOIN request to the master instance of the replica set:

The instance that receives the request sends the following messages in response:

Its vclock:
(Optional) A sequence of requests with information required for instance initialization:
- an IPROTO_JOIN_META request
- an IPROTO_RAFT request with IPROTO_RAFT_TERM and IPROTO_RAFT_VOTE fields
- an IPROTO_RAFT_PROMOTE request
- an IPROTO_JOIN_SNAPSHOT request
This step applies if the IPROTO_SERVER_VERSION specified in the request is 2.10 or later.
A number of INSERT requests (with additional LSN and ServerID). This way, the data is updated on the instance that sent the IPROTO_JOIN request. The instance should not reply to these INSERT requests.
The new vclock’s MP_MAP in a response similar to the one above.

A number of INSERT, REPLACE, UPDATE, UPSERT, and DELETE requests. This way, the instance that is joining the replica set receives data updates that happened during the join stage.
The new vclock’s MP_MAP in a response similar to the one above.

IPROTO_FETCH_SNAPSHOT

Code: 0x45.

To join a replica set as an anonymous replica, an instance must send an initial IPROTO_FETCH_SNAPSHOT request to the master instance of the replica set:

To learn about anonymous replicas, see replication.anon.

The instance that receives the request sends the following messages in response:

Its vclock:
(Optional) A sequence of requests with information required for instance initialization:
- an IPROTO_JOIN_META request
- an IPROTO_RAFT request with IPROTO_RAFT_TERM and IPROTO_RAFT_VOTE fields
- an IPROTO_RAFT_PROMOTE request
- an IPROTO_JOIN_SNAPSHOT request
This step applies if the IPROTO_SERVER_VERSION specified in the request is 2.10 or later.
A number of INSERT requests (with additional LSN and ServerID). This way, the data is updated on the instance that sent the IPROTO_JOIN request. The instance should not reply to these INSERT requests.
The new vclock’s MP_MAP in a response similar to the one above.

IPROTO_REGISTER

Code: 0x46.

To register an anonymous replica in a replica set so that it’s not anonymous anymore, it must send an IPROTO_REGISTER request to a master node of the replica set:

The instance that receives the request sends the following messages in response:

A number of INSERT, REPLACE, UPDATE, UPSERT, and DELETE requests. This way, the instance that is registering in the replica set receives data updates that happened since the time it fetched the snapshot.
The new vclock’s MP_MAP.

Technically, subsequent IPROTO_FETCH_SNAPSHOT and IPROTO_REGISTER requests are equivalent to IPROTO_JOIN.

IPROTO_JOIN_META

Code: 0x47.

When an instance receives an IPOTO_JOIN or IPROTO_FETCH_SNAPSHOT request, its responses include the information required for the instance initialization: current Raft term, current state of synchronous transaction queue. Before sending this information, the instance sends an IPROTO_JOIN_META request with an empty body:

Learn more in IPROTO_JOIN

IPROTO_JOIN_SNAPSHOT

Code: 0x48.

An instance that has received an IPROTO_JOIN or IPROTO_FETCH_SNAPSHOT request sends an IPROTO_JOIN_SNAPSHOT request with an empty body after it completes sending the instance initialization information.

Learn more in IPROTO_JOIN

IPROTO_VOTE

Code: 0x44.

When connecting for replication, an instance sends an IPROTO_VOTE request. It has no body:

IPROTO_VOTE is critical during replica set bootstrap. The response to this request is IPROTO_BALLOT.

IPROTO_BALLOT

Code: 0x29.

This value of IPROTO_REQUEST_TYPE indicates a message sent in response to IPROTO_VOTE (not to be confused with the key IPROTO_RAFT_VOTE).

IPROTO_BALLOT and IPROTO_VOTE are critical during replica set bootstrap. IPROTO_BALLOT corresponds to a map containing the following fields:

IPROTO_BALLOT_REGISTERED_REPLICA_UUIDS has the MP_ARRAY type. The array contains MP_STR elements.

Synchronous

Name	Code	Description
IPROTO_RAFT	0x1e	Inform that the node changed its RAFT status
IPROTO_RAFT_PROMOTE	0x1f	Wait, then choose new replication leader
IPROTO_RAFT_DEMOTE	0x20	Revoke the leader role from the instance
IPROTO_RAFT_CONFIRM	0x28	Confirm that the RAFT transactions have achieved quorum and can be committed
IPROTO_RAFT_ROLLBACK	0x29	Roll back the RAFT transactions because they haven’t achieved quorum

IPROTO_RAFT

Code: 0x1e.

A node broadcasts the IPROTO_RAFT request to all the replicas connected to it when the RAFT state of the node changes. It can be any actions changing the state, like starting a new election, bumping the term, voting for another node, becoming the leader, and so on.

If there should be a response, for example, in case of a vote request to other nodes, the response will also be an IPROTO_RAFT message. In this case, the node should be connected as a replica to another node from which the response is expected because the response is sent via the replication channel. In other words, there should be a full-mesh connection between the nodes.

IPROTO_REPLICA_ID is the ID of the replica from which the request came.

IPROTO_RAFT_PROMOTE

Code: 0x1f.

See box.ctl.promote().

In the header:

IPROTO_REPLICA_ID is the replica ID of the node that sent the request.
IPROTO_LSN is the actual LSN of the promote operation as recorded in the WAL.

In the body:

IPROTO_REPLICA_ID is the replica ID of the previous synchronous queue owner.
IPROTO_LSN is the LSN of the last operation on the previous synchronous queue owner.
IPROTO_TERM is the term in which the node that sent the request becomes the synchronous queue owner. This term corresponds to the value of box.info.synchro.queue.term on the instance.

IPROTO_RAFT_DEMOTE

Code: 0x20.

See box.ctl.demote().

In the header:

IPROTO_REPLICA_ID is the replica ID of the node that sent the request.
IPROTO_LSN is the actual LSN of the demote operation as recorded in the WAL.

In the body:

IPROTO_REPLICA_ID is the replica ID of the node that sent the request (same as the value in the header).
IPROTO_LSN is the LSN of the last synchronous transaction recorded in the node’s WAL.
IPROTO_TERM is the term in which the queue becomes empty.

IPROTO_RAFT_CONFIRM

Code: 0x28.

This message is used in replication connections between Tarantool nodes in synchronous replication. It is not supposed to be used by any client applications in their regular connections.

This message confirms that the transactions that originated from the instance with id = IPROTO_REPLICA_ID (body) have achieved quorum and can be committed, up to and including LSN = IPROTO_LSN (body).

The body is a 2-item map:

In the header:

IPROTO_REPLICA_ID is the ID of the replica that sends the confirm message.
IPROTO_LSN is the LSN of the confirmation action.

In the body:

IPROTO_REPLICA_ID is the ID of the instance from which the transactions originated.
IPROTO_LSN is the LSN up to which the transactions should be confirmed.

Prior to Tarantool v. 2.10.0, IPROTO_RAFT_CONFIRM was called IPROTO_CONFIRM.

IPROTO_RAFT_ROLLBACK

Code: 0x29.

This message is used in replication connections between Tarantool nodes in synchronous replication. It is not supposed to be used by any client applications in their regular connections.

This message says that the transactions that originated from the instance with id = IPROTO_REPLICA_ID (body) couldn’t achieve quorum for some reason and should be rolled back, down to LSN = IPROTO_LSN (body) and including it.

The body is a 2-item map:

In the header:

IPROTO_REPLICA_ID is the ID of the replica that sends the rollback message.
IPROTO_LSN is the LSN of the rollback action.

In the body:

IPROTO_REPLICA_ID is the ID of the instance from which the transactions originated.
IPROTO_LSN is the LSN starting with which all pending synchronous transactions should be rolled back.

Prior to Tarantool v. 2.10.0, IPROTO_RAFT_ROLLBACK was called IPROTO_ROLLBACK.

MessagePack extensions

Tarantool uses predefined MessagePack extension types to represent some of the special values. Extension types include MP_DECIMAL, MP_UUID, MP_ERROR, MP_DATETIME, and MP_INTERVAL. These types require special attention from the connector developers, as they must be treated separately from the default MessagePack types, and correctly mapped to programming language types.

The DECIMAL type

The MessagePack EXT type MP_EXT together with the extension type MP_DECIMAL is a header for values of the DECIMAL type.

MP_DECIMAL type is 1.

MessagePack specification defines two kinds of types:

fixext 1/2/4/8/16 types have fixed length so the length is not encoded explicitly.
ext 8/16/32 types require the data length to be encoded.

MP_EXP + optional length imply using one of these types.

The decimal MessagePack representation looks like this:

+--------+-------------------+------------+===============+
| MP_EXT | length (optional) | MP_DECIMAL | PackedDecimal |
+--------+-------------------+------------+===============+

Here length is the length of PackedDecimal field, and it is of type MP_UINT, when encoded explicitly (i.e. when the type is ext 8/16/32).

PackedDecimal has the following structure:

 <--- length bytes -->
+-------+=============+
| scale |     BCD     |
+-------+=============+

Here scale is either MP_INT or MP_UINT.
scale = number of digits after the decimal point

BCD is a sequence of bytes representing decimal digits of the encoded number (each byte has two decimal digits each encoded using 4-bit nibbles), so byte >> 4 is the first digit and byte & 0x0f is the second digit. The leftmost digit in the array is the most significant. The rightmost digit in the array is the least significant.

The first byte of the BCD array contains the first digit of the number, represented as follows:

|  4 bits           |  4 bits           |
   = 0x                = the 1st digit

(The first nibble contains 0 if the decimal number has an even number of digits.) The last byte of the BCD array contains the last digit of the number and the final nibble, represented as follows:

|  4 bits           |  4 bits           |
   = the last digit    = nibble

The final nibble represents the number’s sign:

0x0a, 0x0c, 0x0e, 0x0f stand for plus,
0x0b and 0x0d stand for minus.

Examples

The decimal -12.34 will be encoded as 0xd6,0x01,0x02,0x01,0x23,0x4d:

|MP_EXT (fixext 4) | MP_DECIMAL | scale |  1   |  2,3 |  4 (minus) |
|       0xd6       |    0x01    | 0x02  | 0x01 | 0x23 | 0x4d       |

The decimal 0.000000000000000000000000000000000010 will be encoded as 0xc7,0x03,0x01,0x24,0x01,0x0c:

| MP_EXT (ext 8) | length | MP_DECIMAL | scale |  1   | 0 (plus) |
|      0xc7      |  0x03  |    0x01    | 0x24  | 0x01 | 0x0c     |

The UUID type

The MessagePack EXT type MP_EXT together with the extension type MP_UUID for values of the UUID type. Since version 2.4.1.

MP_UUID type is 2.

The MessagePack specification defines d8 to mean fixext with size 16, and a UUID’s size is always 16. So the UUID MessagePack representation looks like this:

+--------+------------+-----------------+
| MP_EXT | MP_UUID    | UuidValue       |
| = d8   | = 2        | = 16-byte value |
+--------+------------+-----------------+

The 16-byte value has 2 digits per byte. Typically, it consists of 11 fields, which are encoded as big-endian unsigned integers in the following order:

time_low (4 bytes)
time_mid (2 bytes)
time_hi_and_version (2 bytes)
clock_seq_hi_and_reserved (1 byte)
clock_seq_low (1 byte)
node[0], …, node[5] (1 byte each)

Some of the functions in Module uuid can produce values which are compatible with the UUID data type. For example, after

uuid = require('uuid')
box.schema.space.create('t')
box.space.t:create_index('i', {parts={1,'uuid'}})
box.space.t:insert{uuid.fromstr('f6423bdf-b49e-4913-b361-0740c9702e4b')}
box.space.t:select()

a peek at the server response packet will show that it contains

d8 02 f6 42 3b df b4 9e 49 13 b3 61 07 40 c9 70 2e 4b

The ERROR type

Since version 2.4.1, responses for errors have extra information following what was described in Box protocol – responses for errors. This is a “compatible” enhancement, because clients that expect old-style server responses should ignore map components that they do not recognize. Notice, however, that there has been a renaming of a constant: formerly IPROTO_ERROR in ./box/iproto_constants.h was 0x31, now IPROTO_ERROR is 0x52 and IPROTO_ERROR_24 is 0x31.

MP_ERROR type is 3.

++=========================+============================+
||                         |                            |
||   0x31: IPROTO_ERROR_24 |   0x52: IPROTO_ERROR       |
|| MP_INT: MP_STRING       | MP_MAP: extra information  |
||                         |                            |
++=========================+============================+
                        MP_MAP

The extra information, most of which is also in error object fields, is:

MP_ERROR_TYPE (0x00) (MP_STR) Type that implies source, as in error_object.base_type, for example “ClientError”.

MP_ERROR_FILE (0x01) (MP_STR) Source code file where error was caught, as in error_object.trace.

MP_ERROR_LINE (0x02) (MP_UINT) Line number in source code file, as in error_object.trace.

MP_ERROR_MESSAGE (0x03) (MP_STR) Text of reason, as in error_object.message. The value here will be the same as in the IPROTO_ERROR_24 value.

MP_ERROR_ERRNO (0x04) (MP_UINT) Ordinal number of the error, as in error_object.errno. Not to be confused with MP_ERROR_ERRCODE.

MP_ERROR_ERRCODE (0x05) (MP_UINT) Number of the error as defined in errcode.h, as in error_object.code, which can also be retrieved with the C function box_error_code(). The value here will be the same as the lower part of the Response-Code-Indicator value.

MP_ERROR_FIELDS (0x06) (MP_MAPs) Additional fields depending on error type. For example, if MP_ERROR_TYPE is “AccessDeniedError”, then MP_ERROR_FIELDS will include “object_type”, “object_name”, “access_type”. This field will be omitted from the response body if there are no additional fields available.

Client and connector programmers should ensure that unknown map keys are ignored, and should check for addition of new keys in the Tarantool source code file where error object creation is defined. In version 2.4.1 the name of this source code file is mp_error.cc.

For example, in version 2.4.1 or later, if we try to create a duplicate space with
conn:eval([[box.schema.space.create('_space');]])
the server response will look like this:

ce 00 00 00 88                  MP_UINT = HEADER + BODY SIZE
83                              MP_MAP, size 3 (i.e. 3 items in header)
  00                              Response-Code-Indicator
  ce 00 00 80 0a                  MP_UINT = hexadecimal 800a
  01                              IPROTO_SYNC
  cf 00 00 00 00 00 00 00 05      MP_UINT = sync value
  05                              IPROTO_SCHEMA_VERSION
  ce 00 00 00 4e                  MP_UINT = schema version value
82                              MP_MAP, size 2
  31                              IPROTO_ERROR_24
  bd 53 70 61 63 etc.             MP_STR = "Space '_space' already exists"
  52                              IPROTO_ERROR
  81                              MP_MAP, size 1
    00                              MP_ERROR_STACK
    91                              MP_ARRAY, size 1
      86                              MP_MAP, size 6
        00                              MP_ERROR_TYPE
        ab 43 6c 69 65 6e 74 etc.       MP_STR = "ClientError"
        02                              MP_ERROR_LINE
        cd                              MP_UINT = line number
        01                              MP_ERROR_FILE
        aa 01 b6 62 75 69 6c etc.       MP_STR "builtin/box/schema.lua"
        03                              MP_ERROR_MESSAGE
        bd 53 70 61 63 65 20 etc.       MP_STR = Space.'_space'.already.exists"
        04                              MP_ERROR_ERRNO
        00                              MP_UINT = error number
        05                              MP_ERROR_ERRCODE
        0a                              MP_UINT = error code ER_SPACE_EXISTS

The DATETIME type

Since version 2.10.0. The MessagePack EXT type MP_EXT together with the extension type MP_DATETIME is a header for values of the DATETIME type. It creates a container with a payload of 8 or 16 bytes.

MP_DATETIME type is 4.

The MessagePack specification defines d7 to mean fixext with size 8 or d8 to mean fixext with size 16.

So the datetime MessagePack representation looks like this:

+---------+----------------+==========+-----------------+
| MP_EXT  | MP_DATETIME    | seconds  | nsec; tzoffset; |
| = d7/d8 | = 4            |          | tzindex;        |
+---------+----------------+==========+-----------------+

MessagePack data contains:

Seconds (8 bytes) as an unencoded 64-bit signed integer stored in the little-endian order.
The optional fields (8 bytes), if any of them have a non-zero value. The fields include nsec, tzoffset, and tzindex packed in the little-endian order.

For more information about the datetime type, see datetime field type details and reference for the datetime module.

The INTERVAL type

Since version 2.10.0. The MessagePack EXT type MP_EXT together with the extension type MP_INTERVAL is a header for values of the INTERVAL type.

MP_INTERVAL type is 6.

The interval is saved as a variant of a map with a predefined number of known attribute names. If some attributes are undefined, they are omitted from the generated payload.

The interval MessagePack representation looks like this:

+--------+-------------------------+-------------+----------------+
| MP_EXT | Size of packed interval | MP_INTERVAL | PackedInterval |
+--------+-------------------------+-------------+----------------+

Packed interval consists of:

Packed number of non-zero fields.
Packed non-null fields.

Each packed field has the following structure:

+----------+=====================+
| field ID |     field value     |
+----------+=====================+

The number of defined (non-null) fields can be zero. In this case, the packed interval will be encoded as integer 0.

List of the field IDs:

0 – year
1 – month
2 – week
3 – day
4 – hour
5 – minute
6 – second
7 – nanosecond
8 – adjust

Example

Interval value 1 years, 200 months, -77 days is encoded in the following way:

tarantool> I = datetime.interval.new{year = 1, month = 200, day = -77}
---
...

tarantool> I
---
- +1 years, 200 months, -77 days
...

tarantool> M = msgpack.encode(I)
---
...

tarantool> M
---
- !!binary xwsGBAABAczIA9CzCAE=
...

tarantool> tohex = function(s) return (s:gsub('.', function(c) return string.format('%02X ', string.byte(c)) end)) end
---
...

tarantool> tohex(M)
---
- 'C7 0B 06 04 00 01 01 CC C8 03 D0 B3 08 01 '
...

Where:

C7 – MP_EXT
0B – size of a packed interval value (11 bytes)
06 – MP_INTERVAL type
04 – number of defined fields
00 – field ID (year)
01 – packed value 1
01 – field ID (month)
CCC8 – packed value 200
03 – field ID (day)
D0B3 – packed value -77
08 – field ID (adjust)
01 – packed value 1 (DT_LIMIT)

For more information about the interval type, see interval field type details and description of the datetime module.

File formats

The WAL file format

To maintain data persistence, Tarantool writes each data change request (insert, update, delete, replace, upsert) to a write-ahead log (WAL) file in the wal.dir directory. Each data change request is assigned a continuously growing 64-bit log sequence number. The name of the WAL file is based on the log sequence number of the first record in the file, plus an extension .xlog. A new WAL file is created when the current one reaches the wal_max_size size.

Each WAL record contains:

a log sequence number
a data change request (formatted as in Tarantool’s binary protocol)
a header
some metadata
the data formatted according to msgpack rules.

To see the hexadecimal bytes of the given WAL file, use the hexdump command:

$ hexdump 00000000000000000000.xlog

For example, the WAL file after the first INSERT request might look the following way:

Hex dump of WAL file       Comment
--------------------       -------
58 4c 4f 47 0a             "XLOG\n"
30 2e 31 33 0a             "0.13\n" = version
53 65 72 76 65 72 3a 20    "Server: "
38 62 66 32 32 33 65 30 2d [Server UUID]\n
36 39 31 34 2d 34 62 35 35
2d 39 34 64 32 2d 64 32 62
36 64 30 39 62 30 31 39 36
0a
56 43 6c 6f 63 6b 3a 20    "Vclock: "
7b 7d                      "{}" = vclock value, initially blank
...                        (not shown = tuples for system spaces)
d5 ba 0b ab                Magic row marker always = 0xab0bbad5
19                         Length, not including length of header, = 25 bytes
00                           Record header: previous crc32
ce 8c 3e d6 70               Record header: current crc32
a7 cc 73 7f 00 00 66 39      Record header: padding
84                         msgpack code meaning "Map of 4 elements" follows
00 02                         element#1: tag=request type, value=0x02=IPROTO_INSERT
02 01                         element#2: tag=server id, value=0x01
03 04                         element#3: tag=lsn, value=0x04
04 cb 41 d4 e2 2f 62 fd d5 d4 element#4: tag=timestamp, value=an 8-byte "Float64"
82                         msgpack code meaning "map of 2 elements" follows
10 cd 02 00                   element#1: tag=space id, value=512, big byte first
21 91 01                      element#2: tag=tuple, value=1-element fixed array={1}

Tarantool processes requests atomically: a change is either accepted and recorded in the WAL, or discarded completely. To clarify how this happens, see the example with the REPLACE request below:

The server instance attempts to locate the original tuple by primary key. If found, a reference to the tuple is retained for later use.
The new tuple is validated. If for example it does not contain an indexed field, or it has an indexed field whose type does not match the type according to the index definition, the change is aborted.
The new tuple replaces the old tuple in all existing indexes.
A message is sent to the WAL writer running in a separate thread, requesting that the change be recorded in the WAL. The instance switches to work on the next request until the write is acknowledged.
On success, a confirmation is sent to the client. On failure, a rollback procedure begins. During the rollback procedure, the transaction processor rolls back all changes to the database which occurred after the first failed change, from latest to oldest, up to the first failed change. All rolled back requests are aborted with ER_WAL_IO error. No new change is applied while rollback is in progress. When the rollback procedure is finished, the server restarts the processing pipeline.

One advantage of the described algorithm is that complete request pipelining is achieved, even for requests on the same value of the primary key. As a result, database performance doesn’t degrade even if all requests refer to the same key in the same space.

The transaction processor thread communicates with the WAL writer thread using asynchronous (yet reliable) messaging. The transaction processor thread, not being blocked on WAL tasks, continues to handle requests quickly even at high volumes of disk I/O. A response to a request is sent as soon as it is ready, even if there were earlier incomplete requests on the same connection. In particular, SELECT performance, even for SELECTs running on a connection packed with UPDATEs and DELETEs, remains unaffected by disk load.

The WAL writer employs a number of durability modes, as defined in configuration variable wal.mode. It is possible to turn the write-ahead log completely off, by setting the wal_mode option to none. Even without the write-ahead log it’s still possible to take a persistent copy of the entire data set with the box.snapshot() request.

An .xlog file always contains changes based on the primary key. Even if the client requested an update or delete using a secondary key, the record in the .xlog file contains the primary key.

The snapshot file format

The format of a snapshot (.snap) file is the following:

The snapshot header contains the instance’s global unique identifier and the snapshot file’s position in history, relative to earlier snapshot files.
The snapshot content contains the records of inserts to memtx spaces. That differs from the content of an .xlog file that may contain records for any data-change requests (inserts, updates, upserts, and deletes).

Primarily, the records in the snapshot file have the following order:

System spaces (id >= 256 && id <= 511), ordered by ID.
Non-system spaces, ordered by ID.

Secondarily, the .snap file’s records are ordered by primary key within space ID.

Example

The header of a .snap or .xlog file might look in the following way:

<type>\n                  SNAP\n or XLOG\n
<version>\n               currently 0.13\n
Server: <server_uuid>\n   where UUID is a 36-byte string
VClock: <vclock_map>\n    e.g. {1: 0}\n
\n

After the file header come the data tuples. Tuples begin with a row marker 0xd5ba0bab and the last tuple may be followed by an EOF marker 0xd510aded. Thus, between the file header and the EOF marker, there may be data tuples that have this form:

0            3 4                                         17
+-------------+========+============+===========+=========+
|             |        |            |           |         |
| 0xd5ba0bab  | LENGTH | CRC32 PREV | CRC32 CUR | PADDING |
|             |        |            |           |         |
+-------------+========+============+===========+=========+
   MP_FIXEXT2    MP_INT     MP_INT       MP_INT      ---

+============+ +===================================+
|            | |                                   |
|   HEADER   | |                BODY               |
|            | |                                   |
+============+ +===================================+
     MP_MAP                     MP_MAP

The recovery process

The recovery process begins when box.cfg{} happens for the first time after the Tarantool server instance starts.

The recovery process must recover the databases as of the moment when the instance was last shut down. For this it may use the latest snapshot file and any WAL files that were written after the snapshot. One complicating factor is that Tarantool has two engines – the memtx data must be reconstructed entirely from the snapshot and the WAL files, while the vinyl data will be on disk but might require updating around the time of a checkpoint. (When a snapshot happens, Tarantool tells the vinyl engine to make a checkpoint, and the snapshot operation is rolled back if anything goes wrong, so vinyl’s checkpoint is at least as fresh as the snapshot file.)

Step 1

Read the configuration parameters in the box.cfg{} request. Parameters which affect recovery may include work_dir, wal_dir, memtx_dir, vinyl_dir and force_recovery.

Step 2

Find the latest snapshot file. Use its data to reconstruct the in-memory databases. Instruct the vinyl engine to recover to the latest checkpoint.

There are actually two variations of the reconstruction procedure for memtx databases, depending on whether the recovery process is “default”.

If the recovery process is default (force_recovery is false), memtx can read data in the snapshot with all indexes disabled. First, all tuples are read into memory. Then, primary keys are built in bulk, taking advantage of the fact that the data is already sorted by primary key within each space.

If the recovery process is non-default (force_recovery is true), Tarantool performs additional checking. Indexes are enabled at the start, and tuples are added one by one. This means that any unique-key constraint violations will be caught, and any duplicates will be skipped. Normally there will be no constraint violations or duplicates, so these checks are only made if an error has occurred.

Step 3

Find the WAL file that was made at the time of, or after, the snapshot file. Read its log entries until the log-entry LSN is greater than the LSN of the snapshot, or greater than the LSN of the vinyl checkpoint. This is the recovery process’s “start position”; it matches the current state of the engines.

Step 4

Redo the log entries, from the start position to the end of the WAL. The engine skips a redo instruction if it is older than the engine’s checkpoint.

Step 5

For the memtx engine, re-create all secondary indexes.

Replication internals

Server startup with replication

In addition to the recovery process described in the section Recovery process, the server must take additional steps and precautions if replication is enabled.

Once again the startup procedure is initiated by the box.cfg{} request. One of the box.cfg parameters may be replication which specifies replication source(-s). We will refer to this replica, which is starting up due to box.cfg, as the “local” replica to distinguish it from the other replicas in a replica set, which we will refer to as “distant” replicas.

If there is no snapshot .snap file and the replication parameter is empty and cfg.read_only=false:
then the local replica assumes it is an unreplicated “standalone” instance, or is the first replica of a new replica set. It will generate new UUIDs for itself and for the replica set. The replica UUID is stored in the _cluster space; the replica set UUID is stored in the _schema space. Since a snapshot contains all the data in all the spaces, that means the local replica’s snapshot will contain the replica UUID and the replica set UUID. Therefore, when the local replica restarts on later occasions, it will be able to recover these UUIDs when it reads the .snap file.

If there is no snapshot .snap file and the replication parameter is empty and cfg.read_only=true:
it cannot be the first replica of a new replica set because the first replica must be a master. Therefore an error message will occur: ER_BOOTSTRAP_READONLY. To avoid this, change the setting for this (local) instance to read_only = false, or ensure that another (distant) instance starts first and has the local instance’s UUID in its _cluster space. In the latter case, if ER_BOOTSTRAP_READONLY still occurs, set the local instance’s box.replication_connect_timeout to a larger value.

If there is no snapshot .snap file and the replication parameter is not empty and the _cluster space contains no other replica UUIDs:
then the local replica assumes it is not a standalone instance, but is not yet part of a replica set. It must now join the replica set. It will send its replica UUID to the first distant replica which is listed in replication and which will act as a master. This is called the “join request”. When a distant replica receives a join request, it will send back:

the distant replica’s replica set UUID,
the contents of the distant replica’s .snap file.
When the local replica receives this information, it puts the replica set UUID in its _schema space, puts the distant replica’s UUID and connection information in its _cluster space, and makes a snapshot containing all the data sent by the distant replica. Then, if the local replica has data in its WAL .xlog files, it sends that data to the distant replica. The distant replica will receive this and update its own copy of the data, and add the local replica’s UUID to its _cluster space.

If there is no snapshot .snap file and the replication parameter is not empty and the _cluster space contains other replica UUIDs:
then the local replica assumes it is not a standalone instance, and is already part of a replica set. It will send its replica UUID and replica set UUID to all the distant replicas which are listed in replication. This is called the “on-connect handshake”. When a distant replica receives an on-connect handshake:

the distant replica compares its own copy of the replica set UUID to the one in the on-connect handshake. If there is no match, then the handshake fails and the local replica will display an error.
the distant replica looks for a record of the connecting instance in its _cluster space. If there is none, then the handshake fails.
Otherwise the handshake is successful. The distant replica will read any new information from its own .snap and .xlog files, and send the new requests to the local replica.

In the end, the local replica knows what replica set it belongs to, the distant replica knows that the local replica is a member of the replica set, and both replicas have the same database contents.

If there is a snapshot file and replication source is not empty:
first the local replica goes through the recovery process described in the previous section, using its own .snap and .xlog files. Then it sends a “subscribe” request to all the other replicas of the replica set. The subscribe request contains the server vector clock. The vector clock has a collection of pairs ‘server id, lsn’ for every replica in the _cluster system space. Each distant replica, upon receiving a subscribe request, will read its .xlog files’ requests and send them to the local replica if (lsn of .xlog file request) is greater than (lsn of the vector clock in the subscribe request). After all the other replicas of the replica set have responded to the local replica’s subscribe request, the replica startup is complete.

The following temporary limitations applied for Tarantool versions earlier than 1.7.7:

The URIs in the replication parameter should all be in the same order on all replicas. This is not mandatory but is an aid to consistency.
The replicas of a replica set should be started up at slightly different times. This is not mandatory but prevents a situation where each replica is waiting for the other replica to be ready.

The following limitation still applies for the current Tarantool version:

The maximum number of entries in the _cluster space is 32. Tuples for out-of-date replicas are not automatically re-used, so if this 32-replica limit is reached, users may have to reorganize the _cluster space manually.

Orphan status

Starting with Tarantool version 1.9, there is a change to the procedure when an instance joins a replica set. During box.cfg() the instance tries to join all nodes listed in box.cfg.replication. If the instance does not succeed with connecting to the required number of nodes (see bootstrap_strategy), it switches to the orphan status. While an instance is in orphan status, it is read-only.

To “join” a master, a replica instance must “connect” to the master node and then “sync”.

“Connect” means contact the master over the physical network and receive acknowledgment. If there is no acknowledgment after box.replication_connect_timeout seconds (usually 4 seconds), and retries fail, then the connect step fails.

“Sync” means receive updates from the master in order to make a local database copy. Syncing is complete when the replica has received all the updates, or at least has received enough updates that the replica’s lag (see replication.upstream.lag in box.info()) is less than or equal to the number of seconds specified in box.cfg.replication_sync_lag. If replication_sync_lag is unset (nil) or set to TIMEOUT_INFINITY, then the replica skips the “sync” state and switches to “follow” immediately.

In order to leave orphan mode, you need to sync with a sufficient number of instances (bootstrap_strategy). To do so, you may either:

Reset box.cfg.replication to exclude instances that cannot be reached or synced with.
Set box.cfg.replication to "" (empty string).

The following situations are possible.

Situation 1: bootstrap

Here box.cfg{} is being called for the first time. A replica is joining but no replica set exists yet.

Set the status to ‘orphan’.

Try to connect to all nodes from box.cfg.replication. The replica tries to connect for the replication_connect_timeout number of seconds and retries each replication_timeout seconds if needed.

Abort and throw an error if a replica is not connected to the majority of nodes in box.cfg.replication.

This instance might be elected as the replica set ‘leader’. Criteria for electing a leader include vclock value (largest is best), and whether it is read-only or read-write (read-write is best unless there is no other choice). The leader is the master that other instances must join. The leader is the master that executes box.once() functions.

If this instance is elected as the replica set leader, then perform an “automatic bootstrap”:

Set status to ‘running’.

Return from box.cfg{}.

Otherwise this instance will be a replica joining an existing replica set, so:

Bootstrap from the leader. See examples in section Bootstrapping a replica set.

In background, sync with all the other nodes in the replication set.

Situation 2: recovery

Here box.cfg{} is not being called for the first time. It is being called again in order to perform recovery.

Perform recovery from the last local snapshot and the WAL files.

Try to establish connections to all other nodes for the replication_connect_timeout number of seconds. Once replication_connect_timeout is expired or all the connections are established, proceed to the “sync” state with all the established connections.

If connected, sync with all connected nodes, until the difference is not more than replication_sync_lag seconds.

Situation 3: configuration update

Here box.cfg{} is not being called for the first time. It is being called again because some replication parameter or something in the replica set has changed.

Try to connect to all nodes from box.cfg.replication, within the time period specified in replication_connect_timeout.

Try to sync with the connected nodes, within the time period specified in replication_sync_timeout.

If earlier steps fail, change status to ‘orphan’. (Attempts to sync will continue in the background and when/if they succeed then ‘orphan’ status will end.)

If earlier steps succeed, set status to ‘running’ (master) or ‘follow’ (replica).

Situation 4: rebootstrap

Here box.cfg{} is not being called. The replica connected successfully at some point in the past, and is now ready for an update from the master. But the master cannot provide an update. This can happen by accident, or more likely can happen because the replica is slow (its lag is large), and the WAL (.xlog) files containing the updates have been deleted. This is not crippling. The replica can discard what it received earlier, and then ask for the master’s latest snapshot (.snap) file contents. Since it is effectively going through the bootstrap process a second time, this is called “rebootstrapping”. However, there has to be one difference from an ordinary bootstrap – the replica’s replica id will remain the same. If it changed, then the master would think that the replica is a new addition to the cluster, and would maintain a record of an instance ID of a replica that has ceased to exist. Rebootstrapping was introduced in Tarantool version 1.10.2 and is completely automatic.

Limitations

Number of parts in an index

For TREE or HASH indexes, the maximum is 255 (box.schema.INDEX_PART_MAX). For RTREE indexes, the maximum is 1 but the field is an ARRAY of up to 20 dimensions. For BITSET indexes, the maximum is 1.

Number of tuples in a hash index

4,294,967,288 (2³²-8).

Number of indexes in a space

128 (box.schema.INDEX_MAX).

Number of fields in a tuple

The theoretical maximum is 2,147,483,647 (box.schema.FIELD_MAX). The practical maximum is whatever is specified by the space’s field_count member, or the maximal tuple length.

Number of bytes in a tuple

The maximal number of bytes in a tuple is roughly equal to memtx.max_tuple_size or vinyl.max_tuple_size (with a metadata overhead of about 20 bytes per tuple, which is added on top of useful bytes). By default, the value of either memtx.max_tuple_size or vinyl.max_tuple_size is 1,048,576.

Number of bytes in an index key

If a field in a tuple can contain a million bytes, then the index key can contain a million bytes, so the maximum is determined by factors such as Number of bytes in a tuple, not by the index support.

Number of elements in array fields in a space with a multikey index

In a Tarantool space that has multikey indexes, any tuple cannot contain more than ~8,000 elements in a field indexed with that multikey index. This is because every element has 4 bytes of metadata, and the tuple’s metadata, which includes multikey metadata, cannot exceed 2^16 bytes.

Number of spaces

The theoretical maximum is 2,147,483,646 (box.schema.SPACE_MAX) but the practical maximum is around 65,000.

Number of connections

The practical limit is the number of file descriptors that one can set with the operating system.

Space size

The total maximum size for all spaces is in effect set by memtx.memory, which in turn is limited by the total available memory.

Update operations count

The maximum number of operations per tuple that can be in a single update is 4,000 (BOX_UPDATE_OP_CNT_MAX).

Number of users and roles

32 (BOX_USER_MAX).

Length of an index name or space name or user name

65,000 (box.schema.NAME_MAX).

Number of replicas in a replica set

32 (vclock.VCLOCK_MAX).

Releases

This section contains information about 3.x Tarantool releases: release notes, lifecycle information, release policy, and other documents. To download Tarantool releases, check the Download page. To see information about earlier Tarantool versions, see the Releases page of the corresponding documentation.

All currently supported versions are listed on this page below. Information about earlier versions is provided in EOS versions.

The Enterprise Edition of Tarantool is distributed in the form of an SDK that has its own versioning. See the Enterprise SDK changelog to learn about SDK version numbering and changes.

The detailed information about Tarantool version numbering and release lifecycle is available in Tarantool release policy.

Backward compatibility is guaranteed between all versions in the same release series. It is also appreciated but not guaranteed between different release series (major number changes). To learn more, read the Compatibility guarantees article.

Supported versions

Every Tarantool release series has the same lifecycle defined by the release policy. The following diagram visualizes the lifecycle of currently supported Tarantool versions:

The table below provides information about supported versions with links to their What’s new pages in the documentation and detailed changelogs on GitHub. For information about earlier versions, see EOS versions.

Note

End of life (EOL) means the release series will no longer receive any patches, updates, or feature improvements after the specified date.

End of support (EOS) means that we won’t provide technical support to product versions after the specified date. Versions that haven’t reached their end of life yet are shown in bold.

Series	First release date	End of life	End of support	Versions
3.6	December 12, 2025	Not planned yet	Not planned yet	v. 3.6.0
3.5	August 27, 2025	Not planned yet	Not planned yet	v. 3.5.0
3.4	April 14, 2025	April 14, 2027	Not planned yet	v. 3.4.0
3.3	November 29, 2024	November 29, 2026	Not planned yet	v. 3.3.1 v. 3.3.0
3.2	August 26, 2024	August 26, 2026	Not planned yet	v. 3.2.1 v. 3.2.0

Tarantool release policy

Summary

The Tarantool release policy is constantly changing to become more clear and intuitive. The stated policy uses a SemVer-like versioning format, and upholds version lifecycle with more long-time support series. This document explains the Tarantool release policy, versioning rules, and release series lifecycle.

Versioning policy

Release series and versions

The Tarantool release policy is based on having several release series, each with its own lifecycle, pre-release and release versions.

Release series

Release series is a sequence of development and production-ready versions with linear evolution toward a defined roadmap. A series has a distinct lifecycle and certain compatibility guarantees within itself and with other series. The intended support time for each series is at least two years since the first release.

Release version

Release version is a Tarantool distribution which is thoroughly tested and ready for production usage. It is bound to a certain commit. Release version label consists of three numbers:

MAJOR.MINOR.PATCH

These numbers correspond to the three types of release versions:

Major release

Major release is the first release version of its own release series. It introduces new features and can have a few backward-incompatible changes. Such release changes the first version number:

MAJOR.0.0

3.0.0

Minor release

Minor release introduces a few new features, but guarantees backward compatibility. There can be a few bugs fixed as well. Such release changes the second version number:

MAJOR.MINOR.0

3.1.0
3.2.0

Patch release

Patch release fixes bugs from an earlier release, but doesn’t introduce new features. Such release changes the third version number:

MAJOR.MINOR.PATCH

3.0.1
3.0.2

Release versions conform to a set of requirements:

The release has gone through pre-release testing and adoption in the internal projects until there were no doubts regarding its stability.

There are no known bugs in the typical usage scenarios.

There are no degradations from the previous release or release series, in case of a major release.

Backwards compatibility is guaranteed between all versions in the same release series. It is also appreciated, but not guaranteed between different release series (major number changes). See compatibility guarantees page for details.

Pre-release versions

Pre-release version

Pre-release versions are the ones published for testing and evaluation, and not intended for production use. Such versions use the same pattern with an additional suffix:

MAJOR.MINOR.PATCH-suffix

There are a few types of pre-release versions:

Development build

Development builds reflect the state of current development process. They’re used entirely for development and testing, and not intended for any external use.

Development builds have suffixes made with $(git describe --always --long)-dev:

MAJOR.MINOR.PATCH-describe-dev

10.2-149-g1575f3c07-dev
0.0-alpha1-14-gxxxxxxxxx-dev
0.0-entrypoint-17-gxxxxxxxxx-dev
1.2-5-gxxxxxxxxx-dev

Alpha version

Alpha version has some of the features planned in the release series. It can be incomplete or unstable, and can break the backwards compatibility with the previous release series.

Alpha versions are published for early adopters and developers of dependent components, such as connectors and modules.

MAJOR.MINOR.PATCH-alphaN

3.0.0-alpha1
3.0.0-alpha2

Beta version

Beta version has all the features which are planned for the release series. It is a good choice to start developing a new application.

Readiness of a feature can be checked in a beta version to decide whether to remove the feature, finish it later, or replace it with something else. A beta version can still have a known bug in the new functionality, or a known degradation since the previous release series that affects a common use case.

MAJOR.MINOR.PATCH-betaN

3.0.0-beta1
3.0.0-beta2

Note that the development of 2.10.0, the first release under the new policy, starts with version 2.10.0-beta1.

Release candidate

Release candidate is used to fix bugs, mature the functionality, and collect feedback before an upcoming release. Release candidate has the same feature set as the preceding beta version and doesn’t have known bugs in typical usage scenarios or degradations from the previous release series.

Release candidate is a good choice to set up a staging server.

MAJOR.MINOR.PATCH-rcN

0.0-rc1
0.0-rc2
0.1-rc1

Release series lifecycle

Every release series goes through the following stages:

Early development
Support
End of life
End of support

Early development

The early development stage goes on until the first major release. Alpha, beta, and release candidate versions are published at this stage.

The stage splits into two phases:

Development of a new functionality through alpha and beta versions. Features can be added and, sometimes, removed in this phase.
Stabilization starts with the first release candidate version. Feature set doesn’t change in this phase.

Support

The stage starts when the first release is published. The release series now is an object of only backward compatible changes.

At this stage, all known security problems and all found degradations since the previous series are being fixed.

The series receives degradation fixes and other bugfixes during the support stage and until the series transitions into the end of life (EOL) stage.

The decision of whether to fix a particular problem in a particular release series depends on the impact of the problem, risks around backward compatibility, and the complexity of backporting a fix.

The release series might receive new features at this stage, but only in a backward compatible manner. Also, a release candidate may be published to collect feedback before the release version.

During the support period a release series receives new versions of supported Linux distributives to build infrastructure.

The intended duration of the support period for each series is at least two years.

End of life

A series reaches the end of life (EOL) when the last release in the series is published. The series will not receive updates anymore.

In modules, connectors and tools, we don’t guarantee support of any release series that reaches EOL.

A release series cannot reach EOL until the vast majority of production environments, for which we have commitments and SLAs, is updated to a newer series.

End of support

The end of support (EOS) date is later in time than the EOL date. When the series reaches EOS, the Tarantool team ceases its technical support and does not comply with support- related inquiries.

It is recommended that the customers follow the calendar and plan for updates of their Tarantool version before the EOS date.

Versions per lifecycle stage

Stage	Version types	Examples
Early development	Alpha, beta, release candidate	3.0.0-alpha1 3.0.0-beta1 3.0.0-rc1 3.0.0-dev
Support	Release candidate, release	3.0.0 3.0.1-rc1 3.0.1-dev
End of life	None	N/A

Example of a release series

A release series in an early development stage can have the following version sequence:

0.0-alpha1
0.0-alpha2
...
0.0-alpha7

0.0-beta1
...
0.0-beta5

0.0-rc1
...
0.0-rc4

0.0 (release)

Since the first release version, the series comes into a support stage. Then it can proceed with a version sequence like the following:

0.0 (release of a new major version)

0.1-rc1
...
0.1-rc4
0.1 (release with some bugs fixed but no new features)

1.0-rc1
...
1.0-rc6
1.0 (release with new features and, possibly, extra fixed bugs)

Eventually, the support stage stops and the release series comes to the end of life (EOL) stage. No new versions are released since then.

Note

See all currently supported Tarantool versions in Releases.

Tarantool 3.6

Release date: 2025-12-12

Releases on GitHub: v. 3.6.0

The 3.6 release of Tarantool adds the following main product features and improvements for the Community and Enterprise editions:

Community Edition (CE)
- Memtx: significantly faster snapshot recovery.
- New privileges: grant and metagrant.
Enterprise Edition (EE)
- Failover coordinator: synchronous replication for 2 DC topology.
- MemCS: multiple improvements.

[CE] Memtx: significantly faster snapshot recovery

This release speeds up memtx snapshot recovery by up to 70% by offloading MsgPack decoding to a separate thread. Space and transactional triggers (space.on_replace, space.before_replace) are now deprecated during recovery.

[CE] New privileges: grant and metagrant

This release introduces new privileges, grant and metagrant. They allow Tarantool users to create new users with a complete range of privileges, like the built-in admin user does.

The grant privilege allows granting any privilege (except grant and metagrant) on an object, object class or universe.

The metagrant privilege allows granting the grant and metagrant privileges.

Both grant and metagrant are only grantable on the universe and allow granting privileges only to other users, not to the current user.

[EE] Failover coordinator: synchronous replication for 2 DC topology

In previous releases, the failover coordinator supported synchronous replication for topologies with data storages located in 3 or more data centers. With N instances within a replica set, the minimal quorum size could be N/2 + 1. Topologies with 2 data centers were not supported because the quorum size could not be dynamically decreased if the connection to the second data center was lost. For example, in a minimal replica set of 2 instances the default quorum size was 2 (calculated as 2/2 + 1 = 2). If one of the instances failed, no new transactions could be committed, because it was impossible to get an ack from the second instance. So, new transactions were lost.

Now, for supervised failover, Tarantool supports synchronous replication for topologies with data storages located in 2 data centers. The failover coordinator automatically decreases the quorum size if the connection to the second data center is lost, and restores the cluster’s operability within the available data center. When the connection to the second data center is up again, the failover coordinator automatically restores the normal quorum size. So, no transactions are lost.

However, there are some limitations. An important thing is that while data storages can be deployed in just 2 DCs, the system requires yet another DC (a so-called “quorum DC”) for an extra configuration storage (based on etcd or Tarantool). So, the effective topology (also known as a “2,5 DC topology”) implies 3 DCs all in all: two DCs with a complete set of components (data storages, routers, configuration storages) + one DC with configuration storage.

Here are the cases when a topology with data storages deployed in 2 DCs may fail.

Case #1: “interrupted data enrichment after DC failure”

DC #1 goes down, while DC #2 is up. The failover coordinator decreases the quorum size to 1. The cluster keeps serving the clients.
DC #1 goes up and starts obtaining missed data from DC #2. The quorum size is still 1.
The data enrichment process for DC #1 is still in progress, but DC #2 goes down now. In this case, the cluster becomes unavailable.

Case #2: “no multiple quorum decrease”

DC #1 (2 instances) goes down, while DC #2 (2 instances) is up.
One of the 2 instances in DC #2 goes down. In this case, the cluster becomes unavailable as well.

Case #3: Both DCs go down – the ultimate case when the cluster becomes unavailable.

In other cases, the cluster will keep working.

[EE] MemCS: multiple improvements

This release brings multiple minor enhancements to the MemCS engine, most of them focusing on indexes and performance.

MemCS now supports:

Aggregates over decimal fields,
Inserting into the middle of MemCS primary index with aggregates,
Specifying per-column layout in space format and secondary index definitions,
The index:quantile() function,
Bloom aggregates,
Statistics of index aggregates and scanner,
Index aggregates exported to their Lua objects,
Some C API improvements related to inserting and scanning data in the Apache Arrow format.

Furthermore, the performance of MemCS skip index writes is increased. Now this index doesn’t reallocate blocks if they are not used by any read view.

Tarantool 3.5

Release date: 2025-08-27

Releases on GitHub: v. 3.5.0

The 3.5 release of Tarantool adds the following main product features and improvements for the Community and Enterprise editions:

Community Edition (CE)
- Fixed-point decimal types decimal32, decimal64 are supported.
- Memtx: new O(n) sorting algorithm for sorting secondary keys on startup.
- New fail_if tag for roles and scripts.
- Faster large (500+) clusters reload.
Enterprise Edition (EE)
- Quorum synchronous replication setup in three availability zones is supported by failover = 'supervised'.
- Supervised failover coordinator skips RW switch in presence of dead instances.
- Fixed-point decimal types are supported in MemCS.
- BRIN indexes are supported in MemCS.
- Performance is improved in MemCS.
- Secondary index batch insertion performance is improved in MemCS.
- Storage format and string scanning performance improvements in MemCS.

[CE] Support for fixed-point decimal types `decimal32`, `decimal64`

Fixed-point decimal types are now supported: decimal32, decimal64, decimal128, and decimal256. They differ in their number of significant decimal digits:

decimal32 - 9
decimal64 - 18
decimal128 - 38
decimal256 - 76

These types also have an additional parameter, scale, which defines the position of the decimal point. This can also be interpreted as an implied decimal exponent with a value of -scale.

For example:

A decimal32 type with scale = 4 can represent values from -99999.9999 to 99999.9999.
A decimal32 type with scale = -2 can represent values from -999999999 × 10² to 999999999 × 10².

Example 1. Creating a space with a field of a fixed decimal type:

s = box.schema.create_space('test', {format = {
    {'a', 'unsigned'}, {'b', 'decimal32', scale = 4},
}})

Decimal values can be created using the decimal module. It has been updated the following way to support representing values of the new types:

The limitation on exponent has been removed.
Precision has been increased to 76 decimal digits.
Printing is now done in scientific notation.

local decimal = require('decimal')
s = box.schema.create_space('test', {format = {
    {'a', 'unsigned'}, {'b', 'decimal32', scale = 4},
}})
s:create_index('pk')
s:insert({1, decimal.new(13333)})
s:insert({2, decimal.new(0.0017)})

[CE] Memtx: new O(n) sorting algorithm for sorting secondary keys

Now it is possible to sort secondary keys using a new O(n) sorting algorithm that uses additional data written into the snapshot. The feature can be enabled with a new memtx_use_sort_data option in box.cfg or memtx.use_sort_data in the instance configuration.

With the option set to true, additional data is saved into a separate .sortdata file during snapshot creation and is used during recovery. The default value is false, so the behavior must be explicitly enabled by the user if required.

The option can be changed at runtime. For example, you can enable it during recovery to use the new secondary key sorting approach, but disable it before creating a new snapshot, so only the .snap file will be created and the O(n) secondary key sort will not be available for that snapshot.

The performance impact of the option depends on the persistent storage read/write speed and the number of tuples and secondary keys in spaces (the more tuples and keys, the more beneficial the new approach is). Additionally, the approach only utilizes a single CPU core.

As a downside, it creates additional footprint overhead during recovery (up to ~45 bytes per tuple) and overhead during snapshot creation (due to writing the sort data to persistent storage).

[CE] `fail_if`
tag for roles and scripts

A fail_if tag has been added for roles and scripts. If the tag fail_if is set to an expression string, loading a role/script will raise an error if the fail_if expression evaluates to true.

[CE] Faster large (500+) clusters reload

The new release processes an instance’s configuration only when it is explicitly accessed, such as through a config:get(<...>, {instance = <...>}) call. This significantly speeds up startup and configuration reloads, especially for large clusters.

[EE] Support for quorum synchronous replication setup in three availability zones

This release makes the replication.failover = supervised mode support quorum synchronous replication setup in three availability zones.

This release includes several preliminary patches, which make the appoint_commit logic more safe in regards to various possible situations, and the main patch, which adds the box.ctl.promote() call to appoint_commit if the failover.replicasets.<replicaset_name>.synchro_mode is set to true.

[EE] Supervised failover coordinator skips RW switch in presence of dead instances

When applying centralized configuration updates, Tarantool supervised failover coordinator no longer performs RW switch within the replicaset if there are dead instances.

This release introduces a smarter configuration update process for the Tarantool supervised failover coordinator. Previously, any change would trigger a full restart of all services, causing unnecessary downtime. Now, the system intelligently differentiates between option types: most settings can be applied dynamically without any restart. A restart will only occur if you modify critical core parameters, specifically any options under failover.* or failover.stateboard.* (with the exception of failover.replicasets). This targeted approach minimizes disruptive restarts and significantly improves the coordinator’s availability.

[EE] Fixed-point decimal types are supported in MemCS

It is now possible to define a field with the decimal32/64/128/256 type (fixed-point decimal):

Insert/Replace/Get/Select operations for fixed point decimal values are represented as standard decimals (MP_DECIMAL) in all engines (MemTX, Vinyl, MemCS, Quiver (if applicable)).
The internal representation of fixed point decimal values remains the same as the exisitng decimal type (decNumber) in all engines except MemCS and Quiver.
It is now possible to batch insert and scan fixed point decimal values in the Apache Arrow format in MemCS and Quiver engines.

[EE] BRIN indexes are supported in MemCS

This release introduces a new family of ArrowStream filters named logical combinators, which allow combining other filters using logical AND and OR operations. Validation occurs recursively from the root. Note that complex filters are not optimized; each child filter is evaluated sequentially until the overall condition is satisfied.

[EE] Performance is improved in MemCS

This release brings performance enhancements to the MemCS engine, focusing on interoperability and insertion speed. A fix for Arrow array null_count calculation resolves an issue for users of the Rust Arrow C API. Furthermore, batch insertion into RLE-formatted columns is now dramatically faster, showing performance improvements of 2.5x for 10% filled columns and 11x for 1% filled columns.

[EE] Secondary index batch insertion performance is improved in MemCS

This release introduces a new next_row method that significantly accelerates batch insertion and secondary index building. The method replaces the old tuple-based iterator, using a column mask to process only necessary data and returning raw row information to eliminate costly MsgPack conversions. Performance for secondary index insertion has improved by over 550%, with rates jumping from ~10k to 62k rows per second.

[EE] Storage format and string scanning performance improvements in MemCS

This release combines a new optimized storage format for short strings with a powerful Arrow view layout to deliver huge performance improvements for string scanning.

Storage format updates:

Strings shorter than 13 characters are now stored inline within a 16-byte vector, significantly improving access speed for common short-string data.
New force_view_types parameter for Arrow streams (disabled by default) allows switching to a “variable-size binary view layout,” which unlocks major performance gains when used with a read-view.
The memcs_column_data_size() API function has been split into memcs_column_int_data_size() and memcs_column_ext_data_size() to account for the new dual storage format.

We ran scan performance tests for 3 different sets of strings (the lengths were randomly distributed in the ranges 1-12, 1-100, 1-1000) and 2 different modes, notouch and touch. In the notouch mode strings were only scanned as a batch, without accessing them; in the touch mode, strings were scanned and the first external character was checked:

Strings 1-12 chars: Up to 4.8x faster in the notouch mode, and 3.2x faster in the touch mode.
Strings 1-100 chars: Up to 3.1x faster in the notouch mode, and 1.8x faster in the touch mode.
Strings 1-1000 chars: Up to 10x faster in the notouch mode, and 2.9x faster in the touch mode.

Tarantool 3.4

Release date: April 14, 2024

Releases on GitHub: v. 3.4.0

The 3.4 release of Tarantool adds the following main product features and improvements for the Community and Enterprise editions:

Community Edition (CE)
- Memtx-vinyl cross-engine transactions.
- New index:quantile() function for finding a quantile key in an indexed data range.
- Functional indexes in the MVCC transaction manager.
- Vinyl now supports np (next prefix) and pp (previous prefix) iterators.
- Fixed incorrect number comparisons and duplicates in unique indexes.
- Runtime priviledges for lua_call are now granted before box.cfg().
- The stop callbacks for the roles are now called during graceful shutdown, in the reverse order of roles startup.
- New has_role, is_router, and is_storage methods in the config module to check if a role is enabled on an instance.
- LuaJIT profilers are now more user-friendly.
- Built-in logger now encodes table arguments in the JSON format.
- Multiple bugfixes for MVCC, vinyl, WAL, and snapshotting.
- Fixed memory overgrowing for cdata-intensive workloads.
Enterprise Edition (EE)
- New in-memory columnar storage engine: memcs.
- New bootstrap strategy in failover: native.
- New public API for accessing remote config.storage clusters as key-value storages.
- Two-phase appointment process to avoid incorrect behavior of the failover coordinator.

[EE] New in-memory columnar storage engine: ‘memcs’

The engine stores data in the memtx arena but in contrast to memtx it doesn’t organize data in tuples. Instead, it stores data in columns. Each format field is assigned its own BPS tree-like structure (BPS vector), which stores values only of that field. If the field type fits in 8 bytes, raw field values are stored directly in tree leaves without any encoding. For values larger than 8 bytes, like decimal, uuid or strings, the leaves store pointers to MsgPack-encoded data.

The main benefit of such data organization is a significant performance boost of columnar data sequential scans compared to memtx thanks to CPU cache locality. That’s why memcs supports a special C api for such columnar scans: see box_index_arrow_stream() and box_raw_read_view_arrow_stream(). Peak performance is achieved when scanning embedded field types.

Querying full tuples, like in memtx, is also supported, but the performance is worse compared to memtx, because a tuple has to be constructed on the runtime arena from individual field values gathered from each column tree.

Other features include:

Point lookup.
Stable iterators.
Insert / replace / delete / update.
Batch insertion in the Arrow format.
Transactions, including cross-engine transactions with memtx (with memtx_use_mvcc_engine = false).
Read view support.
Secondary indexes with an ability to specify covered columns and sequentially scan indexed + covered columns.

Embedded field types include only fixed-width types:

Integer: (u)int8/16/32/64.
Floating point: float32/64.

Types with external storage include:

Strings.
All the other types supported by Tarantool: UUID, Decimal, Datetime, etc.

By default, NULL values are stored explicitly and use up the same space as any other valid column value (1, 2, 4 or 8 bytes depending on an exact field type), however RLE encoding of NULLs is also supported. For reference, RLE-encoding of a column with 90% evenly distributed NULL values reduces memory consumption of that column by around 5 times.

[CE] Memtx-vinyl cross-engine transactions

Tarantool now supports mixing statements for memtx and vinyl in the same transaction, for example:

local memtx = box.schema.space.create('memtx', {engine = 'memtx'})
memtx:create_index('primary')
local vinyl = box.schema.space.create('vinyl', {engine = 'vinyl'})
vinyl:create_index('primary')

memtx:insert({1, 'a'})
vinyl:insert({2, 'b'})

box.begin()
memtx:replace(vinyl:get(2))
vinyl:replace(memtx:get(1))
box.commit()

Note

Accessing a vinyl space may trigger a fiber yield (to read a file from the disk), so MVCC must be enabled in memtx to make use of the new feature:
```
box.cfg{memtx_use_mvcc_engine = true}
```
Vinyl operations may yield implicitly, so a transaction may be aborted with TRANSACTION_CONFLICT in case of concurrent transactions.

[EE] New boostrap strategy in failover: ‘native’

Now supervised failover coordinator supports three bootstrap strategies: native, supervised, auto.

The new native strategy relaxes the limitations of the auto strategy, but has different under-the-hood implementation (based on the supervised strategy). Otherwise, it acts similar to the auto strategy.

In effect, it helps resolve these two problems:

Avoid the error Some replica set members were not specified in box.cfg.replication in the following cases:
- several replicas join at the same time,
- the replica set includes non-anonymous CDC instances,
- _cluster contains old unneeded replicas.
Make the database get bootstrapped upon the coordinator’s command rather than let the instances boostrap it on their own.

This strategy is the recommended choice for highly dynamic clusters with automatic scaling, as well as in most other cases.

To enable the native bootstrap strategy, set it in the replication section of the cluster’s configuration, together with a proper failover strategy (for native, you can choose any failover strategy you like, for example supervised):

replication:
  failover: supervised
  bootstrap_strategy: native

[CE] Runtime priviledges for ‘lua_call’ granted before ‘box.cfg()’

It is now possible to grant execution privileges for Lua functions through the declarative configuration, even when the database is in read-only mode or has an outdated schema version. You might also permit guest to execute Lua functions before the initial bootstrap.

You can specify function permissions using the lua_call option in the configuration, for example:

credentials:
  users:
    alice:
      privileges:
        - permissions: [execute]
          lua_call: [my_func]

This grants the alice user permission to execute the my_func Lua function, regardless of the database’s mode or status. The special option lua_call: [all] is also supported, granting access to all global Lua functions except built-in ones, bypassing database restrictions.

Privileges will still be written to the database when possible to maintain compatibility and consistency with other privilege types.

[CE] New methods in the ‘config’ module to check instance roles

Three new methods are now available in the config module:

config:has_role('myrole') tells whether the current instance has the role myrole, and config:has_role('myrole', {instance = 'i-001'}) does the same for the specified instance (i-001).
config:is_router() tells whether the current instance is a vshard router, and config:is_router({instance = 'i-002'}) does the same for the specified instance (i-002).
config:is_storage() tells whether the current instance is a vshard storage, and config:is_storage({instance = 'i-003'}) does the same for the specified instance (i-003).

[EE] New public API: ‘config.storage_client’

Remote config.storage clusters can now be accessed by using the config.storage_client.connect(endpoints[, {options}]) method. The returned object represents a connection to a remote key-value storage accessed through the :get(), :put(), :info(), :txn() methods with the same signature as in the server config.storage API.

The config.storage_client API has also several specific methods: :is_connected(), :watch(), :reconnect(), :close().

Here are some usage examples:

-- Connect to a config.storage cluster using the endpoints
-- configured in the `config.storage` section.
--
-- You can provide endpoints as a Lua table:
--
-- local endpoints = {
--     {
--         uri = '127.0.0.1:4401',
--         login = 'sampleuser',
--         password = '123456',
--     }
-- }

local endpoints = config:get('config.storage.endpoints')
local client = config.storage_client.connect(endpoints)

-- Put a value to the connected client.
client:put('/v', 'a')

-- Get all stored values.
local values = client:get('/')

-- Clean the storage.
local response = client:delete('/')

-- Watch for key changes.
local log = require('log')
local w = client:watch('/config/main', function()
    log.info('config has been updated')
end)

-- Unregister a watcher.
w:unregister()

Tarantool 3.3

Release date: November 29, 2024

Releases on GitHub: v. 3.3.1, v. 3.3.0

The 3.3 release of Tarantool adds the following main product features and improvements for the Community and Enterprise editions:

Community Edition (CE)
- Improvements around queries with offsets.
- Improvement in Raft implementation.
- Persistent replication state.
- New C API for sending work to the TX thread from user threads.
- JSON cluster configuration schema.
- New on_event callback in application roles.
- API for user-defined alerts.
- Isolated instance mode.
- Automatic instance expulsion.
- New configuration option for Lua memory size.
Enterprise Edition (EE)
- Offset-related improvements in read views.
- Supervised failover improvements.

Developing applications

Improved offset processing

Tarantool 3.3 brings a number of improvements around queries with offsets.

The performance of tree index select() with offset and count() methods was improved. Previously, the algorithm complexity had a linear dependency on the provided offset size (O(offset)) or the number of tuples to count. Now, the new algorithm complexity is O(log(size)) where size is the number of tuples in the index. This change also eliminates the dependency on the offset value or the number of tuples to count.

The index and space entities get a new offset_of method that returns the position relative to the given iterator direction of the tuple that matches the given key.

-- index: {{1}, {3}}
index:offset_of({3}, {iterator = 'eq'})  -- returns 1: [1, <3>]
index:offset_of({3}, {iterator = 'req'}) -- returns 0: [<3>, 1]

The offset parameter has been added to the index:pairs() method, allowing to skip the first tuples in the iterator.

Same improvements are also introduced to read views in the Enterprise Edition.

Improved performance of the tree index read view select() with offset.
A new offset_of() method of index read views.
A new offset parameter in the index_read_view:pairs() method.

No rollback on timeout for synchronous transactions

To better match the canonical Raft algorithm design, Tarantool no longer rolls back synchronous transactions on timeout (upon reaching replication.synchro_timeout). In the new implementation, transactions can only be rolled back by a new leader after it is elected. Otherwise, they can wait for a quorum infinitely.

Given this change in behavior, a new replication_synchro_timeout compat option is introduced. To try the new behavior, set this option to new:

In YAML configuration:

compat:
  replication_synchro_timeout: new

In Lua code:

tarantool> require('compat').replication_synchro_timeout = 'new'
---
...

There is also a new replication.synchro_queue_max_size configuration option that limits the total size of transactions in the master synchronous queue. The default value is 16 megabytes.

C API for sending work to TX thread

New public C API functions tnt_tx_push() and tnt_tx_flush() allow to send work to the TX thread from any other thread:

tnt_tx_push() schedules the given callback to be executed with the provided arguments.
tnt_tx_flush() sends all pending callbacks for execution in the TX thread. Execution is started in the same order as the callbacks were pushed.

JSON schema of the cluster configuration

Tarantool cluster configuration schema is now available in the JSON format. A schema lists configuration options of a certain Tarantool version with descriptions. As of Tarantool 3.3 release date, the following versions are available:

Additionally, there is the latest schema that reflects the latest configuration schema in development (master branch).

Use these schemas to add code completion for YAML configuration files and get hints with option descriptions in your IDE, or validate your configurations, for example, with check-jsonschema:

$ check-jsonschema --schemafile https://download.tarantool.org/tarantool/schema/config.schema.3.3.0.json config.yaml

There is also a new API for generating the JSON configuration schema as a Lua table – the config:jsonschema() function.

on_event callbacks in roles

Now application roles can have on_event callbacks. They are executed every time a box.status system event is broadcast or the configuration is updated. The callback has three arguments:

config – the current configuration.
key – an event that has triggered the callback: config.apply or box.status.
value – the value of the box.status system event.

Example:

return {
    name = 'my_role',
    validate = function() end,
    apply = function() end,
    stop = function() end,
    on_event = function(config, key, value)
        local log = require('log')

        log.info('on_event is triggered by ' .. key)
        log.info('is_ro: ' .. value.is_ro)
        log.info('roles_cfg.my_role.foo: ' .. config.foo)
    end,
}

API for raising alerts

Now developers can raise their own alerts from their application or application roles. For this purpose, a new API is introduced into the config module.

The config:new_alerts_namespace() function creates a new alerts namespace – a named container for user-defined alerts:

local config = require('config')
local alerts = config:new_alerts_namespace('my_alerts')

Alerts namespaces provide methods for managing alerts within them. All user-defined alerts raised in all namespaces are shown in box.info.config.alerts.

To raise an alert, use the namespace methods add() or set(): The difference between them is that set() accepts a key to refer to the alert later: overwrite or discard it. An alert is a table with one mandatory field message (its value is logged) and arbitrary used-defined fields.

-- Raise a new alert.
alerts:add({
    message = 'Test alert',
    my_field = 'my_value',
})

-- Raise a new alert with a key.
alerts:set("my_alert", {
    message = 'Test alert',
    my_field = 'my_value',
})

You can discard alerts individually by keys using the unset() method, or all at once using clear():

alerts:unset("my_alert")
alerts:clear()

Administration and maintenance

DDL before upgrade

Since version 3.3, Tarantool allows DDL operations before calling box.schema.upgrade() during an upgrade if the source schema version is 2.11.1 or later. This allows, for example, granting execute access to user-defined functions in the cluster configuration before the schema is upgraded.

Isolated instances

A new instance-level configuration option isolated puts an instance into the isolated mode. In this mode, an instance doesn’t accept updates from other members of its replica set and other iproto requests. It also performs no background data modifications and remains in read-only mode.

groups:
  group-001:
    replicasets:
      replicaset-001:
        instances:
          instance-001: {}
          instance-002: {}
          instance-003:
            isolated: true

Use the isolated mode to temporarily isolate instances for maintenance, debugging, or other actions that should not affect other cluster instances.

Automatic expulsion of removed instances

A new configuration section replication.autoexpel allows to automatically expel instances after they are removed from the YAML configuration.

replication:
  autoexpel:
    enabled: true
    by: prefix
    prefix: '{{ replicaset_name }}'

The section includes three options:

enabled: whether automatic expulsion logic is enabled in the cluster.
by: a criterion for selecting instances that can be expelled automatically. In version 3.3, the only available criterion is prefix.
prefix: a prefix with which an instance name should start to make automatic expulsion possible.

Lua memory size

A new configuration option lua.memory specifies the maximum amount of memory for Lua scripts execution, in bytes. For example, this configuration sets the Lua memory limit to 4 GB:

lua:
  memory: 4294967296

The default limit is 2 GB.

Supervised failover improvements

Tarantool 3.3 is receiving a number of supervised failover improvements:

Support for Tarantool-based stateboard as an alternative to etcd.
Instance priority configuration: new failover.priority configuration section. This section specify the instances’ relative order of being appointed by a coordinator: bigger values mean higher priority.
```
failover:
  replicasets:
    replicaset-001:
      priority:
        instance-001: 5
        instance-002: -5
        instance-003: 4
```
Additionally, there is a failover.learners section that lists instances that should never be appointed as replica set leaders:
```
failover:
  replicasets:
    replicaset-001:
      learners:
        - instance-004
        - instance-005
```
Automatic failover configuration update.

Failover logging configuration with new configuration options failover.log.to and failover.log.file:

failover:
  log:
    to: file # or stderr
    file: var/log/tarantool/failover.log

Learn more about supervised failover in Supervised failover.

Persistent replication state

Tarantool persistence mechanism uses two types of files: snapshots and write-ahead log (WAL) files. These files are also used for replication: read-only replicas receive data changes from the replica set leader by reading these files.

The garbage collector cleans up obsolete snapshots and WAL files, but it doesn’t remove the files while they are in use for replication. To make such a check possible, the replica set leaders store the replication state in connection with files. However, this information was not persisted, which could lead to issues in case of the leader restart. The garbage collector could delete WAL files after the restart even if there were replicas that still read these files. The wal.cleanup_delay configuration option was used to prevent such situations.

Since version 3.3, leader instances persist the information about WAL files in use in a new system space _gc_consumers. After a restart, the replication state is restored, and WAL files needed for replication are protected from garbage collection. This eliminates the need to keep all WAL files after a restart, so the wal.cleanup_delay option is now deprecated.

Tarantool 3.2

Release date: August 26, 2024

Releases on GitHub: v. 3.2.1, v. 3.2.0

The 3.2 release of Tarantool adds the following main product features and improvements for the Community and Enterprise editions:

Community Edition (CE)
- A new experimental module for validating role configurations.
- Initial support for encoding structured data using Protobuf.
- Next and Previous prefix iterators.
- Support for all UUID versions.
- Automatic loading of the most often used built-in modules into the console environment.
Enterprise Edition (EE)
- Time-to-live (TTL) for keys in a Tarantool-based configuration storage.

Developing applications

Configuration validation

Tarantool 3.2 includes a new experimental module for validating role configurations using a declarative schema. For example, you can validate the type of configuration values, provide an array of allowed values, or specify a custom validation function.

Suppose, a sample ‘http-api’ custom role can accept the host and port configuration values:

roles: [ http-api ]
roles_cfg:
  http-api:
    host: '127.0.0.1'
    port: 8080

First, you need to load the experimental.config.utils.schema module:

local schema = require('experimental.config.utils.schema')

The validate_port() function can be used to check that a port value is between 1 and 65535:

local function validate_port(port, w)
    if port <= 1 or port >= 65535 then
        w.error("'port' should be between 1 and 65535, got %d", port)
    end
end

Then, you can create a schema used for validation:

host should be one of the specified string values.
port should be a number that is checked using the validate_port() function declared above.

local listen_address_schema = schema.new('listen_address', schema.record({
    host = schema.enum({ '127.0.0.1', '0.0.0.0' }),
    port = schema.scalar({
        type = 'integer',
        validate = validate_port,
    }),
}))

Finally, you can pass the specified schema to the validate() role’s function:

local function validate(cfg)
    if cfg.host and cfg.port then
        listen_address_schema:validate(cfg)
    else
        error("You need to set both host and port values")
    end
end

Protobuf encoder

The 3.2 release adds initial support for encoding structured data using Protocol buffers. First, you need to load the protobuf module:

local protobuf = require('protobuf')

To encode data, you need to define a protocol:

local customer_protocol = protobuf.protocol({
    -- Define a message and enum --
})

The two main components of the protocol are messages and enums:

A message specifies the structure of data, in particular, the fields and their types.
An enum defines a set of enumerated constants within the message.

To create a message and enum, use the message() and enum() functions, respectively:

local customer_protocol = protobuf.protocol({
    protobuf.message('Customer', {
        id = { 'int32', 1 },
        firstName = { 'string', 2 },
        lastName = { 'string', 3 },
        customerType = { 'CustomerType', 4 }
    }),
    protobuf.enum('CustomerType', {
        active = 0,
        inactive = 1,
    })
})

Once the protocol is specified, use the encode() method to encode data:

local sample_customer = customer_protocol:encode(
    'Customer',
    {
        id = 3,
        firstName = 'Andrew',
        lastName = 'Fuller',
        customerType = 1
    }
)

Next and Previous prefix iterators

This release adds two new iterators for TREE indexes: np (next prefix) and pp (previous prefix). If a key is a string value, a prefix is a common starting substring shared by multiple keys.

Suppose, the products space contains the following values:

application:instance001> box.space.products:select()
---
- - ['clothing_pants']
  - ['clothing_shirt']
  - ['electronics_laptop']
  - ['electronics_phone']
  - ['electronics_tv']
  - ['furniture_chair']
  - ['furniture_sofa']
  - ['furniture_table']
...

If you use the np iterator type and set the key value to electronics, the output should look as follows:

application:instance001> box.space.products:select({ 'electronics' }, { iterator = 'np' })
---
- - ['furniture_chair']
  - ['furniture_sofa']
  - ['furniture_table']
...

Similarly, you can use the pp iterator:

application:instance001> box.space.products:select({ 'electronics' }, { iterator = 'pp' })
---
- - ['clothing_shirt']
  - ['clothing_pants']
...

Note that new iterators work only for the memtx engine.

Tarantool configuration storage: TTL support for keys (EE)

The Enterprise Edition now includes a time-to-live (TTL) for keys in a Tarantool-based configuration storage. You can specify a TTL value in the config.storage.put() call as follows:

config.storage.put('/foo/bar', 'v1', { ttl = 60 })

Similarly, you can configure TTL in config.storage.txn():

config.storage.txn({
    predicates = { { 'revision', '==', revision } },
    on_success = { { 'put', '/foo/bar', 'v1', { ttl = 60 } } }
})

A new config.storage.info.features.ttl field allows you to check whether the current version of the configuration storage supports requests with TTL. In the example below, the conn:call() method is used to make a remote call to get the ttl field value:

local info = conn.call('config.storage.info')
if info.features == nil or not info.features.ttl then
    error('...')
end

Support for all UUID versions

Before the 3.2 version, Tarantool supported only UUIDs following the rules for RFC 4122 version 4. With v3.2, UUID values of all versions (including new 6, 7, and 8) can be parsed using the uuid module. This improves interoperability with third-party data sources whose data is processed by Tarantool.

Administration and maintenance

Interactive console

With this release, both the Tarantool and tt interactive consoles automatically add the most often used built-in modules into the environment. This means that you can start using a module without loading it with the require directive.

In the interactive session below, the config module is used to get the instance’s configuration state right after connecting to this instance:

application:instance001> config:info('v2')
---
- status: ready
  meta:
    last: &0 []
    active: *0
  alerts: []
...

To enable this new behavior, you need to set the console_session_scope_vars compat option value to new:

compat:
  console_session_scope_vars: 'new'

Observability

The 3.2 release adds the following improvements related to observability:

A new box.info.config field allows you to access an instance’s configuration status.
box.info.synchro.queue now includes the age and confirm_lag fields:
- age – shows how much time the oldest entry in the queue has spent waiting for the quorum.
- confirm_lag – shows how much time the latest successfully confirmed entry has waited for the quorum to gather.
New metrics are added:
- tnt_memtx_tuples_data_total
- tnt_memtx_tuples_data_read_view
- tnt_memtx_tuples_data_garbage
- tnt_memtx_index_total
- tnt_memtx_index_read_view
- tnt_vinyl_memory_tuple
- tnt_config_alerts
- tnt_config_status

EOS versions

This section contains information about Tarantool 3.x versions that have reached their end of life and end of support dates in accordance with the Tarantool release policy and do no longer receive updates, fixes, or technical support.

Series	First release date	End of life	End of support	Versions
3.1	April 16, 2024	August 26, 2024	August 26, 2024	v. 3.1.2 v. 3.1.1 v. 3.1.0
3.0	December 26, 2023	April 17, 2024	April 17, 2024	v. 3.0.2 v. 3.0.1 v. 3.0.0

Tarantool 3.1

Release date: April 16, 2024

Releases on GitHub: v. 3.1.2, v. 3.1.1, v. 3.1.0

The 3.1 release of Tarantool continues the development of a new cluster configuration approach introduced in the 3.0 version and adds the following main product features and improvements for the Community and Enterprise editions:

Community Edition (CE)
- Improved developer experience for handling errors using the box.error module.
- Introduced fixed-size numeric field types: uint8, int8, uint16, and more.
- Added RPC functionality for accessing custom roles from the configuration.
- Made the tt utility used to manage instances fully compatible with the latest Tarantool version.
Enterprise Edition (EE)
- Introduced an external coordinator for automatic and manual failover.
- Improved the stability of work with the centralized configuration stored in etcd.

Developing applications

Error handling

This release improves the developer experience for handling errors using the box.error module. Below are listed the most notable features and changes.

Error payload fields

With the 3.1 release, you can add a custom payload to an error. The payload is passed as key-value pairs where a key is a string and a value is any Lua object. In the example below, the description key is used to keep the custom payload.

custom_error = box.error.new({ type = 'CustomInternalError',
                               message = 'Internal server error',
                               description = 'Some error details'  -- payload
})

A payload field value can be accessed using the dot syntax:

tarantool> custom_error.description
---
- Some error details
...

Error stacks

The 3.1 release simplifies creating error chains. In the earlier versions, you need to set an error cause using the set_prev(error_object) method, for example:

local ok, err = pcall(my_func)
if not ok then
    local err2 = box.error.new{type = "MyAppError", message = "my_func failed"}
    err2:set_prev(err)
    err2:raise()
end

Using this approach, you need to construct a new error without raising it, then set its cause using set_prev(), and only then raise it. Starting with the 3.1 version, you can use a new prev argument when constructing an error:

local ok, err = pcall(my_func)
if not ok then
    box.error{type = "MyAppError", message = "my_func failed", prev = err}
end

Error serialization improvements

The 3.1 release allows you to increase the verbosity of error serialization. Before the 3.1 release, a serialized error representation included only an error message:

tarantool> box.error.new({ type = 'CustomInternalError', message = 'Internal server error'})
---
- Internal server error
...

Starting with the 3.1 version, a serialized error also includes other fields that might be useful for analyzing errors:

tarantool> box.error.new({ type = 'CustomInternalError', message = 'Internal server error'})
---
- code: 0
  base_type: CustomError
  type: CustomInternalError
  custom_type: CustomInternalError
  message: Internal server error
  trace:
  - file: '[C]'
    line: 4294967295
...

Logging an error using a built-in logging module prints an error message followed by a tab space (\t) and all the payload fields serialized as a JSON map, for example:

main/104/app.lua/tarantool I> Internal server error {"code":0,"base_type":"CustomError","type":"CustomInternalError", ... }

Given that this change may change the behavior of existing code, a new box_error_serialize_verbose compat option is introduced. To try out an increased verbosity of error serialization, set this option to new:

tarantool> require('compat').box_error_serialize_verbose = 'new'
---
...

Fixed-size numeric types

The 3.1 release introduces fixed-size numeric types that might be useful to store data unencoded in an array for effective scanning. The following numeric types are added:

uint8: an integer in a range [0 .. 255].
int8: an integer in a range [-128 .. 127].
uint16: an integer in a range [0 .. 65,535].
int16: an integer in a range [-32,768 .. 32,767].
uint32: an integer in a range [0 .. 4,294,967,295].
int32: an integer in a range [-2,147,483,648 .. 2,147,483,647].
uint64: an integer in a range [0 .. 18,446,744,073,709,551,615].
int64: an integer in a range [-9,223,372,036,854,775,808 .. 9,223,372,036,854,775,807].
float32: a 32-bit floating point number.
float64: a 64-bit floating point number.

Experimental ‘connpool’ module

A new experimental.connpool module provides a set of features for remote connections to any cluster instance or executing remote procedure calls on an instance that meets the specified criteria. To load the experimental.connpool module, use the require() directive:

sharded_cluster:router-a-001> connpool = require('experimental.connpool')
---
...

In the 3.1 version, this module provides the following API:

The connect() function accepts an instance name and returns the active connection to this instance:

sharded_cluster:router-a-001> conn = connpool.connect("storage-b-002")
---
...

Once you have a connection, you can execute requests on a remote instance, for example, select data from a space:

sharded_cluster:router-a-001> conn.space.bands:select({}, { limit = 5 })
---
- - [3, 804, 'Ace of Base', 1987]
  - [7, 693, 'The Doors', 1965]
  - [9, 644, 'Led Zeppelin', 1968]
  - [10, 569, 'Queen', 1970]
...

The filter() function returns the names of instances that match the specified conditions. In the example below, this function returns a list of instances with the storage role and specified label value:
```
sharded_cluster:router-a-001> connpool.filter({ roles = { 'storage' }, labels = { dc = 'east' }})
---
- - storage-b-002
  - storage-a-002
...
```

The call() function can be used to execute a function on a remote instance. In the example below, the following conditions are specified to choose an instance to execute the vshard.storage.buckets_count function on:

An instance has the storage role.
An instance has the dc label set to west.
An instance is writable.

sharded_cluster:router-a-001> connpool.call('vshard.storage.buckets_count', nil, { roles = { 'storage' }, labels = { dc = 'west' }, mode = 'rw' })
sharded_cluster:router-a-001> connpool.call('vshard.storage.buckets_count', nil, { roles = { 'storage' }, labels = { dc = 'west' }, mode = 'rw' })
---
- 500
...

Learn more in the experimental.connpool module reference.

Accessing configuration of other cluster members

In Tarantool 3.0, the config module provides the ability to work with a current instance’s configuration only. Starting with the 3.1 version, you can get all the instances that constitute a cluster and obtain the configuration of any instance of this cluster.

The config:instances() function lists all instances of the cluster:

sharded_cluster:router-a-001> require('config'):instances()
---
- storage-a-001:
    group_name: storages
    instance_name: storage-a-001
    replicaset_name: storage-a
  storage-b-002:
    group_name: storages
    instance_name: storage-b-002
    replicaset_name: storage-b
  router-a-001:
    group_name: routers
    instance_name: router-a-001
    replicaset_name: router-a
  storage-a-002:
    group_name: storages
    instance_name: storage-a-002
    replicaset_name: storage-a
  storage-b-001:
    group_name: storages
    instance_name: storage-b-001
    replicaset_name: storage-b
...

To get the specified configuration value for a certain instance, pass an instance name as an argument to config:get():

sharded_cluster:router-a-001> require('config'):get('iproto', {instance = 'storage-b-001'})
---
- readahead: 16320
  net_msg_max: 768
  listen:
  - uri: 127.0.0.1:3304
  threads: 1
  advertise:
    peer:
      login: replicator
    client: null
    sharding:
      login: storage
...

Administration and maintenance

Failover coordinator (EE)

Tarantool Enterprise Edition 3.1 introduces an external failover coordinator that monitors a Tarantool cluster and performs automatic leadership change if a current replica set leader is inaccessible.

A failover coordinator requires the replication.failover configuration option to be set to supervised:

replication:
  failover: supervised

# ...

To start a failover coordinator, execute the tarantool command with the failover option and pass a path to a YAML configuration file:

$ tarantool --failover --config /path/to/config

A failover coordinator connects to all the instances, polls them for their status, and controls that each replica set with replication.failover set to supervised has only one writable instance.

Optionally, you can configure failover timeouts and other parameters in the failover section at the global level:

failover:
  call_timeout: 1
  lease_interval: 15
  renew_interval: 5
  stateboard:
    renew_interval: 1
    keepalive_interval: 5

Sharding

The 3.1 release includes new sharding options that provide additional flexibility for configuring a sharded cluster. A new sharding.weight specifies the relative amount of data that a replica set can store. In the example below, the storage-a replica set can store twice as much data as storage-b:

# ...
replicasets:
  storage-a:
    sharding:
      weight: 2
    # ...
  storage-b:
    sharding:
      weight: 1
    # ...

The sharding.rebalancer_mode option configures whether a rebalancer is selected manually or automatically. This option can have one of three values:

auto (default): if there are no replica sets with the rebalancer sharding role (sharding.roles), a replica set with the rebalancer will be selected automatically among all replica sets.
manual: one of the replica sets should have the rebalancer sharding role. The rebalancer will be in this replica set.
off: rebalancing is turned off regardless of whether a replica set with the rebalancer sharding role exists or not.

Compatibility with the tt utility

With this release, the tarantoolctl utility used to administer Tarantool instances is completely removed from Tarantool packages. The latest version of the tt utility is fully compatible with Tarantool 3.1 and covers all the required functionality:

Setting up a development environment: initializing the environment and installing different Tarantool versions.
Various capabilities for developing cluster applications: creating applications from templates, managing modules, and building and packaging applications.
Managing cluster instances: starting and stopping instances, connecting to remote instances for administration, and so on.
Importing and exporting data (Enterprise Edition only).

Learn how to migrate from tarantoolctl to tt in the Migration from tarantoolctl to tt section.

Tarantool 3.0

Release date: December 26, 2023

Releases on GitHub: v. 3.0.2, v. 3.0.1, v. 3.0.0

The 3.0 release of Tarantool introduces a new declarative approach for configuring a cluster, a new visual tool – Tarantool Cluster Manager, and many other new features and fixes. This document provides an overview of the most important features for the Community and Enterprise editions.

New declarative configuration
Tarantool Cluster Manager
Administration and maintenance
Developing applications
Stability

New declarative configuration

Starting with the 3.0 version, Tarantool provides the ability to configure the full topology of a cluster using a declarative YAML configuration instead of configuring each instance using a dedicated Lua script. With a new approach, you can write a local configuration in a YAML file for each instance or store configuration data in one reliable place, for example, a Tarantool or an etcd cluster.

The example below shows how a configuration of a small sharded cluster might look. In the diagram below, the cluster includes 5 instances: one router and 4 storages, which constitute two replica sets. For each replica set, the master instance is specified manually.

The example below demonstrates how a topology of such a cluster might look in a YAML configuration file:

groups:
  storages:
    app:
      module: storage
    sharding:
      roles: [storage]
    replication:
      failover: manual
    replicasets:
      storage-a:
        leader: storage-a-001
        instances:
          storage-a-001:
            iproto:
              listen:
              - uri: '127.0.0.1:3302'
          storage-a-002:
            iproto:
              listen:
              - uri: '127.0.0.1:3303'
      storage-b:
        leader: storage-b-001
        instances:
          storage-b-001:
            iproto:
              listen:
              - uri: '127.0.0.1:3304'
          storage-b-002:
            iproto:
              listen:
              - uri: '127.0.0.1:3305'
  routers:
    app:
      module: router
    sharding:
      roles: [router]
    replicasets:
      router-a:
        instances:
          router-a-001:
            iproto:
              listen:
              - uri: '127.0.0.1:3301'

You can find the full sample in the GitHub documentation repository: sharded_cluster.

The latest version of the tt utility provides the ability to manage Tarantool instances configured using a new approach. You can start all instances in a cluster by executing one command, check the status of instances, or stop them:

$ tt start sharded_cluster
   • Starting an instance [sharded_cluster:storage-a-001]...
   • Starting an instance [sharded_cluster:storage-a-002]...
   • Starting an instance [sharded_cluster:storage-b-001]...
   • Starting an instance [sharded_cluster:storage-b-002]...
   • Starting an instance [sharded_cluster:router-a-001]...

Centralized configuration (EE)

Tarantool Enterprise Edition enables you to store configuration data in one reliable place, for example, an etcd cluster. To achieve this, you need to configure connection options in the config.etcd section of the configuration file, for example:

config:
  etcd:
    endpoints:
    - http://localhost:2379
    prefix: /myapp
    username: sampleuser
    password: '123456'

Using the configuration above, a Tarantool instance searches for a cluster configuration by the following path:

http://localhost:2379/myapp/config/*

Tarantool Cluster Manager (EE)

Tarantool 3.0 Enterprise Edition comes with a brand new visual tool – Tarantool Cluster Manager (TCM). It provides a web-based user interface for managing, configuring, and monitoring Tarantool EE clusters that use centralized configuration storage.

TCM can manage multiple clusters and covers a wide range of tasks, from writing a cluster’s configuration to executing commands interactively on specific instances.

TCM’s role-based access control system lets you manage users’ access to clusters, their configurations, and stored data.

The built-in customizable audit logging mechanism and LDAP authentication make TCM a suitable solution for different enterprise security requirements.

Administration and maintenance

Database statistics

Starting with 3.0, Tarantool provides extended statistics about memory consumption for the given space or specific tuples.

Usually, the space_object:bsize() method is used to get the size of memory occupied by the specified space:

app:instance001> box.space.books:bsize()
---
- 70348673
...

In addition to the actual data, the space requires additional memory to store supplementary information. You can see the total memory usage using box.slab.info():

app:instance001> box.slab.info().items_used
---
- 75302024
...

A new space_object:stat() <box_space-stat> method allows you to determine how the additional 5 Mb of memory is used:

app:instance001> box.space.books:stat()
---
- tuple:
    memtx:
      waste_size: 1744011
      data_size: 70348673
      header_size: 2154132
      field_map_size: 0
    malloc:
      waste_size: 0
      data_size: 0
      header_size: 0
      field_map_size: 0
...

The above report gives the following information:

header_size and field_map_size: the size of service information.
data_size: the actual size of data, which equals to space_object:bsize().
waste_size: the size of memory wasted due to internal fragmentation in the slab allocator.

To get such information about a specific tuple, use tuple_object:info():

app:instance001> box.space.books:get('1853260622'):info()
---
- data_size: 277
  waste_size: 9
  arena: memtx
  field_map_size: 0
  header_size: 10
...

Bootstrapping a replica set

The new version includes the capability to choose a bootstrap leader for a replica set manually. The bootstrap leader is a node that creates an initial snapshot and registers all the replicas in a replica set.

First, you need to set replication.bootstrap_strategy to config. Then, use the <replicaset_name>.bootstrap_leader option to specify a bootstrap leader.

groups:
  group001:
    replicasets:
      replicaset001:
        replication:
          bootstrap_strategy: config
        bootstrap_leader: instance001
        instances:
          instance001:
            iproto:
              listen:
              - uri: '127.0.0.1:3301'
          instance002:
            iproto:
              listen:
              - uri: '127.0.0.1:3302'
          instance003:
            iproto:
              listen:
              - uri: '127.0.0.1:3303'

Note

Note that in 3.0, the replication_connect_quorum option is removed. This option was used to specify the number of nodes to be up and running for starting a replica set.

Security (EE)

With the 3.0 version, Tarantool Enterprise Edition provides a set of new features that enhance security in your cluster:

Introduced the secure_erasing configuration option that forces Tarantool to overwrite a data file a few times before deletion to render recovery of a deleted file impossible. With the new configuration approach, you can enable this capability as follows:
```
security:
  secure_erasing: true
```
This option can be also set using the TT_SECURITY_SECURE_ERASING environment variable.
Added the auth_retries option that configures the maximum number of authentication retries before throttling is enabled. You can configure this option as follows:
```
security:
  auth_retries: 3
```
Added the capability to use the new SSL certificate with the same name by reloading the configuration. To do this, use the reload() function provided by the new config module:
```
app:instance001> require('config'):reload()
---
...
```

Audit logging (EE)

Tarantool Enterprise Edition includes the following new features for audit logging:

Added a unique identifier (UUID) to each audit log entry.
Introduced audit log severity levels. Each system audit event now has a severity level determined by its importance.
Added the audit_log.audit_spaces option that configures the list of spaces for which data operation events should be logged.
Added the audit_log.audit_extract_key option that forces the audit subsystem to log the primary key instead of a full tuple in DML operations. This might be useful for reducing audit log size in the case of large tuples.

The sample audit log configuration in the 3.0 version might look as follows, including new audit_spaces and audit_extract_key options:

audit_log:
  to: file
  file: audit_tarantool.log
  filter: [ddl,dml]
  spaces: [books]
  extract_key: true

With this configuration, an audit log entry for a DELETE operation may look like below:

{
  "time": "2023-12-19T10:09:44.664+0000",
  "uuid": "65901190-f8a6-45c1-b3a4-1a11cf5c7355",
  "severity": "VERBOSE",
  "remote": "unix/:(socket)",
  "session_type": "console",
  "module": "tarantool",
  "user": "admin",
  "type": "space_delete",
  "tag": "",
  "description": "Delete key [\"0671623249\"] from space books"
}

The entry includes the new uuid and severity fields. The last description field gives only the information about the key of the deleted tuple.

Reading flight recordings (EE)

The flight recorder available in the Enterprise Edition is an event collection tool that gathers various information about a working Tarantool instance. With the 3.0 version, you can read flight recordings using the API provided by the flightrec module.

To enable the flight recorder in a YAML file, set flightrec.enabled to true:

flightrec:
  enabled: true

Then, you can use the Lua API to open and read *.ttfr files:

app:instance001> flightrec = require('flightrec')
---
...

app:instance001> flightrec_file = flightrec.open('var/lib/instance001/20231225T085435.ttfr')
---
...

app:instance001> flightrec_file
---
- sections: &0
    requests:
      size: 10485760
    metrics:
      size: 368640
    logs:
      size: 10485760
  was_closed: false
  version: 0
  pid: 1350
...

app:instance001> for i, r in flightrec_file.sections.logs:pairs() do record = r; break end
---
...

app:instance001> record
---
- level: INFO
  fiber_name: interactive
  fiber_id: 103
  cord_name: main
  file: ./src/box/flightrec.c
  time: 2023-12-25 08:50:12.275
  message: 'Flight recorder: configuration has been done'
  line: 727
...

app:instance001> flightrec_file:close()
---
...

New DEB and RPM packages

With this release, the approach to delivering Tarantool to end users in DEB and RPM packages is slightly revised. In the previous versions, Tarantool was built for the most popular Linux distributions and their latest version.

Starting with this release, only two sets of DEB and RPM packages are delivered. The difference is that these packages include a statically compiled Tarantool binary. This approach provides the ability to install DEB and RPM packages on any Linux distributions that are based on СentOS and Debian.

To ensure that Tarantool works for a wide range of different distributions and their versions, RPM and DEB packages are prepared on CentOS 7 with glibc 2.17.

Developing applications

varbinary in Lua

In the previous versions, Tarantool already supported the varbinary type for storing data. But working with varbinary database fields required workarounds, such as using C to process such data.

The 3.0 version includes a new varbinary module for working with varbinary objects. The module implements the following functions:

varbinary.new() - constructs a varbinary object from a plain string.
varbinary.is() - returns true if the argument is a varbinary object.

In the example below, an object is created from a string:

local varbinary = require('varbinary')
local bin = varbinary.new('Hello world!')

The built-in decoders now decode binary data fields to a varbinary object by default:

local varbinary = require('varbinary')
local msgpack = require('msgpack')
varbinary.is(msgpack.decode('\xC4\x02\xFF\xFE'))
--[[
---
- true
...
]]
varbinary.is(yaml.decode('!!binary //4='))
--[[
---
- true
...
]]

This also implies that the data stored in the database under the varbinary field type is now returned to Lua not as a plain string but as a varbinary object.

It’s possible to revert to the old behavior by toggling the new binary_data_decoding compat option because this change may break backward compatibility:

compat:
  binary_data_decoding: old

Default field values

You can now assign the default values for specific fields when defining a space format. In this example, the isbn and title fields have the specified default values:

box.schema.space.create('books')
box.space.books:format({
    { name = 'id', type = 'unsigned' },
    { name = 'isbn', type = 'string', default = '9990000000000' },
    { name = 'title', type = 'string', default = 'New awesome book' },
    { name = 'year_of_publication', type = 'unsigned', default = 2023 }
})
box.space.books:create_index('primary', { parts = { 'isbn' } })

If you insert a tuple with missing fields, the default values are inserted:

app:instance001> box.space.books:insert({ 1000, nil, nil, nil })
---
- [1000, '9990000000000', 'New awesome book', 2023]
...

You can also provide a custom logic for generating a default value. To achieve this, create a function using box.schema.func.create:

box.schema.func.create('current_year', {
    language = 'Lua',
    body = "function() return require('datetime').now().year end"
})

Then, assign the function name to default_func when defining a space format:

box.space.books:format({
    -- ... --
    { name = 'year_of_publication', type = 'unsigned', default_func = 'current_year' }
})

Learn more in Default values.

Triggers

In the 3.0 version, the API for creating triggers is completely reworked. A new trigger module is introduced, allowing you to set handlers on both predefined and custom events.

To create the trigger, you need to:

Provide an event name used to associate the trigger with.
Define the trigger name.
Provide a trigger handler function.

The code snippet below shows how to subscribe to changes in the books space:

local trigger = require('trigger')
trigger.set(
        'box.space.books.on_replace', -- event name
        'some-custom-trigger',        -- trigger name
        function(...)
            -- trigger handler
        end
)

Pagination in read views (EE)

The 2.11 release introduced the following features:

Read views are in-memory snapshots of the entire database that aren’t affected by future data modifications.
Pagination for getting data in chunks.

With the 3.0 release, a read view object supports the after and fetch_pos arguments for the select and pairs methods:

-- Select first 3 tuples and fetch a last tuple's position --
app:instance001> result, position = read_view1.space.bands:select({}, { limit = 3, fetch_pos = true })
---
...

app:instance001> result
---
- - [1, 'Roxette', 1986]
  - [2, 'Scorpions', 1965]
  - [3, 'Ace of Base', 1987]
...

app:instance001> position
---
- kQM
...

-- Then, you can pass this position as the 'after' parameter --
app:instance001> read_view1.space.bands:select({}, { limit = 3, after = position })
---
- - [4, 'The Beatles', 1960]
  - [5, 'Pink Floyd', 1965]
  - [6, 'The Rolling Stones', 1962]
...

IPROTO tuple format

Starting with the 3.0 version, the IPROTO protocol is extended to support for sending names of tuple fields in the IPROTO_CALL and other IPROTO responses. This simplifies the development of Tarantool connectors and also simplifies handling tuples received from remote procedure calls or from routers.

It’s possible to revert to the old behavior by toggling the box_tuple_extension compat option:

compat:
  box_tuple_extension: old

SQL: case-sensitive names

Starting with 3.0, names in SQL, for example, table, column, or constraint names are case-sensitive. Before the 3.0 version, the query below created a MYTABLE table:

CREATE TABLE MyTable (i INT  PRIMARY KEY);

To create the MyTable table, you needed to enclose the name into double quotes:

CREATE TABLE "MyTable" (i INT  PRIMARY KEY);

Starting with 3.0, names are case-sensitive, and double quotes are no longer needed:

CREATE TABLE MyTable (i INT  PRIMARY KEY);

For backward compatibility, the new version also supports a second lookup using an uppercase name. This means that the query below tries to find the MyTable table and then MYTABLE:

SELECT * FROM MyTable;

Stability

Handling LuaJIT compiler errors

The 3.0 release includes a fix for the gh-562 LuaJIT issue related to the inability to handle internal compiler on-trace errors using pcall. The examples of such errors are:

An Out of memory error might occur for select queries returning a large amount of data.
A Table overflow error is raised when exceeding the maximum number of keys in a table.

The script below tries to fill a Lua table with a large number of keys:

local function memory_payload()
    local t = {}
    for i = 1, 1e10 do
        t[ffi.new('uint64_t')] = i
    end
end
local res, err = pcall(memory_payload)
print(res, err)

In the previous Tarantool version with the 32-bit Lua GC, this script causes the following error despite using pcall:

PANIC: unprotected error in call to Lua API (not enough memory)

For Tarantool with the 64-bit Lua GC, this script causes a Table overflow error:

PANIC: unprotected error in call to Lua API (table overflow)

Starting with the 3.0 version, these errors are handled correctly with the following outputs:

false    not enough memory -- 32-bit Lua GC
false    table overflow    -- 64-bit Lua GC

As a result, Tarantool 3.0 becomes more stable in cases when user scripts include erroneous code.

Enterprise SDK changelog

Versioning policy

A Tarantool Enterprise SDK version consists of two parts:

<TARANTOOL_BASE_VERSION>-r<REVISION>

For example: 2.11.1-0-gc42d9735b-r589.

TARANTOOL_BASE_VERSION is the Community version which the Enterprise version is based on.
REVISION is the SDK revision. Besides Tarantool itself, it includes the tt utility, a set of open and closed source modules, and examples. Learn more from Package contents.

r703

Bumped checks version to 3.4.0.
Bumped Cartridge version to 2.16.4.
Bumped vshard version to 0.1.37.

r702

Bumped tarantool-2.11 series to 2.11.8.

r696

Moved CI files from sdk-ci repository.

r695

Bumped tt-ee version to v2.11.0.

r694

Replaced Cartridge EE with Cartridge CE 2.16.3.
Added CRUD CE 1.6.1.
Added expirationd CE 1.7.0.
Added ddl CE 1.7.1.
Bumped metrics version to 1.5.0.
Added migrations CE 1.1.0.
Added vshard CE 0.1.36.
Moved to Community Edition modules.

r693

Fixed glibc package URL to archived.

r692

Added manual trigger job to run tests to generate certificate of compliance. Ready for Astra Linux.

r691

Bumped Cartridge version to 2.16.2.
Bumped metrics version to 1.4.0.
Bumped http version to 1.8.0.
Bumped crud-ee version to 1.7.4.

r690

Bumped tt-ee version to v2.10.1.

r689

Bumped Cartridge version to 2.16.0.

r688

Bumped vshard-ee version to 0.1.34.

r687

Bumped Cartridge version to 2.15.4.

r686

Bumped tt-ee version to v2.10.0.

r685

Bumped migrations-ee version to 1.3.2.

r684

Bumped tarantool-2.11 series to 2.11.7.

r683

Bumped Kafka version to 1.6.10.

r682

Bumped Cartridge version to 2.15.3.

r681

Bumped vshard-ee version to 0.1.33.

r680

Bumped tt-ee version to v2.9.1.

r679

Bumped tt-ee version to v2.9.0.
Bumped Cartridge version to 2.15.2.
Bumped membership version to 2.5.2.

r677

Bumped Cartridge version to 2.15.1.
Bumped tt-ee version to v2.8.1.
Bumped vshard-ee version to 0.1.32.

r673

Bumped Cartridge version to 2.15.0.
Bumped membership version to 2.5.1.
Bumped expirationd-ee version to 1.8.0.

r672

Bumped tarantool-2.11 series to 2.11.6.
Bumped tt-ee version to v2.8.0.
Bumped migrations-ee version to 1.3.1.

r669

Bumped Cartridge version to 2.14.0.
Bumped membership version to 2.4.6.

r662

Bumped vshard-ee version to 0.1.31.
Bumped tt-ee version to v2.7.0.

r660

Bumped tt-ee version to v2.6.0.
Bumped Cartridge version to 2.13.0.
Bumped vshard version to 0.1.30.
Bumped http version to 1.7.0.

r659

Bumped tarantool-2.11 series to 2.11.5.
Bumped tt-ee version to v2.5.2.
Updated Kafka to 1.6.9.
Bumped tt-ee version to v2.5.1.
Bumped tt-ee version to v2.5.0.

r654

Moved cartridge-auth-extension to stable directory.
Bumped crud-ee version to 1.7.1.
Bumped migrations-ee version to 1.3.0.

r653

Bumped Cartridge version to 2.12.4.
Bumped vshard version to 0.1.29.
Bumped http version to 1.6.0.

r652

Bumped tarantool-2.11 series to 2.11.4.
Bumped Cartridge version to 2.12.3.

r650

Bumped tt-ee version to v2.4.0.
Bumped queue version to 1.4.2.
Added Migration Guide to bundle.

r647

Updated Oracle to 1.5.0 for x86_64.
Updated oci to 21.14 for x86_64.

r646

Moved to Enterprise Edition modules.
Fixed Docker image due to CentOS 7 EOL.
Fixed CI/CD workflows running inside CentOS 7.

r643

Bumped tt-ee version to v2.3.1.
Enabled tt bash completion.
Updated Kafka to 1.6.8.

r640

Updated Docker image.
Updated CMake to 3.20.6.
Installed dependencies for building OpenSSL.
Fixed installation of Python 3.6.

r639

Enabled back aarch64 jobs.
Temporary disabled aarch64 jobs.

r637

Updated metrics to 1.1.0.
Updated queue to 1.4.1.
Updated CRUD to 1.5.2.

r636

Updated Cartridge to 2.11.0.
Updated ddl to 1.7.1.
Updated vshard to 0.1.27.

r635

Adjusted CI workflows for 1.x-2.x development branch.
Deleted tarantool-master submodule.

r633

Updated CRUD to 1.5.1.
Updated sideservice to 0.2.1.
Updated httpgo to 0.2.2.
Updated httpgo-crud to 0.1.1.

r632

Updated cartridge-cli to 2.12.12.
tt used instead of tarantoolctl for build/test routines.
Made Tarantool and bundle versions correct.
Bumped tarantool-2.11 to 2.11.3.

r628

Updated Cartridge to 2.10.0.
Updated membership to 2.4.4.
Updated ddl to 1.7.0.
Updated graphqlapi to 0.0.11.

r627

Updated expirationd to 1.6.0.
Updated sharded-queue to 1.0.0.
Dropped building MacOS bundles.
Updated cartridge-cli to 2.12.11.

r623

Updated tt-ee to 2.2.1.
Updated CRUD to 1.5.0.
Updated membership to 2.4.3.
Updated Cartridge to 2.9.0.

r619

Updated bundle tt-ee aarch64.
Updated tt-ee to 2.2.0.
Fixed running Tarantool tests on RED OS.

r616

Updated CRUD to 1.4.3.
Updated luatest to 1.0.1.
Updated migrations to 0.7.0.
Updated tt-ee to 2.1.2.

r613

Updated Cartridge to 2.8.5.
Updated CRUD to 1.4.2.
Added frontend-core 8.2.2.
Updated membership to 2.4.2.
Updated sideservice to 0.2.0.
Updated tt-ee to 2.1.1.
Updated vshard to 0.1.26.

r609

Updated httpgo to 0.2.1.
Added httpgo-crud 0.1.0.
Updated tarantool-2.11 to 2.11.2.

r606

Updated tarantool-master to 3.0.0-beta1.

r605

Updated Cartridge to 2.8.4.
Updated CRUD to 1.4.1.
Updated ddl to 1.6.5.
Added httpgo 0.2.0.
Updated tt-ee to 2.0.0.

r598

Updated cartridge-cli to 2.12.9.
Updated tt-ee to 1.3.1.

r595

Updated tt-ee to 1.3.0.
Updated Cartridge to 2.8.3.
Updated cartridge-cli-extensions to 1.1.2.
Updated CRUD to 1.3.0.
Updated queue to 1.3.3.
Updated sharded-queue to 0.1.1.
Updated membership to 2.4.1.
Added tests for Astra Linux 1.7.

r589

Updated tarantool-2.10 to 2.10.8.
Updated tarantool-master to 3.0.0-alpha3.
Updated migrations to 0.6.0.
Updated tt-ee to 1.2.0.
Updated space-explorer to 1.1.8.
Updated cartridge-metrics-role to 0.1.1.
Updated Cartridge to 2.8.2.
Updated expirationd to 1.5.0.
Added sideservice 0.1.0.

r579

Updated cartridge-cli to 2.12.7.
Updated tarantool-2.11 to 2.11.1.

r577

Added CRUD 1.2.0.
Added ddl 1.6.3.
Added sharded-queue 0.1.0.
Added ddl 1.6.4.
Updated tt-ee to 1.1.2.
Updated cartridge-cli to 2.12.6.

r563

Updated tarantool-2.10 to 2.10.7.
Updated tarantool-2.11 to 2.11.0.
Added Kafka 1.6.6.
Added vshard 0.1.24.
Added metrics 1.0.0.
Added cartridge-metrics-role 0.1.0.
Added Cartridge 2.8.0.
Added http 1.5.0.

r557

Added checks 3.3.0.
Updated cartridge-cli to 2.12.5.

r553

Added tt-ee and tt environment configuration.
Added CRUD 1.1.1.
Added avro-schema 3.1.1.
Added expirationd 1.4.0.
Added graphql 0.3.0.
Added graphqlapi 0.0.10.
Added metrics 0.17.0.
Added migrations 0.5.0.
Added Oracle 1.4.0.
Added Cartridge 2.7.9.
Added vshard 0.1.23.
Added Kafka 1.6.5.

r549

Updated tarantool-2.10 to 2.10.6.

r545

Updated tarantool-2.11 to 2.11.0-rc2.

r543

Added the tarantool-2.11 submodule.

r542

Updated tarantool-1.10 to 1.10.15.

r541

Updated tarantool-master to 3.0.0-entrypoint.

r540

Updated tarantool-2.10 to 2.10.5.

r539

Added vshard 0.1.22.

r538

Updated tarantool-2.8 to apply 2 hotfixes.

r537

Fixed non-interactive installation of the brew package.
Changed the owner of the /usr/local/bin directory.
Installed awscli@1 instead of awscli since it takes much less time.

r536

Added the missing property 2.10 for scope CACHE in CMakeLists.txt.

r535

Added expirationd 1.3.1.

r534

Added CRUD 1.0.0.

r533

Used runners with label regular for builds and the tagged release workflow.

r532

Added http 1.4.0.
Added space-explorer 1.1.7.
Added checks 3.2.0.
Added metrics 0.16.0.
Added Cartridge 2.7.8.

r531

Added the -DENABLE_LTO=ON flag for tarantool-ee@master branch to CMakeLists.txt.

r530

Upgraded devtoolset from 8 to 9. It was required for upgrading ld from 2.30 to 2.31+ for LTO.

r529

Updated tarantool’s master branch to a recent revision.

r528

Fixed code style in the Linux and MacOS workflows.

r527

Reliably installed packages in MacOS builds.

r526

Refactored the way that GC64 builds are defined in the build workflow. There are no changes to the composition of resulting bundles.

r525

Added alerting failures in builds on stable branches and integration testing to VK Teams chats.

r524

Updated to fresh tarantool master (2.11.0-entrypoint-107-ga18449d)

r523

Added Cartridge 2.7.7.

r522

Outdated workflow runs are now canceled to save CI time.

r521

Added CRUD 0.14.1.
Added expirationd 1.3.0.
Added metrics 0.15.1.
Added queue 1.2.2.

r520

Release SDK by tags:

Run workflow in SDK docker container.
Uploaded SDK files for 1.10, 2.8, 2.10 versions to release folder.
Added consistency check for all versions.

r519

On feature branches, SDK is now rebuilt only on relevant changes.

r518

Added frontend-core 8.2.1.
Added vshard 0.1.21.
Added http 1.3.0.
Added Cartridge 2.7.6.

r517

Updated Tarantool EE to 2.10.4.

r516

Updated bundled OpenSSL to version 1.1.1q.

r515

Removed support of Tarantool 2.7.
Started using tarantool/actions/prepare-checkout to make builds more stable.

r514

Remove the local registry and setup using GitHub registry.
Sync rocks cache to S3 and back.
Setup using shared runners.
Refactor and format ci-linux.yml and ci-macos.yml.

r513

Removed Kafka 1.5.0 due to a build issue with Tarantool 2.10.3 and higher.
Updated Kafka to version 1.6.2.

r512

Updated tuple-keydef to version 0.0.3.

r511

Enabled parallel build of rocks for MacOS in CI.

r510

Updated Tarantool to 2.10.3.
Added a readable error for the case when the flight recoder fails to write data due to insufficient free space on the disk device. Previously, it was sending a SIGBUS error.
Fixed a crash in the flight recorder caused by non-thread-safe log recording from multiple threads.

r502

Updated Tarantool to 2.10.2.
Increased resolution of stored entries in flight recorder.
Fixed a bug in the flight recorder that resulted in skipping log entries in case box.cfg.log_level is less than flightrec_log_level.

r498

Updated Tarantool to 2.10.1.
Updated Cyrus SASL to version 2.1.28.
Updated OpenLDAP to version 2.5.13.
Updated LZ4 to version 1.9.3. Fixed CVE-2021-3520.
Fixed replication reconnect failure after disabling SSL encryption.
Fixed a crash that occurred while tyring to start an instance that has a compressed memtx space.
Fixed CVE-2022-29242 in GOST SSL engine.
Fixed a bug in the flight recorder reader implementation that resulted in a hang or error while trying to open an empty section.

r467

Breaking changes

Default audit log format was changed to CSV.

Functionality added or changed

Enterprise

Implemented user-defined audit events. Now it’s possible to log custom messages to the audit log from Lua.
[Breaking change] Switched the default audit log format to CSV. The format can be switched back to JSON using the new box.cfg.audit_format configuration option.
Implemented the audit log filter. Now, it’s possible to enable logging only for a subset of all audit events using the new box.cfg.audit_filter configuration option.

Core

Implement constraints and foreign keys. Now a user can create function constraints and foreign key relations (gh-6436).
Changed log level of some information messages from critical to info (gh-4675).
Added predefined system events: box.status, box.id, box.election and box.schema (gh-6260).
Introduced transaction isolation levels in Lua and IPROTO (gh-6930).

Vinyl

Disabled the deferred DELETE optimization in Vinyl to avoid possible performance degradation of secondary index reads. Now, to enable the optimization, one has to set the defer_deletes flag in space options (gh-4501).

Lua

Added support of console autocompletion for net.box objects stream and future (gh-6305).

Datetime

Parse method to allow converting string literals in extended iso-8601
or rfc3339 formats (gh-6731).
The range of supported years has been extended in all parsers to cover
fully -5879610-06-22..5879611-07-11 (gh-6731).

Build

Added bundling of GNU libunwind to support backtrace feature on AARCH64 architecture and distributives that don’t provide libunwind package.
Re-enabled backtrace feature for all RHEL distributions by default, except for AARCH64 architecture and ancient GCC versions, which lack compiler features required for backtrace (gh-4611).

Bugs fixed

Enterprise

Disabled audit log unless explicitly configured. Before this change, audit events were written to stderr if box.cfg.audit_log wasn’t set. Now, audit log is disabled in this case.
Disabled audit logging of replicated events. Now, replicated events (for example, user creation) are logged only on the origin, never on a replica.

Core

Banned DDL operations in space on_replace triggers, since they could lead to a crash (gh-6920).
Fixed a bug due to which all fibers created with fiber_attr_setstacksize() leaked until the thread exit. Their stacks also leaked except when fiber_set_joinable(..., true) was used.
Fixed a crash in mvcc connected with secondary index conflict (gh-6452).
Fixed a bug which resulted in wrong space count (gh-6421).
Select in RO transaction now reads confirmed data, like a standalone (auotcommit) select does (gh-6452).

Replication

Fixed potential obsolete data write in synchronous replication due to race in accessing terms while disk write operation is in progress and not yet completed.
Fixed replicas failing to bootstrap when master is just re-started (gh-6966).

Lua

Fixed the behavior of tarantool console on SIGINT. Now Ctrl+C discards the current input and prints the new prompt (gh-2717).

Triggers

Fixed assertion or segfault when MP_EXT received via net.box (gh-6766).
Now ROUND() properly support INTEGER and DECIMAL as the first argument (gh-6988).

Datetime

Intervals received after datetime arithmetic operations may be improperly normalized if result was negative
```
tarantool> date.now() - date.now()
---
- -1.000026000 seconds
...
```
I.e. 2 immediately called date.now() produce very close values, whose difference should be close to 0, not 1 second (gh-6882).

Net.box

Changed the type of the error returned by net.box on timeout from ClientError to TimedOut (gh-6144).

r457

Fixed some binary protocol encryption bugs.

r455

Added binary protocol encryption.
Added tuple field compression.

Compatibility guarantees

Backwards compatibility is guaranteed between all versions in the same release series. It is also appreciated but not guaranteed between different release series (major number changes). Pre-releases and releases of one release series are compatible in all senses defined below (any release with any release):

Pre-releases and releases of consequent series are compatible by data layout, binary protocol, and replication protocol.
No guarantees are given regarding compatibility between pre-releases/releases of non-consequent release series if the opposite is not stated in the release notes.
No guarantees are given regarding compatibility between alpha/beta versions and between alpha/beta and pre-release/release even within one series.

Binary data layout

Any newer release (its runtime) is backward compatible with any older one. It means the more recent release can work on top of data (*.xlog, *.snap, *.vylog, *.run) from the older one. All functionality of the older release can work in this configuration. The same compatibility is maintained between release series as well.

An attempt to use a new feature results in one of the options:

The attempt is successful.
There is an error message about the old data layout. The error does not lead to service outage or data corruption. There is a way to avoid the message, if an instance upgrades the data layout by calling the box.schema.upgrade(). The call enables all new release features (when all instances of the replicaset are processed on the same Tarantool version).

Binary protocol

All binary protocol requests operational in an older release keep working in a newer one. Responses have the same format, but mappings may contain fields not present in the older release.

A net.box client of an older release can work with a server running a newer release. However, net.box features introduced in the newer release won’t work. A net.box client of a newer release is fully operational with a server running a older release. However, only the features implemented in the older release will work.

Replication protocol

An instance running on a newer release can work as:

upstream (master) of an instance with an older release
downstream (replica) without database schema upgrade.

The database schema upgrade (box.schema.upgrade()) must be performed when all replicaset instances run on the same Tarantool version. An application should not lean on internal schema representation because it can be changed with the upgrade.

Lua code

If a code is processed on an older release, it will operate with the same effect on a newer one. However, only meaningful code counts. If any code throws an error but starts doing something useful, the change is considered compatible.

There is still room for new functionality: adding new options (fields in a table argument), new arguments to the end, more fields to a return table, and more return values (multireturn).

Adding a new built-in module or a new global value is considered as a compatible change.

Adding a new field to an existing metatable is okay if the field is not listed in the Lua 5.1 Reference Manual. Otherwise, it should be proven that it won’t break any meaningful code.

Examples of compatible changes:

Add __pairs, __ipairs to a metatable of a userdata/cdata object. The fields are not from Lua 5.1, and the userdata/cdata has no default behaviour for pairs() and ipairs() calls.
Add or extend the __lt or __le metamethod (if the attempt to use <, <= etc. leads to an error before the change).
Extend existing __eq metamethod implementation (if the attempt to use it leads to an error before the change).

Examples of incompatible changes:

Add __pairs, __ipairs to a metatable of a table (it already has a defined behavior before the change).
Add the __eq metamethod (any pair of Lua objects already has a defined behavior).

SQL code

If any request is processed on an older release, it will operate with the same effect on a newer one (except the requests that always lead to an error).

Examples of compatible changes:

Add a new keyword.
Add a new type.
Add a new built-in function.
Add a new system table that has a name starting with an underscore.
Add a new collation.
Add an implicit or explicit cast rule for a set of operations {X} and a list of types [Y] if [operation from {X}]([list of values of [Y] types]) had not been implemented before the change.
Change the order of tuples in the result set of SELECT in case ORDER BY is not specified.

Technically, those changes may break some working code in case of a name clash, but the probability of it is negligible.

Examples of incompatible changes:

Change the result of working implicit or explicit cast.
Change of a literal type.

C code

If a module or a C stored procedure runs on an older release, it will operate with the same effect on a newer one.

It is okay to add a new function or structure to the public C API. It must use one of the Tarantool prefixes (box_, fiber_, luaT_, luaM_ and so on) or some new prefix.

A symbol from a used library must not be exported directly because the library may be used in a module by itself, and the clash can lead to problems. Exception: when the whole public API of the library is exported (as for libcurl).

Do not introduce new functions or structures with the lua_ and luaL_ prefixes. Those prefixes are for the Lua runtime. Use luaT_ for Tarantool-specific functions, and luaM_ for general-purpose ones.

How to get involved in Tarantool

What is Tarantool?

Tarantool is an open source database that can store everything in RAM. Use Tarantool as a cache with the ability to save data to disk. Tarantool serves up to a million requests per second, allows for secondary index searches, and has SQL support.

In Tarantool, you can execute code alongside data. This allows for faster operations. Implement any business logic in Lua. Get rid of stale entries, sync with other data sources, implement an HTTP service.

Go to Getting Started and try Tarantool.

How to get help?

We have a special Telegram chat for contributors. We speak Russian and English in the chat.

This is the easiest way to get your questions answered. Many people are afraid to ask questions because they believe they are “wasting the experts’ time,” but we don’t really think so. Contributors are important to us.

We also have a Stack Overflow tag.

Join the chat and ask questions.

How to leave feedback, ideas, or suggestions?

You can leave your feedback or share ideas in different ways:

The simplest way is to fill the feedback form. All you need to do is fill in one product comment field and click “Send.” You can optionally provide your email address. If you wish, we can involve you in the product development process.
A more technical way is to create a ticket on GitHub. If you have a suggestion for a new feature or information about a bug, create a new GitHub issue. The link leads to the tarantool/tarantool repository. To leave feedback for our other projects on GitHub, select “Issues” > “New issue.”

See an example of a feature request.

To talk to our team about a product, go to one of our chats:

If Telegram is inconvenient for you or simply isn’t working, you can leave your comment on tarantool.io. Fill out the form at the bottom of the site and leave your email. We read every request and respond to them usually within 2 days.

How to contribute

There are many ways to contribute to Tarantool:

Code: Contribute to the code. We have components written in C, Lua, Python, Go, and other languages.
Write: Improve documentation, write blog posts, create tutorials or solution pages.
Q&A: Share your experience on Stack Overflow with the #tarantool tag.
Spread the word: Share your accomplishments on social media using the #tarantool hashtag (or CC @tarantooldb on Twitter).

Tarantool ecosystem

Tarantool has a large ecosystem of tools. We divide the ecosystem into four large blocks:

Tarantool itself.
Modules for Tarantool. They can be written in C and Lua.
Connectors for programming languages.
Applied tools. See the curated Awesome Tarantool list, which also includes external tools.

To start contributing, check the “good first issue” tag in the issues section of any of our repositories. These are beginner to intermediate tasks that will help you get comfortable with the tool.

See the list of tasks for the tarantool/tarantool repository.

There is a review queue in each of our repositories, so your changes may not be reviewed immediately. We usually give the first answer within two days. Depending on the ticket and its complexity, the review time may take a week or more.

Please do not hesitate to tag the maintainer in your GitHub ticket.

Read on to learn about contributing to different ecosystem blocks.

Documentation: How to report and fix problems

There are several ways to improve the documentation:

The easiest one is to leave your comment on the web documentation page. To use the built-in feedback form, select the text that you want to comment on, press Ctrl+Enter, type your comment in the pop-up window, and click Submit. On mobile screens, an Error? button appears at the bottom of the screen, which opens the same pop-up window. You can point out an error, provide feedback on the current article, or suggest changes. We review each comment and work with it.
Advanced: All Tarantool documentation tasks can be found in the repository. Go to any task and suggest your changes. We write our documentation using reStructuredText markup, and we have a writing style guide. After you make the change, build the documentation locally and see how it works. This can be done automatically in Docker. To learn more, check the README of the tarantool/doc repository.

Some Tarantool projects have their documentation in the code repository. This is typical for modules, for example, metrics. This is done on purpose, so the developers themselves can update it faster. You can find instructions for building such documentation in the code repository.

If you find that the documentation provided in the README of a module or a connector is incomplete or wrong, the best way to influence this is to fix it yourself. Clone the repository, fix the bug, and suggest changes in a pull request. It will take you five minutes but it will help the whole community.

If you cannot fix it for any reason, create a ticket in the repository and report the error. It will be fixed promptly.

How to contribute to modules

Tarantool is a database with an embedded application server. This means you can write any code in C or Lua and pack it in distributable modules.

We have official and unofficial modules. Here are some of our official modules:

HTTP server: HTTP server implementation with middleware support.
queue: Tarantool implementation of the persistent message queue.
metrics: Ready-to-use solution for collecting metrics.

Official modules are provided in our organization on GitHub.

All modules are distributed through our package manager, which is pre-installed with Tarantool. That also applies to unofficial modules, which means that other users can get your module easily.

If you want to add your module to our GitHub organization, send us a message on Telegram.

Contributing to an existing module

Tasks for contributors can be found in the issues section of any repository under the “good first issue” tag. These tasks are beginner or intermediate in terms of difficulty level, so you can comfortably get used to the module of your interest.

Check the currently open tasks for the HTTP Server module.

Please see our Lua style guide.

You can find the contact of the current maintainer in the MAINTAINERS file, located in the root of the repository. If there is no such file, please let us know. We will respond within two days.

If you see that the project does not have a maintainer or is inactive, you can become its maintainer yourself. See the How to become a maintainer section.

Creating a new module

You can also create custom modules and share them with the community. Look at the module template and write your own.

How to contribute to Tarantool Core

Tarantool is written mostly in C. Some parts are in C++ and Lua. Your contributions to Tarantool Core may take longer to review because we want the code to be reliable.

To start:

Learn how to build Tarantool.
Read about Tarantool architecture and main modules on the developer site and on GitHub.

In Tarantool development, we strive to follow the standards laid out in our style and contribution guides. These documents explain how to format your code and commits as well as how to write tests without breaking anything accidentally.

The guidelines also help you create patches that are easy to check, which allows quickly pushing changes to master.

Please read about our code review procedure before making your first commit.

You can suggest a patch using the fork and pull mechanism on GitHub: Make changes to your copy of the repository and submit it to us for review. Check the GitHub documentation to learn how to do it.

How to write tests

A database is a product that is expected to be as reliable as possible. We at Tarantool created test-run, a dedicated test framework for developing scripts that test Tarantool itself.

Writing your own test is not difficult. Check out the following examples:

We also have a CI workflow that automatically checks build and test coverage for new changes on all supported operating systems. The workflow is launched after every commit to the repository.

We have many tasks for QA specialists. Our QA team provides test coverage for our products, helps develop the test framework, and introduces and maintains new tools to test the stability of our releases.

For modules, we use luatest— our fork of a framework popular in the Lua community, enhanced and optimized for our tasks. See examples. of writing tests for a module.

How to contribute to language connectors

A connector is a library that provides an API to access Tarantool from a programming language. Tarantool uses its own binary protocol for access, and the connector’s task is to transfer user requests to the database and application server in the required format.

Data access connectors have already been implemented for all major languages. If you want to write your own connector, you first need to familiarize yourself with the Tarantool binary protocol. Read the protocol description to learn more.

We consider the following connectors as references:

https://github.com/tarantool-php/client
net.box—Tarantool binary protocol client

You can look at them to understand how to do it right.

Some connectors in the Tarantool ecosystem are supported by the Tarantool team. Others are developed and supported exclusively by the community. All of them have their pros and cons. See the complete list of connectors and their recommended versions.

If you are using a community connector and want to implement new features for it or fix a bug, send your PRs via GitHub to the connector repository.

If you have questions for the author of the connector, check the MAINTAINERS file for the repository maintainer’s contact. If there is no such file, send us a message on Telegram. We will help you figure it out. We usually answer within one day.

How to contribute to tools

The Tarantool ecosystem has tools that facilitate the workflow, help with application deployment, or allow working with Kubernetes.

Here are some of the tools created by the Tarantool team:

tt: a CLI utility for creating and managing Tarantool applications.
tarantool-operator: a Kubernetes operator for cluster orchestration.

These tools can be installed via standard package managers: ansible galaxy, yum, or apt-get.

If you have a tool that might go well in our curated Awesome Tarantool list, read the guide for contributors and submit a pull request.

How to become a maintainer

Maintainers are people who can merge PRs or commit to master. We expect maintainers to answer questions and tickets on time as well as do code reviews.

If you need to get a review but no one responds within a week, take a look at the Maintainers section of the repository’s README.md. Write to the person listed there. If you have not received an answer within 3–4 days, you can escalate the question on Telegram.

A repository may have no maintainers (empty Maintainers list in README.md), or the existing maintainers may be inactive. In this case, you can become a maintainer yourself. We think it’s better if the repository is maintained by a newbie than if the repository is dead. So don’t be shy: we love maintainers and help them figure it all out.

All you need to do is fill out this form. Tell us what repository you want to access, the reason (inactivity, the maintainer is not responding), and how to contact you. We will consider your application in 1 day and either give you the rights or tell you what else needs to be done.

How to write release notes

Below are some best practices to make changelogs consistent, neat, and human-oriented.

Language

Use the past tense to describe changed or fixed behavior.

Examples

Fixed false positive panic when yielding in debug hook (gh-5649).

Fedora 32 is supported now. Added per-commit testing, updated YUM repositories (gh-4966).
Use the present tense to describe new behavior.

Example

Data changes in read-only mode are now forbidden (gh-5231).
Start with a capital letter, end with a period.

Note that these guidelines differ from the best practice for commit message titles that suggests using the imperative mood.

Formatting

In release notes, use the following Sphinx syntax when referring to a specific version of Tarantool:

Tarantool :tarantool-release:`2.10.0`.
This is a link to the release notes on GitHub.

The result looks like this:

Tarantool v. 2.10.0. This is a link to the release notes on GitHub.

Building to contribute

To build Tarantool from source files, you need the following tools:

Git
GCC. Or Clang for Mac OS
CMake 3.3 or later
GNU Make
Autoconf, any version
Automake, any version
Libtool, any version
Readline, any version
ncurses, any version
OpenSSL, any version
ICU, any version
Zlib-devel, any version
Python3 and modules:
- pyyaml
- gevent
- six

Quick build

To install all required dependencies, build Tarantool and run tests, choose your OS and follow the instructions:

Ubuntu/Debian
Fedora
RHEL/CentOS 7
CentOS 8
Mac OS
FreeBSD

Some additional steps might be useful:

-DENABLE_DIST=ON for tarantoolctl installation
Make RPM and Debian packages
Verify your Tarantool installation

Ubuntu/Debian

$ apt-get -y install git build-essential cmake autoconf automake libtool make \
  zlib1g-dev libreadline-dev libncurses5-dev libssl-dev libunwind-dev libicu-dev \
  python3 python3-yaml python3-six python3-gevent

$ git clone https://github.com/tarantool/tarantool.git --recursive

$ cd tarantool

$ git submodule update --init --recursive

$ make clean         # unnecessary, added for good luck
$ rm CMakeCache.txt  # unnecessary, added for good luck

$ mkdir build && cd build

$ # start initiating with build type=RelWithDebInfo
$ cmake .. -DCMAKE_BUILD_TYPE=RelWithDebInfo

$ make

$ make test

Fedora

$ dnf install -y git gcc gcc-c++ cmake  autoconf automake libtool make \
  readline-devel ncurses-devel openssl-devel zlib-devel libunwind-devel libicu-devel \
  python3-pyyaml python3-six python3-gevent

$ git clone https://github.com/tarantool/tarantool.git --recursive

$ cd tarantool

$ git submodule update --init --recursive

$ make clean         # unnecessary, added for good luck
$ rm CMakeCache.txt  # unnecessary, added for good luck

$ mkdir build && cd build

$ # start initiating with build type=RelWithDebInfo
$ cmake .. -DCMAKE_BUILD_TYPE=RelWithDebInfo

$ make

$ make test

RHEL/CentOS 7

$ yum install -y python-pip
$ yum install -y epel-release

$ curl -s https://packagecloud.io/install/repositories/packpack/backports/script.rpm.sh | bash

$ yum install -y git gcc cmake3  autoconf automake libtool make gcc-c++ zlib-devel \
  readline-devel ncurses-devel openssl-devel libunwind-devel libicu-devel \
  python3-pyyaml python3-six python3-gevent

$ git clone https://github.com/tarantool/tarantool.git --recursive

$ cd tarantool

$ git submodule update --init --recursive

$ make clean         # unnecessary, added for good luck
$ rm CMakeCache.txt  # unnecessary, added for good luck

$ mkdir build && cd build

$ # start initiating with build type=RelWithDebInfo
$ cmake3 .. -DCMAKE_BUILD_TYPE=RelWithDebInfo

$ make

$ make test

CentOS 8

$ dnf install -y epel-release

$ dnf install -y git gcc cmake3  autoconf automake libtool libarchive make gcc-c++ \
  zlib-devel readline-devel ncurses-devel openssl-devel libunwind-devel libicu-devel \
  python3-pyyaml python3-six python3-gevent

$ git clone https://github.com/tarantool/tarantool.git --recursive

$ cd tarantool

$ git submodule update --init --recursive

$ make clean         # unnecessary, added for good luck
$ rm CMakeCache.txt  # unnecessary, added for good luck

$ mkdir build && cd build

$ # start initiating with build type=RelWithDebInfo
$ cmake .. -DCMAKE_BUILD_TYPE=RelWithDebInfo

$ make

$ make test

Mac OS

This instruction is for those who use Homebrew. Refer to the full instruction for Mac OS if you use MacPorts.

$ xcode-select --install
$ xcode-select -switch /Applications/Xcode.app/Contents/Developer

$ git clone https://github.com/tarantool/tarantool.git --recursive

$ cd tarantool

$ git submodule update --init --recursive

$ brew install git openssl readline curl icu4c libiconv zlib cmake autoconf automake libtool

$ pip install --user -r test-run/requirements.txt

$ make clean         # unnecessary, added for good luck
$ rm CMakeCache.txt  # unnecessary, added for good luck

$ mkdir build && cd build

$ # start initiating with build type=RelWithDebInfo
$ cmake .. -DCMAKE_BUILD_TYPE=RelWithDebInfo

$ make

$ make test

FreeBSD

$ git clone https://github.com/tarantool/tarantool.git --recursive

$ cd tarantool

$ git submodule update --init --recursive

$ pkg install -y git cmake autoconf automake libtool gmake readline icu

$ pip install --user -r test-run/requirements.txt

$ make clean         # unnecessary, added for good luck
$ rm CMakeCache.txt  # unnecessary, added for good luck

$ mkdir build && cd build

$ # start initiating with build type=RelWithDebInfo
$ cmake .. -DCMAKE_BUILD_TYPE=RelWithDebInfo

$ gmake

$ gmake test

Additional steps

-DENABLE_DIST=ON for tarantoolctl installation

Important

tarantoolctl is deprecated in favor of tt CLI. Find the instructions on switching from tarantoolctl to tt in Migration from tarantoolctl to tt.

The CMake option for hinting that the result will be distributed is -DENABLE_DIST=ON. With this option, make install installs tarantoolctl files in addition to tarantool files.

Make RPM and Debian packages

This step is optional. It’s only for people who want to redistribute Tarantool. We highly recommend to use official packages from the tarantool.org web-site. However, you can build RPM and Debian packages using PackPack. Consult Build RPM or Deb package using packpack for details.

Verify your Tarantool installation

$ # if you installed tarantool locally after build
$ tarantool
$ # - OR -
$ # if you didn't install tarantool locally after build
$ ./src/tarantool

This starts Tarantool in the interactive mode.

Contributing a module

This page discusses how to create a Tarantool module and then get it published on Tarantool rocks page and included in official Tarantool images for Docker.

To learn how to create a simple module in Lua for local usage, check the corresponding how-to guide.

To help our contributors, we have created modulekit, a set of templates for creating Tarantool modules in Lua and C.

Note

As a prerequisite for using modulekit, install tarantool-dev package first. For example, in Ubuntu say:

$ sudo apt-get install tarantool-dev

Contributing a module in Lua

See README in “luakit” branch of tarantool/modulekit repository for detailed instructions and examples.

Contributing a module in C

In some cases, you may want to create a Tarantool module in C rather than in Lua. For example, to work with specific hardware or low-level system interfaces.

See README in “ckit” branch of tarantool/modulekit repository for detailed instructions and examples.

Note

You can also create modules with C++, provided that the code does not throw exceptions.

Developer guidelines

How to work on a bug

If a defect changes user-visible server behavior, it needs a bug report, even if it is a small defect. Report the bug at GitHub.

When reporting a bug, try to come up with a test case right away. Set the current maintenance milestone for the bug fix, and specify the series. Assign the bug to yourself. Put the status to ‘In progress’ Once the patch is ready, put the bug to ‘In review’ and solicit a review for the fix.

Once there is a positive code review, push the patch and set the status to ‘Closed’

Patches for bugs should contain a reference to the respective GitHub issue page or at least the issue id. Each patch should have a test, unless coming up with one is difficult in the current framework, in which case QA should be alerted.

There are two things you need to do when your patch makes it into the master:

put the bug to ‘fix committed’,
delete the remote branch.

How to write a commit message

Any commit needs a helpful message. Mind the following guidelines when committing to any of Tarantool repositories at GitHub.

Separate subject from body with a blank line.
Try to limit the subject line to 50 characters or so.
Start the subject line with a capital letter unless it prefixed with a subsystem name and semicolon:
- memtx:
- vinyl:
- xlog:
- replication:
- recovery:
- iproto:
- net.box:
- lua:
- sql:
Do not end the subject line with a period.
Do not put “gh-xx”, “closes #xxx” to the subject line.
Use the imperative mood in the subject line. A properly formed Git commit subject line should always be able to complete the following sentence: “If applied, this commit will /your subject line here/”.
Wrap the body to 72 characters or so.
Use the body to explain what and why vs. how.
Link GitHub issues on the lasts lines (see how).
Use your real name and real email address. For Tarantool team members, @tarantool.org email is preferred, but not mandatory.

A template:

Summarize changes in 50 characters or less

More detailed explanatory text, if necessary.
Wrap it to 72 characters or so.
In some contexts, the first line is treated as the subject of the
commit, and the rest of the text as the body.
The blank line separating the summary from the body is critical
(unless you omit the body entirely); various tools like `log`,
`shortlog` and `rebase` can get confused if you run the two together.

Explain the problem that this commit is solving. Focus on why you
are making this change as opposed to how (the code explains that).
Are there side effects or other unintuitive consequences of this
change? Here's the place to explain them.

Further paragraphs come after blank lines.

- Bullet points are okay, too.

- Typically a hyphen or asterisk is used for the bullet, preceded
  by a single space, with blank lines in between, but conventions
  vary here.

Fixes #123
Closes #456
Needed for #859
See also #343, #789

Some real-world examples:

Based on [1] and [2].

Documentation & Localization guidelines

These guidelines aim to help the team and external contributors write, translate, publish, and collaborate on the Tarantool documentation.

The guidelines are a work in progress, and we welcome all contributions.

Contents:

Language and style

General style

Concise is good

People usually read technical documentation because they want something up and running quickly. Write simpler, more concise sentences.

Split the content into smaller paragraphs to improve readability. This will also eliminate the need for using |br| and help us translate content faster. Any paragraph over 6 sentences is large.

Keep your audience in mind

Consider your audience’s level. A getting started guide should be written in simpler terms than an advanced internals description.

If you choose to use metaphors to clarify a concept, make sure they are relatable for an international audience of IT professionals.

Don’t say “we”

Only use the pronoun “we” in entry-level texts like getting started guides. In other cases, avoid using “we”, because it is unclear who that is exactly. Consider how Gentoo does it.

Stick to the facts

Use measurable facts instead of personal judgments. Different users may have different ideas of what “often”, “slow”, or “small” means.

Bad example: This parameter is rarely updated.

Good example: This parameter is updated every two hours or more rarely.

Refer to absolute time

Temporal adverbs like “today”, “currently”, “now”, “in the future”, etc. are relative – that is, they are based on the time the documentation is created.

Instead of these words, use absolute terms like version numbers or years. The meaning of those terms doesn’t change over time.

If technical documentation is tied semantically to the time it was created, it increases the risk of the documentation becoming obsolete.

Bad example: Previously, the functionality worked differently. Currently, it supports SSL.

Good example: Before version x.y.z, the functionality worked differently. Since version x.y.z, it supports SSL.

Express one idea in a sentence

Say exactly one thing in a sentence. If you want to define or clarify something, do it in a separate sentence. Simple sentences are easier to read, understand and translate.

Don’t	Do
Dogs (I have three of them) are my favorite animals. Their names are Ace, Bingo and Charm; Charm is the youngest one.	Dogs are my favorite animals. I have three of them. Their names are Ace, Bingo and Charm. Charm is the youngest one.
memtx (the in-memory storage engine) is the default and was the first to arrive.	memtx is an in-memory storage engine. It is the default and was the first to arrive.
The replica set from where the bucket is being migrated is called the source; the target replica set where the bucket is being migrated to is called the destination.	The replica set from where the bucket is being migrated is called the source. The target replica set where the bucket is being migrated to is called the destination.

Put examples next to theory

It’s best if examples immediately follow the concept they illustrate. The readers wouldn’t want to look for the examples in a different part of the article.

Specify link text

When you provide a link, clearly specify where it leads. In this way, you will not mislead the reader.

Bad example:

For more details, click here.

Use this.

Good example:

For more details, refer to the documentation on making links.

Use full link names.

Formatting

Use lists and tables

Lists and tables help split heavy content into manageable chunks.

To make tables maintainable and easy to translate, use the list-table directive, as described in the Tarantool table markup reference.

Translators find it hard to work with content “drawn” with ASCII characters, because it requires adjusting the number of spaces and manually counting characters.

Bad example:

Don't "draw" tables with ASCII characters

Good example:

Format code as code

Format large code fragments using the code-block directive, indicating the language. For shorter code snippets, make sure that only code goes in the backticks. Non-code shouldn’t be formatted as code, because this confuses users (and translators, too). Check our guidelines on writing about code.

For more about formatting, check out the Tarantool markup reference.

Word choice

Instance vs server

We say “instance” rather than “server” to refer to a Tarantool server instance. This keeps the manual terminology consistent with names like /etc/tarantool/instances.enabled in the Tarantool environment.

Wrong usage: “Replication allows multiple Tarantool servers to work with copies of the same database.”

Correct usage: “Replication allows multiple Tarantool instances to work with copies of the same database.”

Don’t use i.e. and e.g.

Don’t use the following contractions:

“i.e.”—from the Latin “id est”. Use “that is” or “which means” instead.
“e.g.”—from the Latin “exempli gratia”. Use “for example” or “such as” instead.

Many people, especially non-native English speakers, aren’t familiar with the “i.e.” and “e.g.” contractions or don’t know the difference between them. For this reason, it’s best to avoid using them.

Spelling and punctuation

Tarantool capitalization

The word “Tarantool” is capitalized because it’s a product name. The only context where it can start with a lowercase “t” is code. Learn more about code formatting in Tarantool documentation.

US vs British spelling

Use the US English spelling.

Check your spelling and punctuation

Consider checking spelling, grammar, and punctuation with special tools like LanguageTool or Grammarly.

Dashes

Special symbols like dashes, quotation marks, and apostrophes look the same across all Tarantool documentation in a single language. This is because the documentation builder renders specific character sequences in the source into correct typographic characters.

Tarantool documentarians are recommended to use the en dash (–) only. Type two hyphens to insert it: --. Add spaces on both sides of the dash. Don’t use a single hyphen as a dash.

Use the dash for the following purposes:

To separate extra information.
To mark a break in a sentence.
To mark ranges like 4–16 GB (don’t surround the dash with spaces in this case).

When indicating a range like code element 1–code element 2, escape the series of hyphens using character-level inline markup. Otherwise, the RST interpreter will perceive the dash as part of the RST syntax:

``box.begin()``\--``box.commit()``

Ending punctuation in lists and tables

The following recommendations are for the English language only. You can find similar guidelines for the Russian language in the external reference for Russian proofreaders.

Lists

There are two kinds of lists:

Where each item forms a complete sentence.
Where each item is a phrase of three or less words or a term.

In the former case, start each item with a capital letter and end with a period. In the latter case, start it with a lowercase letter and add no ending punctuation (no period, no comma, no semicolon).

A list should be formatted uniformly: choose the first or second rule for all items in a list.

The above rules are adapted from the Microsoft style guide.

The sentence preceding a list can end either with a semicolon or a period.

Don’t add redundant conjunctions like “and”/”or” before the last list item.

General English punctuation rules still apply for text in lists.

Tables

For the text in cells, use periods or other end punctuation only if the cells contain complete sentences or a mixture of fragments and sentences. (This is also a Microsoft guideline for the English language.)

Besides, make sure that your table punctuation is consistent – either all similar list/table items end with a period or they all don’t. In the example below, all items in the second column don’t have ending punctuation. Meanwhile, all items in the fourth column end with a period, because they are a mix of fragments and sentences:

Items in one column have similar ending punctuation

To learn more about table formatting, check the table markup reference.

Localization

This section covers the specifics of localizing Tarantool into Russian. If you are translating Tarantool docs into Russian, be sure to check out our translation guidelines.

Contents:

State of localization

Repository	State	Volume, words
Tarantool Community Edition		352 000
Tarantool Ansible Role		11 000
Tarantool Enterprise Edition		6 000
Tarantool Data Grid		4 000
Tarantool Metrics		4 000
Tarantool C++ Driver		4 000
Tarantool Kubernetes Operator		2 750
Tarantool Luatest		750
Tarantool Graphana Dashboard		500

Glossaries

Term [en]	Term [ru]	Description [en]	Description [ru]
space	спейс	A space is a container for tuples.
tuple	кортеж	A tuple plays the same role as a “row” or a “record”. The number of tuples in a space is unlimited. Tuples in Tarantool are stored as MsgPack arrays.
Tarantool	Tarantool	НЕ ПЕРЕВОДИТЬ
primary index	первичный индекс	The first index defined on a space is called the primary key index, and it must be unique. All other indexes are called secondary indexes, and they may be non-unique. https://www.tarantool.io/en/doc/latest/book/box/data_model/#indexes
fiber	файбер	A fiber is a set of instructions which are executed with cooperative multitasking. Fibers managed by the fiber module are associated with a user-supplied function called the fiber function. https://www.tarantool.io/en/doc/latest/reference/reference_lua/fiber/#fibers
Tarantool garbage collector	сборщик мусора в Tarantool	A garbage collector fiber runs in the background on the master storages of each replica set. It starts deleting the contents of the bucket in the GARBAGE state part by part. Once the bucket is empty, its record is deleted from the _bucket system space. https://www.tarantool.io/en/doc/latest/reference/reference_rock/vshard/vshard_admin/#garbage-collector
Lua garbage collector	сборщик мусора на Lua	Lua manages memory automatically by running a garbage collector from time to time to collect all dead objects (that is, objects that are no longer accessible from Lua). https://www.lua.org/manual/5.1/manual.html#2.10
storage engine	движок базы данных	A storage engine is a set of very-low-level routines which actually store and retrieve tuple values. https://www.tarantool.io/en/doc/latest/book/box/engines/
thread	поток	A thread of execution is the smallest sequence of programmed instructions that can be managed independently by a scheduler, which is typically a part of the operating system.
Lua application	Lua-приложение, приложение на языке Lua	Tarantool’s native language for writing applications is Lua.
memtx	memtx
instance	экземпляр
implicit casting	неявное приведение типов
database	база данных
Release policy	Релизная политика	A set of rules for releasing and naming new distributions of Tarantool: where we add new features and where we don’t, how we give them numbers, what versions are suitable to use in production.
field	поле	Fields are distinct data values, contained in a tuple. They play the same role as «row columns» or «record fields» in relational databases.
leader election	выборы лидера	(in a replica set, by the Raft algorithm)
replica set	набор реплик
heartbeat	контрольный сигнал
functionality	функциональность
log	журнал
node	узел
follower	реплика
small allocator	аллокатор small		https://github.com/tarantool/small
patch	патч
breaking change	критическое изменение
parser	парсер
UUID	UUID
data type	тип данных
alias	алиас		или псевдоним?
push	выполнить push
MVCC	(механизм) MVCC
dirty read	“грязное чтение”		в кавычках
snapshot	снимок (данных)
keywords	ключевые слова
identifier	имя, идентификатор
clause	предложение, блок	(SQL) A clause in SQL is a part of a query that lets you filter or customizes how you want your data to be queried to you.
expression	выражение
predicate	предикат	(SQL) Predicates, which specify conditions that can be evaluated to SQL three-valued logic (3VL) (true/false/unknown) or Boolean truth values and are used to limit the effects of statements and queries, or to change program flow.
query	запрос	(SQL) Queries retrieve the data based on specific criteria. A query is a statement that returns a result set (possibly empty).
result set	результат запроса	(SQL) An SQL result set is a set of rows from a database, as well as metadata about the query such as the column names, and the types and sizes of each column. A result set is effectively a table.
statement	инструкция	(SQL) A statement is any text that the database engine recognizes as a valid command.	(SQL) Любой текст, который распознаётся движком БД как команда. Инструкция состоит из ключевых слов и выражений языка SQL, которые предписывают Tarantool выполнять какие-либо действия с базой данных.
		Tarantool: A statement consists of SQL-language keywords and expressions that direct Tarantool to do something with a database. https://www.tarantool.io/en/doc/latest/reference/reference_sql/sql_user_guide/#statements”
batch	пакет (инструкций)	(SQL) A series of SQL statements sent to the server at once is called a batch.	(SQL) Серия SQL-инструкций (statements), отправляемая на сервер вместе
production configuration	конфигурация производственной среды
deployment	развертывание	Transforming a mechanical, electrical, or computer system from a packaged to an operational state. IT infrastructure deployment typically involves defining the sequence of operations or steps, often referred to as a deployment plan, that must be carried to deliver changes into a target system environment.
roll back	отменить		транзакцию
deploy to production		IT infrastructure deployment typically involves defining the sequence of operations or steps, often referred to as a deployment plan, that must be carried to deliver changes into a target system environment. Production environment is a setting where software and other products are actually put into operation for their intended uses by end users
operations	эксплуатация	(DevOps) Information technology operations, or IT operations, are the set of all processes and services that are both provisioned by an IT staff to their internal or external clients and used by themselves, to run themselves as a business.
to deploy		Transforming a mechanical, electrical, or computer system from a packaged to an operational state. IT infrastructure deployment typically involves defining the sequence of operations or steps, often referred to as a deployment plan, that must be carried to deliver changes into a target system environment.
deployment plan		A sequence of operations or steps that must be carried to deliver changes into a target system environment.
production environment	производственная среда	Production environment is a term used mostly by developers to describe the setting where software and other products are actually put into operation for their intended uses by end users.
failover	восстановление после сбоев	In computing and related technologies such as networking, failover is switching to a redundant or standby computer server, system, hardware component or network upon the failure or abnormal termination of the previously active application, server, system, hardware component, or network.
directory	директория
bucket	сегмент
select	выберите, выбрать	To select a checkbox

Localization guidelines

Use this guide when localizing Tarantool into Russian.

Tone of voice

General voice

We address IT specialists fairly knowledgeable in their respective fields. The goal of our translations is to help these people understand how to use Tarantool. Think of us as their colleagues and address them as such. Be professional but friendly. Don’t command or patronize. Use colloquial speech but avoid being too familiar. It’s all about the golden mean.

Gender neutrality

Use gender-neutral expressions like «сделать самостоятельно» instead of «сделать самому», etc.

Term choice

Though not all of our readers may be fluent in English, they write in English-based programming languages and are used to seeing error messages in English. Therefore, if they see an unfamiliar and/or more archaic Russian term for a familiar concept, they might have trouble correlating them.

We don’t want our audience to feel confused, so we prefer newer terms. We also provide the English equivalent for a term if it is used in the article for the first time.

If you feel like an older Russian term may sound more familiar for a part of the audience (for example, those with a math background), consider adding it in parentheses along with the English equivalent. Don’t repeat the parentheses throughout the text. A similar rule applies to introducing terms in Tarantool documentation.

Term choice examples

	First time	All following times
state machine	машина состояний (конечный автомат, state machine)	машина состояний
write-ahead log; WAL	журнал упреждающей записи (write-ahead log, WAL)	журнал упреждающей записи; WAL; журнал WAL (using a descriptor)

Best practices

Be creative

Please avoid word-for-word translations. Let the resulting text sound as though it was originally written in Russian.

Less is more

Be concise and don’t repeat yourself. Fewer words are the best option most of the time.

Don’t	Do
Профиль доступа можно назначить для любой роли пользователя, созданной администратором. А к ролям по умолчанию привязать профили доступа не получится, поскольку такие роли редактировать нельзя.	Профиль доступа можно назначить для любой роли пользователя, созданной администратором. Исключение составляют роли по умолчанию, поскольку их нельзя редактировать.

Topic and focus

Avoid English word order.

The Russian speech is structured with topic and focus (тема и рема). The topic is the given in the sentence, something we already know. The focus is whatever new/important information is provided in the sentence about the topic. In written Russian, the focus most often stands at the end of the sentence, while in English, sentences may start with it.

It is recommended to use `systemd` for managing the application instances and accessing log entries.	Для управления экземплярами приложения и доступа к записям журнала рекомендуется использовать `systemd`.
Do not specify working directories of the instances in this configuration.	Не указывайте в этой конфигурации рабочие директории экземпляров.

No bureaucratese

Avoid overly formal, bureaucratic language whenever possible. Prefer verbs over verbal nouns, and don’t use «являться» and «осуществляться» unless it’s absolutely necessary.

To learn how to clear your Russian texts of bureaucratese, check Timur Anikin’s training “The Writing Dead”.

Don’t	Do
Сообщение исчезнет, как только вы покинете данную страницу.	Сообщение исчезнет, как только вы покинете страницу.
Проверка истечения срока действия паролей производится раз в 30 минут.	Раз в 30 минут система проверяет, не истек ли срок действия паролей.

Consistency

Use one term for one concept throughout the article. For example, only translate production as «производственная среда» and not as «эксплуатационная среда» throughout your article. It’s not about synonyms, but about terms: we don’t want people to get confused.

Avoid elliptical sentences

	Don’t	Do
Defaults to `root`.	По умолчанию — `root`.	Значение по умолчанию — `root`.

Pronoun collocations

Do all the pronouns point to the exact nouns you want them to?

Example (how not to): Прежде чем добавить запись в конфигурацию, укажите к ней путь.

In the example, it is not quite clear what «к ней» means – to the record or to the configuration. For more on this issue, check out the writers’ reference at «Ошибкариум».

Be critical towards your text

Don’t forget to proofread your translation. Check your text at least twice.

Be nice to your peers

If you review others’ translations, be gentle and kind. Everyone makes mistakes, and nobody likes to be punished for them. You can use phrasings like “I suggest” or “it’s a good idea to… .”

Defining and using terms

What are concepts and terms

To write well about a certain subject matter, one needs to know its details and use the right, carefully selected words for them. These details are called concepts, and the words for them are called terms.

concept

A concept is the idea of an object, attribute, or action. It is independent of languages, audience, and products. It just exists.

For example, a large database can be partitioned into smaller instances. Those instances are easier to operate, and their throughput often exceeds the throughput of a single large database instance. The instances can exchange data to keep it consistent between them.

term

A term is a word explicitly selected by the authors of a particular text to denote a concept in a particular language for a particular audience.

For example, in Tarantool, we use the term “[database] sharding” to denote the concept described in the previous example.

Use preferred terms

The purpose of using terms is writing concisely and unambiguously, which is good for the readers. But selecting terms is hard. Often, the community favors two or more terms for one concept, so there’s no obvious choice. Selecting and consistently using any of them is much better than not making a choice and using a random term every time. This is why it’s also helpful to restrict the usage of some terms explicitly.

restricted term

A restricted term is a word that the authors explicitly prohibited to use for denoting a concept. Such a word is sometimes used as a term for the same concept elsewhere – in the community, in books, or in other product documentation. Sometimes this word is used to denote a similar but different concept. In this case, the right choice of terms helps us differentiate between concepts.

For example, in Tarantool, we don’t use the term “[database] segmentation” to denote what we call “database sharding.” Nevertheless, other authors might do so. We also use the term “[database] partitioning” to denote a wider concept, which includes sharding among other things.

Define terms by explaining concepts

We always want to document definitions for the most important concepts, as well as for concepts unique to Tarantool.

Define every term in the document that you find most appropriate for it. You don’t have to create a dedicated glossary page containing all the definitions.

To define a term, use the glossary directive in the following way:

..  glossary::

    term
        definition text

    term2
        definition text

There can be several glossary directives in a Sphinx documentation project and even in a single document. This page has two of them, for example.

The Sphinx documentation has an extensive glossary that can be used as a reference.

Introduce terms on first entry

When you use a term in a document for the first time, define it and provide synonyms, a translation, examples, and/or links. It will help readers learn the term and understand the concept behind it.

Define the term or give a link to the definition.

Database sharding is a type of horizontal partitioning.

To give a link to the definition, use the term role:
```
For example, this is a link to the definition of :term:`concept`.
Like any rST role, it can have :term:`custom text <concept>`.
```
The resulting output will look like this:

For example, this is a link to the definition of concept. Like any rST role, it can have custom text.

With acronyms, you can also use the abbr role:
```
Delete the corresponding :abbr:`PVC (persistent volume claim)`...
```
It produces a tooltip link: PVC.
Provide synonyms, including the restricted terms. Only do it on the first entry of a term.

Database sharding (also known as …) is a type of…
When writing in Russian, it’s good to add the corresponding English term. Readers may be more familiar with it or can search it online.

Шардирование (сегментирование, sharding) — это…
Give examples or links to extra reading where you can.

Markup reference

Tarantool documentation is built via the Sphinx engine and is written in reStructuredText. This section will guide you through our typical documentation formatting cases.

General syntax guidelines

Basic syntax

Paragraphs contain text and may contain inline markup: emphasis, strong emphasis, interpreted text, inline literals.

Text can be organized in bullet-lists:

*   This is a bullet list.

*   Bullets can be "*", "+", or "-".

    -   Lists can be nested. And it is good to indent them with 4 spaces.

or in enumerated lists:

1.  This is an enumerated list.

2.  Tarantool build uses only arabic numbers as enumerators.

#.  You can put #. instead of point numbers and Sphinx will
    recognize it as an enumerated list.

Wrapping text

It’s good practice to wrap lines in documentation source text. It makes source better readable and results in lesser git diff’s. The recommended limit is 80 characters per line for plain text.

In new documents, try to wrap lines by sentences, or by parts of a complex sentence. Don’t wrap formatted text if it affects rST readability and/or HTML output. However, wrapping with proper indentation shouldn’t break things.

Indentation

In rST, indents play exactly the same role as in Python: they denote object boundaries and nesting.

For example, a list starts with a marker, then come some spaces and text. From there, all lines relating to that list item must be at the same indentation level. We can continue the list item by creating a second paragraph in it. To do that we have to leave it at the same level.

We can put a new object inside: another list, or a block of code. Then we have to indent 4 more spaces.

It’s best if all indents are multiples of 4 spaces, even in lists. Otherwise the document is not consistent. Also, it is much easier to put indents with tabs than manually.

Note that you have to use two or three spaces instead of one. It is allowed in rST markup:

|...|...|...|...
*   unordered list
#.  ordered list
..  directive::
|...|...|...|...

Example:

|...|...|...|...
#.  List item 1.
    Paragraph continues.

    Second paragraph.

#.  List item 2.

    *   Nested list item.

        ..  code-block:: bash

            # this code block is in a nested list item

    *   Another nested list item.
|...|...|...|...

Resulting output:

List item 1. Paragraph continues.

Second paragraph.
List item 2.
Nested list item.
# this code block is in a nested list item
Another nested list item.

Making comments

Sometimes we may need to leave comments in an rST file. To make Sphinx ignore some text during processing, use the following per-line notation with .. // as the comment marker:

.. // your comment here

The starting characters .. // do not interfere with the other rST markup, and they are easy to find both visually and using grep. To find comments in source files, go ahead with something like this:

$ grep -n "\.\. //" doc/reference/**/*.rst
doc/reference/reference_lua/box.rst:47:.. // moved to "User Guide > 5. Server administration":
doc/reference/reference_lua/box.rst:48:.. // /book/box/triggers
...

If you’re working with PyCharm or other similar IDE, links in the console will be clickable and will lead right to the source file and string. Check it out!

These comments don’t work properly in nested documentation, though. For example, if you leave a comment in module -> object -> method, Sphinx ignores the comment and all nested content that follows in the method description.

Headings

Heading markup

We use the following markup for headings:

Level 1 heading
===============

Level 2 heading
---------------

Level 3 heading
~~~~~~~~~~~~~~~

Level 4 heading
^^^^^^^^^^^^^^^

The underlining should be exactly the same length as the heading text above it. Mismatching length will result in a build warning.

Sphinx allows using other characters and styles to format headings. Indeed, using this markup consistently helps us better reuse and move content. It also helps us recognize the heading level immediately without reading the whole document and calculating levels.

If you’re going to make a 4th or 5th level heading, you probably need to split the document instead.

Title headings

The top-level heading of each document plays the important role of a document title. Title’s text is used in several places:

Literally as a <h1> tag in HTML or top-level heading in other formats.
Text in the breadcrumbs — the path to the document shown above the text.
The :doc: link’s default text.
Part of the page’s title tag contents, used as the browser tab name.
```
<title>
    Documentation guidelines | Tarantool
</title>
```
Potentially, the page’s OpenGraph metadata which is used for building page preview cards on social networks and messengers.
```
<meta property="og:title" content="Documentation guidelines">
```

ard to navigate in a hierarchy of more than three heading levels.

Links and references

Linking to other documentation pages

To create a link to another document in our documentation, we use the :doc: role. For example, this link points to the document /reference/reference_lua/box_error.rst:

:doc:`box.error reference </reference/reference_lua/box_error>`

Our convention is to put the full path to the referred document so that we can easily replace the path if it changes. Note that we can omit the .rst part of the filename.

You can use the target document’s title as the link text. To do so, omit the text in the link definition:

:doc:`/reference/reference_lua/box_index`

And you will get this:

Submodule box.index

Linking to labels (anchors)

To generate a link to the certain place in the page, we use the :ref: role. For this purpose, we add our own labels for linking to any place in this documentation.

Our naming convention is as follows:

Character set: a through z, 0 through 9, hyphen, underscore.
Format: path hyphen filename hyphen tag

Example:

..  _c_api-box_index-iterator_type:

where:

c_api is the directory name,
box_index is the file name (without “.rst”), and
iterator_type is the tag.

Use a hyphen “-” to delimit the path and the file name. In the documentation source, we use only underscores “_” in paths and file names, reserving the hyphen “-” as the delimiter for local links.

The tag can be anything meaningful. The only guideline is for Tarantool syntax items (such as members), where the preferred tag syntax is module_or_object_name hyphen member_name. For example, box_space-drop.

To add a link to an anchor, use the following syntax:

Check out the :ref:`Quick start guide <vshard-quick-start>`.

The result will be like this:

Check out the Quick start guide.

Linking to external resources

To make an external link, use the following syntax:

Feel free to report an issue at `Tarantool GitHub <https://github.com/tarantool/tarantool/issues>`_.

Avoid separating the link and the target definition, like this:

Feel free to report an issue at `Tarantool GitHub`_.

..  _Tarantool GitHub: https://github.com/tarantool/tarantool/issues

because every separated link tends to cause troubles when this documentation is translated to other languages.

Tables

Tables are very useful and rST markup offers different ways to create them.

We prefer list-tables because they allow you to put as much content as you need without painting ASCII-style borders:

..  container:: table

    ..  list-table::
        :widths: 25 75
        :header-rows: 1

        *   -   Name
            -   Use

        *   -   :doc:`/reference/reference_lua/box_ctl/wait_ro`
            -   Wait until ``box.info.ro`` is true

This is how the table will look like:

Name	Use
box.ctl.wait_ro()	Wait until `box.info.ro` is true

Notice that we use * and then - in tables because it is more readable when rows and columns marked differently.

Writing about code

When writing articles, you need to format code specially, separating it from other text. This document will guide you through typical cases when it is recommended to use code highlighting.

Defining what code is

In general, code is any text, processed by a machine. It is also probably code if the expression contains characters that ordinary words do not have, such as _, {}, [ ], .. Also, you should format the expression as code if it fits at least one of the items in the list below:

parts of a programming language: names of classes, variables, and functions, short expressions, data types and so on,
multiline fragments of application logs,
example link which the reader will not open: example.com, https://example.com:80,
parts of URL, like port number,
package names,
CLI app names.

Items we don’t format as code:

names of products, organizations and services, for example, Tarantool, memtx, vinyl
well-established terms such as stdin and stdout

Keep in mind that grammar doesn’t apply to code, even inline.

Correct: “use shellcheck to analyze your Bash code”.
Incorrect: “shellcheck your Bash code”. Please do not use code as a verb.
Even worse: “shellcheck your Bash code”. There’s no such word in English and we don’t explain what to use.
Cursed: “try shellchecking your Bash code”. There’s no such word and no such application.

Code blocks and inline code

If you have to choose between inline code and code block highlighting, pay attention to the following guidelines:

Code snippets

Use code blocks when you have to highlight multiple lines of code. Also, use it if your code snippet contains a standalone element that is not a part of the article’s text.

For code snippets, we use the code-block:: language directive. You can enable syntax highlighting if you specify the language for the snippet. The most commonly used highlighting languages are:

tarantoolsession – interactive Tarantool session, where command lines start with tarantool> prompt.
console – interactive console session, where command lines start with $ or #.
lua, bash or c for programming languages.
text for cases when we want the code block to have no highlighting.

Sphinx uses the Pygments library for highlighting source code. For a complete list of possible languages, see the list of Pygments lexers.

For example, a code snippet in Lua:

..  code-block:: lua

    for page in paged_iter("X", 10) do
      print("New Page. Number Of Tuples = " .. #page)
      for i=1,#page,1 do print(page[i]) end
    end

Lua syntax is highlighted in the output:

for page in paged_iter("X", 10) do
  print("New Page. Number Of Tuples = " .. #page)
  for i=1,#page,1 do print(page[i]) end
end

Note that in code blocks you can write comments and translate them:

..  //Here is the first comment.
..  //Here is the second comment.

Inline code

Use inline code when you need to wrap a short snippet of code in text, such as variable name or function definition. Keep in mind that inline code doesn’t have syntax highlighting.

To format some inline text as code, enclose it with double ` characters or use the :code: role:

*   Formatting code with backticks: ``echo "Hello world!"``.

*   Formatting code with a role: :code:`echo "Hello world!"`.

Both options produce the same output:

Formatting code with backticks: echo "Hello world!".

Formatting code with a role: echo "Hello world!".

Notes on using inline-code

If you have expressions such as id==4, you should format the whole expression as code inline. Also, you can use the words “equals”, “doesn’t equal” or other similar words without formatting expression as code. Both variants are correct.
Inline code can be used to highlight expressions that are hard to read, for example, words containing il, Il or O0.

Highlighting variables in code

If you need to mark up a placeholder inside code inline, use the :samp: or our custom :extsamp: role, like this:

:samp:`{space_object}:insert(\\{ffi.cast('double', {value})\\})`

:extsamp:`{*{space_object}*}:insert({ffi.cast('double', {**{value}**})})`

And you will get this:

space_object:insert({ffi.cast('double', value)})

space_object:insert({ffi.cast('double', value)})

Notice two backslashes before the curly brackets in the first line. They are needed to escape curly brackets from Lua syntax.

As you can see, :extsamp: extends the abilities of :samp:. It allows you to highlight placeholders in both italics and bold and avoid escaping curly brackets. :extsamp: has the following syntax:

{*{element}*} for italic
{**{element}**} for bold

If you need to mark up a placeholder in code block, use the following syntax:

..  cssclass:: highlight
..  parsed-literal::

    :samp:`box.space.{space-name}:create_index('{index-name}')`

The output will look like this:

box.space.space-name:create_index('index-name')

Formatting file and directory names

If you need to highlight some file standalone name or path to file in text, use the :file: role. You can use curly braces inside this role to mark up a replaceable part:

:file:`/usr/bin/example.py`

:file:`/usr/{dirname}/example.py`

:file:`/usr/{dirname}/{filename.ext}`

And you will get this:

/usr/bin/example.py

/usr/dirname/example.py

/usr/dirname/filename.ext

Referring to GUI elements

To mention a GUI element, use the :guilabel: directive:

Click the :guilabel:`OK` button.

Admonitions

Sometimes you need to highlight a piece of information. For this purpose we use admonitions.

In Tarantool we have 3 variants of css-style for admonitions:

Note:
```
..  note::
```
Note

This is a note. We use it to highlight extra information that might be helpful for users.

For example, here we provide a user with extra information about using net_box.new() function.
Warning:
```
..  warning::
```
Warning

This is a warning. As you might guess, we use it to warn users about something.

For example, in the description of box.session.on_connect() trigger we warn a user about some consequences of his actions.
Important:
```
..  important::
```
Important

This block contains essential information that the user should know while doing something.
Custom admonition:
```
..  admonition:: Your title
    :class: fact
```
Your title

This is a fact. fact is our custom CSS class. Use it when neither note nor warning doesn’t fit.

Note that this type requires a title.

For example, here we highlight the rules that are necessary to read, and that’s why we use fact.

The docutils documentation offers many more variants for admonitions, but for now these three are enough for us. If you think that it is time to create the new style for some of these types, feel free to contribute or contact us to create a task.

Documenting the API

This document contains general guidelines for describing the Tarantool API, as well as examples and templates.

Style

Please write as simply as possible. Describe functionality using short sentences in the present simple tense. A short sentence consists of no more than two clauses. Consider using LanguageTool or Grammarly to check your English. For more style-related specifics, consult the Language and style section.

Indicating the version

For every new module, function, or method, specify the version it first appears in.

For a new parameter, specify the version it first appears in if this parameter is a “feature” and the version it’s been introduced in differs from the version introducing the function/method and all other parameters.

To specify the version, use the following Sphinx directive:

Since :doc:`2.10.0 </release/2.10.0>`.
This is a link to the release notes on the Tarantool documentation website.

The result looks like this:

Since Tarantool 2.10.0. This is a link to the release notes on the Tarantool documentation website.

Language of the general description

Use one of the two options:

Start with a verb in the imperative mood. Example: Create a fiber.
Start with a noun. Example: The directory where memtx stores snapshot files.

Checklist

Each list item is a characteristic to be described. Some items can be optional.

Function or method

Since which Tarantool version
General description
Parameters
What this function returns (if nothing, write ‘none’)
Return type (if exists)
Possible errors (if exist)
Complexity factors (for CRUD operations and index access functions)
Usage with memtx and vinyl (if differs)
Example(s)
Extra information (if needed)

See module function example, class method example.

Data

Since which Tarantool version
General description
Return type
Example

See class data example.

Function and method parameters

Since which Tarantool version (if added later)
General description
Type
Default value (if optional), possible values

If the parameter is optional, make sure it is enclosed in square brackets in the function declaration (in the “heading”). Do not mark parameters additionally as “optional” or “required”:

..  function:: format(URI-components-table[, include-password])

    Construct a URI from components.

    :param URI-components-table: a series of ``name:value`` pairs, one for each component
    :param include-password: boolean. If this is supplied and is ``true``, then
                             the password component is rendered in clear text,
                             otherwise it is omitted.

Configuration parameters

Configuration parameters are not to be confused with class and method parameters. Configuration parameters are passed to Tarantool via the command line or in an initialization file. You can find a list of Tarantool configuration parameters in the configuration reference.

Since which Tarantool version
General description
Type
Corresponding environment variable (if applicable)
Default value
Possible values (can be included in the general description, for example, as a list)
Dynamic (yes or no)

See configuration parameter example.

Documenting possible errors

In the “Possible errors” section of a function or class method, consider explaining what happens if any parameter hasn’t been defined or has the wrong value.

Examples and templates

Module functions

We use the Sphinx directives .. module:: and .. function:: to describe functions of Tarantool modules:

..  module:: fiber

..  function:: create(function [, function-arguments])

    Create and start a fiber. The fiber is created and begins to run immediately.

    :param function: the function to be associated with the fiber
    :param function-arguments: what will be passed to function.

    :return: created fiber object
    :rtype: userdata

    **Example:**

    ..  code-block:: tarantoolsession

        tarantool> fiber = require('fiber')
        ---
        ...
        tarantool> function function_name()
                 >   print("I'm a fiber")
                 > end
        ---
        ...
        tarantool> fiber_object = fiber.create(function_name); print("Fiber started")
        I'm a fiber
        Fiber started
        ---
        ...

The resulting output looks like this:

fiber.create(function[, function-arguments])¶

Create and start a fiber. The fiber is created and begins to run immediately.

Parameters:	function – the function to be associated with the fiber function-arguments – what will be passed to function.
Return:	created fiber object
Rtype:	userdata

Example:

tarantool> fiber = require('fiber')
---
...
tarantool> function function_name()
         >   print("I'm a fiber")
         > end
---
...
tarantool> fiber_object = fiber.create(function_name); print("Fiber started")
I'm a fiber
Fiber started
---
...

Class methods and data

Methods are described similarly to functions, but the .. class:: directive, unlike .. module::, requires nesting.

As for data, it’s enough to write the description, the return type, and an example.

Here is the example documentation describing the method and data of the index_object class:

..  class:: index_object

    ..  method:: get(key)

        Search for a tuple :ref:`via the given index <box_index-note>`.

        :param index_object index_object: :ref:`object reference
                                          <app_server-object_reference>`
        :param scalar/table      key: values to be matched against the index key

        :return: the tuple whose index-key fields are equal to the passed key values
        :rtype:  tuple

        **Possible errors:**

        * No such index
        * Wrong type
        * More than one tuple matches

        **Complexity factors:** index size, index type.
        See also :ref:`space_object:get() <box_space-get>`.

        **Example:**

        ..  code-block:: tarantoolsession

            tarantool> box.space.tester.index.primary:get(2)
            ---
            - [2, 'Music']
            ...

    ..  data:: unique

        True if the index is unique, false if the index is not unique.

        :rtype: boolean

        ..  code-block:: tarantoolsession

            tarantool> box.space.tester.index.primary.unique
            ---
            - true
            ...

And the resulting output looks like this:

object index_object¶

index_object:get(key)¶

Search for a tuple via the given index.

Parameters:	index_object (`index_object`) – object reference key (`scalar/table`) – values to be matched against the index key
Return:	the tuple whose index-key fields are equal to the passed key values
Rtype:	tuple

Possible errors:

No such index
Wrong type
More than one tuple matches

Complexity factors: index size, index type. See also space_object:get().

Example:

tarantool> box.space.tester.index.primary:get(2)
---
- [2, 'Music']
...

index_object.unique¶

True if the index is unique, false if the index is not unique.

Rtype:	boolean

tarantool> box.space.tester.index.primary.unique
---
- true
...

Configuration parameters

Example:

.. _cfg_basic-vinyl_dir:

.. confval:: vinyl_dir

    Since version 1.7.1.

    A directory where vinyl files or subdirectories will be stored. Can be
    relative to :ref:`work_dir <cfg_basic-work_dir>`. If not specified, defaults
    to ``work_dir``.

    |
    | Type: string

Result:

vinyl_dir¶

Since version 1.7.1.

A directory where vinyl files or subdirectories will be stored. Can be relative to work_dir. If not specified, defaults to work_dir.

Type: string

Images

Images are useful in explanations of concepts and structures. When you introduce a term or describe a structure of multiple interconnected parts (such as a cluster), consider illustrating it with a diagram. If you are explaining how to use a GUI, check if a screenshot can make the doc clearer.

Note that illustrations should complement the text, not replace it. Even with an image, the text should be enough for readers to understand the topic.

Don’t overuse images: they are harder to support than text. Use them only if they bring an obvious benefit.

Diagrams

There is a basic set of diagram elements – blocks, arrows, and other – to use in Tarantool docs. It is stored in this Miro board. It also provides basic rules for creating diagrams.

Size

There are two sizes of diagram elements:

M – bigger elements to use in diagrams with a small number of elements.
S – smaller elements to use in diagrams with a big number of elements.

Avoid changing the size of diagram elements unless it’s absolutely necessary.

The diagrams should have the same width. This guarantees that their elements have the same size on pages. The examples in the Miro board have frames of the right width. Copy the frame and and place your diagram in it without changing the frame width.

Exporting

To save the diagram to a file:

Make the frame transparent so that it isn’t shown in the resulting image (set its color to “no color”).
Select all elements together with the frame and click Copy as image in the context menu (under the three dots). The image will be copied to the clipboard.
Paste the image from the clipboard to any graphic editor, for example, GIMP.
Remove the Miro logo in the bottom right corner.
Export/save the image to PNG.

Screenshots

Take screenshots with any tool you like.

Ensure screenshot consistency on the page:

Screenshots must show the same environment: operating system, product version, visual theme, and so on.
The configuration and data must be consistent. For example, if you’ve shown spaces with data on a screenshot, subsequent screenshots must have the same data, too.
Size and resolution must be the same across the page unless you want to zoom in to a specific part of the screen.

Markup

Insert the images using the image directive:

..  image:: images/example_diagram.png
    :alt: Example diagram alt text

Result:

Building Tarantool Docs

How to build Tarantool documentation using Docker

See Docker

Prepare for work

First of all, pull the image for building the docs.

docker pull tarantool/doc-builder:fat-4.3

Next, initialize a Makefile for your OS:

docker run --rm -it -v $(pwd):/doc tarantool/doc-builder:fat-4.3 sh -c "cmake ."

Update submodules and generate documentation sources from code

A big part of documentation sources comes from several other projects, connected as Git submodules. To include their latest contents in the docs, run these two steps.

Update the submodules:
```
git submodule update --init
git fetch --recurse-submodules
git submodule update --remote --checkout
```
This will initialize Git submodules and update them to the top of the stable branch in each repository.

git submodule update can sometimes fail, for example, when you have changes in submodules’ files. You can reinitialize submodules to fix the problem.

Caution: all untracked changes in submodules will be lost!
```
git submodule deinit -f .
git submodule update --init
```
Note that there’s an option to update submodule repositories with a make command. However, it’s intended for use in a CI environment and not on a local machine.
```
docker run --rm -it -v $(pwd):/doc tarantool/doc-builder:fat-4.3 sh -c "make pull-modules"
```
Build the submodules content:
```
docker run --rm -it -v $(pwd):/doc tarantool/doc-builder:fat-4.3 sh -c "make build-modules"
```
This command will do two things:
1. Generate documentation source files from the source code
2. Copy these files to the right places under the ./doc/ directory.
If you’re editing submodules locally, repeat this step to view the updated results.

Now you’re ready to build and preview the documentation locally.

Build and run the documentation on your machine

When editing the documentation, you can set up a live-reload server. It will build your documentation and serve it on 127.0.0.1:8000. Every time you make changes in the source files, it will rebuild the docs and refresh the browser page.

docker run --rm -it -p 8000:8000 -v $(pwd):/doc tarantool/doc-builder:fat-4.3 sh -c "make autobuild"

First build will take some time. When it’s done, open 127.0.0.1:8000 in the browser. Now when you make changes, they will be rebuilt in a few seconds, and the browser tab with preview will reload automatically.

You can also build the docs manually with make html, and then serve them using python3 built-in server:

docker run --rm -it -v $(pwd):/doc tarantool/doc-builder:fat-4.3 sh -c "make html"
docker run --rm -it -v $(pwd):/doc tarantool/doc-builder:fat-4.3 sh -c "make html-ru"
python3 -m http.server --directory output/html

or python2 built-in server:

cd output/html
python -m SimpleHTTPServer

then go to localhost:8000 in your browser.

There are other commands which can run in the tarantool/doc-builder container:

docker run --rm -it -v $(pwd):/doc tarantool/doc-builder:fat-4.3 sh -c "make html"
docker run --rm -it -v $(pwd):/doc tarantool/doc-builder:fat-4.3 sh -c "make html-ru"
docker run --rm -it -v $(pwd):/doc tarantool/doc-builder:fat-4.3 sh -c "make singlehtml"
docker run --rm -it -v $(pwd):/doc tarantool/doc-builder:fat-4.3 sh -c "make singlehtml-ru"
docker run --rm -it -v $(pwd):/doc tarantool/doc-builder:fat-4.3 sh -c "make pdf"
docker run --rm -it -v $(pwd):/doc tarantool/doc-builder:fat-4.3 sh -c "make pdf-ru"
docker run --rm -it -v $(pwd):/doc tarantool/doc-builder:fat-4.3 sh -c "make json"
docker run --rm -it -v $(pwd):/doc tarantool/doc-builder:fat-4.3 sh -c "make json-ru"
docker run --rm -it -v $(pwd):/doc tarantool/doc-builder:fat-4.3 sh -c "make epub"
docker run --rm -it -v $(pwd):/doc tarantool/doc-builder:fat-4.3 sh -c "make epub-ru"
docker run --rm -it -v $(pwd):/doc tarantool/doc-builder:fat-4.3 sh -c "make update-po"

Linkcheck

There’s a specific build mode which checks internal and external links instead of producing a document.

docker run --rm -it -v $(pwd):/doc tarantool/doc-builder:fat-4.3 sh -c "make linkcheck"

If you need to save the linkcheck’s report in a file, you can use the following trick:

docker run --rm -it -v $(pwd):/doc tarantool/doc-builder:fat-4.3 sh -c "make linkcheck" 2>&1 | tee linkcheck.log

Here 2>&1 redirects the stderr output to stdout, and then tee both shows in on screen and writes to a file.

Vale

Tarantool documentation uses the Vale linter for checking grammar, style, and word usage. Its configuration is placed in the vale.ini file located in the root project directory.

To enable RST support in Vale, you need to install Sphinx. Then, you can enable Vale integration in your IDE, for example:

Localization

Terms:

translation unit (TU) is an atomic piece of text which can be translated. A paragraph, a list item, a heading, image’s alt-text and so on.
translation source files are the files with translation units in English only. They’re located in locale/en.
translation files are the files which match original text to translated text. They’re located in locale/ru.

To update the translation files, run the make update-po task:

docker run --rm -it -v $(pwd):/doc tarantool/doc-builder:fat-4.3 sh -c "make update-po"

Translate the strings in the updated files and then commit the changes.

How to contribute

To contribute to documentation, use the REST format for drafting and submit your updates as a pull request via GitHub.

To comply with the writing and formatting style, use the guidelines provided in the documentation, common sense and existing documents.

Notes:

If you suggest creating a new documentation section (a whole new page), it has to be saved to the relevant section at GitHub.
If you want to contribute to localizing this documentation (for example, into Russian), add your translation strings to .po files stored in the corresponding locale directory (for example, /locale/ru/LC_MESSAGES/ for Russian). See more about localizing with Sphinx at http://www.sphinx-doc.org/en/stable/intl.html.

Sphinx-build warnings reference

This document will guide you through the warnings that can be raised by Sphinx while building the docs.

Below are the most frequent warnings and the ways to solve them.

Bullet list ends without a blank line; unexpected unindent

Similar warning: Block quote ends without a blank line; unexpected unindent

Example:

*   The last point of bullet list
This text should start after a blank line

Solution:

*   The last point of bullet list

This text should start after a blank line

Could not lex literal_block as “…”. Highlighting skipped

This warning means that there’s a code-block with an unknown lexer. Most probably, it’s a typo. Check out the full list of Pygments lexers for the right spelling.

Example:

..  code-block:: cxx

    // some code here

Solution:

..  code-block:: cpp

    // some code here

However, sometimes there’s no appropriate lexer or the code snippet can’t be lexed properly. In that case, use code-block:: text.

Duplicate explicit target name: “…”

Example:

*   `Install <https://git-scm.com/book/en/v2/Getting-Started-Installing-Git>`_
    ``git``, the version control system.

*   `Install <https://linuxize.com/post/how-to-unzip-files-in-linux/>`_
    the ``unzip`` utility.

Solution:

Sphinx-builder raises warnings when we call different targets the same name. Sphinx developers recommend using double underlines __ in such cases to avoid this.

*   `Install <https://git-scm.com/book/en/v2/Getting-Started-Installing-Git>`__
    ``git``, the version control system.

*   `Install <https://linuxize.com/post/how-to-unzip-files-in-linux/>`__
    the ``unzip`` utility.

Document isn’t included in any toctree

This warning means that you forgot to put the document name in the toctree.

Solution:

If you don’t want to include the document in a toctree, place the :orphan: directive at the top of the file. If this file is already included somewhere or reused, add it to the _includes directory. Sphinx ignores everything in this directory because we list it among exclude_patterns in conf.py.

Duplicate label “…”, other instance in “…/…/…”

This happens if you include the contents of a file into another file, when the included file has tags in it. In this, Sphinx thinks the tags are repeated.

Solution:

As in the previous case, add the file to _includes or avoid using tags in it.

Malformed hyperlink target

Similar warning: Unknown target name: “…”

Check the target spelling and the tag syntax.

Example:

..  _box_space-index_func

See the :ref:`Creating a functional index <box_space-index_func>` section.

Solution:

A semicolon is missing in the tag definition:

..  _box_space-index_func:

Anonymous hyperlink mismatch

Warning example: Anonymous hyperlink mismatch: 1 references but 0 targets.

Check the hyperlink formatting.

Example:

Read more in `Lua Manual <https://www.lua.org/manual/5.3`__.

Solution:

A closing greater-than sign is missing in the tag definition:

Read more in `Lua Manual <https://www.lua.org/manual/5.3>`__.

Toctree contains reference to nonexisting document ‘…’

Example:

This may happen when you refer to a wrong path to a document.

Solution:

Check the path.

If the path points to a submodule, check that you’ve built the submodules content before building docs.

Undefined label: … (if the link has no caption the label must precede a section header)

Example:

Read more in :ref:`<sql_data_type_conversion>`.

Solution:

We recommend using custom captions with :ref::

Read more in :ref:`Data Type Conversion <sql_data_type_conversion>`.

See also:

Links and references

Unexpected indentation

The reStructuredText syntax is based on indentation, much like in Python. All lines in a block of content must be equally indented. An increase or decrease in indentation denotes the end of the current block and the beginning of a new one.

Example:

Note: In the following examples, dots stand for indentation spaces. For example, |..| denotes a two-space indentation.

|..|* (Engines) Improve dump start/stop logging. When initiating memory dump, print
how much memory is going to be dumped, the expected dump rate, ETA, and the recent
write rate.

Solution:

*|...|(Engines) Improve dump start/stop logging. When initiating memory dump, print
|....|how much memory is going to be dumped, the expected dump rate, ETA, and the recent
|....|write rate.

See also:

General syntax guidelines

Unknown document

Example:

:doc:`reference/reference_lua/box_space/update`

Solution:

Sphinx did not recognize the file path correctly due to a missing slash at the beginning, so let’s just put it there:

:doc:`/reference/reference_lua/box_space/update`

Documentation infrastructure

This section of the documentation guidelines discusses some of the support activities that ensure the correct building of documentation.

Adding submodules

The documentation source files are mainly stored in the documentation repository. However, in some cases, they are stored in the repositories of other Tarantool-related products or modules, such as Monitoring.

If you are working with source files from a product or module repository, add that repository as a submodule to the documentation repository and configure other necessary settings. This will ensure that the entire body of Tarantool documentation, presented on the official website, is built properly.

Here is how to do that:

1. Add a submodule
2. Update build_submodules.sh
3. Update .gitignore

1. Add a submodule

First, we need to add the repository with content source files as a submodule.

Make sure you are in the root directory of the documentation repository.

In the ./modules directory, add the new submodule:

cd modules
git submodule add https://<path_to_submodule_repository>
cd ..

Check that the new submodule is in the .gitmodules file, for example:

[submodule "modules/metrics"]
   path = modules/metrics
   url = https://github.com/tarantool/metrics.git

2. Update build_submodules.sh

Now define what directories and files are to be copied from the submodule repository to the documentation repository before building documentation. These settings are defined in the build_submodules.sh file in the root directory of the documentation repository.

Here are some real submodule examples that show the logic of the settings.

metrics

The content source files for the metrics submodule are in the ./doc/monitoring directory of the submodule repository. In the final documentation view, the content should appear in the Monitoring chapter (https://www.tarantool.io/en/doc/latest/book/monitoring/).

To make this work:

Create a directory at ./doc/book/monitoring/.
Copy the entire content of the ./modules/metrics/doc/monitoring/ directory to ./doc/book/monitoring/.

Here are the corresponding lines in build_submodules.sh:

monitoring_root="${project_root}/modules/metrics/doc/monitoring" #
monitoring_dest="${project_root}/doc/book"

mkdir -p "${monitoring_dest}"
yes | cp -rf "${monitoring_root}" "${monitoring_dest}/"

The ${project_root} variable is defined earlier in the file as project_root=$(pwd). This is because the documentation build has to start from the documentation repository root directory.

3. Update .gitignore

Finally, add paths to the copied directories and files to .gitignore.

Git workflow

Branching

Use one branch for a single task, unless you’re fixing typos or markup on several pages. Long commit histories are hard to manage and sometimes end up stale.

Start a new branch from the last commit on latest. Make sure to update your local version of latest with git pull. Otherwise, you may have to rebase later.

Name your branch so it’s clear what you’re doing. Examples:

short-issue-description
gh-1234-short-issue-description
your-github-handle/short-issue-description

Important

It is not recommended to submit PRs to the documentation repository from forks. Because of a GitHub failsafe mechanism, it is impossible to view changes from a fork on the development website.

Creating branches directly in the repository results in a more convenient workflow.

Linking issues and PRs

When a PR is linked to an issue:

You can go from the issue straight to the PR by clicking the link in the right column.
The issue will be automatically closed when you close the PR.

Specify the issue(s) you want to close in the description of your PR. GitHub will connect them if you use specific keywords. Here are some of them:

Closes #1234
Resolves #1234
Fixes #1234

If your PR closes more than one issue, mention each of them:

Resolves #1300, resolves #1234, resolves tarantool/doc#100

Commit messages

Most of the time, one-line commit messages are sufficient for documentation changes.
- When you squash commits at merge, the resulting commit message is a sum of all commit messages in the PR. It is advised to include the “resolves” string in the first commit. Otherwise, there’s a risk that this line won’t be included in the merge commit.
Convey the nature of the change and possibly the reason why it was made.
- Don’t specify the files you’ve changed or the issue you’re working on. The file names can be looked up in the “Files” section of the PR, and the PR description has the issue number(s).
Try keeping the commit title 50 characters or shorter.
Use the imperative mood.
Start with a capital letter, don’t add ending punctuation.
(Optional) Use the telegraphic style, or “headlinese”, dropping the articles.

Good examples

git commit -m "Expand section on msgpack"
git commit -m "Add details on IPROTO_BALLOT"
git commit -m "Create new structure"
git commit -m "Improve grammar"

Bad examples

git commit -m "Fix gh-2007, second commit"
git commit -m “Changed the file box_protocol.rst”
git commit -m "added more list items"

Selecting a reviewer

Ideally, a PR should have two reviewers: a subject matter expert (SME) and a documentarian. The SME checks the facts, and the documentarian checks the language and style.

If you’re not sure who the SME for an issue is, try the following:

Check the issue description. The SME is often mentioned there explicitly.
Note who created the issue and who was involved in the discussion.

Merging

Merge when your document is ready and good enough. For external contributors, merging is blocked until a reviewer’s approval.

Always squash commits.
Make sure the commit message mentions all relevant issues with “resolves” or “fixes”.
Make sure you’ve attributed all participants with Co-authored-by.

C Style Guide

We use Git for revision control. The latest development is happening in the default branch (currently master). Our git repository is hosted on GitHub, and can be checked out with git clone git://github.com/tarantool/tarantool.git (anonymous read-only access).

If you have any questions about Tarantool internals, please post them on StackOverflow or ask Tarantool developers directly in telegram.

General guidelines

The project’s coding style is inspired by the Linux kernel coding style.

However, we have some additional guidelines, either unique to Tarantool or deviating from the Kernel guidelines. Below we rewrite the Linux kernel coding style according to the Tarantool’s style features.

Tarantool coding style

This is a short document describing the preferred coding style for the Tarantool developers and contributors. We insist on following these rules in order to make our code consistent and understandable to any developer.

Chapter 1: Indentation

Tabs are 8 characters (8-width tabs, not 8 whitespaces), and thus indentations are also 8 characters. There are heretic movements that try to make indentations 4 (or even 2!) characters deep, and that is akin to trying to define the value of PI to be 3.

Rationale: The whole idea behind indentation is to clearly define where a block of control starts and ends. Especially when you’ve been looking at your screen for 20 straight hours, you’ll find it a lot easier to see how the indentation works if you have large indentations.

Now, some people will claim that having 8-character indentations makes the code move too far to the right, and makes it hard to read on a 80-character terminal screen. The answer to that is that if you need more than 3 levels of indentation, you’re screwed anyway, and should fix your program.

8-char indents make things easier to read and have the added benefit of warning you when you’re nesting your functions too deep. Heed that warning.

The preferred way to ease multiple indentation levels in a switch statement is to align the switch and its subordinate case labels in the same column instead of double-indenting the case labels. E.g.:

switch (suffix) {
case 'G':
case 'g':
  mem <<= 30;
  break;
case 'M':
case 'm':
  mem <<= 20;
  break;
case 'K':
case 'k':
  mem <<= 10;
  /* fall through */
default:
  break;
}

Don’t put multiple statements on a single line unless you have something to hide:

if (condition) do_this;
  do_something_everytime;

Don’t put multiple assignments on a single line either. Avoid tricky expressions.

Outside of comments and documentation, spaces are never used for indentation, and the above example is deliberately broken.

Get a decent editor and don’t leave whitespace at the end of lines.

Chapter 2: Breaking long lines and strings

Coding style is all about readability and maintainability using commonly available tools.

The limit on the length of lines is 80 columns and this is a strongly preferred limit. As for comments, the same limit of 80 columns is applied.

Statements longer than 80 columns will be broken into sensible chunks, unless exceeding 80 columns significantly increases readability and does not hide information. Descendants are always substantially shorter than the parent and are placed substantially to the right. The same applies to function headers with a long argument list.

Chapter 3: Placing Braces and Spaces

The other issue that always comes up in C styling is the placement of braces. Unlike the indent size, there are few technical reasons to choose one placement strategy over the other, but the preferred way, as shown to us by the prophets Kernighan and Ritchie, is to put the opening brace last on the line, and put the closing brace first, thus:

if (x is true) {
  we do y
}

This applies to all non-function statement blocks (if, switch, for, while, do). E.g.:

switch (action) {
case KOBJ_ADD:
  return "add";
case KOBJ_REMOVE:
  return "remove";
case KOBJ_CHANGE:
  return "change";
default:
  return NULL;
}

However, there is one special case, namely functions: they have the opening brace at the beginning of the next line, thus:

int
function(int x)
{
  body of function
}

Heretic people all over the world have claimed that this inconsistency is … well … inconsistent, but all right-thinking people know that (a) K&R are right and (b) K&R are right. Besides, functions are special anyway (you can’t nest them in C).

Note that the closing brace is empty on a line of its own, except in the cases where it is followed by a continuation of the same statement, i.e. a while in a do-statement or an else in an if-statement, like this:

do {
  body of do-loop
} while (condition);

and

if (x == y) {
  ..
} else if (x > y) {
  ...
} else {
  ....
}

Rationale: K&R.

Also, note that this brace-placement also minimizes the number of empty (or almost empty) lines, without any loss of readability. Thus, as the supply of new-lines on your screen is not a renewable resource (think 25-line terminal screens here), you have more empty lines to put comments on.

Do not unnecessarily use braces where a single statement will do.

if (condition)
  action();

and

if (condition)
  do_this();
else
  do_that();

This does not apply if only one branch of a conditional statement is a single statement; in the latter case use braces in both branches:

if (condition) {
  do_this();
  do_that();
} else {
  otherwise();
}

Chapter 3.1: Spaces

Tarantool style for use of spaces depends (mostly) on function-versus-keyword usage. Use a space after (most) keywords. The notable exceptions are sizeof, typeof, alignof, and __attribute__, which look somewhat like functions (and are usually used with parentheses, although they are not required in the language, as in: sizeof info after struct fileinfo info; is declared).

So use a space after these keywords:

if, switch, case, for, do, while

but not with sizeof, typeof, alignof, or __attribute__. E.g.,

s = sizeof(struct file);

Do not add spaces around (inside) parenthesized expressions. This example is bad:

s = sizeof( struct file );

When declaring pointer data or a function that returns a pointer type, the preferred use of * is adjacent to the data name or function name and not adjacent to the type name. Examples:

char *linux_banner;
unsigned long long memparse(char *ptr, char **retptr);
char *match_strdup(substring_t *s);

Use one space around (on each side of) most binary and ternary operators, such as any of these:

=  +  -  <  >  *  /  %  |  &  ^  <=  >=  ==  !=  ?  :

but no space after unary operators:

&  *  +  -  ~  !  sizeof  typeof  alignof  __attribute__  defined

no space before the postfix increment & decrement unary operators:

++  --

no space after the prefix increment & decrement unary operators:

++  --

and no space around the . and -> structure member operators.

Do not split a cast operator from its argument with a whitespace, e.g. (ssize_t)inj->iparam.

Do not leave trailing whitespace at the ends of lines. Some editors with smart indentation will insert whitespace at the beginning of new lines as appropriate, so you can start typing the next line of code right away. However, some such editors do not remove the whitespace if you end up not putting a line of code there, such as if you leave a blank line. As a result, you end up with lines containing trailing whitespace.

Git will warn you about patches that introduce trailing whitespace, and can optionally strip the trailing whitespace for you; however, if applying a series of patches, this may make later patches in the series fail by changing their context lines.

Chapter 4: Naming

C is a Spartan language, and so should your naming be. Unlike Modula-2 and Pascal programmers, C programmers do not use cute names like ThisVariableIsATemporaryCounter. A C programmer would call that variable tmp, which is much easier to write, and not the least more difficult to understand.

HOWEVER, while mixed-case names are frowned upon, descriptive names for global variables are a must. To call a global function foo is a shooting offense.

GLOBAL variables (to be used only if you really need them) need to have descriptive names, as do global functions. If you have a function that counts the number of active users, you should call that count_active_users() or similar, you should not call it cntusr().

Encoding the type of a function into the name (so-called Hungarian notation) is brain damaged - the compiler knows the types anyway and can check those, and it only confuses the programmer. No wonder MicroSoft makes buggy programs.

LOCAL variable names should be short, and to the point. If you have some random integer loop counter, it should probably be called i. Calling it loop_counter is non-productive, if there is no chance of it being misunderstood. Similarly, tmp can be just about any type of variable that is used to hold a temporary value.

If you are afraid to mix up your local variable names, you have another problem, which is called the function-growth-hormone-imbalance syndrome. See chapter 6 (Functions).

For function naming we have a convention is to use:

new/delete for functions which allocate + initialize and destroy + deallocate an object,
create/destroy for functions which initialize/destroy an object but do not handle memory management,
init/free for functions which initialize/destroy libraries and subsystems.

Chapter 5: Typedefs

Please don’t use things like vps_t. It’s a mistake to use typedef for structures and pointers. When you see a

vps_t a;

in the source, what does it mean? In contrast, if it says

struct virtual_container *a;

you can actually tell what a is.

Lots of people think that typedefs help readability. Not so. They are useful only for:

Totally opaque objects (where the typedef is actively used to hide what the object is).

Example: pte_t etc. opaque objects that you can only access using the proper accessor functions.

Note

Opaqueness and accessor functions are not good in themselves. The reason we have them for things like pte_t etc. is that there really is absolutely zero portably accessible information there.
Clear integer types, where the abstraction helps avoid confusion whether it is int or long.

u8/u16/u32 are perfectly fine typedefs, although they fit into point 4 better than here.

Note

Again - there needs to be a reason for this. If something is unsigned long, then there’s no reason to do typedef unsigned long myflags_t;

but if there is a clear reason for why it under certain circumstances might be an unsigned int and under other configurations might be unsigned long, then by all means go ahead and use a typedef.
When you use sparse to literally create a new type for type-checking.
New types which are identical to standard C99 types, in certain exceptional circumstances.

Although it would only take a short amount of time for the eyes and brain to become accustomed to the standard types like uint32_t, some people object to their use anyway.

When editing existing code which already uses one or the other set of types, you should conform to the existing choices in that code.

Maybe there are other cases too, but the rule should basically be to NEVER EVER use a typedef unless you can clearly match one of those rules.

In general, a pointer, or a struct that has elements that can reasonably be directly accessed should never be a typedef.

Chapter 6: Functions

Functions should be short and sweet, and do just one thing. They should fit on one or two screenfuls of text (the ISO/ANSI screen size is 80x24, as we all know), and do one thing and do that well.

The maximum length of a function is inversely proportional to the complexity and indentation level of that function. So, if you have a conceptually simple function that is just one long (but simple) case-statement, where you have to do lots of small things for a lot of different cases, it’s OK to have a longer function.

However, if you have a complex function, and you suspect that a less-than-gifted first-year high-school student might not even understand what the function is all about, you should adhere to the maximum limits all the more closely. Use helper functions with descriptive names (you can ask the compiler to in-line them if you think it’s performance-critical, and it will probably do a better job of it than you would have done).

Another measure of the function is the number of local variables. They shouldn’t exceed 5-10, or you’re doing something wrong. Re-think the function, and split it into smaller pieces. A human brain can generally easily keep track of about 7 different things, anything more and it gets confused. You know you’re brilliant, but maybe you’d like to understand what you did 2 weeks from now.

In function prototypes, include parameter names with their data types. Although this is not required by the C language, it is preferred in Tarantool because it is a simple way to add valuable information for the reader.

Note that we place the function return type on the line before the name and signature.

Chapter 7: Centralized exiting of functions

Albeit deprecated by some people, the equivalent of the goto statement is used frequently by compilers in form of the unconditional jump instruction.

The goto statement comes in handy when a function exits from multiple locations and some common work such as cleanup has to be done. If there is no cleanup needed then just return directly.

Choose label names which say what the goto does or why the goto exists. An example of a good name could be out_free_buffer: if the goto frees buffer. Avoid using GW-BASIC names like err1: and err2:, as you would have to renumber them if you ever add or remove exit paths, and they make correctness difficult to verify anyway.

The rationale for using gotos is:

unconditional statements are easier to understand and follow
nesting is reduced
errors by not updating individual exit points when making modifications are prevented
saves the compiler work to optimize redundant code away ;)

int
fun(int a)
{
  int result = 0;
  char *buffer;

  buffer = kmalloc(SIZE, GFP_KERNEL);
  if (!buffer)
    return -ENOMEM;

  if (condition1) {
    while (loop1) {
      ...
    }
    result = 1;
    goto out_free_buffer;
  }
  ...
out_free_buffer:
  kfree(buffer);
  return result;
}

A common type of bug to be aware of is one err bugs which look like this:

err:
  kfree(foo->bar);
  kfree(foo);
  return ret;

The bug in this code is that on some exit paths foo is NULL. Normally the fix for this is to split it up into two error labels err_free_bar: and err_free_foo::

err_free_bar:
 kfree(foo->bar);
err_free_foo:
 kfree(foo);
 return ret;

Ideally you should simulate errors to test all exit paths.

Chapter 8: Commenting

Comments are good, but there is also a danger of over-commenting. NEVER try to explain HOW your code works in a comment: it’s much better to write the code so that the working is obvious, and it’s a waste of time to explain badly written code.

Generally, you want your comments to tell WHAT your code does, not HOW. Also, try to avoid putting comments inside a function body: if the function is so complex that you need to separately comment parts of it, you should probably go back to chapter 6 for a while. You can make small comments to note or warn about something particularly clever (or ugly), but try to avoid excess. Instead, put the comments at the head of the function, telling people what it does, and possibly WHY it does it.

When commenting the Tarantool C API functions, please use Doxygen comment format, Javadoc flavor, i.e. @tag rather than \\tag. The main tags in use are @param, @retval, @return, @see, @note and @todo.

Every function, except perhaps a very short and obvious one, should have a comment. A sample function comment may look like below:

/**
 * Write all data to a descriptor.
 *
 * This function is equivalent to 'write', except it would ensure
 * that all data is written to the file unless a non-ignorable
 * error occurs.
 *
 * @retval 0  Success
 * @retval 1 An error occurred (not EINTR)
 */
static int
write_all(int fd, void *data, size_t len);

It’s also important to comment data types, whether they are basic types or derived ones. To this end, use just one data declaration per line (no commas for multiple data declarations). This leaves you room for a small comment on each item, explaining its use.

Public structures and important structure members should be commented as well.

In C comments out of functions and inside of functions should be different in how they are started. Everything else is wrong. Below are correct examples. /** comes for documentation comments, /* for local not documented comments. However the difference is vague already, so the rule is simple: out of function use /**, inside use /*.

/**
 * Out of function comment, option 1.
 */

/** Out of function comment, option 2. */

int
function()
{
    /* Comment inside function, option 1. */

    /*
     * Comment inside function, option 2.
     */
}

If a function has declaration and implementation separated, the function comment should be for the declaration. Usually in the header file. Don’t duplicate the comment.

A comment and the function signature should be synchronized. Double-check if the parameter names are the same as used in the comment, and mean the same. Especially when you change one of them - ensure you changed the other.

Chapter 9: Macros, Enums and RTL

Names of macros defining constants and labels in enums are capitalized.

#define CONSTANT 0x12345

Enums are preferred when defining several related constants.

CAPITALIZED macro names are appreciated but macros resembling functions may be named in lower case.

Generally, inline functions are preferable to macros resembling functions.

Macros with multiple statements should be enclosed in a do - while block:

#define macrofun(a, b, c)       \
  do {                          \
    if (a == 5)                 \
      do_this(b, c);            \
  } while (0)

Things to avoid when using macros:

macros that affect control flow:
```
#define FOO(x)                  \
  do {                          \
    if (blah(x) < 0)            \
      return -EBUGGERED;        \
  } while (0)
```
is a very bad idea. It looks like a function call but exits the calling function; don’t break the internal parsers of those who will read the code.
macros that depend on having a local variable with a magic name:
```
#define FOO(val) bar(index, val)
```
might look like a good thing, but it’s confusing as hell when one reads the code and it’s prone to breakage from seemingly innocent changes.
macros with arguments that are used as l-values: FOO(x) = y; will bite you if somebody e.g. turns FOO into an inline function.
forgetting about precedence: macros defining constants using expressions must enclose the expression in parentheses. Beware of similar issues with macros using parameters.
```
#define CONSTANT 0x4000
#define CONSTEXP (CONSTANT | 3)
```
namespace collisions when defining local variables in macros resembling functions:
```
#define FOO(x)            \
({                        \
  typeof(x) ret;          \
  ret = calc_ret(x);      \
  (ret);                  \
})
```
ret is a common name for a local variable - __foo_ret is less likely to collide with an existing variable.

Chapter 10: Allocating memory

Prefer specialized allocators like region, mempool, smalloc to malloc()/free() for any performance-intensive or large memory allocations. Repetitive use of malloc()/free() can lead to memory fragmentation and should therefore be avoided.

Always free all allocated memory, even allocated at start-up. We aim at being valgrind leak-check clean, and in most cases it’s just as easy to free() the allocated memory as it is to write a valgrind suppression. Freeing all allocated memory is also dynamic-load friendly: assuming a plug-in can be dynamically loaded and unloaded multiple times, reload should not lead to a memory leak.

Chapter 11: The inline disease

There appears to be a common misperception that gcc has a magic “make me faster” speedup option called inline. While the use of inlines can be appropriate, it very often is not. Abundant use of the inline keyword leads to a much bigger kernel, which in turn slows the system as a whole down, due to a bigger icache footprint for the CPU and simply because there is less memory available for the pagecache. Just think about it; a pagecache miss causes a disk seek, which easily takes 5 milliseconds. There are a LOT of cpu cycles that can go into these 5 milliseconds.

A reasonable rule of thumb is to not put inline at functions that have more than 3 lines of code in them. An exception to this rule are the cases where a parameter is known to be a compiletime constant, and as a result of this constantness you know the compiler will be able to optimize most of your function away at compile time.

Often people argue that adding inline to functions that are static and used only once is always a win since there is no space tradeoff. While this is technically correct, gcc is capable of inlining these automatically without help, and the maintenance issue of removing the inline when a second user appears outweighs the potential value of the hint that tells gcc to do something it would have done anyway.

Chapter 12: Function return values and names

Functions can return values of many different kinds, and one of the most common is a value indicating whether the function succeeded or failed.

In 99.99999% of all cases in Tarantool we return 0 on success, non-zero on error (-1 usually). Errors are saved into a diagnostics area which is global per fiber. We never return error codes as a result of a function.

Functions whose return value is the actual result of a computation, rather than an indication of whether the computation succeeded, are not subject to this rule. Generally they indicate failure by returning some out-of-range result. Typical examples would be functions that return pointers; they use NULL or the mechanism to report failure.

Chapter 13: Editor modelines and other cruft

Some editors can interpret configuration information embedded in source files, indicated with special markers. For example, emacs interprets lines marked like this:

-*- mode: c -*-

Or like this:

/*
Local Variables:
compile-command: "gcc -DMAGIC_DEBUG_FLAG foo.c"
End:
*/

Vim interprets markers that look like this:

/* vim:set sw=8 noet */

Do not include any of these in source files. People have their own personal editor configurations, and your source files should not override them. This includes markers for indentation and mode configuration. People may use their own custom mode, or may have some other magic method for making indentation work correctly.

Chapter 14: Conditional Compilation

Wherever possible, don’t use preprocessor conditionals (#if, #ifdef) in .c files; doing so makes code harder to read and logic harder to follow. Instead, use such conditionals in a header file defining functions for use in those .c files, providing no-op stub versions in the #else case, and then call those functions unconditionally from .c files. The compiler will avoid generating any code for the stub calls, producing identical results, but the logic will remain easy to follow.

Prefer to compile out entire functions, rather than portions of functions or portions of expressions. Rather than putting an #ifdef in an expression, factor out part or all of the expression into a separate helper function and apply the condition to that function.

If you have a function or variable which may potentially go unused in a particular configuration, and the compiler would warn about its definition going unused, do not compile it and use #if for this.

At the end of any non-trivial #if or #ifdef block (more than a few lines), place a comment after the #endif on the same line, noting the conditional expression used. For instance:

#ifdef CONFIG_SOMETHING
...
#endif /* CONFIG_SOMETHING */

Chapter 15: Header files

Use #pragma once in the headers. As the header guards we refer to this construction:

#ifndef THE_HEADER_IS_INCLUDED
#define THE_HEADER_IS_INCLUDED

// ... the header code ...

#endif // THE_HEADER_IS_INCLUDED

It works fine, but the guard name THE_HEADER_IS_INCLUDED tends to become outdated when the file is moved or renamed. This is especially painful with multiple files having the same name in the project, but different path. For instance, we have 3 error.h files, which means for each of them we need to invent a new header guard name, and not forget to update them if the files are moved or renamed.

For that reason we use #pragma once in all the new code, which shortens the header file down to this:

#pragma once

// ... header code ...

Chapter 16: Other

We don’t apply ! operator to non-boolean values. It means, to check if an integer is not 0, you use != 0. To check if a pointer is not NULL, you use != NULL. The same for ==.
Select GNU C99 extensions are acceptable. It’s OK to mix declarations and statements, use true and false.
The not-so-current list of all GCC C extensions can be found at: http://gcc.gnu.org/onlinedocs/gcc-4.3.5/gcc/C-Extensions.html

Appendix I: References

The C Programming Language, Second Edition by Brian W. Kernighan and Dennis M. Ritchie. Prentice Hall, Inc., 1988. ISBN 0-13-110362-8 (paperback), 0-13-110370-9 (hardback).
The Practice of Programming by Brian W. Kernighan and Rob Pike. Addison-Wesley, Inc., 1999. ISBN 0-201-61586-X.
GNU manuals - where in compliance with K&R and this text - for cpp, gcc, gcc internals and indent
WG14 International standardization workgroup for the programming language C
Kernel CodingStyle, by greg@kroah.com at OLS 2002

Python Style Guide

Introduction

This document gives coding conventions for the Python code comprising the standard library in the main Python distribution. Please see the companion informational PEP describing style guidelines for the C code in the C implementation of Python [1].

This document and PEP 257 (Docstring Conventions) were adapted from Guido’s original Python Style Guide essay, with some additions from Barry’s style guide [2].

A Foolish Consistency is the Hobgoblin of Little Minds

One of Guido’s key insights is that code is read much more often than it is written. The guidelines provided here are intended to improve the readability of code and make it consistent across the wide spectrum of Python code. As PEP 20 says, “Readability counts”.

A style guide is about consistency. Consistency with this style guide is important. Consistency within a project is more important. Consistency within one module or function is the most important.

But most importantly: know when to be inconsistent – sometimes the style guide just doesn’t apply. When in doubt, use your best judgment. Look at other examples and decide what looks best. And don’t hesitate to ask!

Two good reasons to break a particular rule:

When applying the rule would make the code less readable, even for someone who is used to reading code that follows the rules.
To be consistent with surrounding code that also breaks it (maybe for historic reasons) – although this is also an opportunity to clean up someone else’s mess (in true XP style).

Code lay-out

Indentation

Use 4 spaces per indentation level.

For really old code that you don’t want to mess up, you can continue to use 8-space tabs.

Continuation lines should align wrapped elements either vertically using Python’s implicit line joining inside parentheses, brackets and braces, or using a hanging indent. When using a hanging indent the following considerations should be applied; there should be no arguments on the first line and further indentation should be used to clearly distinguish itself as a continuation line.

Yes:

# Aligned with opening delimiter
foo = long_function_name(var_one, var_two,
                         var_three, var_four)

# More indentation included to distinguish this from the rest.
def long_function_name(
        var_one, var_two, var_three,
        var_four):
    print(var_one)

No:

# Arguments on first line forbidden when not using vertical alignment
foo = long_function_name(var_one, var_two,
    var_three, var_four)

# Further indentation required as indentation is not distinguishable
def long_function_name(
    var_one, var_two, var_three,
    var_four):
    print(var_one)

Optional:

# Extra indentation is not necessary.
foo = long_function_name(
  var_one, var_two,
  var_three, var_four)

The closing brace/bracket/parenthesis on multi-line constructs may either line up under the first non-whitespace character of the last line of list, as in:

my_list = [
    1, 2, 3,
    4, 5, 6,
    ]
result = some_function_that_takes_arguments(
    'a', 'b', 'c',
    'd', 'e', 'f',
    )

or it may be lined up under the first character of the line that starts the multi-line construct, as in:

my_list = [
    1, 2, 3,
    4, 5, 6,
]
result = some_function_that_takes_arguments(
    'a', 'b', 'c',
    'd', 'e', 'f',
)

Tabs or Spaces?

Never mix tabs and spaces.

The most popular way of indenting Python is with spaces only. The second-most popular way is with tabs only. Code indented with a mixture of tabs and spaces should be converted to using spaces exclusively. When invoking the Python command line interpreter with the -t option, it issues warnings about code that illegally mixes tabs and spaces. When using -tt these warnings become errors. These options are highly recommended!

For new projects, spaces-only are strongly recommended over tabs. Most editors have features that make this easy to do.

Maximum Line Length

Limit all lines to a maximum of 79 characters.

There are still many devices around that are limited to 80 character lines; plus, limiting windows to 80 characters makes it possible to have several windows side-by-side. The default wrapping on such devices disrupts the visual structure of the code, making it more difficult to understand. Therefore, please limit all lines to a maximum of 79 characters. For flowing long blocks of text (docstrings or comments), limiting the length to 72 characters is recommended.

The preferred way of wrapping long lines is by using Python’s implied line continuation inside parentheses, brackets and braces. Long lines can be broken over multiple lines by wrapping expressions in parentheses. These should be used in preference to using a backslash for line continuation.

Backslashes may still be appropriate at times. For example, long, multiple with-statements cannot use implicit continuation, so backslashes are acceptable:

with open('/path/to/some/file/you/want/to/read') as file_1, \
        open('/path/to/some/file/being/written', 'w') as file_2:
    file_2.write(file_1.read())

Another such case is with assert statements.

Make sure to indent the continued line appropriately. The preferred place to break around a binary operator is after the operator, not before it. Some examples:

class Rectangle(Blob):

    def __init__(self, width, height,
                 color='black', emphasis=None, highlight=0):
        if (width == 0 and height == 0 and
            color == 'red' and emphasis == 'strong' or
            highlight > 100):
            raise ValueError("sorry, you lose")
        if width == 0 and height == 0 and (color == 'red' or
                                           emphasis is None):
            raise ValueError("I don't think so -- values are %s, %s" %
                             (width, height))
        Blob.__init__(self, width, height,
                      color, emphasis, highlight)

Blank Lines

Separate top-level function and class definitions with two blank lines.

Method definitions inside a class are separated by a single blank line.

Extra blank lines may be used (sparingly) to separate groups of related functions. Blank lines may be omitted between a bunch of related one-liners (e.g. a set of dummy implementations).

Use blank lines in functions, sparingly, to indicate logical sections.

Python accepts the control-L (i.e. ^L) form feed character as whitespace; Many tools treat these characters as page separators, so you may use them to separate pages of related sections of your file. Note, some editors and web-based code viewers may not recognize control-L as a form feed and will show another glyph in its place.

Encodings (PEP 263)

Code in the core Python distribution should always use the ASCII or Latin-1 encoding (a.k.a. ISO-8859-1). For Python 3.0 and beyond, UTF-8 is preferred over Latin-1, see PEP 3120.

Files using ASCII should not have a coding cookie. Latin-1 (or UTF-8) should only be used when a comment or docstring needs to mention an author name that requires Latin-1; otherwise, using \x, \u or \U escapes is the preferred way to include non-ASCII data in string literals.

For Python 3.0 and beyond, the following policy is prescribed for the standard library (see PEP 3131): All identifiers in the Python standard library MUST use ASCII-only identifiers, and SHOULD use English words wherever feasible (in many cases, abbreviations and technical terms are used which aren’t English). In addition, string literals and comments must also be in ASCII. The only exceptions are (a) test cases testing the non-ASCII features, and (b) names of authors. Authors whose names are not based on the latin alphabet MUST provide a latin transliteration of their names.

Open source projects with a global audience are encouraged to adopt a similar policy.

Imports

Imports should usually be on separate lines, e.g.:

Yes: import os
     import sys

No:  import sys, os

It’s okay to say this though:

from subprocess import Popen, PIPE

Imports are always put at the top of the file, just after any module comments and docstrings, and before module globals and constants.

Imports should be grouped in the following order:
1. standard library imports
2. related third party imports
3. local application/library specific imports
You should put a blank line between each group of imports.

Put any relevant __all__ specification after the imports.
Relative imports for intra-package imports are highly discouraged. Always use the absolute package path for all imports. Even now that PEP 328 is fully implemented in Python 2.5, its style of explicit relative imports is actively discouraged; absolute imports are more portable and usually more readable.
When importing a class from a class-containing module, it’s usually okay to spell this:
```
from myclass import MyClass
from foo.bar.yourclass import YourClass
```
If this spelling causes local name clashes, then spell them
```
import myclass
import foo.bar.yourclass
```
and use “myclass.MyClass” and “foo.bar.yourclass.YourClass”.

Whitespace in Expressions and Statements

Pet Peeves

Avoid extraneous whitespace in the following situations:

Immediately inside parentheses, brackets or braces.

Yes: spam(ham[1], {eggs: 2})
No:  spam( ham[ 1 ], { eggs: 2 } )

Immediately before a comma, semicolon, or colon:

Yes: if x == 4: print x, y; x, y = y, x
No:  if x == 4 : print x , y ; x , y = y , x

Immediately before the open parenthesis that starts the argument list of a function call:
```
Yes: spam(1)
No:  spam (1)
```

Immediately before the open parenthesis that starts an indexing or slicing:

Yes: dict['key'] = list[index]
No:  dict ['key'] = list [index]

More than one space around an assignment (or other) operator to align it with another.

Yes:

x = 1
y = 2
long_variable = 3

No:

x             = 1
y             = 2
long_variable = 3

Other Recommendations

Always surround these binary operators with a single space on either side: assignment (=), augmented assignment (+=, -= etc.), comparisons (==, <, >, !=, <>, <=, >=, in, not in, is, is not), Booleans (and, or, not).
If operators with different priorities are used, consider adding whitespace around the operators with the lowest priority(ies). Use your own judgement; however, never use more than one space, and always have the same amount of whitespace on both sides of a binary operator.

Yes:
```
i = i + 1
submitted += 1
x = x*2 - 1
hypot2 = x*x + y*y
c = (a+b) * (a-b)
```
No:
```
i=i+1
submitted +=1
x = x * 2 - 1
hypot2 = x * x + y * y
c = (a + b) * (a - b)
```

Don’t use spaces around the = sign when used to indicate a keyword argument or a default parameter value.

Yes:

def complex(real, imag=0.0):
    return magic(r=real, i=imag)

No:

def complex(real, imag = 0.0):
    return magic(r = real, i = imag)

Compound statements (multiple statements on the same line) are generally discouraged.

Yes:

if foo == 'blah':
    do_blah_thing()
do_one()
do_two()
do_three()

Rather not:

if foo == 'blah': do_blah_thing()
do_one(); do_two(); do_three()

While sometimes it’s okay to put an if/for/while with a small body on the same line, never do this for multi-clause statements. Also avoid folding such long lines!

Rather not:

if foo == 'blah': do_blah_thing()
for x in lst: total += x
while t < 10: t = delay()

Definitely not:

if foo == 'blah': do_blah_thing()
else: do_non_blah_thing()

try: something()
finally: cleanup()

do_one(); do_two(); do_three(long, argument,
                             list, like, this)

if foo == 'blah': one(); two(); three()

Comments

Comments that contradict the code are worse than no comments. Always make a priority of keeping the comments up-to-date when the code changes!

Comments should be complete sentences. If a comment is a phrase or sentence, its first word should be capitalized, unless it is an identifier that begins with a lower case letter (never alter the case of identifiers!).

If a comment is short, the period at the end can be omitted. Block comments generally consist of one or more paragraphs built out of complete sentences, and each sentence should end in a period.

You should use two spaces after a sentence-ending period.

When writing English, Strunk and White apply.

Python coders from non-English speaking countries: please write your comments in English, unless you are 120% sure that the code will never be read by people who don’t speak your language.

Block Comments

Block comments generally apply to some (or all) code that follows them, and are indented to the same level as that code. Each line of a block comment starts with a # and a single space (unless it is indented text inside the comment).

Paragraphs inside a block comment are separated by a line containing a single #.

Inline Comments

Use inline comments sparingly.

An inline comment is a comment on the same line as a statement. Inline comments should be separated by at least two spaces from the statement. They should start with a # and a single space.

Inline comments are unnecessary and in fact distracting if they state the obvious. Don’t do this:

x = x + 1                 # Increment x

But sometimes, this is useful:

x = x + 1                 # Compensate for border

Documentation Strings

Conventions for writing good documentation strings (a.k.a. “docstrings”) are immortalized in PEP 257.

Write docstrings for all public modules, functions, classes, and methods. Docstrings are not necessary for non-public methods, but you should have a comment that describes what the method does. This comment should appear after the def line.
PEP 257 describes good docstring conventions. Note that most importantly, the """ that ends a multiline docstring should be on a line by itself, and preferably preceded by a blank line, e.g.:
```
"""Return a foobang

Optional plotz says to frobnicate the bizbaz first.

"""
```
For one liner docstrings, it’s okay to keep the closing """ on the same line.

Version Bookkeeping

If you have to have Subversion, CVS, or RCS crud in your source file, do it as follows.

__version__ = "$Revision$"
# $Source$

These lines should be included after the module’s docstring, before any other code, separated by a blank line above and below.

Naming Conventions

The naming conventions of Python’s library are a bit of a mess, so we’ll never get this completely consistent – nevertheless, here are the currently recommended naming standards. New modules and packages (including third party frameworks) should be written to these standards, but where an existing library has a different style, internal consistency is preferred.

Descriptive: Naming Styles

There are a lot of different naming styles. It helps to be able to recognize what naming style is being used, independently from what they are used for.

The following naming styles are commonly distinguished:

b (single lowercase letter)
B (single uppercase letter)
lowercase
lower_case_with_underscores
UPPERCASE
UPPER_CASE_WITH_UNDERSCORES
CapitalizedWords (or CapWords, or CamelCase – so named because of the bumpy look of its letters [3]). This is also sometimes known as StudlyCaps.

Note: When using abbreviations in CapWords, capitalize all the letters of the abbreviation. Thus HTTPServerError is better than HttpServerError.
mixedCase (differs from CapitalizedWords by initial lowercase character!)
Capitalized_Words_With_Underscores (ugly!)

There’s also the style of using a short unique prefix to group related names together. This is not used much in Python, but it is mentioned for completeness. For example, the os.stat() function returns a tuple whose items traditionally have names like st_mode, st_size, st_mtime and so on. (This is done to emphasize the correspondence with the fields of the POSIX system call struct, which helps programmers familiar with that.)

The X11 library uses a leading X for all its public functions. In Python, this style is generally deemed unnecessary because attribute and method names are prefixed with an object, and function names are prefixed with a module name.

In addition, the following special forms using leading or trailing underscores are recognized (these can generally be combined with any case convention):

_single_leading_underscore: weak “internal use” indicator. E.g. from M import * does not import objects whose name starts with an underscore.
single_trailing_underscore_: used by convention to avoid conflicts with Python keyword, e.g.
```
Tkinter.Toplevel(master, class_='ClassName')
```
__double_leading_underscore: when naming a class attribute, invokes name mangling (inside class FooBar, __boo becomes _FooBar__boo; see below).
__double_leading_and_trailing_underscore__: “magic” objects or attributes that live in user-controlled namespaces. E.g. __init__, __import__ or __file__. Never invent such names; only use them as documented.

Prescriptive: Naming Conventions

Names to Avoid

Never use the characters ‘l’ (lowercase letter el), ‘O’ (uppercase letter oh), or ‘I’ (uppercase letter eye) as single character variable names.

In some fonts, these characters are indistinguishable from the numerals one and zero. When tempted to use ‘l’, use ‘L’ instead.

Package and Module Names

Modules should have short, all-lowercase names. Underscores can be used in the module name if it improves readability. Python packages should also have short, all-lowercase names, although the use of underscores is discouraged.

Since module names are mapped to file names, and some file systems are case insensitive and truncate long names, it is important that module names be chosen to be fairly short – this won’t be a problem on Unix, but it may be a problem when the code is transported to older Mac or Windows versions, or DOS.

When an extension module written in C or C++ has an accompanying Python module that provides a higher level (e.g. more object oriented) interface, the C/C++ module has a leading underscore (e.g. _socket).

Class Names

Almost without exception, class names use the CapWords convention. Classes for internal use have a leading underscore in addition.

Exception Names

Because exceptions should be classes, the class naming convention applies here. However, you should use the suffix “Error” on your exception names (if the exception actually is an error).

Global Variable Names

(Let’s hope that these variables are meant for use inside one module only.) The conventions are about the same as those for functions.

Modules that are designed for use via from M import * should use the __all__ mechanism to prevent exporting globals, or use the older convention of prefixing such globals with an underscore (which you might want to do to indicate these globals are “module non-public”).

Function Names

Function names should be lowercase, with words separated by underscores as necessary to improve readability.

mixedCase is allowed only in contexts where that’s already the prevailing style (e.g. threading.py), to retain backwards compatibility.

Function and method arguments

Always use self for the first argument to instance methods.

Always use cls for the first argument to class methods.

If a function argument’s name clashes with a reserved keyword, it is generally better to append a single trailing underscore rather than use an abbreviation or spelling corruption. Thus class_ is better than clss. (Perhaps better is to avoid such clashes by using a synonym.)

Method Names and Instance Variables

Use the function naming rules: lowercase with words separated by underscores as necessary to improve readability.

Use one leading underscore only for non-public methods and instance variables.

To avoid name clashes with subclasses, use two leading underscores to invoke Python’s name mangling rules.

Python mangles these names with the class name: if class Foo has an attribute named __a, it cannot be accessed by Foo.__a. (An insistent user could still gain access by calling Foo._Foo__a.) Generally, double leading underscores should be used only to avoid name conflicts with attributes in classes designed to be subclassed.

Note: there is some controversy about the use of __names (see below).

Constants

Constants are usually defined on a module level and written in all capital letters with underscores separating words. Examples include MAX_OVERFLOW and TOTAL.

Designing for inheritance

Always decide whether a class’s methods and instance variables (collectively: “attributes”) should be public or non-public. If in doubt, choose non-public; it’s easier to make it public later than to make a public attribute non-public.

Public attributes are those that you expect unrelated clients of your class to use, with your commitment to avoid backward incompatible changes. Non-public attributes are those that are not intended to be used by third parties; you make no guarantees that non-public attributes won’t change or even be removed.

We don’t use the term “private” here, since no attribute is really private in Python (without a generally unnecessary amount of work).

Another category of attributes are those that are part of the “subclass API” (often called “protected” in other languages). Some classes are designed to be inherited from, either to extend or modify aspects of the class’s behavior. When designing such a class, take care to make explicit decisions about which attributes are public, which are part of the subclass API, and which are truly only to be used by your base class.

With this in mind, here are the Pythonic guidelines:

Public attributes should have no leading underscores.
If your public attribute name collides with a reserved keyword, append a single trailing underscore to your attribute name. This is preferable to an abbreviation or corrupted spelling. (However, not withstanding this rule, ‘cls’ is the preferred spelling for any variable or argument which is known to be a class, especially the first argument to a class method.)

Note 1:
See the argument name recommendation above for class methods.
For simple public data attributes, it is best to expose just the attribute name, without complicated accessor/mutator methods. Keep in mind that Python provides an easy path to future enhancement, should you find that a simple data attribute needs to grow functional behavior. In that case, use properties to hide functional implementation behind simple data attribute access syntax.

Note 1:
Properties only work on new-style classes.

Note 2:
Try to keep the functional behavior side-effect free, although side-effects such as caching are generally fine.

Note 3:
Avoid using properties for computationally expensive operations; the attribute notation makes the caller believe that access is (relatively) cheap.
If your class is intended to be subclassed, and you have attributes that you do not want subclasses to use, consider naming them with double leading underscores and no trailing underscores. This invokes Python’s name mangling algorithm, where the name of the class is mangled into the attribute name. This helps avoid attribute name collisions should subclasses inadvertently contain attributes with the same name.

Note 1:
Note that only the simple class name is used in the mangled name, so if a subclass chooses both the same class name and attribute name, you can still get name collisions.

Note 2:
Name mangling can make certain uses, such as debugging and __getattr__(), less convenient. However the name mangling algorithm is well documented and easy to perform manually.

Note 3:
Not everyone likes name mangling. Try to balance the need to avoid accidental name clashes with potential use by advanced callers.

References

[1]	PEP 7, Style Guide for C Code, van Rossum

[2]	Barry’s GNU Mailman style guide

[3]	CamelCase Wikipedia page

Copyright

Author:

Guido van Rossum <guido@python.org>
Barry Warsaw <barry@python.org>

Lua style guide

Inspiration:

Programming style is art. There is some arbitrariness to the rules, but there are sound rationales for them. It is useful not only to provide sound advice on style but to understand the underlying rationale behind the style recommendations:

The Zen of Python is good. Understand it and use wisely:

Beautiful is better than ugly.

Explicit is better than implicit.

Simple is better than complex.

Complex is better than complicated.

Flat is better than nested.

Sparse is better than dense.

Readability counts.

Special cases aren’t special enough to break the rules.

Although practicality beats purity.

Errors should never pass silently.

Unless explicitly silenced.

In the face of ambiguity, refuse the temptation to guess.

There should be one – and preferably only one – obvious way to do it.

Although that way may not be obvious at first unless you’re Dutch.

Now is better than never.

Although never is often better than right now.

If the implementation is hard to explain, it’s a bad idea.

If the implementation is easy to explain, it may be a good idea.

Namespaces are one honking great idea – let’s do more of those!

https://www.python.org/dev/peps/pep-0020/

Indentation and formatting

4 spaces instead of tabs. PIL suggests using two spaces, but a programmer looks at code from 4 to 8 hours a day, so it’s simpler to distinguish indentation with 4 spaces. Why spaces? Similar representation everywhere.

You can use vim modelines:
```
-- vim:ts=4 ss=4 sw=4 expandtab
```
A file should ends w/ one newline symbol, but shouldn’t ends w/ blank line (two newline symbols).
Every do/while/for/if/function should indent 4 spaces.

Related or/and in if must be enclosed in the round brackets (). Example:

-- Good
if (a == true and b == false) or (a == false and b == true) then
    <...>
end

-- Bad
if a == true and b == false or a == false and b == true then
    <...>
end

-- Good but not explicit
if a ^ b == true then
end

Type conversion

Do not use concatenation to convert to string or addition to convert to number (use tostring/tonumber instead):

-- Bad
local a = 123
a = a .. ''

-- Good
local a = 123
a = tostring(a)

-- Bad
local a = '123'
a = a + 5 -- 128

-- Good
local a = '123'
a = tonumber(a) + 5 -- 128

Try to avoid multiple nested if’s with common body:

-- Good
if (a == true and b == false) or (a == false and b == true) then
    do_something()
end

-- Bad
if a == true then
    if b == false then
        do_something()
    end
if b == true then
    if a == false then
        do_something()
    end
end

Avoid multiple concatenations in one statement, use string.format instead:

-- Bad
function say_greeting(period, name)
    local a = "good  " .. period .. ", " .. name
end

-- Good
function say_greeting(period, name)
    local a = string.format("good %s, %s", period, name)
end

-- Best
local say_greeting_fmt = "good %s, %s"
function say_greeting(period, name)
    local a = say_greeting_fmt:format(period, name)
end

Use and/or for default variable values

-- Good
function(input)
    input = input or 'default_value'
end

-- Ok but excessive
function(input)
    if input == nil then
        input = 'default_value'
    end
end

if’s and return statements:

-- Good
if a == true then
    return do_something()
end
do_other_thing()

-- Bad
if a == true then
    return do_something()
else
    do_other_thing()
end

Using spaces:

Don’t use spaces between function name and opening round bracket. Split arguments with one whitespace character:
```
-- Bad
function name (arg1,arg2,...)
end

-- Good
function name(arg1, arg2, ...)
end
```

Add a space after comment markers:

while true do -- Inline comment
    -- Comment
    do_something()
end
--[[
Multiline
comment
]]--

Surrounding operators:

-- Bad
local thing=1
thing = thing-1
thing = thing*1
thing = 'string'..'s'

-- Good
local thing = 1
thing = thing - 1
thing = thing * 1
thing = 'string' .. 's'

Add a space after commas in tables:

-- Bad
local thing = {1,2,3}
thing = {1 , 2 , 3}
thing = {1 ,2 ,3}

-- Good
local thing = {1, 2, 3}

Add a space in map definitions after equals signs and commas:

-- Bad
return {1,2,3,4}
return {
    key1 = val1,key2=val2
}

-- Good
return {1, 2, 3, 4}
return {
    key1 = val1, key2 = val2,
    key3 = val3
}

You can also use alignment:

return {
    long_key  = 'vaaaaalue',
    key       = 'val',
    something = 'even better'
}

Extra blank lines may be used (sparingly) to separate groups of related functions. Blank lines may be omitted between several related one-liners (for example, a set of dummy implementations).

Use blank lines in functions (sparingly) to indicate logical sections:

-- Bad
if thing ~= nil then
    -- ... stuff ...
end
function derp()
    -- ... stuff ...
end
local wat = 7

-- Good
if thing ~= nil then
    -- ... stuff ...
end

function derp()
    -- ... stuff ...
end

local wat = 7

Delete whitespace at EOL (strongly forbidden. Use :s/\s\+$//gc in vim to delete them).

Avoid global variables

Avoid using global variables. In exceptional cases, start the name of such a variable with _G, add a prefix, or add a table instead of a prefix:

-- Very bad
function bad_global_example()
end

local function good_local_example()
end
-- Good
_G.modulename_good_local_example = good_local_example

-- Better
_G.modulename = {}
_G.modulename.good_local_example = good_local_example

Always use a prefix to avoid name conflicts.

Naming

Names of variables/”objects” and “methods”/functions: snake_case.
Names of “classes”: CamelCase.
Private variables/methods (future properties) of objects start with underscores <object>._<name>. Avoid syntax like local function private_methods(self) end.
Boolean: naming is_<...>, isnt_<...>, has_, hasnt_ is good style.
For “very local” variables:
- t is for tables
- i, j are for indexing
- n is for counting
- k, v is what you get out of pairs() (are acceptable, _ if unused)
- i, v is what you get out of ipairs() (are acceptable, _ if unused)
- k/key is for table keys
- v/val/value is for values that are passed around
- x/y/z is for generic math quantities
- s/str/string is for strings
- c is for 1-char strings
- f/func/cb are for functions
- status, <rv>.. or ok, <rv>.. is what you get out of pcall/xpcall
- buf, sz is a (buffer, size) pair
- <name>_p is for pointers
- t0.. is for timestamps
- err is for errors
Abbreviations are acceptable if they’re very common or if they’re unambiguous and you’ve documented them.
Global variables are spelled in ALL_CAPS. If it’s a system variable, it starts with an underscore (_G/_VERSION/..).
Modules are named in snake_case (avoid underscores and dashes): for example, ‘luasql’, not ‘Lua-SQL’.
*_mt and *_methods defines metatable and methods table.

Idioms and patterns

Always use round brackets in call of functions except multiple cases (common lua style idioms):

*.cfg{ } functions (box.cfg/memcached.cfg/..)
ffi.cdef[[ ]] function

Avoid the following constructions:

<func>’<name>’. Strongly avoid require’..’.
function object:method() end. Use function object.method(self) end instead.
Semicolons as table separators. Only use commas.
Semicolons at the end of line. Use semicolons only to split multiple statements on one line.
Unnecessary function creation (closures/..).

Avoid implicit casting to boolean in if conditions like if x then or if not x then. Such expressions will likely result in troubles with box.NULL. Instead of those conditions, use if x ~= nil then and if x == nil then.

Modules

Don’t start modules with license/authors/descriptions, you can write it in LICENSE/AUTHORS/README files. To write modules, use one of the two patterns (don’t use modules()):

local M = {}

function M.foo()
    ...
end

function M.bar()
    ...
end

return M

local function foo()
    ...
end

local function bar()
    ...
end

return {
    foo = foo,
    bar = bar,
}

Commenting

Don’t forget to comment your Lua code. You shouldn’t comment Lua syntax (assume that the reader already knows the Lua language). Instead, tell about functions/variable names/etc.

Start a sentence with a capital letter and end with a period.

Multiline comments: use matching (--[[ ]]--) instead of simple (--[[ ]]).

Public function comments:

--- Copy any table (shallow and deep version).
-- * deepcopy: copies all levels
-- * shallowcopy: copies only first level
-- Supports __copy metamethod for copying custom tables with metatables.
-- @function gsplit
-- @table         inp  original table
-- @shallow[opt]  sep  flag for shallow copy
-- @returns            table (copy)

Testing

Use the tap module for writing efficient tests. Example of a test file:

#!/usr/bin/env tarantool

local test = require('tap').test('table')
test:plan(31)

do
    -- Check basic table.copy (deepcopy).
    local example_table = {
        { 1, 2, 3 },
        { "help, I'm very nested", { { { } } } }
    }

    local copy_table = table.copy(example_table)

    test:is_deeply(
            example_table,
            copy_table,
            "checking, that deepcopy behaves ok"
    )
    test:isnt(
            example_table,
            copy_table,
            "checking, that tables are different"
    )
    test:isnt(
            example_table[1],
            copy_table[1],
            "checking, that tables are different"
    )
    test:isnt(
            example_table[2],
            copy_table[2],
            "checking, that tables are different"
    )
    test:isnt(
            example_table[2][2],
            copy_table[2][2],
            "checking, that tables are different"
    )
    test:isnt(
            example_table[2][2][1],
            copy_table[2][2][1],
            "checking, that tables are different"
    )
end

<...>

os.exit(test:check() and 0 or 1)

When you test your code, the output will be something like this:

TAP version 13
1..31
ok - checking, that deepcopy behaves ok
ok - checking, that tables are different
ok - checking, that tables are different
ok - checking, that tables are different
ok - checking, that tables are different
ok - checking, that tables are different
...

Error handling

Be generous in what you accept and strict in what you return.

With error handling, this means that you must provide an error object as the second multi-return value in case of error. The error object can be a string, a Lua table, cdata, or userdata. In the latter three cases, it must have a __tostring metamethod defined.

In case of error, use nil for the first return value. This makes the error hard to ignore.

When checking function return values, check the first argument first. If it’s nil, look for error in the second argument:

local data, err = foo()
if data == nil then
    return nil, err
end
return bar(data)

Unless the performance of your code is paramount, try to avoid using more than two return values.

In rare cases, you may want to return nil as a legal return value. In this case, it’s OK to check for error first and then for return:

local data, err = foo()
if err == nil then
    return data
end
return nil, err

luacheck

To check the code style, Tarantool uses luacheck. It analyses different aspects of code, such as unused variables, and sometimes it checks more aspects than needed. So there is an agreement to ignore some warnings generated by luacheck:

"212/self",   -- Unused argument <self>.
"411",        -- Redefining a local variable.
"421",        -- Shadowing a local variable.
"431",        -- Shadowing an upvalue.
"432",        -- Shadowing an upvalue argument.

return:	`mem_free` is the allocated, but currently unused memory; `mem_used` is the memory used for storing data items (tuples and indexes); `item_count` is the number of stored items; `item_size` is the size of each data item; `slab_count` is the number of slabs allocated; `slab_size` is the size of each allocated slab.
rtype:	table

Parameters:	is_joinable (`boolean`) – the boolean value that specifies whether the fiber is joinable
Return:	nil

param string input-string:
	the string to be converted (the “from” string)
return:	the string that results from the conversion (the “to” string)

Return:	A concatenation of `observation` objects across all created collectors. { label_pairs: table, -- `label_pairs` key-value table timestamp: ctype<uint64_t>, -- current system time (in microseconds) value: number, -- current value metric_name: string, -- collector }
Rtype:	table

Parameters:	kind (`string`) – collector kind (`counter`, `gauge`, `histogram`, or `summary`). name (`string`) – collector name.
Return:	A collector object or `nil`.
Rtype:	collector_obj

Parameters:	data (const char) – begin of MessagePack to push data_end* (const char*) – end of MessagePack to push
Returns:	-1 on error (check box_error_last())
Returns:	0 otherwise

Parameters:	seq_id (uint32_t) – sequence identifier result (int64_t) – pointer to a variable where the current sequence value will be stored on success.
Returns:	0 on success and -1 otherwise. In case of an error user could get it via `box_error_last()`.

Parameters:	sid (uint32_t) – the IPROTO session identifier (see box_session_id()) header (char) – a MsgPack-encoded header header_end* (char) – end of a header encoded as MsgPack body* (char) – a MsgPack-encoded body. If the `body` and `body_end` parameters are omitted, the packet consists of the header only. body_end* (char*) – end of a body encoded as MsgPack
Returns:	0 on success; -1 on error (check box_error_last())
Return type:	number

Parameters:	request_type (uint32_t) – IPROTO request type code (for example, `IPROTO_SELECT`). For details, check Client-server requests and responses. To override the handler of unknown request types, use the IPROTO_UNKNOWN type code. handler (iproto_handler_t) – IPROTO request handler. To reset the request handler, set the `handler` parameter to `NULL`. See the full parameter description in the Handler function section. destroy (iproto_handler_destroy_t) – IPROTO request handler destructor. The destructor is called when the corresponding handler is removed. See the full parameter description in the Handler destructor function section. ctx (void*) – a context passed to the `handler` and `destroy` callbacks
Returns:	0 on success; -1 on error (check box_error_last())
Return type:	number

Parameters:	f (struct fiber) – fiber yesno* (bool) – status to set
Returns:	previous state

Parameters:	fiber_attr (struct fiber_attr) – fiber attributes container stack_size* (size_t) – stack size for new fibers (in bytes)
Returns:	0 on success
Returns:	-1 on failure (if `stack_size` is smaller than the minimum allowable fiber stack size)

Parameters:	cond (struct fiber_cond) – conditional variable timeout* (double) – timeout in seconds
Returns:	0 on fiber_cond_signal() call or a spurious wake up
Returns:	-1 on timeout, and the error code is set to ‘TimedOut’

Parameters:	L (lua_State) – Lua State idx* (int) – stack index ctypeid (uint32_t*) – output argument. FFI’s CTypeID of returned cdata
Returns:	memory associated with this cdata

Parameters:	L (lua_State) – Lua State ctypename* (const char*) – C type name as string (e.g. “struct request” or “uint32_t”)
Returns:	CTypeID

Parameters:	tuple (box_tuple_t*) – a tuple
Returns:	-1 on error
Returns:	0 otherwise

Parameters:	tuple (box_tuple_t) – a tuple field_id* (uint32_t) – zero-based index in MsgPack array.
Returns:	NULL if i >= box_tuple_field_count()
Returns:	msgpack otherwise

Parameters:	fields (uint32_t) – array with key field identifiers types* (uint32_t) – array with key field types part_count (uint32_t) – the number of key fields
Returns:	key definition on success
Returns:	NULL on error

Parameters:	keys (key_def) – array of keys defined for the format key_count (uint16_t) – count of keys
Returns:	new tuple format on success
Returns:	NULL on error

Parameters:	tuple_a (const box_tuple_t) – the first tuple tuple_b* (const box_tuple_t) – the second tuple key_def* (const box_key_def_t*) – key definition
Returns:	0 if `key_fields(tuple_a)` == `key_fields(tuple_b)`
Returns:	<0 if `key_fields(tuple_a)` < `key_fields(tuple_b)`
Returns:	>0 if `key_fields(tuple_a)` > `key_fields(tuple_b)`

Parameters:	tuple (const box_tuple_t) – tuple key* (const char) – key with MessagePack array header key_def* (const box_key_def_t*) – key definition
Returns:	0 if `key_fields(tuple)` == `parts(key)`
Returns:	<0 if `key_fields(tuple)` < `parts(key)`
Returns:	>0 if `key_fields(tuple)` > `parts(key)`

Parameters:	it (box_tuple_iterator_t*) – a tuple iterator
Returns:	position