Updated at 2023-06-07 09:39:41.195831

Changelog

Versioning policy

Tarantool Enterprise SDK version consists of two parts:

<TARANTOOL_BASE_VERSION>-r<REVISION>

For example: 2.6.1-0-gcfe0d1a55-r362.

563

557

553

549

545

543

542

541

540

539

538

537

536

535

534

533

532

531

530

529

528

527

526

525

524

523

522

521

520

Release SDK by tags:

519

r518

r517

r516

r515

r514

r513

r512

r511

r510

r502

r498

r467

Breaking changes

  • Default audit log format was changed to CSV.

Functionality added or changed

Enterprise

  • Implemented user-defined audit events. Now it’s possible to log custom messages to the audit log from Lua (gh-65).
  • [Breaking change] Switched the default audit log format to CSV. The format can be switched back to JSON using the new box.cfg.audit_format configuration option (gh-66).
  • Implemented the audit log filter. Now, it’s possible to enable logging only for a subset of all audit events using the new box.cfg.audit_filter configuration option (gh-67).

Core

  • Implement constraints and foreign keys. Now a user can create function constraints and foreign key relations (gh-6436).
  • Changed log level of some information messages from critical to info (gh-4675).
  • Added predefined system events: box.status, box.id, box.election and box.schema (gh-6260).
  • Introduced transaction isolation levels in Lua and IPROTO (gh-6930).

Vinyl

  • Disabled the deferred DELETE optimization in Vinyl to avoid possible performance degradation of secondary index reads. Now, to enable the optimization, one has to set the defer_deletes flag in space options (gh-4501).

Lua

  • Added support of console autocompletion for net.box objects stream and future (gh-6305).

Datetime

  • Parse method to allow converting string literals in extended iso-8601
    or rfc3339 formats (gh-6731).
  • The range of supported years has been extended in all parsers to cover
    fully -5879610-06-22..5879611-07-11 (gh-6731).

Build

  • Added bundling of GNU libunwind to support backtrace feature on AARCH64 architecture and distributives that don’t provide libunwind package.
  • Re-enabled backtrace feature for all RHEL distributions by default, except for AARCH64 architecture and ancient GCC versions, which lack compiler features required for backtrace (gh-4611).

Bugs fixed

Enterprise

  • Disabled audit log unless explicitly configured (gh-39). Before this change, audit events were written to stderr if box.cfg.audit_log wasn’t set. Now, audit log is disabled in this case.
  • Disabled audit logging of replicated events (gh-59). Now, replicated events (for example, user creation) are logged only on the origin, never on a replica.

Core

  • Banned DDL operations in space on_replace triggers, since they could lead to a crash (gh-6920).
  • Fixed a bug due to which all fibers created with fiber_attr_setstacksize() leaked until the thread exit. Their stacks also leaked except when fiber_set_joinable(..., true) was used.
  • Fixed a crash in mvcc connected with secondary index conflict (gh-6452).
  • Fixed a bug which resulted in wrong space count (gh-6421).
  • Select in RO transaction now reads confirmed data, like a standalone (auotcommit) select does (gh-6452).

Replication

  • Fixed potential obsolete data write in synchronious replication due to race in accessing terms while disk write operation is in progress and not yet completed.
  • Fixed replicas failing to bootstrap when master is just re-started (gh-6966).

Lua

  • Fixed the behavior of tarantool console on SIGINT. Now Ctrl+C discards the current input and prints the new prompt (gh-2717).

Triggers

  • Fixed assertion or segfault when MP_EXT received via net.box (gh-6766).
  • Now ROUND() properly support INTEGER and DECIMAL as the first argument (gh-6988).

Datetime

  • Intervals received after datetime arithmetic operations may be improperly normalized if result was negative

    tarantool> date.now() - date.now()
    ---
    - -1.000026000 seconds
    ...
    

    I.e. 2 immediately called date.now() produce very close values, whose difference should be close to 0, not 1 second (gh-6882).

Net.box

  • Changed the type of the error returned by net.box on timeout from ClientError to TimedOut (gh-6144).

r457

r455

Setup

This chapter explains how to download and set up Tarantool Enterprise and run a sample application provided with it.

System requirements

The recommended system requirements for running Tarantool Enterprise are as follows.

Hardware requirements

To fully ensure the fault tolerance of a distributed data storage system, at least three physical computers or virtual servers are required.

For testing/development purposes, the system can be deployed using a smaller number of servers; however, it is not recommended to use such configurations for production.

Software requirements

  1. As host operating systems, Tarantool Enterprise supports Red Hat Enterprise Linux and CentOS versions 7.5 and higher.

    Note

    Tarantool Enterprise can run on other systemd-based Linux distributions but it is not tested on them and may not work as expected.

  2. glibc 2.17-260.el7_6.6 and higher is required. Take care to check and update, if needed:

    $ rpm -q glibc
    glibc-2.17-196.el7_4.2
    $ yum update glibc
    

Network requirements

Hereinafter, “storage servers” or “Tarantool servers” are the computers used to store and process data, and “administration server” is the computer used by the system operator to install and configure the product.

The Tarantool cluster has a full mesh topology, therefore all Tarantool servers should be able to communicate and send traffic from and to TCP/UDP ports used by the cluster’s instances (see advertise_uri: <host>:<port> and config: advertise_uri: '<host>:<port>' in /etc/tarantool/conf.d/*.yml for each instance). For example:

# /etc/tarantool/conf.d/*.yml

myapp.s2-replica:
  advertise_uri: localhost:3305 # this is a TCP/UDP port
  http_port: 8085

all:
  ...
  hosts:
    storage-1:
      config:
        advertise_uri: 'vm1:3301' # this is a TCP/UDP port
        http_port: 8081

To configure remote monitoring or to connect via the administrative console, the administration server should be able to access the following TCP ports on Tarantool servers:

  • 22 to use the SSH protocol,
  • ports specified in instance configuration (http_port parameter) to monitor the HTTP-metrics.

Additionally, it is recommended to apply the following settings for sysctl on all Tarantool servers:

$ # TCP KeepAlive setting
$ sysctl -w net.ipv4.tcp_keepalive_time=60
$ sysctl -w net.ipv4.tcp_keepalive_intvl=5
$ sysctl -w net.ipv4.tcp_keepalive_probes=5

This optional setup of the Linux network stack helps speed up the troubleshooting of network connectivity when the server physically fails. To achieve the maximum performance, you may also need to configure other network stack parameters that are not specific to the Tarantool DBMS. For more information, please refer to the Network Performance Tuning Guide section of the RHEL7 user documentation.

Package contents

The latest release packages of Tarantool Enterprise are available in the customer zone. at Tarantool website. Please contact support@tarantool.io for access.

Each package is distributed as a tar + gzip archive and includes the following components and features:

Archive contents:

Installation

The delivered tar + gzip archive should be uploaded to a server and unpacked:

$ tar xvf tarantool-enterprise-bundle-<version>.tar.gz

No further installation is required as the unpacked binaries are almost ready to go. Go to the directory with the binaries (tarantool-enterprise) and add them to the executable path by running the script provided by the distribution:

$ source ./env.sh

Make sure you have enough privileges to run the script and the file is executable. Otherwise, try chmod and chown commands to adjust it.

Next, set up your development environment as described in the developer’s guide.

Developer’s guide

To develop an application, use Tarantool Cartridge framework that is installed as part of Tarantool Enterprise.

Here is a summary of the commands you need:

  1. Create a cluster-aware application from template:

    $ cartridge create --name <app_name> /path/to
    
  2. Develop your application:

    $ cd /path/to/<app_name>
    $ ...
    
  3. Package your application:

    $ cartridge pack [rpm|tgz] /path/to/<app_name>
    
  4. Deploy your application:

    • For rpm package:

      1. Upload the package to all servers dedicated to Tarantool.

      2. Install the package:

        $ yum install <app_name>-<version>.rpm
        
      3. Launch the application.

        $ systemctl start <app_name>
        
    • For tgz archive:

      1. Upload the archive to all servers dedicated to Tarantool.

      2. Unpack the archive:

        $ tar -xzvf <app_name>-<version>.tar.gz -C /home/<user>/apps
        
      3. Launch the application

        $ tarantool init.lua
        

For details and examples, please consult the open-source Tarantool documentation:

Further on, this guide focuses on Enterprise-specific developer features available on top of the open-source Tarantool version with Tarantool Cartridge framework:

Implementing LDAP authorization in the web interface

If you run an LDAP server in your organization, you can connect Tarantool Enterprise to it and let it handle the authorization. In this case, follow the general recipe where in the first step add the ldap module to the .rockspec file as a dependency and consider implementing the check_password function the following way:

-- auth.lua
-- Require the LDAP module at the start of the file
local ldap = require('ldap')
...
-- Add a function to check the credentials
local function check_password(username, password)

    -- Configure the necessary LDAP parameters
    local user = string.format("cn=%s,ou=superheros,dc=glauth,dc=com", username)

    -- Connect to the LDAP server
    local ld, err = ldap.open("localhost:3893", user, password)

    -- Return an authentication success or failure
    if not ld then
       return false
    end
    return true
end
 ...

Delivering environment-independent applications

Tarantool Enterprise allows you to build environment-independent applications.

An environment-independent application is an assembly (in one directory) of:

When started by the tarantool executable, the application provides a service.

The modules are Lua rocks installed into a virtual environment (under the application directory) similar to Python’s virtualenv and Ruby’s bundler.

Such an application has the same structure both in development and production-ready phases. All the application-related code resides in one place, ready to be packed and copied over to any server.

Packaging applications

Once custom cluster role(s) are defined and the application is developed, pack it and all its dependencies (module binaries) together with the tarantool executable.

This will allow you to upload, install, and run your application on any server in one go.

To pack the application, say:

$ cartridge pack [rpm|tgz] /path/to/<app_name>

where specify a path to your development environment – the Git repository containing your application code, – and one of the following build options:

  • rpm to build an RPM package (recommended), or
  • tgz to build a tar + gz archive (choose this option only if you do not have root privileges on servers dedicated for Tarantool Enterprise).

This will create a package (or compressed archive) named <app_name>-<version_tag>-<number_of_commits> (e.g., myapp-1.2.1-12.rpm) containing your environment-independent application.

Next, proceed to deploying packaged applications (or archived ones) on your servers.

Deploying packaged applications

To deploy your packaged application, do the following on every server dedicated for Tarantool Enterprise:

  1. Upload the package created in the previous step.

  2. Install:

    $ yum install <app_name>-<version>.rpm
    
  3. Start one or multiple Tarantool instances with the corresponding services as described below.

    • A single instance:

      $ systemctl start <app_name>
      

      This will start an instantiated systemd service that will listen to port 3301.

    • Multiple instances on one or multiple servers:

      $ systemctl start <app_name>@instance_1
      $ systemctl start <app_name>@instance_2
      ...
      $ systemctl start <app_name>@instance_<number>
      

      where <app_name>@instance_<number> is the instantiated service name for systemd with an incremental <number> (unique for every instance) to be added to the 3300 port the instance will listen to (e.g., 3301, 3302, etc.).

  4. In case it is a cluster-aware application, proceed to deploying the cluster.

To stop all services on a server, use the systemctl stop command and specify instance names one by one. For example:

$ systemctl stop <app_name>@instance_1 <app_name>@instance_2 ... <app_name>@instance_<N>

Deploying archived applications

While the RPM package places your application to /usr/share/tarantool/<app_name> on your server by default, the tar + gz archive does not enforce any structure apart from just the <app_name>/ directory, so you are responsible for placing it appropriately.

Note

RPM packages are recommended for deployment. Deploy archives only if you do not have root privileges.

To place and deploy the application, do the following on every server dedicated for Tarantool Enterprise:

  1. Upload the archive, decompress, and extract it to the /home/<user>/apps directory:

    $ tar -xzvf <app_name>-<version>.tar.gz -C /home/<user>/apps
    
  2. Start Tarantool instances with the corresponding services.

    To manage instances and configuration, use tools like ansible, systemd, and supervisord.

  3. In case it is a cluster-aware application, proceed to deploying the cluster.

Upgrading code

All instances in the cluster are to run the same code. This includes all the components: custom roles, applications, module binaries, tarantool and tarantoolctl (if necessary) executables.

Pay attention to possible backward incompatibility that any component may introduce. This will help you choose a scenario for an upgrade in production. Keep in mind that you are responsible for code compatibility and handling conflicts should inconsistencies occur.

To upgrade any of the components, prepare a new version of the package (archive):

  1. Update the necessary files in your development environment (directory):
    • Your own source code: custom roles and/or applications.
    • Module binaries.
    • Executables. Replace them with ones from the new bundle.
  2. Increment the version as described in application versioning.
  3. Repack the updated files as described in packaging applications.
  4. Choose an upgrade scenario as described in production upgrade section.

Running sample applications

The Enterprise distribution package includes sample applications in the examples/ directory that showcase basic Tarantool functionality.

Write-through cache application for PostgreSQL

The example in pg_writethrough_cache/ shows how Tarantool can cache data written through it to a PostgreSQL database to speed up the reads.

The sample application requires a deployed PostgreSQL database and the following rock modules:

$ tarantoolctl rocks install http
$ tarantoolctl rocks install pg
$ tarantoolctl rocks install argparse

Look through the code in the files to get an understanding of what the application does.

To run the application for a local PostgreSQL database, say:

$ tarantool cachesrv.lua --binary-port 3333 --http-port 8888 --database postgresql://localhost/postgres

Write-behind cache application for Oracle

The example in ora-writebehind-cache/ shows how Tarantool can cache writes and queue them to an Oracle database to speed up both writes and reads.

Application requirements

The sample application requires:

  • deployed Oracle database;

  • Oracle tools: Instant Client and SQL Plus, both of version 12.2;

    Note

    In case the Oracle Instant Client errors out on .so files (Oracle’s dynamic libraries), put them to some directory and add it to the LD_LIBRARY_PATH environment variable.

    For example: export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$PWD/<path_to_so_files>

  • rock modules listed in the rockspec file.

To install the modules, run the following command in the examples/ora_writebehind_cache directory:

$ tarantoolctl rocks make oracle_rb_cache-0.1.0-1.rockspec

If you do not have a deployed Oracle instance at hand, run a dummy in a Docker container:

  1. In browser, log in to Oracle container registry, click Database, and accept the Oracle’s Enterprise Terms and Restrictions.

  2. In the ora-writebehind-cache/ directory, log in to the repository under the Oracle account, pull, and run an image using the prepared scripts:

    $ docker login container-registry.oracle.com
    Login:
    Password:
    Login Succeeded
    $ docker pull container-registry.oracle.com/database/enterprise:12.2.0.1
    $ docker run -itd \
       -p 1521:1521 \
       -p 5500:5500 \
       --name oracle \
       -v "$(pwd)"/setupdb/configDB.sh:/home/oracle/setup/configDB.sh \
       -v "$(pwd)"/setupdb/runUserScripts.sh:/home/oracle/setup/runUserScripts.sh \
       -v "$(pwd)"/startupdb:/opt/oracle/scripts/startup \
       container-registry.oracle.com/database/enterprise:12.2.0.1
    

When all is set and done, run the example application.

Running write-behind cache

To launch the application, run the following in the examples/ora_writebehind_cache directory:

$ tarantool init.lua

The application supports the following requests:

  • Get: GET http://<host>:<http_port>/account/id;

  • Add: POST http://<host>:<http_port>/account/ with the following data:

    {"clng_clng_id":1,"asut_asut_id":2,"creation_data":"01-JAN-19","navi_user":"userName"}
    
  • Update: POST http://<host>:<http_port>/account/id with the same data as in the add request;

  • Remove: DELETE http://<host>:<http_port>/account/id where id is an account identifier.

Look for sample CURL scripts in the examples/ora_writebehind_cache/testing directory and check the README.md for more information on implementation.

Hello-world application in Docker

The example in the docker/ directory contains a hello-world application that you can pack in a Docker container and run on CentOS 7.

The hello.lua file is the entry point and it is very bare-bones, so you can add your own code here.

  1. To build the container, say:

    $ docker build -t tarantool-enterprise-docker -f Dockerfile ../..
    
  2. To run it:

    $ docker run --rm -t -i tarantool-enterprise-docker
    

Cluster administrator’s guide

This guide focuses on Enterprise-specific administration features available on top of the open-source Tarantool version with Tarantool Cartridge framework:

Otherwise, please consult the open-source Tarantool documentation for:

Exploring spaces

The web interface lets you connect (in the browser) to any instance in the cluster and see what spaces it stores (if any) and their contents.

To explore spaces:

  1. Open the Space Explorer tab in the menu on the left:

    ../_images/space_explr_tab.png
  2. Click connect next to an instance that stores data. The basic sanity-check (test.py) of the example application puts sample data to one replica set (shard), so its master and replica store the data in their spaces:

    ../_images/spaces_with_data.png

    When connected to a instance, the space explorer shows a table with basic information on its spaces. For more information, see the box.space reference.

    To see hidden spaces, tick the corresponding checkbox:

    ../_images/hidden_spaces.png
  3. Click the space’s name to see its format and contents:

    ../_images/space_contents.png

    To search the data, select an index and, optionally, its iteration type from the drop-down lists, and enter the index value:

    ../_images/space_search.png

Upgrading in production

To upgrade either a single instance or a cluster, you need a new version of the packaged (archived) application.

A single instance upgrade is simple:

  1. Upload the package (archive) to the server.
  2. Stop the current instance.
  3. Deploy the new one as described in deploying packaged applications (or archived ones).

Cluster upgrade

To upgrade a cluster, choose one of the following scenarios:

  • Cluster shutdown. Recommended for backward-incompatible updates, requires downtime.
  • Instance by instance. Recommended for backward-compatible updates, does not require downtime.

To upgrade the cluster, do the following:

  1. Schedule a downtime or plan for the instance-by-instance upgrade.
  2. Upload a new application package (archive) to all servers.

Next, execute the chosen scenario:

  • Cluster shutdown:
    1. Stop all instances on all servers.
    2. Deploy the new package (archive) on every server.
  • Instance by instance. Do the following in every replica set in succession:
    1. Stop a replica on any server.
    2. Deploy the new package (archive) in place of the old replica.
    3. Promote the new replica to a master (see Switching the replica set’s master section in the Tarantool manual).
    4. Redeploy the old master and the rest of the instances in the replica set.
    5. Be prepared to resolve possible logic conflicts.

Security hardening guide

This guide explains how to enhance security in your Tarantool Enterprise cluster using built-in features and provides general recommendations on security hardening. If you need to perform a security audit of a Tarantool Enterprise cluster, refer to the security checklist.

Tarantool Enterprise does not provide a dedicated API for security control. All the necessary configurations can be done via an administrative console or initialization code.

Tarantool Enterprise has the following built-in security features:

Authentication

Tarantool Enterprise supports password-based authentication and allows for two types of connections:

For more information on authentication and connection types, see the Security section of the Tarantool manual.

In addition, Tarantool provides the following functionality:

Access control

Tarantool Enterprise provides the means for administrators to prevent unauthorized access to the database and to certain functions.

Tarantool recognizes:

The following system spaces are used to store users and privileges:

For more information, see the Access control section.

Users who create objects (spaces, indexes, users, roles, sequences, and functions) in the database become their owners and automatically acquire privileges for what they create. For more information, see the Owners and privileges section.

Authentication restrictions

Tarantool Enterprise provides the ability to apply additional restrictions for user authentication. For example, you can specify the minimum time between authentication attempts or disable access for guest users.

The following configuration options are available:

auth_delay

Specifies a period of time (in seconds) that a specific user should wait for the next attempt after failed authentication.

With the configuration below, Tarantool refuses the authentication attempt if the previous attempt was less than 5 seconds ago.

box.cfg{ auth_delay = 5 }
Since version: 2.11
Type: number
Default: 0
Environment variable: TT_AUTH_DELAY
Dynamic: yes
disable_guest

If true, disables access over remote connections from unauthenticated or guest access users. This option affects both net.box and replication connections.

Since version: 2.11
Type: boolean
Default: false
Environment variable: TT_DISABLE_GUEST
Dynamic: yes

Password policy

A password policy allows you to improve database security by enforcing the use of strong passwords, setting up a maximum password age, and so on. When you create a new user with box.schema.user.create or update the password of an existing user with box.schema.user.passwd, the password is checked against the configured password policy settings.

The following configuration options are available:

password_min_length

Specifies the minimum number of characters for a password.

The following example shows how to set the minimum password length to 10.

box.cfg{ password_min_length = 10 }
Since version: 2.11
Type: integer
Default: 0
Environment variable: TT_PASSWORD_MIN_LENGTH
Dynamic: yes
password_enforce_uppercase

If true, a password should contain uppercase letters (A-Z).

Since version: 2.11
Type: boolean
Default: false
Environment variable: TT_PASSWORD_ENFORCE_UPPERCASE
Dynamic: yes
password_enforce_lowercase

If true, a password should contain lowercase letters (a-z).

Since version: 2.11
Type: boolean
Default: false
Environment variable: TT_PASSWORD_ENFORCE_LOWERCASE
Dynamic: yes
password_enforce_digits

If true, a password should contain digits (0-9).

Since version: 2.11
Type: boolean
Default: false
Environment variable: TT_PASSWORD_ENFORCE_DIGITS
Dynamic: yes
password_enforce_specialchars

If true, a password should contain at least one special character (such as &|?!@$).

Since version: 2.11
Type: boolean
Default: false
Environment variable: TT_PASSWORD_ENFORCE_SPECIALCHARS
Dynamic: yes
password_lifetime_days

Specifies the maximum period of time (in days) a user can use the same password. When this period ends, a user gets the “Password expired” error on a login attempt. To restore access for such users, use box.schema.user.passwd.

Note

The default 0 value means that a password never expires.

The example below shows how to set a maximum password age to 365 days.

box.cfg{ password_lifetime_days = 365 }
Since version: 2.11
Type: integer
Default: 0
Environment variable: TT_PASSWORD_LIFETIME_DAYS
Dynamic: yes
password_history_length

Specifies the number of unique new user passwords before an old password can be reused.

In the example below, a new password should differ from the last three passwords.

box.cfg{ password_history_length = 3 }
Since version: 2.11
Type: integer
Default: 0
Environment variable: TT_PASSWORD_HISTORY_LENGTH
Dynamic: yes

Note

Tarantool uses the auth_history field in the box.space._user system space to store user passwords.

Authentication protocol

By default, Tarantool uses the CHAP protocol to authenticate users and applies SHA-1 hashing to passwords. Note that CHAP stores password hashes in the _user space unsalted. If an attacker gains access to the database, they may crack a password, for example, using a rainbow table.

With Tarantool Enterprise, you can enable PAP authentication with the SHA256 hashing algorithm. For PAP, a password is salted with a user-unique salt before saving it in the database, which keeps the database protected from cracking using a rainbow table.

To enable PAP, specify the box.cfg.auth_type option as follows:

box.cfg{ auth_type = 'pap-sha256' }
Since version: 2.11
Type: string
Default value: ‘chap-sha1’
Possible values: ‘chap-sha1’, ‘pap-sha256’
Environment variable: TT_AUTH_TYPE
Dynamic: yes

For new users, the box.schema.user.create method will generate authentication data using PAP-SHA256. For existing users, you need to reset a password using box.schema.user.passwd to use the new authentication protocol.

Warning

Given that PAP transmits a password as plain text, Tarantool requires configuring SSL/TLS for a connection.

The examples below show how to specify the authentication protocol on the client side:

  • For net.box, you can specify the authentication protocol using the auth_type URI parameter or the corresponding connection option:

    -- URI parameters
    conn = require('net.box').connect(
        'username:password@localhost:3301?auth_type=pap-sha256')
    
    -- URI parameters table
    conn = require('net.box').connect({
        uri = 'username:password@localhost:3301',
        params = {auth_type = 'pap-sha256'},
    })
    
    -- Connection options
    conn = require('net.box').connect('localhost:3301', {
        user = 'username',
        password = 'password',
        auth_type = 'pap-sha256',
    })
    
  • For replication configuration, the authentication protocol can be specified in URI parameters:

    -- URI parameters
    box.cfg{
        replication = {
            'replicator:password@localhost:3301?auth_type=pap-sha256',
        },
    }
    
    -- URI parameters table
    box.cfg{
        replication = {
            {
                uri = 'replicator:password@localhost:3301',
                params = {auth_type = 'pap-sha256'},
            },
        },
    }
    

If the authentication protocol isn’t specified explicitly on the client side, the client uses the protocol configured on the server via box.cfg.auth_type.

Audit log

Tarantool Enterprise has a built-in audit log that records events such as:

The audit log contains:

You can configure the following audit log parameters:

For more information on logging, see the following:

Access permissions to audit log files can be set up as to any other Unix file system object – via chmod.

Traffic encryption

Since version 2.10.0, Tarantool Enterprise has the built-in support for using SSL to encrypt the client-server communications over binary connections, that is, between Tarantool instances in a cluster or connecting to an instance via connectors using net.box.

Tarantool uses the OpenSSL library that is included in the delivery package. Please note that SSL connections use only TLSv1.2.

Configuration

To configure traffic encryption, you need to set the special URI parameters for a particular connection. The parameters can be set for the following box.cfg options and nex.box method:

Below is the list of the parameters. In the next section, you can find details and examples on what should be configured on both the server side and the client side.

  • transport – enables SSL encryption for a connection if set to ssl. The default value is plain, which means the encryption is off. If the parameter is not set, the encryption is off too. Other encryption-related parameters can be used only if the transport = 'ssl' is set.

    Example:

    c = require('net.box').connect({
        uri = 'localhost:3301',
        params = {transport = 'ssl'}
    })
    
  • ssl_key_file – a path to a private SSL key file. Mandatory for a server. For a client, it’s mandatory if the ssl_ca_file parameter is set for a server; otherwise, optional. If the private key is encrypted, provide a password for it in the ssl_password or ssl_password_file parameter.

  • ssl_cert_file – a path to an SSL certificate file. Mandatory for a server. For a client, it’s mandatory if the ssl_ca_file parameter is set for a server; otherwise, optional.

  • ssl_ca_file – a path to a trusted certificate authorities (CA) file. Optional. If not set, the peer won’t be checked for authenticity.

    Both a server and a client can use the ssl_ca_file parameter:

    • If it’s on the server side, the server verifies the client.
    • If it’s on the client side, the client verifies the server.
    • If both sides have the CA files, the sever and the client verify each other.
  • ssl_ciphers – a colon-separated (:) list of SSL cipher suites the connection can use. See the Supported ciphers section for details. Optional. Note that the list is not validated: if a cipher suite is unknown, Tarantool just ignores it, doesn’t establish the connection and writes to the log that no shared cipher found.

  • ssl_password – a password for an encrypted private SSL key. Optional. Alternatively, the password can be provided in ssl_password_file.

  • ssl_password_file – a text file with one or more passwords for encrypted private SSL keys (each on a separate line). Optional. Alternatively, the password can be provided in ssl_password.

    Tarantool applies the ssl_password and ssl_password_file parameters in the following order:

    1. If ssl_password is provided, Tarantool tries to decrypt the private key with it.
    2. If ssl_password is incorrect or isn’t provided, Tarantool tries all passwords from ssl_password_file one by one in the order they are written.
    3. If ssl_password and all passwords from ssl_password_file are incorrect, or none of them is provided, Tarantool treats the private key as unencrypted.

Configuration example:

box.cfg{ listen = {
    uri = 'localhost:3301',
    params = {
        transport = 'ssl',
        ssl_key_file = '/path_to_key_file',
        ssl_cert_file = '/path_to_cert_file',
        ssl_ciphers = 'HIGH:!aNULL',
        ssl_password = 'topsecret'
    }
}}

Supported ciphers

Tarantool Enterprise supports the following cipher suites:

  • ECDHE-ECDSA-AES256-GCM-SHA384
  • ECDHE-RSA-AES256-GCM-SHA384
  • DHE-RSA-AES256-GCM-SHA384
  • ECDHE-ECDSA-CHACHA20-POLY1305
  • ECDHE-RSA-CHACHA20-POLY1305
  • DHE-RSA-CHACHA20-POLY1305
  • ECDHE-ECDSA-AES128-GCM-SHA256
  • ECDHE-RSA-AES128-GCM-SHA256
  • DHE-RSA-AES128-GCM-SHA256
  • ECDHE-ECDSA-AES256-SHA384
  • ECDHE-RSA-AES256-SHA384
  • DHE-RSA-AES256-SHA256
  • ECDHE-ECDSA-AES128-SHA256
  • ECDHE-RSA-AES128-SHA256
  • DHE-RSA-AES128-SHA256
  • ECDHE-ECDSA-AES256-SHA
  • ECDHE-RSA-AES256-SHA
  • DHE-RSA-AES256-SHA
  • ECDHE-ECDSA-AES128-SHA
  • ECDHE-RSA-AES128-SHA
  • DHE-RSA-AES128-SHA
  • AES256-GCM-SHA384
  • AES128-GCM-SHA256
  • AES256-SHA256
  • AES128-SHA256
  • AES256-SHA
  • AES128-SHA
  • GOST2012-GOST8912-GOST8912
  • GOST2001-GOST89-GOST89

Tarantool Enterprise static build has the embeded engine to support the GOST cryptographic algorithms. If you use these algorithms for traffic encryption, specify the corresponding cipher suite in the ssl_ciphers parameter, for example:

box.cfg{ listen = {
    uri = 'localhost:3301',
    params = {
        transport = 'ssl',
        ssl_key_file = '/path_to_key_file',
        ssl_cert_file = '/path_to_cert_file',
        ssl_ciphers = 'GOST2012-GOST8912-GOST8912'
    }
}}

For detailed information on SSL ciphers and their syntax, refer to OpenSSL documentation.

Using environment variables

The URI parameters for traffic encryption can also be set via environment variables. For example:

export TT_LISTEN="localhost:3301?transport=ssl&ssl_cert_file=/path_to_cert_file&ssl_key_file=/path_to_key_file"

For details, refer to the Tarantool configuration reference.

Server-client configuration details

When configuring the traffic encryption, you need to specify the necessary parameters on both the server side and the client side. Below you can find the summary on the options and parameters to be used and examples of configuration.

Server side

  • Is configured via the box.cfg.listen option.
  • Mandatory URI parameters: transport, ssl_key_file and ssl_cert_file.
  • Optional URI parameters: ssl_ca_file, ssl_ciphers, ssl_password, and ssl_password_file.

Client side

  • Is configured via the box.cfg.replication option (see details) or net_box_object.connect().

Parameters:

  • If the server side has only the transport, ssl_key_file and ssl_cert_file parameters set, on the client side, you need to specify only transport = ssl as the mandatory parameter. All other URI parameters are optional.
  • If the server side also has the ssl_ca_file parameter set, on the client side, you need to specify transport, ssl_key_file and ssl_cert_file as the mandatory parameters. Other parameters – ssl_ca_file, ssl_ciphers, ssl_password, and ssl_password_file – are optional.

Configuration examples

Suppose, there is a master-replica set with two Tarantool instances:

  • 127.0.0.1:3301 – master (server)
  • 127.0.0.1:3302 – replica (client).

Examples below show the configuration related to connection encryption for two cases: when the trusted certificate authorities (CA) file is not set on the server side and when it does. Only mandatory URI parameters are mentioned in these examples.

  1. Without CA
  • 127.0.0.1:3301 – master (server)

    box.cfg{
        listen = {
            uri = '127.0.0.1:3301',
            params = {
                transport = 'ssl',
                ssl_key_file = '/path_to_key_file',
                ssl_cert_file = '/path_to_cert_file'
            }
        }
    }
    
  • 127.0.0.1:3302 – replica (client)

    box.cfg{
        listen = {
            uri = '127.0.0.1:3302',
            params = {transport = 'ssl'}
        },
        replication = {
            uri = 'username:password@127.0.0.1:3301',
            params = {transport = 'ssl'}
        },
        read_only = true
    }
    
  1. With CA
  • 127.0.0.1:3301 – master (server)

    box.cfg{
        listen = {
            uri = '127.0.0.1:3301',
            params = {
                transport = 'ssl',
                ssl_key_file = '/path_to_key_file',
                ssl_cert_file = '/path_to_cert_file',
                ssl_ca_file = '/path_to_ca_file'
            }
        }
    }
    
  • 127.0.0.1:3302 – replica (client)

    box.cfg{
        listen = {
            uri = '127.0.0.1:3302',
            params = {
                transport = 'ssl',
                ssl_key_file = '/path_to_key_file',
                ssl_cert_file = '/path_to_cert_file'
            }
        },
        replication = {
            uri = 'username:password@127.0.0.1:3301',
            params = {
                transport = 'ssl',
                ssl_key_file = '/path_to_key_file',
                ssl_cert_file = '/path_to_cert_file'
            }
        },
        read_only = true
    }
    

Recommendations on security hardening

This section lists recommendations that can help you harden the cluster’s security.

Encrypting traffic

Since version 2.10.0, Tarantool Enterprise has built-in support for using SSL to encrypt the client-server communications over binary connections, that is, between Tarantool instances in a cluster. For details on enabling SSL encryption, see the Traffic encryption section of this guide.

In case the built-in encryption is not set for particular connections, consider the following security recommendations:

  • setting up connection tunneling, or
  • encrypting the actual data stored in the database.

For more information on data encryption, see the crypto module reference.

The HTTP server module provided by rocks does not support the HTTPS protocol. To set up a secure connection for a client (e.g., REST service), consider hiding the Tarantool instance (router if it is a cluster of instances) behind an Nginx server and setting up an SSL certificate for it.

To make sure that no information can be intercepted ‘from the wild’, run Nginx on the same physical server as the instance and set up their communication over a Unix socket. For more information, see the socket module reference.

Firewall configuration

To protect the cluster from any unwanted network activity ‘from the wild’, configure the firewall on each server to allow traffic on ports listed in Network requirements.

If you are using static IP addresses, whitelist them, again, on each server as the cluster has a full mesh network topology. Consider blacklisting all the other addresses on all servers except the router (running behind the Nginx server).

Tarantool Enterprise does not provide defense against DoS or DDoS attacks. Consider using third-party software instead.

Data integrity

Tarantool Enterprise does not keep checksums or provide the means to control data integrity. However, it ensures data persistence using a write-ahead log, regularly snapshots the entire data set to disk, and checks the data format whenever it reads the data back from the disk. For more information, see the Data persistence section.

Security audit

This document will help you audit the security of a Tarantool Enterprise cluster. It explains certain security aspects, their rationale, and the ways to check them. For details on how to configure Tarantool Enterprise and its infrastructure for each aspect, refer to the security hardening guide.

Encryption of external iproto traffic

Tarantool Enterprise uses the iproto binary protocol for replicating data between instances and also in the connector libraries.

Since version 2.10.0, Tarantool Enterprise has the built-in support for using SSL to encrypt the client-server communications over binary connections. For details on enabling SSL encryption, see the Traffic encryption section of this document.

In case the built-in encryption is not enabled, we recommend to use VPN to secure data exchange between data centers.

Closed iproto ports

When a Tarantool Enterprise cluster does not use iproto for external requests, connections to the iproto ports should be allowed only between Tarantool instances.

For more details on configuring ports for iproto, see the advertise_uri section in the Cartridge documentation.

HTTPS connection termination

A Tarantool Enterprise instance can accept HTTP connections from external services or to access the administrative web UI. All such connections must go through an HTTPS-providing web server, running on the same host, such as NGINX. This requirement is for both virtual and physical hosts. Running HTTP traffic through a few separate hosts with HTTPS termination is not sufficiently secure.

Closed HTTP ports

Tarantool Enterprise accepts HTTP connections on a specific port, configured with http_port: <number> value (see configuring Cartridge instances). It must be only available on the same host for nginx to connect to it.

Check that the configured HTTP port is closed and that the HTTPS port (443 by default) is open.

Restricted access to the administrative console

The console module provides a way to connect to a running instance and run custom Lua code. This can be useful for development and administration. The following code examples open connections on a TCP port and on a UNIX socket.

console.listen(<port number>)
console.listen('/var/lib/tarantool/socket_name.sock')

Opening administrative console through a TCP port is always unsafe. Check that there are no calls like console.listen(<port_number>) in the code.

Connecting through a socket requires having the write permission on the /var/lib/tarantool directory. Check that write permission to this directory is limited to the tarantool user.

Limiting the guest user

Connecting to the instance with tarantoolctl connect without user credentials (under the guest user) must be disabled.

There are two ways to check this vulnerability:

For more details, refer to the documentation on access control.

Authorization in the web UI

Using the web interface must require logging in with username and password. See more details in the documentation on configuring web interface authorization.

Running under the tarantool user

All Tarantool Enterprise instances should be running under the tarantool user.

Limiting access to the tarantool user

The tarantool user must be a non-privileged user without the sudo permission. Also, it must not have a password set to prevent logging in via SSH or su.

Keeping two or more snapshots

In order to have a reliable backup, a Tarantool Enterprise instance must keep two or more latest snapshots. This should be checked on each Tarantool Enterprise instance.

The snapshot_count value determines number of kept snapshots. Configuration values are primarily set in the configuration files but can be overridden with environment variables and command-line arguments. So, it’s best to check both the values in the configuration files and the actual values using the console:

tarantool> box.cfg.checkpoint_count
---
- 2

Enabled write-ahead logging (WAL)

Tarantool Enterprise records all incoming data in the write-ahead log (WAL). The WAL must be enabled to ensure that data will be recovered in case of a possible instance restart.

Secure values of wal_mode are write and fsync:

tarantool> box.cfg.wal_mode
---
- write

An exclusion from this requirement is when the instance is processing data, which can be freely rejected. For example, when Tarantool Enterprise is used for caching. Then WAL can be disabled to reduce i/o load.

For more details, see the wal_mode reference.

The logging level is INFO or higher

The logging level should be set to 5 (INFO), 6 (VERBOSE), or 7 (DEBUG). Application logs will then have enough information to research a possible security breach.

tarantool> box.cfg.log_level
---
- 5

For a full list of logging levels, see the log_level reference.

Logging with journald

Tarantool Enterprise should use journald for logging.

LDAP authorization

This chapter describes how to manage the access roles for LDAP users authorizing in your Cartridge application.

Setting up this feature is twofold:

Note

For information on setting up the authorization of external users in your application, refer to Implementing LDAP authorization in the web interface.

Enabling LDAP authorization

First, you should enable LDAP authorization function in your application development project:

Note

If you don’t have a development project yet, refer to Developer’s guide on how to create it.

  1. In your development project, find a .rockspec file and specify the following dependency:

    dependencies = {
        'cartridge-auth-extension'
    }
    
  2. In an initialization Lua file of your project, specify the cartridge-auth-extension cluster role in the Cartridge configuration. The role enables storing authorized users and validating the LDAP configuration.

    cartridge.cfg({
        roles = {
           'cartridge-auth-extension',
        },
        auth_backend_name = 'cartridge-auth-extension',
    })
    
  3. Deploy and start your application. For details, refer to Developer’s guide.

Configuring LDAP authorization

After starting your application, you need to configure LDAP authorization. It can be done via the GUI administrative console.

  1. In a web browser, open the GUI administrative console of your application.
  2. If you have the application instances already configured, proceed to the next step. Otherwise, refer to Deploying the cluster on how to configure the cluster.
  3. In the GUI administrative console, navigate to the Code tab. Create the following YAML configuration files and specify the necessary parameters. Below is the example of configuration and the description of parameters.

Note

If you set the authorization mode as local in the auth_extension.yml file, you don’t need to define LDAP configuration parameters in the ldap.yml file.

Configuration parameters:

Tuple compression

Tuple compression, introduced in Tarantool 2.10.0, aims to save memory space. Typically, it decreases the volume of stored data by 15%. However, the exact volume saved depends on the type of data.

Two compression algorithms are currently supported: lz4 and zstd. To learn about the performance costs of each algorithm, check the appendix.

You do not compress tuples themselves, just the fields inside these tuples. You can only compress non-indexed fields. Compression works best when JSON is stored in the field.

Tuple compression is possible for memtx spaces only. Vinyl spaces do not support compression.

How to create compressed fields

First, create a memtx space:

box.schema.space.create('TEST')

Then create an index for this space, for example:

box.space.TEST:create_index('tree', {
            type = 'TREE',
            parts = {
                {1, 'unsigned'},
                {3, 'unsigned'},
                {5, 'unsigned'}
        }})

Create a format to declare field names and types. It is possible to have only one field with an index. This example has several indexed fields, just to demonstrate a more complicated case:

box.space.TEST:format({
            {name = 'A', type = 'unsigned'},
            {name = 'B', type = 'string', compression = 'zstd'},
            {name = 'C', type = 'unsigned'},
            {name = 'D', type = 'unsigned', compression = 'lz4'},
            {name = 'E', type = 'unsigned'}
        })

In this example, fields number 1, 3, and 5 have indexes, so they cannot be compressed. Fields 2 and 4 can be compressed. They have compression formats compression = 'zstd' and compression = 'lz4', correspondingly. You can apply different compression algorithms to different fields in a single space.

Now, the new tuples that you create and add to the space ‘TEST’ will be compressed.

When you read a compressed tuple, you do not need to decompress it back yourself.

If the size of the field is too small, the field will not be compressed. It is not an error, so you will see no error message. The field will just have the same size as it had before the compression.

How to check whether a field is compressed

To determine which fields in your space are compressed, run space_object:format() on the space. If a field is compressed, the format will include the compression type. Example output:

box.space.ledger:format({
            {name = 'id', type = 'unsigned'}, -- this field is uncompressed
            {name = 'client_details', type = 'array', compression = 'zstd'},
            {name = 'notes', type = 'string', compression = 'lz4'},
        })

What tuples can be compressed

In Tarantool 2.10.0, you can enable compression for an existing field. All the tuples added after that will have this field compressed. However, this doesn’t affect the tuples already stored in the space – they remain uncompressed until the snapshot and restart.

How to enable compression for already created tuples

With the help of space:upgrade(), you can enable compression and migrate, including already created tuples. Just specify the fields to be compressed in the format passed to space:upgrade(). Everything works transparently: when writing, the data is compressed, when reading it is decompressed. You can also compress data in existing storages.

Here’s an example of how to compress an existing field:

  1. Create a space without compression (named ledger in this example) and add several tuples.

  2. Suppose that you want fields 2 and 3 to be compressed from now on. To enable compression, change the format:

    local format = box.space.ledger:format()
    format[2].compression = 'zstd'
    format[3].compression = 'zstd'
    box.space.ledger:format(format)
    
  3. To finalize the change, create a snapshot by running box.snapshot() and restart Tarantool.

  4. From now on, all the tuples that you add to the space have fields 2 and 3 compressed. After the snapshot and restart, all old tuples will also be compressed in-memory as well during recovery).

Errors

“Indexed field does not support compression”

You can only compress non-indexed fields. If you try to compress an indexed field, you will get an error message: “Indexed field does not support compression”.

“Vinyl does not support compression”

Tuple compression is possible for memtx spaces. If you create a vinyl space with compression, you will get an error message: “Vinyl does not support compression”.

“Failed to create space ‘T’: field 1 has unknown compression type”

If you set a compression format that is not zstd or lz4, you will get an error message: “Failed to create space ‘T’: field 1 has unknown compression type”. Here field 1 is the name of an example field.

WAL extensions

WAL extensions allow you to add auxiliary information to each write-ahead log record. For example, you can enable storing an old and new tuple for each CRUD operation performed. This information might be helpful for implementing a CDC (Change Data Capture) utility that transforms a data replication stream.

Configuration

To configure WAL extensions, use the wal_ext configuration property. Inside the wal_ext block, you can enable storing old and new tuples as follows:

Note that records with additional fields are replicated as follows:

Example

The table below demonstrates how write-ahead log records might look for the specific CRUD operations if storing old and new tuples is enabled for the bands space.

Operation Example WAL information
insert bands:insert{4, 'The Beatles', 1960}
new_tuple: [4, ‘The Beatles’, 1960]
tuple: [4, ‘The Beatles’, 1960]
delete bands:delete{4}
key: [4]
old_tuple: [4, ‘The Beatles’, 1960]
update bands:update({2}, {{'=', 2, 'Pink Floyd'}})
new_tuple: [2, ‘Pink Floyd’, 1965]
old_tuple: [2, ‘Scorpions’, 1965]
key: [2]
tuple: [[‘=’, 2, ‘Pink Floyd’]]
upsert bands:upsert({2, 'Pink Floyd', 1965}, {{'=', 2, 'The Doors'}})
new_tuple: [2, ‘The Doors’, 1965]
old_tuple: [2, ‘Pink Floyd’, 1965]
operations: [[‘=’, 2, ‘The Doors’]]
tuple: [2, ‘Pink Floyd’, 1965]
replace bands:replace{1, 'The Beatles', 1960}
old_tuple: [1, ‘Roxette’, 1986]
new_tuple: [1, ‘The Beatles’, 1960]
tuple: [1, ‘The Beatles’, 1960]

Storing both old and new tuples is especially useful for the update operation because a write-ahead log record contains only a key value.

Note

You can use the tt cat command to see the contents of a write-ahead log.

Read views

A read view is an in-memory snapshot of the entire database that isn’t affected by future data modifications. Read views provide access to database spaces and their indexes and enable you to retrieve data using the same select and pairs operations.

Read views can be used to make complex analytical queries. This reduces the load on the main database and improves RPS for a single Tarantool instance.

To improve memory consumption and performance, Tarantool creates read views using the copy-on-write technique. In this case, duplication of the entire data set is not required: Tarantool duplicates only blocks modified after a read view is created.

Note

Tarantool supports read views starting from v2.11.0 and enables the ability to work with them using both Lua and C API.

Limitations

Read views have the following limitations:

Working with read views

Creating a read view

To create a read view, call the box.read_view.open() function. The snippet below shows how to create a read view with the read_view1 name.

tarantool> read_view1 = box.read_view.open({name = 'read_view1'})

After creating a read view, you can see the information about it by calling read_view_object:info().

tarantool> read_view1:info()
---
- timestamp: 66.606817935
  signature: 24
  is_system: false
  status: open
  vclock: {1: 24}
  name: read_view1
  id: 1
...

To list all the created read views, call the box.read_view.list() function.

Querying data

After creating a read view, you can access database spaces using the read_view_object.space field. This field provides access to a space object that exposes the select, get, and pairs methods with the same behavior as corresponding box.space methods.

The example below shows how to select 4 records from the bands space:

tarantool> read_view1.space.bands:select({}, {limit = 4})
---
- - [1, 'Roxette', 1986]
  - [2, 'Scorpions', 1965]
  - [3, 'Ace of Base', 1987]
  - [4, 'The Beatles', 1960]
...

Similarly, you can retrieve data by the specific index.

tarantool> read_view1.space.bands.index.year:select({}, {limit = 4})
---
- - [4, 'The Beatles', 1960]
  - [2, 'Scorpions', 1965]
  - [1, 'Roxette', 1986]
  - [3, 'Ace of Base', 1987]
...

Closing a read view

When a read view is no longer needed, close it using the read_view_object:close() method because a read view may consume a substantial amount of memory.

tarantool> read_view1:close()
---
...

Otherwise, a read view is closed implicitly when the read view object is collected by the Lua garbage collector.

After the read view is closed, its status is set to closed. On an attempt to use it, an error is raised.

Example

A Tarantool session below demonstrates how to open a read view, get data from this view, and close it. To repeat these steps, you need to bootstrap a Tarantool instance as described in Using data operations (you can skip creating secondary indexes).

  1. Insert test data.

    tarantool> bands:insert{1, 'Roxette', 1986}
               bands:insert{2, 'Scorpions', 1965}
               bands:insert{3, 'Ace of Base', 1987}
               bands:insert{4, 'The Beatles', 1960}
    
  2. Create a read view by calling the open function. Then, make sure that the read view status is open.

    tarantool> read_view1 = box.read_view.open({name = 'read_view1'})
    
    tarantool> read_view1.status
    ---
    - open
    ...
    
  3. Change data in a database using the delete and update operations.

    tarantool> bands:delete(4)
    ---
    - [4, 'The Beatles', 1960]
    ...
    tarantool> bands:update({2}, {{'=', 2, 'Pink Floyd'}})
    ---
    - [2, 'Pink Floyd', 1965]
    ...
    
  4. Query a read view to make sure it contains a snapshot of data before a database is updated.

    tarantool> read_view1.space.bands:select()
    ---
    - - [1, 'Roxette', 1986]
      - [2, 'Scorpions', 1965]
      - [3, 'Ace of Base', 1987]
      - [4, 'The Beatles', 1960]
    ...
    
  5. Close a read view.

    tarantool> read_view1:close()
    ---
    ...
    

Lua API

This topic describes the Lua API for working with read views.

box.read_view:open({opts})

Create a new read view.

Parameters:opts (table) – (optional) configurations options for a read view. For example, the name option specifies a read view name. If name is not specified, a read view name is set to unknown.
Returns:a created read view object
Return type:read_view_object

Example:

tarantool> read_view1 = box.read_view.open({name = 'read_view1'})
class read_view_object

An object that represents a read view.

read_view_object:info()

Get information about a read view such as a name, status, or ID. All the available fields are listed below in the object options.

Returns:information about a read view
Return type:table
read_view_object:close()

Close a read view. After the read view is closed, its status is set to closed. On an attempt to use it, an error is raised.

status

A read view status. The possible values are open and closed.

Return type:string
id

A unique numeric identifier of a read view.

Return type:number
name

A read view name. You can specify a read view name in the box.read_view.open() arguments.

Return type:string
is_system

Determine whether a read view is system. For example, system read views can be created to make a checkpoint or join a new replica.

Return type:boolean
timestamp

The fiber.clock() value at the moment of opening a read view.

Return type:number
vclock

The box.info.vclock value at the moment of opening a read view.

Return type:table
signature

The box.info.signature value at the moment of opening a read view.

Return type:number
space

Get access to database spaces included in a read view. You can use this field to query space data.

Return type:space object

C API

This topic describes the C API for working with read views. The C API is MT-safe and provides the ability to use a read view from any thread, not only from the main (TX) thread.

The C API has the following specifics:

Note

You can learn how to call C code using stored procedures in the C tutorial.

Data types

The opaque data types below represent raw read views and an iterator over data in a raw read view. Note that there is no special data type for tuples retrieved from a read view. Tuples are returned as raw MessagePack data (const char *).

typedef struct box_raw_read_view box_raw_read_view_t

A raw database read view.

typedef struct box_raw_read_view_space box_raw_read_view_space_t

A space in a raw read view.

typedef struct box_raw_read_view_index box_raw_read_view_index_t

An index in a raw read view.

typedef struct box_raw_read_view_iterator box_raw_read_view_iterator_t

An iterator over data in a raw read view.

Creating and destroying read views

To create or destroy a read view, use the functions below.

box_raw_read_view_t * box_raw_read_view_new(const char *name)

Open a raw read view with the specified name and get a pointer to this read view. In the case of error, returns NULL and sets box_error_last(). This function may be called from the main (TX) thread only.

Parameters:
  • char *name (const) –

    (optional) a read view name; if name is not specified, a read view name is set to unknown

Returns:

a pointer to a read view

void box_raw_read_view_delete(box_raw_read_view_t *rv)

Close a raw read view and release all resources associated with it. This function may be called from the main (TX) thread only.

Parameters:

Note

Read views created using box_raw_read_view_new are displayed in box.read_view.list() along with read views created in Lua.

Spaces and indexes

To fetch data from a read view, you need to specify an index to fetch the data from. The following functions are available for looking up spaces and indexes in a read view object.

box_raw_read_view_space_t * box_raw_read_view_space_by_id(const box_raw_read_view_t *rv, uint32_t space_id)

Find a space by ID in a raw read view. If not found, returns NULL and sets box_error_last().

Parameters:
  • box_raw_read_view_t *rv (const) –

    a pointer to a read view

  • space_id (uint32_t) – a space identifier
Returns:

a pointer to a space

box_raw_read_view_space_t * box_raw_read_view_space_by_name(const box_raw_read_view_t *rv, const char *space_name, uint32_t space_name_len)

Find a space by name in a raw read view. If not found, returns NULL and sets box_error_last().

Parameters:
  • box_raw_read_view_t *rv (const) –

    a pointer to a read view

  • char *space_name (const) –

    a space name

  • space_name_len (uint32_t) – a space name length
Returns:

a pointer to a space

box_raw_read_view_index_t * box_raw_read_view_index_by_id(const box_raw_read_view_space_t *space, uint32_t index_id)

Find an index by ID in a read view’s space. If not found, returns NULL and sets box_error_last().

Parameters:
  • box_raw_read_view_space_t *space (const) –

    a pointer to a read view’s space

  • space_id (uint32_t) – a space identifier
Returns:

a pointer to an index

box_raw_read_view_index_t * box_raw_read_view_index_by_name(const box_raw_read_view_space_t *space, const char *index_name, uint32_t index_name_len)

Find an index by name in a read view’s space. If not found, returns NULL and sets box_error_last().

Parameters:
  • box_raw_read_view_space_t *space (const) –

    a pointer to a space

  • char *index_name (const) –

    an index name

  • index_name_len (uint32_t) – an index name length
Returns:

a pointer to an index

Iteration and lookup

The functions below provide the ability to look up a tuple by the key or create an iterator over a read view index.

Note

Methods of the read view iterator are safe to call from any thread, but they may be used in one thread at the same time. This means that an iterator should be thread-local.

int box_raw_read_view_get(const box_raw_read_view_index_t *index, const char *key, const char *key_end, const char **data, uint32_t *size)

Look up a tuple in a read view’s index. If found, the data and size out arguments return a pointer to and the size of tuple data. If not found, *data is set to NULL and *size is set to 0.

Parameters:
  • box_raw_read_view_index_t *index (const) –

    a pointer to a read view’s index

  • char *key (const) –

    a pointer to the first byte of the MsgPack data that represents the search key

  • char *key_end (const) –

    a pointer to the byte following the last byte of the MsgPack data that represents the search key

  • char **data (const) –

    a pointer to the tuple data

  • *size (uint32_t) –

    the size of tuple data

Returns:

0 on success; in the case of error, returns -1 and sets box_error_last()

int box_raw_read_view_iterator_create(box_raw_read_view_iterator_t *it, const box_raw_read_view_index_t *index, int type, const char *key, const char *key_end)

Create an iterator over a raw read view index. The initialized iterator object returned by this function remains valid and may be safely used until it’s destroyed or the read view is closed. When the iterator object is no longer needed, it should be destroyed using box_raw_read_view_iterator_destroy().

Parameters:
  • *it (box_raw_read_view_iterator_t) –

    an iterator over a raw read view index

  • box_raw_read_view_index_t *index (const) –

    a pointer to a read view index

  • type (int) – an iteration direction represented by the iterator_type
  • char *key (const) –

    a pointer to the first byte of the MsgPack data that represents the search key

  • char *key_end (const) –

    a pointer to the byte following the last byte of the MsgPack data that represents the search key

Returns:

0 on success; in the case of error, returns -1 and sets box_error_last()

int box_raw_read_view_iterator_next(box_raw_read_view_iterator_t *it, const char **data, uint32_t *size)

Retrieve the current tuple and advance the given iterator over a raw read view index. The pointer to and the size of tuple data are returned in the data and the size out arguments. The data returned by this function remains valid and may be safely used until the read view is closed.

Parameters:
  • *it (box_raw_read_view_iterator_t) –

    an iterator over a read view index

  • char **data (const) –

    a pointer to the tuple data; at the end of iteration, *data is set to NULL

  • *size (uint32_t) –

    the size of tuple data; at the end of iteration, *size is set to 0

Returns:

0 on success; in the case of error, returns -1 and sets box_error_last()

void box_raw_read_view_iterator_destroy(box_raw_read_view_iterator_t *it)

Destroy an iterator over a raw read view index. The iterator object should not be used after calling this function, but the data returned by the iterator may be safely dereferenced until the read view is closed.

Parameters:

Space format

A space object’s methods below provide the ability to get names and types of space fields.

uint32_t box_raw_read_view_space_field_count(const box_raw_read_view_space_t *space)

Get the number of fields defined in the format of a read view space.

Parameters:
  • box_raw_read_view_space_t *space (const) –

    a pointer to a read view space

Returns:

the number of fields

const char * box_raw_read_view_space_field_name(const box_raw_read_view_space_t *space, uint32_t field_no)

Get the name of a field defined in the format of a read view space. If the field number is greater than the total number of fields defined in the format, NULL is returned. The string returned by this function is guaranteed to remain valid until the read view is closed.

Parameters:
  • box_raw_read_view_space_t *space (const) –

    a pointer to a read view space

  • field_no (uint32_t) – the field number (starts with 0)
Returns:

the name of a field

const char * box_raw_read_view_space_field_type(const box_raw_read_view_space_t *space, uint32_t field_no)

Get the type of a field defined in the format of a read view space. If the field number is greater than the total number of fields defined in the format, NULL is returned. The string returned by this function is guaranteed to remain valid until the read view is closed.

Parameters:
  • box_raw_read_view_space_t *space (const) –

    a pointer to a read view space

  • field_no (uint32_t) – the field number (starts with 0)
Returns:

the type of a field

Flight recorder

The Tarantool flight recorder is an event collection tool that gathers various information about a working Tarantool instance, such as:

This information helps you investigate incidents related to crashing a Tarantool instance.

Enabling the flight recorder

The flight recorder is disabled by default and can be enabled and configured for a specific Tarantool instance. To enable the flight recorder, set the flightrec_enabled configuration option to true. This option is dynamic and can be changed at runtime by calling box.cfg{}:

box.cfg{ flightrec_enabled = true }

After flightrec_enabled is set to true, the flight recorder starts collecting data in the flight recording file current.ttfr. This file is stored in the memtx_dir directory. If the instance crashes and reboots, Tarantool rotates the flight recording: current.ttfr is renamed to <timestamp>.ttfr (for example, 20230411T050721.ttfr) and the new current.ttfr file is created for collecting data. In the case of correct shutdown (for example, using os.exit()), Tarantool continues writing to the existing current.ttfr file after restart.

Note

Note that old flight recordings should be removed manually.

Configuration

This section describes options related to the flight recorder configuration. Note that all options are dynamic and can be changed at runtime.

Logs

This section describes the flight recorder settings related to logging. For example, you can configure a more detailed logging level in the flight recorder for deeper analysis.

flightrec_logs_size

Specifies the size (in bytes) of the log storage. You can set this option to 0 to disable the log storage.

Type: integer
Default: 10485760
Environment variable: TT_FLIGHTREC_LOGS_SIZE
flightrec_logs_max_msg_size

Specifies the maximum size (in bytes) of the log message. The log message is truncated if its size exceeds this limit.

Type: integer
Default: 4096
Maximum: 16384
Environment variable: TT_FLIGHTREC_LOGS_MAX_MSG_SIZE
flightrec_logs_log_level

Specifies the level of detail the log has. You can learn more about log levels from the log_level option description. Note that the flightrec_logs_log_level value might differ from log_level.

Type: integer
Default: 6
Environment variable: TT_FLIGHTREC_LOGS_LOG_LEVEL

Metrics

This section describes the flight recorder settings related to collecting metrics.

flightrec_metrics_period

Specifies the time period (in seconds) that defines how long metrics are stored from the moment of dump. So, this value defines how much historical metrics data is collected up to the moment of crash. The frequency of metric dumps is defined by flightrec_metrics_interval.

Type: integer
Default: 180
Environment variable: TT_FLIGHTREC_METRICS_PERIOD
flightrec_metrics_interval

Specifies the time interval (in seconds) that defines the frequency of dumping metrics. This value shouldn’t exceed flightrec_metrics_period.

Type: number
Default: 1.0
Minimum: 0.001
Environment variable: TT_FLIGHTREC_METRICS_INTERVAL

Note

Given that the average size of a metrics entry is 2 kB, you can estimate the size of the metrics storage as follows:

(flightrec_metrics_period / flightrec_metrics_interval) * 2 kB

Requests

This section lists the flight recorder settings related to storing the request and response data.

flightrec_requests_size

Specifies the size (in bytes) of storage for the request and response data. You can set this parameter to 0 to disable a storage of requests and responses.

Type: integer
Default: 10485760
Environment variable: TT_FLIGHTREC_REQUESTS_SIZE
flightrec_requests_max_req_size

Specifies the maximum size (in bytes) of a request entry. A request entry is truncated if this size is exceeded.

Type: integer
Default: 16384
Environment variable: TT_FLIGHTREC_REQUESTS_MAX_REQ_SIZE
flightrec_requests_max_res_size

Specifies the maximum size (in bytes) of a response entry. A response entry is truncated if this size is exceeded.

Type: integer
Default: 16384
Environment variable: TT_FLIGHTREC_REQUESTS_MAX_RES_SIZE

Upgrading space schema

In Tarantool, migration refers to any change in a data schema, for example, creating an index, adding a field, or changing a field format. If you need to change a data schema, there are several possible cases:

To solve the task of migrating the data, you can:

Space upgrade overview

The space:upgrade() feature allows users to upgrade the format of a space and the tuples stored in it without blocking the database.

How to apply space upgrade

First, specify an upgrade function – a function that will convert the tuples in the space to a new format. The requirements for this function are listed below.

  • The upgrade function takes two arguments. The first argument is a tuple to be upgraded. The second one is optional. It contains some additional information stored in plain Lua object. If omitted, the second argument is nil.
  • The function returns a new tuple or a Lua table. For example, it can add a new field to the tuple. The new tuple must conform to the new space format set by the upgrade operation.
  • The function should be registered with box.schema.func.create. It should also be stored, deterministic, and written in Lua.
  • The function should not change the primary key of the tuple.
  • The function should be idempotent: f(f(t)) = f(t). This is necessary because the function is applied to all tuples returned to the user, and some of them may have already been upgraded in the background.

Then define a new space format. This step is optional. However, it could be useful if, for example, you want to add a new column with data. For details, check the Usage Example section.

The next optional step is to choose an upgrade mode. There are three modes: upgrade, dryrun, and dryrun+upgrade. The default value is upgrade. To check an upgrade function without applying any changes, choose the dryrun mode. To run a space upgrade without testing the function, pick the upgrade mode. If you want to apply both the test and the actual upgrade, use the dryrun+upgrade option. For details, see the Upgrade Modes section.

How the upgrade works

The user defines an upgrade function. Each tuple of the chosen space is passed through the function. The function converts the tuple from the old format to a new one. The function is applied to all tuples stored in the space in the background. Besides, the function is applied to all tuples returned to the user via the box API (for example, select, get). Therefore, it appears that the space upgrades instantly.

Keep in mind that space:upgrade differs from the space_object:format() in the following ways:

Difference space:upgrade() space:format()
Non-blocking Yes. It returns tuples in the new format, whether or not they have already been converted. Yes.
Set a format incompatible with the current one Yes. Works for non-indexed field types only. No, only expand the format in a compatible way.
Visibility of changes Immediately. All changes are visible and replicated immediately. New data should conform to the new format immediately after the call. After data validation. Data validation starts in the background, it does not block the database. Inserting data incompatible with the new format is allowed before validation is completed – in this case space.format fails.
Cancel (error/restart) Writes the state to the system table. Restart: the operation continues. Error: the operation should be restarted manually, any other attempt to change the table fails. Leaves no traces.
Set the upgrade function Yes. The upgrade may take a while to traverse the space and transform tuples. No.

Note

At the moment, the feature is not supported for vinyl spaces.

User API

The space:upgrade() method is added to the space object:

space:upgrade({func[, arg, format, mode, is_async]})
Parameters:
  • func (string/integer) – upgrade function name (string) or ID (integer). For details, see the upgrade function requirements section.
  • arg – additional information passed to the upgrade function in the second argument. The option accepts any Lua value that can be encoded in MsgPack, which means that the msgpack.encode(arg) should succeed. For example, one can pass a scalar or a Lua table. The default value is nil.
  • format (map) – new space format. The requirements for this are the same as for any other space:format(). If the field is omitted, the space format will remain the same as before the upgrade.
  • mode (string) – upgrade mode. Possible values: upgrade, dryrun, dryrun+upgrade. The default value is upgrade.
  • is_async (boolean) – the flag indicates whether to wait until the upgrade operation is complete before exiting the function. The default value is false – the function is blocked until the upgrade operation is finished.
Return:

object describing the status of the operation (also known as future). The methods of the object are described below.

class future_object
info(dryrun, status, func, arg, owner, error, progress)

Shows information about the state of the upgrade operation.

Parameters:
  • dryrun (boolean) – dry run mode flag. Possible values: true for a dry run, nil for an actual upgrade.
  • status (string) – upgrade status. Possible values: inprogress, waitrw, error, replica, done.
  • func (string/integer) – name of the upgrade function. It is the same as passed to the space:upgrade method. The field is nil if the status is done.
  • arg – additional information passed to the upgrade function. It is the same as for the space:upgrade method. The field is nil if it is omitted in the space:upgrade.
  • owner (string) – UUID of the instance running the upgrade (see box.info.uuid). The field is nil if the status is done.
  • error (string) – error message if the status is error, otherwise nil.
  • progress (string) – completion percentage if the status is inprogress/waitrw, otherwise nil.
Returns:

a table with information about the state of the upgrade operation

Return type:

table

The fields can also be accessed directly, without calling the info() method. For example, future.status is the same as future:info().status.

wait([timeout])

Waits until the upgrade operation is completed or a timeout occurs. An operation is considered completed if its status is done or error.

Parameters:timeout (double) – if the timeout argument is omitted, the method waits as long as it takes.
Returns:returns true if the operation has been completed, false on timeout
Return type:boolean
cancel()

Cancels the upgrade operation if it is currently running. Otherwise, an exception is thrown. A canceled upgrade operation completes with an error.

Returns:none
Return type:void

Running space:upgrade() with is_async = false or the is_async field not set is equal to:

local future = space:upgrade({func = 'my_func', is_async = true})
future:wait()
return future

If called without arguments, space:upgrade() returns a future object for the active upgrade operation. If there is none, it returns nil.

Upgrade modes

There are three upgrade modes: dryrun, dryrun+upgrade, and upgrade. Regardless of the mode selected, the upgrade does not block execution. Once in a while, the background fiber commits the upgraded tuples and yields.

Calling space:upgrade without arguments always returns the current state of the space upgrade, never the state of a dry run. If there is a dry run working in the background, space:upgrade will still return nil. Unlike an actual space upgrade, the future object returned by a dry run upgrade can’t be recovered if it is lost. So a dry run is aborted if it is garbage collected.

Warning

In dryrun+upgrade mode: if the future object is garbage collected by Lua before the end of the dry run and the start of the upgrade, then the dry run will be canceled, and no upgrade will be started.

Upgrade modes:

States

An upgrade operation has one of the following upgrade states:

../_images/ddl-state.png

Interaction with alter

While a space upgrade is in progress, the space can’t be altered or dropped. The attempt to do that will throw an exception. Restarting an upgrade is allowed in case the currently running upgrade is canceled or completed with an error. It means the manual restart is possible if the upgrade operation is in the error state.

If a space upgrade was canceled or failed with an error, the space can’t be altered or dropped. The only option is to restart the upgrade using a different upgrade function or format.

Interaction with recovery

The space upgrade state is persisted. It is stored in the _space system table. If an instance with a space upgrade in progress (inprogress state) is shut down, it restarts the space upgrade after recovery. If a space upgrade fails (switches to the error state), it remains in the error state after recovery.

Interaction with replication

The changes made to a space by a space upgrade are replicated. Just as on the instance where the upgrade is performed, the upgrade function is applied to all tuples returned to the user on the replicas. However, the upgrade operation is not performed on the replicas in the background. The replicas wait for the upgrade operation to complete on the master. They can’t alter or drop the space. Normally, they can’t cancel or restart the upgrade operation either.

There is an emergency exception when the master is permanently dead. It is possible to restart a space upgrade that started on another instance. The restart is possible if the upgrade owner UUID (see the owner field) has been deleted from the _cluster system table.

Note

Except the dryrun mode, the upgrade can only be performed on the master. If the instance is no longer the master, the upgrade is suspended until the instance is master again. Restarting the upgrade on a new master works only if the old one has been removed from the replica set (_cluster system space).

Usage example

Suppose there are two columns in the space testid (unsigned) and data (string). The example shows how to upgrade the schema and add another column to the space using space:upgrade(). The new column contains the id values converted to string. Each step takes a while.

The test space is generated with the following script:

local log = require('log')
box.cfg{
    checkpoint_count = 1,
    memtx_memory = 5 * 1024 * 1024 * 1024,
}
box.schema.space.create('test')
box.space.test:format{
    {name = 'id', type = 'unsigned'},
    {name = 'data', type = 'string'},
}
box.space.test:create_index('pk')
local count = 20 * 1000 * 1000
local progress = 0
box.begin()
for i = 1, count do
    box.space.test:insert{i, 'data' .. i}

    if i % 1000 == 0 then
        box.commit()
        local p = math.floor(i / count * 100)
        if progress ~= p then
            progress = p
            log.info('Generating test data set... %d%% done', p)
        end
        box.begin()
    end
end
box.commit()
box.snapshot()
os.exit(0)

To upgrade the space, connect to the server and then run the commands below:

localhost:3301> box.schema.func.create('convert', {
              >     language = 'lua',
              >     is_deterministic = true,
              >     body = [[function(t)
              >         if #t == 2 then
              >             return t:update({{'!', 2, tostring(t.id)}})
              >         else
              >             return t
              >         end
              >     end]],
              > })
localhost:3301> box.space.test:upgrade({
              >     func = 'convert',
              >     format = {
              >         {name = 'id', type = 'unsigned'},
              >         {name = 'id_string', type = 'string'},
              >         {name = 'data', type = 'string'},
              >     },
              > })

While the upgrade is in progress, you can track the state of the upgrade. To check the status, connect to Tarantool from another console and run the following commands:

localhost:3311> box.space.test:upgrade()
---
- status: inprogress
  progress: 8%
  owner: 579a9e99-427e-4e99-9e2e-216bbd3098a7
  func: convert
...

Even though the upgrade is only 8% complete, selecting the data from the space returns the converted tuples:

localhost:3311> box.space.test:select({}, {iterator = 'req', limit = 5})
---
- - [20000000, '20000000', 'data20000000']
  - [19999999, '19999999', 'data19999999']
  - [19999998, '19999998', 'data19999998']
  - [19999997, '19999997', 'data19999997']
  - [19999996, '19999996', 'data19999996']
...

Note

The tuples contain the new field even though the space upgrade is still running.

Wait for the space upgrade to complete using the command below:

localhost:3311> box.space.test:upgrade():wait()

Migration from Tarantool Cartridge

If your company uses a service based on Tarantool Community Edition and Tarantool Cartridge, follow the steps below to update these components to Tarantool Enterprise Edition.

As a reference, the instructions below use a template service created with cartridge-cli, namely with cartridge create --name myapp ..

Service build pipeline

Get access to the source code and build pipeline of your service. Here is an example of what the service build pipeline might look like for CentOS/RHEL 7:

curl -L https://tarantool.io/release/2/installer.sh | bash
yum -y install tarantool tarantool-devel cartridge-cli git gcc gcc-с++ cmake
cartridge pack rpm

Update the pipeline

In the installation section of your pipeline, replace open-source tarantool packages with Tarantool Enterprise SDK:

curl -L \
  https://${TOKEN}@download.tarantool.io/enterprise/release/${OS}/${ARCH}/${VERSION}/tarantool-enterprise-sdk-${VERSION_OS_ARCH_POSTFIX}.tar.gz \
  > sdk.tar.gz

# for example, the URL for the Linux build of Tarantool 2.10.4 for the x86_64 platform will be:
# https://${TOKEN}@download.tarantool.io/enterprise/release/linux/x86_64/2.10/tarantool-enterprise-sdk-gc64-2.10.4-0-r523.linux.x86_64.tar.gz

tar -xvf sdk.tar.gz
source tarantool-enterprise/env.sh
cartridge pack rpm

Now the pipeline will produce a new service artifact, which includes Tarantool Enterprise.

Update the service

Update your service to the new version like you usually update Tarantool in your organization. You don’t have to interrupt access to the service. To learn how to do it with ansible-cartridge, check this example.

That’s it!

You can now use Tarantool Enterprise features in your installation. For example, to enable the audit log, set up the audit_log parameter in your node configuration.

Modules reference

This chapter covers open and closed source Lua modules for Tarantool Enterprise included in the distribution as an offline rocks repository.

Open source modules

Closed source modules

Installing and using modules

To use a module, install the following:

  1. All the necessary third-party software packages (if any). See the module’s prerequisites for the list.

  2. The module itself on every Tarantool instance:

    $ tarantoolctl rocks install <module_name> [<module_version>]
    

See also other useful tarantoolctl commands for managing Tarantool modules.

ldap

LDAP client library for tarantool

This library allows you to authenticate in a LDAP server and perform searches.

Usage example with OpenLDAP library

OpenLDAP – an open-source implementation of LDAP. A complex one and not very fun to deal with. However, it should meet the full LDAP standard.

Install OpenLDAP

Centos 7

yum install -y openldap*

slapd server will be available in $PATH and can be started right away.

Centos 8

There is no package for Centos 8, you will have to build it from scratch.

Full process is described here: https://kifarunix.com/install-and-setup-openldap-on-centos-8/

MacOS

brew install openldap

slapd server will be in /usr/local/opt/openldap/libexec/ and must be added to $PATH in order to continue.

Running tests on OpenLDAP

There are scripts ready for such a task.

  1. Run test/prepare.sh to setup a virtualenv and create SSL certificates & keys.

  2. Start slapd with test/openldap/start_slapd.sh. This will:

    • create a slapd.conf config file according to your environment;
    • start a slapd process in the background;
    • populate the LDAP database with the contents of test/openldap/database.ldif file.
  3. Run tarantool test.lua

Usage example with glauth (a simple LDAP server)

First, download glauth, a simple Go-based LDAP server using the following commands:

cd test/glauth
./glauth/download_glauth.sh

Then run glauth:

./glauth -c glauth_test.cfg

Then run the following tarantool script in a separate terminal

#!/usr/bin/env tarantool

local ldap = require('ldap')
local yaml = require('yaml')

local user = "cn=johndoe,ou=superheros,dc=glauth,dc=com"
local password = "dogood"

local ld = assert(ldap.open("localhost:3893", user, password))

local iter = assert(ldap.search(ld,
    {base="dc=glauth,dc=com",
     scope="subtree",
     sizelimit=10,
     filter="(objectclass=*)"}))

for entry in iter do
    print(yaml.encode(entry))
end

Usage ldap for authorization in the web interface

See this doc page.

Changelog

All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog and this project adheres to Semantic Versioning.

[Unreleased]

Added

  • Add versioning support.

[1.2.0] - 2022-3-28

Added

  • allow to pass options to ldap.open function.

[1.1.2] - 2022-02-22

Added

  • validate method for ldap.options.

[1.1.1] - 2022-02-17

Fixed

  • Options is a part of a ldap module now, call it by require('ldap.options').

[1.1.0] - 2022-02-07

Added

  • Functions set_option and set_global_option to set OpenLDAP configuration options.

  • Functions get_option and get_global_option to retrieve OpenLDAP configuration options’ values.

  • Exported some internal ldap_* options (#14).

  • Reworked tests:

    • Now run against an OpenLDAP server – slapd, rather than old glauth.

    This gives more flexibility on what can be done in tests. - Added test for TLS connection. - userPrincipalName attribute is not available for tests yet. - memberOf attribute must be asked for.

Fixed

  • Incorrect return value handling for ldap_start_tls call in ldap.open() (#11).
  • Event loop deadlock if iterator is used twice without search_timeout specified (#2).

Other

  • Updated test server glauth to version 2.0.0. Allows to use userPrincipalName attribute.

[1.0.2] - 2021-07-08

Fixed

  • Parsing of memberOf list in LDAP entries

[1.0.1] - 2021-06-02

Added

  • Module is now able to hot-reload
  • Updated documentation
  • Github CI testing

Fixed

  • Error messages on incorrect ldap_simple_bind call
  • Fallback to ldap_simple_bind call when AUTH_METHOD_NOT_SUPPORTED for SASL

[1.0.0] - 2019-04-03

Added

  • Basic functionality
  • Luarock-based packaging
  • Build without any dependencies but Tarantool Enterprise

task

Module scheduler

Task scheduler (a cartridge role).

Functions

init (opts)

Initialize the scheduler, start cron background fiber.

Parameters:

  • opts:
    • runner: (string) name of the runner module name, default is task.runner.local
    • storage: (string) name of the storage module, default is task.storage.local

get_tasks ()

List registered tasks.

get_task_log (opts)

List task execution log, ordered by creation time.

Parameters:

  • opts:
    • filter: (table) must contain either an id number, or an array of names
    • limit: (number) the maximum length of a single task log fetched from storage
    • created: (string) ISO 8601 timestamp, acts as offset for pagination

start (name, args)

Start a task.

Parameters:

  • name: (string) name of the task
  • args: (table) array of arguments to be passed to task function

stop (id)

Stop a running or pending task.

Parameters:

  • id: (string) name of the task

forget (id)

Remove task execution log record from storage.

Parameters:

  • id: (string) name of the task

start_periodical_task (name, args)

Start a periodical task.

Parameters:

  • name: (string) name of the task
  • args: (table) array of arguments to be passed to the task function

register (tasks)

Register available tasks. Starts launching periodical and continuous tasks, allows to start single_shot tasks.

Parameters:

  • tasks: (table) names of tasks

Module roles.scheduler

Task manager (a cartridge role).

Handles setting available tasks from the cluster config, handles scheduler fibers (including failover). In a basic case, it sets up a scheduler, a task storage, and a task runner on the same node.

Module roles.runner

Local task runner module, used by default.

You must provide the same interface if you want to write your own one. It is expected to take tasks from storage, run them, and complete them.

Module roles.storage

Local task storage module, used by default.

You must provide the same interface if you want to write your own one.

Functions

select (index, key, opts)

Select task log.

Parameters:

Task Manager for Tarantool Enterprise

@lookup README.md

Task manager module allows you to automate several types of background jobs:

You get the following features out-of-the-box:

Task manager comes with several built-in cartridge roles:

Basic usage (single-node application)

  1. Embed the following to the instance file (init.lua):
...
local task = require('task')
require('my_module')
...
local ok, err = cartridge.cfg({
   roles = {'my_role', ...}
})
assert(ok, tostring(err))

task.init_webui()
  1. Add to your role dependency on task scheduler, runner and storage roles:
return {
    ...
    dependencies = {
        'task.roles.storage',
        'task.roles.scheduler',
        'task.roles.runner'
    }
}
  1. Add tasks section to your clusterwide configuration:
tasks:
  my_task:
    kind: periodical
    func_name: my_module.my_task
    schedule: "*/* 1 * * * *"
  1. That’s it! my_task function from my_module module will be launched every minute.

Advanced usage (multi-node installation)

  1. Embed the following to the instance file:
...

...
local task = require('task')
require('my_module')
local ok, err = cartridge.cfg({
   roles = {
    ...
    'task.roles.scheduler',
    'task.roles.storage',
    'task.roles.runner'
   }
})
assert(ok, tostring(err))

task.init_webui()

Important: task.init_webui() should be called on all nodes. It is a safe and necessary thing to do even if Tarantool Cartridge Web UI is disabled on some nodes.

  1. Enable the task scheduler role on a dedicated node in your cluster (after deployment). If you set up a big cluster, don’t set up more than one replica set with the scheduler.
  2. Enable the task storage role on a dedicated node in your cluster (after deployment), possibly on the same node as task scheduler. If you set up a big cluster, don’t set up more than one replica set with the storage.
  3. Enable the task runner role on dedicated stateless nodes in your cluster (after deployment) - as many as you may need.

Advanced usage (sharded storage)

  1. Embed the following to the instance file:
...
local task = require('task')
require('my_module')
local ok, err = cartridge.cfg({
   roles = {
    ...
    'task.roles.sharded.scheduler',
    'task.roles.sharded.storage',
    'task.roles.sharded.runner'
   }
})
assert(ok, tostring(err))

task.init_webui()

Important: task.init_webui() should be called on all nodes. It is a safe and necessary thing to do even if Tarantool Cartridge Web UI is disabled on some nodes.

  1. Enable the task scheduler role on a dedicated node in your cluster (after deployment). If you set up a big cluster, don’t set up more than one replica set with the scheduler.
  2. Enable the task storage role on the nodes of some vshard group (or an all storage nodes). Set up cartridge built-in vshard-storage role on these nodes.
  3. Enable the task runner role on dedicated stateless nodes in your cluster (after deployment) - as many as you may need.

Tasks configuration

Tasks are configured via the scheduler cluster role. An example of valid role configuration:

tasks:
    my_reload:
      kind: periodical
      func_name: my_module.cache_reload
      schedule: "*/* 1 * * * *"
      time_to_resolve: 180
    my_flush:
      kind: single_shot
      func_name: my_module.cache_flush
      args:

        - some_string1
        - some_string2
    push_metrics:
      kind: continuous
      func_name: my_module.push_metrics
      pause_sec: 30

task_storage:
  task_ttr: 60 # default time_to_resolve, if no other value is specified in task
  task_log_max_size: 100 # number of task history records (excluding pending or running ones) to be kept on a single storage node (per task name)

task_runner:
    capacity: 128 # number of tasks that can be executed simultaneously on a single runner node

For example:

  • delay = 60, max_attemps = 5 will lead to 5 retries every 60 seconds,
  • delay = 8, delay_factor = 1.5, max_attemps = 8 will lead to retries after 8, 12, 18, 27, 40, 60 and 90 seconds,
    with last attempt happening 260 seconds after first one.

Note: be careful when setting up retries for periodic tasks that may lead to overlaps between task launches.

You may set up default task config for your application in task.init() call:

task.init({
    default_config = {
        my_task = {
            kind = 'single_shot',
            func_name = 'dummy_task.dummy',
        }
    },
    task_storage = {
        task_ttr = 3600,
        task_log_max_size = 10000
    },
    task_runner = {
        capacity = 1
    }
})

Default config will be applied if no respective sections (tasks, task_runner, task_storage) are set in clusterwide config. task.init() should be called prior to cartridge.cfg().

Advanced usage

Monitoring

Task storage nodes expose current breakdown by statuses for your monitoring plugins of choice:

> require('task.roles.storage').statistics()

---
- statuses:
    stopped: 0
    failed: 0
    lost: 0
    total: 6
    completed: 3
    pending: 1
    did not start: 0
    running: 2
    unknown task: 0

Same goes for sharded storage role.

Running a task via API

Everything visible from the UI is available via the API. You may look up requests in the UI or in the cartridge graphql schema.

curl -w "\n" -X POST http://127.0.0.1:8080/admin/api --fail -d@- <<'QUERY'
{"query": "mutation { task { start(name: "my_reload") } }"}
QUERY

Running massive amounts of tasks

If you need to run a big amount of background tasks (say, more than several per second) in parallel, there are two options:

Option A - registered tasks

require('task').start(
    'my_task', -- task_name from the config, probably it is of `single_shot` kind
    {...}, -- arguments
)

Option B - anonymous

Or, if you don’t want to bother with registering tasks in clusterwide config), you may do so:

require('task').create(
    'your_module.some_background_job', -- func_name
    {...}, -- arguments
    { time_to_resolve = 1 } -- options
)
In both cases, such a task will be put to a storage and executed on a runner as usual. Remember that:
  • If you’re using sharded task storage, you need to have vshard-router enabled on the instance where you call task.create
  • You may want to increase task_storage.task_log_max_size appropriately, if you intend to launch many tasks simultaneously.

Supplying custom runner and storage

Embed the following to your instance file

task.init({
    runner = 'my_tasks.my_runner',
    storage = 'my_tasks.my_storage'
})
...
local ok, err = (cartridge.cfg{...})
assert(ok, tostring(err))
task.init_webui()

Be sure to call task.init() it prior to cartridge.cfg, so that custom options would be provided by the time role initialization starts.

You may set up then only task scheduler role, and handle storages and runners yourself.

Writing your own runner and storage

Runner module must expose api member with the following functions:
  • stop_task
storage module must expose api member with the following functions:
  • select
  • get
  • delete
  • put
  • complete
  • take
  • cancel
  • wait
  • (optional) set_ddl function, which should be (safely) called on each nodes of the cluster

For more details refer to built-in runner and storage documentation

Changelog

All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog and this project adheres to Semantic Versioning.

[Unreleased]

Added:

Fixed:

(now it is explicitly prhibited)

[0.9.0]

Fixed:

[0.8.5]

Fixed:

[0.8.4]

Fixed:

[0.8.3]

Fixed:

[0.8.2]

Fixed:

[0.8.1]

Fixed:

[0.8.0]

Added:

Fixed:

[0.7.4]

[0.7.3]

[0.7.2]

[0.7.1]

[0.6.1]

[0.6.0]

Added:

[0.5.0]

Added:

[0.4.0]

Changed:

[0.3.0]

Added:

Changed:

[0.2.0]

Added:

ODBC connector for Tarantool

Based on unixODBC

Breaking changes

Between 0.7.3 (and lower) and 1.0.0 (and higher)

odbc.create_env

  • opts.date_as_table deprecated
    • Use custom string date value to table marshaling
  • opts.decimal_as_luanumber introduced

Examples

Use a single connection

local odbc = require('odbc')
local yaml = require('yaml')

local env, err = odbc.create_env()
local conn, err = env:connect("DSN=odbc_test")

local result, err = conn:execute("SELECT 1 as a, 2 as b")
print(yaml.encode(result))

conn:close()

Use ODBC transactions

local odbc = require('odbc')
local yaml = require('yaml')

local env, err = odbc.create_env()
local conn, err = env:connect("DSN=odbc_test")
conn:execute("CREATE TABLE t(id INT, value TEXT)")

conn:set_autocommit(false)
conn:execute("INSERT INTO t VALUES (1, 'one')")
conn:execute("INSERT INTO t VALUES (2, 'two')")
local result, err = conn:execute("SELECT * FROM t")
print(yaml.encode(result))

conn:commit()
conn:close()

Use connection pool ad-hoc queries

Pool implements :execute(), :drivers(), :datasources() and :tables() methods.

local odbc = require('odbc')
local yaml = require('yaml')

local pool, err = odbc.create_pool({
    size = 5,
    dsn = os.getenv('DSN')
})
local _, err = pool:connect()

local res, err = pool:execute("SELECT 1 as a, 2 as b")
print(yaml.encode(res))

pool:close()

Rent pool connections

local odbc = require('odbc')
local yaml = require('yaml')

local pool, err = odbc.create_pool({
    size = 5,
    dsn = os.getenv('DSN')
})
local _, err = pool:connect()

local conn = pool:acquire()

local res, err = conn:execute("SELECT 1 as a, 2 as b")
print(yaml.encode(res))

pool:release(conn)

pool:close()

API Reference

ODBC

odbc.create_env(opts)

Creates ODBC environment.

Parameters:decimal_as_luanumber (boolean) – configures behaviour of odbc package of how to deal with decimal
Returns:environment
odbc.create_pool(opts)

Creates a connection pool. All options for odbc.create_env

Parameters:
  • dsn (string) – connection string
  • size (integer) – number of connections in the pool

Environment methods

class odbc.environment
environment.connect(dsn)

Connects to database using dsn

Parameters:dsn (string) – connection string (documentation).
Returns:connection
Returns:nil, errs
environment.drivers()
Returns:drivers Returns a list of drivers available in the system (contents of odbcinst.ini file).

Example

tarantool> env:drivers()
---
- - name: PostgreSQL ANSI
    attributes:
        Setup: libodbcpsqlS.so
        Driver: psqlodbca.so
        UsageCount: '1'
        Debug: '0'
        CommLog: '1'
        Description: PostgreSQL ODBC driver (ANSI version)
  - name: ODBC Driver 17 for SQL Server
    attributes:
        UsageCount: '1'
        Driver: /opt/microsoft/msodbcsql17/lib64/libmsodbcsql-17.2.so.0.1
        Description: Microsoft ODBC Driver 17 for SQL Server
  - name: MySQL ODBC 8.0 Driver
    attributes:
        Setup: /opt/mysqlodbc/lib/libmyodbc8S.so
        Driver: /opt/mysqlodbc/lib/libmyodbc8a.so
        UsageCount: '1'
environment.datasources()

Returns a list of data sources available in the system (contents of odbc.ini file).

Example

tarantool> env:datasources()
---
- - name: tmddb
   driver: PostgreSQL ANSI
 - name: odbc_test
   driver: PostgreSQL ANSI
 - name: odbc_mssql
   driver: ODBC Driver 17 for SQL Server
 - name: odbc_mysql
   driver: MySQL ODBC 8.0 Unicode Driver
...

Connection methods

class odbc.connection
connection.execute(query, params)

Executes an arbitrary SQL query

Parameters:
  • query (string) – SQL query
  • params (table) – table with parameters binding
Returns:

resultset table with results

Returns:

nil, errors

Example

conn:execute("SELECT * FROM t WHERE id > ? and value = ?", {1, "two"})

Limitations

  • When no type marshaling case binding params have to be explicitly casted in sql expression, e.g.
conn:execute([[ insert into <table> values (cast(? as json) ) ]], { json:encode({a=1, b=2}) })
connection.set_autocommit(flag)

Sets autocommit of connection to a specified value. Used to achieve transaction behaviour. Set autocommit to false to execute multiple statements in one transactions.

Parameters:flag (boolean) – true/false
connection.set_timeout(timeout)

Sets timeout of query execution to a specified value. Timeout will be applied to each executed query, including queries with a cursor, until the connection is closed. Set timeout to 0 to disable timeout. By default, the timeout from the driver settings is used.

Parameters:timeout (integer) – timeout of query execution in seconds.
connection.commit()

Commit a transaction

connection.rollback()

Rollback a transaction

connection.set_isolation(level)

Sets isolation level of a transaction. Cannot be run in an active transaction.

Parameters:level (enum) – isolation level. One of the values defined in the odbc.isolation table.

Isolation levels

  1. odbc.isolation.READ_UNCOMMITTED
  2. odbc.isolation.READ_COMMITTED
  3. odbc.isolation.REPEATABLE_READ
  4. odbc.isolation.SERIALIZABLE
connection.is_connected()

Returns true if connection is active.

connection.state()

Returns an internal state of a connection.

connection.close()

Disconnect and close the connection.

connection.tables()

Returns a list of tables of a connected data source.

Example

tarantool> conn:tables()
---
- - table_type: TABLE
   catalog: odbc_test
   remarks:
   table_schema: public
   table_name: tt
 - table_type: TABLE
   catalog: odbc_test
   remarks:
   table_schema: public
   table_name: tt2
 - table_type: TABLE
   catalog: odbc_test
   remarks:
   table_schema: public
   table_name: tt3
...
connection.cursor(query, params)

Creates a cursor object for the specified query.

Parameters

Parameters:
  • query (string) – SQL query
  • params (table) – table with parameters binding
Returns:

cursor

Returns:

nil, err

connection.prepare(query)

Create object and prepare query. If you share a prepared query between fibers, the queries will be executed sequentially (synchronized between fibers).

Parameters:query (string) – SQL query
Returns:prepare
Returns:nil, err

Cursor methods

class odbc.cursor
cursor.fetchrow()

Fetch one row from the data frame and return as a single table value.

Example

tarantool> cusor = conn:cursor("select * from tt")
tarantool> cursor:fetchrow()
---
- id: 1
...

tarantool> cursor:fetchrow()
---
- id: 2
...
cursor.fetch(n)

Fetch multiple rows.

Parameters:n (integer) – number of rows to fetch

Example

tarantool> cursor:fetch(4)
---
- - id: 3
  - id: 4
  - id: 5
  - id: 6
...
cursor.fetchall()

Fetch all available rows in the data frame.

cursor.is_open()

Returns true if cursor is open.

cursor.close()

Close cursor discarding available data.

Prepare methods

class odbc.prepare
prepare.execute()

Execute prepared SQL query

Parameters:param (table) – table with parameters binding

Example

tarantool> p = conn:prepare('insert into tt values (?)')
---
...
tarantool> p:execute({1})
---
- 1
...
tarantool> p:execute({2})
---
- 1
...
prepare.is_open()

Returns true if prepare is open.

prepare:close()

Close prepare discarding prepared query.

Pool methods

class odbc.pool
pool.connect()

Connect to all size connections.

pool.acquire(timeout)

Acquire a connection. The connection must be either returned to pool with pool:release() method or closed.

pool.release(conn)

Release the connection.

pool.available()

Returns the number of available connections.

pool.close()

Close pool and all underlying connections.

pool.execute(query, params)

Acquires a connection, executes query on it and releases the connection.

Parameters:
  • query (string) – SQL query
  • params (table) – table with parameters binding
pool.tables()

Acquires a connection, executes tables() on it and releases.

pool.drivers()

Acquires a connection, executes drivers() on it and releases.

pool.datasources()

Acquires a connection, executes datasources() on it and releases.

Installation

Prerequisites:

  1. Driver for the database of your choice. E.g. Postgresql, MySQL, Sybase.
  2. Datasource for the database in odbc.ini file or DSN string, which describes connection.

PostgreSQL

Linux (Ubuntu)

$ sudo apt-get install odbc-postgresql

Add to file /etc/odbcinst.ini:

[PostgreSQL ANSI]
Description=PostgreSQL ODBC driver (ANSI version)
Driver=psqlodbca.so
Setup=libodbcpsqlS.so
Debug=0
CommLog=1
UsageCount=1

Add to file /etc/odbc.ini:

[<dsn_name>]
Description=PostgreSQL
Driver=PostgreSQL ANSI
Trace=No
TraceFile=/tmp/psqlodbc.log
Database=<Database>
Servername=localhost
username=<username>
password=<password>
port=
readonly=no
rowversioning=no
showsystemtables=no
showoidcolumn=no
fakeoidindex=no
connsettings=

MacOS

Use brew to install:

$ brew install psqlodbc

/usr/local/etc/odbcinst.ini contents:

[PostgreSQL ANSI]
Description=PostgreSQL ODBC driver (ANSI version)
Driver=/usr/local/lib/psqlodbca.so
Debug=0
CommLog=1
UsageCount=1

/usr/local/etc/odbc.ini contents:

[<dsn_nam>]
Description=PostgreSQL
Driver=PostgreSQL ANSI
Trace=No
TraceFile=/tmp/psqlodbc.log
Database=<database>
Servername=<host>
UserName=<username>
Password=<password>
ReadOnly=No
RowVersioning=No
ShowSystemTables=No
ShowOidColumn=No
FakeOidIndex=No

MSSQL

Linux

Please follow to the official installation guide.

/etc/odbcinst.ini contents:

[ODBC Driver 17 for SQL Server]
Description=Microsoft ODBC Driver 17 for SQL Server
Driver=/opt/microsoft/msodbcsql17/lib64/libmsodbcsql-17.2.so.0.1
UsageCount=1

/etc/odbc.ini contents:

[<dsn_name>]
Driver=ODBC Driver 17 for SQL Server
Database=<Database>
Server=localhost

MacOS

For El Capitan, Sierra and High Sierra use brew to install:

$ brew tap microsoft/mssql-release https://github.com/Microsoft/homebrew-mssql-release
$ brew install --no-sandbox msodbcsql17 mssql-tools

For El Capitan and Sierra use brew to install:

$ brew tap microsoft/mssql-release https://github.com/Microsoft/homebrew-mssql-release
$ brew install --no-sandbox msodbcsql@13.1.9.2 mssql-tools@14.0.6.0

Examples below are fair to msodbcsql 13

/usr/local/etc/odbcinst.ini contents:

[ODBC Driver 13 for SQL Server]
Description=Microsoft ODBC Driver 13 for SQL Server
Driver=/usr/local/lib/libmsodbcsql.13.dylib
UsageCount=1

/usr/local/etc/odbc.ini contents:

[<dsn_name>]
Description=SQL Server
Driver=ODBC Driver 13 for SQL Server
Server=<host>,<port>

FYI:

Uid, Pwd etc are placed into connstring

Example

tarantool> conn, err = require('odbc').create_env():connect('DSN=odbc_mssql;Uid=SA;Pwd=<YourStrong!Passw0rd>')

MySQL

Linux

Please follow to the official installation guide.

MacOS

Download and install:

  1. http://www.iodbc.org/dataspace/doc/iodbc/wiki/iodbcWiki/Downloads
  2. https://dev.mysql.com/downloads/connector/odbc/

Add to file /usr/local/etc/odbcinst.ini:

[MySQL ODBC 8.0 Driver]
Driver=/usr/local/mysql-connector-odbc-8.0.12-macos10.13-x86-64bit/lib/libmyodbc8a.so
UsageCount=1

Add to file /usr/local/etc/odbc.ini:

[<dsn>]
Driver = MySQL ODBC 8.0 Driver
Server = <host>
PORT = <port>

FYI:

USER, DATABASE etc are placed into connstring

Example

tarantool> conn, err = require('odbc').create_env():connect('DSN=odbc_mysql; USER=root; DATABASE=odbc_test')
---
...

Sybase ASE

MacOS

Run brew install freetds

Add to file /usr/local/etc/freetds.conf

[sybase]
host = localhost
port = 8000
tds version = auto

Add to file /usr/local/etc/odbcinst.ini:

[Sybase Driver]
Driver=/usr/local/lib/libtdsodbc.so
UsageCount=1

Add to file /usr/local/etc/odbc.ini:

[default]
Driver=/usr/local/lib/libtdsodbc.so
Port=8000

[sybase]
Driver=Sybase Driver
Description=Sybase ASE
DataSource=<datasource>
ServerName=sybase
Database=<database>

Example

tarantool> conn, err = require('odbc').create_env():connect('DSN=sybase;Uid=sa;Pwd=myPassword')
---
...

References

  1. Tarantool - in-memory database and application server.
  2. PostgreSQL ODBC
  3. MS SQL Server ODBC
  4. MySQL ODBC

oracle

Oracle connector

The oracle package exposes some functionality of OCI. With this package, Tarantool Lua applications can send and receive data over Oracle protocol.

The advantage of integrating oracle with Tarantool, which is an application server plus a DBMS, is that anyone can handle all of the tasks associated with Oracle (control, manipulation, storage, access) with the same high-level language (Lua) and with minimal delay.

Table of contents

Prerequisites

  • An operating system with developer tools including cmake, C compiler with gnu99 support, git and Lua.
  • Tarantool 1.6.5+ with header files (tarantool and tarantool-dev packages).
  • Oracle OCI 10.0+ header files and dynamic libs.

Automatic build

Important: Builder requires Oracle Instant Client zip archives. You need to download them from Oracle into the source tree:

curl -O https://raw.githubusercontent.com/bumpx/oracle-instantclient/master/instantclient-basic-linux.x64-12.2.0.1.0.zip
curl -O https://raw.githubusercontent.com/bumpx/oracle-instantclient/master/instantclient-sdk-linux.x64-12.2.0.1.0.zip
sha256sum -c instantclient.sha256sum

To build a complete oracle package, you need to run package.sh script first (depends on docker). Packages will be available in build/ directory. Example:

wget <oracle-client.rpm>
wget <oracle-devel.rpm>
$./package.sh
...
done
$ls -1 build
oracle-instantclient12.2-basic-12.2.0.1.0-1.x86_64.rpm
oracle-instantclient12.2-devel-12.2.0.1.0-1.x86_64.rpm
tarantool-oracle-1.0.0.0-1.el7.centos.src.rpm
tarantool-oracle-1.0.0.0-1.el7.centos.x86_64.rpm
tarantool-oracle-debuginfo-1.0.0.0-1.el7.centos.x86_64.rpm

After that you can install oracle package on the target machine:

rpm -Uvh tarantool-oracle-1.0.0.0-1.el7.centos.x86_64.rpm

Getting started

Start Tarantool in the interactive mode. Execute these requests:

tarantool> oracle = require('oracle')
tarantool> env, errmsg = oracle.new()
tarantool> if not env then error("Failed to create environment: "..errmsg) end
tarantool> c, errmsg = env:connect({username='system', password='oracle', db='localhost:1511/myspace'})
tarantool> if not c then error("Failed to connect: "..errmsg) end
tarantool> c:exec('CREATE TABLE test(i int, s varchar(20))')
tarantool> c:exec('INSERT INTO test(i, s) VALUES(:I, :S)', {I=1, S='Hello!'})
tarantool> rc, result_set = c:exec('SELECT * FROM test')

If all goes well, you should see:

tarantool> result_set[1][2] -- 'Hello!'

This means that you have successfully installed tarantool/oracle and successfully executed an instruction that brought data from an Oracle database.

API reference

function new([opts])

Create Oracle connection environment.

Accepts parameters:

  • [optional] table of options:
    • charset - client-side character and national character set. If not set or set improperly, NLS_LANG setting is used.

Returns: * env - environment object in case of success, nil otherwise, * err [OPTIONAL] - error string in case of error.

function env:connect(credentials [, additional options])

Connect to the Oracle database.

Accepts parameters: * credentials (table):

  • username (str) - user login,
  • password (str) - user password,
  • db (str) - database URL.
  • additional options (as table):
    • prefetch_count (int) - prefetch row count amount from Oracle,
    • prefetch_size (int) - memory limit for prefetching (in MB),
    • batch_size (int) - the size of each SELECT loop batch on exec() and cursor:fetchall().

Returns: * conn - connection object in case of success, nil otherwise, * err [OPTIONAL] - error string or table structure in case of error.

function env:version()

Get version string.

function conn:exec(sql [, args])

Execute an operation.

Accepts parameters:

  • sql - SQL statement,
  • [optional] statement arguments.

Returns:

  • rc - result code (0 - Success, 1 - Error)
  • result_set - result table, err - table with error (see below) in case of error
  • row_count - number of rows in result_set
  • err [OPTIONAL] - table with error in case of warning from Oracle

Examples:

--  Schema - create table(a int, b varchar(25), c number)
conn:exec("insert into table(a, b, c) values(:A, :B, :C)",  {A=1, B='string', C=0.1})
--  Schema - create table(a int, b varchar(25), c number)
rc, res = conn:exec("SELECT a, b, c FROM table")
res[1][1] -- a
res[1][2] -- b
res[1][3] -- c

function conn:cursor(sql [, args, opts])

Create cursor to fetch SELECT results.

Accepts parameters:

  • sql - SELECT SQL statement,

  • [optional] statement arguments,

  • [optional] table of options:

    • scrollable - enable cursor scrollable mode (false by default).

Returns:

  • cursor - cursor object in case of success, nil otherwise
  • err [OPTIONAL] - table with error, same format as exec function one

function conn:close()

Close connection and all associated cursors.

function cursor:fetch_row()

Fetch one row of resulting set.

Returns:

  • rc - result code (0 - Success, 1 - Error),
  • result_set - result table, err - table with error in case of error,
  • row_count - number of rows in result_set.

function cursor:fetch(fetch_size)

Fetch fetch_size rows of resulting set.

Accepts parameters:

  • fetch_size - number of rows (positive integer).

Returns:

  • rc - result code (0 - Success, 1 - Error),
  • result_set - result table, err - table with error in case of error,
  • row_count - number of rows in result_set.

function cursor:fetch_all()

Fetch all remaining rows of resulting set.

Returns:

  • rc - result code (0 - Success, 1 - Error),
  • result_set - result table, err - table with error in case of error,
  • row_count - number of rows in result_set.

function cursor:fetch_first(fetch_size)

Scrollable only.

Fetch first fetch_size rows of resulting set.

Accepts parameters:

  • fetch_size - number of rows (positive integer).

Returns:

  • rc - result code (0 - Success, 1 - Error),
  • result_set - result table, err - table with error in case of error,
  • row_count - number of rows in result_set.

function cursor:fetch_last()

Scrollable only.

Fetch last row of resulting set.

Returns:

  • rc - result code (0 - Success, 1 - Error),
  • result_set - result table, err - table with error in case of error,
  • row_count - number of rows in result_set.

function cursor:fetch_absolute(fetch_size, offset)

Scrollable only.

Fetch fetch_size rows of resulting set, starting from offset absolute position (including offset row).

Accepts parameters:

  • fetch_size - number of rows (positive integer),
  • offset - absolute cursor offset (positive integer).

Returns:

  • rc - result code (0 - Success, 1 - Error),
  • result_set - result table, err - table with error in case of error,
  • row_count - number of rows in result_set.

function cursor:fetch_relative(fetch_size, offset)

Scrollable only.

Fetch fetch_size rows of resulting set, starting from current + offset absolute position(including current + offset row).

Accepts parameters:

  • fetch_size - number of rows (positive integer),
  • offset - relative cursor offset (signed integer).

Returns:

  • rc - result code (0 - Success, 1 - Error),
  • result_set - result table, err - table with error in case of error,
  • row_count - number of rows in result_set.

function cursor:fetch_current(fetch_size)

Scrollable only.

Fetch fetch_size rows of resulting set, starting from current position (including current row).

Accepts parameters:

  • fetch_size - number of rows (positive integer).

Returns:

  • rc - result code (0 - Success, 1 - Error),
  • result_set - result table, err - table with error in case of error,
  • row_count - number of rows in result_set.

function cursor:fetch_prior(fetch_size)

Scrollable only.

Fetch fetch_size rows of resulting set, starting from previous row from the current position (including previous row).

Accepts parameters:

  • fetch_size - number of rows (positive integer).

Returns:

  • rc - result code (0 - Success, 1 - Error),
  • result_set - result table, err - table with error in case of error,
  • row_count - number of rows in result_set.

function cursor:get_position()

Scrollable only.

Get current cursor position.

Returns:

  • rc - result code (0 - Success, 1 - Error)
  • position - current cursor position, err - table with error in case of error

function cursor:close()

Closes cursor. After this was executed, cursor is no longer available for fetching results.

function cursor:is_closed()

Returns:

  • is_closed - true if closed, false otherwise.

function cursor:ipairs()

Lua 5.1 version of ipairs(cursor) operator.

Example:

-- Foo(row) is some function for row processing
for k, row in cursor:ipairs() do
    Foo(row)
end

function conn:close()

Close connection and all associated cursors.

Error handling

In case of error returns nil, err where err is table with the next fields:

  • type - type of error (1 - Oracle error is occured, 0 - Connector error is occured)
  • msg - message with text of error
  • code - error code (now defined ONLY for Oracle error codes)

Deploy Oracle in Docker

Description (For Oracle EE v12.2):

  1. You should have working Docker and Oracle accounts.
  2. instantclient-sqlplus.
  3. Go to this page and follow instructions. You need to follow steps 1.a-1.d steps. For short here they are:
docker pull store/oracle/database-enterprise:12.2.0.1
docker run -d -it --name OraDBEE -P store/oracle/database-enterprise:12.2.0.1

Note: type docker ps to get allocated port, you need PORTS section (or docker port CONTAINER_NAME), also check that status is healthy (if not, repeat 1.d.2 - in my case probability to create working container was 50/50 :P). Assume port is 32771 for farther instructions.

  1. Go inside container, create a user, grant all necessary permissions:
docker exec -it OraDBEE bash -c "source /home/oracle/.bashrc; sqlplus sys/Oradoc_db1@ORCLCDB as sysdba"
...

SQL> alter session set "_ORACLE_SCRIPT"=true;
SQL> CREATE USER user1 IDENTIFIED BY qwerty123;
SQL> GRANT CONNECT, RESOURCE, DBA TO user1;
  1. Check connection and access rights outside container for a new user:
> sqlplus user1/qwerty123@localhost:32771/ORCLCDB.localdomain

Try to create a table, insert some rows, select them, then drop table.

6. Follow Getting started part. Use this user credentials in connect method:

tarantool> c, err = ora.connect({username='user1', password='qwerty123', db='localhost:32771/ORCLCDB.localdomain'})

Troubleshooting

If Docker can’t get Oracle image with

docker pull store/oracle/database-enterprise:12.2.0.1

try to login with ‘docker login’. If still nothing is happening, go to docker store and check license (Terms of Service).

Changelog

All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog and this project adheres to Semantic Versioning.

[1.4.0] - 2023-03-24

Added

  • Package version API

[1.3.2] - 2020-12-02

We strongly recommend to upgrade to this version from Oracle 1.3.0 if you use Tarantool 2.6.0-138-gd3f1dd720, 2.5.1-105-gc690b3337, 2.4.2-89-g83037df15, 1.10.7-47-g8099cb053 or newer. .. _changed:

Changed

  • Fix connection objects memory leak

[1.3.1] - 2020-12-01

Changed

  • Removed get_position for non-scrollable cursors
  • Delay connection gc so it not yields while collecting garbage (causes exit(1) since Tarantool 2.6.0-138-gd3f1dd720, 2.5.1-105-gc690b3337, 2.4.2-89-g83037df15, 1.10.7-47-g8099cb053, if non-closed connection processed by garbage collector)

[1.3.0] - 2019-11-28

Added

  • Added support for connection environment

Changed

  • Direct call of ‘oracle.connect’ is deprecated and will print warnings on call
  • Added explicit error return for fiber cancelled errors
  • Fixed segmentation fault on fiber cancel in the middle of fetch, cursor and connection create
  • Fixed freeze on wrong credentials

[1.2.2] - 2019-09-30

Added

  • Added support for scrollable cursors

Changed

  • Fixed bug when cursor had no explicit Lua link to connection object and collecting connection object by gc made cursor unusable

[1.2.1] - 2019-09-16

Changed

  • Fixed bug when connection close affected other connections cursors
  • Fixed bug when cursors remained open on the Oracle side when closed manually

[1.2.0] - 2019-08-22

Added

  • Added support for non-scrollable cursors
  • Added support for connection caching parameters

[1.1.6] - 2019-07-08

Changed

  • Removed memory leak when fiber with connection has been killed
  • Fixed unresponsive event loop when fiber with long-running request is killed

[1.1.5] - 2019-05-27

Changed

  • Bugfixes and other improvements

[1.1.4] - 2019-04-04

Changed

  • Improved error handling
  • Fixed pushing double value into Lua stack
  • Fixed Lua VM freeze while connecting to Oracle DB
  • Various bugfixes

Added

  • Added OCI libs as a separate rock with dependency

[1.1.0] - 2019-02-01

Changed

  • Bugfixes
  • Several stability improvements

Added

  • Oracle OCCI switched to 11.2
  • Detect and bind params from statement
  • Removed autocommit
  • Added more data conversions for returning dataset
  • Use one method for read and write requests
  • exec_once() removed
  • Tests added
  • CentOS 7 Dockerfile added
  • Add support for blob parameters

[1.0.0] - 2017-04-12

Added

  • Basic functionality

space-explorer

Space explorer

Cartridge WebUI plugin for exploring tarantool spaces and data.

Usage

Example

example-app-scm-1.rockspec

package = 'example-app'
version = 'scm-1'
source  = {
    url = '/dev/null',
}
dependencies = {
    'tarantool',
    'lua >= 5.1',
    'luatest == 0.2.0-1',
    'ldecnumber == 1.1.3-1',
    'cartridge == scm-1',
    'space-explorer == scm-1'
}
build = {
    type = 'none';
}

init.lua

#!/usr/bin/env tarantool

require('strict').on()

local cartridge = require('cartridge')

local ok, err = cartridge.cfg({
    roles = {
        'cartridge.roles.vshard-storage',
        'cartridge.roles.vshard-router',
        'space-explorer',
        'app.roles.api',
        'app.roles.storage',
    },
})

assert(ok, tostring(err))

Appendixes

Appendix A. Tarantool audit module

This document provides an overview of the Tarantool audit module.

Overview

The Tarantool audit module writes messages that record events from the Tarantool DBMS in plain text, CSV or JSON format.

It provides you with a detailed report of all security-related activities and helps you find and fix breaches to protect your business. For example, you can see who updated user privilege and when:

{"time": "2022-04-07T13:39:36.046+0300", "remote": "", "session_type": "background", "module": "tarantool", "user": "admin", "type": "user_priv", "tag": "", "description": "Update user guest privileges for role super from none to execute"}

It is up to each company to decide exactly what activities to audit and what actions to take. System administrators, security engineers and others in charge in the company may want to audit different events for different reasons. Tarantool provides such an option for each of them.

Types of events you can monitor

Tarantool records various types of audit log events that you can monitor and decide whether you need to take actions:

For more details about these audit log events, see the table below.

Event Type of event written to the audit log Example of an event display
Audit log enabled for events audit_enable  
User authorized successfully auth_ok {“name”: “user”}
User authorization failed auth_fail {“name”: “user”}
User logged out or quit the session disconnect  
Failed attempt to access secure data (personal records, details, geolocation, etc.) access_denied {“name”: “obj_name”, “obj_type”: “space”, “access_type”: “read”}
User created user_create {“name”: “user”}
User dropped user_drop {“name”: “user”}
User disabled user_disable {“name”: “user”}
User enabled user_enable {“name”: “user”}
User privileges (roles, profiles, etc.) granted or changed user_priv {“name”: “user”, “obj_name”: “obj_name”, “obj_type”: “space”, “old_priv”: “”, “new_priv”: “read,write”}
Password reset for a specific user password_change {“name”: “user”}
Role created role_create {“name”: “role”}
Role privileges granted or changed role_priv {“name”: “role”, “obj_name”: “obj_name”, “obj_type”: “space”, “old_priv”: “”, “new_priv”: “read,write”}
Space created space_create {“space”: “name”}
Space altered space_alter {“space”: “name”}
Space dropped space_drop {“space”: “name”}
Tuple inserted into space space_insert {“tuple”: “name”} {“space”: “name”}
Tuple replaced in space space_replace {“tuple”: “name”} {“space”: “name”}
Tuple deleted from space space_delete {“tuple”: “name”} {“space”: “name”}
Iterator key selected from space.index space_select {“iterator key”: “name”} {“space.index”: “space.index”}
Function called with arguments call {“function”: “name”} {“arguments”: “arguments”}
Expressions with arguments evaluated in a string eval {“expression”: “name”} {“arguments”: “arguments”}

Note

The eval event displays data from the console module and the eval function of the net.box module. For more on how they work, see Module console and Module net.box – eval. To separate the data, specify console or binary in the session field.

Structure of audit log events

Each audit log event contains several fields to make it easy to filter and aggregate the resulting logs. They are described in the following table.

Field Description Example of a log field display
time Time of the event 2022-04-07T13:20:05.327+0300
remote Remote host that triggered the event 100.96.163.226:48722
session_type Session type console
module Audit log module. Set to tarantool for system events; can be overwritten for user-defined events tarantool
user User who triggered the event admin
type Audit event type access_denied
tag A text field that can be overwritten by the user  
description Human-readable event description Authenticate user Alice

Warning

You can set all these parameters only once. Unlike many other parameters in box.cfg, they cannot be changed.

Enable the Tarantool audit log

By default, the audit_enable option is set to false. Set the audit_enable option to true to start working with the Tarantool audit module.

You can also set this option back to false to disable audit logging.

Choose where you want to write logs

By default, the audit_log option is set to nil and the Tarantool audit module sends audit logs to the default error stream (stderr). If you want to send the audit log to a file, to a pipe, or to the system logger, you need to specify this audit_log option.

Writing to a file

box.cfg{audit_log = 'audit_tarantool.log'}
-- or
box.cfg{audit_log = 'file:audit_tarantool.log'}

This opens the audit_tarantool.log file for output in the server’s default directory. If the audit_log string has no prefix or the prefix file:, the string is interpreted as a file path.

Sending to a pipe

box.cfg{audit_log = '| cronolog audit_tarantool.log'}
-- or
box.cfg{audit_log = 'pipe: cronolog audit_tarantool.log'}'

This starts the cronolog program when the server starts and sends all audit_log messages to cronolog’s standard input (stdin). If the audit_log string starts with ‘|’ or contains the prefix pipe:, the string is interpreted as a Unix pipeline.

Sending to syslog

Warning

Below is an example of writing audit logs to a directory shared with the system logs. Tarantool allows this option, but it is not recommended to do this to avoid difficulties when working with audit logs. System and audit logs should be written separately. To do this, create separate paths and specify them.

This example setting sends the audit log to syslog:

box.cfg{audit_log = 'syslog:identity=tarantool'}
-- or
box.cfg{audit_log = 'syslog:facility=user'}
-- or
box.cfg{audit_log = 'syslog:identity=tarantool,facility=user'}
-- or
box.cfg{audit_log = 'syslog:server=unix:/dev/log'}

If the audit_log string starts with “syslog:”, it is interpreted as a message for the syslogd program, which normally runs in the background of any Unix-like platform. The setting can be ‘syslog:’, ‘syslog:facility=…’, ‘syslog:identity=…’, ‘syslog:server=…’ or a combination.

The syslog:identity setting is an arbitrary string that is placed at the beginning of all messages. The default value is tarantool.

The syslog:facility setting is currently ignored, but will be used in the future. The value must be one of the syslog keywords that tell syslogd where to send the message. The possible values are auth, authpriv, cron, daemon, ftp, kern, lpr, mail, news, security, syslog, user, uucp, local0, local1, local2, local3, local4, local5, local6, local7. The default value is local7.

The syslog:server setting is the locator for the syslog server. It can be a Unix socket path starting with “unix:” or an ipv4 port number. The default socket value is /dev/log (on Linux) or /var/run/syslog (on Mac OS). The default port value is 514, which is the UDP port.

If you log to a file, Tarantool will reopen the audit log at SIGHUP. If log is a program, its pid is stored in the audit_log.logger_pid variable. You need to send it a signal to rotate logs.

Configure a blocking mode

By default, the audit_nonblock option is set to true and Tarantool will not block during logging if the system is not ready to write, dropping the message instead. Using this value may improve logging performance at the cost of losing some log messages. This option only has an effect if the output goes to syslog: or pipe:. Setting audit_nonblock to true is not allowed if the output is to a file. In this case, set audit_nonblock to false.

Configure the format of audit log events

You can choose the format of audit log events – plain text, CSV or JSON format.

Plain text is used by default. This human-readable format can be efficiently compressed. The JSON format is more convenient to receive log events, analyze them and integrate them with other systems if needed. Using the CSV format allows you to view audit log events in tabular form.

Use these commands to configure the format of audit log events in Tarantool.

Plain text

box.cfg{audit_log = 'audit.log', audit_format = 'plain'}

JSON format

box.cfg{audit_log = 'audit.log', audit_format = 'json'}

CSV format

box.cfg{audit_log = 'audit.log', audit_format = 'csv'}

Use filters

Tarantool’s extensive filtering options help you write only the events you need to the audit log.

To set filters, use the box.cfg.audit_filter option.

You have more than 10 values for this option, which you can find in the following table.

Value Description
custom User-defined event logged with audit.log() Lua-function auth_ok
auth_ok Authentication of username
auth_fail Authentication of username failed
disconnect Close connection
user_create Create username
user_drop Drop username
role_create Create role name
role_drop Drop role name
user_enable Enable username
user_disable Disable username
user_grant_rights Grant read, write rights for space my_space to user my_user
user_revoke_rights Revoke write rights for space my_space from user my_user
role_grant_rights Grant execute rights for function my_func to role my_role
role_revoke_rights Revoke write rights for space my_space from role my_role
password_change access_denied Change password for username access_type access to object_type object_name denied

Note

You cannot specify a filter twice or specify a filter that does not exist. In this case you will get a configuration error. The default value for the box.cfg.audit_filter option is compatibility, which enables logging of all events available before 2.10.0.

Customize your filters

You can customize the filters and use different combinations of filters for your purposes.

Filter based on a specific event

You can set only certain events that you need to record.

For example, you can select password_change to monitor the users who have changed their passwords.

Filter based on a specific group

You can set one of the groups of events that you need to record.

For example, you can select compatibility to monitor only events of user authorization, granted privileges, disconnection, user password change, and denied access.

Filter based on multiple groups

You can specify multiple groups depending on the purpose.

For example, you can select auth and priv to see only events related to authorization and granted privileges.

Filter based on a group and a specific event

You can specify a group and a certain event depending on the purpose.

For example, you can select priv and disconnect to see only events related to granted privileges and disconnect events.

Example

Run the command to filter:

local audit = require('audit')

box.cfg{audit_log = 'audit.log', audit_filter = 'custom,user_create', audit_format = 'csv'}
-- The Tarantool audit module writes the event because a filter is set for it
box.schema.user.create('alice')
-- The Tarantool audit module will not write the event because no filter is set for it
box.schema.user.drop('alice')

Use event groups

You can simplify working with audit log events by using built-in groups in Tarantool. For example, you can set to record only events related to the enabling of the audit log, or only events related to a space.

Select one or more available groups to record the events you need:

Warning

Be careful when selecting all and data operations. The more events you record, the slower the requests will be processed over time. It is recommended that you select only those groups whose events your company really needs to monitor and analyze.

Use API to create user-defined events

In addition, Tarantool provides you with an API that allows you to write user-defined audit log events.

For this, you can use the audit.log() function that takes one of the following values:

Using the field audit.new(), you can create a new log module that allows you to avoid passing all custom audit log fields each time audit.log() is called. It takes a table of audit log field values (same as audit.log()).

Example

local my_audit = audit.new({type = 'custom_hello', module = 'my_module'})

my_audit:log('Hello, Alice!')

my_audit:log({tag = 'admin', description = 'Hello, Bob!'})

-- is equivalent to

audit.log({type = 'custom_hello', module = 'my_module',

description = 'Hello, Alice!'})

audit.log({type = 'custom_hello', module = 'my_module',

tag = 'admin', description = 'Hello, Bob!'})

Some user-defined audit log fields (time, remote, session_type) are set in the same way as for a system event. If a field is not overwritten, it is set to the same value as for a system event.

Some audit log fields you can overwrite with audit.new() and audit.log():

  • type

  • user

  • module

  • tag

  • description

    Note

    To avoid confusion with system events, the value of the type field must either be message (default) or begin with custom_. Otherwise you will get the error message. User-defined events are filtered out by default. To enable user-defined audit log events, you must add custom to box.cfg.audit_filter.

Example

local audit = require('audit')

box.cfg{audit_log = 'audit.log', audit_filter = 'custom', audit_format = 'csv'}

audit.log('Hello, Alice!')

audit.log('Hello, %s!', 'Bob')

audit.log({type = 'custom_hello', description = 'Hello, Eve!'})

audit.log({type = 'custom_farewell', user = 'eve', module = 'custom', description = 'Farewell, Eve!'})

local my_audit = audit.new({module = 'my_module', tag = 'default'})

my_audit:log({description = 'Message 1'})

my_audit:log({description = 'Message 2', tag = 'my_tag'})

my_audit:log({description = 'Message 3', module = 'other_module'})

Use read commands

To easily read the audit log events in the needed form, use the different commands:

Tips

How many events can be recorded?

If you write to a file, the size of the Tarantool audit module is limited by the disk space. If you write to a system logger, the size of the Tarantool audit module is limited by the system logger. If you write to a pipe, the size of the Tarantool audit module is limited by the system buffer if the audit_nonblock = false; if audit_nonblock = true, there is no limit. However, it is not recommended to use the entire memory, as this may cause performance degradation and even loss of some logs.

How often should audit logs be reviewed?

Consider setting up a schedule in your company. It is recommended to review audit logs at least every 3 months.

How long should audit logs be stored?

It is recommended to store audit logs for at least one year.

What is the best way to process audit logs?

It is recommended to use SIEM systems for this issue.

Appendix B. Useful Tarantool parameters

For details, please see reference on module `box` in the main Tarantool documentation.

Tuple compression performance

Learn about tuple compression to save memory.

Below are the results of a synthetic test that illustrate how tuple compression impacts performance. The test was carried out on a simple Tarantool space containing 1,000,000 tuples, each having a field with a sample JSON roughly 1,000 bytes large. The test compared the speed of running select and replace operations on uncompressed and compressed data as well as the overall data size of the space. Performance is measured in requests per second.

Compression type select, RPS replace, RPS Space size, bytes
None 1609.04k 703.861k 997,394,386
zstd 104.654k 15.0687k 596,784,864
lz4 625.452k 260.766k 866,504,560

Appendix C. Monitoring system metrics

Option Description SNMP type Units of measure Threshold
Version Tarantool version DisplayString    
IsAlive instance availability indicator

Integer

(listing)

 

0 - unavailable

1 - available

MemoryLua storage space used by Lua Gauge32 Mbyte 900
MemoryData storage space used for storing data Gauge32 Mbyte set the value manually
MemoryNet storage space used for network I/O Gauge32 Mbyte 1024
MemoryIndex storage space used for storing indexes Gauge32 Mbyte set the value manually
MemoryCache storage space used for storing caches (for vinyl engine only) Gauge32 Mbyte  
ReplicationLag lag time since the last sync between (the maximum value in case there are multiple fibers) Integer32 sec. 5
FiberCount number of fibers Gauge32 pc. 1000
CurrentTime current time, in seconds, starting at January, 1st, 1970 Unsigned32 Unix timestamp, in sec.  
StorageStatus status of a replica set Integer listing > 1
StorageAlerts number of alerts for storage nodes Gauge32 pc. >= 1
StorageTotalBkts total number of buckets in the storage Gauge32 pc. < 0
StorageActiveBkts number of buckets in the ACTIVE state Gauge32 pc. < 0
StorageGarbageBkts number of buckets in the GARBAGE state Gauge32 pc. < 0
StorageReceivingBkts number of buckets in the RECEIVING state Gauge32 pc. < 0
StorageSendingBkts number of buckets in the SENDING state Gauge32 pc. < 0
RouterStatus status of the router Integer listing > 1
RouterAlerts number of alerts for the router Gauge32 pc. >= 1
RouterKnownBkts number of buckets within the known destination replica sets Gauge32 pc. < 0
RouterUnknownBkts number of buckets that are unknown to the router Gauge32 pc. < 0
RequestCount total number of requests Counter64 pc.  
InsertCount total number of insert requests Counter64 pc.  
DeleteCount total number of delete requests Counter64 pc.  
ReplaceCount total number of replace requests Counter64 pc.  
UpdateCount total number of update requests Counter64 pc.  
SelectCount total number of select requests Counter64 pc.  
EvalCount number of calls made via Eval Counter64 pc.  
CallCount number of calls made via call Counter64 pc.  
ErrorCount number of errors in Tarantool Counter64 pc.  
AuthCount number of completed authentication operations Counter64 pc.  

Appendix D. Deprecated features

The ZooKeeper along with orchestrator are no longer supported. However, they still can be used, if necessary.

The following sections describe the corresponding functionality.

Controlling the cluster via API

To control the cluster, use the orchestrator included in the delivery package. The orchestrator uses ZooKeeper to store and distribute the configuration. The orchestrator provides the REST API for controlling the cluster. Configurations in the ZooKeeper are changed as a result of calling the orchestrator’s API-functions, which in turn leads to changes in configurations of the Tarantool nodes.

We recommend using a curl command line interface to call the API-functions of the orchestrator.

The following example shows how to register a new availability zone (DC):

$ curl -X POST http://HOST:PORT/api/v1/zone \
    -d '{
  "name": "Caucasian Boulevard"
  }'

To check whether the DC registration was successful, try the following instruction. It retrieves the list of all registered nodes in the JSON format:

$ curl http://HOST:PORT/api/v1/zone| python -m json.tool

To apply the new configuration directly on the Tarantool nodes, increase the configuration version number after calling the API function. To do this, use the POST request to /api/v1/version:

$ curl -X POST http://HOST:PORT/api/v1/version

Altogether, to update the cluster configuration:

  1. Call the POST/PUT method of the orchestrator. As a result, the ZooKeeper nodes are updated, and a subsequent update of the Tarantool nodes is initiated.
  2. Update the configuration version using the POST request to /api/v1/version. As a result, the configuration is applied to the Tarantool nodes.

See Appendix E for the detailed orchestrator API.

Setting up geo redundancy

Logically, cluster nodes can belong to some availability zone. Physically, an availability zone is a separate DC, or a rack inside a DC. You can specify a matrix of weights (distances) for the availability zones.

New zones are added by calling a corresponding API method of the orchestrator.

By default, the matrix of weights (distances) for the zones is not configured, and geo-redundancy for such configurations works as follows:

When you define a matrix of weights (distances) by calling /api/v1/zones/weights, the automatic scale-out system of the Tarantool DBMS finds a replica which is the closest to the specified router in terms of weights, and starts using this replica for reading. If this replica is not available, then the next nearest replica is selected, taking into account the distances specified in the configuration.

Appendix E. Orchestrator API reference

Configuring the zones

POST /api/v1/zone

Create a new zone.

Request

{
"name": "zone 1"
}

Response

{
"error": {
    "code": 0,
    "message": "ok"
},
"data": {
    "id": 2,
    "name": "zone 2"
},
"status": true
}

Potential errors

  • zone_exists - the specified zone already exists
GET /api/v1/zone/{zone_id: optional}

Return information on the specified zone or on all the zones.

Response

{
    "error": {
        "code": 0,
        "message": "ok"
    },
    "data": [
        {
            "id": 1,
            "name": "zone 11"
        },
        {
            "id": 2,
            "name": "zone 2"
        }
    ],
    "status": true
}

Potential errors

  • zone_not_found - the specified zone is not found
PUT /api/v1/zone/{zone_id}

Update information on the zone.

Body

{
    "name": "zone 22"
}

Response

{
    "error": {
        "code": 0,
        "message": "ok"
    },
    "data": {},
    "status": true
}

Potential errors

  • zone_not_found - the specified zone is not found
DELETE /api/v1/zone/{zone_id}

Delete a zone if it doesn’t store any nodes.

Response

{
    "error": {
        "code": 0,
        "message": "ok"
    },
    "data": {},
    "status": true
}

Potential errors

  • zone_not_found - the specified zone is not found
  • zone_in_use - the specified zone stores at least one node

Configuring the zone weights

POST /api/v1/zones/weights

Set the zone weights configuration.

Body

{
    "weights": {
        "1": {
            "2": 10,
            "3": 11
        },
        "2": {
            "1": 10,
            "3": 12
        },
        "3": {
            "1": 11,
            "2": 12
        }
    }
}

Response

{
    "error": {
        "code": 0,
        "message": "ok"
    },
    "data": {},
    "status": true
}

Potential errors

  • zones_weights_error - configuration error
GET /api/v1/zones/weights

Return the zone weights configuration.

Response

{
    "error": {
        "code": 0,
        "message": "ok"
    },
    "data": {
        "1": {
            "2": 10,
            "3": 11
        },
        "2": {
            "1": 10,
            "3": 12
        },
        "3": {
            "1": 11,
            "2": 12
        }
    },
    "status": true
}

Potential errors

  • zone_not_found - the specified zone is not found

Configuring registry

GET /api/v1/registry/nodes/new

Return all the detected nodes.

Response

{
    "error": {
        "code": 0,
        "message": "ok"
    },
    "data": [
        {
            "uuid": "uuid-2",
            "hostname": "tnt2.public.i",
            "name": "tnt2"
        }
    ],
    "status": true
}
POST /api/v1/registry/node

Register the detected node.

Body

{
    "zone_id": 1,
    "uuid": "uuid-2",
    "uri": "tnt2.public.i:3301",
    "user": "user1:pass1",
    "repl_user": "repl_user1:repl_pass1",
    "cfg": {
        "listen": "0.0.0.0:3301"
    }
}

Response

{
    "error": {
        "code": 0,
        "message": "ok"
    },
    "data": {},
    "status": true
}

Potential errors

  • node_already_registered - the specified node is already registered
  • zone_not_found - the specified zone is not found
  • node_not_discovered - the specified node is not detected
PUT /api/v1/registry/node/{node_uuid}

Update the registered node parameters.

Body

Pass only those parameters that need to be updated.

{
    "zone_id": 1,
    "repl_user": "repl_user2:repl_pass2",
    "cfg": {
        "listen": "0.0.0.0:3301",
        "memtx_memory": 100000
    }
}

Response

{
    "error": {
        "code": 0,
        "message": "ok"
    },
    "data": {},
    "status": true
}

Potential errors

  • node_not_registered - the specified node is not registered
GET /api/v1/registry/node/{node_uuid: optional}

Return information on the nodes in a cluster. If node_uuid is passed, information on this node only is returned.

Response

{
    "error": {
        "code": 0,
        "message": "ok"
    },
    "data": {
        "uuid-1": {
            "user": "user1:pass1",
            "hostname": "tnt1.public.i",
            "repl_user": "repl_user2:repl_pass2",
            "uri": "tnt1.public.i:3301",
            "zone_id": 1,
            "name": "tnt1",
            "cfg": {
                "listen": "0.0.0.0:3301",
                "memtx_memory": 100000
            },
            "zone": 1
        },
        "uuid-2": {
            "user": "user1:pass1",
            "hostname": "tnt2.public.i",
            "name": "tnt2",
            "uri": "tnt2.public.i:3301",
            "repl_user": "repl_user1:repl_pass1",
            "cfg": {
                "listen": "0.0.0.0:3301"
            },
            "zone": 1
        }
    },
    "status": true
}

Potential errors

  • node_not_registered - the specified node is not registered
DELETE /api/v1/registry/node/{node_uuid}

Delete the node if it doesn’t belong to any replica set.

Response

{
    "error": {
        "code": 0,
        "message": "ok"
    },
    "data": {},
    "status": true
}

Potential errors

  • node_not_registered - the specified node is not registered
  • node_in_use - the specified node is in use by a replica set

Routers API

GET /api/v1/routers

Return the list of all nodes that constitute the router.

Response

{
    "data": [
        "uuid-1"
    ],
    "status": true,
    "error": {
        "code": 0,
        "message": "ok"
    }
}
POST /api/v1/routers

Assign the router role to the node.

Body

{
    "uuid": "uuid-1"
}

Response

{
    "error": {
        "code": 0,
        "message": "ok"
    },
    "data": {},
    "status": true
}

Potential errors

  • node_not_registered - the specified node is not registered
DELETE /api/v1/routers/{uuid}

Release the router role from the node.

Response

{
    "error": {
        "code": 0,
        "message": "ok"
    },
    "data": {},
    "status": true
}

Configuring replica sets

POST /api/v1/replicaset

Create a replica set containing all the registered nodes.

Body

{
    "uuid": "optional-uuid",
    "replicaset": [
        {
            "uuid": "uuid-1",
            "master": true
        }
    ]
}

Response

{
    "error": {
        "code": 0,
        "message": "ok"
    },
    "data": {
        "replicaset_uuid": "cc6568a2-63ca-413d-8e39-704b20adb7ae"
    },
    "status": true
}

Potential errors

  • replicaset_exists – the specified replica set already exists
  • replicaset_empty – the specified replica set doesn’t contain any nodes
  • node_not_registered – the specified node is not registered
  • node_in_use – the specified node is in use by another replica set
PUT /api/v1/replicaset/{replicaset_uuid}

Update the replica set parameters.

Body

{
    "replicaset": [
        {
            "uuid": "uuid-1",
            "master": true
        },
        {
            "uuid": "uuid-2",
            "master": false,
            "off": true
        }
    ]
}

Response

{
    "error": {
        "code": 0,
        "message": "ok"
    },
    "data": {},
    "status": true
}

Potential errors

  • replicaset_empty – the specified replica set doesn’t contain any nodes
  • replicaset_not_found – the specified replica set is not found
  • node_not_registered – the specified node is not registered
  • node_in_use – the specified node is in use by another replica set
GET /api/v1/replicaset/{replicaset_uuid: optional}

Return information on all the cluster components. If replicaset_uuid is passed, information on this replica set only is returned.

Body

{
    "name": "zone 22"
}

Response

{
    "error": {
        "code": 0,
        "message": "ok"
    },
    "data": {
        "cc6568a2-63ca-413d-8e39-704b20adb7ae": {
            "uuid-1": {
                "hostname": "tnt1.public.i",
                "off": false,
                "repl_user": "repl_user2:repl_pass2",
                "uri": "tnt1.public.i:3301",
                "master": true,
                "name": "tnt1",
                "user": "user1:pass1",
                "zone_id": 1,
                "zone": 1
            },
            "uuid-2": {
                "hostname": "tnt2.public.i",
                "off": true,
                "repl_user": "repl_user1:repl_pass1",
                "uri": "tnt2.public.i:3301",
                "master": false,
                "name": "tnt2",
                "user": "user1:pass1",
                "zone": 1
            }
        }
    },
    "status": true
}

Potential errors

  • replicaset_not_found – the specified replica set is not found
DELETE /api/v1/replicaset/{replicaset_uuid}

Delete a replica set.

Response

{
    "error": {
        "code": 0,
        "message": "ok"
    },
    "data": {},
    "status": true
}

Potential errors

  • replicaset_not_found - the specified replica set is not found
POST /api/v1/replicaset/{replicaset_uuid}/master

Switch the master in the replica set.

Body

{
    "instance_uuid": "uuid-1",
    "hostname_name": "hostname:instance_name"
}

Response

{
    "error": {
        "code": 0,
        "message": "ok"
    },
    "data": {},
    "status": true
}

Potential errors

  • replicaset_not_found – the specified replica set is not found
  • node_not_registered – the specified node is not registered
  • node_not_in_replicaset – the specified node is not in the specified replica set
POST /api/v1/replicaset/{replicaset_uuid}/node

Add a node to the replica set.

Response

{
    "error": {
        "code": 0,
        "message": "ok"
    },
    "data": {},
    "status": true
}

Body

{
    "instance_uuid": "uuid-1",
    "hostname_name": "hostname:instance_name",
    "master": false,
    "off": false
}

Potential errors

GET /api/v1/replicaset/status

Return statistics on the cluster.

Response

{
    "error": {
        "code": 0,
        "message": "ok"
    },
    "data": {
        "cluster": {
            "routers": [
                {
                    "zone": 1,
                    "name": "tnt1",
                    "repl_user": "repl_user1:repl_pass1",
                    "hostname": "tnt1.public.i",
                    "status": null,
                    "uri": "tnt1.public.i:3301",
                    "user": "user1:pass1",
                    "uuid": "uuid-1",
                    "total_rps": null
                }
            ],
            "storages": [
                {
                    "hostname": "tnt1.public.i",
                    "repl_user": "repl_user2:repl_pass2",
                    "uri": "tnt1.public.i:3301",
                    "name": "tnt1",
                    "total_rps": null,
                    "status": 'online',
                    "replicas": [
                        {
                            "user": "user1:pass1",
                            "hostname": "tnt2.public.i",
                            "replication_info": null,
                            "repl_user": "repl_user1:repl_pass1",
                            "uri": "tnt2.public.i:3301",
                            "uuid": "uuid-2",
                            "status": 'online',
                            "name": "tnt2",
                            "total_rps": null,
                            "zone": 1
                        }
                    ],
                    "user": "user1:pass1",
                    "zone_id": 1,
                    "uuid": "uuid-1",
                    "replicaset_uuid": "cc6568a2-63ca-413d-8e39-704b20adb7ae",
                    "zone": 1
                }
            ]
        }
    },
    "status": true
}

Potential errors

  • zone_not_found - the specified zone is not found
  • zone_in_use - the specified zone stores at least one node

Setting up configuration versions

POST /api/v1/version

Set the configuration version.

Response

{
    "error": {
        "code": 0,
        "message": "ok"
    },
    "status": true,
    "data": {
        "version": 2
    }
}

Potential errors

  • cfg_error - configuration error
GET /api/v1/version

Return the configuration version.

Response

{
    "error": {
        "code": 0,
        "message": "ok"
    },
    "status": true,
    "data": {
        "version": 2
    }
}

Configuring sharding

POST /api/v1/sharding/cfg

Add a new sharding configuration.

Response

{
    "error": {
        "code": 0,
        "message": "ok"
    },
    "status": true,
    "data": {}
}
GET /api/v1/sharding/cfg

Return the current sharding configuration.

Response

{
    "error": {
        "code": 0,
        "message": "ok"
    },
    "status": true,
    "data": {}
}

Resetting cluster configuration

POST /api/v1/clean/cfg

Reset the cluster configuration.

Response

{
    "error": {
        "code": 0,
        "message": "ok"
    },
    "status": true,
    "data": {}
}
POST /api/v1/clean/all

Reset the cluster configuration and delete information on the cluster nodes from the ZooKeeper catalogues.

Response

{
    "error": {
        "code": 0,
        "message": "ok"
    },
    "status": true,
    "data": {}
}