Configuration reference

Note

Starting with the 3.0 version, the recommended way of configuring Tarantool is using a configuration file. Configuring Tarantool in code is considered a legacy approach.

Basic parameters

sharding
weights
shard_index
bucket_count
collect_bucket_garbage_interval
collect_lua_garbage
sync_timeout
rebalancer_disbalance_threshold
rebalancer_max_receiving
rebalancer_max_sending
discovery_mode
sched_move_quota
sched_ref_quota

sharding¶: A field defining the logical topology of the sharded Tarantool cluster.

Type: table

Default: false

Dynamic: yes

weights¶: A field defining the configuration of relative weights for each zone pair in a replica set.

Type: table

Default: false

Dynamic: yes

shard_index¶: Name or id of a TREE index over the bucket id. Spaces without this index do not participate in a sharded Tarantool cluster and can be used as regular spaces if needed. It is necessary to specify the first part of the index, other parts are optional.

Type: non-empty string or non-negative integer

Default: “bucket_id”

Dynamic: no

bucket_count¶

The total number of buckets in a cluster.

This number should be several orders of magnitude larger than the potential number of cluster nodes, considering potential scaling out in the foreseeable future.

Example:

If the estimated number of nodes is M, then the data set should be divided into 100M or even 1000M buckets, depending on the planned scaling out. This number is certainly greater than the potential number of cluster nodes in the system being designed.

Keep in mind that too many buckets can cause a need to allocate more memory to store routing information. On the other hand, an insufficient number of buckets can lead to decreased granularity when rebalancing.

Type: number
Default: 3000
Dynamic: no

collect_bucket_garbage_interval¶

Deprecated since: 0.1.17.

The interval between garbage collector actions, in seconds.

Type: number
Default: 0.5
Dynamic: yes

collect_lua_garbage¶

Deprecated since: 0.1.20.

If set to true, the Lua collectgarbage() function is called periodically.

Type: boolean
Default: no
Dynamic: yes

sync_timeout¶: Timeout to wait for synchronization of the old master with replicas before demotion. Used when switching a master or when manually calling the sync() function.

Type: number

Default: 1

Dynamic: yes

rebalancer_disbalance_threshold¶

A maximum bucket disbalance threshold, in percent. The disbalance is calculated for each replica set using the following formula:

|etalon_bucket_count - real_bucket_count| / etalon_bucket_count * 100

Type: number
Default: 1
Dynamic: yes

rebalancer_max_receiving¶

The maximum number of buckets that can be received in parallel by a single replica set. This number must be limited, because when a new replica set is added to a cluster, the rebalancer sends a very large amount of buckets from the existing replica sets to the new replica set. This produces a heavy load on the new replica set.

Example:

Suppose rebalancer_max_receiving is equal to 100, bucket_count is equal to 1000. There are 3 replica sets with 333, 333 and 334 buckets on each respectively. When a new replica set is added, each replica set’s etalon_bucket_count becomes equal to 250. Rather than receiving all 250 buckets at once, the new replica set receives 100, 100 and 50 buckets sequentially.

Type: number
Default: 100
Dynamic: yes

rebalancer_max_sending¶

The degree of parallelism for parallel rebalancing.

Works for storages only, ignored for routers.

The maximum value is 15.

Type: number
Default: 1
Dynamic: yes

discovery_mode¶: A mode of a bucket discovery fiber: on/off/once. See details below.

Type: string

Default: ‘on’

Dynamic: yes

sched_move_quota¶

A scheduler’s bucket move quota used by the rebalancer.

sched_move_quota defines how many bucket moves can be done in a row if there are pending storage refs. Then, bucket moves are blocked and a router continues making map-reduce requests.

uuid¶: A unique identifier of a replica set.

Type:

Default:

Dynamic:

weight¶: A weight of a replica set. See the Replica set weights section for details.

Type:

Default: 1

Dynamic:

master¶

Turns on automated master discovery in a replica set if set to auto. Applicable only to the configuration of a router; the storage configuration ignores this parameter.

The parameter should be specified per replica set. The configuration is not compatible with a manual master selection.

Examples

Correct configuration:

config = {
    sharding = {
        <replicaset uuid> = {
            master = 'auto',
            replicas = {...},
        },
        ...
    },
    ...
}

Incorrect configuration:

config = {
    sharding = {
        <replicaset uuid> = {
            master = 'auto',
            replicas = {
                <replica uuid1> = {
                    master = true,
                    ...
                },
                <replica uuid2> = {
                    master = false,
                    ...
                },
            },
        },
        ...
    },
    ...
}

If the configuration is incorrect, it is not applied, and the vshard.router.cfg() call throws an error.

If the master parameter is set to auto for some replica sets, the router goes to these replica sets, discovers the master in each of them, and periodically checks if the master instance still has its master status. When the master in the replica set stops being a master, the router goes around all the nodes of the replica set to find out which one is the new master.

Without this setting, the router cannot detect master nodes in the configured replica sets on its own. It relies only on how they are specified in the configuration. This becomes a problem when the master changes, and the change is not delivered to the router’s configuration: for instance, in case the router doesn’t rely on a central configuration provider or the provider cannot deliver a new configuration due to some reason.

Type: string
Default: nil
Dynamic: yes

Version:

Configuration reference

Basic parameters

Replica set parameters