Metrics reference
This page provides a detailed description of metrics from the metrics
module.
General instance information:
tnt_cfg_current_time |
Instance system time in the Unix timestamp format |
tnt_info_uptime |
Time in seconds since the instance has started |
tnt_read_only |
Indicates if the instance is in read-only mode (1 if true, 0 if false) |
The following metrics provide a picture of memory usage by the Tarantool process.
tnt_info_memory_cache |
Number of bytes in the cache used to store tuples with the vinyl storage engine. |
tnt_info_memory_data |
Number of bytes used to store user data (tuples) with the memtx engine and with level 0 of the vinyl engine, without regard for memory fragmentation. |
tnt_info_memory_index |
Number of bytes used for indexing user data. Includes memtx and vinyl memory tree extents, the vinyl page index, and the vinyl bloom filters. |
tnt_info_memory_lua |
Number of bytes used for the Lua runtime. The Lua memory is limited to 2 GB per instance. Monitoring this metric can prevent memory overflow. |
tnt_info_memory_net |
Number of bytes used for network input/output buffers. |
tnt_info_memory_tx |
Number of bytes in use by active transactions.
For the vinyl storage engine,
this is the total size of all allocated objects
(struct txv , struct vy_tx , struct vy_read_interval )
and tuples pinned for those objects. |
Provides a memory usage report for the slab allocator. The slab allocator is the main allocator used to store tuples. The following metrics help monitor the total memory usage and memory fragmentation. To learn more about use cases, refer to the box.slab submodule documentation.
Available memory, bytes:
tnt_slab_quota_size |
Amount of memory available to store tuples and indexes.
Is equal to memtx_memory . |
tnt_slab_arena_size |
Total memory available to store both tuples and indexes. Includes allocated but currently free slabs. |
tnt_slab_items_size |
Total amount of memory available to store only tuples and not indexes. Includes allocated but currently free slabs. |
Memory usage, bytes:
tnt_slab_quota_used |
The amount of memory that is already reserved by the slab allocator. |
tnt_slab_arena_used |
The effective memory used to store both tuples and indexes. Disregards allocated but currently free slabs. |
tnt_slab_items_used |
The effective memory used to store only tuples and not indexes. Disregards allocated but currently free slabs. |
Memory utilization, %:
tnt_slab_quota_used_ratio |
tnt_slab_quota_used / tnt_slab_quota_size |
tnt_slab_arena_used_ratio |
tnt_slab_arena_used / tnt_slab_arena_size |
tnt_slab_items_used_ratio |
tnt_slab_items_used / tnt_slab_items_size |
The following metrics provide specific information about each individual space in a Tarantool instance.
tnt_space_len |
Number of records in the space.
This metric always has 2 labels: {name="test", engine="memtx"} ,
where name is the name of the space and
engine is the engine of the space. |
tnt_space_bsize |
Total number of bytes in all tuples.
This metric always has 2 labels: {name="test", engine="memtx"} ,
where name is the name of the space
and engine is the engine of the space. |
tnt_space_index_bsize |
Total number of bytes taken by the index.
This metric always has 2 labels: {name="test", index_name="pk"} ,
where name is the name of the space and
index_name is the name of the index. |
tnt_space_total_bsize |
Total size of tuples and all indexes in the space.
This metric always has 2 labels: {name="test", engine="memtx"} ,
where name is the name of the space and
engine is the engine of the space. |
tnt_vinyl_tuples |
Total tuple count for vinyl.
This metric always has 2 labels: {name="test", engine="vinyl"} ,
where name is the name of the space and
engine is the engine of the space. For vinyl this metric is disabled
by default and can be enabled only with global variable setup:
rawset(_G, 'include_vinyl_count', true) . |
Network activity stats. These metrics can be used to monitor network load, usage peaks, and traffic drops.
Sent bytes:
tnt_net_sent_total |
Bytes sent from the instance over the network since the instance’s start time |
Received bytes:
tnt_net_received_total |
Bytes received by the instance since start time |
Connections:
tnt_net_connections_total |
Number of incoming network connections since the instance’s start time |
tnt_net_connections_current |
Number of active network connections |
Requests:
tnt_net_requests_total |
Number of network requests the instance has handled since its start time |
tnt_net_requests_current |
Number of pending network requests |
Requests in progress:
tnt_net_requests_in_progress_total |
Total count of requests processed by tx thread |
tnt_net_requests_in_progress_current |
Count of requests currently being processed in the tx thread |
Requests placed in queues of streams:
tnt_net_requests_in_stream_total |
Total count of requests, which was placed in queues of streams for all time |
tnt_net_requests_in_stream_current |
Count of requests currently waiting in queues of streams |
Since Tarantool 2.10 in each network metric has the label thread
, showing per-thread network statistics.
Provides the statistics for fibers. If your application creates a lot of fibers, you can use the metrics below to monitor fiber count and memory usage.
tnt_fiber_amount |
Number of fibers |
tnt_fiber_csw |
Overall number of fiber context switches |
tnt_fiber_memalloc |
Amount of memory reserved for fibers |
tnt_fiber_memused |
Amount of memory used by fibers |
You can collect iproto requests an instance has processed and aggregate them by request type. This may help you find out what operations your clients perform most often.
tnt_stats_op_total |
Total number of calls since server start |
To distinguish between request types, this metric has the operation
label.
For example, it can look as follows: {operation="select"}
.
For the possible request types, check the table below.
auth |
Authentication requests |
call |
Requests to execute stored procedures |
delete |
Delete calls |
error |
Requests resulted in an error |
eval |
Calls to evaluate Lua code |
execute |
Execute SQL calls |
insert |
Insert calls |
prepare |
SQL prepare calls |
replace |
Replace calls |
select |
Select calls |
update |
Update calls |
upsert |
Upsert calls |
Provides the current replication status. Learn more about replication in Tarantool.
tnt_info_lsn |
LSN of the instance. |
tnt_info_vclock |
LSN number in vclock.
This metric always has the label {id="id"} ,
where id is the instance’s number in the replica set. |
tnt_replication_lsn |
LSN of the tarantool instance.
This metric always has labels {id="id", type="type"} , where
id is the instance’s number in the replica set,
type is master or replica . |
tnt_replication_lag |
Replication lag value in seconds.
This metric always has labels {id="id", stream="stream"} ,
where id is the instance’s number in the replica set,
stream is downstream or upstream . |
tnt_replication_status |
This metrics equals 1 when replication status is “follow” and 0 otherwise.
This metric always has labels {id="id", stream="stream"} ,
where id is the instance’s number in the replica set,
stream is downstream or upstream . |
tnt_runtime_lua |
Lua garbage collector size in bytes |
tnt_runtime_used |
Number of bytes used for the Lua runtime |
tnt_runtime_tuple |
Number of bytes used for the tuples (except tuples owned by memtx and vinyl) |
tnt_cartridge_issues |
Number of instance issues.
This metric always has the label
|
tnt_cartridge_cluster_issues |
Sum of instance issues number over cluster. |
tnt_clock_delta |
Clock drift across the cluster.
This metric always has the label
|
tnt_cartridge_failover_trigger_total |
Count of failover triggers in cluster. |
LuaJIT metrics provide an insight into the work of the Lua garbage collector. These metrics are available in Tarantool 2.6 and later.
General JIT metrics:
lj_jit_snap_restore_total |
Overall number of snap restores |
lj_jit_trace_num |
Number of JIT traces |
lj_jit_trace_abort_total |
Overall number of abort traces |
lj_jit_mcode_size |
Total size of allocated machine code areas |
JIT strings:
lj_strhash_hit_total |
Number of strings being interned |
lj_strhash_miss_total |
Total number of string allocations |
GC steps:
lj_gc_steps_atomic_total |
Count of incremental GC steps (atomic state) |
lj_gc_steps_sweepstring_total |
Count of incremental GC steps (sweepstring state) |
lj_gc_steps_finalize_total |
Count of incremental GC steps (finalize state) |
lj_gc_steps_sweep_total |
Count of incremental GC steps (sweep state) |
lj_gc_steps_propagate_total |
Count of incremental GC steps (propagate state) |
lj_gc_steps_pause_total |
Count of incremental GC steps (pause state) |
Allocations:
lj_gc_strnum |
Number of allocated string objects |
lj_gc_tabnum |
Number of allocated table objects |
lj_gc_cdatanum |
Number of allocated cdata objects |
lj_gc_udatanum |
Number of allocated udata objects |
lj_gc_freed_total |
Total amount of freed memory |
lj_gc_memory |
Current allocated Lua memory |
lj_gc_allocated_total |
Total amount of allocated memory |
The following metrics provide CPU usage statistics. They are only available on Linux.
tnt_cpu_number |
Total number of processors configured by the operating system |
tnt_cpu_time |
Host CPU time |
tnt_cpu_thread |
Tarantool thread CPU time.
This metric always has the labels
|
There are also two cross-platform metrics, which can be obtained with a getrusage()
call.
tnt_cpu_user_time |
Tarantool CPU user time |
tnt_cpu_system_time |
Tarantool CPU system time |
Vinyl metrics provide vinyl engine statistics.
The disk metrics are used to monitor overall data size on disk.
The vinyl regulator decides when to commence disk IO actions. It groups activities in batches so that they are more consistent and efficient.
tnt_vinyl_regulator_dump_bandwidth |
Estimated average dumping rate, bytes per second. The rate value is initially 10485760 (10 megabytes per second). It is recalculated depending on the the actual rate. Only significant dumps that are larger than 1 MB are used for estimating. |
tnt_vinyl_regulator_write_rate |
Actual average rate of performing write operations, bytes per second. The rate is calculated as a 5-second moving average. If the metric value is gradually going down, this can indicate disk issues. |
tnt_vinyl_regulator_rate_limit |
Write rate limit, bytes per second.
The regulator imposes the limit on transactions
based on the observed dump/compaction performance.
If the metric value is down to approximately 10^5 ,
this indicates issues with the disk
or the scheduler. |
tnt_vinyl_regulator_dump_watermark |
Maximum amount of memory in bytes used for in-memory storing of a vinyl LSM tree. When this maximum is accessed, a dump must occur. For details, see Filling an LSM tree. The value is slightly smaller than the amount of memory allocated for vinyl trees, reflected in the vinyl_memory parameter. |
tnt_vinyl_regulator_blocked_writers |
The number of fibers that are blocked waiting for Vinyl level0 memory quota. |
tnt_vinyl_tx_commit |
Counter of commits (successful transaction ends) Includes implicit commits: for example, any insert operation causes a commit unless it is within a box.begin()–box.commit() block. |
tnt_vinyl_tx_rollback |
Сounter of rollbacks (unsuccessful transaction ends). This is not merely a count of explicit box.rollback() requests – it includes requests that ended with errors. |
tnt_vinyl_tx_conflict |
Counter of conflicts that caused transactions to roll back.
The ratio tnt_vinyl_tx_conflict / tnt_vinyl_tx_commit
above 5% indicates that vinyl is not healthy.
At that moment, you’ll probably see a lot of other problems with vinyl. |
tnt_vinyl_tx_read_views |
Current number of read views – that is, transactions
that entered the read-only state to avoid conflict temporarily.
Usually the value is 0 .
If it stays non-zero for a long time, it is indicative of a memory leak. |
The following metrics show state memory areas used by vinyl for caches and write buffers.
tnt_vinyl_memory_tuple_cache |
Amount of memory in bytes currently used to store tuples (data) |
tnt_vinyl_memory_level0 |
“Level 0” (L0) memory area, bytes.
L0 is the area that vinyl can use for in-memory storage of an LSM tree.
By monitoring this metric, you can see when L0 is getting close to its
maximum (tnt_vinyl_regulator_dump_watermark ),
at which time a dump will occur.
You can expect L0 = 0 immediately after the dump operation is completed. |
tnt_vinyl_memory_page_index |
Amount of memory in bytes currently used to store indexes. If the metric value is close to vinyl_memory, this indicates that vinyl_page_size was chosen incorrectly. |
tnt_vinyl_memory_bloom_filter |
Amount of memory in bytes used by bloom filters. |
tnt_vinyl_memory_tuple |
Total size of memory in bytes occupied by Vinyl tuples. It includes cached tuples and tuples pinned by the Lua world. |
The vinyl scheduler invokes the regulator and updates the related variables. This happens once per second.
tnt_vinyl_scheduler_tasks |
Number of scheduler dump/compaction tasks.
The metric always has label
|
tnt_vinyl_scheduler_dump_time |
Total time in seconds spent by all worker threads performing dumps. |
tnt_vinyl_scheduler_dump_total |
Counter of dumps completed. |
Event loop tx thread information:
tnt_ev_loop_time |
Event loop time (ms) |
tnt_ev_loop_prolog_time |
Event loop prolog time (ms) |
tnt_ev_loop_epilog_time |
Event loop epilog time (ms) |
Shows the current state of a synchronous replication.
tnt_synchro_queue_owner |
Instance ID of the current synchronous replication master. |
tnt_synchro_queue_term |
Current queue term. |
tnt_synchro_queue_len |
How many transactions are collecting confirmations now. |
tnt_synchro_queue_busy |
Whether the queue is processing any system entry (CONFIRM/ROLLBACK/PROMOTE/DEMOTE). |
Shows the current state of a replica set node in regards to leader election.
tnt_election_state |
election state (mode) of the node. When election is enabled, the node is writable only in the leader state. Possible values:
|
tnt_election_vote |
ID of a node the current node votes for. If the value is 0, it means the node hasn’t voted in the current term yet. |
tnt_election_leader |
Leader node ID in the current term. If the value is 0, it means the node doesn’t know which node is the leader in the current term. |
tnt_election_term |
Current election term. |
tnt_election_leader_idle |
Time in seconds since the last interaction with the known leader. |
Memtx mvcc memory statistics. Transaction manager consists of two parts: - the transactions themselves (TXN section) - MVCC
tnt_memtx_tnx_statements are the transaction statements. |
For example, the user started a transaction and made an action in it
|
tnt_memtx_tnx_user |
In Tarantool C API there is a function
|
tnt_memtx_tnx_system |
There are internals: logs, savepoints.
This metric always has the label
|
mvcc
is responsible for the isolation of transactions.
It detects conflicts and makes sure that tuples that are no longer in the space, but read by some transaction
(or can be read) have not been deleted.
tnt_memtx_mvcc_trackers |
Trackers that keep track of transaction reads.
This metric always has the label
|
tnt_memtx_mvcc_conflicts |
Allocated in case of transaction conflicts.
This metric always has the label
|
Saved tuples are divided into 3 categories: used
, read_view
, tracking
.
Each category has two metrics:
- retained
tuples - they are no longer in the index, but MVCC does not allow them to be removed.
- stories
- MVCC is based on the story mechanism, almost every tuple has a story.
This is a separate metric because even the tuples that are in the index can have a story.
So stories
and retained
need to be measured separately.
tnt_memtx_mvcc_tuples_used_stories |
Tuples that are used by active read-write transactions.
This metric always has the label
|
tnt_memtx_mvcc_tuples_used_retained |
Tuples that are used by active read-write transactions.
But they are no longer in the index, but MVCC does not allow them to be removed.
This metric always has the label
|
tnt_memtx_mvcc_tuples_read_view_stories |
Tuples that are not used by active read-write transactions,
but are used by read-only transactions (i.e. in read view).
This metric always has the label
|
tnt_memtx_mvcc_tuples_read_view_retained |
Tuples that are not used by active read-write transactions,
but are used by read-only transactions (i.e. in read view).
This tuples are no longer in the index, but MVCC does not allow them to be removed.
This metric always has the label
|
tnt_memtx_mvcc_tuples_tracking_stories |
Tuples that are not directly used by any transactions, but are used by MVCC to track reads.
This metric always has the label
|
tnt_memtx_mvcc_tuples_tracking_retained |
Tuples that are not directly used by any transactions, but are used by MVCC to track reads.
This tuples are no longer in the index, but MVCC does not allow them to be removed.
This metric always has the label
|
tnt_memtx_tuples_data_total |
Total amount of memory (in bytes) allocated for data tuples.
This includes tnt_memtx_tuples_data_read_view and
tnt_memtx_tuples_data_garbage metric values plus tuples that
are actually stored in memtx spaces. |
tnt_memtx_tuples_data_read_view |
Memory (in bytes) held for read views. |
tnt_memtx_tuples_data_garbage |
Memory (in bytes) that is unused and scheduled to be freed (freed lazily on memory allocation). |
tnt_memtx_index_total |
Total amount of memory (in bytes) allocated for indexing data.
This includes tnt_memtx_index_read_view metric value
plus memory used for indexing tuples
that are actually stored in memtx spaces. |
tnt_memtx_index_read_view |
Memory (in bytes) held for read views. |
These metrics are available starting from Tarantool 3.0.
tnt_config_alerts |
Count of current instance configuration apply alerts.
{level="warn"} label covers warnings and
{level="error"} covers errors. |
tnt_config_status |
The status of current instance configuration apply.
# HELP tnt_config_status Tarantool 3 configuration status
# TYPE tnt_config_status gauge
tnt_config_status{status="reload_in_progress",alias="router-001-a"} 0
tnt_config_status{status="uninitialized",alias="router-001-a"} 0
tnt_config_status{status="check_warnings",alias="router-001-a"} 0
tnt_config_status{status="ready",alias="router-001-a"} 1
tnt_config_status{status="check_errors",alias="router-001-a"} 0
tnt_config_status{status="startup_in_progress",alias="router-001-a"} 0
For example, this set of metrics means that current configuration
for |