API reference
An application using the metrics
module has 4 primitives, called collectors,
at its disposal:
A collector represents one or more observations that change over time.
counter
-
metrics.
counter
(name[, help, metainfo])¶
Register a new counter.
Параметры:
Return: A counter object.
Rtype: counter_obj
-
object
counter_obj
¶
-
counter_obj:
inc
(num, label_pairs)¶
Increment the observation for label_pairs
.
If label_pairs
doesn’t exist, the method creates it.
Параметры:
- num (
number
) – increment value.
- label_pairs (
table
) – table containing label names as keys,
label values as values. Note that both
label names and values in label_pairs
are treated as strings.
-
counter_obj:
collect
()¶
Return: Array of observation
objects for a given counter.
{
label_pairs: table, -- `label_pairs` key-value table
timestamp: ctype<uint64_t>, -- current system time (in microseconds)
value: number, -- current value
metric_name: string, -- collector
}
Rtype: table
-
counter_obj:
remove
(label_pairs)¶
Remove the observation for label_pairs
.
gauge
-
metrics.
gauge
(name[, help, metainfo])¶
Register a new gauge.
Параметры:
Return: A gauge object.
Rtype: gauge_obj
-
object
gauge_obj
¶
-
-
gauge_obj:
dec
(num, label_pairs)¶
Works like inc()
, but decrements the observation.
-
gauge_obj:
set
(num, label_pairs)¶
Sets the observation for label_pairs
to num
.
-
gauge_obj:
collect
()¶
Returns an array of observation
objects for a given gauge.
For the description of observation
, see
counter_obj:collect().
histogram
-
metrics.
histogram
(name[, help, buckets, metainfo])¶
Register a new histogram.
Параметры:
- name (
string
) – collector name. Must be unique.
- help (
string
) – collector description.
- buckets (
table
) – histogram buckets (an array of sorted positive numbers).
The infinity bucket (INF
) is appended automatically.
Default: {.005, .01, .025, .05, .075, .1, .25, .5, .75, 1.0, 2.5, 5.0, 7.5, 10.0, INF}
.
- metainfo (
table
) – collector metainfo.
Return: A histogram object.
Rtype: histogram_obj
Примечание
A histogram is basically a set of collectors:
name .. "_sum"
– a counter holding the sum of added observations.
name .. "_count"
– a counter holding the number of added observations.
name .. "_bucket"
– a counter holding all bucket sizes under the label
le
(less or equal). To access a specific bucket – x
(where x
is a number),
specify the value x
for the label le
.
-
object
histogram_obj
¶
-
histogram_obj:
observe
(num, label_pairs)¶
Record a new value in a histogram.
This increments all bucket sizes under the labels le
>= num
and the labels that match label_pairs
.
Параметры:
- num (
number
) – value to put in the histogram.
- label_pairs (
table
) – table containing label names as keys,
label values as values.
All internal counters that have these labels specified
observe new counter values.
Note that both label names and values in label_pairs
are treated as strings.
-
histogram_obj:
collect
()¶
Return a concatenation of counter_obj:collect()
across all internal
counters of histogram_obj
. For the description of observation
,
see counter_obj:collect().
summary
-
metrics.
summary
(name[, help, objectives, params, metainfo])¶
Register a new summary. Quantile computation is based on the
«Effective computation of biased quantiles over data streams»
algorithm.
Параметры:
- name (
string
) – сollector name. Must be unique.
- help (
string
) – collector description.
- objectives (
table
) – a list of «targeted» φ-quantiles in the {quantile = error, ... }
form.
Example: {[0.5]=0.01, [0.9]=0.01, [0.99]=0.01}
.
The targeted φ-quantile is specified in the form of a φ-quantile and the tolerated
error. For example, {[0.5] = 0.1}
means that the median (= 50th
percentile) is to be returned with a 10-percent error. Note that
percentiles and quantiles are the same concept, except that percentiles are
expressed as percentages. The φ-quantile must be in the interval [0, 1]
.
A lower tolerated error for a φ-quantile results in higher memory and CPU
usage during summary calculation.
- params (
table
) – table of the summary parameters used to configuring the sliding
time window. This window consists of several buckets to store observations.
New observations are added to each bucket. After a time period, the head bucket
(from which observations are collected) is reset, and the next bucket becomes the
new head. This way, each bucket stores observations for
max_age_time * age_buckets_count
seconds before it is reset.
max_age_time
sets the duration of each bucket’s lifetime – that is, how
many seconds the observations are kept before they are discarded.
age_buckets_count
sets the number of buckets in the sliding time window.
This variable determines the number of buckets used to exclude observations
older than max_age_time
from the summary. The value is
a trade-off between resources (memory and CPU for maintaining the bucket)
and how smooth the time window moves.
Default value: {max_age_time = math.huge, age_buckets_count = 1}
.
- metainfo (
table
) – collector metainfo.
Return: A summary object.
Rtype: summary_obj
Примечание
A summary represents a set of collectors:
name .. "_sum"
– a counter holding the sum of added observations.
name .. "_count"
– a counter holding the number of added observations.
name
holds all the quantiles under observation that find themselves
under the label quantile
(less or equal).
To access bucket x
(where x
is a number),
specify the value x
for the label quantile
.
-
object
summary_obj
¶
-
summary_obj:
observe
(num, label_pairs)¶
Record a new value in a summary.
Параметры:
- num (
number
) – value to put in the data stream.
- label_pairs (
table
) – a table containing label names as keys,
label values as values.
All internal counters that have these labels specified
observe new counter values.
You can’t add the "quantile"
label to a summary.
It is added automatically.
If max_age_time
and age_buckets_count
are set,
the observed value is added to each bucket.
Note that both label names and values in label_pairs
are treated as strings.
-
summary_obj:
collect
()¶
Return a concatenation of counter_obj:collect()
across all internal
counters of summary_obj
. For the description of observation
,
see counter_obj:collect().
If max_age_time
and age_buckets_count
are set, quantile observations
are collected only from the head bucket in the sliding time window,
not from every bucket. If no observations were recorded,
the method will return NaN
in the values.
All collectors support providing label_pairs
on data modification.
A label is a piece of metainfo that you associate with a metric in the key-value format.
See tags in Graphite and labels in Prometheus.
Labels are used to differentiate between the characteristics of a thing being
measured. For example, in a metric associated with the total number of HTTP
requests, you can represent methods and statuses as label pairs:
http_requests_total_counter:inc(1, {method = 'POST', status = '200'})
You don’t have to predefine labels in advance.
With labels, you can extract new time series (visualize their graphs)
by specifying conditions with regard to label values.
The example above allows extracting the following time series:
- The total number of requests over time with
method = "POST"
(and any status).
- The total number of requests over time with
status = 500
(and any method).
You can also set global labels by calling
metrics.set_global_labels({ label = value, ...})
.
-
metrics.
cfg
([config])¶
Entrypoint to setup the module. Since 0.17.0.
Параметры:
- config (
table
) – module configuration options:
cfg.include
(string/table, default 'all'
): 'all
to enable all
supported default metrics, 'none'
to disable all default metrics,
table with names of the default metrics to enable a specific set of metrics.
cfg.exclude
(table, default {}
): table containing the names of
the default metrics that you want to disable. Has higher priority
than cfg.include
.
cfg.labels
(table, default {}
): table containing label names as
string keys, label values as values.
You can work with metrics.cfg
as a table to read values, but you must call
metrics.cfg{}
as a function to update them.
Supported default metric names (for cfg.include
and cfg.exclude
tables):
network
operations
system
replicas
info
slab
runtime
memory
spaces
fibers
cpu
vinyl
memtx
luajit
cartridge_issues
cartridge_failover
clock
event_loop
See metrics reference for details.
All metric collectors from the collection have metainfo.default = true
.
cfg.labels
are the global labels to be added to every observation.
Global labels are applied only to metric collection. They have no effect
on how observations are stored.
Global labels can be changed on the fly.
label_pairs
from observation objects have priority over global labels.
If you pass label_pairs
to an observation method with the same key as
some global label, the method argument value will be used.
Note that both label names and values in label_pairs
are treated as strings.
-
metrics.
enable_default_metrics
([include, exclude])¶
Same as metrics.cfg{include=include, exclude=exclude}
, but include={}
is
treated as include='all'
for backward compatibility.
-
metrics.
set_global_labels
(label_pairs)¶
Same as metrics.cfg{labels=label_pairs}
.
-
metrics.
collect
([opts])¶
Collect observations from each collector.
Параметры:
- opts (
table
) – table of collect options:
invoke_callbacks
– if true
, invoke_callbacks()
is triggerred before actual collect.
default_only
– if true
, observations contain only default metrics (metainfo.default = true
).
-
object
registry
¶
-
registry:
unregister
(collector)¶
Remove a collector from the registry.
Параметры:
- collector (
collector_obj
) – the collector to be removed.
Example:
local collector = metrics.gauge('some-gauge')
-- after a while, we don't need it anymore
metrics.registry:unregister(collector)
-
registry:
find
(kind, name)¶
Find a collector in the registry.
Параметры:
Return: A collector object or nil
.
Rtype: collector_obj
Example:
local collector = metrics.gauge('some-gauge')
collector = metrics.registry:find('gauge', 'some-gauge')
-
metrics.
register_callback
(callback)¶
Register a function named callback
, which will be called right before metric
collection on plugin export.
Параметры:
- callback (
function
) – a function that takes no parameters.
This method is most often used for gauge metrics updates.
Example:
metrics.register_callback(function()
local cpu_metrics = require('metrics.psutils.cpu')
cpu_metrics.update()
end)
-
metrics.
unregister_callback
(callback)¶
Unregister a function named callback
that is called right before metric
collection on plugin export.
Параметры:
- callback (
function
) – a function that takes no parameters.
Example:
local cpu_callback = function()
local cpu_metrics = require('metrics.psutils.cpu')
cpu_metrics.update()
end
metrics.register_callback(cpu_callback)
-- after a while, we don't need that callback function anymore
metrics.unregister_callback(cpu_callback)
-
metrics.
invoke_callbacks
()¶
Invoke all registered callbacks. Has to be called before each collect()
.
(Since version 0.16.0, you may use collect{invoke_callbacks = true}
instead.)
If you’re using one of the default exporters,
invoke_callbacks()
will be called by the exporter.
Below are the functions that you can call
with metrics = require('cartridge.roles.metrics')
specified in your init.lua
.
-
metrics.
set_export
(export)¶
Параметры:
- export (
table
) – a table containing paths and formats of the exported metrics.
Configure the endpoints of the metrics role:
local metrics = require('cartridge.roles.metrics')
metrics.set_export({
{
path = '/path_for_json_metrics',
format = 'json'
},
{
path = '/path_for_prometheus_metrics',
format = 'prometheus'
},
{
path = '/health',
format = 'health'
}
})
You can add several entry points of the same format but with different paths,
for example:
metrics.set_export({
{
path = '/path_for_json_metrics',
format = 'json'
},
{
path = '/another_path_for_json_metrics',
format = 'json'
},
})
-
metrics.
set_default_labels
(label_pairs)¶
Add default global labels. Note that both
label names and values in label_pairs
are treated as strings.
Параметры:
- label_pairs (
table
) – Table containing label names as string keys,
label values as values.
local metrics = require('cartridge.roles.metrics')
metrics.set_default_labels({ ['my-custom-label'] = 'label-value' })
metrics
also provides middleware for monitoring HTTP
(set by the http module)
latency statistics.
-
metrics.http_middleware.
configure_default_collector
(type_name, name, help)¶
Register a collector for the middleware and set it as default.
Параметры:
Possible errors:
- A collector with the same type and name already exists in the registry.
-
metrics.http_middleware.
build_default_collector
(type_name, name[, help])¶
Register and return a collector for the middleware.
Параметры:
Return: A collector object.
Possible errors:
- A collector with the same type and name already exists in the registry.
-
metrics.http_middleware.
set_default_collector
(collector)¶
Set the default collector.
Параметры:
- collector – middleware collector object.
-
metrics.http_middleware.
get_default_collector
()¶
Return the default collector.
If the default collector hasn’t been set yet, register it (with default
http_middleware.build_default_collector(...)
parameters) and set it
as default.
Return: A collector object.
-
metrics.http_middleware.
v1
(handler, collector)¶
Latency measuring wrap-up for the HTTP ver. 1.x.x handler. Returns a wrapped handler.
Параметры:
- handler (
function
) – handler function.
- collector – middleware collector object.
If not set, the default collector is used
(like in
http_middleware.get_default_collector()
).
Usage: httpd:route(route, http_middleware.v1(request_handler, collector))
CPU metrics work only on Linux. See the metrics reference
for details.
To enable CPU metrics, first register a callback function:
local metrics = require('metrics')
local cpu_callback = function()
local cpu_metrics = require('metrics.psutils.cpu')
cpu_metrics.update()
end
metrics.register_callback(cpu_callback)
Collected metrics example:
# HELP tnt_cpu_time Host CPU time
# TYPE tnt_cpu_time gauge
tnt_cpu_time 15006759
# HELP tnt_cpu_thread Tarantool thread cpu time
# TYPE tnt_cpu_thread gauge
tnt_cpu_thread{thread_name="coio",file_name="init.lua",thread_pid="699",kind="system"} 160
tnt_cpu_thread{thread_name="tarantool",file_name="init.lua",thread_pid="1",kind="user"} 949
tnt_cpu_thread{thread_name="tarantool",file_name="init.lua",thread_pid="1",kind="system"} 920
tnt_cpu_thread{thread_name="coio",file_name="init.lua",thread_pid="11",kind="user"} 79
tnt_cpu_thread{thread_name="coio",file_name="init.lua",thread_pid="699",kind="user"} 44
tnt_cpu_thread{thread_name="coio",file_name="init.lua",thread_pid="11",kind="system"} 294
Prometheus query aggregated by thread name:
sum by (thread_name) (idelta(tnt_cpu_thread[$__interval]))
/ scalar(idelta(tnt_cpu_total[$__interval]) / tnt_cpu_count)
All psutils metric collectors have metainfo.default = true
.
To clear CPU metrics when you don’t need them anymore, remove the callback and clear the collectors with a method:
metrics.unregister_callback(cpu_callback)
cpu_metrics.clear()
Below are some examples of using metric primitives.
Notice that this usage is independent of export plugins such as
Prometheus, Graphite, etc. For documentation on how to use the plugins, see
the Metrics plugins section.
Using counters:
local metrics = require('metrics')
-- create a counter
local http_requests_total_counter = metrics.counter('http_requests_total')
-- somewhere in the HTTP requests middleware:
http_requests_total_counter:inc(1, {method = 'GET'})
Using gauges:
local metrics = require('metrics')
-- create a gauge
local cpu_usage_gauge = metrics.gauge('cpu_usage', 'CPU usage')
-- register a lazy gauge value update
-- this will be called whenever export is invoked in any plugins
metrics.register_callback(function()
local current_cpu_usage = some_cpu_collect_function()
cpu_usage_gauge:set(current_cpu_usage, {app = 'tarantool'})
end)
Using histograms:
local metrics = require('metrics')
local fiber = require('fiber')
-- create a histogram
local http_requests_latency_hist = metrics.histogram(
'http_requests_latency', 'HTTP requests total', {2, 4, 6})
-- somewhere in the HTTP request middleware:
local t0 = fiber.clock()
observable_function()
local t1 = fiber.clock()
local latency = t1 - t0
http_requests_latency_hist:observe(latency)
Using summaries:
local metrics = require('metrics')
local fiber = require('fiber')
-- create a summary with a window of 5 age buckets and a bucket lifetime of 60 s
local http_requests_latency = metrics.summary(
'http_requests_latency', 'HTTP requests total',
{[0.5]=0.01, [0.9]=0.01, [0.99]=0.01},
{max_age_time = 60, age_buckets_count = 5}
)
-- somewhere in the HTTP requests middleware:
local t0 = fiber.clock()
observable_function()
local t1 = fiber.clock()
local latency = t1 - t0
http_requests_latency:observe(latency)
-
metrics.
counter
(name[, help, metainfo])¶ Register a new counter.
Параметры: Return: A counter object.
Rtype: counter_obj
-
object
counter_obj
¶ -
counter_obj:
inc
(num, label_pairs)¶ Increment the observation for
label_pairs
. Iflabel_pairs
doesn’t exist, the method creates it.Параметры: - num (
number
) – increment value. - label_pairs (
table
) – table containing label names as keys, label values as values. Note that both label names and values inlabel_pairs
are treated as strings.
- num (
-
counter_obj:
collect
()¶ Return: Array of observation
objects for a given counter.{ label_pairs: table, -- `label_pairs` key-value table timestamp: ctype<uint64_t>, -- current system time (in microseconds) value: number, -- current value metric_name: string, -- collector }
Rtype: table
-
counter_obj:
remove
(label_pairs)¶ Remove the observation for
label_pairs
.
-
gauge
-
metrics.
gauge
(name[, help, metainfo])¶
Register a new gauge.
Параметры:
Return: A gauge object.
Rtype: gauge_obj
-
object
gauge_obj
¶
-
-
gauge_obj:
dec
(num, label_pairs)¶
Works like inc()
, but decrements the observation.
-
gauge_obj:
set
(num, label_pairs)¶
Sets the observation for label_pairs
to num
.
-
gauge_obj:
collect
()¶
Returns an array of observation
objects for a given gauge.
For the description of observation
, see
counter_obj:collect().
histogram
-
metrics.
histogram
(name[, help, buckets, metainfo])¶
Register a new histogram.
Параметры:
- name (
string
) – collector name. Must be unique.
- help (
string
) – collector description.
- buckets (
table
) – histogram buckets (an array of sorted positive numbers).
The infinity bucket (INF
) is appended automatically.
Default: {.005, .01, .025, .05, .075, .1, .25, .5, .75, 1.0, 2.5, 5.0, 7.5, 10.0, INF}
.
- metainfo (
table
) – collector metainfo.
Return: A histogram object.
Rtype: histogram_obj
Примечание
A histogram is basically a set of collectors:
name .. "_sum"
– a counter holding the sum of added observations.
name .. "_count"
– a counter holding the number of added observations.
name .. "_bucket"
– a counter holding all bucket sizes under the label
le
(less or equal). To access a specific bucket – x
(where x
is a number),
specify the value x
for the label le
.
-
object
histogram_obj
¶
-
histogram_obj:
observe
(num, label_pairs)¶
Record a new value in a histogram.
This increments all bucket sizes under the labels le
>= num
and the labels that match label_pairs
.
Параметры:
- num (
number
) – value to put in the histogram.
- label_pairs (
table
) – table containing label names as keys,
label values as values.
All internal counters that have these labels specified
observe new counter values.
Note that both label names and values in label_pairs
are treated as strings.
-
histogram_obj:
collect
()¶
Return a concatenation of counter_obj:collect()
across all internal
counters of histogram_obj
. For the description of observation
,
see counter_obj:collect().
summary
-
metrics.
summary
(name[, help, objectives, params, metainfo])¶
Register a new summary. Quantile computation is based on the
«Effective computation of biased quantiles over data streams»
algorithm.
Параметры:
- name (
string
) – сollector name. Must be unique.
- help (
string
) – collector description.
- objectives (
table
) – a list of «targeted» φ-quantiles in the {quantile = error, ... }
form.
Example: {[0.5]=0.01, [0.9]=0.01, [0.99]=0.01}
.
The targeted φ-quantile is specified in the form of a φ-quantile and the tolerated
error. For example, {[0.5] = 0.1}
means that the median (= 50th
percentile) is to be returned with a 10-percent error. Note that
percentiles and quantiles are the same concept, except that percentiles are
expressed as percentages. The φ-quantile must be in the interval [0, 1]
.
A lower tolerated error for a φ-quantile results in higher memory and CPU
usage during summary calculation.
- params (
table
) – table of the summary parameters used to configuring the sliding
time window. This window consists of several buckets to store observations.
New observations are added to each bucket. After a time period, the head bucket
(from which observations are collected) is reset, and the next bucket becomes the
new head. This way, each bucket stores observations for
max_age_time * age_buckets_count
seconds before it is reset.
max_age_time
sets the duration of each bucket’s lifetime – that is, how
many seconds the observations are kept before they are discarded.
age_buckets_count
sets the number of buckets in the sliding time window.
This variable determines the number of buckets used to exclude observations
older than max_age_time
from the summary. The value is
a trade-off between resources (memory and CPU for maintaining the bucket)
and how smooth the time window moves.
Default value: {max_age_time = math.huge, age_buckets_count = 1}
.
- metainfo (
table
) – collector metainfo.
Return: A summary object.
Rtype: summary_obj
Примечание
A summary represents a set of collectors:
name .. "_sum"
– a counter holding the sum of added observations.
name .. "_count"
– a counter holding the number of added observations.
name
holds all the quantiles under observation that find themselves
under the label quantile
(less or equal).
To access bucket x
(where x
is a number),
specify the value x
for the label quantile
.
-
object
summary_obj
¶
-
summary_obj:
observe
(num, label_pairs)¶
Record a new value in a summary.
Параметры:
- num (
number
) – value to put in the data stream.
- label_pairs (
table
) – a table containing label names as keys,
label values as values.
All internal counters that have these labels specified
observe new counter values.
You can’t add the "quantile"
label to a summary.
It is added automatically.
If max_age_time
and age_buckets_count
are set,
the observed value is added to each bucket.
Note that both label names and values in label_pairs
are treated as strings.
-
summary_obj:
collect
()¶
Return a concatenation of counter_obj:collect()
across all internal
counters of summary_obj
. For the description of observation
,
see counter_obj:collect().
If max_age_time
and age_buckets_count
are set, quantile observations
are collected only from the head bucket in the sliding time window,
not from every bucket. If no observations were recorded,
the method will return NaN
in the values.
All collectors support providing label_pairs
on data modification.
A label is a piece of metainfo that you associate with a metric in the key-value format.
See tags in Graphite and labels in Prometheus.
Labels are used to differentiate between the characteristics of a thing being
measured. For example, in a metric associated with the total number of HTTP
requests, you can represent methods and statuses as label pairs:
http_requests_total_counter:inc(1, {method = 'POST', status = '200'})
You don’t have to predefine labels in advance.
With labels, you can extract new time series (visualize their graphs)
by specifying conditions with regard to label values.
The example above allows extracting the following time series:
- The total number of requests over time with
method = "POST"
(and any status).
- The total number of requests over time with
status = 500
(and any method).
You can also set global labels by calling
metrics.set_global_labels({ label = value, ...})
.
-
metrics.
cfg
([config])¶
Entrypoint to setup the module. Since 0.17.0.
Параметры:
- config (
table
) – module configuration options:
cfg.include
(string/table, default 'all'
): 'all
to enable all
supported default metrics, 'none'
to disable all default metrics,
table with names of the default metrics to enable a specific set of metrics.
cfg.exclude
(table, default {}
): table containing the names of
the default metrics that you want to disable. Has higher priority
than cfg.include
.
cfg.labels
(table, default {}
): table containing label names as
string keys, label values as values.
You can work with metrics.cfg
as a table to read values, but you must call
metrics.cfg{}
as a function to update them.
Supported default metric names (for cfg.include
and cfg.exclude
tables):
network
operations
system
replicas
info
slab
runtime
memory
spaces
fibers
cpu
vinyl
memtx
luajit
cartridge_issues
cartridge_failover
clock
event_loop
See metrics reference for details.
All metric collectors from the collection have metainfo.default = true
.
cfg.labels
are the global labels to be added to every observation.
Global labels are applied only to metric collection. They have no effect
on how observations are stored.
Global labels can be changed on the fly.
label_pairs
from observation objects have priority over global labels.
If you pass label_pairs
to an observation method with the same key as
some global label, the method argument value will be used.
Note that both label names and values in label_pairs
are treated as strings.
-
metrics.
enable_default_metrics
([include, exclude])¶
Same as metrics.cfg{include=include, exclude=exclude}
, but include={}
is
treated as include='all'
for backward compatibility.
-
metrics.
set_global_labels
(label_pairs)¶
Same as metrics.cfg{labels=label_pairs}
.
-
metrics.
collect
([opts])¶
Collect observations from each collector.
Параметры:
- opts (
table
) – table of collect options:
invoke_callbacks
– if true
, invoke_callbacks()
is triggerred before actual collect.
default_only
– if true
, observations contain only default metrics (metainfo.default = true
).
-
object
registry
¶
-
registry:
unregister
(collector)¶
Remove a collector from the registry.
Параметры:
- collector (
collector_obj
) – the collector to be removed.
Example:
local collector = metrics.gauge('some-gauge')
-- after a while, we don't need it anymore
metrics.registry:unregister(collector)
-
registry:
find
(kind, name)¶
Find a collector in the registry.
Параметры:
Return: A collector object or nil
.
Rtype: collector_obj
Example:
local collector = metrics.gauge('some-gauge')
collector = metrics.registry:find('gauge', 'some-gauge')
-
metrics.
register_callback
(callback)¶
Register a function named callback
, which will be called right before metric
collection on plugin export.
Параметры:
- callback (
function
) – a function that takes no parameters.
This method is most often used for gauge metrics updates.
Example:
metrics.register_callback(function()
local cpu_metrics = require('metrics.psutils.cpu')
cpu_metrics.update()
end)
-
metrics.
unregister_callback
(callback)¶
Unregister a function named callback
that is called right before metric
collection on plugin export.
Параметры:
- callback (
function
) – a function that takes no parameters.
Example:
local cpu_callback = function()
local cpu_metrics = require('metrics.psutils.cpu')
cpu_metrics.update()
end
metrics.register_callback(cpu_callback)
-- after a while, we don't need that callback function anymore
metrics.unregister_callback(cpu_callback)
-
metrics.
invoke_callbacks
()¶
Invoke all registered callbacks. Has to be called before each collect()
.
(Since version 0.16.0, you may use collect{invoke_callbacks = true}
instead.)
If you’re using one of the default exporters,
invoke_callbacks()
will be called by the exporter.
Below are the functions that you can call
with metrics = require('cartridge.roles.metrics')
specified in your init.lua
.
-
metrics.
set_export
(export)¶
Параметры:
- export (
table
) – a table containing paths and formats of the exported metrics.
Configure the endpoints of the metrics role:
local metrics = require('cartridge.roles.metrics')
metrics.set_export({
{
path = '/path_for_json_metrics',
format = 'json'
},
{
path = '/path_for_prometheus_metrics',
format = 'prometheus'
},
{
path = '/health',
format = 'health'
}
})
You can add several entry points of the same format but with different paths,
for example:
metrics.set_export({
{
path = '/path_for_json_metrics',
format = 'json'
},
{
path = '/another_path_for_json_metrics',
format = 'json'
},
})
-
metrics.
set_default_labels
(label_pairs)¶
Add default global labels. Note that both
label names and values in label_pairs
are treated as strings.
Параметры:
- label_pairs (
table
) – Table containing label names as string keys,
label values as values.
local metrics = require('cartridge.roles.metrics')
metrics.set_default_labels({ ['my-custom-label'] = 'label-value' })
metrics
also provides middleware for monitoring HTTP
(set by the http module)
latency statistics.
-
metrics.http_middleware.
configure_default_collector
(type_name, name, help)¶
Register a collector for the middleware and set it as default.
Параметры:
Possible errors:
- A collector with the same type and name already exists in the registry.
-
metrics.http_middleware.
build_default_collector
(type_name, name[, help])¶
Register and return a collector for the middleware.
Параметры:
Return: A collector object.
Possible errors:
- A collector with the same type and name already exists in the registry.
-
metrics.http_middleware.
set_default_collector
(collector)¶
Set the default collector.
Параметры:
- collector – middleware collector object.
-
metrics.http_middleware.
get_default_collector
()¶
Return the default collector.
If the default collector hasn’t been set yet, register it (with default
http_middleware.build_default_collector(...)
parameters) and set it
as default.
Return: A collector object.
-
metrics.http_middleware.
v1
(handler, collector)¶
Latency measuring wrap-up for the HTTP ver. 1.x.x handler. Returns a wrapped handler.
Параметры:
- handler (
function
) – handler function.
- collector – middleware collector object.
If not set, the default collector is used
(like in
http_middleware.get_default_collector()
).
Usage: httpd:route(route, http_middleware.v1(request_handler, collector))
CPU metrics work only on Linux. See the metrics reference
for details.
To enable CPU metrics, first register a callback function:
local metrics = require('metrics')
local cpu_callback = function()
local cpu_metrics = require('metrics.psutils.cpu')
cpu_metrics.update()
end
metrics.register_callback(cpu_callback)
Collected metrics example:
# HELP tnt_cpu_time Host CPU time
# TYPE tnt_cpu_time gauge
tnt_cpu_time 15006759
# HELP tnt_cpu_thread Tarantool thread cpu time
# TYPE tnt_cpu_thread gauge
tnt_cpu_thread{thread_name="coio",file_name="init.lua",thread_pid="699",kind="system"} 160
tnt_cpu_thread{thread_name="tarantool",file_name="init.lua",thread_pid="1",kind="user"} 949
tnt_cpu_thread{thread_name="tarantool",file_name="init.lua",thread_pid="1",kind="system"} 920
tnt_cpu_thread{thread_name="coio",file_name="init.lua",thread_pid="11",kind="user"} 79
tnt_cpu_thread{thread_name="coio",file_name="init.lua",thread_pid="699",kind="user"} 44
tnt_cpu_thread{thread_name="coio",file_name="init.lua",thread_pid="11",kind="system"} 294
Prometheus query aggregated by thread name:
sum by (thread_name) (idelta(tnt_cpu_thread[$__interval]))
/ scalar(idelta(tnt_cpu_total[$__interval]) / tnt_cpu_count)
All psutils metric collectors have metainfo.default = true
.
To clear CPU metrics when you don’t need them anymore, remove the callback and clear the collectors with a method:
metrics.unregister_callback(cpu_callback)
cpu_metrics.clear()
Below are some examples of using metric primitives.
Notice that this usage is independent of export plugins such as
Prometheus, Graphite, etc. For documentation on how to use the plugins, see
the Metrics plugins section.
Using counters:
local metrics = require('metrics')
-- create a counter
local http_requests_total_counter = metrics.counter('http_requests_total')
-- somewhere in the HTTP requests middleware:
http_requests_total_counter:inc(1, {method = 'GET'})
Using gauges:
local metrics = require('metrics')
-- create a gauge
local cpu_usage_gauge = metrics.gauge('cpu_usage', 'CPU usage')
-- register a lazy gauge value update
-- this will be called whenever export is invoked in any plugins
metrics.register_callback(function()
local current_cpu_usage = some_cpu_collect_function()
cpu_usage_gauge:set(current_cpu_usage, {app = 'tarantool'})
end)
Using histograms:
local metrics = require('metrics')
local fiber = require('fiber')
-- create a histogram
local http_requests_latency_hist = metrics.histogram(
'http_requests_latency', 'HTTP requests total', {2, 4, 6})
-- somewhere in the HTTP request middleware:
local t0 = fiber.clock()
observable_function()
local t1 = fiber.clock()
local latency = t1 - t0
http_requests_latency_hist:observe(latency)
Using summaries:
local metrics = require('metrics')
local fiber = require('fiber')
-- create a summary with a window of 5 age buckets and a bucket lifetime of 60 s
local http_requests_latency = metrics.summary(
'http_requests_latency', 'HTTP requests total',
{[0.5]=0.01, [0.9]=0.01, [0.99]=0.01},
{max_age_time = 60, age_buckets_count = 5}
)
-- somewhere in the HTTP requests middleware:
local t0 = fiber.clock()
observable_function()
local t1 = fiber.clock()
local latency = t1 - t0
http_requests_latency:observe(latency)
-
metrics.
gauge
(name[, help, metainfo])¶ Register a new gauge.
Параметры: Return: A gauge object.
Rtype: gauge_obj
-
object
gauge_obj
¶ -
-
gauge_obj:
dec
(num, label_pairs)¶ Works like
inc()
, but decrements the observation.
-
gauge_obj:
set
(num, label_pairs)¶ Sets the observation for
label_pairs
tonum
.
-
gauge_obj:
collect
()¶ Returns an array of
observation
objects for a given gauge. For the description ofobservation
, see counter_obj:collect().
-
histogram
-
metrics.
histogram
(name[, help, buckets, metainfo])¶
Register a new histogram.
Параметры:
- name (
string
) – collector name. Must be unique.
- help (
string
) – collector description.
- buckets (
table
) – histogram buckets (an array of sorted positive numbers).
The infinity bucket (INF
) is appended automatically.
Default: {.005, .01, .025, .05, .075, .1, .25, .5, .75, 1.0, 2.5, 5.0, 7.5, 10.0, INF}
.
- metainfo (
table
) – collector metainfo.
Return: A histogram object.
Rtype: histogram_obj
Примечание
A histogram is basically a set of collectors:
name .. "_sum"
– a counter holding the sum of added observations.
name .. "_count"
– a counter holding the number of added observations.
name .. "_bucket"
– a counter holding all bucket sizes under the label
le
(less or equal). To access a specific bucket – x
(where x
is a number),
specify the value x
for the label le
.
-
object
histogram_obj
¶
-
histogram_obj:
observe
(num, label_pairs)¶
Record a new value in a histogram.
This increments all bucket sizes under the labels le
>= num
and the labels that match label_pairs
.
Параметры:
- num (
number
) – value to put in the histogram.
- label_pairs (
table
) – table containing label names as keys,
label values as values.
All internal counters that have these labels specified
observe new counter values.
Note that both label names and values in label_pairs
are treated as strings.
-
histogram_obj:
collect
()¶
Return a concatenation of counter_obj:collect()
across all internal
counters of histogram_obj
. For the description of observation
,
see counter_obj:collect().
summary
-
metrics.
summary
(name[, help, objectives, params, metainfo])¶
Register a new summary. Quantile computation is based on the
«Effective computation of biased quantiles over data streams»
algorithm.
Параметры:
- name (
string
) – сollector name. Must be unique.
- help (
string
) – collector description.
- objectives (
table
) – a list of «targeted» φ-quantiles in the {quantile = error, ... }
form.
Example: {[0.5]=0.01, [0.9]=0.01, [0.99]=0.01}
.
The targeted φ-quantile is specified in the form of a φ-quantile and the tolerated
error. For example, {[0.5] = 0.1}
means that the median (= 50th
percentile) is to be returned with a 10-percent error. Note that
percentiles and quantiles are the same concept, except that percentiles are
expressed as percentages. The φ-quantile must be in the interval [0, 1]
.
A lower tolerated error for a φ-quantile results in higher memory and CPU
usage during summary calculation.
- params (
table
) – table of the summary parameters used to configuring the sliding
time window. This window consists of several buckets to store observations.
New observations are added to each bucket. After a time period, the head bucket
(from which observations are collected) is reset, and the next bucket becomes the
new head. This way, each bucket stores observations for
max_age_time * age_buckets_count
seconds before it is reset.
max_age_time
sets the duration of each bucket’s lifetime – that is, how
many seconds the observations are kept before they are discarded.
age_buckets_count
sets the number of buckets in the sliding time window.
This variable determines the number of buckets used to exclude observations
older than max_age_time
from the summary. The value is
a trade-off between resources (memory and CPU for maintaining the bucket)
and how smooth the time window moves.
Default value: {max_age_time = math.huge, age_buckets_count = 1}
.
- metainfo (
table
) – collector metainfo.
Return: A summary object.
Rtype: summary_obj
Примечание
A summary represents a set of collectors:
name .. "_sum"
– a counter holding the sum of added observations.
name .. "_count"
– a counter holding the number of added observations.
name
holds all the quantiles under observation that find themselves
under the label quantile
(less or equal).
To access bucket x
(where x
is a number),
specify the value x
for the label quantile
.
-
object
summary_obj
¶
-
summary_obj:
observe
(num, label_pairs)¶
Record a new value in a summary.
Параметры:
- num (
number
) – value to put in the data stream.
- label_pairs (
table
) – a table containing label names as keys,
label values as values.
All internal counters that have these labels specified
observe new counter values.
You can’t add the "quantile"
label to a summary.
It is added automatically.
If max_age_time
and age_buckets_count
are set,
the observed value is added to each bucket.
Note that both label names and values in label_pairs
are treated as strings.
-
summary_obj:
collect
()¶
Return a concatenation of counter_obj:collect()
across all internal
counters of summary_obj
. For the description of observation
,
see counter_obj:collect().
If max_age_time
and age_buckets_count
are set, quantile observations
are collected only from the head bucket in the sliding time window,
not from every bucket. If no observations were recorded,
the method will return NaN
in the values.
All collectors support providing label_pairs
on data modification.
A label is a piece of metainfo that you associate with a metric in the key-value format.
See tags in Graphite and labels in Prometheus.
Labels are used to differentiate between the characteristics of a thing being
measured. For example, in a metric associated with the total number of HTTP
requests, you can represent methods and statuses as label pairs:
http_requests_total_counter:inc(1, {method = 'POST', status = '200'})
You don’t have to predefine labels in advance.
With labels, you can extract new time series (visualize their graphs)
by specifying conditions with regard to label values.
The example above allows extracting the following time series:
- The total number of requests over time with
method = "POST"
(and any status).
- The total number of requests over time with
status = 500
(and any method).
You can also set global labels by calling
metrics.set_global_labels({ label = value, ...})
.
-
metrics.
cfg
([config])¶
Entrypoint to setup the module. Since 0.17.0.
Параметры:
- config (
table
) – module configuration options:
cfg.include
(string/table, default 'all'
): 'all
to enable all
supported default metrics, 'none'
to disable all default metrics,
table with names of the default metrics to enable a specific set of metrics.
cfg.exclude
(table, default {}
): table containing the names of
the default metrics that you want to disable. Has higher priority
than cfg.include
.
cfg.labels
(table, default {}
): table containing label names as
string keys, label values as values.
You can work with metrics.cfg
as a table to read values, but you must call
metrics.cfg{}
as a function to update them.
Supported default metric names (for cfg.include
and cfg.exclude
tables):
network
operations
system
replicas
info
slab
runtime
memory
spaces
fibers
cpu
vinyl
memtx
luajit
cartridge_issues
cartridge_failover
clock
event_loop
See metrics reference for details.
All metric collectors from the collection have metainfo.default = true
.
cfg.labels
are the global labels to be added to every observation.
Global labels are applied only to metric collection. They have no effect
on how observations are stored.
Global labels can be changed on the fly.
label_pairs
from observation objects have priority over global labels.
If you pass label_pairs
to an observation method with the same key as
some global label, the method argument value will be used.
Note that both label names and values in label_pairs
are treated as strings.
-
metrics.
enable_default_metrics
([include, exclude])¶
Same as metrics.cfg{include=include, exclude=exclude}
, but include={}
is
treated as include='all'
for backward compatibility.
-
metrics.
set_global_labels
(label_pairs)¶
Same as metrics.cfg{labels=label_pairs}
.
-
metrics.
collect
([opts])¶
Collect observations from each collector.
Параметры:
- opts (
table
) – table of collect options:
invoke_callbacks
– if true
, invoke_callbacks()
is triggerred before actual collect.
default_only
– if true
, observations contain only default metrics (metainfo.default = true
).
-
object
registry
¶
-
registry:
unregister
(collector)¶
Remove a collector from the registry.
Параметры:
- collector (
collector_obj
) – the collector to be removed.
Example:
local collector = metrics.gauge('some-gauge')
-- after a while, we don't need it anymore
metrics.registry:unregister(collector)
-
registry:
find
(kind, name)¶
Find a collector in the registry.
Параметры:
Return: A collector object or nil
.
Rtype: collector_obj
Example:
local collector = metrics.gauge('some-gauge')
collector = metrics.registry:find('gauge', 'some-gauge')
-
metrics.
register_callback
(callback)¶
Register a function named callback
, which will be called right before metric
collection on plugin export.
Параметры:
- callback (
function
) – a function that takes no parameters.
This method is most often used for gauge metrics updates.
Example:
metrics.register_callback(function()
local cpu_metrics = require('metrics.psutils.cpu')
cpu_metrics.update()
end)
-
metrics.
unregister_callback
(callback)¶
Unregister a function named callback
that is called right before metric
collection on plugin export.
Параметры:
- callback (
function
) – a function that takes no parameters.
Example:
local cpu_callback = function()
local cpu_metrics = require('metrics.psutils.cpu')
cpu_metrics.update()
end
metrics.register_callback(cpu_callback)
-- after a while, we don't need that callback function anymore
metrics.unregister_callback(cpu_callback)
-
metrics.
invoke_callbacks
()¶
Invoke all registered callbacks. Has to be called before each collect()
.
(Since version 0.16.0, you may use collect{invoke_callbacks = true}
instead.)
If you’re using one of the default exporters,
invoke_callbacks()
will be called by the exporter.
Below are the functions that you can call
with metrics = require('cartridge.roles.metrics')
specified in your init.lua
.
-
metrics.
set_export
(export)¶
Параметры:
- export (
table
) – a table containing paths and formats of the exported metrics.
Configure the endpoints of the metrics role:
local metrics = require('cartridge.roles.metrics')
metrics.set_export({
{
path = '/path_for_json_metrics',
format = 'json'
},
{
path = '/path_for_prometheus_metrics',
format = 'prometheus'
},
{
path = '/health',
format = 'health'
}
})
You can add several entry points of the same format but with different paths,
for example:
metrics.set_export({
{
path = '/path_for_json_metrics',
format = 'json'
},
{
path = '/another_path_for_json_metrics',
format = 'json'
},
})
-
metrics.
set_default_labels
(label_pairs)¶
Add default global labels. Note that both
label names and values in label_pairs
are treated as strings.
Параметры:
- label_pairs (
table
) – Table containing label names as string keys,
label values as values.
local metrics = require('cartridge.roles.metrics')
metrics.set_default_labels({ ['my-custom-label'] = 'label-value' })
metrics
also provides middleware for monitoring HTTP
(set by the http module)
latency statistics.
-
metrics.http_middleware.
configure_default_collector
(type_name, name, help)¶
Register a collector for the middleware and set it as default.
Параметры:
Possible errors:
- A collector with the same type and name already exists in the registry.
-
metrics.http_middleware.
build_default_collector
(type_name, name[, help])¶
Register and return a collector for the middleware.
Параметры:
Return: A collector object.
Possible errors:
- A collector with the same type and name already exists in the registry.
-
metrics.http_middleware.
set_default_collector
(collector)¶
Set the default collector.
Параметры:
- collector – middleware collector object.
-
metrics.http_middleware.
get_default_collector
()¶
Return the default collector.
If the default collector hasn’t been set yet, register it (with default
http_middleware.build_default_collector(...)
parameters) and set it
as default.
Return: A collector object.
-
metrics.http_middleware.
v1
(handler, collector)¶
Latency measuring wrap-up for the HTTP ver. 1.x.x handler. Returns a wrapped handler.
Параметры:
- handler (
function
) – handler function.
- collector – middleware collector object.
If not set, the default collector is used
(like in
http_middleware.get_default_collector()
).
Usage: httpd:route(route, http_middleware.v1(request_handler, collector))
CPU metrics work only on Linux. See the metrics reference
for details.
To enable CPU metrics, first register a callback function:
local metrics = require('metrics')
local cpu_callback = function()
local cpu_metrics = require('metrics.psutils.cpu')
cpu_metrics.update()
end
metrics.register_callback(cpu_callback)
Collected metrics example:
# HELP tnt_cpu_time Host CPU time
# TYPE tnt_cpu_time gauge
tnt_cpu_time 15006759
# HELP tnt_cpu_thread Tarantool thread cpu time
# TYPE tnt_cpu_thread gauge
tnt_cpu_thread{thread_name="coio",file_name="init.lua",thread_pid="699",kind="system"} 160
tnt_cpu_thread{thread_name="tarantool",file_name="init.lua",thread_pid="1",kind="user"} 949
tnt_cpu_thread{thread_name="tarantool",file_name="init.lua",thread_pid="1",kind="system"} 920
tnt_cpu_thread{thread_name="coio",file_name="init.lua",thread_pid="11",kind="user"} 79
tnt_cpu_thread{thread_name="coio",file_name="init.lua",thread_pid="699",kind="user"} 44
tnt_cpu_thread{thread_name="coio",file_name="init.lua",thread_pid="11",kind="system"} 294
Prometheus query aggregated by thread name:
sum by (thread_name) (idelta(tnt_cpu_thread[$__interval]))
/ scalar(idelta(tnt_cpu_total[$__interval]) / tnt_cpu_count)
All psutils metric collectors have metainfo.default = true
.
To clear CPU metrics when you don’t need them anymore, remove the callback and clear the collectors with a method:
metrics.unregister_callback(cpu_callback)
cpu_metrics.clear()
Below are some examples of using metric primitives.
Notice that this usage is independent of export plugins such as
Prometheus, Graphite, etc. For documentation on how to use the plugins, see
the Metrics plugins section.
Using counters:
local metrics = require('metrics')
-- create a counter
local http_requests_total_counter = metrics.counter('http_requests_total')
-- somewhere in the HTTP requests middleware:
http_requests_total_counter:inc(1, {method = 'GET'})
Using gauges:
local metrics = require('metrics')
-- create a gauge
local cpu_usage_gauge = metrics.gauge('cpu_usage', 'CPU usage')
-- register a lazy gauge value update
-- this will be called whenever export is invoked in any plugins
metrics.register_callback(function()
local current_cpu_usage = some_cpu_collect_function()
cpu_usage_gauge:set(current_cpu_usage, {app = 'tarantool'})
end)
Using histograms:
local metrics = require('metrics')
local fiber = require('fiber')
-- create a histogram
local http_requests_latency_hist = metrics.histogram(
'http_requests_latency', 'HTTP requests total', {2, 4, 6})
-- somewhere in the HTTP request middleware:
local t0 = fiber.clock()
observable_function()
local t1 = fiber.clock()
local latency = t1 - t0
http_requests_latency_hist:observe(latency)
Using summaries:
local metrics = require('metrics')
local fiber = require('fiber')
-- create a summary with a window of 5 age buckets and a bucket lifetime of 60 s
local http_requests_latency = metrics.summary(
'http_requests_latency', 'HTTP requests total',
{[0.5]=0.01, [0.9]=0.01, [0.99]=0.01},
{max_age_time = 60, age_buckets_count = 5}
)
-- somewhere in the HTTP requests middleware:
local t0 = fiber.clock()
observable_function()
local t1 = fiber.clock()
local latency = t1 - t0
http_requests_latency:observe(latency)
-
metrics.
histogram
(name[, help, buckets, metainfo])¶ Register a new histogram.
Параметры: - name (
string
) – collector name. Must be unique. - help (
string
) – collector description. - buckets (
table
) – histogram buckets (an array of sorted positive numbers). The infinity bucket (INF
) is appended automatically. Default:{.005, .01, .025, .05, .075, .1, .25, .5, .75, 1.0, 2.5, 5.0, 7.5, 10.0, INF}
. - metainfo (
table
) – collector metainfo.
Return: A histogram object.
Rtype: histogram_obj
Примечание
A histogram is basically a set of collectors:
name .. "_sum"
– a counter holding the sum of added observations.name .. "_count"
– a counter holding the number of added observations.name .. "_bucket"
– a counter holding all bucket sizes under the labelle
(less or equal). To access a specific bucket –x
(wherex
is a number), specify the valuex
for the labelle
.
- name (
-
object
histogram_obj
¶ -
histogram_obj:
observe
(num, label_pairs)¶ Record a new value in a histogram. This increments all bucket sizes under the labels
le
>=num
and the labels that matchlabel_pairs
.Параметры: - num (
number
) – value to put in the histogram. - label_pairs (
table
) – table containing label names as keys, label values as values. All internal counters that have these labels specified observe new counter values. Note that both label names and values inlabel_pairs
are treated as strings.
- num (
-
histogram_obj:
collect
()¶ Return a concatenation of
counter_obj:collect()
across all internal counters ofhistogram_obj
. For the description ofobservation
, see counter_obj:collect().
-
summary
-
metrics.
summary
(name[, help, objectives, params, metainfo])¶
Register a new summary. Quantile computation is based on the
«Effective computation of biased quantiles over data streams»
algorithm.
Параметры:
- name (
string
) – сollector name. Must be unique.
- help (
string
) – collector description.
- objectives (
table
) – a list of «targeted» φ-quantiles in the {quantile = error, ... }
form.
Example: {[0.5]=0.01, [0.9]=0.01, [0.99]=0.01}
.
The targeted φ-quantile is specified in the form of a φ-quantile and the tolerated
error. For example, {[0.5] = 0.1}
means that the median (= 50th
percentile) is to be returned with a 10-percent error. Note that
percentiles and quantiles are the same concept, except that percentiles are
expressed as percentages. The φ-quantile must be in the interval [0, 1]
.
A lower tolerated error for a φ-quantile results in higher memory and CPU
usage during summary calculation.
- params (
table
) – table of the summary parameters used to configuring the sliding
time window. This window consists of several buckets to store observations.
New observations are added to each bucket. After a time period, the head bucket
(from which observations are collected) is reset, and the next bucket becomes the
new head. This way, each bucket stores observations for
max_age_time * age_buckets_count
seconds before it is reset.
max_age_time
sets the duration of each bucket’s lifetime – that is, how
many seconds the observations are kept before they are discarded.
age_buckets_count
sets the number of buckets in the sliding time window.
This variable determines the number of buckets used to exclude observations
older than max_age_time
from the summary. The value is
a trade-off between resources (memory and CPU for maintaining the bucket)
and how smooth the time window moves.
Default value: {max_age_time = math.huge, age_buckets_count = 1}
.
- metainfo (
table
) – collector metainfo.
Return: A summary object.
Rtype: summary_obj
Примечание
A summary represents a set of collectors:
name .. "_sum"
– a counter holding the sum of added observations.
name .. "_count"
– a counter holding the number of added observations.
name
holds all the quantiles under observation that find themselves
under the label quantile
(less or equal).
To access bucket x
(where x
is a number),
specify the value x
for the label quantile
.
-
object
summary_obj
¶
-
summary_obj:
observe
(num, label_pairs)¶
Record a new value in a summary.
Параметры:
- num (
number
) – value to put in the data stream.
- label_pairs (
table
) – a table containing label names as keys,
label values as values.
All internal counters that have these labels specified
observe new counter values.
You can’t add the "quantile"
label to a summary.
It is added automatically.
If max_age_time
and age_buckets_count
are set,
the observed value is added to each bucket.
Note that both label names and values in label_pairs
are treated as strings.
-
summary_obj:
collect
()¶
Return a concatenation of counter_obj:collect()
across all internal
counters of summary_obj
. For the description of observation
,
see counter_obj:collect().
If max_age_time
and age_buckets_count
are set, quantile observations
are collected only from the head bucket in the sliding time window,
not from every bucket. If no observations were recorded,
the method will return NaN
in the values.
All collectors support providing label_pairs
on data modification.
A label is a piece of metainfo that you associate with a metric in the key-value format.
See tags in Graphite and labels in Prometheus.
Labels are used to differentiate between the characteristics of a thing being
measured. For example, in a metric associated with the total number of HTTP
requests, you can represent methods and statuses as label pairs:
http_requests_total_counter:inc(1, {method = 'POST', status = '200'})
You don’t have to predefine labels in advance.
With labels, you can extract new time series (visualize their graphs)
by specifying conditions with regard to label values.
The example above allows extracting the following time series:
- The total number of requests over time with
method = "POST"
(and any status).
- The total number of requests over time with
status = 500
(and any method).
You can also set global labels by calling
metrics.set_global_labels({ label = value, ...})
.
-
metrics.
cfg
([config])¶
Entrypoint to setup the module. Since 0.17.0.
Параметры:
- config (
table
) – module configuration options:
cfg.include
(string/table, default 'all'
): 'all
to enable all
supported default metrics, 'none'
to disable all default metrics,
table with names of the default metrics to enable a specific set of metrics.
cfg.exclude
(table, default {}
): table containing the names of
the default metrics that you want to disable. Has higher priority
than cfg.include
.
cfg.labels
(table, default {}
): table containing label names as
string keys, label values as values.
You can work with metrics.cfg
as a table to read values, but you must call
metrics.cfg{}
as a function to update them.
Supported default metric names (for cfg.include
and cfg.exclude
tables):
network
operations
system
replicas
info
slab
runtime
memory
spaces
fibers
cpu
vinyl
memtx
luajit
cartridge_issues
cartridge_failover
clock
event_loop
See metrics reference for details.
All metric collectors from the collection have metainfo.default = true
.
cfg.labels
are the global labels to be added to every observation.
Global labels are applied only to metric collection. They have no effect
on how observations are stored.
Global labels can be changed on the fly.
label_pairs
from observation objects have priority over global labels.
If you pass label_pairs
to an observation method with the same key as
some global label, the method argument value will be used.
Note that both label names and values in label_pairs
are treated as strings.
-
metrics.
enable_default_metrics
([include, exclude])¶
Same as metrics.cfg{include=include, exclude=exclude}
, but include={}
is
treated as include='all'
for backward compatibility.
-
metrics.
set_global_labels
(label_pairs)¶
Same as metrics.cfg{labels=label_pairs}
.
-
metrics.
collect
([opts])¶
Collect observations from each collector.
Параметры:
- opts (
table
) – table of collect options:
invoke_callbacks
– if true
, invoke_callbacks()
is triggerred before actual collect.
default_only
– if true
, observations contain only default metrics (metainfo.default = true
).
-
object
registry
¶
-
registry:
unregister
(collector)¶
Remove a collector from the registry.
Параметры:
- collector (
collector_obj
) – the collector to be removed.
Example:
local collector = metrics.gauge('some-gauge')
-- after a while, we don't need it anymore
metrics.registry:unregister(collector)
-
registry:
find
(kind, name)¶
Find a collector in the registry.
Параметры:
Return: A collector object or nil
.
Rtype: collector_obj
Example:
local collector = metrics.gauge('some-gauge')
collector = metrics.registry:find('gauge', 'some-gauge')
-
metrics.
register_callback
(callback)¶
Register a function named callback
, which will be called right before metric
collection on plugin export.
Параметры:
- callback (
function
) – a function that takes no parameters.
This method is most often used for gauge metrics updates.
Example:
metrics.register_callback(function()
local cpu_metrics = require('metrics.psutils.cpu')
cpu_metrics.update()
end)
-
metrics.
unregister_callback
(callback)¶
Unregister a function named callback
that is called right before metric
collection on plugin export.
Параметры:
- callback (
function
) – a function that takes no parameters.
Example:
local cpu_callback = function()
local cpu_metrics = require('metrics.psutils.cpu')
cpu_metrics.update()
end
metrics.register_callback(cpu_callback)
-- after a while, we don't need that callback function anymore
metrics.unregister_callback(cpu_callback)
-
metrics.
invoke_callbacks
()¶
Invoke all registered callbacks. Has to be called before each collect()
.
(Since version 0.16.0, you may use collect{invoke_callbacks = true}
instead.)
If you’re using one of the default exporters,
invoke_callbacks()
will be called by the exporter.
Below are the functions that you can call
with metrics = require('cartridge.roles.metrics')
specified in your init.lua
.
-
metrics.
set_export
(export)¶
Параметры:
- export (
table
) – a table containing paths and formats of the exported metrics.
Configure the endpoints of the metrics role:
local metrics = require('cartridge.roles.metrics')
metrics.set_export({
{
path = '/path_for_json_metrics',
format = 'json'
},
{
path = '/path_for_prometheus_metrics',
format = 'prometheus'
},
{
path = '/health',
format = 'health'
}
})
You can add several entry points of the same format but with different paths,
for example:
metrics.set_export({
{
path = '/path_for_json_metrics',
format = 'json'
},
{
path = '/another_path_for_json_metrics',
format = 'json'
},
})
-
metrics.
set_default_labels
(label_pairs)¶
Add default global labels. Note that both
label names and values in label_pairs
are treated as strings.
Параметры:
- label_pairs (
table
) – Table containing label names as string keys,
label values as values.
local metrics = require('cartridge.roles.metrics')
metrics.set_default_labels({ ['my-custom-label'] = 'label-value' })
metrics
also provides middleware for monitoring HTTP
(set by the http module)
latency statistics.
-
metrics.http_middleware.
configure_default_collector
(type_name, name, help)¶
Register a collector for the middleware and set it as default.
Параметры:
Possible errors:
- A collector with the same type and name already exists in the registry.
-
metrics.http_middleware.
build_default_collector
(type_name, name[, help])¶
Register and return a collector for the middleware.
Параметры:
Return: A collector object.
Possible errors:
- A collector with the same type and name already exists in the registry.
-
metrics.http_middleware.
set_default_collector
(collector)¶
Set the default collector.
Параметры:
- collector – middleware collector object.
-
metrics.http_middleware.
get_default_collector
()¶
Return the default collector.
If the default collector hasn’t been set yet, register it (with default
http_middleware.build_default_collector(...)
parameters) and set it
as default.
Return: A collector object.
-
metrics.http_middleware.
v1
(handler, collector)¶
Latency measuring wrap-up for the HTTP ver. 1.x.x handler. Returns a wrapped handler.
Параметры:
- handler (
function
) – handler function.
- collector – middleware collector object.
If not set, the default collector is used
(like in
http_middleware.get_default_collector()
).
Usage: httpd:route(route, http_middleware.v1(request_handler, collector))
CPU metrics work only on Linux. See the metrics reference
for details.
To enable CPU metrics, first register a callback function:
local metrics = require('metrics')
local cpu_callback = function()
local cpu_metrics = require('metrics.psutils.cpu')
cpu_metrics.update()
end
metrics.register_callback(cpu_callback)
Collected metrics example:
# HELP tnt_cpu_time Host CPU time
# TYPE tnt_cpu_time gauge
tnt_cpu_time 15006759
# HELP tnt_cpu_thread Tarantool thread cpu time
# TYPE tnt_cpu_thread gauge
tnt_cpu_thread{thread_name="coio",file_name="init.lua",thread_pid="699",kind="system"} 160
tnt_cpu_thread{thread_name="tarantool",file_name="init.lua",thread_pid="1",kind="user"} 949
tnt_cpu_thread{thread_name="tarantool",file_name="init.lua",thread_pid="1",kind="system"} 920
tnt_cpu_thread{thread_name="coio",file_name="init.lua",thread_pid="11",kind="user"} 79
tnt_cpu_thread{thread_name="coio",file_name="init.lua",thread_pid="699",kind="user"} 44
tnt_cpu_thread{thread_name="coio",file_name="init.lua",thread_pid="11",kind="system"} 294
Prometheus query aggregated by thread name:
sum by (thread_name) (idelta(tnt_cpu_thread[$__interval]))
/ scalar(idelta(tnt_cpu_total[$__interval]) / tnt_cpu_count)
All psutils metric collectors have metainfo.default = true
.
To clear CPU metrics when you don’t need them anymore, remove the callback and clear the collectors with a method:
metrics.unregister_callback(cpu_callback)
cpu_metrics.clear()
Below are some examples of using metric primitives.
Notice that this usage is independent of export plugins such as
Prometheus, Graphite, etc. For documentation on how to use the plugins, see
the Metrics plugins section.
Using counters:
local metrics = require('metrics')
-- create a counter
local http_requests_total_counter = metrics.counter('http_requests_total')
-- somewhere in the HTTP requests middleware:
http_requests_total_counter:inc(1, {method = 'GET'})
Using gauges:
local metrics = require('metrics')
-- create a gauge
local cpu_usage_gauge = metrics.gauge('cpu_usage', 'CPU usage')
-- register a lazy gauge value update
-- this will be called whenever export is invoked in any plugins
metrics.register_callback(function()
local current_cpu_usage = some_cpu_collect_function()
cpu_usage_gauge:set(current_cpu_usage, {app = 'tarantool'})
end)
Using histograms:
local metrics = require('metrics')
local fiber = require('fiber')
-- create a histogram
local http_requests_latency_hist = metrics.histogram(
'http_requests_latency', 'HTTP requests total', {2, 4, 6})
-- somewhere in the HTTP request middleware:
local t0 = fiber.clock()
observable_function()
local t1 = fiber.clock()
local latency = t1 - t0
http_requests_latency_hist:observe(latency)
Using summaries:
local metrics = require('metrics')
local fiber = require('fiber')
-- create a summary with a window of 5 age buckets and a bucket lifetime of 60 s
local http_requests_latency = metrics.summary(
'http_requests_latency', 'HTTP requests total',
{[0.5]=0.01, [0.9]=0.01, [0.99]=0.01},
{max_age_time = 60, age_buckets_count = 5}
)
-- somewhere in the HTTP requests middleware:
local t0 = fiber.clock()
observable_function()
local t1 = fiber.clock()
local latency = t1 - t0
http_requests_latency:observe(latency)
-
metrics.
summary
(name[, help, objectives, params, metainfo])¶ Register a new summary. Quantile computation is based on the «Effective computation of biased quantiles over data streams» algorithm.
Параметры: - name (
string
) – сollector name. Must be unique. - help (
string
) – collector description. - objectives (
table
) – a list of «targeted» φ-quantiles in the{quantile = error, ... }
form. Example:{[0.5]=0.01, [0.9]=0.01, [0.99]=0.01}
. The targeted φ-quantile is specified in the form of a φ-quantile and the tolerated error. For example,{[0.5] = 0.1}
means that the median (= 50th percentile) is to be returned with a 10-percent error. Note that percentiles and quantiles are the same concept, except that percentiles are expressed as percentages. The φ-quantile must be in the interval[0, 1]
. A lower tolerated error for a φ-quantile results in higher memory and CPU usage during summary calculation. - params (
table
) – table of the summary parameters used to configuring the sliding time window. This window consists of several buckets to store observations. New observations are added to each bucket. After a time period, the head bucket (from which observations are collected) is reset, and the next bucket becomes the new head. This way, each bucket stores observations formax_age_time * age_buckets_count
seconds before it is reset.max_age_time
sets the duration of each bucket’s lifetime – that is, how many seconds the observations are kept before they are discarded.age_buckets_count
sets the number of buckets in the sliding time window. This variable determines the number of buckets used to exclude observations older thanmax_age_time
from the summary. The value is a trade-off between resources (memory and CPU for maintaining the bucket) and how smooth the time window moves. Default value:{max_age_time = math.huge, age_buckets_count = 1}
. - metainfo (
table
) – collector metainfo.
Return: A summary object.
Rtype: summary_obj
Примечание
A summary represents a set of collectors:
name .. "_sum"
– a counter holding the sum of added observations.name .. "_count"
– a counter holding the number of added observations.name
holds all the quantiles under observation that find themselves under the labelquantile
(less or equal). To access bucketx
(wherex
is a number), specify the valuex
for the labelquantile
.
- name (
-
object
summary_obj
¶ -
summary_obj:
observe
(num, label_pairs)¶ Record a new value in a summary.
Параметры: - num (
number
) – value to put in the data stream. - label_pairs (
table
) – a table containing label names as keys, label values as values. All internal counters that have these labels specified observe new counter values. You can’t add the"quantile"
label to a summary. It is added automatically. Ifmax_age_time
andage_buckets_count
are set, the observed value is added to each bucket. Note that both label names and values inlabel_pairs
are treated as strings.
- num (
-
summary_obj:
collect
()¶ Return a concatenation of
counter_obj:collect()
across all internal counters ofsummary_obj
. For the description ofobservation
, see counter_obj:collect(). Ifmax_age_time
andage_buckets_count
are set, quantile observations are collected only from the head bucket in the sliding time window, not from every bucket. If no observations were recorded, the method will returnNaN
in the values.
-
All collectors support providing label_pairs
on data modification.
A label is a piece of metainfo that you associate with a metric in the key-value format.
See tags in Graphite and labels in Prometheus.
Labels are used to differentiate between the characteristics of a thing being
measured. For example, in a metric associated with the total number of HTTP
requests, you can represent methods and statuses as label pairs:
http_requests_total_counter:inc(1, {method = 'POST', status = '200'})
You don’t have to predefine labels in advance.
With labels, you can extract new time series (visualize their graphs) by specifying conditions with regard to label values. The example above allows extracting the following time series:
- The total number of requests over time with
method = "POST"
(and any status). - The total number of requests over time with
status = 500
(and any method).
You can also set global labels by calling
metrics.set_global_labels({ label = value, ...})
.
-
metrics.
cfg
([config])¶ Entrypoint to setup the module. Since 0.17.0.
Параметры: - config (
table
) –module configuration options:
cfg.include
(string/table, default'all'
):'all
to enable all supported default metrics,'none'
to disable all default metrics, table with names of the default metrics to enable a specific set of metrics.cfg.exclude
(table, default{}
): table containing the names of the default metrics that you want to disable. Has higher priority thancfg.include
.cfg.labels
(table, default{}
): table containing label names as string keys, label values as values.
You can work with
metrics.cfg
as a table to read values, but you must callmetrics.cfg{}
as a function to update them.Supported default metric names (for
cfg.include
andcfg.exclude
tables):network
operations
system
replicas
info
slab
runtime
memory
spaces
fibers
cpu
vinyl
memtx
luajit
cartridge_issues
cartridge_failover
clock
event_loop
See metrics reference for details. All metric collectors from the collection have
metainfo.default = true
.cfg.labels
are the global labels to be added to every observation.Global labels are applied only to metric collection. They have no effect on how observations are stored.
Global labels can be changed on the fly.
label_pairs
from observation objects have priority over global labels. If you passlabel_pairs
to an observation method with the same key as some global label, the method argument value will be used.Note that both label names and values in
label_pairs
are treated as strings.- config (
-
metrics.
enable_default_metrics
([include, exclude])¶ Same as
metrics.cfg{include=include, exclude=exclude}
, butinclude={}
is treated asinclude='all'
for backward compatibility.
-
metrics.
set_global_labels
(label_pairs)¶ Same as
metrics.cfg{labels=label_pairs}
.
-
metrics.
collect
([opts])¶ Collect observations from each collector.
Параметры: - opts (
table
) –table of collect options:
invoke_callbacks
– iftrue
,invoke_callbacks()
is triggerred before actual collect.default_only
– iftrue
, observations contain only default metrics (metainfo.default = true
).
- opts (
-
object
registry
¶ -
registry:
unregister
(collector)¶ Remove a collector from the registry.
Параметры: - collector (
collector_obj
) – the collector to be removed.
- collector (
Example:
local collector = metrics.gauge('some-gauge') -- after a while, we don't need it anymore metrics.registry:unregister(collector)
-
registry:
find
(kind, name)¶ Find a collector in the registry.
Параметры: Return: A collector object or
nil
.Rtype: collector_obj
Example:
local collector = metrics.gauge('some-gauge') collector = metrics.registry:find('gauge', 'some-gauge')
-
-
metrics.
register_callback
(callback)¶ Register a function named
callback
, which will be called right before metric collection on plugin export.Параметры: - callback (
function
) – a function that takes no parameters.
This method is most often used for gauge metrics updates.
Example:
metrics.register_callback(function() local cpu_metrics = require('metrics.psutils.cpu') cpu_metrics.update() end)
- callback (
-
metrics.
unregister_callback
(callback)¶ Unregister a function named
callback
that is called right before metric collection on plugin export.Параметры: - callback (
function
) – a function that takes no parameters.
Example:
local cpu_callback = function() local cpu_metrics = require('metrics.psutils.cpu') cpu_metrics.update() end metrics.register_callback(cpu_callback) -- after a while, we don't need that callback function anymore metrics.unregister_callback(cpu_callback)
- callback (
-
metrics.
invoke_callbacks
()¶ Invoke all registered callbacks. Has to be called before each
collect()
. (Since version 0.16.0, you may usecollect{invoke_callbacks = true}
instead.) If you’re using one of the default exporters,invoke_callbacks()
will be called by the exporter.
Below are the functions that you can call
with metrics = require('cartridge.roles.metrics')
specified in your init.lua
.
-
metrics.
set_export
(export)¶ Параметры: - export (
table
) – a table containing paths and formats of the exported metrics.
Configure the endpoints of the metrics role:
local metrics = require('cartridge.roles.metrics') metrics.set_export({ { path = '/path_for_json_metrics', format = 'json' }, { path = '/path_for_prometheus_metrics', format = 'prometheus' }, { path = '/health', format = 'health' } })
You can add several entry points of the same format but with different paths, for example:
metrics.set_export({ { path = '/path_for_json_metrics', format = 'json' }, { path = '/another_path_for_json_metrics', format = 'json' }, })
- export (
-
metrics.
set_default_labels
(label_pairs)¶ Add default global labels. Note that both label names and values in
label_pairs
are treated as strings.Параметры: - label_pairs (
table
) – Table containing label names as string keys, label values as values.
local metrics = require('cartridge.roles.metrics') metrics.set_default_labels({ ['my-custom-label'] = 'label-value' })
- label_pairs (
metrics
also provides middleware for monitoring HTTP
(set by the http module)
latency statistics.
-
metrics.http_middleware.
configure_default_collector
(type_name, name, help)¶ Register a collector for the middleware and set it as default.
Параметры: Possible errors:
- A collector with the same type and name already exists in the registry.
-
metrics.http_middleware.
build_default_collector
(type_name, name[, help])¶ Register and return a collector for the middleware.
Параметры: Return: A collector object.
Possible errors:
- A collector with the same type and name already exists in the registry.
-
metrics.http_middleware.
set_default_collector
(collector)¶ Set the default collector.
Параметры: - collector – middleware collector object.
-
metrics.http_middleware.
get_default_collector
()¶ Return the default collector. If the default collector hasn’t been set yet, register it (with default
http_middleware.build_default_collector(...)
parameters) and set it as default.Return: A collector object.
-
metrics.http_middleware.
v1
(handler, collector)¶ Latency measuring wrap-up for the HTTP ver. 1.x.x handler. Returns a wrapped handler.
Параметры: - handler (
function
) – handler function. - collector – middleware collector object.
If not set, the default collector is used
(like in
http_middleware.get_default_collector()
).
Usage:
httpd:route(route, http_middleware.v1(request_handler, collector))
- handler (
CPU metrics work only on Linux. See the metrics reference for details.
To enable CPU metrics, first register a callback function:
local metrics = require('metrics')
local cpu_callback = function()
local cpu_metrics = require('metrics.psutils.cpu')
cpu_metrics.update()
end
metrics.register_callback(cpu_callback)
Collected metrics example:
# HELP tnt_cpu_time Host CPU time
# TYPE tnt_cpu_time gauge
tnt_cpu_time 15006759
# HELP tnt_cpu_thread Tarantool thread cpu time
# TYPE tnt_cpu_thread gauge
tnt_cpu_thread{thread_name="coio",file_name="init.lua",thread_pid="699",kind="system"} 160
tnt_cpu_thread{thread_name="tarantool",file_name="init.lua",thread_pid="1",kind="user"} 949
tnt_cpu_thread{thread_name="tarantool",file_name="init.lua",thread_pid="1",kind="system"} 920
tnt_cpu_thread{thread_name="coio",file_name="init.lua",thread_pid="11",kind="user"} 79
tnt_cpu_thread{thread_name="coio",file_name="init.lua",thread_pid="699",kind="user"} 44
tnt_cpu_thread{thread_name="coio",file_name="init.lua",thread_pid="11",kind="system"} 294
Prometheus query aggregated by thread name:
sum by (thread_name) (idelta(tnt_cpu_thread[$__interval]))
/ scalar(idelta(tnt_cpu_total[$__interval]) / tnt_cpu_count)
All psutils metric collectors have metainfo.default = true
.
To clear CPU metrics when you don’t need them anymore, remove the callback and clear the collectors with a method:
metrics.unregister_callback(cpu_callback)
cpu_metrics.clear()
Below are some examples of using metric primitives.
Notice that this usage is independent of export plugins such as Prometheus, Graphite, etc. For documentation on how to use the plugins, see the Metrics plugins section.
Using counters:
local metrics = require('metrics')
-- create a counter
local http_requests_total_counter = metrics.counter('http_requests_total')
-- somewhere in the HTTP requests middleware:
http_requests_total_counter:inc(1, {method = 'GET'})
Using gauges:
local metrics = require('metrics')
-- create a gauge
local cpu_usage_gauge = metrics.gauge('cpu_usage', 'CPU usage')
-- register a lazy gauge value update
-- this will be called whenever export is invoked in any plugins
metrics.register_callback(function()
local current_cpu_usage = some_cpu_collect_function()
cpu_usage_gauge:set(current_cpu_usage, {app = 'tarantool'})
end)
Using histograms:
local metrics = require('metrics')
local fiber = require('fiber')
-- create a histogram
local http_requests_latency_hist = metrics.histogram(
'http_requests_latency', 'HTTP requests total', {2, 4, 6})
-- somewhere in the HTTP request middleware:
local t0 = fiber.clock()
observable_function()
local t1 = fiber.clock()
local latency = t1 - t0
http_requests_latency_hist:observe(latency)
Using summaries:
local metrics = require('metrics')
local fiber = require('fiber')
-- create a summary with a window of 5 age buckets and a bucket lifetime of 60 s
local http_requests_latency = metrics.summary(
'http_requests_latency', 'HTTP requests total',
{[0.5]=0.01, [0.9]=0.01, [0.99]=0.01},
{max_age_time = 60, age_buckets_count = 5}
)
-- somewhere in the HTTP requests middleware:
local t0 = fiber.clock()
observable_function()
local t1 = fiber.clock()
local latency = t1 - t0
http_requests_latency:observe(latency)