box.info.replication | Tarantool
Документация на русском языке
поддерживается сообществом

box.info.replication

box.info.replication

The replication section of box.info() is a table with statistics for all instances in the replica set that the current instance belongs to. To see the example, refer to Monitoring a replica set.

In the following, n is the index number of one table item, for example, replication[1], which has data about server instance number 1, which may or may not be the same as the current instance (the «current instance» is what is responding to box.info).

  • replication[n].id – это короткий числовой идентификатор экземпляра в наборе реплик. Данное значение хранится в системном спейсе box.space._cluster.

  • replication[n].uuid – это глобально-уникальный идентификатор экземпляра. Данное значение хранится в системном спейсе box.space._cluster.

  • replication[n].lsn is the log sequence number (LSN) for the latest entry in instance n’s write-ahead log (WAL).

  • replication[n].upstream appears (is not nil) if the current instance is following or intending to follow instance n, which ordinarily means replication[n].upstream.status = follow, replication[n].upstream.peer = url of instance n which is being followed, replication[n].lag and idle = the instance’s speed, described later. Another way to say this is: replication[n].upstream will appear when replication[n].upstream.peer is not of the current instance, and is not read-only, and was specified in box.cfg{replication={...}}, so it is shown in box.cfg.replication.

  • replication[n].upstream.status is the replication status of the connection with the instance n:

    • connect: an instance is connecting to the master.
    • auth: authentication is being performed.
    • wait_snapshot: an instance is receiving metadata from the master. If join fails with a non-critical error at this stage (for example, ER_READONLY, ER_ACCESS_DENIED, or a network-related issue), an instance tries to find a new master to join.
    • fetch_snapshot: an instance is receiving data from the master’s .snap files.
    • final_join: an instance is receiving new data added during fetch_snapshot.
    • sync: the master and replica are synchronizing to have the same data.
    • follow: the current instance’s role is replica. This means that the instance is read-only or acts as a replica for this remote peer in master-master configuration. The instance is receiving or able to receive data from the instance n’s (upstream) master.
    • stopped: replication is stopped due to a replication error (for example, duplicate key).
    • disconnected: an instance is not connected to the replica set (for example, due to network issues, not replication errors).

    Learn more from Replication stages.

  • replication[n].upstream.idle is the time (in seconds) since the last event was received. This is the primary indicator of replication health. Learn more from Monitoring a replica set.
  • replication[n].upstream.lag is the time difference between the local time of instance n, recorded when the event was received, and the local time at another master recorded when the event was written to the write-ahead log on that master. Learn more from Monitoring a replica set.

  • replication[n].upstream.message contains an error message in case of a degraded state; otherwise, it is nil.

  • replication[n].downstream appears (is not nil) with data about an instance that is following instance n or is intending to follow it, which ordinarily means replication[n].downstream.status = follow.

  • replication[n].downstream.vclock contains the vector clock, which is a table of „id, lsn“ pairs, for example, vclock: {1: 3054773, 4: 8938827, 3: 285902018}. (Notice that the table may have multiple pairs although vclock is a singular name).

    Даже если экземпляр удален, его значения все равно появятся здесь; однако, его значения будут переопределены, если позже экземпляр присоединится с тем же UUID. Пары векторных часов будут появляться только если lsn > 0.

    replication[n].downstream.vclock может быть таким же, как и vclock текущего экземпляра (box.info.vclock), потому что все значения vclock в кластере известны. Мастер будет знать, что находится в копии vclock реплики, потому что, когда мастер делает изменение данных, он посылает информацию об изменении на реплику (включая векторные часы мастера), и реплика отвечает тем, что находится в ее таблице векторных часов.

    A replica also sends its entire vector clock table in response to a master’s heartbeat message, see the heartbeat-message examples in the section Binary protocol – replication.

  • replication[n].downstream.idle – это время (в секундах) с момента последней отправки событий экземпляром n через downstream-репликацию.

  • replication[n].downstream.status – это статус для downstream-репликации:

    • stopped означает, что downstream-репликация остановлена,
    • follow означает, что downstream-репликация находится в процессе (экземпляр n готов принимать данные от мастера или уже делает это).
  • replication[n].downstream.lag is the time difference between the local time at the master node, recorded when a particular transaction was written to the write-ahead log, and the local time recorded when it receives an acknowledgment for this transaction from a replica. Since version 2.10.0. See more in Monitoring a replica set.

  • replication[n].downstream.message and replication[n].downstream.system_message will be nil unless a problem occurs with the connection. For example, if instance n goes down, then one may see status = 'stopped', message = 'unexpected EOF when reading from socket', and system_message = 'Broken pipe'. See also degraded state.

For better understanding, see the following diagram illustrating the upstream and downstream connections within the replica set of three instances:

../../../../_images/replication.svg
Нашли ответ на свой вопрос?
Обратная связь