box.info.replication
-
box.info.replication¶ The
replicationsection of box.info() is a table with statistics for all instances in the replica set that the current instance belongs to. To see the example, refer to Monitoring a replica set.In the following, n is the index number of one table item, for example,
replication[1], which has data about server instance number 1, which may or may not be the same as the current instance (the “current instance” is what is responding tobox.info).replication[n].idis a short numeric identifier of instance n within the replica set. This value is stored in the box.space._cluster system space.replication[n].uuidis a globally unique identifier of instance n. This value is stored in the box.space._cluster system space.replication[n].lsnis the log sequence number (LSN) for the latest entry in instance n’s write-ahead log (WAL).replication[n].nameis the instance name. See also: box.info.name.replication[n].upstreamappears (is notnil) if the current instance is following or intending to follow instance n, which ordinarily meansreplication[n].upstream.status=follow,replication[n].upstream.peer= url of instance n which is being followed,replication[n].lag and idle= the instance’s speed, described later. Another way to say this is:replication[n].upstreamwill appear whenreplication[n].upstream.peeris not of the current instance, and is not read-only, and was specified inbox.cfg{replication={...}}, so it is shown in box.cfg.replication.replication[n].upstream.statusis the replication status of the connection with the instance n:connect: an instance is connecting to the master.auth: authentication is being performed.wait_snapshot: an instance is receiving metadata from the master. If join fails with a non-critical error at this stage (for example,ER_READONLY,ER_ACCESS_DENIED, or a network-related issue), an instance tries to find a new master to join.fetch_snapshot: an instance is receiving data from the master’s.snapfiles.final_join: an instance is receiving new data added duringfetch_snapshot.sync: the master and replica are synchronizing to have the same data.follow: the current instance’s role is replica. This means that the instance is read-only or acts as a replica for this remote peer in master-master configuration. The instance is receiving or able to receive data from the instance n’s (upstream) master.stopped: replication is stopped due to a replication error (for example, duplicate key).disconnected: an instance is not connected to the replica set (for example, due to network issues, not replication errors).
Learn more from Replication stages.
replication[n].upstream.idleis the time (in seconds) since the last event was received. This is the primary indicator of replication health. Learn more from Monitoring a replica set.
replication[n].upstream.peercontains instance n’s URI, for example, 127.0.0.1:3302. Learn more from Monitoring a replica set.
replication[n].upstream.lagis the time difference between the local time of instance n, recorded when the event was received, and the local time at another master recorded when the event was written to the write-ahead log on that master. Learn more from Monitoring a replica set.replication[n].upstream.messagecontains an error message in case of a degraded state; otherwise, it isnil.replication[n].downstreamappears (is notnil) with data about an instance that is following instance n or is intending to follow it, which ordinarily meansreplication[n].downstream.status=follow.replication[n].downstream.vclockcontains the vector clock, which is a table of ‘id, lsn’ pairs, for example,vclock: {1: 3054773, 4: 8938827, 3: 285902018}. (Notice that the table may have multiple pairs althoughvclockis a singular name).Even if instance n is removed, its values will still appear here; however, its values will be overridden if an instance joins later with the same UUID. Vector clock pairs will only appear if
lsn > 0.replication[n].downstream.vclockmay be the same as the current instance’s vclock (box.info.vclock) because this is for all known vclock values of the cluster. A master will know what is in a replica’s copy of vclock because, when the master makes a data change, it sends the change information to the replica (including the master’s vector clock), and the replica replies with what is in its entire vector clock table.A replica also sends its entire vector clock table in response to a master’s heartbeat message, see the heartbeat-message examples in the section Binary protocol – replication.
replication[n].downstream.idleis the time (in seconds) since the last time that instance n sent events through the downstream replication.replication[n].downstream.statusis the replication status for downstream replications:stoppedmeans that downstream replication has stopped,followmeans that downstream replication is in progress (instance n is ready to accept data from the master or is currently doing so).
replication[n].downstream.lagis the time difference between the local time at the master node, recorded when a particular transaction was written to the write-ahead log, and the local time recorded when it receives an acknowledgment for this transaction from a replica. Since version 2.10.0. See more in Monitoring a replica set.replication[n].downstream.messageandreplication[n].downstream.system_messagewill benilunless a problem occurs with the connection. For example, if instance n goes down, then one may seestatus = 'stopped',message = 'unexpected EOF when reading from socket', andsystem_message = 'Broken pipe'. See also degraded state.