Module cartridge.issues¶
Monitor issues across cluster instances.
Cartridge detects the following problems:
Replication:
- “Replication from … to … isn’t running” -
when
box.info.replication.upstream == nil
; - “Replication from … to … is stopped/orphan/etc. (…)”;
- “Replication from … to …: high lag” -
when
upstream.lag > box.cfg.replication_sync_lag
; - “Replication from … to …: long idle” -
when
upstream.idle > 2 * box.cfg.replication_timeout
;
Failover:
- “Can’t obtain failover coordinator (…)”;
- “There is no active failover coordinator”;
- “Failover is stuck on …: Error fetching appointments (…)”;
- “Failover is stuck on …: Failover fiber is dead” - this is likely a bug;
Tables¶
limits¶
Thresholds for issuing warnings.
All settings are local, not clusterwide. They can be changed with
corresponding environment variables ( TARANTOOL_*
) or command-line
arguments. See cartridge.argparse module for details.
Fields:
- fragmentation_threshold_critical: (number) default: 0.9.
- fragmentation_threshold_warning: (number) default: 0.6.
- clock_delta_threshold_warning: (number) default: 5.