Persistence
To ensure data persistence, Tarantool provides the abilities to:
Record each data change request into a write-ahead log (WAL) file (
.xlog
files).When a power outage occurs or the Tarantool instance is killed incidentally, the in-memory database is lost. In such case, Tarantool restores the data from WAL files by reading them and redoing the requests. This is called the «recovery process».
Take internals-snapshot that contain an on-disk copy of the entire data set for a given moment (
.snap
files).During the recovery process, Tarantool can load the latest snapshot file and then read the requests from the WAL files, produced after this snapshot was made. After creating a new snapshot, the earlier WAL files can be removed to free up space.
This topic describes how to configure:
- the snapshot creation in the snapshot section of a YAML configuration.
- the recording to the write-ahead log in the wal section of a YAML configuration.
To learn more about the persistence mechanism in Tarantool, see the Persistence section. The formats of WAL and snapshot files are described in detail in the File formats section.
Example on GitHub: snapshot
This section describes how to define snapshot settings in the snapshot section of a YAML configuration.
Примечание
To force immediate creation of a snapshot file, use the box.snapshot() function.
In Tarantool, it is possible to automate the snapshot creation. Automatic creation is enabled by default and can be configured in two ways:
- A new snapshot is taken once in a given period (see snapshot.by.interval).
- A new snapshot is taken once the size of all WAL files created since the last snapshot exceeds a given limit (see snapshot.by.wal_size).
The snapshot.by.interval
option sets up the checkpoint daemon
that takes a new snapshot every snapshot.by.interval
seconds.
If the snapshot.by.interval
option is set to zero, the checkpoint daemon is disabled.
The snapshot.by.wal_size
option defines the maximum size in bytes for all WAL files created since the last snapshot taken.
Once this size is exceeded, the checkpoint daemon takes a snapshot. Then, Tarantool garbage collector
deletes the old WAL files.
The example shows how to specify the snapshot.by.interval
and the snapshot.by.wal_size
options:
by:
interval: 7200
wal_size: 1000000000000000000
In the example, a new snapshot is created in two cases:
- every 2 hours (every 7200 seconds)
- when the size for all WAL files created since the last snapshot reaches the size of 1e18 (1000000000000000000) bytes.
To configure a directory where the snapshot files are stored, use the snapshot.dir
configuration option.
The example below shows how to specify a snapshot directory for instance001
explicitly:
instance001:
snapshot:
dir: 'var/lib/{{ instance_name }}/snapshots'
By default, WAL files and snapshot files are stored in the same directory var/lib/{{ instance_name }}
.
However, you can specify different directories for them.
For example, you can place snapshots and write-ahead logs on different hard drives for better reliability:
instance001:
snapshot:
dir: '/media/drive1/snapshots'
wal:
dir: '/media/drive2/wals'
You can set a limit on the number of snapshots stored in the snapshot.dir directory using the snapshot.count option. Once the number of snapshots reaches the given limit, Tarantool garbage collector deletes the oldest snapshot file and any associated WAL files after the new snapshot is taken.
In the example below, the snapshot is created every two hours (every 7200 seconds) until there are three snapshots in the
snapshot.dir
directory.
After creating a new snapshot (the fourth one), the oldest snapshot and the corresponding WALs are deleted.
count: 3
by:
interval: 7200
Example on GitHub: wal
This section describes how to define WAL settings in the wal section of a YAML configuration.
The recording to the write-ahead log is enabled by default. It means that if an instance restart occurs, the data will be recovered. The recording to the WAL can be configured using the wal.mode configuration option.
There are two modes that enable writing to the WAL:
write
(default) – enable WAL and write the data without waiting for the data to be flushed to the storage device.fsync
– enable WAL and ensure that the record is written to the storage device.
The example below shows how to specify the write
WAL mode:
mode: 'write'
To turn the WAL writer off, set the wal.mode
option to none
.
To configure a directory where the WAL files are stored, use the wal.dir configuration option.
The example below shows how to specify a directory for instance001
explicitly:
instance001:
wal:
dir: 'var/lib/{{ instance_name }}/wals'
In case of replication or hot standby mode, Tarantool scans for changes in the WAL files every wal.dir_rescan_delay seconds. The example below shows how to specify the interval between scans:
dir_rescan_delay: 3
A new WAL file is created when the current one reaches the wal.max_size size. The configuration for this option might look as follows:
max_size: 268435456
In Tarantool, the checkpoint daemon takes new snapshots at the given interval (see snapshot.by.interval). After an instance restart, the Tarantool garbage collector deletes the old WAL files.
To delay the immediate deletion of WAL files, use the wal.cleanup_delay configuration option. The delay eliminates possible erroneous situations when the master deletes WALs needed by replicas after restart. As a consequence, replicas sync with the master faster after its restart and don’t need to download all the data again.
In the example, the delay is set to 5 hours (18000 seconds):
cleanup_delay: 18000
In Tarantool Enterprise, you can store an old and new tuple for each CRUD operation performed. A detailed description and examples of the WAL extensions are provided in the WAL extensions section.
See also: wal.ext.* configuration options.
The checkpoint daemon (snapshot daemon) is a constantly running fiber.
The checkpoint daemon creates a schedule for the periodic snapshot creation based on
the configuration options and the speed of file size growth.
If enabled, the daemon makes new snapshot (.snap
) files according to this schedule.
The work of the checkpoint daemon is based on the following configuration options:
- snapshot.by.interval – a new snapshot is taken once in a given period.
- snapshot.by.wal_size – a new snapshot is taken once the size of all WAL files created since the last snapshot exceeds a given limit.
If necessary, the checkpoint daemon also activates the Tarantool garbage collector that deletes old snapshots and WAL files.
Примечание
The memtx engine takes only regular snapshots with the interval set in the checkpoint daemon configuration.
The vinyl engine runs checkpointing in the background at all times.
Tarantool garbage collector can be activated by the checkpoint daemon. The garbage collector tracks the snapshots that are to be relayed to a replica or needed by other consumers. When the files are no longer needed, Tarantool garbage collector deletes them.
Примечание
The garbage collector called by the checkpoint daemon is distinct from the Lua garbage collector which is for Lua objects, and distinct from the Tarantool garbage collector that specializes in handling shard buckets.
This garbage collector is called as follows:
- When the number of snapshots reaches the limit of snapshot.count size. After a new snapshot is taken, Tarantool garbage collector deletes the oldest snapshot file and any associated WAL files.
- When the size of all WAL files created since the last snapshot reaches the limit of snapshot.by.wal_size. Once this size is exceeded, the checkpoint daemon takes a snapshot, then the garbage collector deletes the old WAL files.
If an old snapshot file is deleted, the Tarantool garbage collector also deletes any write-ahead log (.xlog) files that meet the following conditions:
- The WAL files are older than the snapshot file.
- The WAL files contain information present in the snapshot file.
Tarantool garbage collector also deletes obsolete vinyl .run
files.
Tarantool garbage collector doesn’t delete a file in the following cases:
- A backup is running, and the file has not been backed up (see Hot backup).
- Replication is running, and the file has not been relayed to a replica (see Replication architecture),
- A replica is connecting.
- A replica has fallen behind. The progress of each replica is tracked; if a replica’s position is far from being up to date, then the server stops to give it a chance to catch up. If an administrator concludes that a replica is permanently down, then the correct procedure is to restart the server, or (preferably) remove the replica from the cluster.