Master-master

Example on GitHub: master_master

This tutorial shows how to configure and work with a master-master replica set.

Prerequisites

Before starting this tutorial:

Install the tt utility.
Create a tt environment in the current directory by executing the tt init command.
Inside the instances.enabled directory of the created tt environment, create the master_master directory.
Inside instances.enabled/master_master, create the instances.yml and config.yaml files:
- instances.yml specifies instances to run in the current environment and should look like this:
```
instance001:
instance002:
```
- The config.yaml file is intended to store a replica set configuration.

Configuring a replica set

This section describes how to configure a replica set in config.yaml.

Step 1: Configuring a failover mode

First, set the replication.failover option to off:

replication:
  failover: off

Step 2: Defining a replica set topology

Define a replica set topology inside the groups section:

The database.mode option should be set to rw to make instances work in read-write mode.
The iproto.listen option specifies an address used to listen for incoming requests and allows replicas to communicate with each other.

groups:
  group001:
    replicasets:
      replicaset001:
        instances:
          instance001:
            database:
              mode: rw
            iproto:
              listen:
              - uri: '127.0.0.1:3301'
          instance002:
            database:
              mode: rw
            iproto:
              listen:
              - uri: '127.0.0.1:3302'

Step 3: Creating a user for replication

In the credentials section, create the replicator user with the replication role:

credentials:
  users:
    replicator:
      password: 'topsecret'
      roles: [replication]

Step 4: Specifying advertise URIs

Set iproto.advertise.peer to advertise the current instance to other replica set members:

iproto:
  advertise:
    peer:
      login: replicator

Resulting configuration

The resulting replica set configuration should look as follows:

credentials:
  users:
    replicator:
      password: 'topsecret'
      roles: [replication]

iproto:
  advertise:
    peer:
      login: replicator

replication:
  failover: off

groups:
  group001:
    replicasets:
      replicaset001:
        instances:
          instance001:
            database:
              mode: rw
            iproto:
              listen:
              - uri: '127.0.0.1:3301'
          instance002:
            database:
              mode: rw
            iproto:
              listen:
              - uri: '127.0.0.1:3302'

Working with a replica set

Starting instances

After configuring a replica set, execute the tt start command from the tt environment directory:

$ tt start master_master
   • Starting an instance [master_master:instance001]...
   • Starting an instance [master_master:instance002]...

Check that instances are in the RUNNING status using the tt status command:

$ tt status master_master
INSTANCE                      STATUS      PID   MODE
master_master:instance001     RUNNING     30818 RW
master_master:instance002     RUNNING     30819 RW

Checking a replica set status

Connect to both instances using tt connect. Below is the example for instance001:

$ tt connect master_master:instance001
   • Connecting to the instance...
   • Connected to master_master:instance001

master_master:instance001>

Check that both instances are writable using box.info.ro:

instance001:

master_master:instance001> box.info.ro
---
- false
...

instance002:

master_master:instance002> box.info.ro
---
- false
...

Execute box.info.replication to check a replica set status. For instance002, upstream.status and downstream.status should be follow.

master_master:instance001> box.info.replication
---
- 1:
    id: 1
    uuid: c3bfd89f-5a1c-4556-aa9f-461377713a2a
    lsn: 7
    name: instance001
  2:
    id: 2
    uuid: dccf7485-8bff-47f6-bfc4-b311701e36ef
    lsn: 0
    upstream:
      status: follow
      idle: 0.93246499999987
      peer: replicator@127.0.0.1:3302
      lag: 0.00016188621520996
    name: instance002
    downstream:
      status: follow
      idle: 0.8988360000003
      vclock: {1: 7}
      lag: 0
...

To see the diagrams that illustrate how the upstream and downstream connections look, refer to Monitoring a replica set.

Примечание

Note that a vclock value might include the 0 component that is related to local space operations and might differ for different instances in a replica set.

Adding data

To check that both instances get updates from each other, follow the steps below:

On instance001, create a space, format it, and create a primary index:

box.schema.space.create('bands')
box.space.bands:format({
    { name = 'id', type = 'unsigned' },
    { name = 'band_name', type = 'string' },
    { name = 'year', type = 'unsigned' }
})
box.space.bands:create_index('primary', { parts = { 'id' } })

Then, add sample data to this space:

box.space.bands:insert { 1, 'Roxette', 1986 }
box.space.bands:insert { 2, 'Scorpions', 1965 }

On instance002, use the select operation to make sure data is replicated:

master_master:instance002> box.space.bands:select()
---
- - [1, 'Roxette', 1986]
  - [2, 'Scorpions', 1965]
...

Add more data to the created space on instance002:

box.space.bands:insert { 3, 'Ace of Base', 1987 }
box.space.bands:insert { 4, 'The Beatles', 1960 }

Get back to instance001 and use select to make sure new records are replicated:

master_master:instance001> box.space.bands:select()
---
- - [1, 'Roxette', 1986]
  - [2, 'Scorpions', 1965]
  - [3, 'Ace of Base', 1987]
  - [4, 'The Beatles', 1960]
...

Check that box.info.vclock values are the same on both instances:

instance001:

master_master:instance001> box.info.vclock
---
- {2: 2, 1: 12}
...

instance002:

master_master:instance002> box.info.vclock
---
- {2: 2, 1: 12}
...

Resolving replication conflicts

Примечание

To learn how to fix and prevent replication conflicts using trigger functions, see Resolving replication conflicts.

Inserting conflicting records

To insert conflicting records to instance001 and instance002, follow the steps below:

Stop instance001 using the tt stop command:
```
$ tt stop master_master:instance001
```

On instance002, insert a new record:

box.space.bands:insert { 5, 'incorrect data', 0 }

Stop instance002 using tt stop:
```
$ tt stop master_master:instance002
```
Start instance001 back:
```
$ tt start master_master:instance001
```
Connect to instance001 and insert a record that should conflict with a record already inserted on instance002:
```
box.space.bands:insert { 5, 'Pink Floyd', 1965 }
```

Start instance002 back:

$ tt start master_master:instance002

Then, check box.info.replication on instance001. upstream.status should be stopped because of the Duplicate key exists error:

master_master:instance001> box.info.replication
---
- 1:
    id: 1
    uuid: c3bfd89f-5a1c-4556-aa9f-461377713a2a
    lsn: 13
    name: instance001
  2:
    id: 2
    uuid: dccf7485-8bff-47f6-bfc4-b311701e36ef
    lsn: 2
    upstream:
      peer: replicator@127.0.0.1:3302
      lag: 115.99977827072
      status: stopped
      idle: 2.0342070000006
      message: Duplicate key exists in unique index "primary" in space "bands" with
        old tuple - [5, "Pink Floyd", 1965] and new tuple - [5, "incorrect data",
        0]
    name: instance002
    downstream:
      status: stopped
      message: 'unexpected EOF when reading from socket, called on fd 24, aka 127.0.0.1:3301,
        peer of 127.0.0.1:58478: Broken pipe'
      system_message: Broken pipe
...

The diagram below illustrates how the upstream and downstream connections look like:

Reseeding a replica

To resolve a replication conflict, instance002 should get the correct data from instance001 first. To achieve this, instance002 should be rebootstrapped:

Select all the tuples in the box.space._cluster system space to get a UUID of instance002:

master_master:instance001> box.space._cluster:select()
---
- - [1, 'c3bfd89f-5a1c-4556-aa9f-461377713a2a', 'instance001']
  - [2, 'dccf7485-8bff-47f6-bfc4-b311701e36ef', 'instance002']
...

In the config.yaml file, change the following instance002 settings:
- Set database.mode to ro.
- Set database.instance_uuid to a UUID value obtained in the previous step.
```
instance002:
  database:
    mode: ro
    instance_uuid: 'dccf7485-8bff-47f6-bfc4-b311701e36ef'
```

Reload configurations on both instances using the config:reload() function:

instance001:

master_master:instance001> require('config'):reload()
---
...

instance002:

master_master:instance002> require('config'):reload()
---
...

Delete write-ahead logs and snapshots stored in the var/lib/instance002 directory.

Примечание

var/lib is the default directory used by tt to store write-ahead logs and snapshots. Learn more from Configuration.
Restart instance002 using the tt restart command:
```
$ tt restart master_master:instance002
```

Connect to instance002 and make sure it received the correct data from instance001:

master_master:instance002> box.space.bands:select()
---
- - [1, 'Roxette', 1986]
  - [2, 'Scorpions', 1965]
  - [3, 'Ace of Base', 1987]
  - [4, 'The Beatles', 1960]
  - [5, 'Pink Floyd', 1965]
...

Restarting replication

After reseeding a replica, you need to resolve a replication conflict that keeps replication stopped:

Execute box.info.replication on instance001. upstream.status is still stopped:

master_master:instance001> box.info.replication
---
- 1:
    id: 1
    uuid: c3bfd89f-5a1c-4556-aa9f-461377713a2a
    lsn: 13
    name: instance001
  2:
    id: 2
    uuid: dccf7485-8bff-47f6-bfc4-b311701e36ef
    lsn: 2
    upstream:
      peer: replicator@127.0.0.1:3302
      lag: 115.99977827072
      status: stopped
      idle: 1013.688243
      message: Duplicate key exists in unique index "primary" in space "bands" with
        old tuple - [5, "Pink Floyd", 1965] and new tuple - [5, "incorrect data",
        0]
    name: instance002
    downstream:
      status: follow
      idle: 0.69694700000036
      vclock: {2: 2, 1: 13}
      lag: 0
...

The diagram below illustrates how the upstream and downstream connections look like:

replication status after reseeding a replica

In the config.yaml file, clear the iproto option for instance001 by setting its value to {} to disconnect this instance from instance002. Set database.mode to ro:
```
instance001:
  database:
    mode: ro
  iproto: {}
```

Reload configuration on instance001 only:

master_master:instance001> require('config'):reload()
---
...

Change database.mode values back to rw for both instances and restore iproto.listen for instance001. The database.instance_uuid option can be removed for instance002:

instance001:
  database:
    mode: rw
  iproto:
    listen:
    - uri: '127.0.0.1:3301'
instance002:
  database:
    mode: rw
  iproto:
    listen:
    - uri: '127.0.0.1:3302'

Reload configurations on both instances one more time:

instance001:

master_master:instance001> require('config'):reload()
---
...

instance002:

master_master:instance002> require('config'):reload()
---
...

Check box.info.replication. upstream.status should be follow now.

master_master:instance001> box.info.replication
---
- 1:
    id: 1
    uuid: c3bfd89f-5a1c-4556-aa9f-461377713a2a
    lsn: 13
    name: instance001
  2:
    id: 2
    uuid: dccf7485-8bff-47f6-bfc4-b311701e36ef
    lsn: 2
    upstream:
      status: follow
      idle: 0.86873800000012
      peer: replicator@127.0.0.1:3302
      lag: 0.0001060962677002
    name: instance002
    downstream:
      status: follow
      idle: 0.058662999999797
      vclock: {2: 2, 1: 13}
      lag: 0
...

Adding and removing instances

The process of adding instances to a replica set and removing them is similar for all failover modes. Learn how to do this from the Master-replica: manual failover tutorial:

Before removing an instance from a replica set with replication.failover set to off, make sure this instance is in read-only mode.

Версия:

Master-master

Prerequisites

Configuring a replica set

Step 1: Configuring a failover mode

Step 2: Defining a replica set topology

Step 3: Creating a user for replication

Step 4: Specifying advertise URIs

Resulting configuration

Working with a replica set

Starting instances

Checking a replica set status

Adding data

Resolving replication conflicts

Inserting conflicting records

Reseeding a replica

Restarting replication

Adding and removing instances