Question

What is Redis Cluster Synchronization Between Index Master and Slave?

Answer and Explanation

In a Redis Cluster, synchronization between a master and its slave (replica) is crucial for maintaining data consistency and ensuring high availability. The primary goal is to ensure that the slave has an up-to-date copy of the master's data, so it can take over if the master fails. Here’s a breakdown of how Redis achieves this:

1. Full Synchronization (BGSAVE and RDB):

- When a slave connects to a master for the first time, or when the replication process needs to be restarted, a full synchronization occurs. This process involves the following steps:

- The master initiates a background save (BGSAVE) to create an RDB (Redis DataBase) file on disk. This operation allows the master to continue serving requests without significant interruption.

- Once the RDB file is created, the master sends this file to the slave.

- The slave loads the RDB file into its memory, effectively replacing its existing data with the data from the master.

- After the slave loads the RDB file, it starts processing any buffered commands that occurred on the master during the RDB file transfer, bringing the slave up to date.

2. Partial Synchronization (Replication Stream):

- After the initial full synchronization, Redis uses partial synchronization for ongoing replication. This is more efficient as it only transfers the incremental changes since the last synchronization, instead of the entire dataset.

- Redis uses a replication stream, also known as the backlog, to keep track of the commands executed on the master. This stream resides in the master's memory and acts as a circular buffer.

- The master maintains a replication offset, which is a logical pointer to the position in the replication stream. Both the master and the slave keep track of their respective replication offsets.

- When a slave disconnects and reconnects, it sends its last known offset to the master. The master checks if the requested data is still available in its replication stream.

- If the requested data is available, the master sends only the missing commands (the difference between the master's current offset and the slave's last known offset) to the slave. This is much faster than a full synchronization.

- If the requested data is no longer available (e.g., because the slave was disconnected for too long and the replication stream has wrapped around), a full synchronization is required.

3. Heartbeat and Keep-Alive:

- Slaves periodically send heartbeat signals (PING commands) to the master to ensure the connection is alive and the master is still functioning. The master, in turn, responds to these signals.

- If the master doesn't receive a heartbeat from a slave within a configured timeout period (repl-timeout), it considers the slave disconnected.

4. Configuration:

- The replication behavior can be configured through the redis.conf file or via the CONFIG SET command. Key parameters include:

- slaveof <masterip> <masterport>: Configures the instance as a slave of the specified master.

- masterauth <master-password>: If the master requires authentication, this option specifies the password.

- repl-diskless-sync: Enables diskless replication, where the master sends the RDB file directly to the slave over the network, without writing it to disk.

- repl-backlog-size: Specifies the size of the replication backlog buffer on the master.

5. Automatic Failover:

- In a Redis Cluster, Sentinel nodes monitor the health of masters and slaves. If a master fails, Sentinel promotes one of its slaves to become the new master. This process involves:

- Sentinel detecting the failure of a master.

- Selecting a suitable slave based on factors like data completeness and connection stability.

- Promoting the selected slave to become the new master.

- Reconfiguring other slaves to replicate from the new master.

In summary, Redis Cluster synchronization between master and slave involves a combination of full and partial synchronization, utilizing RDB files and a replication stream to ensure data consistency and enable automatic failover. This design helps maintain high availability and data durability in a distributed Redis environment. The 'redis.conf' file is used to configure synchronization.

More questions