编程知识 cdmana.com

2021-01-31: what are the situations when the redis cluster solution is not available?

Fogo's answer 2021-01-31:
The answer comes from this link :
The official minimum recommended best practice for a cluster model is 6 Nodes ,3 individual Master 3 individual Slave The pattern of .

key Slotting and forwarding mechanism
Redis Divide the key space into 16384 Slot , Determine each by the following algorithm key The groove of :
CRC16(key) mod 16384
because 16384 = 2 Of 14 Power , To a 2 Of n The remainder to the power of is equivalent to 2 Of n Subtraction and operation of the power . So the optimization is :
CRC16(key) & 16383
When key contain hash tags When ( for example key{sub}1), Will sub tags The string specified in ( Namely sub ) Calculation slot , therefore key{sub}1 and key{sub}2 Will be in the same slot .
The client can send the command to read any slot to any cluster instance , When the slot belongs to the instance of the request , Will deal with , Otherwise, it will tell the client where the slot is , For example, if you send the following command to the second Master:
GET key1
return : MOVED slot ip:port( first Master Of )
By default , All read and write commands can only be sent to Master. If needed Slave Processing read requests , It needs to be executed on the client first readonly command .

Master slave automatic switching mechanism
When one Master failure , If there is Slave, Will switch to Master.
How to determine Master There's something wrong ?Redis There is a configuration in the cluster configuration ,cluster-node-timeout Cluster heartbeat timeout . When the nodes in the cluster are connected , Timing task clusterCron function ( Reference source code :https://github.com/redis/redis/blob/6.0/src/cluster.c) Will randomly select a node every second to send heartbeat . If in the timeout period (cluster-node-timeout) No heartbeat response has been received within a period of , Then mark this node as pfail. If more than half of the cluster Master The state of marking a node is pfail, Then the state of this node becomes fail.
When the node becomes fail Will trigger automatic master-slave switching . The process of master-slave switching , Similar elections are also involved :
1. When a Master Marked as fail after , Corresponding Slave Nodes perform timed tasks clusterCron Function time , Select Copy offset , That is, the master-slave synchronization progress is the largest 、 The latest data Slave Try to be dominant .
2. This Slave Set up your own currentEpoch += 1( Normally, all of the currentEpoch identical , Every election adds 1, And each currentEpoch Only one shot , Prevent multiple Slave At the same time, it is difficult to obtain a majority of votes after the election ), After all Master send out failover request , If you get the majority Master And then the master-slave switch is executed .

Cluster unavailability
According to the description above , We can summarize the following unavailability
1. When accessing a Master and Slave When the nodes are all hung up in the slot , Can't get .
2. When clustering Master The number of nodes is less than 3 When it's time , Or when the number of available nodes in the cluster is even , be based on fail The automatic master-slave switching process of this election mechanism may not work properly , One is the mark fail The process of , One is the election of new master The process of , It can be abnormal .


Comment on

版权声明
本文为[A daily question for the architect of Fuda]所创,转载请带上原文链接,感谢
https://cdmana.com/2021/01/20210131220732096c.html

Scroll to Top