编程知识 cdmana.com

Redis: master slave database mode, sentry and fragment cluster

Master slave database mode

Redis The high reliability of the system mainly includes two aspects :

  • Data loss as little as possible :RDB & AOF Mechanism

  • Service interruption as little as possible : Increase replica redundancy

A master-slave mode

Redis The master-slave library mode is provided , Add redundant copies to improve Redis High reliability of clusters . Read write separation is adopted between master and slave libraries , Write requests can only be made in the main database , Read requests can be completed in both the master and slave libraries .

  • Read operations : Main library 、 Slave Library

  • Write operations : Main library --> After the master database is written, synchronize the slave database

http://img1.sycdn.imooc.com/5facc5cd00018eec14660610.jpg

Why the write request can only be on the main database , What happens if both the slave and the master can read and write ?

  • 1. If you have any 3 The requests for writing are all right key1 To operate , And request to different instances respectively , The corresponding modified values are v1,v2,v3, that key1 The corresponding values are inconsistent on each instance . When reading data , You might read old values .

  • 2. To keep this data in 3 All of them are consistent , It's about locking 、 A series of operations such as whether to complete modification are negotiated among instances , But it's going to cost a lot , Do more harm than good .

How the master and slave databases are synchronized

Full amount of synchronization : When the slave library is connected for the first time and starts to synchronize , The synchronization in this case is shown in the figure below :

http://img4.sycdn.imooc.com/5facc5cf000126c721600938.jpg

  • The connection between the slave library and the master library is established , And tell the master database is about to synchronize , After the main database confirms the reply , The master and slave libraries can start to synchronize . First , Send from the slave to the master psync command , Indicates that you want to synchronize data , The master database starts replication based on the parameters of this command .psync The command contains the runID And replication progress offset Two parameters .
    (1)runID, Is each Redis When the instance is started, it will automatically generate a random ID, Used to uniquely mark this instance . When the slave and master are copied for the first time , Because I don't know about the main database runID, So will runID Set to “?”.
    (2)offset, This is set to -1, For the first time .

  • FULLRESYNC The response represents the full replication used for the first replication , in other words , The master database will copy all the current data to the slave database . In the second phase , The master database synchronizes all data to the slave database . After receiving data from the library , Load data locally . This process depends on the memory snapshot generated RDB file . In order to ensure the data consistency of master-slave database , The main library will use special replication buffer, Record RDB All writes received after file generation .

  • The main database will send the newly received write commands during the second stage execution , And send it to the slave library . The specific operation is , When the master library is finished RDB After the file is sent , It will put this time replication buffer The modification operation in is sent to the slave library , Redo these operations from the library . thus , The master and slave libraries are synchronized .

The incremental synchronization :

Redis Delta copy means replica The process of synchronizing writes that occur on the primary server to slave servers when they start working normally after initialization . The process of incremental replication is mainly that the master server sends the same write command to the slave server for every write command it executes , Receives and executes the received write command from the server .

Redis Create and maintain a ring buffer replication queue ( namely repl_backlog_buffer), Copy from node part ( Incremental replication ) All the data are from repl_backlog_buffer. There is only one master node repl_backlog_buffer, All slave nodes share .

Lord - from - From the mode to reduce the pressure of master database full synchronization

In master-slave mode , All slave databases are connected to the master database , All full replication is done with the master database . Now? , We can go through “ Lord - from - from ” The pattern generates the master library RDB And transmission RDB The pressure of the , In a cascading way, it is distributed to the slave libraries . Here's the picture :

http://img3.sycdn.imooc.com/5facc5d20001157b17300946.jpg

How to do when the master-slave network is interrupted

Redis Create and maintain a ring buffer replication queue ( namely repl_backlog_buffer), Copy from node part ( Incremental replication ) All the data are from repl_backlog_buffer. There is only one master node repl_backlog_buffer, All slave nodes share .

When the master-slave database is disconnected , The main database will send the received write operation commands every time , write in replication buffer, At the same time, these operation commands will also be written repl_backlog_buffer This buffer . The main database will record where it writes , From the library, you will record the position you have read . Is the offset of the main database master_repl_offset, For the slave library, the offset copied from the library is slave_repl_offset, Under normal circumstances , These two offsets are basically equal .

After the connection between the master and slave libraries is restored , The slave will first send to the master psync command , And put your current slave_repl_offset Send to master library , The master library will judge its own master_repl_offset and slave_repl_offset The gap between . In the network disconnection phase , The main library may receive new write commands , therefore , Generally speaking ,master_repl_offset Will be greater than the slave_repl_offset. here , The main library only uses master_repl_offset and slave_repl_offset Between the command operation synchronization to the slave library on the line .

sentry

Sentinel is actually a special mode of operation Redis process , When the master-slave database instance is running , It's running, too . The sentry is mainly responsible for three tasks : monitor 、 Elector ( Select the master library ) And notify the , Sentinel node is actually a special Redis Instance node .

  • monitor : Judge whether the master-slave database is offline

  • Elector : Select a new master library

  • notice : Synchronize the slave database with the master database , Inform the client to connect with the main database

http://img3.sycdn.imooc.com/5facc5d3000112d313860746.jpg

The sentry is connected to no master and slave libraries , And the sentry nodes are clustered .

http://img4.sycdn.imooc.com/5facc5d40001477417741032.jpg

monitor

Sentinels and none of them Redis Nodes are connected and passed through PING The command detects itself and the master 、 Network connection of slave library , Used to determine the state of an instance . None of them Redis The state of nodes can be divided into ” Subjective offline “ and ” Objective offline “.

If it's a slave Library , that , The sentry simply marked it as “ Subjective offline ” That's it , Because the offline effect of the slave library is not too big , The external service of the cluster will not be interrupted .

If it's a master library , that , The Sentry can't simply mark it as “ Subjective offline ”, Turn on the master-slave switch . Because it's possible that there is such a situation : That's the sentinel misjudged , In fact, there is no fault in the main database . But , Once the master-slave switch is started , Subsequent selection and notification operations will bring additional computational and communication overhead .

When judging whether the main database is offline , It can't has the final say of a sentinel , Only most sentinel instances , It is judged that the master database has “ Subjective offline ” 了 , The master library will be marked as “ Objective offline ”, This name also shows that the offline of the main database has become an objective fact . The principle of judgment is : The minority is subordinate to the majority . meanwhile , This further triggers the sentry to start the master-slave switch process .

Elector

When the main library hangs up , The sentinel will select the master from the pool , There are three main principles in electing the president .

  • First , The one with the highest priority gets the highest score from the database .

  • secondly , The slave database with the closest degree of synchronization with the old master database gets a high score .

  • Last ,ID The smaller one gets a higher score from the library .

The user can go through slave-priority Configuration item , Set different priorities for different slaves . such as , You have two slaves , They have different memory sizes , You can manually set a high priority for instances with large memory . In the election , Sentinels will give high priority slaves high marks , If there is a slave library with the highest priority , So the main library is the new one . If the priority of the slave library is the same , So the sentry begins to make the second principle judgment .

The rule is based on , If you choose the slave database that is closest to the old master database as the master database , that , There's the latest in the main database . There is a command propagation process during master-slave synchronization . In the process , The main library will use master_repl_offset Record the current latest write operation in repl_backlog_buffer Position in , And from the library will use slave_repl_offset This value records the current replication progress .

Each instance will have a ID, This ID It's similar to the slave library number here . at present ,Redis When selecting the master database , There is a default rule : With the same priority and replication schedule ,ID The one with the smallest number gets the highest score from the library , Will be selected as the new master library .

Fragmentation cluster

As the data capacity grows larger and larger , Such as Redis The amount of data in the cluster is 5G increased 25G, Then we need to consider the expansion of the cluster , There are generally two schemes for capacity expansion :

  • Vertical expansion : Upgrade single Redis Instance resource allocation , Including increasing memory capacity 、 Increase disk capacity 、 Use a higher configuration of CPU. Expand disk capacity to 50G
    Simple and direct , But it adds hardware and cost constraints 、 When using RDB When data is persisted , If the amount of data increases , The amount of memory needed will also increase , The main thread fork Child processes may block

  • Horizontal scaling : Horizontal increase current Redis Number of instances , It's like in the picture below , The original use 1 individual *GB Memory 、10GB An instance of a disk , Now use three instances of the same configuration .

http://img1.sycdn.imooc.com/5facc5d50001373519080960.jpg

The corresponding distribution relationship between fragmentation and instance

In a shard cluster , Data needs to be distributed across different instances , that , How do data and instances correspond ? That's what I'm going to talk about Redis Cluster The plan is about . however , We need to figure out the slice clusters and Redis Cluster The connection and difference between .

Redis Cluster The scheme uses the Hashi trough (Hash Slot, I'm going to call it Slot), To handle the mapping between data and instances . stay Redis Cluster In the plan , A slice cluster has 16384 Hash slot , These hash slots are similar to data partitions , Each key value pair will be based on its key, Is mapped to a hash slot .

  • First of all, according to the key value pair key, according to CRC16 The algorithm computes a 16 bit Value ;

  • then , To use this 16bit It's worth it 16384 modulus , obtain 0~16383 Modulus in range , Each module represents a corresponding numbered Hashi trough .

Take just five slots , Exhibition , data 、 Hash slot 、 Examples of these three mapping distribution , Here's the picture :

http://img2.sycdn.imooc.com/5facc5d6000151e717460714.jpg

How the client locates the data

After the connection between the client and the cluster instance is established , The instance will send the hash slot allocation information to the client . however , When the cluster was just created , Each instance only knows which hash slots it has been assigned , Do not know the hash slot information owned by other instances . that , Why can the client access any instance , Can get all the hash slot information ? This is because ,Redis Instance will send its own hash slot information to other instances connected to it , To complete the hash slot allocation information diffusion . When instances are interconnected , Each instance has the mapping relationship of all hash slots . After the client receives the hash slot information , The hash slot information will be cached locally . When a client requests a key value pair , We will first calculate the corresponding Hashi groove of the bond , Then you can send the request to the corresponding instance .

The corresponding relationship between examples and Hashi trough is not invariable , There are two of the most common changes :

  • In the cluster , Instances are added or deleted ,Redis Need to reallocate the Hashi trough

  • For load balancing ,Redis The hash slot needs to be redistributed across all instances

Redis Cluster The scheme provides a redirection mechanism , So-called “ Redirect ”, Is refers to , When the client sends data read and write operations to an instance , There is no corresponding data on this instance , The client will send an operation command to a new instance .

http://img1.sycdn.imooc.com/5facc5d7000147d527501162.jpg

It should be noted that , In the diagram above , When the client gives the instance 2 When sending a command ,Slot 2 All the data in has been migrated to the instance 3. In practice , If Slot 2 There's a lot of data in , There may be a situation : Client to instance 2 Send a request , But this time ,Slot 2 Only part of the data in is migrated to the instance 3, There is still some data not migrated . In the case of partial completion of this migration , The client will receive a ASK Error message .

In this result ASK An order means , The hash slot where the key value pair requested by the client is located 13320, In the instance 3 On , But this hash slot is migrating . here , The client needs to give example 3 This instance sends a ASKING command . This order means , Let this instance allow the execution of the next command sent by the client . then , The client sends the instance GET command , To read data .

http://img1.sycdn.imooc.com/5facc5d90001bab427501162.jpg

In the following illustration ,Slot 2 From the example 2 Go to the example 3 transfer ,key1 and key2 It's been moved in the past ,key3 and key4 It's still an example 2. Client to instance 2 request key2 after , You'll get an example 2 Back to ASK command .ASK The command has two meanings : First of all , indicate Slot Data is still migrating ; second ,ASK The command returns the latest instance address of the data requested by the client to the client , here , The client needs to give the instance 3 send out ASKING command , And then send the operation command .ASK The command does not update the hash slot allocation information of the client cache .


author : Buy an orange

 

版权声明
本文为[Sleeping devil's lies]所创,转载请带上原文链接,感谢

Scroll to Top