编程知识 cdmana.com

Redis high availability: what do you call sentinel cluster

Summary

We know 「 Master slave replication is the cornerstone of high availability 」, When the slave library is down, the request can still be sent to the master library or other slave libraries , however Master Downtime , Can only respond to read operations , The write request can no longer be executed .

So the master-slave replication architecture faces a serious problem , The main library is down , Unable to execute 「 Write operations 」, Can't automatically select one Slave Switch to a Master, That is, it can't fail over automatically .

Late at night with my girlfriend ……( Omit here 10000 word ), Sudden downtime , You can't lift your pants up from the bed and switch between master and slave by hand , Then inform other programmers to change the address to the new main library online .

After such a toss, I have been switched from my girlfriend to my ex boyfriend , I can't do it . So we have to have a highly available solution , So ,Redis The government provides a highly available solution —— sentry (Sentinel).

Redis The principle of sentry group

The opening remarks

“ The iteration of technology is very fast , But the thinking precipitated from technology benefits for life . So don't worry about midlife crisis , People who are worried about midlife crisis usually have a hard time growing up . As long as we grow up , As long as our cognition is constantly breaking through , You don't have to worry about midlife crisis , The world always needs those talents . ”

What is a sentry (Sentinel)

“65 Brother : Margo , Although I don't have a girlfriend , however , Prepare for a rainy day, I want to master this sentinel mode , To prevent me from being disturbed with my girlfriend in the middle of the night , Let's talk about the realization principle of sentry . ”

Three sentries are used to form a cluster , Three data nodes ( One master and two slaves ) Way to build , As shown in the figure below :

Redis The sentry cluster

The construction of sentry group The demonstration will not be repeated here , Readers in need can click on the bottom left corner 「 Read the original 」 see .

65 Brother, you've heard of 「 Wudang sect 」 Founder Zhang San is crazy ?Redis Master slave architecture is like Wudang , It's the leader Master. If the leader hangs up , You need to choose an able person from the seven swordsmen of Wudang to be the leader . This requires a department to monitor the life and death of the leader and the life status of other Wudang disciples , And can vote from Wudang disciples to elect a capable person as the new leader , Then a press conference will be held to announce the new leader's message to the world . This 「 department 」 It's the sentry .

Sentinels will encounter the following problems in electing a new leader :

  1. How to judge whether the leader is really dead , It's possible to feign death ;
  2. Which one of Wudang's children to choose as the new leader ?
  3. Inform all Wudang disciples about the new leader through the press conference (slave and master) And the whole Wulin ( client ).

The main task of the sentinel department is : Monitoring the whole Wudang 、 Choose a new leader , Inform the whole Wudang and the whole Wulin .

The main task of the sentinel mechanism

The sentry is Redis A mode of operation of , It's focused on Redis example ( Master node 、 From the node ) Monitoring the operation status of , When the master node fails, a series of mechanisms can be used to select the master and switch between the master and the slave , Achieve failover , Make sure that the whole Redis Availability of the system . combination Redis Of Official documents :https://redis.io/topics/sentinel, You can know Redis Sentinels have the following capabilities :

  • monitor : Continuous monitoring master 、slave Whether it is in the expected working state .
  • Switch the main library automatically : When Master Operational failure , Sentinels start the auto recovery process : from slave Choose one of them as the new master.
  • notice : Give Way slave perform replicaof , With the new master Sync ; And inform the client with the new master Establishing a connection .

Sentinel is also a Redis process , It's just that we don't provide external reading and writing services , Usually, the sentry should be configured as an odd number , Why? ? And listen to 「 Code byte 」 Analyze slowly .

“65 Brother : In the end 「 sentry 」 How does this mysterious department realize these three abilities ? ”

Let's look at the Sentinels from the whole picture , A brief understanding of the whole operation process , Then we will analyze each task in detail . Start with monitoring …...

monitor

Sentinel It's just a special department of Wudang disciples , By default ,Sentinel Pass the message to all Wudang disciples once a second through flying pigeons 、 The leader and the sentry ( Include Master、Slave、 other Sentinel , ) send out PING command , If slave Did not respond within the specified time 「 sentry 」 Of PING command ,「 sentry 」 I thought this guy might be belching , He will be recorded as 「 Offline status 」;

If master The leader didn't respond at the specified time 「 sentry 」 Of PING command , The sentry decided that the leader was off the line , Start execution 「 Automatic switch master representative or leader in a certain field 」 The process of .

PING There are two ways to reply to an order :

  1. Valid responses : return +PONG、-LOADING、-MASTERDOWN Any kind of ;
  2. Invalid response : A reply other than a valid reply , Or return any reply within a specified time .

“65 Brother : How do sentinels judge 「 representative or leader in a certain field 」 Hiccups ? What should I do if the leader swindles the corpse ? ”

In order to prevent the leader from 「 Feign death 」,「 sentry 」 Designed 「 Subjective offline 」 and 「 Objective offline 」 Two signals .

Subjective offline

Sentinels use PING Command to detect the leader 、 slave The state of life . If it's an invalid reply , The sentry marked this guy as 「 Subjective offline 」. It's Wudang boy detected , That is to say slave role . Then mark it directly 「 Subjective offline 」.

because master The leader is still ,slave My belch has little influence on Wudang . It's still open for meetings , Martial arts and swordsmanship 、 Eat and drink hot …...

If it's detected to be master The leader is finished , At this time, the Sentry can't simply mark 「 Subjective offline 」, Open a new leader election .

Because there may be misjudgment , The leader didn't belch , Once the leader switch is activated , Subsequent electors 、 Call for a press conference ,slave Take time with new master Synchronizing data consumes a lot of resources .

therefore 「 sentry 」 To reduce the probability of miscarriage of justice , Miscalculation usually occurs when the cluster network is under great pressure 、 Network congestion , Or when the main reservoir itself is under high pressure .

Since it's easy for a person to misjudge , Let's vote together . The sentry mechanism is similar , The cluster mode composed of multiple instances is adopted for deployment , This is the sentry group . Introduce several sentinel examples to judge together , You can avoid a single sentry because your network is not good , And misjudge that the main database is offline .

meanwhile , The probability of multiple sentinel networks being unstable at the same time is small , They make decisions together , The miscalculation rate can also be reduced .

Objective offline

Judge master There can't be only one 「 sentry 」 The final say , Only half of the Sentinels judged master already 「 Subjective offline 」, Only at this time can master Marked as 「 Objective offline 」, That is to say, it is an objective fact , The leader is really belching , Hua Tuo can't be cured in his second life .

Only master Judged as 「 Objective offline 」, It will further trigger the sentry to start the master-slave switching process .

Objective offline

The difference between subjective offline and objective offline

Simply speaking , Subjective offline is that the sentinel thinks the node is down , And the objective offline is not only the sentinel thinks that the node is down , And after the sentry communicates with other sentries , Up to a certain number of sentinels think it's time for the man to belch .

there 「 A certain amount of 」 It's a legal quantity (Quorum), It's determined by the sentinel monitoring configuration , Explain the configuration :

# sentinel monitor <master-name> <master-host> <master-port> <quorum>
#  Examples are as follows :
sentinel monitor mymaster 127.0.0.1 6379 2

This configuration item is used to tell the sentinel which master node to listen on :

  • sentinel monitor: On behalf of monitoring .
  • mymaster: Represents the name of the master node , You can customize .
  • 192.168.11.128: Represents the master node of monitoring ip,6379 For port .
  • 2: Legal quantity , Represents when only two or more sentinels think the master node is unavailable , That's what makes master Set to objective offline state , Then proceed failover operation .

「 Objective offline 」 The standard is , When there is N A sentinel instance , Want to have N/2 + 1 Let's take an example to judge master by 「 Subjective offline 」, In order to finally determine Master by 「 Objective offline 」, It's more than half the mechanism .

Switch the main library automatically

“65 Brother : Since judgment master I'm off the line , Then it's time to choose a new leader . ”

「 sentry 」 My second task , Select new master representative or leader in a certain field . You need to choose a new leader from Wudang disciples according to certain rules , After selecting the leader , new master Lead all the disciples to eat and drink together .

According to a certain 「 filter 」 + 「 Scoring 」 Strategy , elect 「 The strongest King 」 As the leader , That is to say, through some conditions of audition filtering some 「 The incompetent 」, Then we will score and rank all the beauties who have passed the audition , Choose the highest as the new master.

As shown in the figure :

new master choice

It's not a good idea for a pretty guy who is often disconnected from the Internet , Would you , Even if it becomes master, But soon the network broke down , You have to choose a new one master, It's not for fun , We have to rule out !

filter

“65 Brother : What are the screening criteria ? ”

  • From the current online state of the library , The offline ones are discarded directly ;
  • Evaluate previous network connection status down-after-milliseconds \* 10: If the slave database is always disconnected from the master database , And the number of disconnection times exceeds a certain threshold (10 Time ), We have reason to believe that , The network condition of this slave database is not very good , You can sift this out of the library .

Scoring

Filter out inappropriate slave after , Then enter the scoring link . There are three rules for three rounds of scoring , The rules are :

  1. slave priority , adopt slave-priority Configuration item , Set different priorities for different slaves ( There's someone backstage who can't help it ), Those with higher priority will be promoted directly to new master representative or leader in a certain field .
  2. slave_repl_offset And master_repl_offset Progress gap ( The closer one's martial arts is to the previous leader's, the more powerful one will be ), If it's all the same , Let's move on to the next rule . It's just a comparison slave And the old master Copy progress gap ;
  3. slave runID, With the same priority and replication schedule ,ID The one with the smallest number gets the highest score from the library , Will be selected as the new master library .( arrange in order of seniority , according to runID To determine when , Early superior );

notice

“65 Brother : Why hold a press conference ? ”

Re elect a new master Such things as headmaster , What a big deal , How can we not tell the world . What's more slave I also need to know who the new leader is , Follow the new leader to be popular and drink spicy health care together .

The last task ,「 sentry 」 Will be new 「master representative or leader in a certain field 」 The connection information is sent to other slave Wudang disciples , And let slave perform replacaof command , New 「master representative or leader in a certain field 」 Establishing a connection , And copy the data to learn all the martial arts of the new leader .

besides ,「 sentry 」 You also need to inform the whole Wulin of the connection information of the new leader ( client ), Make everyone want to visit 、 Those who seek advice can find the new leader , In this way, many matters can be handed over to the new leader for decision ( Transfer the read / write request to the new master).

The main task of the sentry is to achieve the goal

Sentinels carry out tasks and targets

How sentinel clusters work

「 sentry 」 The Department is not alone , Many people work together to form a 「 The sentry cluster 」, Even though there are some 「 sentry 」 I was killed by Lao Wang , Other 「 sentry 」 We can still work together to complete the monitoring 、 New leader election and notice slave 、master And everyone in the Wulin ( client ).

When deploying sentry clusters , Sentinel configuration is only set up to monitor master IP and port, There is no connection information configured for other sentinels .

sentinel monitor <master-name> <ip> <redis-port> <quorum>

How do sentinels know each other ? How do you know slave And monitor their ? By which 「 sentry 」 To perform master-slave switching ?

With these questions , follow 「 Code byte 」 Let's go back to the source together , Deep into the heart of the sentinel cluster .

pub/sub Communication and discovery between sentinels slave

“65 Brother : How do sentinels know each other ? ”

Sentinels can communicate with each other, date and do things , Mainly due to Redis Of pub/sub Release / Subscribe mechanism .

The sentry and master Establish communication , utilize master Provide release / The subscription mechanism publishes its own information , Like height and weight 、 Are you single? 、IP、 port ……

master There is one __sentinel__:hello A dedicated channel for , Used to publish and subscribe messages between sentinels . It's like __sentinel__:hello Wechat group , Sentinels use master Set up a wechat group to release their own news , At the same time, follow the news from other sentinels .

Redis pub/sub Mechanism

When multiple sentinel instances have done publish and subscribe operations on the main database , They can know each other's IP Address and port , To discover and connect with each other .

Redis Manage messages separately through channels , The channels here are actually different wechat groups . such as “ Codebyte reader Technology Group ” It's a technology sharing group . Friends can pay attention to the official account , The background to reply “ Add group ”, Growing up together .

“65 Brother : The Sentinels are connected , But we need to talk to slave Establishing a connection , Otherwise, we can't monitor them , How do you know slave And monitor their ? ”

You bet , It's not enough to connect sentinels to form a cluster , I need to follow slave Establishing a connection , Or you can't monitor them , Unable to make heartbeat judgment on master-slave Library .

besides , If there is a master-slave switch, you have to notify slave Follow the new master Set up a connection to perform data synchronization . The principle of data synchronization in master-slave architecture can be changed step by step 《Redis High availability : You call this master-slave architecture data consistency synchronization 》.

The key is to use master To achieve , The sentry turned to master send out INFO command , master The leader naturally knows what he has salve My little brother's . therefore master After receiving the command , It will be slave The list tells the sentry .

The sentry is based on master Responsive slave List information with every salve Establishing a connection , And continuously monitor the sentry based on this connection .

As shown in the figure , sentry 2 towards Master send out INFO command ,Master Just put slave The list goes back to the sentinel 2, sentry 2 According to slave List connection information with each slave Establishing a connection , And realize continuous monitoring based on this connection .

The rest of the Sentinels also monitor based on this .

INFO Command acquisition slave Information

Select sentry to switch between master and slave

“65 Brother :master After belching , There are so many sentinels , Which Sentry is going to carry out the new master Switching ? ”

It's the sentry's judgment master “ Objective offline ” similar , It was also elected by vote .

Any sentinel judge master “ Subjective offline ” after , Will send to other sentinel friends is-master-down-by-addr command , Good friends are based on their own master The state of connection between them responds to Y perhaps N ,Y To vote for , N It's against .

If a sentinel gets the majority of sentinels “ Affirmative vote ” after , You can mark master by “ Objective offline ”, The Yes vote is through the sentinel profile quorum Configuration item settings .

sentinel monitor <master-name> <ip> <redis-port> <quorum>

For example, a total of 3 A group of sentinels , that quorum Can be configured to 2, When a sentry gets 2 Yes, yes , You can mark master “ Objective offline ”, Of course, this vote includes your own one .

A sentinel with a majority vote can send orders to other sentinels , State that you want to perform master-slave switching . And let the other sentinels vote , The voting process is called “Leader The election ”.

Want to be “Leader” It's not that simple , You have to have two brushes . The following conditions need to be met :

  1. More than half of the other sentinel friends voted for it ;
  2. The number of affirmative votes should be greater than or equal to that of the configuration file quorum Value .

If the sentry group has 2 An example , here , A sentinel wants to be Leader, Must obtain 2 ticket , instead of 1 ticket . therefore , If a sentinel goes down , that , At this time, the cluster is unable to switch between master and slave databases . therefore , Usually we will at least configure 3 A sentinel example .

This is also the reason why sentry clusters are deployed in an odd number , Even numbers are unnecessary and wasteful .

The election process is shown in the figure below :

Redis Sentinels perform master-slave switching

adopt pub/sub Implement client event notification

“65 Brother : new master It's chosen , How to publicize the world ? ”

A press conference, of course , Invite news related media reports to spread , Interested people naturally pay attention to subscription related events , And act on events .

stay Redis It's similar , adopt pub/sub Mechanisms release different events , Let the client subscribe to the message here . The client can subscribe to sentry messages , The sentinel has a lot of subscription channels , Different channels contain different key events in the process of master-slave switch .

That is to say, in different “ Wechat group ” Publish different events , Let the people who are interested in the event into the group .

master Offline events

  • +sdown: Get into “ Subjective offline ” state ;
  • -sdown: sign out “ Subjective offline ” state ;
  • +odown: Get into “ Objective offline ” state ;
  • -odown: sign out “ Objective offline ” state ;

slave Reconfigure Events

  • +slave-reconf-sent: The sentry sent replicaof Command to reconfigure the slave Library ;
  • +slave-reconf-inprog:slave New master, But it's not synchronized yet ;
  • +slave-reconf-done:slave New master, And with the new master Complete data synchronization ;

New main library switch

+switch-master:master The address has changed .

After knowing these channels , So that the client can subscribe to the message from the sentry . After the client reads the Sentinel's configuration file , You can get the sentry's address and port , Network with the sentry .

then , We can execute subscription commands on the client side , To get different event messages .

Take a chestnut : The following commands subscribe to “ Events in which all instances enter the objective offline state ”

SUBSCRIBE +odown

Notes and configuration instructions

Did you find out ,Redis Of pub/sub The publish subscribe mechanism is particularly important , With pub/sub Mechanism , Between the sentry and the sentry 、 Between the sentry and the slave 、 The connection can be established between the sentry and the client , The release of various events is also realized through this mechanism .

down-after-milliseconds

Sentinel In the configuration file down-after-milliseconds Option specifies Sentinel Determine the length of time it takes for the instance to enter the subjective logoff : If an example is in down-after-milliseconds In milliseconds , In succession Sentinel Return invalid reply , that Sentinel The data corresponding to this instance will be modified , This indicates that the instance has entered the subjective offline state .

Make sure that the configuration of all sentinel instances is consistent , Especially the subjective judgment value down-after-milliseconds. Because this value is not configured consistently on different sentinel instances , As a result, the sentinel cluster has not reached a consensus on the failed main database , So we didn't switch the main database in time , The end result of cluster service instability .

down-after-milliseconds * 10

down-after-milliseconds It is the maximum connection timeout that we determine that the master-slave database is disconnected . If in down-after-milliseconds In milliseconds , The master and slave nodes are not connected through the network , We can think that the master-slave node is disconnected . If the disconnection occurs more than 10 Time , This shows that the network condition of the slave database is not good , Not suitable as a new master library .

summary

The main task of the sentry is

Redis The sentinel mechanism is to achieve Redis One of the high availability means of uninterrupted service . Data synchronization of master-slave architecture cluster , It is the basic guarantee of data reliability ; Main library down , Automatic execution of master-slave switching is the key support for uninterrupted service .

Redis Sentry mechanism realizes the automatic switch between master and slave , I'm not afraid to be with my female friend any more master It's down. :

  • monitor master And slave Running state , Judge whether it is objective ;
  • master After the objective offline , Select a slave Switch to master;
  • notice slave And client new master Information .

The principle of sentry group

In order to avoid the failure of master-slave switch after single sentry failure , And to reduce the miscarriage of justice , And the sentinel group was introduced ; Sentinel cluster needs some mechanisms to support its normal operation :

  • be based on pub/sub Mechanism to realize the communication between sentry clusters ;
  • be based on INFO Command acquisition slave list , help The sentry and slave Establishing a connection ;
  • Through the sentry's pub/sub, Realize the event notification between client and sentry .

Master slave switch , It's not a random choice of a sentry to execute , It's arbitration by vote , Select a Leader, By this Leader Responsible for master-slave switching .

This article is from WeChat official account. - Code byte (MageByte)

The source and reprint of the original text are detailed in the text , If there is any infringement , Please contact the yunjia_community@tencent.com Delete .

Original publication time : 2021-04-01

Participation of this paper Tencent cloud media sharing plan , You are welcome to join us , share .

版权声明
本文为[Codebyte]所创,转载请带上原文链接,感谢
https://cdmana.com/2021/04/20210408111751413i.html

Scroll to Top