编程知识 cdmana.com

Zookeeper (4) -- ZK cluster deployment and election

One 、 Cluster deployment

1. Prepare three machines , Install well ZK. It is strongly recommended that an odd number of machines , because zookeeper By judging the survival of most nodes to determine whether the entire service is available .3 Nodes , Hang up 2 Indicates that the whole cluster is down , And even numbers 4 individual , Hang up 2 It also means that most of them don't survive , So it's going to hang up , On the contrary, I feel that one more machine resource is wasted .

2. Modify the configuration file

   Fixed grammar format :server. node ID=ip: Data synchronization port : Election port

   node ID: service id Specify manually 1 to 125 Number between , And write it to the corresponding service node {dataDir}/myid In file .

  IP Address : Remote node IP Address

   Data synchronization port : Master slave synchronous data replication port .

   Remote port : Master node hung , Select the communication port of the new master node .

   Such as :

    server.1=192.168.0.67:28888:38888

    server.2=192.168.0.68:28888:38888

    server.3=192.168.0.69:28888:38888

   All three machines have the same content :

 

 3. The... Specified in the configuration file dataDir Create... Under the directory myid File and write the corresponding value in the configuration

   my dataDir Directory is /tmp/zookeeper

 

Corresponding to IP Corresponding node ID write in

192.168.0.67

 

 192.168.0.68

 

 192.168.0.69

 

 4. Each uses the configuration file to start the service

./zkServer.sh start ../conf/zoo.cfg

 

5. Check the status of each node

./zkServer.sh status

 

 

 

 

  We can see that two of them are follower One is leader

 

6. Connect clusters

Any node can be connected to the cluster , You can also connect every one of them , Use , Division

zkCli.sh Add parameters after -server Indicates that the connection is specified IP And port

./zkCli.sh -server 192.168.0.67:2181

Connect clusters 67 node Write data , And then connect 68 You can also see the data written , It means the data is synchronized

 

 

  If we stop a machine , Clusters are still available , If it's stopped leader, Then the cluster will elect a new leader, The entire cluster is not available at election time . If you shut down two machines , The cluster will not be available .

 

Two 、 The cluster character

The front by ./zkServer.sh status Command we see in the cluster role leader and follower, One more observer

 

leader

Master node , Also known as leader . For writing data , By election , If it goes down, it will elect a new master node .

follower

Child node , Also known as follower . For reading data . At the same time, it is also the alternative node of the master node , And with the right to vote .

observer

Secondary child nodes , Also known as observer . For reading data , And follower The difference is that there is no right to vote , Cannot select primary node . And when calculating the available state of the cluster, it will not observer Calculated in .

 

observer To configure :

Just add... To the cluster configuration observer Suffixes are enough , Examples are as follows :

server.3=192.168.0.69:2889:3889:observer

 

3、 ... and 、 A cluster election

Let's go through ./zkServer.sh status Instructions have seen 68 Mechanical leader, 67 and 69 yes follower

Why? 68 yes leader Well ? When it comes to mass elections , The first round is all for yourself , Every time after that, I'm going to invest more than myself myid Large adjacent nodes . If the vote is more than half, the election is over .

 

 

If it is four nodes, there will be a third round of elections , The first node in the third round will be cast to the third node , So if it is 4 Nodes, so leader It's going to be the third node .

There are two kinds of election in cluster nodes , One is node startup , The other is that more than half of the nodes can't communicate with leader Establishing a connection .

When the node is initially started, it will look in the cluster for Leader node , If it is found, it will be with Leader Establishing a connection , Its own state changes follower or observer. If not found Leader, The current node state will change LOOKING, Enter the election process .

During cluster operation, if there is follower or observer As long as the node downtime is less than half, it will not affect the normal operation of the entire cluster service . But if leader Downtime , External services will be suspended , all follower Will enter the LOOKING state , Enter the election process .

 

Four 、 Data synchronization

zookeeper Data synchronization is to ensure the data in each node Uniformity . One is the normal client data submission , The other is data synchronization after service outage recovery . In the previous operation, we also saw that after writing data on a machine , There's data on other machines .

When data is written , The request may be sent to follower Of , The request will be forwarded to leader

1.client towards zk Medium server Send write request , If it's time to server No leader, The write request is forwarded to leader server,leader The transaction will be requested to proposal The form is distributed to follower;

2. When follower Yes, I have leader Of proposal when , Process according to the order of receiving proposal;

3. When Leader received follower For something proposal More than half ack after , The transaction commit is initiated , Launch a new one commit Of proposal

4.Follower received commit Of proposal after , Record transaction commit , And update the data to the memory database ; When you write successfully , Feedback to client.

  If there is a follower The node is down , Because not more than half of the nodes are down , The cluster can still work normally . When leader New client request received , At this time, it is unable to synchronize with the down node . The data are different . To solve this problem , When the node starts , The first thing is to find the current Leader, Whether the comparison data is consistent with . If not, it will start to synchronize , After the synchronization is completed, the external services are provided .

that zk How to confirm the data version , It's through the introduction of Zxid, For comparison . Able to participate leader The election node is also zxid The latest node ( newest zxid The data is complete )

 

Zxid It's a length 64 Digit number , Which is low 32 Bits are incremented by numbers , Any change in data will result in , low 32 The number of bits is simply added with 1. high 32 Is it leader Period number , Whenever a new leader when , new leader Just take it out of the local log ZXID, And then it resolves to high 32 The period number of bits , add 1, And then lower 32 All bits are set to 0. This ensures that every new election leader after , To ensure the ZXID It's unique and incremental .

In short, elections will make zxid The height of 32 Data plus 1, Every time the data changes, it makes zxid It's low 32 Bit data plus 1, therefore zxid The largest node data is always the most complete one .

 

5、 ... and 、 Cluster operation and maintenance instructions

Zk Some operation and maintenance related instructions are provided , Can pass telnet or nc towards zk Give orders . These orders all have 4 It is also called four character operation and maintenance command .

By default, these commands are off , Configure through the configuration file 4lw.commands.whitelist To turn on these commands

Partially open :4lw.commands.whitelist=stat, conf, isro,envi

All on :4lw.commands.whitelist=*

 

Installation may be required Netcat Tools

yum install -y nc

Check the server and client connection status :

echo stat | nc localhost 2181

 

1.conf3.3.0 What's new in : Print details about service configuration .

2.crst3.3.0 What's new in : Reset all connected connections / Session Statistics .

3.dump: List outstanding sessions and temporary nodes . This only applies to leader.

4.envi: Print details about the service environment

6.ruok: Test whether the server is running in a non error state . If the server is running , It will take imok Respond to . otherwise , It will not respond at all . Respond to “ imok” It does not necessarily mean that the server has joined the Arbitration , Only the server process is active and bound to the specified client port . Use “ stat” Get more information about state arbitration and client connection information .

7.srst: Reset server statistics .

8.srvr3.3.0 New features in : List the full details of the server .

9.stat: List brief details of the server and connected clients .

10.wchs3.3.0 What's new in : List brief information about server monitoring .

11.wchc3.3.0 What's new in : List details about server monitoring by session . This will output with relevant monitoring ( route ) Conversation ( Connect ) list . Please note that , according to watch The number of , This operation can be expensive ( That is, it affects the server performance ), Please use with care .

12.dirs3.5.1 What's new in : Displays the total size of snapshot and log files in bytes

13.wchp3.3.0 What's new in : List details about server monitoring by path . This will output the path with the associated session (znode) list . Please note that , According to the number of watches , This operation can be expensive ( That is, it affects the server performance ), Please use with care .

14.mntr3.4.0 What's new in : Output a list of variables that can be used to monitor the health of the cluster .

 

版权声明
本文为[White Dew is not frost]所创,转载请带上原文链接,感谢

Scroll to Top