Author's brief introduction ：
eBay China R & D Center Senior Software Engineer
2010 To join in eBay, I have been responsible for design and development in system platform department . Initially responsible for the whole eBay Database application layer development and optimization ; And then engaged in the collection of user behavior data , The establishment of data pipeline and part of data analysis work , It's an open source project PulsarIO One of the major contributors to ;
Currently committed to eBay Real time data transmission and computing platform , be based on Kafka and Storm Open source software, etc . High availability for construction , High scalability , In addition, we have rich experience in automatic operation and maintenance of distributed systems .
The theme of this article is what we have done in the last year , be based on Kafka Do enterprise data transmission platform , We implement this platform , And the platform is finally online and in the process of its operation and maintenance , The experience and lessons learned .
This paper is mainly divided into four parts ：
eBay Overview of data transmission platform
Overview of our data transmission platform , Including why we built this platform , What exactly is this platform , What is the main technical architecture .
Platform core services
The core services of this platform , Make a little more detailed description .
System monitoring and Automation
After this platform goes online , How do we monitor it , And we spent a little bit of effort on O & M Automation , Reduce the manual processing of operation and maintenance personnel .
Kafka performance optimization
Kafka Performance optimization and performance improvement , What system adjustments have we made , bring Kafka Have better performance to run .
1、eBay Overview of data transmission platform
1.1 Why build a transmission platform
The first is an overview of the platform , What is a data transmission platform , Why do we build this platform .
In fact, there are many systems in Internet enterprises , Generally speaking, we can divide the systems in the Internet into two categories ：
One is online systems , An online system is a site system that deals directly with users , For example, for e-commerce sites , We have a product view , Commodity search . We have sellers publishing products , We have a distribution system .
The other is offline systems , The most important thing about offline systems is that we do BI analysis , Site reports , Including financial statements , And the analysis of user behavior , We usually have some products in the offline system , For example, big data , We'll use hadoop Do some data mining .
In fact, the data of offline system is also from online system , So how do we get out of a huge online system , Transferring data to offline systems , This is the problem we need to solve .
about eBay Speaking of , It must have tens of thousands of online systems , For example, we need to collect user behavior , There are several in the system PB Relational database storage of , We need to get from the database , Transfer database changes to the background offline system , How to do it? ？
In recent years , There is a growing demand for real-time computing , A lot like fraud detection , Like user personality recommendation , This type of system , All require real-time , I don't want to wait for the data to come in one night , I want the data right away , Be able to calculate , Get the calculation results , And then feed back to other online systems .
under these circumstances , We need a real-time platform , Transfer data from an online system to an offline system , So that's why we build this transport platform .
1.2 Why use Kafka
Why do we use Kafka , Here is wrong Kafka Make too many introductions , Let's talk about it. We're in love with Kafka Which points , Why did we finally choose Kafka Do data transmission platform .
In fact, our data transmission platform is a typical message middleware , There are many products in the market of message middleware , Why use Kafka Well ？
Kafka Its advantages are mainly in these aspects , First of all, the biggest advantage is high throughput , For other message middleware , Kafka Basically, it can kill the throughput of other message oriented middleware .
It also has many other great features , With high throughput , Its transmission performance is also better , We say the transmission performance is mainly on the data transmission , For example, the end-to-end can reach the millisecond level .
besides , It should support multi subscription , This is where it's done better .
Again, it ensures message continuity , That is to enter Kafka All the data , You can go back at any moment , In the early stage, this data has not been cracked .
In addition, it also has good scalability , You add Kafka node , Processing is growing linearly , It also ensures high availability , image Kafka If some nodes are down , It doesn't affect data integrity .
1.3 Kafka Handling capacity of
Since we have chosen Kafka , that eBay stay Kafka How much data is there on ？ What kind of data do we put in Kafka How about it ？
At present, we have about thirty of them Kafka colony , these Kafka Clusters are mainly built on eBay On your own private cloud , We are based on OpenStark Build your own private cloud , So we Kafka Are virtual machines .
We have 800 Multiple virtual machines , In all, we have 1200 Multiple apps running on top , The total number of these applications adds up to more than 2.5 m , Every day the news reaches 1000 More than 100 million times . So we can see that this is a typical example of real-time transmission of big data .
Then why do we have to divide so much Kafka colony ？
This is very similar to the sub database of the database , We are based on vertical division of business Kafka colony , For example, all user behavior , The user clicks 、 Search for 、 User behavior, such as browsing products , We will also put Kafka In cluster , For all database changes , For example, when a selling price changes a price , This data change we will put in Kafka In cluster .
For example, there are some sites , I want to send out some business events myself , For example, when a commodity is sold , We will put these business events on another basis Kafka Business clusters of .
We know LinkedIn It was first proposed that Kafka Of , They should have 60 Multiple Kafka .
So we have Kafka Such open source products , Can we use it directly ？ You don't have to do anything ？
Of course not. , Actually Kafka What it provides is a unit function , As an enterprise application , You want to do enterprise real-time transmission platform solution , Need to be based on Kafka Do a lot of extra services , Every enterprise should have its own needs , For example, enterprises consider safety , Every enterprise realizes it differently , The distribution of enterprise data centers is also different , For the needs of different enterprises themselves , We need to do some extra services to support it .
eBay What kind of services have been done ？ Take some very simple examples , For example, we want a user to create his own Kafka topic, You don't just send him to a node , It's obviously not safe enough , At the same time, it's not convenient .
In this way, we must have a server to provide management functions , We want to provide a unified entrance , And unified topic The name space , Then we need to introduce services from the original data center .
For example, we are in Shanghai 、 There are data centers in Beijing . How do we move data from Shanghai to Beijing , At this point, you need to have a data mirroring service .
Another example is that we just talked about , Whole Kafka Clusters are in openstack Above the clouds , When we need to build a new cluster , Or a cluster needs to be fixed , Or add new nodes for this cluster , Or when the nodes are abandoned .
How do we need to call openstack Function complete ？ At the same time, we also have a lot of monitoring services , System log service .
We all expose all services through the interface form , At the same time, the lower end of the user interface directly to do something , You don't have to find a system administrator to do it , They can do it themselves .
2、eBay Platform core services
These are some of the services our system has , I will introduce the services in this system in a little more detail .
2.1 The purpose of metadata services is
The first is metadata services , Why propose metadata services , Because we want to logically provide you with a unified topic The name space , For example, I want to access user behavior data , If we don't have this service , I have to let users know your user behavior Kafka Where is the cluster , And know which one you're connected to .
then topic What is the name ？ For example, we can query a user's behavior directly from the service , You go straight to this topic, The real... Will be found after the service Kafka Where is the cluster , And then back to the client , Let it connect .
In addition to providing a unified namespace , We also put forward a proposal called topic The concept of subpackage , Why is there such a thing ？
Because we talked about self-service , We want users to create their own , If you let it be created unlimited , It's definitely damaging to system resources , Because it doesn't know how much of your resources are still there , If he creates too many , Will be able to Kafka Cluster down .
At this time, quota management is needed , The unit we introduced here is topic Group , We created topic When the system administrator to do the approval .
Once approved , I will give topic The group allocates some quotas , Like how much I create on it topic, The network bandwidth that happens above can be configured . So this is also convenient for the operation and maintenance management in the future .
2.2 Metadata services
I said that just now , It's all provided by metadata services , How does metadata service work ？
stay Kafka In cluster , You can't use the management that these clusters bring with them topic Metadata to manage , So I have to have metadata storage .
2.3 Kafka agent
We've introduced a logic layer that looks like a cluster for more than 30 clusters , under these circumstances , Is the user using Kafka API There was a problem when , because API You don't know which one you're going to use .
Let's see this Kafka The agent is fully implemented Kafka The agreement , This protocol defines a lot of operations , These operations are based on TCP Layer of .
Let's go to such an agent , It can simulate completely Kafka In itself group The agreement . For the client , It could have been used Kafka Of API visit , This API The connection broker is like connecting to a separate Kafka Clustering is the same , This is not a real Kafka colony , It's with three Kafka colony .
2.4 Tier-Aggregation Patterns and data mirroring Services
Let's talk about our image service , We actually have multiple data centers , Kafka The data source itself comes from the data center . So how do we all build Kafka Cluster ？
Here we have a pattern Tier—Aggregation, For example, Shanghai and Beijing have user behavior . When we want to do data analysis , To be able to analyze Shanghai at the same time 、 Beijing data , I need to put the data from two regions run get up .
For example, we only have two data centers , We create four Kafka colony , Two of them location The data of .
We also achieve data redundancy across data centers , For example, the Beijing data center burned down , Our Shanghai data center can still take out all the data .
In fact, this is also caused by LinkedIn The way of comparison and recommendation is put forward , Although it introduces a lot of data redundancy , But it keeps it running .
because Kafka It has its own location, Each data will cause three network traffic , This network traffic is to make Kafka Cluster high availability .
If Kafka If the cluster crosses the data center , So called network traffic will be cross data center , How do we transfer data from one data center to another , That's what it takes to use a mirror service .
In fact, we will have a lot of management in the aspect of image service , Like how many nodes I drive , How many threads , How to start , How to cut off , All management work we need to have specific services to do such things , This is what we call mirror service , To achieve specific services , But it exposes a server , Let the upper application do the management of data image again .
3.5 Schema Registration service
In addition, we have Schema Registration service , For ordinary platforms , All data passing through the platform can be managed , I want everyone to know the data format , So we defined the unified data schema in the platform .
Kafka It provides Schema Components , Back use Kafka Do storage , And high availability does it , We're just going to take it and use it , But I didn't take it 100 percent .
Because it has some limitations , For example, it doesn't support sanity , Everyone can change , Everyone can add versions .
2.6 User self service
The service just mentioned , No matter for users , Administrator , We all need an interface to operate it , Because it's impossible for everyone to pass SSH Connect to server .
So we have a user service portal, from consumer register ,producer register ,topicgroup register ,schema Registration of .
I just talked about creating a cluster , Replace some nodes on the cluster floor , We're going to add new nodes and so on , We all need to tune openstack The function of , But we need a very small one for this place PaaS Finish the system .
We are actually based on openstack Built a little mini PaaS, In addition to providing functional workflows , It also provides the function of running workflow management .
openstack Provides a set of interfaces to do this kind of thing , However, the interface must be based on ALQP The agreement , The same is true for configuration , Kafka What is the default configuration , We also have some configuration optimizations , Changed some configuration to optimize all nodes , How to manage these configurations is also in Prism It's done in the server .
3、 System monitoring and Automation
So much for that , When the platform finally goes online , We're going to run and operate it , The most important thing in operation and maintenance is that we should monitor the system well , And when it goes wrong, we need to fix it in time . For this system monitoring is a very important topic .
3.1 Cluster node monitoring
You can see it , We're in this system , Actually, it involves many nodes , For all of it, we pack it up , Let it complete a business semantics .
In terms of monitoring, we must have a unified perspective to see the running status of a series of clusters , For all cluster nodes , It doesn't mean that a node can't work , because Kafka There is data redundancy , There's no problem dropping one or two nodes .
So here I will list the nodes of operation that are down , Or healthy , Our operation and maintenance personnel repair the down node . We're still repairing it manually , Because we need to analyze the reasons for these failures .
For now , The system didn't run for more than a year , So we're using artificial methods at present . In the future, we will consider when any node has problems , Make an automatic replacement , Automatic replacement must introduce some rules , Under what circumstances can automatic stop , Under what circumstances can not automatically stop .
4.2 Kafka Condition monitoring
about Kafka Speaking of , We're looking at each node , also Kafka Monitoring of its own state . about Kafka For system operation and maintenance personnel , The system resources of this node also need to be monitored , System state is very important for administrators .
There is also a situation that is not only important to administrators , It's also very important for users , such as Kafka Condition monitoring , For users , For example, I want to know that I entered yesterday Kafka How much data does the cluster have , So this aspect of monitoring , Except for the operation and maintenance personnel , Also available to users of the system .
about consumer Is the same , about Kafka In terms of monitoring, it's important for me to know my consumer There is one leg.
If this leg If it keeps increasing , Just explain consumer There must be something wrong with the application of , I have to deal with it , So this place is equivalent to helping users to strictly control these problems .
When something goes wrong , Not only consumer Administrators need to know , Its users also need to know , So the alarm system also informs the user . In addition, some of our applications have end-to-end requirements , We have to know that data is coming out of the app , To enter into Kafka , How long does it take .
3.3 Fault node monitoring and processing
So in Kafka Inside , In the operation and maintenance process, we found that there is a very important topic , How to deal with slow nodes , What is a slow node ？
Actually Kafka It can handle the situation of node failure very well , Because it doesn't matter if one or two nodes are down , It can quickly remove bad nodes , It can be quickly selected from other nodes .
But the slow nodes didn't die , It's just slow work , For example, under normal circumstances , Its throughput rate is only the original 1/10, that Kafka You can't take it off , We found that most systems have this problem .
Why does a node have performance problems , It will affect the whole Kafka Cluster ？ We only have Kafka It's data clustering . Once there is a problem with the performance of this node , There will be problems with your network connection to all the other nodes .
It's equivalent to dragging down a lot of other nodes , get out Kafka The cluster has killed this node at this time , If you don't kill , This drag will always be there .
So that's why we detect in slow nodes , Slow nodes are more troublesome than dead nodes , Because it's still there . For example, we can see your CPU How long does it take to reach more than , We need to set up some rules .
At the same time, we also need to monitor the disk ,IO Is there a big problem with the performance of . At the same time, we should analyze the system log exception , Finally, there is a more direct way , We create some footprint topic, I regularly talk to topic To test , First, let's see if it doesn't work , Let's see if there's a problem with the speed , So I can know if there is a problem with the node .
What if I detect slow nodes ？
The simplest way to deal with it , Just stop it , It's safe to stop it , If you think that downtime has an impact on system throughput , We can restart it , But for many slow nodes , If you restart it, it will still be slow .
If a node , A disk takes up a lot , This shows that the resource allocation is very uneven , We introduce this partition reassignrnert It's not easy to deal with it , It still needs to be handled manually .
3.4 Offset Index and auto backup
We talked about our Kafka Clusters have independent copies in each data center , Take Beijing data center as an example Kafka It's gone down , Can I go to Shanghai data center and continue consumer, So how to do this , Without human intervention , It will automatically cut through ？
And that's where it comes in Kafka Agent for , If the agent knows there's something wrong with the data center , hold Kafka The returned information is sent back to another data center , At this point, you can handle it very well .
about consumer There's another thing that needs to be done , Because the same information is different in different data centers .
For example, Shanghai data center is better than Beijing data center Kafka A month in the morning , Then there must be an index offset , So we need a tool to find out how big the offset is . Fortunately, the subsequent version Kafka It's an index mechanism that's promoted in Kafka In the server , But when we implement this function , That version hasn't come out yet .
4、Kafka performance optimization
Let's talk about why Kafka So much throughput and performance .
4.1 Disk order read and write
First of all, it ensures that the disk is read and written sequentially , Because we know that disk sequential read and write is good , As long as you don't introduce file reading and writing . After we know this feature , How to ensure the normal operation of disk reading and writing ？ We don't want its head to jump all the time , Under what circumstances does the head often jump ？
By a Kafka There are too many nodes in it , When switching between different files , It leads to sequential reading and writing . And, for example, in applications , It will also affect the performance of big data reading and writing . If you want to make sure it's read and write in sequence , Try to avoid the operation just mentioned .
4.2 With Page Cache Design for the center
The other one is Kafka adopt Page Cache Maintain high performance , about Kafka End , If you keep up , It always reads data from within , Not reading data from disk .
Because now we will have Page Cache To monitor data processing , We are in the process of operation and maintenance , Find out about Page Cache Come on , There must be no Swap, Once occurred Swap, The performance of the node will soon decline .
If you're cloud based , Especially based on the public cloud , This is a very troublesome thing , You need to Hypervisor Set it on the . We also found that NUMA The problem of imbalance , This will lead to the location of other virtual machines CPU Memory doesn't work well , It's in a later version of OpenStack in , It's going to let you distribute CPU When , Force PIN The operation of .
besides Linux Also on the Page Cache There are some optimized settings . Someone may ask me that I wrote the disk asynchronously , If the data comes in , Before the data is actually on the plate , What should I do if mine goes down ？
Kafka Except in this node , He'll have it on other nodes as well , It's preserved first , Then the asynchronous disk writing process will actually drop the message into the disk .
4.3 Zero Copy
And finally , Kafka be used linux After the operating system ,Zero Copy It's simple , When my disk came , You don't have to go directly into user memory , Instead, it's thrown directly into the network port , This is not optimized , We must be careful Zero Copy It will fail .
We do Kafka When you upgrade , It's possible that the data format will change , For instance from 0.9 Change to 1.0, If you cut it by force , Will destroy Kafka .
So how to solve this problem , Actually in 1.0 There is a very complicated and detailed description in the upgrade instructions , If anyone wants to upgrade , Be sure to read the instructions in detail .
At the same time, another leads to Zero Copy Failure is the possibility of introducing SSL/TLS, How to deal with this ？ So you can only weigh security against performance .
4.4 Kafka performance optimization - Other parameter adjustment
In addition, there are some other parameters that can be applied to Kafka To optimize , One is File Descriptor It has to be big .
Included on high throughput nodes , Make sure to add Max socket buffer size. Balance unclean.leader.electionbenable, increase fetch Number of threads num.replica.fetchers, Handle leader The election outo.leaderrebalance.enoble.