Explain performance as Kafka The beginning of the journey , Let's learn more about Kafka “ fast ” The inside secret of . You can not only learn Kafka Various means of performance optimization , We can also refine various performance optimization methodologies , These methodologies can also be applied to our own projects , Help us write high performance projects .
Guan Gong and Qin Qiong
“65: Redis and Kafka It's totally different middleware , Is there any comparison ? ”
Yes , So this article is not about 《 The selection of distributed cache 》
, Neither 《 Comparison of distributed middleware 》
. We focus on the performance optimization of these two projects in different fields , Take a look at the common means of performance optimization for excellent projects , As well as the optimization methods for the characteristics of different scenes .
A lot of people learn a lot , Learned a lot about the framework , But when it comes to practical problems , But often feel the lack of knowledge . This is the failure to systematize the knowledge learned , There is no abstraction from the concrete implementation that can work methodology
.
It's important to learn about open source projects inductive
, Summarize the methodology of excellent implementation of different projects , then deductive
Go to self practice .
The opening remarks
“ Margo : rational 、 objective 、 Prudence is a characteristic of programmers , And advantages , But a lot of times we also need to bring a little emotional , With a little impulse , This time can help us make decisions faster .「 Pessimists are right 、 Optimists succeed .」 I hope everyone is an optimistic problem solver . ”
Kafka Performance panorama
From a highly abstract point of view , Performance problems cannot escape from the following three aspects :
- The Internet
- disk
- Complexity
about Kafka For this kind of network distributed queue , Network and disk are the top priority of optimization . Aiming at the above abstract problems , The solution is highly abstract and simple :
- Concurrent
- Compress
- Batch
- cache
- Algorithm
I know the problems and ideas , Let's see , stay Kafka in , What are the roles , And these roles are the points that can be optimized :
- Producer
- Broker
- Consumer
Yes , All the questions , Ideas , The optimization points have been listed , We can refine as much as possible , All three directions can be refined , such , All the implementation is clear at a glance , Even if you don't look Kafka The implementation of the , We can also think of one or two things that can be optimized .
That's the way of thinking . Raise questions
> List the problem points
> List optimization methods
> List specific entry points
> tradeoff And refine the implementation
.
Now? , You can also think of your own optimization methods , You don't have to be perfect , It doesn't matter whether it's realized or not , Think about it a little bit .
“65 Brother : No way. , I'm stupid , Lazy, too , You'd better tell me directly , I'm good at whoring for nothing . ”
Sequential writing
“65 Brother : The somebody else Redis It's a pure memory based system , you kafka And read and write disks , Energy ratio ? ”
Why is disk writing slow ?
We can't just know the conclusion , And I don't know why . Answer that question , We have to go back to the operating system course we learned at school .65 Do you still have the textbook ? Come on , Turn to the chapter on disk , Let's review how disks work .
“65 Brother : There are still ghosts , There's no book in the middle of the class . If it wasn't for the exam, my eyes would be good , I guess I haven't graduated yet . ”
Look at the classic big picture :
Complete a disk IO, Need to go through Seek the way
、 rotate
and The data transfer
Three steps .
Affect disk IO The performance factor also occurs in the above three steps , So the main time spent is :
- Seek time :Tseek It refers to the time required to move the read / write head to the correct track . The shorter the search time ,I/O The faster the operation , At present, the average seek time of the disk is generally 3-15ms.
- Rotation delay :Trotation It refers to the time required for disk rotation to move the sector in which the requested data is located to the bottom of the read-write disk . The rotation delay depends on the disk speed , It usually takes time to rotate a disk for one revolution 1/2 Express . such as :7200rpm The average disk rotation delay of is about 60*1000/7200/2 = 4.17ms, And the speed is 15000rpm The average rotation delay of the disk is 2ms.
- Data transfer time :Ttransfer It refers to the time required to complete the transmission of the requested data , It depends on the data rate , Its value is equal to the data size divided by the data transfer rate . at present IDE/ATA Can achieve 133MB/s,SATA II Accessible 300MB/s The interface data transfer rate , Data transmission time is usually far less than the time consumed by the first two parts . Simple calculation can be ignored .
therefore , If you leave it out when you write to disk Seek the way
、 rotate
It can greatly improve the performance of disk reading and writing .
Kafka use Sequential writing
File to improve disk write performance . Sequential writing
file , Basically reduced the number of disks Seek the way
and rotate
The number of times . The head doesn't have to dance on the track anymore , It's going all the way .
Kafka Each partition is an ordered , Immutable message sequence , New news is constantly added to Partition At the end of , stay Kafka in Partition It's just a logical concept ,Kafka take Partition Divided into multiple Segment, Every Segment Corresponding to a physical file ,Kafka Yes segment File append write , That's how to write files in sequence .
“65 Brother : Why? Kafka You can use the method of appending ? ”
This sum Kafka Is related to the nature of , Let's see Kafka and Redis, To put it bluntly ,Kafka It's just one. Queue
, and Redis It's just one. HashMap
.Queue
and Map
What's the difference ?
Queue
yes FIFO Of , Data is in order ;HashMap
The data is out of order , It's random reading and writing .Kafka The immutability of , Order makes Kafka You can write files by appending them .
In fact, many data systems that conform to the above characteristics , You can use add write to optimize disk performance . The typical ones are Redis
Of AOF file , Of various databases WAL(Write ahead log)
Mechanism and so on .
“ So clearly understand the characteristics of their own business , Can be targeted to make optimization . ”
Zero copy
“65 Brother : ha-ha , I was asked about this in the interview . It's a pity that the answer is not so good , alas . ”
What is zero copy ?
We from Kafka From the perspective of the scene ,Kafka Consumer Consumption is stored in Broker Disk data , Read from Broker Disk to network transmission to Consumer, What system interactions are involved in the process .Kafka Consumer from Broker Consumption data ,Broker Read Log, I used sendfile. If traditional IO Model , The pseudo code logic is as follows :
readFile(buffer) send(buffer)
Pictured , If the traditional IO technological process , Read the network first IO, Write to disk again IO, You actually need to put the data Copy The four time .
- for the first time : Read disk files to the operating system kernel buffer ;
- The second time : The data in the kernel buffer is ,copy To the application buffer;
- The third step : Put the application buffer Data in ,copy To socket Network send buffer ;
- The fourth time : take socket buffer The data of ,copy To the network card , Network transmission by network card .
“65 Brother : ah , Is the operating system so stupid ?copy Come on copy Yes . ”
It's not the operating system , The design of the operating system is that each application has its own user memory , User memory is isolated from kernel memory , This is for the sake of program and system security , Otherwise, the memory of every application will be all over the place , Read and write at will. That's enough .
however , also Zero copy
technology , english ——Zero-Copy
. Zero copy
Is to minimize the number of copies of the above data , So as to reduce the number of copies CPU expenses , Reduce the number of context switches in user mode and kernel mode , So as to optimize the performance of data transmission .
There are three common zero copy ideas :
- direct I/O: Data directly across the kernel , In the user address space and I/O Between devices , The kernel only performs necessary auxiliary work such as virtual storage configuration ;
- Avoid copying data between kernel and user space : When the application does not need to access the data , You can avoid copying data from kernel space to user space ;
- When writing copy : Data doesn't need to be copied in advance , It's a partial copy when it needs to be modified .
Kafka Using the mmap
and sendfile
The way to achieve Zero copy
. They correspond to each other Java Of MappedByteBuffer
and FileChannel.transferTo
.
Use Java NIO Realization Zero copy
, as follows :
FileChannel.transferTo()
Under this model , The number of context switches is reduced to one . To be specific ,transferTo()
Method indicates that the block device passes through DMA The engine reads the data into the read buffer . then , Copy the buffer to another kernel buffer for temporary storage to the socket . Last , Socket buffer passed through DMA Copied to the NIC buffer .
We reduced the number of copies from four to three , And only one of these copies involves CPU. We also reduced the number of context switches from four to two . This is a big improvement , But there's no zero copy query yet . When running Linux kernel 2.4 And later, as well as the network interface card supporting the collection operation , The latter can be realized as a further optimization . As shown below .
According to the previous example , call transferTo()
The method will allow the device to pass through DMA The engine reads the data into the kernel read buffer . however , Use gather
In operation , There is no copy between the read buffer and the socket buffer . In its place , to NIC A pointer to the read buffer and the offset and length , The offset and length are determined by DMA eliminate .CPU Never participate in copying buffers .
About Zero copy
details , You can read this article in detail (Zero-copy) Analysis and application .
PageCache
producer Production news to Broker when ,Broker Will use pwrite() system call 【 Corresponding to Java NIO Of FileChannel.write() API】 Write data by offset , At this point, the data will be written first page cache
.consumer When consuming news ,Broker Use sendfile() system call 【 Corresponding FileChannel.transferTo() API】, Zero copy data from page cache Transferred to the broker Of Socket buffer, And then through the network .
leader And follower Synchronization between , With the above consumer The process of consuming data is the same .
page cache
The data in the kernel will change with the data in the kernel flusher The scheduling of threads and the management of sync()/fsync() Call to write back to disk , Even if the process crashes , And don't worry about data loss . in addition , If consumer The news of consumption is not there page cache
in , Will go to disk to read , By the way, it will read out some adjacent blocks and put them in page cache, To facilitate the next reading .
So if Kafka producer The rate of production is in line with consumer There is no big difference in the rate of consumption , Then you can almost rely on the right broker page cache Read and write to complete the whole production - Consumption process , Very little disk access .
A network model
“65 Brother : The Internet , As Java The programmer , Naturally Netty ”
Yes ,Netty yes JVM An excellent network framework in this field , Provides high performance network services . majority Java Programmers talk about network frameworks , The first thing that comes to mind Netty.Dubbo、Avro-RPC And so on, excellent frameworks use Netty As the underlying network communication framework .
Kafka I realized the network model to do RPC. Bottom based Java NIO, Adopt and Netty Same Reactor Threading model .
Reacotr The model is mainly divided into three roles
- Reactor: hold IO Events are assigned to the corresponding handler Handle
- Acceptor: Handle client connection Events
- Handler: Dealing with non blocking tasks
In traditional blocking IO In the model , Each connection needs to be handled by a separate thread , When the concurrency is large , There are many threads created , Occupancy resources ; Use blocking IO Model , After the connection is established , If the current thread has no data to read , The thread will block the read operation , Waste of resources
For traditional blocking IO Two problems of the model ,Reactor The model is based on pooling , Avoid creating threads for each connection , After the connection is completed, the business processing is handed over to the thread pool for processing ; be based on IO Reuse model , Multiple connections share the same blocking object , Don't wait for all the connections . Traversal to new data can be processed , The operating system notifies the program , The thread jumps out of the blocking state , Do business logic processing
Kafka That is, based on Reactor The model implements multiplexing and processing thread pool . Its design is as follows :
It contains a Acceptor
Threads , For handling new connections ,Acceptor
Yes N individual Processor
Threads select and read socket request ,N individual Handler
Threads process requests and respond accordingly , That is to deal with business logic .
I/O Multiplexing can be done by combining multiple I/O The blocks are multiplexed to the same select On the block , Thus, the system can process multiple client requests at the same time in the case of single thread . Its biggest advantage is the small system overhead , And there's no need to create new processes or threads , It reduces the resource cost of the system .
summary : Kafka Broker Of KafkaServer
Design is an excellent network architecture , I want to know Java Network programming , Or students who need to use this technology might as well read the source code . follow-up 『 Margo 』 Of Kafka The article series will also cover the interpretation of this source code .
Batch and compression
Kafka Producer towards Broker Sending a message is not the sending of a message . Have used Kafka You should know ,Producer There are two important parameters :batch.size
and linger.ms
. These two parameters are the same as Producer It's about bulk sending .
Kafka Producer The execution process of is shown in the figure below :
The messages are sent through the following processors in turn :
- Serialize: Keys and values are serialized according to the serializer passed . Excellent serialization can improve the efficiency of network transmission .
- Partition: Decide which partition of the topic to write messages to , By default, follow murmur2 Algorithm . Custom partition programs can also be passed to producers , To control which partition messages should be written to .
- Compress: By default , stay Kafka Compression is not enabled in the producer .Compression Not only can it be transferred faster from producer to agent , It can also be replicated faster in the process of transmission . Compression helps improve throughput , Reduce latency and improve disk utilization .
- Accumulate:
Accumulate
seeing the name of a thing one thinks of its function , It's a message accumulator . Its interior is for each Partition Maintain aDeque
deque , The queue holds the batch data to be sent ,Accumulate
Accumulate data to a certain amount , Or within a certain expiration time , The data is sent out in batches . Records are accumulated in the buffer of each partition of the subject . Grouping records according to the producer batch size attribute . There is a separate accumulator for each topic in the partition / buffer . - Group Send: Batches of partitions in the record accumulator are grouped by the agent to which they are sent . The records in the batch are based on batch.size and linger.ms Property to the agent . The record is sent by the producer based on two conditions . When the defined batch size or delay time is reached .
Kafka Support multiple compression algorithms :lz4、snappy、gzip.Kafka 2.1.0 Formal support ZStandard —— ZStandard yes Facebook Open source compression algorithm , Designed to provide ultra-high compression ratio (compression ratio), For details, see zstd.
Producer、Broker and Consumer Using the same compression algorithm , stay producer towards Broker Write data ,Consumer towards Broker You can read data without even decompressing it , In the end in Consumer Poll Unzip it when the message arrives , This saves a lot of network and disk overhead .
Partition concurrency
Kafka Of Topic It can be divided into several Partition, Every Paritition It's like a queue , Keep the data in order . The same Group The next difference Consumer Concurrent consumption Paritition, Partitioning is actually tuning Kafka The smallest unit of parallelism , therefore , so to speak , Every time you add one Paritition It adds a consumption concurrency .
Kafka Excellent partition allocation algorithm ——StickyAssignor, It can ensure that the partition allocation is as balanced as possible , And the result of each redistribution should be consistent with that of the last one . such , The partition of the whole cluster should be balanced as much as possible , each Broker and Consumer It's not going to tilt too much .
“65 Brother : The more partitions, the better ? ”
Of course not. .
More partitions require more file handles to be opened
stay kafka Of broker in , Each partition is compared to a directory in the file system . stay kafka In the data log file directory of , Each log segment is assigned two files , An index file and a data file . therefore , With partition Increase of , The number of file handles required has increased dramatically , If necessary, you need to adjust the number of file handles allowed to be opened by the operating system .
client / The more memory the server needs to use
client producer A parameter batch.size, The default is 16KB. It caches messages for each partition , Once it's full, it's packed and sent out in batches . It looks like it's a performance enhancing design . But obviously , Because this parameter is partition level , If you have more partitions , This part of the cache also takes up more memory .
Reduce high availability
The more partitions , Every Broker The more partitions are allocated on the , When one happens Broker Downtime , So the recovery time will be very long .
File structure
Kafka The message is Topic Categorize units , each Topic They are independent of each other , They don't influence each other . Every Topic It can also be divided into one or more partitions . Each partition has a log file to record message data .
Kafka Each partition log is physically divided into several... By size Segment.
- segment file form : from 2 Most of the components , Respectively index file and data file, this 2 One file to one , Pairs appear , suffix ”.index” and “.log” Expressed as segment Index file 、 Data files .
- segment File naming rules :partion First of all segment from 0 Start , Each of the following segment The file name is last segment Of the last message in the file offset value . The maximum value is 64 position long size ,19 Digit character length , There's no number 0 fill .
index Using sparse index , So each of them index File size is limited ,Kafka use mmap
The way , Direct will index Files mapped to memory , That's right index You don't need to operate the disk for the operation of IO.mmap
Of Java Implement correspondence MappedByteBuffer
.
“65 Brother notes :mmap It's a way to map files in memory . Mapping a file or other object to the address space of a process , Realize the one-to-one mapping between the file disk address and the virtual address space of a process . After implementing such a mapping relationship , The process can read and write this section of memory by pointer , The system will automatically write back the dirty page to the corresponding file disk , That is, the operation on the file is completed without calling read,write Wait for the system to call the function . contrary , Kernel space changes to this area also directly reflect user space , So we can share files among different processes . ”
Kafka Make full use of dichotomy to find the corresponding offset The location of the message :
- Follow the dichotomy to find less than offset Of segment Of .log and .index
- Using goals offset Subtract... From the file name offset Get the news in this segment Offset in .
- Again, use dichotomy in index Find the corresponding index in the file .
- To log In file , In order to find , Until I find offset The corresponding message .
summary
Kafka It's an excellent open source project . Its optimization on performance is incisive , It's a project worthy of our in-depth study . Whether it's thought or realization , We should all take a serious look , Think about it .
Kafka performance optimization :
- Zero copy networks and disks
- Excellent network model , be based on Java NIO
- Efficient file data structure design
- Parition Parallel and scalable
- Data batch transmission
- data compression
- Read and write disks sequentially
- Lockless lightweight offset
This article is from WeChat official account. - High performance server development (easyserverdev)
The source and reprint of the original text are detailed in the text , If there is any infringement , Please contact the yunjia_community@tencent.com Delete .
Original publication time : 2021-04-06
Participation of this paper Tencent cloud media sharing plan , You are welcome to join us , share .
版权声明
本文为[Fan Li]所创,转载请带上原文链接,感谢
https://cdmana.com/2021/04/20210408111750756v.html