编程知识 cdmana.com

Rocketmq vs. Kafka

Taobao internal trading system uses Taobao independent research and development of Notify Message middleware , Use Mysql As a message storage medium , It can be completely expanded horizontally , To further reduce costs , We think the storage part can be further optimized ,2011 Beginning of the year ,Linkin Open source Kafka This excellent message middleware , Taobao middleware team is working on Kafka Have done enough Review after ,Kafka Infinite news pile up , The speed of efficient persistence attracts us , But at the same time, it is found that the message system is mainly aimed at log transmission , For use in Taobao trading 、 Order 、 There are still many features that are not satisfied in recharge and other scenarios , So we reuse Java The language has written RocketMQ, Locating reliable message transfer without logging ( The log scene is also OK), at present RocketMQ It is widely used in Alibaba Group for orders , transaction , Recharge , Flow calculation , Message push , Log streaming ,binglog Distribution and other scenarios .

Data reliability

  • RocketMQ Support asynchronous real-time disk brushing , Synchronous brush set , Synchronous replication , Asynchronous replication
  • Kafka uses asynchronous disk brushing , Asynchronous replication / Synchronous replication

summary :RocketMQ Compared with the reliability of a single machine Kafka Higher , Not because of the operating system Crash, Cause data loss .Kafka Sync Replication Theoretically, the performance is lower than RocketMQ Synchronization of Replication, as a result of Kafka Data is organized in units of partitions , It means a Kafka There will be ​​ There are hundreds of data partitions ,RocketMQ There is only one data partition on an instance ,RocketMQ Can be fully utilized IO Group Commit Mechanism , Bulk data transfer , Configuration synchronization Replication And asynchronous Replication comparison , The performance loss is about 20%~30%,Kafka I haven't tested it myself , But I think it will be lower than in theory RocketMQ.

Performance comparison

  • Kafka single write TPS About a million / second , Message size 10 Bytes
  • RocketMQ Single write TPS Single instance is about 7 Ten thousand / second , Single deployment 3 individual Broker, Can run to the top 12 Ten thousand / second , Message size 10 Bytes

summary :Kafka Of TPS Run to a single million , Mainly due to Producer The client merges multiple small messages , Send in bulk to Broker.

RocketMQ Why not ?

  1. Producers usually use Java Language , Cache too many messages ,GC It's a very serious problem
  2. Producer Call the send message interface , Message not sent to Broker, Return success to business , here Producer Downtime , Can cause messages to be lost , Business error
  3. Producer It's usually a distributed system , And each machine is sent by multithreading , We think of online systems as single Producer The amount of data generated per second is limited , It can't be tens of thousands .
  4. The function of cache can be completed by the upper layer business .

The number of queues supported by a single machine

  • Kafka A single machine exceeds 64 A queue / Partition ,Load There will be an obvious phenomenon of soaring , More queues ,load The higher the , Longer response time to send messages .Kafka There can't be too many partitions
  • RocketMQ Single machine support is the highest 5 Ten thousand queues , The load will not change significantly

What's the advantage of having more queues ?

  1. Stand alone can create more topics , Because each topic is made up of a group of queues
  2. The cluster size of consumers is proportional to the number of queues , More queues , The larger the consumer cluster can be

Real time message delivery

  • Kafka Use short polling , Real time depends on the polling interval ,0.8 Later versions support long polling .
  • RocketMQ Use long polling , Same as Push Mode real-time consistency , The delivery delay of a message is usually a few milliseconds .

Consumption failure retry

  • Kafka consumption failure does not support retrying .
  • RocketMQ Consumption failure supports regular retrying , The interval between each retry is postponed

summary : For example, recharging applications , Call operator gateway at the current time , Recharge failed , Maybe it's pressure from the other side

Too much force , Call it later and it will succeed , For example, Alipay has a similar demand for bank deductions .

The retrying here requires reliable retrying , That is, the message of failed retrying is not due to Consumer Downtime leads to loss .

Strict message order

  • Kafka supports message ordering , But when an agent goes down , There will be a disorderly order of messages
  • RocketMQ Support strict message order , In a sequential message scenario , a Broker After downtime , Sending a message will fail , But not out of order

MySQL Binary log distribution requires strict message order

Timing message

  • Kafka does not support timed messages
  • RocketMQ Two types of timing messages are supported
    • Open source version RocketMQ Only timing level is supported , Timing level users can customize
    • Alibaba cloud MQ The specified millisecond delay time

Distributed transaction messages

  • Kafka does not support distributed transaction messages
  • Alibaba cloud MQ Support distributed transaction messages , Future open source versions of RocketMQ There are also plans to support distributed transactional messaging

Message query

  • Kafka does not support message queries
  • RocketMQ Support query message according to message ID , It also supports querying messages according to message content ( Specify a message key when sending a message , Any string , For example, specify the order number )

summary : Message query is very helpful in locating the problem of message loss , For example, an order failed to process , Is the message not received or received processing error .

The message goes back

  • Kafka could theoretically trace the message according to the offset
  • RocketMQ Support backtracking messages according to time , Precision milliseconds , For example, from a certain time, a certain minute and a second before a day, the message will be consumed again

summary : Typical business scenarios are consumer Do order analysis , But because of the failure of program logic or dependent system , All the news that leads to today's consumption is invalid , We need to start spending again from 0:00 yesterday , So time based message replay is very helpful for business .

Consumption parallelism

 

  • Kafka Consumption parallelism depends on Topic Number of partitions configured , If the number of partitions is 10, So at most 10 Machines to consume in parallel ( Only one thread can be opened per machine ), Or a machine to consume (10 Thread consumption in parallel ). That is, consumption parallelism and partition number are consistent .
  • RocketMQ There are two kinds of consumption parallelism
    • The parallel degree of sequential consumption mode is completely consistent with that of Kafka
    • The degree of parallelism depends on Consumer Number of threads for , Such as Topic To configure 10 A queue ,10 Machine consumption , Each machine 100 Threads , So the parallelism is 1000.

Message track

  • Kafka doesn't support message trajectories
  • Alibaba cloud MQ Support message trajectories

Developing language friendliness

  • Kafka was written in SCARA
  • RocketMQ Adopted Java Language writing

Message filtering of securities companies

  • Kafka does not support message filtering on the proxy side
  • RocketMQ Support two kinds of proxy message filtering methods
    • Filter by message variable , It is equivalent to the concept of a sub topic
    • Upload a paragraph to the server Java Code , You can filter messages in any form , You can even do it Message The body's filter split .

The ability to accumulate information

Theoretically Kafka than RocketMQ The ability to stack is stronger , however RocketMQ A single machine can also support a billion level message accumulation capability , We believe that this stacking capacity is fully capable of meeting the business needs .

Open source community activity

  • Kafka community is slow to update
  • RocketMQ Of GitHub Our community has 250 A person , Company users registered contact information ,QQ Group over 1000 people .  MQ ### Commercial support
  • Kafka's original development team set up a new company , At present, there are no related products to see
  • RocketMQ Alibaba cloud has been commercialized , At present, cloud services are available for commercial use , And promise to users 99.99% The reliability of the , At the same time, it completely solves the problem of user self construction MQ The complexity of product operation and maintenance

maturity

  • Kafka is more mature in the field of log
  • RocketMQ There are a lot of applications in Alibaba group , A lot of news is generated every day , And it has successfully supported many mass news tests on tmall's double 11 , It's a sharp tool for data peak shaving and valley filling .

版权声明
本文为[mskk]所创,转载请带上原文链接,感谢

Scroll to Top