Why Kafka does not support read and write separation

Java源码网 1月前 ⋅ 88 阅读

Why Kafka does not support read and write separation

In Kafka, the producer writes the message and the consumer reads the message interacting with the leader copy, thus implementing a production and consumption model of the main write master read. Database, Redis, etc. all have the function of main write master read, and at the same time support the main write and read function, the main write and read are read and write separation, in order to correspond with the main write main read, here is the main write read Call it. Kafka does not support master write and read. Why is this?

At the code level, although the code complexity is increased, this feature is fully supported in Kafka. For this problem, we can do a specific analysis from the perspective of “return point”. The master write slave read allows the slave node to share the load pressure of the master node, preventing the master node from being overloaded and the slave node being idle. However, there are two very obvious shortcomings in the main write and read:

  • (1) Data consistency issues. The data from the master node to the slave node must have a time window of delay. This time window will cause data inconsistency between the master and slave nodes. At some point, the value of the A data in the master node and the slave node is X, and then the value of A in the master node is changed to Y, then the application reads the A data in the slave node before the change is notified to the slave node. The value is not the latest Y, which creates a problem of data inconsistency.
  • (2) Delay problem. Similar to the Redis component, the process from data writing to the master node to the synchronization to the slave node requires several stages of network→master memory→network→slave node memory, and the whole process takes a certain amount of time. In Kafka, master-slave synchronization is more time consuming than Redis. It needs to go through the stages of network → primary node memory → master node disk → network → slave node memory → slave node disk. For time-sensitive applications, the main write-to-read function is not suitable.

In reality, many applications can tolerate a certain degree of delay and can withstand the inconsistency of data over a period of time. Is it necessary for Kafka to support the main write-and-read function for this situation?

The main read from write can share a certain load but can not achieve complete load balancing. For example, when the data write pressure is high and the read pressure is small, the slave node can only share a small load pressure, and most of the pressure is still On the primary node. In Kafka, a large amount of load balancing can be achieved, and this equalization is implemented on the architecture of the main write master read. Let's take a look at Kafka's production and consumption model, as shown in the following figure.


There are 3 partitions in the Kafka cluster, each with 3 copies, evenly distributed over 3 brokers, gray shaded for the leader copy, non-gray shaded for the follower copy, and dashed line for the follower copy from the leader copy Pull the message. When the producer writes the message, the leader copy is written. For the situation in Figure 8-23, each broker has a message flowing from the producer; when the consumer reads the message, it is also read from the leader copy. For the situation in Figure 8-23, each broker has a message to the consumer.

It is obvious that the read and write load on each broker is the same, which means that Kafka can achieve load balancing that the master write cannot achieve by reading the main write master read. The above diagram shows an ideal deployment scenario. There are several situations (including but not limited to) that can cause a certain degree of load imbalance:

  • (1) The partition allocation on the broker side is uneven. When creating a theme, it may happen that some brokers allocate more partitions and other brokers allocate fewer partitions, so the leader copy that is naturally assigned is not uniform.
  • (2) The producer writes the message unevenly. Producers may only perform a large number of writes to the leader copy in some brokers, but not the leader copy in other brokers.
  • (3) Consumer consumption news is uneven. Consumers may only perform a large number of pulls on the leader copy in some brokers, but not on the leader copy in other brokers.
  • (4) The switching of the leader copy is uneven. In actual applications, the master-slave copy may be switched due to the broker downtime, or the partition copy may be redistributed. These actions may cause uneven distribution of the leader copy in each broker.

In this regard, we can do some preventive measures. For the first case, the partitions should be allocated as evenly as possible when the theme is created. Fortunately, the corresponding allocation algorithm in Kafka is also pursuing this goal. If it is a developer-defined allocation, you need to pay attention to this aspect. Content. For the second and third cases, the main write from the read can not be resolved. In the fourth case, Kafka provides a priority copy election to achieve the balance of the leader copy, and at the same time, it can also be used in conjunction with the corresponding monitoring, alarm and operation and maintenance platform to achieve balanced optimization.

In practical applications, Kafka can achieve a large degree of load balancing in most cases with an ecological platform that combines monitoring, alarming, and operation and maintenance. In general, Kafka only supports the main write master reading has several advantages: it can simplify the implementation logic of the code, reduce the possibility of error; the load granularity is evenly divided, compared with the main write from the read, not only the load performance is better, And it is controllable to the user; there is no delay effect; in the case of stable copy, there will be no data inconsistency. For this reason, why should Kafka achieve the function of reading and reading that is not profitable for it? All of this is due to Kafka's excellent architectural design. In a sense, the main writing is due to reading. Expedients formed by design flaws.

全部评论: 0