One 、 background
Maybe people are using Spring Cloud Gateway When building a microservice gateway , Cut six generals after five passes ,Reactor It didn't make us , Link tracking didn't beat us , Finally, I found many wonderful problems after I went online , These wonderful questions are still out of the question , Like this stack , I've used... Deeply SCG People who are not unfamiliar with ：
reactor.netty.http.client.PrematureCloseException: Connection prematurely closed BEFORE response Suppressed: reactor.core.publisher.FluxOnAssembly$OnAssemblyException:
Similar to that ：
Connection prematurely closed DURING response ... wait
Baidu took a circle , Few people offer solutions , The conditional Google A handful of , Follow the official adjustment of several parameters , It is not known whether it is useful or not , In the end, it's all over .
Go to SCG official Issue Look up , There's a lot more ,（ Let me just say a word here , If you have any problems, you can find the predecessors first Issue, Try not to ask repetitive questions , Virtue endures ！）
Then I found a sentence ：
It seems that the root cause of the problem is not SCG, According to past experience ,Spencer Gibb Old fellow iron is a real person , know what really understands , Don't know for don't know , The essence of Confucianism . Summoned Reactor-Netty Of @violetagg, thus , We know the problem and we're going to https://github.com/reactor/reactor-netty/issues I found the answer . In fact, the subject is the same , stay Reactor-Netty I don't know how to make this mistake , It's just that the global exception handler can catch this exception , Return to the caller a uniform result of a request for a third party error , It doesn't hurt, it doesn't hurt .But, I won't let this kind of ambiguous problem program go online ！ Actually, read it carefully Reactor-Netty Project Issue Some key points will be summed up , except @violetagg Teach people how to cut debug Pattern , How to find channel id outside , There are some concluding remarks , I'll stick it down , Let's take a closer look ：
The first paragraph ：
I would recommend to configure maxIdleTime on the client side having in mind the keepAliveTimeout on Tomcat. Without such configuration Reactor Netty can receive the close event at any time between acquiring the connection from the pool and before actual sending of the request. Also you might want to switch to LIFO leasing strategy so that you will use always the most recently used connection.
The second paragraph ：
The connection is closed by Spring Framework WebClient’s new change for disposing the connection when a cancellation happens
The third paragraph ：
the connection was closed while still sending the request body
In fact, the essence of the problem is that a normal request is suddenly closed in some cases , Like everyone else, there are more question marks in the head ：“ In some cases ” Yeah, the situation ？ Why it was shut down ？ The frequency of such problems is not high , How to effectively reproduce ？why？？？
In many Issue in , You'll notice, too , This anomaly and Reactor-Netty Inside HttpClient There is a great deal of connection .
SCG The official document says , Setting the connection timeout and read timeout for requesting third-party services is actually set org.springframework.cloud.gateway.config.HttpClientProperties Class properties , And then dig down ,HttpClientProperties In fact, it is to improve the configuration ability , For initialization reactor.netty.http.client.HttpClient Make a facade , In fact, this configuration class and you know HttpClient It doesn't matter directly , It's just simulating something like HttpClient There should be some mechanisms , Such as connection pool （ Have used HttpClient The old fellow must have played the connection when he was making moths on the line / Thread pool parameters ） Mechanism ,HttpClientProperties Inside pool Property is to set the connection pool related properties .
See here , You just need to know ,SCG The bottom of the Reactor-Netty A connection pool is created for the request instance , In order to make a request later without re creating the request , You can get it directly from it . In fact, this is the root cause of the problem , Look at the sequence diagram below and you'll see ：
Here we use a Spring Boot built-in Tomcat As a service provider , User pass SCG visit ,SCG Agent request .
By default ,SCG Connections created internally are not recycled , It's always in memory , and Spring Boot Built in Tomcat Dissimilarity , Default in 20s There is no data interaction after that , The connection will be recycled , I happened to encounter another request when recycling , It happened to be SCG Get this connection and try to request Tomcat, This anomaly will appear .
therefore , Don't expect to be Reactor-Netty or SCG To solve this problem , This requires gateway and back-end services to solve , To the maximum extent, this exception will not occur .
From the first paragraph above, there is a solution ：
The first 1 Step 、 Join in JVM Parameters ： -Dreactor.netty.pool.leasingStrategy=lifo The first 2 Step 、SCG New configuration ： spring: cloud: gateway: httpclient: pool: maxIdleTime: 10000（ Adjust... As needed ）
The first 1 Step will get the connection policy from the default FIFO Changed to: LIFO, because LIFO It can ensure that the maximum probability of the connection obtained is the one that has been used recently , In other words, hot connection is always hot connection , And connections that are never used can be recycled ,LRU Thought .
The first 2 Step is to set how long idle requests will be recycled , In this way, you can avoid getting the old connection forced on the way to the request close 了 , The setting of this time is only to make sure that it is better than your back-end service connectTimeout It's OK to be young , This ensures that SCG Recycle request before the backend service reclaims the request , This problem can be avoided .
After this setting, this exception will occur occasionally , Please check if all your back-end services are connectTimeout All ratio maxIdleTime Big , Or try to adjust maxIdleTime. in addition , In itself, it's a probabilistic problem , If your structure is a question, this example is similar to , After the subject is set in this way , You can hardly see this anomaly , Completely eradicate this stubborn disease , Please read the sequence diagram and ask the question again . in addition , If your architecture is different , You need to find out why your request was suddenly closed on the way to the request , This may not be Reactor-Netty The problem of , It's about your service .
Version Description ：
Before the subject SCG The version is Greenwich.SR2 edition , Corresponding Spring Boot The version is 2.1.6.RELEASE, This version corresponds to Reactor-Netty The version is v0.8.9.RELEASE, This version of Reactor-Netty Settings are not provided maxIdleTime Of this option .
Reactor-Netty Is in v0.9.5.RELEASE The version starts to provide settings
Therefore, please use the above configuration in the following version ：
Spring Cloud：Hoxton.SR1 And above （SCG 2.2.1.RELEASE And above ）
Reactor-Netty：v0.9.5.RELEASE And above
Spring Boot：2.2.2.RELEASE And above
Be careful ：v0.9.6.RELEASE Version of maxIdleTime There is one bug, It may not work , Need to upgrade to v0.9.7.RELEASE Above version
Just use Reactor-Netty You can also be in reactor.netty.resources.ConnectionProvider Find a way to configure .
in addition ,v0.9.10.RELEASE Version of the connection to the early closing of the retrial mechanism , Make the probability of this anomaly very small