我在围绕 Spring Cloud Gateway 构建的 Spring Boot 3 应用程序中有一个全局错误处理程序:
@Component
@Order(-99)
public class GlobalErrorWebExceptionHandler implements WebExceptionHandler {
public static final String ERR_TAG_SOURCE = "source";
public static final String ERR_TAG_CODE = "code";
public static final String ERR_TAG_MESSAGE = "message";
public static final String ERR_SOURCE_INTERNAL = "internal";
public static final String ERR_SOURCE_UPSTREAM = "upstream";
protected static final String LOG_MARKER = "PROXY";
protected static final Logger LOGGER = LoggerFactory.getLogger(GlobalErrorWebExceptionHandler.class);
public Mono<Void> handle(ServerWebExchange exchange, Throwable ex) {
LOGGER.error(MarkerFactory.getMarker(LOG_MARKER), "Error ({}) encountered while filtering request {}.",
ex.getClass().getCanonicalName(), exchange.getRequest().getId(), ex);
HttpStatus status = ex instanceof ErrorResponseException ? (HttpStatus) ((ErrorResponseException) ex).getStatusCode() : HttpStatus.INTERNAL_SERVER_ERROR;
String source = ex instanceof ErrorResponseException ? ERR_SOURCE_UPSTREAM : ERR_SOURCE_INTERNAL;
String errorMessage = ex.getMessage();
exchange.getResponse().setStatusCode(status);
return exchange.getResponse().writeWith(
Flux.just(
exchange.getResponse().bufferFactory().wrap(
(
"{\"" + ERR_TAG_MESSAGE + "\": \"" + errorMessage
+ "\", \"" + ERR_TAG_CODE + "\": " + status.value()
+ ", \"" + ERR_TAG_SOURCE + "\": \"" + source + "\"}")
.getBytes())));
}
}
我代理的其中一项服务偶尔会返回 504 网关超时。我可以看到调用了全局错误处理程序 - 我看到了它写入的日志消息。堆栈跟踪如下所示:
2023-06-26 21:19:03,435 ERROR --- pod-86f446c7dd-fn47g parallel-7 com.applicationGlobalErrorWebExceptionHandler: app-name, , , , GATEWAY, - Error (org.springframework.web.server
.ResponseStatusException) encountered while filtering request f5eb8577-292.
org.springframework.web.server.ResponseStatusException: 504 GATEWAY_TIMEOUT "504 GATEWAY_TIMEOUT "Response took longer than timeout: PT5M""
at com.application.proxy.SomeHandler.lambda$handle$3(SomeHandler.java:66)
Suppressed: [CIRCULAR REFERENCE: org.springframework.web.server.ResponseStatusException: 504 GATEWAY_TIMEOUT "Response took longer than timeout: PT5M"]
Suppressed: reactor.core.publisher.FluxOnAssembly$OnAssemblyException:
Error has been observed at the following site(s):
*__checkpoint ⇢ org.springframework.cloud.gateway.filter.WeightCalculatorWebFilter [DefaultWebFilterChain]
*__checkpoint ⇢ org.springframework.web.filter.reactive.ServerHttpObservationFilter [DefaultWebFilterChain]
*__checkpoint ⇢ HTTP POST "/api21/registry/2.1.0/registry/getDataPools" [ExceptionHandlingWebHandler]
Original Stack Trace:
at com.application.proxy.SomeHandler.lambda$handle$3(SomeHandler.java:66)
at reactor.core.publisher.FluxOnErrorResume$ResumeSubscriber.onError(FluxOnErrorResume.java:94)
at reactor.core.publisher.MonoIgnoreThen$ThenIgnoreMain.onError(MonoIgnoreThen.java:278)
at reactor.core.publisher.MonoPeekTerminal$MonoTerminalPeekSubscriber.onError(MonoPeekTerminal.java:258)
at reactor.core.publisher.FluxPeekFuseable$PeekConditionalSubscriber.onError(FluxPeekFuseable.java:903)
at reactor.core.publisher.MonoIgnoreThen$ThenIgnoreMain.onError(MonoIgnoreThen.java:278)
at reactor.core.publisher.MonoPeekTerminal$MonoTerminalPeekSubscriber.onError(MonoPeekTerminal.java:258)
at reactor.core.publisher.MonoPeekTerminal$MonoTerminalPeekSubscriber.onError(MonoPeekTerminal.java:258)
at reactor.core.publisher.MonoIgnoreThen$ThenIgnoreMain.onError(MonoIgnoreThen.java:278)
at reactor.core.publisher.FluxOnErrorResume$ResumeSubscriber.onError(FluxOnErrorResume.java:106)
at reactor.core.publisher.Operators.error(Operators.java:198)
at reactor.core.publisher.MonoError.subscribe(MonoError.java:53)
at reactor.core.publisher.Mono.subscribe(Mono.java:4444)
at reactor.core.publisher.FluxOnErrorResume$ResumeSubscriber.onError(FluxOnErrorResume.java:103)
at reactor.core.publisher.SerializedSubscriber.onError(SerializedSubscriber.java:124)
at reactor.core.publisher.FluxTimeout$TimeoutOtherSubscriber.onError(FluxTimeout.java:341)
at reactor.core.publisher.Operators.error(Operators.java:198)
at reactor.core.publisher.MonoError.subscribe(MonoError.java:53)
at reactor.core.publisher.Mono.subscribe(Mono.java:4444)
at reactor.core.publisher.FluxTimeout$TimeoutMainSubscriber.handleTimeout(FluxTimeout.java:301)
at reactor.core.publisher.FluxTimeout$TimeoutMainSubscriber.doTimeout(FluxTimeout.java:280)
at reactor.core.publisher.FluxTimeout$TimeoutTimeoutSubscriber.onNext(FluxTimeout.java:419)
at reactor.core.publisher.FluxOnErrorReturn$ReturnSubscriber.onNext(FluxOnErrorReturn.java:162)
at reactor.core.publisher.MonoDelay$MonoDelayRunnable.propagateDelay(MonoDelay.java:271)
at reactor.core.publisher.MonoDelay$MonoDelayRunnable.run(MonoDelay.java:286)
at reactor.core.scheduler.SchedulerTask.call(SchedulerTask.java:68)
at reactor.core.scheduler.SchedulerTask.call(SchedulerTask.java:28)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: org.springframework.web.server.ResponseStatusException: 504 GATEWAY_TIMEOUT "Response took longer than timeout: PT5M"
at org.springframework.cloud.gateway.filter.NettyRoutingFilter.lambda$filter$5(NettyRoutingFilter.java:195)
at reactor.core.publisher.Flux.lambda$onErrorMap$28(Flux.java:7123)
at reactor.core.publisher.Flux.lambda$onErrorResume$29(Flux.java:7176)
at reactor.core.publisher.FluxOnErrorResume$ResumeSubscriber.onError(FluxOnErrorResume.java:94)
at reactor.core.publisher.SerializedSubscriber.onError(SerializedSubscriber.java:124)
at reactor.core.publisher.FluxTimeout$TimeoutOtherSubscriber.onError(FluxTimeout.java:341)
at reactor.core.publisher.Operators.error(Operators.java:198)
at reactor.core.publisher.MonoError.subscribe(MonoError.java:53)
at reactor.core.publisher.Mono.subscribe(Mono.java:4444)
at reactor.core.publisher.FluxTimeout$TimeoutMainSubscriber.handleTimeout(FluxTimeout.java:301)
at reactor.core.publisher.FluxTimeout$TimeoutMainSubscriber.doTimeout(FluxTimeout.java:280)
at reactor.core.publisher.FluxTimeout$TimeoutTimeoutSubscriber.onNext(FluxTimeout.java:419)
at reactor.core.publisher.FluxOnErrorReturn$ReturnSubscriber.onNext(FluxOnErrorReturn.java:162)
at reactor.core.publisher.MonoDelay$MonoDelayRunnable.propagateDelay(MonoDelay.java:271)
at reactor.core.publisher.MonoDelay$MonoDelayRunnable.run(MonoDelay.java:286)
at reactor.core.scheduler.SchedulerTask.call(SchedulerTask.java:68)
at reactor.core.scheduler.SchedulerTask.call(SchedulerTask.java:28)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: org.springframework.cloud.gateway.support.TimeoutException: Response took longer than timeout: PT5M
整个应用程序的日志级别设置为DEBUG。在处理产生初始 504 的请求时,没有记录有关错误或其他转换响应的操作的其他信息。但是,客户端收到的响应状态不是我在错误处理程序中设置的 504,而是 500。为什么?
我在我的一个应用程序中遇到了同样的问题。我们有一个过滤器来检查
exchange.getResponse().getStatusCode().is5xxServerError()
成功捕获 500 和 503 错误(并以受控方式将它们返回为 502),但同一过滤器缺少 504 错误(这些错误作为 ResponseStatusException 冒泡)。就像你说的,这对客户端显示为 500,但我们希望返回 502,以表示我们已经做了我们应该做的一切。
一种解决方案是捕获全局的所有异常,并在该级别留意这些 504。本指南显示了一个示例(尽管由于我使用此代码构建的问题,我尚未验证它):
你找到解决办法了吗?