Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

升级ServiceComb,从1.3升级到2.7.5的过程中遇到了一个问题 #3454

Open
edgar1170 opened this issue Nov 4, 2022 · 12 comments
Open

Comments

@edgar1170
Copy link

edgar1170 commented Nov 4, 2022

对某个循环调用同一个服务器端的客户端接口进行多次调用[不算很频繁],wait response会不断增长,直到30s后超时,然后会慢慢发生熔断
陆续出现code=408;msg=CommonExceptionData [message=Request Timeout.]
接着出现InvocationException: InvocationException: code=500;msg=CommonExceptionData [message=No available address found.]
查看调用链之后发现,耗时在wait response上面。
请问是因为从从1.3升级到2.7.5从而某些参数变更出现的问题吗?
客户端与服务器的配置如下:

servicecomb:
  accesslog:
    enabled: true
    pattern: "%t %p %m %H %U %s %b %D"
  server:
    http2:
      useAlpnEnabled: false
  rest:
    client:
      http2:
        enabled: false
      connection:
        maxPoolSize: 50
@edgar1170
Copy link
Author

1.3旧版本的时候没有上述的配置,全都用的默认值

@liubao68
Copy link
Contributor

liubao68 commented Nov 4, 2022

你的问题看起来和配置没什么关系。 还是需要分析下耗时的原因或者环节。 建议打开metrics并结合服务端、客户端的情况,分析下可能的环节与问题。

@edgar1170
Copy link
Author

好的,不过我用系统查调用链的时候,发现服务端业务耗时并不长,大头的耗时都在等待上了

@liubao68
Copy link
Contributor

liubao68 commented Nov 4, 2022

TPS上升的时候, wait-response 会增长的。 一般的业务统计耗时,只是某条语句的执行时间,通常波动很小。 耗时的部分都是在排队、调度等。

@fanjiwang1992
Copy link
Member

看一下服务端access.log,看看对应请求服务端处理的时间实际花费多久?%D对应的值

@edgar1170
Copy link
Author

看一下服务端access.log,看看对应请求服务端处理的时间实际花费多久?%D对应的值

服务器处理时间可以看到是非常短的,都花在了get connection上面了

@edgar1170
Copy link
Author

@liubao68 使用测试工具跑测试用例的时候,起初能够看到大量如下报错, 2022-11-07 11:38:37.565 ERROR 2268 --- [http-nio-10.0.3.113-8080-Poller] .c.b.h.TomcatConfig$TomcatThreadExecutor : 156 tomcat Work queue full
2022-11-07 11:38:37.566 WARN 2268 --- [http-nio-10.0.3.113-8080-Poller] org.apache.tomcat.util.net.NioEndpoint : 175 Executor rejected socket [org.apache.tomcat.util.net.NioEndpoint$NioSocketWrapper@ceeaf51:org.apache.tomcat.util.net.NioChannel@1993f1bc:java.nio.channels.SocketChannel[connected local=/10.0.3.113:8080 remote=/10.0.3.1:25788]] for processing

java.util.concurrent.RejectedExecutionException: The executor's work queue is full
at org.apache.catalina.core.StandardThreadExecutor.execute(StandardThreadExecutor.java:178)
at org.apache.tomcat.util.net.AbstractEndpoint.processSocket(AbstractEndpoint.java:1169)
at org.apache.tomcat.util.net.NioEndpoint$Poller.processKey(NioEndpoint.java:856)
at org.apache.tomcat.util.net.NioEndpoint$Poller.run(NioEndpoint.java:823)
at java.lang.Thread.run(Thread.java:748)

2022-11-07 11:38:37.566 ERROR 2268 --- [http-nio-10.0.3.113-8080-Poller] .c.b.h.TomcatConfig$TomcatThreadExecutor : 156 tomcat Work queue full
2022-11-07 11:38:37.567 WARN 2268 --- [http-nio-10.0.3.113-8080-Poller] org.apache.tomcat.util.net.NioEndpoint : 175 Executor rejected socket [org.apache.tomcat.util.net.NioEndpoint$NioSocketWrapper@4d13e92a:org.apache.tomcat.util.net.NioChannel@43aa2179:java.nio.channels.SocketChannel[connected local=/10.0.3.113:8080 remote=/10.0.3.1:18134]] for processing

java.util.concurrent.RejectedExecutionException: The executor's work queue is full
at org.apache.catalina.core.StandardThreadExecutor.execute(StandardThreadExecutor.java:178)
at org.apache.tomcat.util.net.AbstractEndpoint.processSocket(AbstractEndpoint.java:1169)
at org.apache.tomcat.util.net.NioEndpoint$Poller.processKey(NioEndpoint.java:856)
at org.apache.tomcat.util.net.NioEndpoint$Poller.run(NioEndpoint.java:823)
at java.lang.Thread.run(Thread.java:748)

@liubao68
Copy link
Contributor

liubao68 commented Nov 8, 2022

这个日志是 A -> B, A超时, B的日志吧? 可以抓一下堆栈看下, B在忙啥。

@edgar1170
Copy link
Author

edgar1170 commented Nov 9, 2022

@liubao68 客户端这边发现问题了,升级版本之后发生了线程BLOCK,请问这个为什么会在升级之后出现呀:- locked org.apache.servicecomb.provider.pojo.PojoConsumerMetaRefresher@4196491c <---- but blocks 5 other threads!
at org.apache.servicecomb.provider.pojo.PojoConsumerMetaRefresher.getLatestMeta(PojoConsumerMetaRefresher.java:58)
at org.apache.servicecomb.provider.pojo.PojoInvocationCreator.create(PojoInvocationCreator.java:26)
at org.apache.servicecomb.provider.pojo.HandlerInvocationCaller.call(HandlerInvocationCaller.java:31)
at org.apache.servicecomb.provider.pojo.Invoker.invoke(Invoker.java:71)
at com.sun.proxy.$Proxy203.queryRule(Unknown Source)

@edgar1170
Copy link
Author

edgar1170 commented Nov 9, 2022

每次调用serviceComb都会增加一条堵塞线程,线程阻塞的数量会不断增加最后爆掉

@edgar1170
Copy link
Author

或许需要在哪里控制线程总数量?

@liubao68
Copy link
Contributor

liubao68 commented Nov 9, 2022

能否提供比较完整的堆栈?你给的信息看不出阻塞在哪个地方了。 或者你可以看下完整的堆栈,看看同步代码正在执行的线程,为什么没有释放锁。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants