-
Notifications
You must be signed in to change notification settings - Fork 356
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Blocked thread if WebApplicationException is reused. #4097
Comments
So we ran into the same issue where our threads end up blocked in waiting state and connections in CLOSE_WAIT. This happened with this exception : "An I/O error has occurred while writing a response message entity to the container output stream". For now, we found out the source of these exceptions and fixed it but code should be able to gracefully handle such scenarios and not build up the blocked threads infinitely. This happens when we see this error: Thread blocked in waiting state: |
How is that WebApplicationExceptionre used by guava cache?. It contains a response, so that exception should never be cached/reused. |
@jbescos Agreed that the response shouldn't be reused. Contrary to the original issue by the reporter, we were not using Guava yet at one point we were reusing the response leading to this scenario. Even then, shouldn't jersey gracefully recover from this situation (or throw exception) rather than having threads wait infinitely? |
I think we can set a configurable timeout for the CompletableFuture.get in the line: So it will not be hanged forever. |
@jbescos Yes this should prevent the threads from hanging forever. We had tried it in our server and it prevented the increase in waiting threads thereby alleviating the problem. It will be sufficient although it would be better if in case of an exception in setStreamProvider(), it is handled immediately (in addition to adding timeout). |
@jbescos Which version will this change go in? |
@alishaarora sorry for not reply for quite some time. I was discussing this with @jansupol and we decided to not do anything yet. There are two reasons for this:
|
@jbescos Thanks for the reply.
|
I really wished it would be that simple. If you have that simple reproducer, please add it here. We already spent some time on the reproducer without a success. |
I created a simple reproducer here: https://github.com/migedt/jersey-issue-4097-reproducer |
I get the thread blocked in your example, but I don't get it blocked when I execute the next test:
Anyway, I will review this. Ideally in the second request it should throw one exception. I am attaching both relevant threads in your example:
The completable future is completed when the close is invoked, but CommittingOutputStream.close only closes one time. Then the completable future is never completed.
|
Signed-off-by: Jorge Bescos Gascon <[email protected]>
Signed-off-by: Jorge Bescos Gascon <[email protected]>
* Blocked thread if WebApplicationException is reused. #4097 Signed-off-by: Jorge Bescos Gascon <[email protected]>
I think this exception has been raise a few times; but it never seems to come with a reproducer so the scenario is that after a while there is a number of stuck threads noted in this case by WLS 12.2.1; but I can see that this code would be a problem in the servers:
"[ACTIVE] ExecuteThread: '48' for queue: 'weblogic.kernel.Default (self-tuning)'" #157 daemon prio=5 os_prio=0 tid=0x00007f1c849e2000 nid=0x56c7 waiting on condition [0x00007f1c58318000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000007b36845c8> (a jersey.repackaged.com.google.common.util.concurrent.AbstractFuture$Sync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
at jersey.repackaged.com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:285)
at jersey.repackaged.com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116)
at org.glassfish.jersey.servlet.internal.ResponseWriter.getResponseContext(ResponseWriter.java:299)
at org.glassfish.jersey.servlet.internal.ResponseWriter.callSendError(ResponseWriter.java:215)
at org.glassfish.jersey.servlet.internal.ResponseWriter.commit(ResponseWriter.java:194)
at org.glassfish.jersey.server.ContainerResponse.close(ContainerResponse.java:413)
at org.glassfish.jersey.server.ServerRuntime$Responder.writeResponse(ServerRuntime.java:784)
at org.glassfish.jersey.server.ServerRuntime$Responder.processResponse(ServerRuntime.java:444)
at org.glassfish.jersey.server.ServerRuntime$Responder.process(ServerRuntime.java:490)
at org.glassfish.jersey.server.ServerRuntime$2.run(ServerRuntime.java:334)
at org.glassfish.jersey.internal.Errors$1.call(Errors.java:271)
at org.glassfish.jersey.internal.Errors$1.call(Errors.java:267)
at org.glassfish.jersey.internal.Errors.process(Errors.java:315)
at org.glassfish.jersey.internal.Errors.process(Errors.java:297)
at org.glassfish.jersey.internal.Errors.process(Errors.java:267)
at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:317)
at org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:305)
at org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:1154)
at org.glassfish.jersey.servlet.WebComponent.serviceImpl(WebComponent.java:471)
at org.glassfish.jersey.servlet.WebComponent.service(WebComponent.java:425)
at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:383)
at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:336)
at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:223)
at weblogic.servlet.internal.StubSecurityHelper$ServletServiceAction.run(StubSecurityHelper.java:286)
at weblogic.servlet.internal.StubSecurityHelper$ServletServiceAction.run(StubSecurityHelper.java:260)
at weblogic.servlet.internal.StubSecurityHelper.invokeServlet(StubSecurityHelper.java:137)
at weblogic.servlet.internal.ServletStubImpl.execute(ServletStubImpl.java:350)
at weblogic.servlet.internal.TailFilter.doFilter(TailFilter.java:25)
at weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:78)
at weblogic.security.internal.IDCSSessionSynchronizationFilter.doFilter(IDCSSessionSynchronizationFilter.java:176)
at weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:78)
at weblogic.websocket.tyrus.TyrusServletFilter.doFilter(TyrusServletFilter.java:274)
at weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:78)
at com.oracle.breeze.authorization.CSRFFilter.doFilter(CSRFFilter.java:124)
at weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:78)
at com.oracle.breeze.service.SlashSlashWarningFilter.doFilter(SlashSlashWarningFilter.java:51)
at weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:78)
at com.oracle.breeze.authorization.CORSFilter.doFilter(CORSFilter.java:125)
at weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:78)
at com.oracle.breeze.metrics.DtVisitorTrackingFilter.doFilter(DtVisitorTrackingFilter.java:253)
at weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:78)
at com.oracle.breeze.authorization.TenantFilter.doFilter(TenantFilter.java:224)
at weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:78)
at com.oracle.breeze.authorization.TrackingFilter.doFilter(TrackingFilter.java:173)
at weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:78)
at com.oracle.breeze.service.inject.LocaleFilter.doFilter(LocaleFilter.java:35)
at weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:78)
at oracle.security.jps.ee.http.JpsAbsFilter$3.run(JpsAbsFilter.java:172)
https://github.com/jersey/jersey/issues/3558
https://github.com/jersey/jersey/issues/3800
https://github.com/jersey/jersey/issues/3207
https://github.com/jersey/jersey/issues/3619
#3474
None of these appear to have a reproducible test case, so I have raised this bug in case they are separate concerns.
This behaviour is seen in our code when a WebApplicationException is rethrown by a Guava Cache but the Response in this case is no longer valid as it has previously been used or is in use.
Note originally the response was a string; but the problem became more reproducible when we use an entity that cannot be re-used; but it isn't required.
Of course Response is an instance of OutboundJaxrsResponse which holds and instance of OutboundMessageContext which in turns holds a CommittingOutputStream which is not re-usable.
Depending on the load you either see two thread using the CommittingOutputStream at once or errors due to closed streams. It most cases the client in unaware of what has gone wrong and that makes it much harder to track down. Not sure how it gets the right response in this case.
So the deadlock happens in this section of code in SeverRuntime.java:
For example if the code in response.enableBuffering(runtime.configuration); fails because someone has already started to use the buffer you end up in the Throwable block before the code in setStreamProvider is executed. This means that when you get to response.close() it eventually get suck in ResponseWriter.commit but the ResponseContext is never set.
So it would make sense if the .get on the future this ResponseWriter had some kind of timeout; but it would be even better if the code properly recovered from this situation. As I say this is a hard was to track down because if the requirement for concurrency and for re-using the same Response object.
I can demonstrate this working in a running system if it helps with a fix, for the moment we worked around this using a trivial subclass:
But it might be possible to use an interceptor to clone the Response; but I didn't try it.
The text was updated successfully, but these errors were encountered: