Random Remotely Closed exception

Hi,

I’m experiencing some strange behaviour of my scenario.
Scenario consists of chain with login to web application, doing something and log out. And all these steps is repeated 10 times.
But the issue it is that I’m getting randomly following exception for different requests.

13:11:34.820 [WARN ] c.e.e.g.h.a.GatlingAsyncHandler - Request ‘SOME_DIFFERENT_REQUESTS_HERE’ failed
java.io.IOException: Remotely Closed [id: 0xdf69cd33, /192.168.200.130:51176 :> /192.168.200.182:80]
at com.ning.http.client.providers.netty.NettyAsyncHttpProvider.channelClosed(NettyAsyncHttpProvider.java:1388) [async-http-client-1.7.18.20130621.jar:na]
at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:88) ~[netty-3.6.6.Final.jar:na]
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) ~[netty-3.6.6.Final.jar:na]
at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791) ~[netty-3.6.6.Final.jar:na]
at org.jboss.netty.handler.stream.ChunkedWriteHandler.handleUpstream(ChunkedWriteHandler.java:142) ~[netty-3.6.6.Final.jar:na]
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) ~[netty-3.6.6.Final.jar:na]
at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791) ~[netty-3.6.6.Final.jar:na]
at org.jboss.netty.channel.SimpleChannelUpstreamHandler.channelClosed(SimpleChannelUpstreamHandler.java:225) ~[netty-3.6.6.Final.jar:na]
at org.jboss.netty.handler.codec.http.HttpContentDecoder.channelClosed(HttpContentDecoder.java:147) ~[netty-3.6.6.Final.jar:na]
at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:88) ~[netty-3.6.6.Final.jar:na]
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) ~[netty-3.6.6.Final.jar:na]
at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791) ~[netty-3.6.6.Final.jar:na]
at org.jboss.netty.handler.codec.replay.ReplayingDecoder.cleanup(ReplayingDecoder.java:570) ~[netty-3.6.6.Final.jar:na]
at org.jboss.netty.handler.codec.frame.FrameDecoder.channelClosed(FrameDecoder.java:371) ~[netty-3.6.6.Final.jar:na]
at org.jboss.netty.handler.codec.http.HttpClientCodec$Decoder.channelClosed(HttpClientCodec.java:221) ~[netty-3.6.6.Final.jar:na]
at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:88) ~[netty-3.6.6.Final.jar:na]
at org.jboss.netty.handler.codec.http.HttpClientCodec.handleUpstream(HttpClientCodec.java:92) ~[netty-3.6.6.Final.jar:na]
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) ~[netty-3.6.6.Final.jar:na]
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559) ~[netty-3.6.6.Final.jar:na]
at org.jboss.netty.channel.Channels.fireChannelClosed(Channels.java:468) ~[netty-3.6.6.Final.jar:na]
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.close(AbstractNioWorker.java:376) ~[netty-3.6.6.Final.jar:na]
at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:93) ~[netty-3.6.6.Final.jar:na]
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:109) ~[netty-3.6.6.Final.jar:na]
at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312) ~[netty-3.6.6.Final.jar:na]
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:90) ~[netty-3.6.6.Final.jar:na]
at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178) ~[netty-3.6.6.Final.jar:na]
at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108) ~[netty-3.6.6.Final.jar:na]
at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42) ~[netty-3.6.6.Final.jar:na]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) ~[na:1.7.0_25]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) ~[na:1.7.0_25]
at java.lang.Thread.run(Thread.java:724) ~[na:1.7.0_25]
13:11:34.820 [WARN ] c.e.e.g.h.a.GatlingAsyncHandlerActor - Request ‘request_14’ failed : Remotely Closed [id: 0xdf69cd33, /192.168.200.130:51176 :> /192.168.200.182:80]
13:11:34.821 [DEBUG] c.e.e.g.h.a.GatlingAsyncHandlerActor -

Application under load is managed by Apache Web Server.
I’m not using thousands of users just 1, so we can assume that server is not stresses at all.
Plus with other tools like SilkPerformer I don’t see such behaviour even with 20 concurrent users.

I’ve attached actual scenario (for 2nd version of Gatling) to this post.
Have tried this one for both latest available for download builds(1.5.2 and 2.0.0-M3a).

Thanks in advance!

DeployDashboard.scala (7.77 KB)

The remote end is still hanging up on you, so check those loss first. .

…logs. sorry for the typo.

Actually there is no errors in Apache log.

Четвер, 7 листопада 2013 р. 19:47:37 UTC+2 користувач Floris Kraak написав:

Hi,

The problem could be anything, even a JDK NIO bug (BTW, which version do you use?), one explanation being your server indeed closing the connection.
The only way to get sure would be to get a tcp dump, for example with Wireshark.

I don’t know how SilkPerformer works, but having another tool not reporting such issue doesn’t really mean anything: it could silently retry with a new connection. One can also achieve this with Gatling, set retry to 1 or more in gatling.conf.

Cheers,

Stéphane

… or simply a scripting bug. Send a wrong request, application bombs with an error, gatling throws an exception. Logs show nothing but the apache log tends to be the wrong place to look anyway - usually there is an application server behind it that has its own logs…

Throwing exceptions when the remote end closes the connection is the wrong thing to do, btw. It implies some bug in the gatling stack but gatling really can’t do much about this sort of thing. The information needed by the user is the request info, the call stack is sort of irrelevant.

Hi,

i’ve experienced the same issue with Gatling 2 (JDK 1.7)
Also there are lot of “connection timed out” errors after 60seconds.

Is it ok that for load of 20 users/second gatling creates up to 1000 active sessions?

Hi,

Hi all,

First of all thanks for you ideas!!!
I’ve tried to trace tcp dumps with wireshark and looks that request that I’m loosing reaching apache host.
But they somehow cant get to apache because I don’t see them in its log.
It is simple php application so there is no any application server behind apache.
Usually Im loosing one of requests that are sent in repeat loop with the same header and parameters so there is no difference between reached apache and those that didn’t reach

Пʼятниця, 8 листопада 2013 р. 12:30:09 UTC+2 користувач Floris Kraak написав:

Which version of JDK7 do you use? Hotspot or OpenJDK? Which update? Have you tried upgrading to latest Hotspot?

I’ve tried following Oracle JDKs:

  • 1.6.0_45
  • 1.7.0_25
  • 1.7.0_45

Пʼятниця, 8 листопада 2013 р. 21:22:42 UTC+2 користувач Stéphane Landelle написав:

Could you provide access to your app so I can try to reproduce, please?

Yeah, I can. Please contact me in person.

Субота, 9 листопада 2013 р. 23:15:42 UTC+2 користувач Stéphane Landelle написав:

Hi there,

i’ve seen lots of issues like this in the past. It has always been due to the fact that I’ve been overloading the server.

How many connections are you opening? Thousands? What does your apache allow for?

Stefan

Stefan,

I’m able to reproduce this even with one user.
If it be with thousands of users I would assume that it is problem with server but not in case of one user.

Понеділок, 11 листопада 2013 р. 22:50:03 UTC+2 користувач Stefan Magnus Landrø написав:

Hi Ievgen,

I was able to reproduce on my side too and I know what the problem is.

The connection is being close from the server side, because you close idle keep-alive connections after 5s, which is perfectly fine.

We’ve been too aggressive with an AsyncHttpClient parameter named retry. Default value is 4 and we override it to 0.
Please change the value back to 4 in gatling.conf.

I’ll also clean up/fix things in AHC, there’s room for optimization in there.

Cheers,

Stéphane

Hi Stéphane,

I noticed https://github.com/gatling/gatling/issues/2199 set maxRetry back to 0 i gatling.conf i 2.0.0
Any particular reason? under what circumstancs would it make sense to increase the value?

Stefan

It doesn’t make sense anymore, from my point of view.

AsyncHttpClient was abusing this feature for compensating for picking a connection from the pool that was closed by the server by the time it tried to send a request with it. I completely rewrote this logic.

Now, you really get “Remotely closed” because the server forcefully closed it in the middle of a request, not because it was simply doing its keep-alive timeout job.

Hi,

I happen to have the same problem again with Gatling 2.0. I get “Remotely closed” errors when the maxRetry parameter is set to 0 but I don’t have any errors when the value is equal to 4.

Are you sure that this is fixed ?

Thanks,

Sébastien

Yes.

This happens because the server decided to close the connection while the client was writing on it.
AsyncHttpClient was abusing this retry mechanism and that’s what I fixed.

Don’t try to hide the problem, those “Remotely closed” errors are really what happens, probably because your server decides to abruptly close connections when load becomes too heavy.