I am seeing following error frequently, even with very basic smoke tests when loading homepage with internal resources. When I ran Gatling in debug mode seeing following error message.
15:26:12.056 [DEBUG] i.g.h.a.AsyncHandler - Request ‘request_311’ failed for user 6586454730362407223-25
java.util.concurrent.TimeoutException: Request timed out to alphaint/192.168.1.50:443 of 60000 ms
at com.ning.http.client.providers.netty.request.timeout.TimeoutTimerTask.expire(TimeoutTimerTask.java:40) [async-http-client-1.9.0-BETA9.jar:na]
at com.ning.http.client.providers.netty.request.timeout.RequestTimeoutTimerTask.run(RequestTimeoutTimerTask.java:45) [async-http-client-1.9.0-BETA9.jar:na]
at io.gatling.http.ahc.AkkaNettyTimer$$anonfun$1.apply$mcV$sp(AkkaNettyTimer.scala:55) [gatling-http-2.0.0-SNAPSHOT.jar:2.0.0-SNAPSHOT]
at akka.actor.Scheduler$$anon$7.run(Scheduler.scala:117) [akka-actor_2.10-2.3.4.jar:na]
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41) [akka-actor_2.10-2.3.4.jar:na]
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:393) [akka-actor_2.10-2.3.4.jar:na]
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) [scala-library.jar:na]
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) [scala-library.jar:na]
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) [scala-library.jar:na]
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) [scala-library.jar:na]
15:26:12.058 [WARN ] i.g.h.a.AsyncHandlerActor - Request ‘request_311’ failed: java.util.concurrent.TimeoutException: Request timed out to alphaint/192.168.1.50:443 of 60000 ms
15:26:12.062 [DEBUG] i.g.h.a.AsyncHandlerActor -
Request:
request_311: KO java.util.concurrent.TimeoutException: Request timed out to alphaint/192.168.1.50:443 of 60000 ms
I can’t see any packet loss both at load generator system or App server system when using ifconfig eth0/eth1.
netstat -s | egrep -i ‘loss|retran’ shows following output. There seems increase in TCPLossProbes, TCPLossProbeRecovery.
1814 segments retransmited
1160 times recovered from packet loss by selective acknowledgements
TCPLostRetransmit: 81
1 timeouts in loss state
1295 fast retransmits
142 forward retransmits
78 retransmits in slow start
TCPLossProbes: 4754
TCPLossProbeRecovery: 3159
24 SACK retransmits failed
TCPRetransFail: 172
when I modified gatling.conf, http → Ahc → requestTimeout = 120000 seeing different message as listed below.
Kindly advise if any changes needed in gatling.conf to resolve this sort of timeout issues… of advise on how to proceed with further investigation.
java.util.concurrent.TimeoutException: Read timeout to testenv of 60000ms 1 (100.0%)
I have downloaded RC3.zip file and tried to rerun the test and still seeing same issue. i.e. java.util.concurrent.TimeoutException: Request timed out to testenv/192.168.1.00:443 of 60000 ms
when I modify gatling.conf, http → Ahc → requestTimeout = 120000 seeing , seeing java.util.concurrent.TimeoutException: Read timeout out to testenv/
Kindly could you advise if any changes in gatling.conf or advise on potential improvements.
This issue is resolved upon replacing “.maxConnectionsPerHostLikeChrome” with “maxConnectionsPerHost(2)”. Looks like reducing connections to 2 or 3 seems to resolve this issue.
When i keep maxConnectionsPerHost(4), seeing same error. i.e. ReadTimeout to Test Env
how realistic to keep maxConnectionPerHost to 2 or 3. kindly advise if this is not correct approach.
Unfortunately I could not share test environment access, as its secured. I’m happy to share my test code, if that would be of use to debug this issue.
This issue is resolved upon replacing ".maxConnectionsPerHostLikeChrome"
with "maxConnectionsPerHost(2)". Looks like reducing connections to 2
or 3 seems to resolve this issue.
When i keep maxConnectionsPerHost(4), seeing same error. i.e.
ReadTimeout to Test Env
how realistic to keep maxConnectionPerHost to 2 or 3. kindly advise if
this is not correct approach.
maxConnectionsPerHostLikeChrome implies maxConnectionsPerHost(6)
If you lower this value, you generate less concurrent connections and
requests to your SUT.
Unfortunately I could not share test environment access, as its secured.
I'm happy to share my test code, if that would be of use to debug this
issue.
Honestly, I can't do much without being able to reproduce.
I'll check again AsyncHttpClient code later this week and see if by any
chance I find something, but I doubt it.
Then, the problem could also be with your SUT, can't tell.
Hi Stéphane (best to copy and paste to get the ecute in there…)
if the request was a timeout then presumably Raja can use netstat to determine whether the connection was established or not (for 60 seconds) etc?
It’s a read timeout so there may be errors at the server end also.
how realistic to keep maxConnectionPerHost to 2 or 3. kindly advise if this is not correct approach.
→ I would suggest the approach should be: determine what your client/server does in terms of kept alive connections and reproduce that in the tests.
ie. if your system keeps 10 pooled connections open between client and server for 5 minutes then set up gatling to do that. If the system fails with those settings then the test has failed and you would need to look into why (maybe the connection limit or backlog on the server is too low), assuming that the system was sized correctly for the test.