Gatling scenario never ends for some users

Michele_Gallina · October 8, 2018, 8:46am

Hi,
I recently run into a weird problem with a Gatling simulation scenario for testing an eCommerce site.
Basically it happens that sometime (quite often recently), the simulation doesn’t finish properly and a few active users (1, 2 sometimes 4) are still logged, but I can’t see anymore requests sent on the simulation.log. So I’ve taken a Thread Dump on the gatling injector server and I’ve noticed that there are more or less a thread for each of the active users that doesn’t complete the navigation with the following pattern:

at scala.runtime.ScalaRunTime$.hash(ScalaRunTime.scala:206)

at scala.collection.immutable.HashMap.elemHashCode(HashMap.scala:80)

at scala.collection.immutable.HashMap.computeHash(HashMap.scala:89)

at scala.collection.immutable.HashMap.get(HashMap.scala:54)

at scala.collection.GenMapLike$$anonfun$liftedTree1$1$1.apply(GenMapLike.scala:120)

at scala.collection.GenMapLike$$anonfun$liftedTree1$1$1.apply(GenMapLike.scala:119)

at scala.collection.Iterator$class.forall(Iterator.scala:905)

at scala.collection.AbstractIterator.forall(Iterator.scala:1336)

at scala.collection.IterableLike$class.forall(IterableLike.scala:75)

at scala.collection.AbstractIterable.forall(Iterable.scala:54)

at scala.collection.GenMapLike$class.liftedTree1$1(GenMapLike.scala:119)

at scala.collection.GenMapLike$class.equals(GenMapLike.scala:118)

at scala.collection.AbstractMap.equals(Map.scala:59)

at io.gatling.http.cache.Cookies.equals(Cookies.scala:33)

at io.gatling.http.cache.PermanentRedirectCacheKey.equals(PermanentRedirectCacheSupport.scala:33)

at scala.collection.immutable.Map$Map1.get(Map.scala:107)

at io.gatling.core.util.cache.Cache.get(Cache.scala:79)

at io.gatling.core.util.cache.SessionCacheHandler.io$gatling$core$util$cache$SessionCacheHandler$$$anonfun$1(SessionCacheHandler.scala:64)

at io.gatling.core.util.cache.SessionCacheHandler$$Lambda$323/1463745981.apply(Unknown Source)

at scala.Option.flatMap(Option.scala:171)

at io.gatling.core.util.cache.SessionCacheHandler.getEntry(SessionCacheHandler.scala:64)

at io.gatling.http.cache.PermanentRedirectCacheSupport$class.permanentRedirect1$1(PermanentRedirectCacheSupport.scala:55)

at io.gatling.http.cache.PermanentRedirectCacheSupport$class.io$gatling$http$cache$PermanentRedirectCacheSupport$$permanentRedirect(PermanentRedirectCacheSupport.scala:64)

at io.gatling.http.cache.PermanentRedirectCacheSupport$class.applyPermanentRedirect(PermanentRedirectCacheSupport.scala:75)

at io.gatling.http.cache.HttpCaches.applyPermanentRedirect(HttpCaches.scala:24)

at io.gatling.http.action.sync.HttpTx$.startWithCache(HttpTx.scala:55)

at io.gatling.http.action.sync.HttpTx$.start(HttpTx.scala:94)

at io.gatling.http.ahc.ResponseProcessor.redirect$1(ResponseProcessor.scala:274)

at io.gatling.http.ahc.ResponseProcessor.processResponse(ResponseProcessor.scala:317)

at io.gatling.http.ahc.ResponseProcessor.onCompleted(ResponseProcessor.scala:66)

at io.gatling.http.ahc.AsyncHandler.io$gatling$http$ahc$AsyncHandler$$$anonfun$1(AsyncHandler.scala:136)

at io.gatling.http.ahc.AsyncHandler.io$gatling$http$ahc$AsyncHandler$$$anonfun$1$adapted(AsyncHandler.scala:134)

at io.gatling.http.ahc.AsyncHandler$$Lambda$331/434896179.apply(Unknown Source)

at io.gatling.http.ahc.AsyncHandler.withResponse(AsyncHandler.scala:127)

at io.gatling.http.ahc.AsyncHandler.onCompleted(AsyncHandler.scala:134)

at io.gatling.http.ahc.AsyncHandler.onCompleted(AsyncHandler.scala:47)

at org.asynchttpclient.netty.NettyResponseFuture.getContent(NettyResponseFuture.java:189)

at org.asynchttpclient.netty.NettyResponseFuture.done(NettyResponseFuture.java:223)

at org.asynchttpclient.netty.handler.HttpHandler.finishUpdate(HttpHandler.java:58)

at org.asynchttpclient.netty.handler.HttpHandler.handleChunk(HttpHandler.java:159)

at org.asynchttpclient.netty.handler.HttpHandler.handleRead(HttpHandler.java:187)

at org.asynchttpclient.netty.handler.AsyncHttpClientHandler.channelRead(AsyncHttpClientHandler.java:76)

at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)

at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)

at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:336)

at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)

at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)

at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)

at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:336)

at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)

at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)

at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)

at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:336)

at io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.fireChannelRead(CombinedChannelDuplexHandler.java:435)

at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:293)

at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:267)

at io.netty.channel.CombinedChannelDuplexHandler.channelRead(CombinedChannelDuplexHandler.java:250)

at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)

at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)

at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:336)

at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1228)

at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1039)

at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:411)

at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:248)

at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)

at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)

at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:336)

at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294)

at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)

at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)

at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:911)

at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)

at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:643)

at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:566)

at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:480)

at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:442)

at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131)

at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144)

at java.lang.Thread.run(Thread.java:748)

So I’ve guessed from the stack trace that maybe was a problem related to infinite redirect and then I’ve added also the .maxRedirects(n) on the http protocol to avoid an infinite redirect loop but the problem is still there and, by the way, I found no traces in the log that a lot of redirects were issued particularly toward the end of the tests when I had just the 1,2 active users.

Did you have any idea what can be the problem?

When the issue starts to happens I’ve also a huge spike on the CPU of the gatling servers that then stay up till the end of the test.

You can see that around 19:00 the issue happens (just 1 user remains active at the end of the test) but with a server with just 2 cpu we had almost 50% of the cpu (1 core) fully used, and this is probably due to this thread that we see still RUNNABLE at the end of the test (like if it’s consuming all CPU on an infinite loop or similar).

Does anyone experienced a similar problem?

Oh sorry, I’m on gatling 2.5

slandelle · October 8, 2018, 9:11am

Hi,

Oh sorry, I’m on gatling 2.5

Nope. There’s no 2.5

Could you please upgrade to 3.0.0-RC3 and check if your issue is fixed?

Stéphane Landelle

GatlingCorp CTO
slandelle@gatling.io

Michele_Gallina · October 8, 2018, 9:20am

apoligize…was 2.2.5

regarding the migration, I can give it a try, but probably not for the engagement on this customer…I won’t have enough time.

Thanks the same

slandelle · October 8, 2018, 9:23am

Gatling 2.2.5 was released in April 2017…
Latest stable is 2.3.1 and it will reach end of life by 2 weeks.

Michele_Gallina · October 8, 2018, 1:43pm

Hi Stéphane,
since you gave me the same advice for both problems (never ending simulation and timeout to not connected requests) I convinced the customer to let me invest some time on the upgrade. It was less painful than expected and now I’m having the tests running on 3.0 gatling version. Regarding the problem of scenario never ending I’ve found out this promosing error in the new report:

Invalid redirect to the same request

This was probably causing the infinite loop. Now my problem is to find out which are those requests…on gatling 2.2.5 I had 2 options for this:

I’ve enriched the simulation log with more information using the extraInfoExtractor . But this seems to be not an option anymore. So now if I grep the simulation log for the error I don’t see the url that was called
there were a couple of loggers: io.gatling.http.ahc and io.gatling.http.response that when set at TRACE or DEBUG levele were logging the request and response body of calls / failed calls.

Can you advice how to achieve something similar with version 3.0?

Thanks a lot

slandelle · October 8, 2018, 1:46pm

there were a couple of loggers: io.gatling.http.ahc and io.gatling.http.response that when set at TRACE or DEBUG levele were logging the request and response body of calls / failed calls.

https://github.com/gatling/gatling/blob/master/gatling-core/src/main/resources/logback.dummy#L11-L13

Stéphane Landelle

GatlingCorp CTO
slandelle@gatling.io

Michele_Gallina · October 9, 2018, 7:29am

Thanks a lot Stéphane,
that was what I was looking for.

Topic		Replies	Views
Infinite Loop on scenario Gatling (Open-Source)	1	151	February 24, 2015
[Gatling 1.5.2] : scenario can't repeat the execution Gatling (Open-Source)	11	204	January 2, 2014
gatling cant keep users lopping the scenario for a period of time Gatling (Open-Source)	1	166	April 11, 2019
repeat and during doesn't loop Gatling (Open-Source)	5	161	May 17, 2016
gatling - scenario executed with repeat goes into infinite loop. Gatling (Open-Source)	10	394	October 21, 2019

Gatling scenario never ends for some users

Related topics