Hi,
I recently run into a weird problem with a Gatling simulation scenario for testing an eCommerce site.
Basically it happens that sometime (quite often recently), the simulation doesn’t finish properly and a few active users (1, 2 sometimes 4) are still logged, but I can’t see anymore requests sent on the simulation.log. So I’ve taken a Thread Dump on the gatling injector server and I’ve noticed that there are more or less a thread for each of the active users that doesn’t complete the navigation with the following pattern:
at scala.runtime.ScalaRunTime$.hash(ScalaRunTime.scala:206)
at scala.collection.immutable.HashMap.elemHashCode(HashMap.scala:80)
at scala.collection.immutable.HashMap.computeHash(HashMap.scala:89)
at scala.collection.immutable.HashMap.get(HashMap.scala:54)
at scala.collection.GenMapLike$$anonfun$liftedTree1$1$1.apply(GenMapLike.scala:120)
at scala.collection.GenMapLike$$anonfun$liftedTree1$1$1.apply(GenMapLike.scala:119)
at scala.collection.Iterator$class.forall(Iterator.scala:905)
at scala.collection.AbstractIterator.forall(Iterator.scala:1336)
at scala.collection.IterableLike$class.forall(IterableLike.scala:75)
at scala.collection.AbstractIterable.forall(Iterable.scala:54)
at scala.collection.GenMapLike$class.liftedTree1$1(GenMapLike.scala:119)
at scala.collection.GenMapLike$class.equals(GenMapLike.scala:118)
at scala.collection.AbstractMap.equals(Map.scala:59)
at io.gatling.http.cache.Cookies.equals(Cookies.scala:33)
at io.gatling.http.cache.PermanentRedirectCacheKey.equals(PermanentRedirectCacheSupport.scala:33)
at scala.collection.immutable.Map$Map1.get(Map.scala:107)
at io.gatling.core.util.cache.Cache.get(Cache.scala:79)
at io.gatling.core.util.cache.SessionCacheHandler.io$gatling$core$util$cache$SessionCacheHandler$$$anonfun$1(SessionCacheHandler.scala:64)
at io.gatling.core.util.cache.SessionCacheHandler$$Lambda$323/1463745981.apply(Unknown Source)
at scala.Option.flatMap(Option.scala:171)
at io.gatling.core.util.cache.SessionCacheHandler.getEntry(SessionCacheHandler.scala:64)
at io.gatling.http.cache.PermanentRedirectCacheSupport$class.permanentRedirect1$1(PermanentRedirectCacheSupport.scala:55)
at io.gatling.http.cache.PermanentRedirectCacheSupport$class.io$gatling$http$cache$PermanentRedirectCacheSupport$$permanentRedirect(PermanentRedirectCacheSupport.scala:64)
at io.gatling.http.cache.PermanentRedirectCacheSupport$class.applyPermanentRedirect(PermanentRedirectCacheSupport.scala:75)
at io.gatling.http.cache.HttpCaches.applyPermanentRedirect(HttpCaches.scala:24)
at io.gatling.http.action.sync.HttpTx$.startWithCache(HttpTx.scala:55)
at io.gatling.http.action.sync.HttpTx$.start(HttpTx.scala:94)
at io.gatling.http.ahc.ResponseProcessor.redirect$1(ResponseProcessor.scala:274)
at io.gatling.http.ahc.ResponseProcessor.processResponse(ResponseProcessor.scala:317)
at io.gatling.http.ahc.ResponseProcessor.onCompleted(ResponseProcessor.scala:66)
at io.gatling.http.ahc.AsyncHandler.io$gatling$http$ahc$AsyncHandler$$$anonfun$1(AsyncHandler.scala:136)
at io.gatling.http.ahc.AsyncHandler.io$gatling$http$ahc$AsyncHandler$$$anonfun$1$adapted(AsyncHandler.scala:134)
at io.gatling.http.ahc.AsyncHandler$$Lambda$331/434896179.apply(Unknown Source)
at io.gatling.http.ahc.AsyncHandler.withResponse(AsyncHandler.scala:127)
at io.gatling.http.ahc.AsyncHandler.onCompleted(AsyncHandler.scala:134)
at io.gatling.http.ahc.AsyncHandler.onCompleted(AsyncHandler.scala:47)
at org.asynchttpclient.netty.NettyResponseFuture.getContent(NettyResponseFuture.java:189)
at org.asynchttpclient.netty.NettyResponseFuture.done(NettyResponseFuture.java:223)
at org.asynchttpclient.netty.handler.HttpHandler.finishUpdate(HttpHandler.java:58)
at org.asynchttpclient.netty.handler.HttpHandler.handleChunk(HttpHandler.java:159)
at org.asynchttpclient.netty.handler.HttpHandler.handleRead(HttpHandler.java:187)
at org.asynchttpclient.netty.handler.AsyncHttpClientHandler.channelRead(AsyncHttpClientHandler.java:76)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:336)
at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:336)
at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:336)
at io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.fireChannelRead(CombinedChannelDuplexHandler.java:435)
at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:293)
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:267)
at io.netty.channel.CombinedChannelDuplexHandler.channelRead(CombinedChannelDuplexHandler.java:250)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:336)
at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1228)
at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1039)
at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:411)
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:248)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:336)
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:911)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:643)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:566)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:480)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:442)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144)
at java.lang.Thread.run(Thread.java:748)
So I’ve guessed from the stack trace that maybe was a problem related to infinite redirect and then I’ve added also the .maxRedirects(n) on the http protocol to avoid an infinite redirect loop but the problem is still there and, by the way, I found no traces in the log that a lot of redirects were issued particularly toward the end of the tests when I had just the 1,2 active users.
Did you have any idea what can be the problem?
When the issue starts to happens I’ve also a huge spike on the CPU of the gatling servers that then stay up till the end of the test.
You can see that around 19:00 the issue happens (just 1 user remains active at the end of the test) but with a server with just 2 cpu we had almost 50% of the cpu (1 core) fully used, and this is probably due to this thread that we see still RUNNABLE at the end of the test (like if it’s consuming all CPU on an infinite loop or similar).
Does anyone experienced a similar problem?
Oh sorry, I’m on gatling 2.5