Gatling hanging AFTER reports are generated?

Hello,

We have a strange problem. Our Gattling tests have started hanging after generation of the reports.

The output looks like


14:56:12.626 [INFO ] c.e.e.g.c.a.EndAction - Done user #10
14:56:12.627 [INFO ] c.e.e.g.c.r.w.FileDataWriter - Received flush order
14:56:12.629 [DEBUG] c.e.e.g.c.r.Runner - All scenarios finished, stoping actors
Simulation Finished.
Generating reports…
14:56:13.102 [DEBUG] o.f.s.u.ClassFinder - loaded commands from jar:file:/var/lib/jenkins/deploy/client/clientArchive/lib/scalate-core-1.5.3.jar!/META-INF/services/org.fusesource.scalate/addon.index
14:56:13.103 [DEBUG] o.f.s.u.ClassFinder - loaded classes: List(org.fusesource.scalate.filter.ScalaMarkdownFilter)

…[SNIPPED MORE LOADING TEMPLATE STUFF]…

14:56:13.260 [DEBUG] o.f.s.TemplateEngine - Loaded uri: templates/series_scatter.ssp template: templates.$scalate$series_scatter_ssp@2f1d9c7c
Reports generated in 0s.
Please open the following file : /var/lib/jenkins/deploy/client/clientArchive/results/run20120821145610/index.html

and then it just hangs there and does not exit.

We are using Gatling 1.1.25 but we have implemented our own protocol (as our application is not HTTP based.)

It seems quite possible that we have done something silly in out protocol that causes Actors to hang around or something, but I would have assumed that would cause Gatling to hang before report generation?

Any ideas on what might be going on, or how to investigate it further?

cheers
Perryn

Hi Perryn,

There can be 2 possible causes for Gatling hanging:

  • messages/users not being propagated while running the simulation
  • thread pools not being shut down

As the reports are generated, you are in the second case.

You can have a look with a monitoring tool such as yourkit or visual vm to see which threads are still alive.

Here’s a list of what has to be taken care of in “standard” Gatling:

  • the Actor system has to be shut down, see com.excilys.ebi.gatling.app.Gatling.useActorSystem
  • the HTTP engine has to be shut down, this is done with a callback on the Actor system shut down, see com.excilys.ebi.gatling.http.action.HttpRequestAction.HTTP_CLIENT
  • the scalate templating engine (use for dynamic request bodies) has to be shut down the same way, as when templates are not precompiled (as the ones used in charts/highcharts), it starts an interactive compiler, see com.excilys.ebi.gatling.http.request.builder.AbstractHttpRequestWithBodyBuilder
    Hope that helps,

Cheers,

Steph

2012/8/21 Perryn Fowler <pezlists@gmail.com>

Hi Perryn,

Were you able to identify the threads that were still alive?
I have another user who reports a similar problem, but I haven’t been able to reproduce yet.

Cheers,

Stéphane

Hi Steph,

No not yet. I’m trying to confirm a few observations before breaking out a thread analyzer.

FYI On windows it seems to only hang for 60s, but on unix it seems to hang indefinitely. I am trying to confirm this.

I am a bit confused as to where the threads may be coming from as we have just dropped in a replacement protocol, so shutting down the Actors and the template engine should still be handled by standard Gatling.

I’m not aware of anywhere within our DSL where we are starting threads. (I suppose may be something analogous to the HTTP Engine in our DSL that does stuff with threads that I dont know about.)

Investigation continues…

Hi Perryn,

This kind of OS dependent timeouts looks to me like sockets no been closed, so it might be related to your specific protocol.

Good luck,

Steph

2012/8/23 Perryn Fowler <pezlists@gmail.com>

Hi Steph,

Finally bit the bullet and took some thread dumps. After tracing through and a lot off guesswork I arrived at the conclusion that you told me the answer in your first response.

“this is done with a callback on the Actor system shut down”

We were missing this callback for our TCP equivalent of HTTP_CLIENT

The only mystery is how this ever worked at all…

Perryn

Very good news!

Will you disclose some details about your protocol? I’m very curious about it… :wink:

Steph

2012/8/23 Perryn Fowler <pezlists@gmail.com>

Its an implementation of http://en.wikipedia.org/wiki/Extensible_Provisioning_Protocol

Underneath it all it creates a long lived socket, and then exchanges XML messages over it.

Not sure what you’d like to know?

Just curiosity. We haven’t found time to implement other protocols yet, but I hope we will someday. So, I’m curious about how other people are able to hack into Gatling, if the API is easily extensible, etc… Trying to also know what the impact of changing some inner APIs would be.

Steph

2012/8/23 Perryn Fowler <pezlists@gmail.com>