Gatling crashes sometimes with [ERROR] i.g.c.c.Controller - Actor io.gatling.core.controller.Controller@295aad8c crashed

Hi,

We have multiple scripts running every 20 minutes in a Dockerized environment. Most of the time they run smooth and finish properly. However once in a while we a have a few scripts that keeps running and running indefinitely.

In the output we see this block being repeated all the time:

  • ================================================================================
  • 2020-10-15 14:22:55 25s elapsed
  • ---- Requests ------------------------------------------------------------------
  • Global (OK=0 KO=0 )

  • ---- Alert C API ---------------------------------------------------------------
  • active: 0 / done: 0
  • ================================================================================

Quite early in the script I do see a crash (error message): 14:22:48.437 [ERROR] i.g.c.c.Controller - Actor io.gatling.core.controller.Controller@295aad8c crashed on message Some(Start(List(Scenario(Alert C API,io.gatling.core.action.Feed@754bf264…

And after this the logging mentioned above is repeated endlessly.

What could be the reason. Reading the error description I get the feeling it might be something with the feeder. However the script runs perfectly fine all the other time.
And then it is running in the same docker-container on the same server. Without any changes to the script or whatever setting.

Normally the script output looks like:

  • GATLING_HOME is set to /opt/gatling

  • Simulation services.alertcapi.AlertCApiSimulation started…

  • ================================================================================

  • 2020-10-16 09:20:25 1s elapsed

  • ---- Requests ------------------------------------------------------------------

  • Global (OK=12 KO=0 )

  • Alert C Loc Linear, GET CALL (OK=6 KO=0 )

  • Alert C Loc Point, GET CALL (OK=6 KO=0 )

  • ---- Alert C API ---------------------------------------------------------------

  • active: 0 / done: 6

  • ================================================================================

  • Simulation services.alertcapi.AlertCApiSimulation completed in 1 seconds

  • Parsing log file(s)…

  • Parsing log file(s) done

  • Generating reports…

  • ================================================================================

  • ---- Global Information --------------------------------------------------------

  • request count 12 (OK=12 KO=0 )

  • min response time 4 (OK=4 KO=- )

  • max response time 80 (OK=80 KO=- )

  • mean response time 31 (OK=31 KO=- )

  • std deviation 25 (OK=25 KO=- )

  • response time 50th percentile 23 (OK=23 KO=- )

  • response time 75th percentile 53 (OK=53 KO=- )

  • response time 95th percentile 68 (OK=68 KO=- )

  • response time 99th percentile 78 (OK=78 KO=- )

  • mean requests/sec 6 (OK=6 KO=- )

  • ---- Response Time Distribution ------------------------------------------------

  • t < 800 ms 12 (100%)

  • 800 ms < t < 1200 ms 0 ( 0%)

  • t > 1200 ms 0 ( 0%)

  • failed 0 ( 0%)

  • ================================================================================

  • Reports generated in 1s.

  • Please open the following file: /opt/gatling/results/alertcapisimulation-20201016092022471/index.html

Does anyone has an idea in which direction we have to search for a solution?

Don’t you have a full stacktrace?
Which version of Gatling do you use?

As required in the terms, please Provide a Short, Self Contained, Correct (Compilable), Example (see http://sscce.org/)

Sorry, I do have a full stacktrace. I will attach it to this message.
We are running Gatling 3.3.1

The message below the stack trace in the attached file is being repeated all the time until I manually stopped the process.
If you have difficulties reading the file please let me know.

Kind regards and thanks for your quick response,

Hans

log-events-viewer-result (1).csv (201 KB)

First thing first: please upgrade to Gatling 3.4.1.

Your DataWriters fail to boot in due time. Any chance you’ve enabled graphite?

Hi Stephane,

Yes, indeed we enabled Graphite datawriters in Gatling.

I think that’s the issue: the host where you deploy your injector doesn’t have access to your Graphite server and that causes a timeout.

Ah, I see. Is there any way to circumvent that problem? There might be a situation possible where the Graphite host is not available and we accept losing the data at that moment.

Well, don’t enable Graphite if it’s not available/reachable.

Stephane, thanks for your fast responses. We will look into a solution to

  1. Make the graphite server more available
  2. Detect the availability of the graphite server and act upon it in our scripts engine.