Strange behaviour GraphiteDataWriter Gatling-2.1.2

Daniel_Moll · January 9, 2015, 1:07pm

Hi Guys,

I found some really strange GraphiteDataWriter behavior in Gatling-2.1.2. It seems to fail in case of certain simulation setups, specifically for simulations with long duration.

If I use this setup:

setUp(

        Scenarios.scn.inject(rampUsersPerSec(1.0) to (200.0) during (1800)
          , constantUsersPerSec(200.0)
            during (86400))
          .exponentialPauses

).protocols(httpProtocol)

the GraphiteDataWriter works, but if I change duration to this:

setUp(

        Scenarios.scn.inject(rampUsersPerSec(1.0) to (200.0) during (1800)
          , constantUsersPerSec(200.0)
            during (172800))
          .exponentialPauses

).protocols(httpProtocol)

it pushes a few datapoints at the beginning of the scenario and then stops.

I can reproduce the issue so it doesn't seem to be a coincident. With other scripts I was able to run 48 hour runs without problems, so it seems to be a combination of simulation load and duration. Could it be something memory related?

My GraphiteDataWriter look like this

   graphite {
      light = false
      host = "host"
      port = 2113
      protocol = "tcp"
      rootPathPrefix = "gatling2"
      writeInterval = 10
      #bucketWidth = 100
      #bufferSize = 8192
    }

cheers

Daniel

Excilys · January 9, 2015, 1:24pm

Weird, I fail to see how run duration and graphite could be related.

Your issue could be related to the supervisor strategy we introduced: if the GraphiteDataWriter failed to reconnect to Graphite more than 5 times in 5 seconds, it stops. Are you sure you don’t have connectivity issues between the Gatling host and the Graphite one?

Daniel_Moll · January 9, 2015, 1:32pm

That was my first guess as well, but it works fine if I change the set up to half the duration. Tried it dozens of times and as soon as I increase the duration it fails after a sending just a few data points....

Daniel_Moll · January 9, 2015, 1:38pm

But would there be a difference in the initial calls to Graphite in a simulation with a longer duration, somehow causing the TCP connection to fail?

Excilys · January 9, 2015, 3:35pm

Absolutely not, that’s why it’s so weird.

Except for Graphite, are you sure that the HTTP requests are sent as expected (with expected load)?

Daniel_Moll · January 9, 2015, 8:16pm

Hi Stephane,

I have made a Wireshark capture of both setups and I don't see any connection resets or errors. The client just stops sending packets after a minute or so in the 48 hour setup, really strange. When I compare the captured streams I can't see any obvious differences. Would you like me to share the capture files?

Cheers

Daniel

Excilys · January 12, 2015, 11:37am

Hi Daniel,

Found it (and fixed it): https://github.com/gatling/gatling/issues/2502

The issue was actually that virtual users scheduling was not properly lazy. The consequence was that the heap is quickly full under such heavy and long load, causing permanent GC to freeze the JVM until it finally dies in OOM (I guess you killed the JVM before it happened).

Thanks a lot for reporting!
Cheers,

Stéphane

Daniel_Moll · January 12, 2015, 2:19pm

Nice!

It was starting to worry it was me doing something wrong Is release 2.1.3 still due for today?

Cheers

Daniel

Pierre_DAL-PRA · January 12, 2015, 2:24pm

Hi Daniel,

We’ll release 2.1.3 no later than tomorrow.
Like always, we’ll announce when it’s be available on the ML

Cheers,

Pierre

Topic		Replies	Views
Graphite output Gatling (Open-Source)	2	497	April 19, 2021
Custom GraphiteDataWriterConfiguration Gatling (Open-Source)	0	136	August 14, 2015
GraphiteWriter issue witg Gatling-2.2.0-SNAPSHOT version Gatling (Open-Source)	2	111	January 8, 2015
io.gatling.graphite.GraphiteDataWriter crashed Gatling (Open-Source)	2	146	March 11, 2021
Live Monitoring with Gatling-InfluxDB-Grafana Gatling (Open-Source)	1	237	July 3, 2019

Strange behaviour GraphiteDataWriter Gatling-2.1.2

Related topics