We’re using gatling v2.3.0 to run stability-simulations of an internal application. We have not yet tested upgrading to v2.3.1, but as far as I can see, there are no changes in the release-notes that appear to be touching the injectors, so I have difficulty imagining why that would change anything in this situation. (We are preparing an upgrade to v3.0, but there will be some time before we have the chance to do that)
I’m unable to provide exact scenario/system details unfortunately due to project restrictions, but the scenario itself is a somewhat long chain of ~40 exec elements, where a set of them repeats 50 times. All individual requests are just plain GET/POST/DELETE/PUT requests to an otherwise ordinary REST-API. It’s a relatively long chain, but outside of the 50x login repeat loop, nothing extraordinary.
Some simple scenario pseudocode:
val HappyDayScenario: ScenarioBuilder = scenario("Happy Day")
.feed(csv("userIDs").random)
.group("create user")(
exec(
sequence,
of,
requests,
to,
create,
user
)
).exitHereIfFailed
.group("a bunch of logins")(
repeat(50){
exitBlockOnFail(
exec(
sequence,
of,
requests,
to,
perform,
login
)
)
}
)
.group("delete the user")(
exitBlockOnFail(
exec(
sequence,
of,
requests,
to,
delete,
user
)
)
)
`
`
We are using the following injection-profile:
setUp(
new Scenario().HappyDayScenario.inject(
rampUsersPerSec(0) to (0.6) during (10 minutes),
constantUsersPerSec(0.6) during (72 hours)
)
)
`
`
The goal here is to run an initial ramp up to the target load with a slow warmup, then run this static amount of load over 72 hours to measure the stability of the target system over time. This has worked fine over several iterations, but recently this simulation has been behaving very strangely when we run it against one specific system, where it appears that the load generated by the constantUsersPerSec(targetLoad) during (duration hours)
seemingly suddenly doubles out of nowhere for no discernible reason. We have been able to reproduce this behavior three times now.
It’s easy to jump to the conclusion that there has to be some sort of error in the target application - but as far as I understand the constantUsersPerSec injector, this feature should always, under any and all circumstances, do exactly what it says on the can: inject a set amount of users every second.
The reason we believe there has to be some bizarre behaviour with the injector is mainly based on the observation that we the counters for the waiting/done scenarios in the gatling console out log very distinctly changes pace - where it previously has been changing at a constant, steady pace, as is expected. This always happens near the end of the 72-hour period, where the waiting/done curves clearly change, as indicated in the simple plot below:
Similar behavior is of course apparent in the gatling-report and the hardware metrics as well - everything indicates a sudden increase of load generated by the gatling testrunner. As far as we can see, no errors are being generated in the application under test, the stability tests have been run at roughly 50% of the systems total capacity, so even at a doubled rate it is behaving correctly.
We have of course double and triple checked that the simulation actually runs with the above mentioned injection profiles, and that the config-values for the targetLoad values are static and don’t change (Though I don’t think it can be dynamic in the first place).
We have been unable to explain what causes this, so we’re referring here for a second set of eyes on the situation before we consider running through all the upgrade steps, or possibly creating a gatling github issue on the situation. Maybe there is something obvious we’re missing.
Is this a possible bug/issue with the constantUsersPerSec feature itself, or can anyone identify a likely cause of this behavior - be that by misunderstandings or incorrect usage?
(Question is copy-pasted from the original post on stack: https://stackoverflow.com/questions/53043655)