Reading constantUsersPerSec with Number of Active Sessions

I have a load:

scn.inject(
rampUsers(1) over (5 seconds),
constantUsersPerSec(15) during (120 minutes)
)

What is strange for me is a report:

“Lower step” corresponds to the load when everything is ok, no errors, no timeouts.

“Higher step” corresponds to the load when errors starts to appear and actually system “collapses”

On the lower “step” it is 34 (c3.xlarge), 55 (c3.2xlarge), 110 (c3.4xlarge) in average depending on target machine.

On higher step it is 1000, 1100, 1300.

I’m trying to understand “Active Sessions” vs “constantUsersPerSec”

My understanding is that active sessions aggregated for 10 secs, and gatling with a tested system is able to “complete” more than 15 (target constant) users per 10 seconds. And on crashes, users die even more quickly…

Does it looks like a true?

Constant users per second is an input to the test and
active users/sessions is an output.

Active users/session is the number of users at a point in time executing a scenario, including when pausing.

Given the input rate is constant at 15, it is likely the SUT had an issue causing the response time to increase. This caused the number of users executing a scenario at any point in time to increase, resulting in the trend in the chart.

Alex,

Thanks, most of your post is making sense for me and explains picture.
But with paper in hands I don’t get: Why longer response time cause active users increase? I do miss something

Here is the points:

  • On more powerful machine, response time is shorter and active users are higher. (this is controversial to assumption longer response more active users)
  • If response time is huge enough active users should NOT exceed “constant users per second” (CUpS). I’m thinking of it as in n-1’s interval users were started, at n’s interval they stay active, and not earlier than n+1’s interval they were complete => active users count should equal CUpS. Active Users Graph: _/
  • At particular point of time (let’s say n+1), responses are back and gatling starts new users, at this time active users might be higher, but on n+2’s should “hold” till longer response. Active Users Graph:

So why on system slow down, active users count are increased in many-times, while my expectations gatling should decrease pressure on SUT and active users count should look like “a saw”: /_/_/_/.

Any thoughts? Where is my thinking is wrong?

Hi Dmytro,

Alex,

Thanks, most of your post is making sense for me and explains picture.
But with paper in hands I don’t get: Why longer response time cause active users increase? I do miss something

Think about a real physical shop.
The users arrive through the entrance.
Once items to purchase selected the users go to pay. There’s a queue at the cashier.
The cashier takes some time to process your items.

The queue plus person being serviced is the active users.
If the cashier gets slower what happens to the active users?

Here is the points:

  • On more powerful machine, response time is shorter and active users are higher. (this is controversial to assumption longer response more active users)

Doesn’t sound right but is the difference in the error margin?
Will need more of a look.

  • If response time is huge enough active users should NOT exceed “constant users per second” (CUpS).

No. They are not comparable. Think back to the real shop example.

I’m thinking of it as in n-1’s interval users were started, at n’s interval they stay active, and not earlier than n+1’s interval they were complete => active users count should equal CUpS. Active Users Graph: _/

  • At particular point of time (let’s say n+1), responses are back and gatling starts new users, at this time active users might be higher, but on n+2’s should “hold” till longer response. Active Users Graph:

So why on system slow down, active users count are increased in many-times, while my expectations gatling should decrease pressure on SUT and active users count should look like “a saw”: /_/_/_/.

Any thoughts? Where is my thinking is wrong?

You are comparing concurrency with arrival rate which can’t be done directly.
Have a think about the real shop.
Ping back if you’re not making progress.

Thanks
Alex

Alex,

Ok, thanks a lot. Real shop example made thinks clear.

Offtopic, is there way to keep Active Users as a constant for period of time?

Best Regards,
Dmytro

Hi Dmytro,

no problem, what’s your use case for that?
Thanks,
Alex

Hi Dmytro,

Note that it could prove really difficult to keep the number of active users constant for a period of time : you can easily inject a known number of users per second with constantUsersPerSec, but to keep it constant while testing, you’d need to know how many users your system can handle per second, so that the number of new users injected is kept in balance with the number of users finishing the scenario.

Cheers,

Pierre

Pierre, Alex,

For me use case looks this way:

  • Slowly run scenarios till success. Scenario corresponds to creating Entity in SUT.
  • Monitor how response time and latency change during a time (more entities, slowly response)
  • Test has deliberately long time than SUT can survive.
  • Execute same test agains SUTs deployed on different configurations/VMs
  • When SUT dies or response time is too long, for me this is “Capacity in Entities” that configuration can support.

=> Have a report correlation among Capacity and Configuration (Linear, Log, Exp etc)

Does it sounds reasonable?

Next use case will be adopt scenario above for regression, and stop the test when capacity is reached (timeouts and errors start to appear).

P.S. SUT is not http server, it has internal API to communicate with AMQP+Akka Cluster(or Akka single node). Thus configuration changes can be vertical (powerful VM) or horizontal (more akka nodes).

Hi Dmytro,

Looks reasonable.
As far as I can tell I don’t see a need to keep Active Users as a constant assuming the test objective is to measure the peak sustainable throughput on the different configurations.

Thanks,
Alex