2.0.0-RC2 Increasing heap size issue

Is that really the behavior you want? How many alive connections do you expect?

I just ran with connection pooling disabled and see the same behavior, I’ll upload a heap dump for that run as well, tomorrow. For the next test I’ll indeed discuss and investigate what expected number of connection will be. But is there a way in Gatling to limit the number of connections used and still emulate an “open system”, in other words put a load on the SUT that is not influenced by the performance of the SUT?

Whether the client ie gatling/netty keeps connections open or nor does not block the progress of the next users arriving. So it’ll still be and open model.

As Stéphane says the connection pooling should match what your systems clients do. This will be different for different people’s tests. For those who do have clients that pool/keep alive this is a major feature compared with other tools.

Is that really the behavior you want? How many alive connections do you expect?

I just ran with connection pooling disabled and see the same behavior, I’ll upload a heap dump for that run as well, tomorrow. For the next test I’ll indeed discuss and investigate what expected number of connection will be. But is there a way in Gatling to limit the number of connections used and still emulate an “open system”, in other words put a load on the SUT that is not influenced by the performance of the SUT?

Whether the client ie gatling/netty keeps connections open or nor does not block the progress of the next users arriving. So it’ll still be and open model.

As Stéphane says the connection pooling should match what your systems clients do. This will be different for different people’s tests. For those who do have clients that pool/keep alive this is a major feature compared with other tools.

I had a look today at some monitoring data from the production environment. These show 75 concurrent invocations in the peaks. We would to be able to cope with a fail over situation so we test at a load of 150 tps. Since the response times for the calls are > 1 second it is safe to say there will 150 concurrent connections under that load.

I have checked with developers from the client applications a they told me they don’t use connection pooling so I will disabled that in my Gatling config. I ran a test last night with that setup and still saw Old Gen heap growing after the SUT hiccups occur, resulting in request time outs.

Did the heap dumps I shared shed any light on the issue?

Cheers

Daniel

Hello Daniel,

I’m on it. I think I have pinned several problems, trying to find the best way to fix them. I should have something for you to try in an hour or so.

I ran a test last night with that setup and still saw Old Gen heap growing after the SUT hiccups occur, resulting in request time outs.

I actually think that’s quite the opposite: SUT hiccups => request timeouts => Old Gen heap to grows.

Also, do you have any other kind of errors than the request timeouts?

@Daniel

I just published a new snapshot (shipping AHC 1.9.0-BETA10).
Could you give it a shot, please?

Cheers,

Stéphane

Stephane,

I’ll give it a spin this afternoon!

Daniel

Hi Stephane,

Last night I let the test run for about 8 hours under load of 75 tps. The reason I did not use the target 150 tps load is I found that under that load, the number of concurrent users can skyrocket up to thousands of users during the hiccups, resulting in very high cpu usage and “connection closed remotely” errors in the log. If I get the chance I will experiment with lowering the request timeout to see if that helps Gatling to survive the hiccups.

The good news is that under load of 75 TPS the test ran fine! During the run there were a few hiccups resulting in 80 request time outs (see below)

waiting: 10921503 / running: 160 / done:2175197
---- Requests ------------------------------------------------------------------

Global (OK=2175036 KO=161 )

---- Errors --------------------------------------------------------------------

java.util.concurrent.TimeoutException: Request timed out to se 80 (49.69%)
rvices.com/ip:443 of 60000 ms

I just did a 150 tps run with request timeout = 30000 and Gatling survived all hiccups! (8000+ timed out requests) I am ready for the next challenge :slight_smile: (coming next week)

Have a good weekend,

Daniel

Great on a friday!

I’ll tackle the connection pool clean up on monday.

Have a nice weekend!

Stéphane