Handshake timed out after 10000ms with huge load

trinp · December 8, 2023, 3:47am

Hi, I have an issue when reaching a certain Users such as this:
i.n.h.s.SslHandshakeTimeoutException: handshake timed out after 10000ms
This is my inject pro5

.inject(
                  rampConcurrentUsers(1).to(5000).during(2500),
                  constantConcurrentUsers(5000).during(SteadyLength),
                  rampConcurrentUsers(5000).to(1).during(500)
)

Usually the error appear at 900 seconds timestamp - 1800 Users reached, then REST API following fails to be executed, mostly with 500, 503, 504 errors.
.
The odd thing is that when I change the target pool to 2000 with 1000 seconds ramp time, everything went smoothly without any issues, I checked the log and see that with same timestamp at 400 seconds, the total connections made in both pro5 was in a huge difference (around 1500 Users vs 6000 Users) what could have happened in this pro5 to cause this difference?

slandelle · December 8, 2023, 7:56am

TLS handshakes are nowadays very expensive CPU-wise, because of the long keys and complex ciphers.
Your server is most likely saturated, Gatling is just the messenger here.

trinp · December 8, 2023, 8:01am

Hi @slandelle , thanks for your perspective view on this matter,
Can you also explain in the last paragraph I wrote there ? I just want to understand more how Gatling create Virtual Users in a thread, hopefully it won’t be a hard topic to understand

when I change the target pool to 2000 with 1000 seconds ramp time, everything went smoothly without any issues, I checked the log and see that with same timestamp at 400 seconds, the total connections made in both pro5 was in a huge difference (around 1500 Users vs 6000 Users) what could have happened in this pro5 to cause this difference?

slandelle · December 8, 2023, 8:17am

If you reduce the number of users per second, you’re obviously going to reduce the number of TLS handshakes per second.
Moreover, when you start experiencing failures, depending on how your scenario is design, your users who failed to perform (TCP connect + TLS handshake) are going to try again on the next request.

trinp · December 9, 2023, 3:17am

Hi @slandelle , I think this is a weird behavior, as I tested around with this scenario

2000 Users, ramping rate 2 Users / sec (ramping time 1000 seconds), duration 1 hour
5000 Users, ramping rate 2 Users / sec (ramping time 2500 seconds), duration 1 hour

for 2000 Users case, everything is still fine and 5000 Users things started to crash at 15 minutes.
With log enabled, the huge difference are still there (1500 Users opened vs 6000 Users opened at time stamp )
Is the total Users of Gatling calculated beforehand and all pushed into server at the mean time ?
Please explain me more on this.

slandelle · December 9, 2023, 6:58pm

Sorry, but we can’t help without a reproducer. Moreover, if this is about the third party gRPC plugin, that’s not something we support here.

trinp · December 11, 2023, 8:18am

Hi @slandelle , I only cover REST HTTP in this topic, gRPC was in the past After rough discussions we have figured out where the problem is, I have learnt a lot from this, thank you.

system · January 10, 2024, 8:18am

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
SslHandshakeTimeoutException Gatling (Open-Source)	1	516	April 8, 2021
java.net.ConnectException: Handshake did not complete within 1 1 (100.0%) Gatling (Open-Source)	2	146	March 22, 2016
Handshake timeout errors Gatling (Open-Source)	3	1135	December 15, 2020
Handshake did not complete within 10000ms Gatling (Open-Source)	3	635	April 8, 2016
How to fix this j.n.ConnectException Handshake timed out Gatling (Open-Source)	7	210	August 22, 2018

Handshake timed out after 10000ms with huge load

Related topics