Ephemeral ports exhausted for high load tests

Hi Team,

Context
I am running performance tests with Gatling from a single aws linux node with 16 core CPU. The ephemeral port range is extended on this node to 65K using below commands.
sudo sysctl -w net.ipv4.ip_local_port_range="1025 65535"

My load test workload is as follows
users : 15000
total request/ sec :1800 .

Problem
When the test reaches the max load, tcp connections are not established and gatling throws below error .
j.n.ConnectException: connect(..) failed: Cannot assign requested address
From monitoring graphs , tcp_established connections in node is at 65K and node CPU is at 40% and hence there is more cpu available to run more load on node but the ephemeral ports not available.

Question
For high throughput/ concurrent user tests, how to run Gatling tests from single node when the ephemeral ports are exhausted. What are the possible solutions for the above problem?

Thanks
Sherin Rose

users : 15000

What do you mean exactly?

  • 15,000 users looping and 15,000 keep-alive sockets being reused?
  • or 15,000 concurrent users, each opening a socket, performing some requests (1?) and then closing it?

If the latter, it’s 100% expected that you run out of ephemeral ports.
If this use case is really what you’re trying to achieve (and not that you’ve actually overlooked the connections amount aspect of designing a proper injection profile that matches the behavior on your live system), you have no choice be to distribute your test over multiple machines (available in Gatling Enterprise).

Hi slandelle,

My workload design is for 15000 concurrent users, and have plans to increase the concurrent users in future as well. So Is distributed load testing the only option here ?

Defining a workload in terms of concurrent users doesn’t remove the requirement to define realistic virtual users journeys that trigger the proper number of sockets openings and closings.

If your virtual users journeys are properly defined and you hit ephemeral ports exhaustion and your OS limits are properly tuned, yes, distributed tests is the solution.