Hitting a max per second request rate

Hi,

I am running a simulation with one request (i.e. not waiting for a response to action another request), but I am unable to achieve a concurrency rate much above 1000 r/s.
https://github.com/bbc/gatling-load-tests/blob/master/src/test/scala/bbc/trafficmanager/TrafficManager.scala

Running JConsole locally, I note Java heap doesn’t go above 300 Mb, but CPU hits 60% (4 CPUs, 8GB RAM). However, the same max request rate occurs on a c3.2xlarge (8 CPU 15GB RAM) . I have set the maximum open-file limit, ephemeral ports and even reduced TCP TIME_WAIT.

My load injector has a public IP and there are no mechanisms in place to prevent a high incoming load from one IP.

I have run tens of thousands of successful requests (against the same URL) using wrk in order to obviate a network/infrastructure issue.

I wonder if anyone could give me any guidance as to why a request limit occurs when requests are asynchronous?

Aidy