Differences in throughput between 1.5.2 and 2.0.0-RC5

John_Huffaker · September 17, 2014, 8:38pm

Hey All,

Hopefully I’ve just made a stupid error, but I tried benchmarking gatling between 1.5.2 and 2.0.0-RC5 against the same /ping => pong service and got much different peak throughputs.
The 1.5.2 version peaks at ~23k RPS and the 2.0.0-RC5 peaks out at about 7.5k RPS.

Futher info:

Done on a macbook pro (java 1.7.0-67), invoked via gatling.sh
Server on same box as load driver
Both runs were done against a totally warm service.
Yes we would like to be able to drive more than 20k. We can and have done this over the network on a harder workload with 1.5.2 gatling.
I’ve tried tweaking http settings and didn’t manage to get any wins.

The main differences between the two are:

Gatling version
.users(1000) vs .inject(atOnceUsers(1000))

The scripts used on each:

– 152 —
val httpConf = httpConfig.baseURL(“http://localhost:9114”)

val scn =
scenario(“Ping Simulation”)
.during(10 seconds)(
exec(http(“ping”)
.get("/ping")
.check(status.in(Seq(200)))
.check(bodyString.is(“pong\n”)))
)
.users(1000)
.protocolConfig(httpConf)

setUp(scn)

— 200RC5 —

val httpConf = http.baseURL(“http://localhost:9114”)

val scn = scenario(“Ping”)
.during(10 seconds)(
exec(http(“ping”)
.get("/ping")
.check(status.is(200))
.check(bodyString.is(“pong\n”))))

setUp(scn.inject(atOnceUsers(1000)).protocols(httpConf))

Excilys · September 19, 2014, 7:40pm

What does your gatling.conf file looked like in 1.5.2?
Are you sure you weren’t sharing the connection pool amongst virtual users?
Is there any way you can share a reproducer?

Excilys · September 19, 2014, 7:40pm

What does your gatling.conf file looked like in 1.5.2?
Are you sure you weren’t sharing the connection pool amongst virtual users?
Is there any way you can share a reproducer?

Excilys · September 19, 2014, 9:40pm

Mmmm, I see you’re not setting the Connection header to keep-alive. Is this intended?
I’m starting to wonder if keep-alive was somehow enforced in Gatling 1.

Excilys · September 20, 2014, 1:06pm

I have some great news!

There was indeed a huge performance regression since Gatling 1 for your use case (also impacted generate web browsing in a less obvious fashion).
The fix was actually very simple (very stupid located mistake): https://github.com/gatling/gatling/issues/2223

On my test, Gatling 2 went from 66% slower to 59% faster than Gatling 1.5!

All this last year hard work indeed paid off!!!

Thanks for reporting this.

Cheers,

Stéphane

Excilys · September 20, 2014, 1:09pm

Just one last thing: my test case is the same as yours: small service replying “pong\n”, Gatling and server on same host, OSX, latest JDK7.
Once Gatling is warm (after 15sec), I get ~33kRPS.

John_Huffaker · September 21, 2014, 7:46pm

Awesome! Thanks for digging in. I was trying to grab a snapshot to try out your change but it looks like they are lagging the repo’s commits by a little bit. I’ll follow up early next week if anything goes awry.

Regards,
John

Excilys · September 21, 2014, 9:18pm

Mistake in our build chain.
Snapshots are back on Sonatype.

John_Huffaker · September 22, 2014, 6:36pm

Awesome, thanks for fixing that.

I ran the test this morning on my laptop and peak throughput went from 7.5k rps with RC5 to 38k rps on the SNAPSHOT! Thanks again for digging in to an issue that could have easily been ignored!

Excilys · September 22, 2014, 6:43pm

That’s very impressive!

Could you share how you tuned your OSX, please?
Is there also any way you could share/explain your ping service? Spray? How is it tuned?

John_Huffaker · September 22, 2014, 8:18pm

Note, that is 38k peak (i.e. looking at the throughput graph over time) rather than the mean. The means were in the 27k-29k range.

Yeah here’s my info:Macbook pro, retina 2.7ghz core i7 (4 cores, 8 virtual cores), 16gb memory, ssd, OS X 10.9.4.
You’ve seen my simulation. I use the default gatling config. I don’t think I’ve really adjusted much on OSX beyond what you guys suggest for open file descriptors and whatnot.

The service itself is actually really boring Jetty8 + a servlet not built on any framework with no ssl. Unfortunately it is tightly bound to our internal service infrastructure so I can’t usefully put up the jetty part of the extraction, but here’s the servlet:

import javax.servlet.ServletException;

import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
import java.io.IOException;
import java.io.PrintWriter;

/**

An HTTP servlet which outputs a {@code text/plain} {@code “pong”} response.
*/
public class PingServlet extends HttpServlet {
private static final String CONTENT_TYPE = “text/plain”;
private static final String CONTENT = “pong”;

@Override
protected void doGet(HttpServletRequest req,
HttpServletResponse resp) throws ServletException, IOException {
resp.setStatus(HttpServletResponse.SC_OK);
resp.setHeader(“Cache-Control”, “must-revalidate,no-cache,no-store”);
resp.setContentType(CONTENT_TYPE);
final PrintWriter writer = resp.getWriter();
try {
writer.println(CONTENT);
} finally {
writer.close();
}
}
}

Here are the settings on the embedded jetty8 we use:
val acceptors = Some(1)

val acceptQueueSize = None
val minThreads = Some(3)
val maxThreads = Some(10)

with no real access logging going on. If you are using any sort of access or per request logs you need to make sure to turn on async flushing / async appenders or you’ll probably spend a bunch of time waiting for disk syncs.

We’ve also had some issues with back pressure for async servlets where the server lets a lot of work into the system and then proceeds to thrash on execution contexts. You’ll notice my min and maxThreads above are pretty low so whatever you use to provide back pressure in spray or other framework make sure you use it. I tweaked the minThreads and maxThreads numbers up to 8/16 and the throughput stayed about the same.

Empty_Account1 · September 22, 2014, 9:23pm

Hi John,

You mention you managed to get around 27k-29k RPS. Do you think this load is sustainable using Gatling with the hardware specification you gave of your laptop?

Aidy

Santhosh · September 22, 2014, 9:38pm

It means we should use SNAPSHOT version, not RC5?

Excilys · September 23, 2014, 8:17am

If you can afford to run snapshots, yes.

Excilys · September 23, 2014, 8:34am

So we have similar results.
I also peak to 38k rps with my Spray app. Mean is about 36k rps once the JVM is warm (after ~20s).

John_Huffaker · September 23, 2014, 9:57pm

There’s a lot of problems with the test setup that I specified. I mostly do it as a smell test to figure out where certain limits are in ideal conditions. Things definitely shift as you add the network and request latency of real service. There are a ton of variables to play with which can adjust these numbers. Your best bet is to get multiple load drivers and services and scale throughput that way rather than relying on a single laptop to get an accurate picture.

Daniel_Moll · September 29, 2014, 2:34pm

Hi Stephan,

Will this fix also result in a lower CPU usage?

Cheers

Daniel

Excilys · September 29, 2014, 2:38pm

Absolutely.

Daniel_Moll · September 29, 2014, 2:43pm

OK thanks, back to SNAPSHOT then

Excilys · September 29, 2014, 2:44pm

Don’t bother, RC6 will be out in 1hr top.

Topic		Replies	Views
Differences in throughput between 1.5.2 and 2.0.0-RC5 Gatling (Open-Source)	0	84	September 18, 2014
Differences in throughput between 1.5.2 and 2.0.0-RC5 Gatling (Open-Source)	1	91	September 20, 2014
Comparing HTTP1 vs HTTP2 performance Gatling (Open-Source)	8	186	May 5, 2023
Gatling Max throughput and JVM tuning Gatling (Open-Source)	10	134	May 1, 2015
Increased latency when testing http/2 and 60,000 connections Gatling (Open-Source)	1	94	April 2, 2019

Differences in throughput between 1.5.2 and 2.0.0-RC5

Related Topics