- HTML mode recording / playback support:
As far as I can see, the main protocol gatling supports is pure HTTP
calls. However, these calls do very little parsing on the result and
requires manually recording each and playing them back.
In loadrunner however there is something called "HTML mode", where all
automatically, on the fly, without needing to manually rerecord everything
for each change to the page layout.
It also probably doesn't fetch style images from the css.
Actually, I'm pretty sure it does.
However, anything that isn't fetchable from inline page resources (due to
So scripts recorded in HTML mode will contain statements like:
What is seen by the recorder as page resources and what is seen as a top
level is controlled by a default list of content-types for which seperate
elements are created; And that list is configurable.
So for instance, JSON resources are not in the default content-type list
for top level recordable items, and end up in EXTRARES, unless you
configure the scenario to do something else.
(Note that loadrunner scripts are written in C. Most of which is pretty
well hidden, until you start looking for it )
Is there something like this in the works for gatling?
Not yet,I think it will be there for Gatling 2, but no date yet.
I have to say that I'm not fond of this feature as it usually only fetches
a part of the resources that would be fetched by a real browser, and that
those resources are usually cached/hosted on a CDN... But well, people keep
on asking for it, so...
That would be correct. Many of the scripts I create for my current customer
would have the 'fetch inline resources' configuration set to 'off' when run
by default, simply because those are irrelevant for the pages themselves.
It's nice to be able to turn that back on for tests where the CDN is
actually the thing you wish to test, though.
Also, automatically following HTTP 302 responses is kinda nice as well. Not
sure if gatling does that, either.
If not, how hard would it be to add such a thing myself? I've been
looking at gatling sources and it shouldn't be entirely impossible, but
some ideas on where to start would be nice. HTTP protocol itself isn't
quite it, since this would have to wrap / extend the HTTP protocol out of
Expect if you're good at Scala and willing to contribute, don't bother, it
would be probably too complex for you...
No offense, just that we have to implement async fetching first, before
automatic HTML parsing.
Good at Scala? I've read a few things, never tried my hand at it though.
I'm fairly decent in C and know my way around various other languages
(java, perl, PHP, bash ..) though. So trying my hand at it isn't entirely
out of the question.
- real-time test monitoring and control
One of the really nice things of Loadrunner is that you can watch test
results in real time as they happen, using a controller dashboard with
various graphs. You don't have the same level of granularity to the graphs
as you can have post-test, but the important stuff like "how many errors is
this test generating" and "what do response times look like" is there.
This mode of operation appears to simply not exist in gatling. Is that
correct, or have I overlooked something?
You have some info in the command line, and you can also integrate with
Graphite, but we currently don't have a built-in live dashboard.
Hmm, that is one of the main killer features gatling is missing, I reckon.
The loadrunner controller interface is pretty decent - configurable graphs,
most of the information you really want to see available, etc. And being
able to spot errors as they occur with server logs tailing on the
background means you have the ability to immediately spot problems and
report / fix them. Or simply reduce the load, for that matter.
- Impact of garbage collection on response times
Since gatling uses a java virtual machine, shouldn't there be a concern
that java virtual machine garbage collection pauses will influence the test
If not, how can this be mitigated / prevented?
We try to optimize things the best we can: GC is tuned, and we try to
discard object as soon as we can. For example, we discard response chunks
unless you perform some checks on the response body.
Right. So you haven't eliminated them, then. Which leads to more questions:
- Is it possible to measure gatling's own overhead - wasted time,
effectively - as a seperate per-transaction thing?
- Is it possible to run gatling inside a different JVM implementation that
does GC differently (or not at all) in order to avoid GC spikes?
I'm pretty sure the answer to the latter question would be "yes", but I
don't know how hard that would be.
Note that I don't seek to eliminate memory management overhead, really.
Just *unpredictable* memory management overhead. Since our reporting
usually talks about the 99th percentile in responsetimes for various
transactions having a sudden delay increase responsetime measurements can
actually make a large difference to the question whether something actually
passes the test criteria.
Speaking of which - does gatling report such a thing as the 99th (or
whatevered) percentage of responsetimes?
- Decision lists
A feature I have in ylib is something called a 'profile list' - perhaps
better called a 'decision list'. Wherein the user can specify that at a
certain point in the simulation the virtual user will choose from a list of
follow-up steps, using weights (percentages, usually) that determine the
odds that a particular set of steps is chosen. Gatling does have
rudamentary support for if-else based decisions, but is there anything more
advanced out there for people who don't want to type out a series of nested
error-prone if-else statements?
What you're looking for is randomSwitch:
Hmm, yes. That's it.
Can that construct deal with floating point numbers, though? The
documentation doesn't say, and we have quite a few cases where precision to
2-3 digits after the comma is present / used.
Quite honestly, I kind of like the ylib implementation that accepts any
combination of numbers that doesn't sum up to a number bigger than MAX_INT
(2 billion and a bit, on 32 bit systems). That way you often don't even
have to calculate percentages yourselves - just toss in raw hit counts from
access logs and you let the computer figure out what that would mean in
terms of ratios
But of course since I wrote that I'm sorta biased.
(And dangerous, I might add, if you don't know what you're doing - unless
you know exactly what the ratio is between the click before that point and
My bigger test scenarios tend to involve spreadsheets. Large ones. :p)
- Extended browser emulation:
One of the things I have built at some point for a customer is support
for emulating a range of different browsers based on production access
logs. The idea being that different browsers will use different settings
for things like HTTP connection pools and therefore behave differently.
This code would take a CSV file with various browser types, connection pool
settings, and (again) weights and then apply them proportionally each time
a new script iteration starts.
Same point as above: we have to implement async fetching.
That brings me to another feature I haven't seen yet - transactions.
The idea behind a transaction is to group a collection of requests into a
single responsetime measurement, so that instead of a bunch of url's you
get measurements for "click on next", or "login".
Preferably numbered, so that it's easy to see from the controller interface
what order things get executed in.
A standard script would do something like:
web_url("URL=http://start.page/", .... LAST);
Which sounds useless - until you realize you can add as many web_url()
statements in there as you like. Or any other type of request, in fact.
(Usually my scripts use the ylib wrapper functions though, so that we can
add common prefixes and numbers to transaction names without having to be
explict about them. Not to mention triggers. But that's something for
another day. If you're interested, there's code here:
- Full request logging
I usually rely heavily on 'debug mode' logging during script development,
where everything that goes into and out of a remote server is logged while
I'm still working on the simulation. As far as I can see no such beast
exists in Gatling. You get headers, yes, but not full body contents.
As stated above, response chunks are by default discarded if they're not
required by checks. You can disable this:
We'll be working on a debug mode too that will give you nice reports with
the full data.
Ah, yes. That's a tad clunky though, in the sense that you want to turn
this off whenever you put large amounts of load on something, and on when
you're running the script from within the scripting interface.
Usually the workaround for this is to detect what mode your script is
running in but since gatling doesn't appear to be making any such a
distinction testing for it is going to be a tad difficult.
(To be fair, the canonical way in loadrunner to do this is to check if the
assigned virtual user id (thread number, effectively) is negative. Which
will be the case whenever the scripting interface is running the script,
and not the case in real load tests.)
- User mode datapoints
I may have missed this - vaguely recall reading something about it
somewhere, in fact - but is it possible to specify your own metrics from
within gatling scripts to be recorded for later analysis / graphing?
You can dump custom data (
https://github.com/excilys/gatling/wiki/HTTP#wiki-custom-dump) and have
access to the original simulation.log file that you can parse on your own
Custom dumps isn't exactly what I was looking for, though it's close.
The basic user mode datapoint is no more than:
This name-value pair will be stored as a measurement value in the
collection database - same place the response times are kept, in other
words - and can be summoned up on both the controller interface and the
post-test analysis tool as one of the lines on the user datapoint graph.
We use this for all sorts of things. We have measurement pages that expose
application server internal values, which are fetched and turned into
datapoints by a seperate monitoring script. We have datapoints that extract
how much 'wasted' time loadrunner reports for each transaction (time spend
on the load generator CPU, rather than waiting on responses) and export
those as vuser datapoints as well. Then there are datapoints that are used
for think-time based rampup scenarios to expose the exact (calculated)
think time value so that we can see if the load is ramped up correctly.
Etc, etc, ad infinitum. You name it, we've done it
- Global verifications
It's possible in loadrunner to register a so called "global
verification", that will perform checks on each page being requested,
rather than just a single page.
That's a new feature: https://github.com/excilys/gatling/issues/1289
It hasn't been shipped yet (will be in 2M4, no date yet).
Not sure I understand what this tracker item translates into. What kind of
"common HTTP checks" are we talking about here?
These verifications work together with HTML mode (ignoring statics that
are not on the first level of the request) and can be paused and resumed as
needed to create exceptions for pages that diverge from the norm.
Does something like this exist in gatling?
How it's been implemented fro now is that common checks are being ignored
as soon as you specify some on a given request.
So as soon as you specify a fine-grainedcheck for a particular request the
common checks go out of the window?
That's definitely not what I want to do here.
Let's say that you know that your application will always format common
errors in a predictable way: Let's say there is a <div class="error">This
is an error</div> present in that case, on every page.
You also know that for a particular page the presence of the text
"<H1>Your account</H1>" means that particular page rendered at least
partially with success.
Except if there is such an error tag present - which can happen on every
page, not just that one.
What I do in such a case is register a global verification for the snippet
<div class="error"> that captures the text inside the div. This check will
then fail the transaction and report an error every time this div is
encountered, and report the exact text captured inside it. Regardless of
any other checks on the page:
The loadrunner code for that would look like:
web_global_verification("ID=errordiv", "TextPfx=<div class=\"error\">",
"TextSfx=</div>", "Fail=Found", LAST);
.. do requests ...
However, on one particular page this tag is always present, and NOT having
it on there is actually an error, for some odd reason. (Don't laugh. I've
seen it happen. :p)
So this check needs to be turned off:
web_reg_find("Text=<div class=\"error\">This is not an error but we really
like pretty colours!</div>", LAST);
.. do request ..
Does this make it clear what I mean?
As of today our standard template for test scripts contains something like
20-30 different global verifications for one thing or another. Including
some rather complicated ones that test for the absence or presence of
various versions of our left-hand side menu, which need to be turned on and
off with wrapper functions..
(To be fair - I am starting to suspect we're going to have to cut down on
this list some time soon. The oldest ones have been in there for years, and
I have no idea how often they fire .. probably not often.)
- Auto filters
One of the things that is very important is to have is some kind of
protection against accidentally putting load on the wrong URI or server,
especially when using HTML mode. One of the ways we have to counter that is
the auto filter - a simple statement forbidding the script to touch any URI
that fails to match a defined criterium, such as a hostname or URI snippet.
When used in conjunction with a CSV datafile that defines environments and
their URI's this can be a very powerful safeguard.
Does such a thing exist in gatling?
No, but that's a nice idea.
Seven or eight years as a performance engineer with various big banks and
airline companies can get you a lot of ideas about how to run performance
tests, I can tell you that
.. I probably can think of a few others, but this is a nice start for now
Thanks a lot for all those ideas.
Heh, that was just off the top of my head.
I haven't even begun talking about things like "data integrity checks",
"error ratio based load reduction", "log on error", "disk space guards",
"pacing", or "thinktime based rampup"
Or any of the subjects regarding correlation'/playback that require me to
actually have some first hand experience with more complicated gatling
.. ok .. since this is a wall of text already .. might as well continue for
Going over that list - I'm sure many of these are possible with gatling,
mind. Just not enough knowledge of the tool to be able to tell.
- Data integrity checks:
Custom checks that are layered on top of regular checks to verify that the
response doesn't just contain the correct page, but the correct *data* as
Think of it as check that only fires if all other checks pass, but starts
virtually jumping up and down and waving a big red flag whenever the page
displays data that doesn't correspond to the request.
In our case, it will not just fail the transaction, but effectively fail
the test itself as well, and generate a custom transaction to notify the
tester the second it occurs.
( The genesis of this was a production incident where some customers would
see the wrong information - belonging to different, basically random
customers - whenever they accessed a particular page. That sort of thing
really isn't good for your reputation, I can tell you that. And it didn't
show up in functional tests, because this problem was concurrency related,
and could never be reproduced with just a single user on the system. )
- Error ratio based load reduction
Sometimes we're not so much interested in bottlenecks as we are interested
in memory leaks. For this purpose, we wanted to put load on the system as
long as possible, and didn't care about temporary disruptions in the
backend infrastructure so much.
However, those disruptions often meant an end to the test in question,
because the load generators kept pumping requests at the system during the
disruption and it never got a chance to recover.
In order to work around that we have a feature that will detect disruptions
and reduce load temporarily. Whenever more than X number of iterations in a
row fails, the code in question will enforce a 900 second sleep on the
virtual user, and keep doing that until an iteration is completed. This way
the load will stay low as long as the problem remains, and go back up
whenever things get back up on their feet.
Default this feature is off, but when needed we can turn it on with a
simple command line attribute.
- Log on error
One useful feature of loadrunner is that you can set things up in such a
way that logging is turned off by default, but whenever an error occurs the
last X kb of log data is recorded on disk.
That reduces the amount of logging to what usually amounts to the only
things you're really interested in (including, I might add, full
- Disk space guards
Long running tests have a second problem: Too much disk space usage on the
generators, usually due to excessive logging or too many errors (which
turns all of the logging on temporarily, of course).
Loadrunner however never detects that, and will gleefully put truncated
files on disk, resulting in test corruption later.
So we had to work around it - with two pieces of code. One that will check
how much free space the generator has, and one that measures how much the
test used so far. If this code deems the usage is too high or the space
left too low, will simply disable the logging entirely.
Example here: https://github.com/randakar/y-lib/blob/master/y_logging.c#L306-
That works, but it would be really nice if a test tool could detect such a
Another thing that appears to not be present in gatling scripts: Specific
settings for pacing.
If you really want your load numbers to be exact and to give something of a
representative load then you somehow want to steer how long each user
session lasts. If you rely only on think time and responsetimes to drive
load then you run into the issue that as soon as the load nears a
bottleneck the load will actually decrease because the slow increase in
responsetimes makes your users slow down.
The way to deal with that is to set up a pacing - where each run through
the script is allocated a set amount of time (usually about 2 minutes,
though we always apply a randomization factor to these things to keep
users from getting into lockstep with each other) and if the user finishes
it's set of requests before that, the script will simply wait until the
required time has passed. Meaning that if some disruption occurs on the
application server, the load will just keep on going as if nothing is
Loadrunner actually has several different ways to configure this: No
pacing, pacing with fixed intervals, pacing with variable intervals, and
you can set it to "delay X to Y seconds after the script ends".
In our case pacing gets a bit of a complicated subject though, because we
also time the logout to occur (if it occurs - there's a percentage chance
associated with it) just before the end of the pacing duration, so we
actually have a calculated forced thinktime sitting right before the logout
that will delay it. And while we're at it, we also measure how long the
script took to reach that point, so that any overly long scripts or
too-short pacing settings can be acted upon by the tester.
- Thinktime based rampup
Steering the load by decreasing the think time, rather than adding more
threads or virtual users.
If you want to have the load ramp up linearly, this involves a bit of math
- the formula is:
thinktime = (virtual_users / TPS_target) - response_time;
an implementation. And lots of comments.
I like to think that code speaks for itself, but I may be wrong.
I'm sure this can be reimplemented in a gatling scenario fairly easily,