Gatling from a loadrunner perspective

Hi,

I’m a long time loadrunner user (and developer of the y-lib support library - https://github.com/randakar/y-lib ) checking in.
This gatling tool looks pretty good. Nice to see an open source loadtest tool that doesn’t seem require point-and-click and is actually actively developed. This may go far :slight_smile:

However, before I can convince my customers that deploying gatling is a good idea, there is a number of features that seem to be missing and therefore I am here to ask questions about them. :slight_smile:

Questions:

  • HTML mode recording / playback support:
    As far as I can see, the main protocol gatling supports is pure HTTP calls. However, these calls do very little parsing on the result and fetching all the static images, CSS and javascript embedded in a page requires manually recording each and playing them back.
    In loadrunner however there is something called “HTML mode”, where all static content that does not require running javascript is fetched automatically, on the fly, without needing to manually rerecord everything for each change to the page layout.
    Is there something like this in the works for gatling? If not, how hard would it be to add such a thing myself? I’ve been looking at gatling sources and it shouldn’t be entirely impossible, but some ideas on where to start would be nice. HTTP protocol itself isn’t quite it, since this would have to wrap / extend the HTTP protocol out of necessity.

  • real-time test monitoring and control
    One of the really nice things of Loadrunner is that you can watch test results in real time as they happen, using a controller dashboard with various graphs. You don’t have the same level of granularity to the graphs as you can have post-test, but the important stuff like “how many errors is this test generating” and “what do response times look like” is there.
    This mode of operation appears to simply not exist in gatling. Is that correct, or have I overlooked something?

  • Impact of garbage collection on response times
    Since gatling uses a java virtual machine, shouldn’t there be a concern that java virtual machine garbage collection pauses will influence the test results?
    If not, how can this be mitigated / prevented?

  • Decision lists
    A feature I have in ylib is something called a ‘profile list’ - perhaps better called a ‘decision list’. Wherein the user can specify that at a certain point in the simulation the virtual user will choose from a list of follow-up steps, using weights (percentages, usually) that determine the odds that a particular set of steps is chosen. Gatling does have rudamentary support for if-else based decisions, but is there anything more advanced out there for people who don’t want to type out a series of nested error-prone if-else statements?

  • Extended browser emulation:
    One of the things I have built at some point for a customer is support for emulating a range of different browsers based on production access logs. The idea being that different browsers will use different settings for things like HTTP connection pools and therefore behave differently. This code would take a CSV file with various browser types, connection pool settings, and (again) weights and then apply them proportionally each time a new script iteration starts.

  • Full request logging
    I usually rely heavily on ‘debug mode’ logging during script development, where everything that goes into and out of a remote server is logged while I’m still working on the simulation. As far as I can see no such beast exists in Gatling. You get headers, yes, but not full body contents.

  • User mode datapoints
    I may have missed this - vaguely recall reading something about it somewhere, in fact - but is it possible to specify your own metrics from within gatling scripts to be recorded for later analysis / graphing?

  • Global verifications
    It’s possible in loadrunner to register a so called “global verification”, that will perform checks on each page being requested, rather than just a single page. These verifications work together with HTML mode (ignoring statics that are not on the first level of the request) and can be paused and resumed as needed to create exceptions for pages that diverge from the norm.
    Does something like this exist in gatling?

  • Auto filters
    One of the things that is very important is to have is some kind of protection against accidentally putting load on the wrong URI or server, especially when using HTML mode. One of the ways we have to counter that is the auto filter - a simple statement forbidding the script to touch any URI that fails to match a defined criterium, such as a hostname or URI snippet. When used in conjunction with a CSV datafile that defines environments and their URI’s this can be a very powerful safeguard.
    Does such a thing exist in gatling?

… I probably can think of a few others, but this is a nice start for now :slight_smile:

Hi,

- HTML mode recording / playback support:

As far as I can see, the main protocol gatling supports is pure HTTP
calls. However, these calls do very little parsing on the result and
fetching all the static images, CSS and javascript embedded in a page
requires manually recording each and playing them back.
In loadrunner however there is something called "HTML mode", where all
static content that does not require running javascript is fetched
automatically, on the fly, without needing to manually rerecord everything
for each change to the page layout.

It also probably doesn't fetch style images from the css.

Actually, I'm pretty sure it does.
However, anything that isn't fetchable from inline page resources (due to
needing a javascript engine to execute them) is recorded as a so called
"EXTRARES" statement.

So scripts recorded in HTML mode will contain statements like:

web_url("URI=http://someuri.uri",
            "Refererer=...",
             "Mode=HTML",
            EXTRARES,
            "Resource=http://someuri.uri/javascript_fetched_by_javascript",
            "Resource=http://someuri.uri/image_fetched_by_javascript",
            LAST );

What is seen by the recorder as page resources and what is seen as a top
level is controlled by a default list of content-types for which seperate
elements are created; And that list is configurable.
So for instance, JSON resources are not in the default content-type list
for top level recordable items, and end up in EXTRARES, unless you
configure the scenario to do something else.
(Note that loadrunner scripts are written in C. Most of which is pretty
well hidden, until you start looking for it :wink: )

Is there something like this in the works for gatling?

Not yet,I think it will be there for Gatling 2, but no date yet.
I have to say that I'm not fond of this feature as it usually only fetches
a part of the resources that would be fetched by a real browser, and that
those resources are usually cached/hosted on a CDN... But well, people keep
on asking for it, so...

That would be correct. Many of the scripts I create for my current customer
would have the 'fetch inline resources' configuration set to 'off' when run
by default, simply because those are irrelevant for the pages themselves.
It's nice to be able to turn that back on for tests where the CDN is
actually the thing you wish to test, though. :wink:
Also, automatically following HTTP 302 responses is kinda nice as well. Not
sure if gatling does that, either. :wink:

If not, how hard would it be to add such a thing myself? I've been
looking at gatling sources and it shouldn't be entirely impossible, but
some ideas on where to start would be nice. HTTP protocol itself isn't
quite it, since this would have to wrap / extend the HTTP protocol out of
necessity.

Expect if you're good at Scala and willing to contribute, don't bother, it
would be probably too complex for you...
No offense, just that we have to implement async fetching first, before
automatic HTML parsing.

Good at Scala? I've read a few things, never tried my hand at it though.
I'm fairly decent in C and know my way around various other languages
(java, perl, PHP, bash ..) though. So trying my hand at it isn't entirely
out of the question. :slight_smile:

- real-time test monitoring and control
One of the really nice things of Loadrunner is that you can watch test
results in real time as they happen, using a controller dashboard with
various graphs. You don't have the same level of granularity to the graphs
as you can have post-test, but the important stuff like "how many errors is
this test generating" and "what do response times look like" is there.
This mode of operation appears to simply not exist in gatling. Is that
correct, or have I overlooked something?

You have some info in the command line, and you can also integrate with
Graphite, but we currently don't have a built-in live dashboard.

Hmm, that is one of the main killer features gatling is missing, I reckon.
The loadrunner controller interface is pretty decent - configurable graphs,
most of the information you really want to see available, etc. And being
able to spot errors as they occur with server logs tailing on the
background means you have the ability to immediately spot problems and
report / fix them. Or simply reduce the load, for that matter.

- Impact of garbage collection on response times
Since gatling uses a java virtual machine, shouldn't there be a concern
that java virtual machine garbage collection pauses will influence the test
results?
If not, how can this be mitigated / prevented?

We try to optimize things the best we can: GC is tuned, and we try to
discard object as soon as we can. For example, we discard response chunks
unless you perform some checks on the response body.

Right. So you haven't eliminated them, then. Which leads to more questions:
- Is it possible to measure gatling's own overhead - wasted time,
effectively - as a seperate per-transaction thing?
- Is it possible to run gatling inside a different JVM implementation that
does GC differently (or not at all) in order to avoid GC spikes?

I'm pretty sure the answer to the latter question would be "yes", but I
don't know how hard that would be.
Note that I don't seek to eliminate memory management overhead, really.
Just *unpredictable* memory management overhead. Since our reporting
usually talks about the 99th percentile in responsetimes for various
transactions having a sudden delay increase responsetime measurements can
actually make a large difference to the question whether something actually
passes the test criteria.

Speaking of which - does gatling report such a thing as the 99th (or
whatevered) percentage of responsetimes?

- Decision lists
A feature I have in ylib is something called a 'profile list' - perhaps
better called a 'decision list'. Wherein the user can specify that at a
certain point in the simulation the virtual user will choose from a list of
follow-up steps, using weights (percentages, usually) that determine the
odds that a particular set of steps is chosen. Gatling does have
rudamentary support for if-else based decisions, but is there anything more
advanced out there for people who don't want to type out a series of nested
error-prone if-else statements?

What you're looking for is randomSwitch:
https://github.com/excilys/gatling/wiki/Structure-Elements#wiki-randomSwitch

Hmm, yes. That's it. :slight_smile:
Can that construct deal with floating point numbers, though? The
documentation doesn't say, and we have quite a few cases where precision to
2-3 digits after the comma is present / used.

Quite honestly, I kind of like the ylib implementation that accepts any
combination of numbers that doesn't sum up to a number bigger than MAX_INT
(2 billion and a bit, on 32 bit systems). That way you often don't even
have to calculate percentages yourselves - just toss in raw hit counts from
access logs and you let the computer figure out what that would mean in
terms of ratios :wink:
But of course since I wrote that I'm sorta biased.
(And dangerous, I might add, if you don't know what you're doing - unless
you know exactly what the ratio is between the click before that point and
afterwards.
My bigger test scenarios tend to involve spreadsheets. Large ones. :p)

- Extended browser emulation:
One of the things I have built at some point for a customer is support
for emulating a range of different browsers based on production access
logs. The idea being that different browsers will use different settings
for things like HTTP connection pools and therefore behave differently.
This code would take a CSV file with various browser types, connection pool
settings, and (again) weights and then apply them proportionally each time
a new script iteration starts.

Same point as above: we have to implement async fetching.

That brings me to another feature I haven't seen yet - transactions.
The idea behind a transaction is to group a collection of requests into a
single responsetime measurement, so that instead of a bunch of url's you
get measurements for "click on next", or "login".
Preferably numbered, so that it's easy to see from the controller interface
what order things get executed in.

A standard script would do something like:

lr_start_transaction("Open_startpage")
web_url("URL=http://start.page/", .... LAST);
lr_end_transaction("Open_startpage", LR_AUTO);

Which sounds useless - until you realize you can add as many web_url()
statements in there as you like. Or any other type of request, in fact.
(Usually my scripts use the ylib wrapper functions though, so that we can
add common prefixes and numbers to transaction names without having to be
explict about them. Not to mention triggers. But that's something for
another day. If you're interested, there's code here:
https://github.com/randakar/y-lib/blob/master/y_transaction.c#L543)

- Full request logging
I usually rely heavily on 'debug mode' logging during script development,
where everything that goes into and out of a remote server is logged while
I'm still working on the simulation. As far as I can see no such beast
exists in Gatling. You get headers, yes, but not full body contents.

As stated above, response chunks are by default discarded if they're not
required by checks. You can disable this:
https://github.com/excilys/gatling/wiki/HTTP#wiki-custom-dump
We'll be working on a debug mode too that will give you nice reports with
the full data.

Ah, yes. That's a tad clunky though, in the sense that you want to turn
this off whenever you put large amounts of load on something, and on when
you're running the script from within the scripting interface.
Usually the workaround for this is to detect what mode your script is
running in but since gatling doesn't appear to be making any such a
distinction testing for it is going to be a tad difficult.
(To be fair, the canonical way in loadrunner to do this is to check if the
assigned virtual user id (thread number, effectively) is negative. Which
will be the case whenever the scripting interface is running the script,
and not the case in real load tests.)

- User mode datapoints
I may have missed this - vaguely recall reading something about it
somewhere, in fact - but is it possible to specify your own metrics from
within gatling scripts to be recorded for later analysis / graphing?

WDYM?
You can dump custom data (
https://github.com/excilys/gatling/wiki/HTTP#wiki-custom-dump) and have
access to the original simulation.log file that you can parse on your own
too.

Custom dumps isn't exactly what I was looking for, though it's close.
The basic user mode datapoint is no more than:

lr_vuser_datapoint("name", value);

This name-value pair will be stored as a measurement value in the
collection database - same place the response times are kept, in other
words - and can be summoned up on both the controller interface and the
post-test analysis tool as one of the lines on the user datapoint graph.

We use this for all sorts of things. We have measurement pages that expose
application server internal values, which are fetched and turned into
datapoints by a seperate monitoring script. We have datapoints that extract
how much 'wasted' time loadrunner reports for each transaction (time spend
on the load generator CPU, rather than waiting on responses) and export
those as vuser datapoints as well. Then there are datapoints that are used
for think-time based rampup scenarios to expose the exact (calculated)
think time value so that we can see if the load is ramped up correctly.
Etc, etc, ad infinitum. You name it, we've done it :wink:

- Global verifications
It's possible in loadrunner to register a so called "global
verification", that will perform checks on each page being requested,
rather than just a single page.

That's a new feature: https://github.com/excilys/gatling/issues/1289
It hasn't been shipped yet (will be in 2M4, no date yet).

Not sure I understand what this tracker item translates into. What kind of
"common HTTP checks" are we talking about here?

These verifications work together with HTML mode (ignoring statics that
are not on the first level of the request) and can be paused and resumed as
needed to create exceptions for pages that diverge from the norm.
Does something like this exist in gatling?

How it's been implemented fro now is that common checks are being ignored
as soon as you specify some on a given request.

So as soon as you specify a fine-grainedcheck for a particular request the
common checks go out of the window?
That's definitely not what I want to do here.

Let's say that you know that your application will always format common
errors in a predictable way: Let's say there is a <div class="error">This
is an error</div> present in that case, on every page.
You also know that for a particular page the presence of the text
"<H1>Your account</H1>" means that particular page rendered at least
partially with success.
Except if there is such an error tag present - which can happen on every
page, not just that one.

What I do in such a case is register a global verification for the snippet
<div class="error"> that captures the text inside the div. This check will
then fail the transaction and report an error every time this div is
encountered, and report the exact text captured inside it. Regardless of
any other checks on the page:

The loadrunner code for that would look like:

web_global_verification("ID=errordiv", "TextPfx=<div class=\"error\">",
"TextSfx=</div>", "Fail=Found", LAST);

.. do requests ...

However, on one particular page this tag is always present, and NOT having
it on there is actually an error, for some odd reason. (Don't laugh. I've
seen it happen. :p)
So this check needs to be turned off:

web_global_verification_pause("ID=errordiv", LAST);

web_reg_find("Text=<div class=\"error\">This is not an error but we really
like pretty colours!</div>", LAST);
.. do request ..

web_global_verification_resume("ID=errordiv", LAST);

Does this make it clear what I mean?

As of today our standard template for test scripts contains something like
20-30 different global verifications for one thing or another. Including
some rather complicated ones that test for the absence or presence of
various versions of our left-hand side menu, which need to be turned on and
off with wrapper functions..
(To be fair - I am starting to suspect we're going to have to cut down on
this list some time soon. The oldest ones have been in there for years, and
I have no idea how often they fire .. probably not often.)

- Auto filters
One of the things that is very important is to have is some kind of
protection against accidentally putting load on the wrong URI or server,
especially when using HTML mode. One of the ways we have to counter that is
the auto filter - a simple statement forbidding the script to touch any URI
that fails to match a defined criterium, such as a hostname or URI snippet.
When used in conjunction with a CSV datafile that defines environments and
their URI's this can be a very powerful safeguard.
Does such a thing exist in gatling?

No, but that's a nice idea.

Seven or eight years as a performance engineer with various big banks and
airline companies can get you a lot of ideas about how to run performance
tests, I can tell you that :slight_smile:

.. I probably can think of a few others, but this is a nice start for now
:slight_smile:

Thanks a lot for all those ideas.

Heh, that was just off the top of my head.

I haven't even begun talking about things like "data integrity checks",
"error ratio based load reduction", "log on error", "disk space guards",
"pacing", or "thinktime based rampup"

Or any of the subjects regarding correlation'/playback that require me to
actually have some first hand experience with more complicated gatling
scenario's :wink:

.. ok .. since this is a wall of text already .. might as well continue for
a bit :wink:

Going over that list - I'm sure many of these are possible with gatling,
mind. Just not enough knowledge of the tool to be able to tell.

- Data integrity checks:
Custom checks that are layered on top of regular checks to verify that the
response doesn't just contain the correct page, but the correct *data* as
well.
Think of it as check that only fires if all other checks pass, but starts
virtually jumping up and down and waving a big red flag whenever the page
displays data that doesn't correspond to the request.
In our case, it will not just fail the transaction, but effectively fail
the test itself as well, and generate a custom transaction to notify the
tester the second it occurs.
( The genesis of this was a production incident where some customers would
see the wrong information - belonging to different, basically random
customers - whenever they accessed a particular page. That sort of thing
really isn't good for your reputation, I can tell you that. And it didn't
show up in functional tests, because this problem was concurrency related,
and could never be reproduced with just a single user on the system. )

- Error ratio based load reduction
Sometimes we're not so much interested in bottlenecks as we are interested
in memory leaks. For this purpose, we wanted to put load on the system as
long as possible, and didn't care about temporary disruptions in the
backend infrastructure so much.
However, those disruptions often meant an end to the test in question,
because the load generators kept pumping requests at the system during the
disruption and it never got a chance to recover.
In order to work around that we have a feature that will detect disruptions
and reduce load temporarily. Whenever more than X number of iterations in a
row fails, the code in question will enforce a 900 second sleep on the
virtual user, and keep doing that until an iteration is completed. This way
the load will stay low as long as the problem remains, and go back up
whenever things get back up on their feet.
Default this feature is off, but when needed we can turn it on with a
simple command line attribute.

- Log on error
One useful feature of loadrunner is that you can set things up in such a
way that logging is turned off by default, but whenever an error occurs the
last X kb of log data is recorded on disk.
That reduces the amount of logging to what usually amounts to the only
things you're really interested in (including, I might add, full
responses). :wink:

- Disk space guards
Long running tests have a second problem: Too much disk space usage on the
generators, usually due to excessive logging or too many errors (which
turns all of the logging on temporarily, of course).
Loadrunner however never detects that, and will gleefully put truncated
files on disk, resulting in test corruption later.
So we had to work around it - with two pieces of code. One that will check
how much free space the generator has, and one that measures how much the
test used so far. If this code deems the usage is too high or the space
left too low, will simply disable the logging entirely.
Example here: https://github.com/randakar/y-lib/blob/master/y_logging.c#L306-
https://github.com/randakar/y-lib/blob/master/y_logging.c#L345
That works, but it would be really nice if a test tool could detect such a
thing itself..

- Pacing
Another thing that appears to not be present in gatling scripts: Specific
settings for pacing.
If you really want your load numbers to be exact and to give something of a
representative load then you somehow want to steer how long each user
session lasts. If you rely only on think time and responsetimes to drive
load then you run into the issue that as soon as the load nears a
bottleneck the load will actually decrease because the slow increase in
responsetimes makes your users slow down.
The way to deal with that is to set up a pacing - where each run through
the script is allocated a set amount of time (usually about 2 minutes,
though we always apply a randomization factor to these things to keep
users from getting into lockstep with each other) and if the user finishes
it's set of requests before that, the script will simply wait until the
required time has passed. Meaning that if some disruption occurs on the
application server, the load will just keep on going as if nothing is
happening.

Loadrunner actually has several different ways to configure this: No
pacing, pacing with fixed intervals, pacing with variable intervals, and
you can set it to "delay X to Y seconds after the script ends".

In our case pacing gets a bit of a complicated subject though, because we
also time the logout to occur (if it occurs - there's a percentage chance
associated with it) just before the end of the pacing duration, so we
actually have a calculated forced thinktime sitting right before the logout
that will delay it. And while we're at it, we also measure how long the
script took to reach that point, so that any overly long scripts or
too-short pacing settings can be acted upon by the tester.

- Thinktime based rampup
Steering the load by decreasing the think time, rather than adding more
threads or virtual users.
If you want to have the load ramp up linearly, this involves a bit of math
- the formula is:

thinktime = (virtual_users / TPS_target) - response_time;

See https://github.com/randakar/y-lib/blob/master/y_loadrunner_utils.c#L942for
an implementation. And lots of comments.
I like to think that code speaks for itself, but I may be wrong. :wink:
I'm sure this can be reimplemented in a gatling scenario fairly easily,
mind..

Regards,
Floris

- HTML mode recording / playback support:

As far as I can see, the main protocol gatling supports is pure HTTP
calls. However, these calls do very little parsing on the result and
fetching all the static images, CSS and javascript embedded in a page
requires manually recording each and playing them back.
In loadrunner however there is something called "HTML mode", where all
static content that does not require running javascript is fetched
automatically, on the fly, without needing to manually rerecord everything
for each change to the page layout.

It also probably doesn't fetch style images from the css.

Actually, I'm pretty sure it does.

mmm, would be quite resource consuming, I guess. You'd have to parse the
css, find one the ones that contain images, and apply all of them to the
HTML DOM to find out if they match.
Doable though.

However, anything that isn't fetchable from inline page resources (due to
needing a javascript engine to execute them) is recorded as a so called
"EXTRARES" statement.

So scripts recorded in HTML mode will contain statements like:

web_url("URI=http://someuri.uri",
            "Refererer=...",
             "Mode=HTML",
            EXTRARES,
            "Resource=http://someuri.uri/javascript_fetched_by_javascript
",
            "Resource=http://someuri.uri/image_fetched_by_javascript",
            LAST );

How does LoadRunner figure out which resources were fetched by javascript
while recording? Does it ship/monitor a browser?

What is seen by the recorder as page resources and what is seen as a top
level is controlled by a default list of content-types for which seperate
elements are created; And that list is configurable.
So for instance, JSON resources are not in the default content-type list
for top level recordable items, and end up in EXTRARES, unless you
configure the scenario to do something else.
(Note that loadrunner scripts are written in C. Most of which is pretty
well hidden, until you start looking for it :wink: )

Is there something like this in the works for gatling?

Not yet,I think it will be there for Gatling 2, but no date yet.
I have to say that I'm not fond of this feature as it usually only
fetches a part of the resources that would be fetched by a real browser,
and that those resources are usually cached/hosted on a CDN... But well,
people keep on asking for it, so...

That would be correct. Many of the scripts I create for my current
customer would have the 'fetch inline resources' configuration set to 'off'
when run by default, simply because those are irrelevant for the pages
themselves.
It's nice to be able to turn that back on for tests where the CDN is
actually the thing you wish to test, though. :wink:
Also, automatically following HTTP 302 responses is kinda nice as well.
Not sure if gatling does that, either. :wink:

Yes, followRedirect is enabled by default:
https://github.com/excilys/gatling/wiki/HTTP#wiki-follow-redirects

If not, how hard would it be to add such a thing myself? I've been
looking at gatling sources and it shouldn't be entirely impossible, but
some ideas on where to start would be nice. HTTP protocol itself isn't
quite it, since this would have to wrap / extend the HTTP protocol out of
necessity.

Expect if you're good at Scala and willing to contribute, don't bother,
it would be probably too complex for you...
No offense, just that we have to implement async fetching first, before
automatic HTML parsing.

Good at Scala? I've read a few things, never tried my hand at it though.
I'm fairly decent in C and know my way around various other languages
(java, perl, PHP, bash ..) though. So trying my hand at it isn't entirely
out of the question. :slight_smile:

Prepare for some mind blowing adventure, then! :slight_smile:

- real-time test monitoring and control
One of the really nice things of Loadrunner is that you can watch test
results in real time as they happen, using a controller dashboard with
various graphs. You don't have the same level of granularity to the graphs
as you can have post-test, but the important stuff like "how many errors is
this test generating" and "what do response times look like" is there.
This mode of operation appears to simply not exist in gatling. Is that
correct, or have I overlooked something?

You have some info in the command line, and you can also integrate with
Graphite, but we currently don't have a built-in live dashboard.

Hmm, that is one of the main killer features gatling is missing, I reckon.
The loadrunner controller interface is pretty decent - configurable
graphs, most of the information you really want to see available, etc. And
being able to spot errors as they occur with server logs tailing on the
background means you have the ability to immediately spot problems and
report / fix them. Or simply reduce the load, for that matter.

I agree.

- Impact of garbage collection on response times
Since gatling uses a java virtual machine, shouldn't there be a concern
that java virtual machine garbage collection pauses will influence the test
results?
If not, how can this be mitigated / prevented?

We try to optimize things the best we can: GC is tuned, and we try to
discard object as soon as we can. For example, we discard response chunks
unless you perform some checks on the response body.

Right. So you haven't eliminated them, then.

As long as you allocate memory, you have to spend time deallocating it. But
Gatling launch script do tune the JVM.

Which leads to more questions:
- Is it possible to measure gatling's own overhead - wasted time,
effectively - as a seperate per-transaction thing?

Nope. But we substract this overhead to pause times.

- Is it possible to run gatling inside a different JVM implementation that
does GC differently (or not at all) in order to avoid GC spikes?

Gatling is not bound to a given JVM and you can tune it the way you want,
just edit the launch script.
I'm not sure buying Azul's Zing for running Gatling would be worth it...

I'm pretty sure the answer to the latter question would be "yes", but I
don't know how hard that would be.
Note that I don't seek to eliminate memory management overhead, really.
Just *unpredictable* memory management overhead. Since our reporting
usually talks about the 99th percentile in responsetimes for various
transactions having a sudden delay increase responsetime measurements can
actually make a large difference to the question whether something actually
passes the test criteria.

Then G1 is the GC you're look for. That's not the one configured in Gatling
scripts as it's only available since JDK7 and we still want to be
compatible with JDK6.

Speaking of which - does gatling report such a thing as the 99th (or
whatevered) percentage of responsetimes?

By default, Gatling computes 95th and 99th percentiles, but this is
configurable.

- Decision lists
A feature I have in ylib is something called a 'profile list' - perhaps
better called a 'decision list'. Wherein the user can specify that at a
certain point in the simulation the virtual user will choose from a list of
follow-up steps, using weights (percentages, usually) that determine the
odds that a particular set of steps is chosen. Gatling does have
rudamentary support for if-else based decisions, but is there anything more
advanced out there for people who don't want to type out a series of nested
error-prone if-else statements?

What you're looking for is randomSwitch:
https://github.com/excilys/gatling/wiki/Structure-Elements#wiki-randomSwitch

Hmm, yes. That's it. :slight_smile:
Can that construct deal with floating point numbers, though? The
documentation doesn't say, and we have quite a few cases where precision to
2-3 digits after the comma is present / used.

Quite honestly, I kind of like the ylib implementation that accepts any
combination of numbers that doesn't sum up to a number bigger than MAX_INT
(2 billion and a bit, on 32 bit systems). That way you often don't even
have to calculate percentages yourselves - just toss in raw hit counts from
access logs and you let the computer figure out what that would mean in
terms of ratios :wink:

No, it expects an Int percent value. The reason is that you don't have to
pass a 100% sum, users that don't fall in just go directly to the next step
after the randomSwitch.

But of course since I wrote that I'm sorta biased.
(And dangerous, I might add, if you don't know what you're doing - unless
you know exactly what the ratio is between the click before that point and
afterwards.
My bigger test scenarios tend to involve spreadsheets. Large ones. :p)

- Extended browser emulation:
One of the things I have built at some point for a customer is support
for emulating a range of different browsers based on production access
logs. The idea being that different browsers will use different settings
for things like HTTP connection pools and therefore behave differently.
This code would take a CSV file with various browser types, connection pool
settings, and (again) weights and then apply them proportionally each time
a new script iteration starts.

Same point as above: we have to implement async fetching.

That brings me to another feature I haven't seen yet - transactions.
The idea behind a transaction is to group a collection of requests into a
single responsetime measurement, so that instead of a bunch of url's you
get measurements for "click on next", or "login".
Preferably numbered, so that it's easy to see from the controller
interface what order things get executed in.

A standard script would do something like:

lr_start_transaction("Open_startpage")
web_url("URL=http://start.page/", .... LAST);
lr_end_transaction("Open_startpage", LR_AUTO);

Which sounds useless - until you realize you can add as many web_url()
statements in there as you like. Or any other type of request, in fact.
(Usually my scripts use the ylib wrapper functions though, so that we can
add common prefixes and numbers to transaction names without having to be
explict about them. Not to mention triggers. But that's something for
another day. If you're interested, there's code here:
https://github.com/randakar/y-lib/blob/master/y_transaction.c#L543)

We call them groups:
https://github.com/excilys/gatling/wiki/Structure-Elements#wiki-group(transaction
will mean something else for the future JDBC protocol support).
Until now, it was computing elapsed time, pauses included. We're changing
that to also have the time without pauses.

- Full request logging
I usually rely heavily on 'debug mode' logging during script
development, where everything that goes into and out of a remote server is
logged while I'm still working on the simulation. As far as I can see no
such beast exists in Gatling. You get headers, yes, but not full body
contents.

As stated above, response chunks are by default discarded if they're not
required by checks. You can disable this:
https://github.com/excilys/gatling/wiki/HTTP#wiki-custom-dump
We'll be working on a debug mode too that will give you nice reports with
the full data.

Ah, yes. That's a tad clunky though, in the sense that you want to turn
this off whenever you put large amounts of load on something, and on when
you're running the script from within the scripting interface.
Usually the workaround for this is to detect what mode your script is
running in but since gatling doesn't appear to be making any such a
distinction testing for it is going to be a tad difficult.
(To be fair, the canonical way in loadrunner to do this is to check if the
assigned virtual user id (thread number, effectively) is negative. Which
will be the case whenever the scripting interface is running the script,
and not the case in real load tests.)

- User mode datapoints
I may have missed this - vaguely recall reading something about it
somewhere, in fact - but is it possible to specify your own metrics from
within gatling scripts to be recorded for later analysis / graphing?

WDYM?
You can dump custom data (
https://github.com/excilys/gatling/wiki/HTTP#wiki-custom-dump) and have
access to the original simulation.log file that you can parse on your own
too.

Custom dumps isn't exactly what I was looking for, though it's close.
The basic user mode datapoint is no more than:

lr_vuser_datapoint("name", value);

This name-value pair will be stored as a measurement value in the
collection database - same place the response times are kept, in other
words - and can be summoned up on both the controller interface and the
post-test analysis tool as one of the lines on the user datapoint graph.

We use this for all sorts of things. We have measurement pages that expose
application server internal values, which are fetched and turned into
datapoints by a seperate monitoring script. We have datapoints that extract
how much 'wasted' time loadrunner reports for each transaction (time spend
on the load generator CPU, rather than waiting on responses) and export
those as vuser datapoints as well. Then there are datapoints that are used
for think-time based rampup scenarios to expose the exact (calculated)
think time value so that we can see if the load is ramped up correctly.
Etc, etc, ad infinitum. You name it, we've done it :wink:

Will have to think about it.

- Global verifications
It's possible in loadrunner to register a so called "global
verification", that will perform checks on each page being requested,
rather than just a single page.

That's a new feature: https://github.com/excilys/gatling/issues/1289
It hasn't been shipped yet (will be in 2M4, no date yet).

Not sure I understand what this tracker item translates into. What kind of
"common HTTP checks" are we talking about here?

Let's call them default checks then. Clear?

These verifications work together with HTML mode (ignoring statics that
are not on the first level of the request) and can be paused and resumed as
needed to create exceptions for pages that diverge from the norm.
Does something like this exist in gatling?

How it's been implemented fro now is that common checks are being ignored
as soon as you specify some on a given request.

So as soon as you specify a fine-grainedcheck for a particular request the
common checks go out of the window?
That's definitely not what I want to do here.

Let's say that you know that your application will always format common
errors in a predictable way: Let's say there is a <div class="error">This
is an error</div> present in that case, on every page.
You also know that for a particular page the presence of the text
"<H1>Your account</H1>" means that particular page rendered at least
partially with success.
Except if there is such an error tag present - which can happen on every
page, not just that one.

What I do in such a case is register a global verification for the snippet
<div class="error"> that captures the text inside the div. This check will
then fail the transaction and report an error every time this div is
encountered, and report the exact text captured inside it. Regardless of
any other checks on the page:

The loadrunner code for that would look like:

web_global_verification("ID=errordiv", "TextPfx=<div class=\"error\">",
"TextSfx=</div>", "Fail=Found", LAST);

.. do requests ...

However, on one particular page this tag is always present, and NOT having
it on there is actually an error, for some odd reason. (Don't laugh. I've
seen it happen. :p)
So this check needs to be turned off:

web_global_verification_pause("ID=errordiv", LAST);

web_reg_find("Text=<div class=\"error\">This is not an error but we really
like pretty colours!</div>", LAST);
.. do request ..

web_global_verification_resume("ID=errordiv", LAST);

Does this make it clear what I mean?

Yeah, yeah :slight_smile:
This feature is very fresh and hasn't been released yet, so it definitively
needs polishing :slight_smile:
The problem is that we have no way of telling if a request level defined
check is to override a default one or not, as a check can be anything, even
a function (= some code).
But you're right, the default behavior should be append, and we could have
a way of telling to ignore the defaults as per request.

Note that as you have code, you can compose your check lists the way you
want.

As of today our standard template for test scripts contains something like
20-30 different global verifications for one thing or another. Including
some rather complicated ones that test for the absence or presence of
various versions of our left-hand side menu, which need to be turned on and
off with wrapper functions..
(To be fair - I am starting to suspect we're going to have to cut down on
this list some time soon. The oldest ones have been in there for years, and
I have no idea how often they fire .. probably not often.)

- Auto filters
One of the things that is very important is to have is some kind of
protection against accidentally putting load on the wrong URI or server,
especially when using HTML mode. One of the ways we have to counter that is
the auto filter - a simple statement forbidding the script to touch any URI
that fails to match a defined criterium, such as a hostname or URI snippet.
When used in conjunction with a CSV datafile that defines environments and
their URI's this can be a very powerful safeguard.
Does such a thing exist in gatling?

No, but that's a nice idea.

Seven or eight years as a performance engineer with various big banks and
airline companies can get you a lot of ideas about how to run performance
tests, I can tell you that :slight_smile:

.. I probably can think of a few others, but this is a nice start for
now :slight_smile:

Thanks a lot for all those ideas.

Heh, that was just off the top of my head.

I haven't even begun talking about things like "data integrity checks",
"error ratio based load reduction", "log on error", "disk space guards",
"pacing", or "thinktime based rampup"

Or any of the subjects regarding correlation'/playback that require me to
actually have some first hand experience with more complicated gatling
scenario's :wink:

.. ok .. since this is a wall of text already .. might as well continue
for a bit :wink:

Going over that list - I'm sure many of these are possible with gatling,
mind. Just not enough knowledge of the tool to be able to tell.

- Data integrity checks:
Custom checks that are layered on top of regular checks to verify that the
response doesn't just contain the correct page, but the correct *data* as
well.
Think of it as check that only fires if all other checks pass, but starts
virtually jumping up and down and waving a big red flag whenever the page
displays data that doesn't correspond to the request.
In our case, it will not just fail the transaction, but effectively fail
the test itself as well, and generate a custom transaction to notify the
tester the second it occurs.
( The genesis of this was a production incident where some customers would
see the wrong information - belonging to different, basically random
customers - whenever they accessed a particular page. That sort of thing
really isn't good for your reputation, I can tell you that. And it didn't
show up in functional tests, because this problem was concurrency related,
and could never be reproduced with just a single user on the system. )

- Error ratio based load reduction
Sometimes we're not so much interested in bottlenecks as we are interested
in memory leaks. For this purpose, we wanted to put load on the system as
long as possible, and didn't care about temporary disruptions in the
backend infrastructure so much.
However, those disruptions often meant an end to the test in question,
because the load generators kept pumping requests at the system during the
disruption and it never got a chance to recover.
In order to work around that we have a feature that will detect
disruptions and reduce load temporarily. Whenever more than X number of
iterations in a row fails, the code in question will enforce a 900 second
sleep on the virtual user, and keep doing that until an iteration is
completed. This way the load will stay low as long as the problem remains,
and go back up whenever things get back up on their feet.
Default this feature is off, but when needed we can turn it on with a
simple command line attribute.

- Log on error
One useful feature of loadrunner is that you can set things up in such a
way that logging is turned off by default, but whenever an error occurs the
last X kb of log data is recorded on disk.
That reduces the amount of logging to what usually amounts to the only
things you're really interested in (including, I might add, full
responses). :wink:

- Disk space guards
Long running tests have a second problem: Too much disk space usage on the
generators, usually due to excessive logging or too many errors (which
turns all of the logging on temporarily, of course).
Loadrunner however never detects that, and will gleefully put truncated
files on disk, resulting in test corruption later.
So we had to work around it - with two pieces of code. One that will check
how much free space the generator has, and one that measures how much the
test used so far. If this code deems the usage is too high or the space
left too low, will simply disable the logging entirely.
Example here:
https://github.com/randakar/y-lib/blob/master/y_logging.c#L306 -
https://github.com/randakar/y-lib/blob/master/y_logging.c#L345
That works, but it would be really nice if a test tool could detect such a
thing itself..

- Pacing
Another thing that appears to not be present in gatling scripts: Specific
settings for pacing.
If you really want your load numbers to be exact and to give something of
a representative load then you somehow want to steer how long each user
session lasts. If you rely only on think time and responsetimes to drive
load then you run into the issue that as soon as the load nears a
bottleneck the load will actually decrease because the slow increase in
responsetimes makes your users slow down.
The way to deal with that is to set up a pacing - where each run through
the script is allocated a set amount of time (usually about 2 minutes,
though we always apply a randomization factor to these things to keep
users from getting into lockstep with each other) and if the user finishes
it's set of requests before that, the script will simply wait until the
required time has passed. Meaning that if some disruption occurs on the
application server, the load will just keep on going as if nothing is
happening.

Loadrunner actually has several different ways to configure this: No
pacing, pacing with fixed intervals, pacing with variable intervals, and
you can set it to "delay X to Y seconds after the script ends".

In our case pacing gets a bit of a complicated subject though, because we
also time the logout to occur (if it occurs - there's a percentage chance
associated with it) just before the end of the pacing duration, so we
actually have a calculated forced thinktime sitting right before the logout
that will delay it. And while we're at it, we also measure how long the
script took to reach that point, so that any overly long scripts or
too-short pacing settings can be acted upon by the tester.

- Thinktime based rampup
Steering the load by decreasing the think time, rather than adding more
threads or virtual users.
If you want to have the load ramp up linearly, this involves a bit of math
- the formula is:

thinktime = (virtual_users / TPS_target) - response_time;

See
https://github.com/randakar/y-lib/blob/master/y_loadrunner_utils.c#L942for an implementation. And lots of comments.
I like to think that code speaks for itself, but I may be wrong. :wink:
I'm sure this can be reimplemented in a gatling scenario fairly easily,
mind..

Man, you've given me enough work for the next 3 years... :slight_smile:
Thanks a lot!

- HTML mode recording / playback support:

As far as I can see, the main protocol gatling supports is pure HTTP
calls. However, these calls do very little parsing on the result and
fetching all the static images, CSS and javascript embedded in a page
requires manually recording each and playing them back.
In loadrunner however there is something called "HTML mode", where all
static content that does not require running javascript is fetched
automatically, on the fly, without needing to manually rerecord everything
for each change to the page layout.

It also probably doesn't fetch style images from the css.

Actually, I'm pretty sure it does.

mmm, would be quite resource consuming, I guess. You'd have to parse the
css, find one the ones that contain images, and apply all of them to the
HTML DOM to find out if they match.
Doable though.

It probably won't be as bad as the full-xml parsing web services protocol
loadrunner has, though .. or even worse: Citrix :stuck_out_tongue:

However, anything that isn't fetchable from inline page resources (due to
needing a javascript engine to execute them) is recorded as a so called
"EXTRARES" statement.

So scripts recorded in HTML mode will contain statements like:

web_url("URI=http://someuri.uri",
            "Refererer=...",
             "Mode=HTML",
            EXTRARES,
            "Resource=http://someuri.uri/javascript_fetched_by_javascript
",
            "Resource=http://someuri.uri/image_fetched_by_javascript",
            LAST );

How does LoadRunner figure out which resources were fetched by javascript
while recording? Does it ship/monitor a browser?

The recording engine listens in on the network traffic between a real
browser and the site, as usual for this type of tool.
After that .. I don't have access to the source code, but I think it simply
creates a list of all resources fetched from the initial request, then uses
it's own HTML parser on the initial resource to create a second list of all
resources that can be fetched automatically.
Substract the list with automatically fetchable resources from the original
list of recorded resources and you're done.

Something similar is done with cookies, I believe.

What is seen by the recorder as page resources and what is seen as a top
level is controlled by a default list of content-types for which seperate
elements are created; And that list is configurable.
So for instance, JSON resources are not in the default content-type list
for top level recordable items, and end up in EXTRARES, unless you
configure the scenario to do something else.
(Note that loadrunner scripts are written in C. Most of which is pretty
well hidden, until you start looking for it :wink: )

Is there something like this in the works for gatling?

Not yet,I think it will be there for Gatling 2, but no date yet.
I have to say that I'm not fond of this feature as it usually only
fetches a part of the resources that would be fetched by a real browser,
and that those resources are usually cached/hosted on a CDN... But well,
people keep on asking for it, so...

That would be correct. Many of the scripts I create for my current
customer would have the 'fetch inline resources' configuration set to 'off'
when run by default, simply because those are irrelevant for the pages
themselves.
It's nice to be able to turn that back on for tests where the CDN is
actually the thing you wish to test, though. :wink:
Also, automatically following HTTP 302 responses is kinda nice as well.
Not sure if gatling does that, either. :wink:

Yes, followRedirect is enabled by default:
https://github.com/excilys/gatling/wiki/HTTP#wiki-follow-redirects

If not, how hard would it be to add such a thing myself? I've been
looking at gatling sources and it shouldn't be entirely impossible, but
some ideas on where to start would be nice. HTTP protocol itself isn't
quite it, since this would have to wrap / extend the HTTP protocol out of
necessity.

Expect if you're good at Scala and willing to contribute, don't bother,
it would be probably too complex for you...
No offense, just that we have to implement async fetching first, before
automatic HTML parsing.

Good at Scala? I've read a few things, never tried my hand at it though.
I'm fairly decent in C and know my way around various other languages
(java, perl, PHP, bash ..) though. So trying my hand at it isn't entirely
out of the question. :slight_smile:

Prepare for some mind blowing adventure, then! :slight_smile:

I'll have to start with reading the documentation first .. as of now I
don't even know what all of the operators mean. :wink:

- real-time test monitoring and control
One of the really nice things of Loadrunner is that you can watch test
results in real time as they happen, using a controller dashboard with
various graphs. You don't have the same level of granularity to the graphs
as you can have post-test, but the important stuff like "how many errors is
this test generating" and "what do response times look like" is there.
This mode of operation appears to simply not exist in gatling. Is that
correct, or have I overlooked something?

You have some info in the command line, and you can also integrate with
Graphite, but we currently don't have a built-in live dashboard.

Hmm, that is one of the main killer features gatling is missing, I reckon.
The loadrunner controller interface is pretty decent - configurable
graphs, most of the information you really want to see available, etc. And
being able to spot errors as they occur with server logs tailing on the
background means you have the ability to immediately spot problems and
report / fix them. Or simply reduce the load, for that matter.

I agree.

- Impact of garbage collection on response times
Since gatling uses a java virtual machine, shouldn't there be a concern
that java virtual machine garbage collection pauses will influence the test
results?
If not, how can this be mitigated / prevented?

We try to optimize things the best we can: GC is tuned, and we try to
discard object as soon as we can. For example, we discard response chunks
unless you perform some checks on the response body.

Right. So you haven't eliminated them, then.

As long as you allocate memory, you have to spend time deallocating it.
But Gatling launch script do tune the JVM.

Which leads to more questions:
- Is it possible to measure gatling's own overhead - wasted time,
effectively - as a seperate per-transaction thing?

Nope. But we substract this overhead to pause times.

- Is it possible to run gatling inside a different JVM implementation
that does GC differently (or not at all) in order to avoid GC spikes?

Gatling is not bound to a given JVM and you can tune it the way you want,
just edit the launch script.
I'm not sure buying Azul's Zing for running Gatling would be worth it...

Since loadrunner installations of the size deployed at the customers I work
for typically take license fees of a few hundred thousand euros a year I'm
not sure it wouldn't (for those customers) :wink:

I'm pretty sure the answer to the latter question would be "yes", but I
don't know how hard that would be.
Note that I don't seek to eliminate memory management overhead, really.
Just *unpredictable* memory management overhead. Since our reporting
usually talks about the 99th percentile in responsetimes for various
transactions having a sudden delay increase responsetime measurements can
actually make a large difference to the question whether something actually
passes the test criteria.

Then G1 is the GC you're look for. That's not the one configured in
Gatling scripts as it's only available since JDK7 and we still want to be
compatible with JDK6.

I'll look into it, thanks.

Speaking of which - does gatling report such a thing as the 99th (or
whatevered) percentage of responsetimes?

By default, Gatling computes 95th and 99th percentiles, but this is
configurable.

Good, since I've actually had requests for the 99.5th percentile at some
point. That's a tad of a pain if your analysis tool only knows how to deal
with integer numbers :wink:

- Decision lists
A feature I have in ylib is something called a 'profile list' - perhaps
better called a 'decision list'. Wherein the user can specify that at a
certain point in the simulation the virtual user will choose from a list of
follow-up steps, using weights (percentages, usually) that determine the
odds that a particular set of steps is chosen. Gatling does have
rudamentary support for if-else based decisions, but is there anything more
advanced out there for people who don't want to type out a series of nested
error-prone if-else statements?

What you're looking for is randomSwitch:
https://github.com/excilys/gatling/wiki/Structure-Elements#wiki-randomSwitch

Hmm, yes. That's it. :slight_smile:
Can that construct deal with floating point numbers, though? The
documentation doesn't say, and we have quite a few cases where precision to
2-3 digits after the comma is present / used.

Quite honestly, I kind of like the ylib implementation that accepts any
combination of numbers that doesn't sum up to a number bigger than MAX_INT
(2 billion and a bit, on 32 bit systems). That way you often don't even
have to calculate percentages yourselves - just toss in raw hit counts from
access logs and you let the computer figure out what that would mean in
terms of ratios :wink:

No, it expects an Int percent value. The reason is that you don't have to
pass a 100% sum, users that don't fall in just go directly to the next step
after the randomSwitch.

Ok, that's a reason for having a fixed maximum (100), but not really a good
reason for only accepting Int values. If this is how it works I'm going to
have to resort to ugly hacks to get something resembling floating point
precision..

But of course since I wrote that I'm sorta biased.

(And dangerous, I might add, if you don't know what you're doing - unless
you know exactly what the ratio is between the click before that point and
afterwards.
My bigger test scenarios tend to involve spreadsheets. Large ones. :p)

- Extended browser emulation:
One of the things I have built at some point for a customer is support
for emulating a range of different browsers based on production access
logs. The idea being that different browsers will use different settings
for things like HTTP connection pools and therefore behave differently.
This code would take a CSV file with various browser types, connection pool
settings, and (again) weights and then apply them proportionally each time
a new script iteration starts.

Same point as above: we have to implement async fetching.

That brings me to another feature I haven't seen yet - transactions.
The idea behind a transaction is to group a collection of requests into a
single responsetime measurement, so that instead of a bunch of url's you
get measurements for "click on next", or "login".
Preferably numbered, so that it's easy to see from the controller
interface what order things get executed in.

A standard script would do something like:

lr_start_transaction("Open_startpage")
web_url("URL=http://start.page/", .... LAST);
lr_end_transaction("Open_startpage", LR_AUTO);

Which sounds useless - until you realize you can add as many web_url()
statements in there as you like. Or any other type of request, in fact.
(Usually my scripts use the ylib wrapper functions though, so that we can
add common prefixes and numbers to transaction names without having to be
explict about them. Not to mention triggers. But that's something for
another day. If you're interested, there's code here:
https://github.com/randakar/y-lib/blob/master/y_transaction.c#L543)

We call them groups:
https://github.com/excilys/gatling/wiki/Structure-Elements#wiki-group(transaction will mean something else for the future JDBC protocol support).
Until now, it was computing elapsed time, pauses included. We're changing
that to also have the time without pauses.

Ah, I see.
I guess I'll have to wrap that with something similar to transaction.c to
get:
- common prefixes for groups of groups .. err, blocks of groups - err.
"transaction blocks". - right, the naming requires some work, here.
- automatically numbering subsequent groups sequentially - handy if you
have, say, a lot of transactions named 'click next' or the like, and also
useful in the later reporting UI because transaction names are sorted
alphabetically.
- triggers - running code implicitly every time a transaction(or a specific
transaction) starts or ends. (hmm, maybe not, though. Scala probably
provides that.)
- subtransactions/groups - groups within groups.. Useful for things like "I
want responsetimes for this set of individual soap requests but also the
responsetime for the entire set together as that is what the end user will
experience", or things like "this request, depending on data, hits a
different set of backends, and we want to distinguish between those but
still preserve the normal responsetime as well."

- Full request logging
I usually rely heavily on 'debug mode' logging during script
development, where everything that goes into and out of a remote server is
logged while I'm still working on the simulation. As far as I can see no
such beast exists in Gatling. You get headers, yes, but not full body
contents.

As stated above, response chunks are by default discarded if they're not
required by checks. You can disable this:
https://github.com/excilys/gatling/wiki/HTTP#wiki-custom-dump
We'll be working on a debug mode too that will give you nice reports
with the full data.

Ah, yes. That's a tad clunky though, in the sense that you want to turn
this off whenever you put large amounts of load on something, and on when
you're running the script from within the scripting interface.
Usually the workaround for this is to detect what mode your script is
running in but since gatling doesn't appear to be making any such a
distinction testing for it is going to be a tad difficult.
(To be fair, the canonical way in loadrunner to do this is to check if
the assigned virtual user id (thread number, effectively) is negative.
Which will be the case whenever the scripting interface is running the
script, and not the case in real load tests.)

- User mode datapoints
I may have missed this - vaguely recall reading something about it
somewhere, in fact - but is it possible to specify your own metrics from
within gatling scripts to be recorded for later analysis / graphing?

WDYM?
You can dump custom data (
https://github.com/excilys/gatling/wiki/HTTP#wiki-custom-dump) and have
access to the original simulation.log file that you can parse on your own
too.

Custom dumps isn't exactly what I was looking for, though it's close.
The basic user mode datapoint is no more than:

lr_vuser_datapoint("name", value);

This name-value pair will be stored as a measurement value in the
collection database - same place the response times are kept, in other
words - and can be summoned up on both the controller interface and the
post-test analysis tool as one of the lines on the user datapoint graph.

We use this for all sorts of things. We have measurement pages that
expose application server internal values, which are fetched and turned
into datapoints by a seperate monitoring script. We have datapoints that
extract how much 'wasted' time loadrunner reports for each transaction
(time spend on the load generator CPU, rather than waiting on responses)
and export those as vuser datapoints as well. Then there are datapoints
that are used for think-time based rampup scenarios to expose the exact
(calculated) think time value so that we can see if the load is ramped up
correctly. Etc, etc, ad infinitum. You name it, we've done it :wink:

Will have to think about it.

It's one of those swiss-army knife things that you really need to
experience a usecase for to see the value. Once you do, you're sold, I
reckon :slight_smile:

- Global verifications
It's possible in loadrunner to register a so called "global
verification", that will perform checks on each page being requested,
rather than just a single page.

That's a new feature: https://github.com/excilys/gatling/issues/1289
It hasn't been shipped yet (will be in 2M4, no date yet).

Not sure I understand what this tracker item translates into. What kind
of "common HTTP checks" are we talking about here?

Let's call them default checks then. Clear?

Right. So these are built into gatling, and not something you can change
yourself.

These verifications work together with HTML mode (ignoring statics that
are not on the first level of the request) and can be paused and resumed as
needed to create exceptions for pages that diverge from the norm.
Does something like this exist in gatling?

How it's been implemented fro now is that common checks are being
ignored as soon as you specify some on a given request.

So as soon as you specify a fine-grainedcheck for a particular request
the common checks go out of the window?
That's definitely not what I want to do here.

Let's say that you know that your application will always format common
errors in a predictable way: Let's say there is a <div class="error">This
is an error</div> present in that case, on every page.
You also know that for a particular page the presence of the text
"<H1>Your account</H1>" means that particular page rendered at least
partially with success.
Except if there is such an error tag present - which can happen on every
page, not just that one.

What I do in such a case is register a global verification for the
snippet <div class="error"> that captures the text inside the div. This
check will then fail the transaction and report an error every time this
div is encountered, and report the exact text captured inside it.
Regardless of any other checks on the page:

The loadrunner code for that would look like:

web_global_verification("ID=errordiv", "TextPfx=<div class=\"error\">",
"TextSfx=</div>", "Fail=Found", LAST);

.. do requests ...

However, on one particular page this tag is always present, and NOT
having it on there is actually an error, for some odd reason. (Don't laugh.
I've seen it happen. :p)
So this check needs to be turned off:

web_global_verification_pause("ID=errordiv", LAST);

web_reg_find("Text=<div class=\"error\">This is not an error but we
really like pretty colours!</div>", LAST);
.. do request ..

web_global_verification_resume("ID=errordiv", LAST);

Does this make it clear what I mean?

Yeah, yeah :slight_smile:
This feature is very fresh and hasn't been released yet, so it
definitively needs polishing :slight_smile:
The problem is that we have no way of telling if a request level defined
check is to override a default one or not, as a check can be anything, even
a function (= some code).

I'm not surprised, I've had cases where that's exactly what I needed. :stuck_out_tongue:
(See: triggers, which is one way to implement that.)

But you're right, the default behavior should be append, and we could have
a way of telling to ignore the defaults as per request.

Note that as you have code, you can compose your check lists the way you
want.

I expected nothing less :wink:

As of today our standard template for test scripts contains something
like 20-30 different global verifications for one thing or another.
Including some rather complicated ones that test for the absence or
presence of various versions of our left-hand side menu, which need to be
turned on and off with wrapper functions..
(To be fair - I am starting to suspect we're going to have to cut down on
this list some time soon. The oldest ones have been in there for years, and
I have no idea how often they fire .. probably not often.)

- Auto filters
One of the things that is very important is to have is some kind of
protection against accidentally putting load on the wrong URI or server,
especially when using HTML mode. One of the ways we have to counter that is
the auto filter - a simple statement forbidding the script to touch any URI
that fails to match a defined criterium, such as a hostname or URI snippet.
When used in conjunction with a CSV datafile that defines environments and
their URI's this can be a very powerful safeguard.
Does such a thing exist in gatling?

No, but that's a nice idea.

Seven or eight years as a performance engineer with various big banks and
airline companies can get you a lot of ideas about how to run performance
tests, I can tell you that :slight_smile:

.. I probably can think of a few others, but this is a nice start for
now :slight_smile:

Thanks a lot for all those ideas.

Heh, that was just off the top of my head.

I haven't even begun talking about things like "data integrity checks",
"error ratio based load reduction", "log on error", "disk space guards",
"pacing", or "thinktime based rampup"

Or any of the subjects regarding correlation'/playback that require me to
actually have some first hand experience with more complicated gatling
scenario's :wink:

.. ok .. since this is a wall of text already .. might as well continue
for a bit :wink:

Going over that list - I'm sure many of these are possible with gatling,
mind. Just not enough knowledge of the tool to be able to tell.

- Data integrity checks:
Custom checks that are layered on top of regular checks to verify that
the response doesn't just contain the correct page, but the correct *data*
as well.
Think of it as check that only fires if all other checks pass, but starts
virtually jumping up and down and waving a big red flag whenever the page
displays data that doesn't correspond to the request.
In our case, it will not just fail the transaction, but effectively fail
the test itself as well, and generate a custom transaction to notify the
tester the second it occurs.
( The genesis of this was a production incident where some customers
would see the wrong information - belonging to different, basically random
customers - whenever they accessed a particular page. That sort of thing
really isn't good for your reputation, I can tell you that. And it didn't
show up in functional tests, because this problem was concurrency related,
and could never be reproduced with just a single user on the system. )

- Error ratio based load reduction
Sometimes we're not so much interested in bottlenecks as we are
interested in memory leaks. For this purpose, we wanted to put load on the
system as long as possible, and didn't care about temporary disruptions in
the backend infrastructure so much.
However, those disruptions often meant an end to the test in question,
because the load generators kept pumping requests at the system during the
disruption and it never got a chance to recover.
In order to work around that we have a feature that will detect
disruptions and reduce load temporarily. Whenever more than X number of
iterations in a row fails, the code in question will enforce a 900 second
sleep on the virtual user, and keep doing that until an iteration is
completed. This way the load will stay low as long as the problem remains,
and go back up whenever things get back up on their feet.
Default this feature is off, but when needed we can turn it on with a
simple command line attribute.

- Log on error
One useful feature of loadrunner is that you can set things up in such a
way that logging is turned off by default, but whenever an error occurs the
last X kb of log data is recorded on disk.
That reduces the amount of logging to what usually amounts to the only
things you're really interested in (including, I might add, full
responses). :wink:

- Disk space guards
Long running tests have a second problem: Too much disk space usage on
the generators, usually due to excessive logging or too many errors (which
turns all of the logging on temporarily, of course).
Loadrunner however never detects that, and will gleefully put truncated
files on disk, resulting in test corruption later.
So we had to work around it - with two pieces of code. One that will
check how much free space the generator has, and one that measures how much
the test used so far. If this code deems the usage is too high or the space
left too low, will simply disable the logging entirely.
Example here:
https://github.com/randakar/y-lib/blob/master/y_logging.c#L306 -
https://github.com/randakar/y-lib/blob/master/y_logging.c#L345
That works, but it would be really nice if a test tool could detect such
a thing itself..

- Pacing
Another thing that appears to not be present in gatling scripts: Specific
settings for pacing.
If you really want your load numbers to be exact and to give something of
a representative load then you somehow want to steer how long each user
session lasts. If you rely only on think time and responsetimes to drive
load then you run into the issue that as soon as the load nears a
bottleneck the load will actually decrease because the slow increase in
responsetimes makes your users slow down.
The way to deal with that is to set up a pacing - where each run through
the script is allocated a set amount of time (usually about 2 minutes,
though we always apply a randomization factor to these things to keep
users from getting into lockstep with each other) and if the user finishes
it's set of requests before that, the script will simply wait until the
required time has passed. Meaning that if some disruption occurs on the
application server, the load will just keep on going as if nothing is
happening.

Loadrunner actually has several different ways to configure this: No
pacing, pacing with fixed intervals, pacing with variable intervals, and
you can set it to "delay X to Y seconds after the script ends".

In our case pacing gets a bit of a complicated subject though, because we
also time the logout to occur (if it occurs - there's a percentage chance
associated with it) just before the end of the pacing duration, so we
actually have a calculated forced thinktime sitting right before the logout
that will delay it. And while we're at it, we also measure how long the
script took to reach that point, so that any overly long scripts or
too-short pacing settings can be acted upon by the tester.

- Thinktime based rampup
Steering the load by decreasing the think time, rather than adding more
threads or virtual users.
If you want to have the load ramp up linearly, this involves a bit of
math - the formula is:

thinktime = (virtual_users / TPS_target) - response_time;

See
https://github.com/randakar/y-lib/blob/master/y_loadrunner_utils.c#L942for an implementation. And lots of comments.
I like to think that code speaks for itself, but I may be wrong. :wink:
I'm sure this can be reimplemented in a gatling scenario fairly easily,
mind..

Man, you've given me enough work for the next 3 years... :slight_smile:
Thanks a lot!

Don't worry about it. It's high time someone toppled the proprietary tool
hegemony, I'm all too happy to do my bit.