Just started to play with Gatling, great tool! I'm trying to write a
scenario to test our home page. When the call goes out: www.homepage.com,
there are about 27 http requests that get triggered to cdn server,
image server, app server, etc. I can't figure out how does it gets
handle in Gatling?
I'm a newbie myself, but does the recorder capture all those requests
for you? If so you could record a session and take a look at the
recorded result to see what happens
I used a recorder today, it recorded 66 calls, instead of 27 or 29
that I need and it did not record some of the calls that I see in
firebug or I get automatically fired with other tools, like WebLOAD
I'm hoping there are better solution to this issue then use of
The recorded is just a proxy so it is strange if it is just not
recording requests - Maybe your browser was caching stuff from the CDN
and not actually sending those requests when you recorded?
Yes, the recorder is just a proxy, so it records the requests your browser sends.
Make sure you don’t have other pannels sending ajax requests (such as your emails getting refreshed, etc).
You can also add filters (ant or regexp) to only capture the requests your interested in.
2012/5/4 Perryn Fowler <firstname.lastname@example.org>
Thank You Steph,
I'm going to use regexes and try to record what I need. How
Would that solve my issue?
Hard to tell, as I don’t know exactly what requests are missing and what requests should be here.
Could you please elaborate?
Regarding followRedirect, it means that the engine will automatically reissue requests when receiving a 30X response status code, until getting a non 30X one. So yes, you get let requests in your scenario when using followRedirect.
2012/5/4 David T <email@example.com>
Basically my main concern is not that some calls were not recorded,
but if there is function/method I can use to pull all embedded
resources for the page?
Thank You for helping me with this.
The response is no.
You have to define every request in your scenario, so the best way is to record them with the recorder.
2012/5/5 David T <firstname.lastname@example.org>
What do you think about supporting extraction and retrieval of just
the embedded resources defined in the html response with no evaluation
Do you know if this approach is much more expensive than:
1. checking the response meets expectations via regex/xpath
2. continue on and issue the expected hard-coded requests
note: I work with David, and this topic came up in a discussion about
creating and maintaining gatling simulations efficiently. Clearly,
there is a runtime cost to fetching embedded resources, but I haven't
quantified it for myself yet. From our team's perspective, the
runtime cost is less important than the maintenance cost of keeping
simulations aligned with the real world, which we generally don't have
To that end, I was thinking of implementing the equivalent of JMeter's
'Retrieve all embedded resources" feature for http requests:
JMeter's implementation retrieves embedded:
* external scripts
* frames, iframes
* background images (body, table, TD, TR)
* background sound
If there is not a philosophical or architectural problem with adding
this feature, I would like to attempt it and see if and how it works
out. I understand that gatling is designed to be different from tools
like JMeter and WebLoad and so that if the feature cannot scale and
perform, then it won't make it into the official codebase.
The dsl would look something like:
.check(status.in(List(200 to 210)))
retrieveEmbeddedResources retrieves all embedded resources by default,
but also takes an optional whitelist of regexes for filtering the
resource requests. retrieveEmbeddedResources might also need a
facility for specifying checks.
First of all, thanks for the pull requests.
Static resources retrieving raises a lot of questions:
- Why fetch them? It’s quite common that those are served by a cache or a CDN, so performance might only be an issue from a network or browser perspective.
- Real world behavior depends on cache headers and browser cache content. How will your solution behave? Have a conservative approach where users would have an empty cache? Would it support caching and cache expiration?
- Gatling current workflow is a sequence. It means that we have yet to implement a workflow where resource get fetched with a scatter/gather strategy. See https://github.com/excilys/gatling/issues/431. I have a solution in mind that would spawn actors for each scatter/gather execution. Your solution should be based on this mechanism, otherwise, your static resources will be fetched one after another, and that’s not real world behavior.
- What would the reports look like? Like with follow redirect support: “Request name, resource 1”, “Request name, resource 2”, etc…?
2012/5/6 Stephen Kuenzli <email@example.com>
You’re welcome for the minor fixes. Thank you for creating such a nice tool. It is a credit to yourself and the team that gatling is maintainable, powerful, efficient, and scalable.
You raise some good questions and some answers are not clear-cut:1. Why fetch embedded assets?
a) Large assets can sneak into the source repo and content management systems and kill performance. Alternatively, the repo and all CMSes could guard against this problem on submit instead of writing tests to check for the problem.
b) Some of the embedded assets are actually dynamically-generated responses, e.g. a thumbnail rendered for a profile photo
c) Some CDNs are better than others and it is useful to characterize their performance from time to time.
How should caching be handled? I’d say an embedded request would be done as if the cache is empty with no specification of headers or caching behavior. If testing of cache-related headers is important for a particular simulation, it’s probably better to write the simulation that way.
How should the new embedded requests into the workflow? I was thinking of appending to the workflow sequence after parsing the response. Scatter/gather is certainly desirable with some level of specifiable concurrency; 4-8 concurrent requests should simulate most modern browsers.
What about optimizing the simulation by sampling the flow for one user and change future actors’ behavior? That sounds great from a simulation-writer’s perspective, but it seems like it might be a bit much to ask gatling to do this.
What would reports look like? I think the redirect model looks fairly good, but might suggest:
“Request Name - Embedded 1 - <embedded resource uri 1>”
“Request Name - Embedded 2 - <embedded resource uri 2>”
“Request Name - Embedded 3 - <embedded resource uri 3>”
For this case the server-side caching should be considered, so that we simulate the real traffic between the client and the server.
At least we need to be able to filter according to caching status.
First call : get all
Second call : get only non cached content
Without this, the http behaviour will be eager and not testing the reality.
Still this is an interesting feature to start implementing.
@dbaeli (Salut, BTW) I agree, if we implement this feature, we have to implement a simple caching behavior. Just like in existing cookies implementation, let’s drop timed expiration.
Here’s the points that still puzzle me:
- I’m afraid having a limit on the number of concurrent requests used for fetching the page resources will be easily feasible (even if feasible). I’d rather not have this feature at first.
- I’d rather not have to implement HTML and CSS parsing myself… Some recommendations? neko?
Anyway, this is an interesting feature, but be aware it will take time to implement…
I don't really have a dog in this discussion since this isn't a
feature I would use but it does seem like there is a big potential
here of the team getting drowned by the complexities of implementing
this. What is the use case? Gatling is a load testing tool and in
general static content like CSS and images generate very little load
on origin servers. If the site has any traffic to speak of this
content should already be served by a CDN in which case adding it to a
load test is meaningless.
If, on the other hand, you are trying to somehow test the speed the
page actually loads then none of the features discussed in this thread
would help with that. Again assuming the page is complex at all there
is most likely AJAX content being loaded after the page is officially
loaded. So next people would be asking for Gatling to execute
So in short I think if load testing is the real goal then static
content from a web page is basically meaningless. If responsiveness
of the page is what you want to test then the team would have to
implement a complete browser within Gatling. I've never seen a tool
do that very well. Even ones that are specifically designed to. I
would rather have the Gatling team spend their time focusing on
improving the load testing features without bloating the tool with a
bunch of features that are outside the scope of load/stress testing.
But again that's mostly me being selfish because I wouldn't make use
of the features discussed in this thread.
re concurrency limits:
It’s reasonable to skip limits in a first attempt; limits are not necessary for the feature to be useful.
nekohtml is probably a good bet, it seems to be maintained and is used by Selenium 2 for their headless webdriver.
JMeter uses/composes sourceforge’s htmlparser (http://htmlparser.sourceforge.net/) for its embedded resource extraction. The sourceforge htmlparser project shows a last update in 2006, so their HTML5 support is probably going to be lacking. The relevant JMeter class looks like HtmlParserHTMLParser:http://svn.apache.org/repos/asf/jmeter/trunk/src/protocol/http/org/apache/jmeter/protocol/http/parser/HtmlParserHTMLParser.java
Sorry, finally find the time to continue with this thread.
I agree with Chris that static resources handling seems to be important mostly for applications that don’t follow classic technics such as cache headers, server caching, sprites or CDN.
Tools such as YSlow and SpeedTracer can easily point out such problems, without running a stress test.
If someone can contribute the parsing engine with the sufficient tests, we’ll be able to build the actor stuff on top of it.
Otherwise, I’m afraid our hands are full with other features to implement, such as clustering, server monitoring and database persistence.
2012/5/10 Stephen Kuenzli <firstname.lastname@example.org>