One more tread about shared connections, keep-alive, sockets reuse etc.

Hello Gatling community,

I’m facing problem with implementing my scenario using Gatling and I’m starting thinking I misunderstand some principles of load testing with Gatling. Here is some description of what I want to simulate with Gatling:
INTRO:

  1. In production we have cluster of ~30 servers that receive ~60k RPS. Each request contains JSON we should parse and reply with another JSON. All balancing is done using Nginx.
  2. Nginx is creating ~20 tcp keep-alive connections to each server
  3. Each server is handling ~2k RPS

So I wand to simulate prod conditions as close as possible.

What I have/use:

  1. Gatling 2.1.7 running on c4.xlarge with Ubuntu tuned for load testing
  2. Same as prod instance that should handle 2k RPS with max response time 120ms
  3. Gatling scenario

`

import io.gatling.core.Predef._
import io.gatling.core.check.CheckResult
import io.gatling.core.scenario.Simulation
import io.gatling.core.validation.Failure
import io.gatling.udp.Predef._
import io.gatling.udp.UdpMessage
import org.jboss.netty.handler.codec.string.{StringDecoder, StringEncoder}
import io.gatling.http.Predef._
import scala.concurrent.duration._
import load.utils.BearerTokenGenerator

import scala.util.Random

class ZettaLoadTest extends Simulation{

val Users = 10

val widthFeeder = Seq(58188555, 1234).toArray.map(r => Map(“widthValue” → r)).random

val bidRequest1 =
“”"{“device”: {“dpidsha1”: “dpid”, “dnt”: 1, “geo”: {“country”: “US”}}, “imp”: [{“banner”: {“h”: 58188555, “api”: [1],

“id”: “1”, “w”: ${widthValue}}, “id”: “1”}], “app”: {“publisher”: {“id”: “agltb3B1Yi1pbmNyAEsBS0GjZ291bnQY9Iv5FAw”},
“id”: “agltb3B1Yi1pdmNyBAsSA0FwcBjRmvkVDB”, “paid”: 0}, “id”: “country”, “user”: {“geo”: {}, “id”: “qwerty”}}“”".stripMargin

val bidderBaseUrl = “http://biddder.base.url

val httpConfig = http
.baseURL(bidderBaseUrl)
.acceptHeader(“Content-type: application/x-www-form-urlencoded”)
.connection(“keep-alive”)
.shareConnections

val scn = scenario(“Zetta Load Test”)
.feed(widthFeeder)
.repeat(2000)
{
exec(
http(“zetta http post”)
.post(“/rtb23/nexage/bid”)
.body(StringBody(bidRequest1))
.asJSON
.check(status.in(200, 204))
)
}

setUp(
scn.inject(
atOnceUsers(Users)
).protocols(httpConfig)
)
}

`

Problem:
When I run this scenario I expect that Gatling will open 10 keep-alive connections for 10 virtual users and each user will repeat it’s scenario 2000 times using his own connection. Actual result is a bit different:

`

bid1:/home/alexandr.boiko # netstat -tulpan | grep ‘10.3.0.76’ | wc -l
199

tcp 0 0 10.3.1.121:8090 10.3.0.76:42546 TIME_WAIT -
tcp 0 0 10.3.1.121:8090 10.3.0.76:42551 TIME_WAIT -
tcp 0 0 10.3.1.121:8090 10.3.0.76:42477 TIME_WAIT -
tcp 0 0 10.3.1.121:8090 10.3.0.76:42485 TIME_WAIT -
tcp 0 0 10.3.1.121:8090 10.3.0.76:42612 TIME_WAIT -
tcp 0 0 10.3.1.121:8090 10.3.0.76:42577 TIME_WAIT -

`

I configured sysctl with net.ipv4.tcp_tw_reuse = 1 to force kernel reusing this TIME_WAIT connections but still I’m curious why I can’t get Gatling open only 10 connections and use them instead of 200 connections(200 handshakes, more CPU load, further from prod-like envo). Any help will be highly appreciated

Hi Alexander,

I guess the problem is here:

val httpConfig = http
.baseURL(bidderBaseUrl)
.acceptHeader(“Content-type: application/x-www-form-urlencoded”)
.connection(“keep-alive”)
.shareConnections

As per Gatling´s documentation setting this option does exactly the opposite of what you´re trying to achieve, see details here:
http://gatling.io/docs/2.1.7/http/http_protocol.html#connection-sharing

Regards,
Aleksei

I tried commenting it but it doesn’t change anything as well as using .disableClientSharing. It’s ok if clients will have shared pool of 10 connections(still IMHO dedicated connections are better because of absence of connection switching). My problem is that I always have Active TCP connections == users in Gatling but after completing one scenario iteration(one repeat) virtual user seems to leave one connection and open new one. Here is my netstat for 10 users simulation

`

bid1:/home/alexandr.boiko # netstat -tulpan | grep ‘10.3.0.76’
tcp 0 0 10.3.1.121:8090 10.3.0.76:51733 ESTABLISHED 25663/beam.smp
tcp 0 0 10.3.1.121:8090 10.3.0.76:51729 ESTABLISHED 25663/beam.smp
tcp 0 0 10.3.1.121:8090 10.3.0.76:51716 TIME_WAIT -
tcp 0 0 10.3.1.121:8090 10.3.0.76:51727 ESTABLISHED 25663/beam.smp
tcp 0 0 10.3.1.121:8090 10.3.0.76:51725 TIME_WAIT -
tcp 0 0 10.3.1.121:8090 10.3.0.76:51735 ESTABLISHED 25663/beam.smp
tcp 0 0 10.3.1.121:8090 10.3.0.76:51723 TIME_WAIT -
tcp 0 123 10.3.1.121:8090 10.3.0.76:51726 ESTABLISHED 25663/beam.smp
tcp 0 0 10.3.1.121:8090 10.3.0.76:51718 TIME_WAIT -
tcp 0 0 10.3.1.121:8090 10.3.0.76:51724 TIME_WAIT -
tcp 0 0 10.3.1.121:8090 10.3.0.76:51728 ESTABLISHED 25663/beam.smp
tcp 512 0 10.3.1.121:8090 10.3.0.76:51730 ESTABLISHED 25663/beam.smp
tcp 0 0 10.3.1.121:8090 10.3.0.76:51720 TIME_WAIT -
tcp 516 0 10.3.1.121:8090 10.3.0.76:51732 ESTABLISHED 25663/beam.smp
tcp 516 0 10.3.1.121:8090 10.3.0.76:51734 ESTABLISHED 25663/beam.smp
tcp 0 0 10.3.1.121:8090 10.3.0.76:51719 TIME_WAIT -
tcp 0 0 10.3.1.121:8090 10.3.0.76:51717 TIME_WAIT -
tcp 0 0 10.3.1.121:8090 10.3.0.76:51721 TIME_WAIT -
tcp 0 0 10.3.1.121:8090 10.3.0.76:51722 TIME_WAIT -
tcp 512 0 10.3.1.121:8090 10.3.0.76:51731 ESTABLISHED 25663/beam.smp

`

I agree this is not the expected behavior. Could you provide a reproducer, please?

I tried to describe my envo as much as possible in first post. Here is scenario code I currently working with + my gatling.conf attached. I’ve tried running load test from my macbook to eliminate possible AWS network “optimizations” but got same result. Please let me know if you need any additional info to reproduce

Enter code here...**package** zetta

import io.gatling.core.Predef._

import io.gatling.core.check.CheckResult

import io.gatling.core.scenario.Simulation

import io.gatling.core.validation.Failure

import io.gatling.udp.Predef._

import io.gatling.udp.UdpMessage

import org.jboss.netty.handler.codec.string.{StringDecoder, StringEncoder}

import io.gatling.http.Predef._

import scala.concurrent.duration._

import load.utils.BearerTokenGenerator

import scala.util.Random

class ZettaLoadTest extends Simulation{

val Users = 10

val widthFeeder = Seq(58188555, 1234).toArray.map(r => Map(“widthValue” → r)).random

val bidRequest1 =

“”"{“device”: {“dpidsha1”: “dpid”, “dnt”: 1, “geo”: {“country”: “US”}}, “imp”: [{“banner”: {“h”: 58188555, “api”: [1],

“id”: “1”, “w”: ${widthValue}}, “id”: “1”}], “app”: {“publisher”: {“id”: “agltb3B1Yi1pbmNyAEsBS0GjZ291bnQY9Iv5FAw”},

“id”: “agltb3B1Yi1pdmNyBAsSA0FwcBjRmvkVDB”, “paid”: 0}, “id”: “country”, “user”: {“geo”: {}, “id”: “qwerty”}}“”".stripMargin

val bidderBaseUrl = “http://bid1.load.va.us.rtb2.strikead.com

val httpConfig = http

.baseURL(bidderBaseUrl)

.acceptHeader(“Content-type: application/x-www-form-urlencoded”)

.connection(“keep-alive”)

.disableClientSharing

// .shareConnections

val scn = scenario(“Zetta Load Test”)

.feed(widthFeeder)

.repeat(20000)

{

exec(

http(“zetta http post”)

.post(“/rtb23/nexage/bid”)

.body(StringBody(bidRequest1))

.asJSON

.check(status.in(200, 204))

)

}

setUp(

scn.inject(

atOnceUsers(Users)

).protocols(httpConfig)

)

}

Oh, I didn’t realize that you’re using a third party extension for UDP support (I guess this one: https://github.com/carlosraphael/gatling-udp-extension).
Your issue is definitively related to this plugin, not to Gatling official components.

This is a third party, so it’s not maintained by Gatling authors. And it shouldn’t be using our groupId and package.

If the author doesn’t answer on this mailing list, you can try getting in touch with him on his bugtracker.

If your company really needs official UDP support in Gatling, please get in touch at contact@gatling.io

Cheers,

I tried removing this libraries from my project(removed from IDEA, removed from build.sbt, reimported project) and running load test from my laptop but got exactly same behavior… The only difference is that now it’s opening new connections less even but more like N new connections every X seconds, where N==users amount in my scenario and X seems to be some periodical parameter in config I’m missing :slight_smile: Here is example where 1 connection is for my ssh session so Gatling is creating 20-40-60-80-100 connections

`

bid1:/home/alexandr.boiko # netstat -tulpan | grep ‘10.3.0.200’ | wc -l
41
bid1:/home/alexandr.boiko # netstat -tulpan | grep ‘10.3.0.200’ | wc -l
61
bid1:/home/alexandr.boiko # netstat -tulpan | grep ‘10.3.0.200’ | wc -l
81
bid1:/home/alexandr.boiko # netstat -tulpan | grep ‘10.3.0.200’ | wc -l
101

`

If you can provide a private access to an environment to reproduce, I’d gladly investigate.

I’ll discuss this with our DevOps/admin team to get you access and let you know asap. Thanks for quick replies in involvement!