DNS issues testing against ELB

I am having trouble getting Gatling to re-resolve the DNS entry I use in my URLs, periodically. For example, I tried to use a DNS entry to point to 1 of 3 elastic load balancers(ELB). After I had my traffic running, I would change the DNS entry to point to a new ELB, but the traffic doesn’t seem to switch over. Also, keep in mind that the ELB might add more IPs to scale with load, so that might be a similar problem.

Note that I did disable Java’s name resolution, but it seems that I still have problems turning off DNS caching completely.

Here is a nice article detailing the problem, and one solution implemented by JMeter.

To complicate matters, I don’t think .perUserNameResolution will help me. I am simulating devices that all come online near the start of the simulation, and then post datapoints periodically throughout the entire simulation. So the devices (or users), never leave the simulation, so nothing gets re-resolved.

I took a look at the Gatling code, but I found it difficult to figure out a solution. I was wondering if this might help:

https://developer.lightbend.com/docs/akka-management/current/discovery/index.html

Any help is much appreciated. Thanks for a great tool!

Ron

Maybe some misunderstanding on when DNS resolution occurs, like in here?

Hmm. So the devices (aka users) I am simulating might only post a single datapoint in an hour, for example. The real devices will reuse a TCP connection, unless there is an idle period of more than 10 minutes. After that, the TCP connection gets reestablished. The cloud does not maintain one TCP connection for the life of each device. That would be too many connections. There is a lightweight UDP connection that is used as a back channel to tell the device to come to the cloud to pick up changes, if one has been posted (say by a mobile application). I don’t need to simulate the UDP part, but it would be nice if connections were terminated every once in a while.

Is there a way for me to close a TCP connection every once in a while?

Any ideas for how I can simulate the reality of the IoT problem I have?

Ron

Hi Ron,

For a long time, I have not been able to find a solution to the same problem as yours for my simulation environment. With Stephane’s response to your question, I managed to gather following findings:

First, I disabled InetAddress caching in my Gatling load generator with JDK8 by setting networkaddress.cache.ttl=0 in $JAVA_HOME/jre/lib/security/java.security. I found a tool to help verify if InetAddress caching was enabled/disabled as expected.

Second, I ran “tcpdump -l -n port 53 or port 8443” to capture any request being sent to DNS server (port 53) or target server (port 8443).

I then generated 1-hour-long Gatling load traffic with multiple virtual users, and observed the following:

  • With Gatling 2.3.1, DNS queries were requested only once for each virtual user (as you have observed) for the whole 1-hour duration. HTTP connection pool was not enabled, that a new connection was opened for every new request.
  • With Gatling 3.0, DNS queries were requested for every new connections. And HTTP connection pool was indeed enabled as multiple back-to-back (or few second apart) requests were sent in a single connection. If the requests were 5-10 minutes apart, they were sent in different connections.

I don’t know how HTTP connection pool works in Gatling 3.0 or if it can be disabled. Could Stephane et team please help?

Vu

Connection pooling obeys the Connection request and response headers, plus is there’s an idle timeout that can be configured in gatling.conf.
If your connections are kept alive, it means that the client side doesn’t advertise it will not keep them alive (no Connection: close request header) nor the server closes them.
That’s plain old HTTP.
That’s for you to decide which behavior best matches how things happen on your live system.
Don’t cheat to get better load test results at the cost of unrealistic behavior.