Hi,
I am using Gatling (3.0.3) and I am facing a NAT gateway idle connection timeout on slow requests (~6min).
Gatling continues to wait forever for the response while the connection is dropped between the gateway and the target server.
Unfortunately, I cannot change this timeout, it is an AWS NAT Gateway with a hardcoded 350s timeout.
A classic solution to bypass a NAT router idle timeout is to use TCP keepalive.
This requires to change the OS (linux) configuration on the Gatling host to have some keepalive activity before the timeout (changing tcp_keepalive_time from 7200 to 60):
net.ipv4.tcp_keepalive_intvl = 60
net.ipv4.tcp_keepalive_probes = 5
net.ipv4.tcp_keepalive_time = 60
But so far it does not work: the problem remains.
Looking at the Gatling connection I cannot see any keepalive activity with tcpdump nor netstat (the Timer column displays keepalive when active):
$ netstat -latopen
Proto Recv-Q Send-Q Local Address Foreign Address State User Inode PID/Program name Timer
tcp 0 0 172.17.0.2:35970 X.X.X.X:80 ESTABLISHED 1001 1201186 18767/java off (0.00/0/0)
Am I missing something or Gatling/Netty does not use the SO_KEEPALIVE option by default?
Thanks for your help and keep up the good work.
ben