[Gatling 2.0.0-M3] Actor AsyncHandlerActor doesn't support message ReceiveTimeout

Hi!

When an http request times out I get the error below. Is this the correct behavior? Shouldn’t AsyncHandlerActor handle ReceiveTimeout when it specifies a timeout in the preStart (using configuration.http.ahc.requestTimeOutInMs)? I think this might be the cause that my scenario sometimes never completes.

[ERROR] [07/09/2013 14:00:08.180] [GatlingSystem-akka.actor.default-dispatcher-14] [akka://GatlingSystem/user/$V/$f] Actor AsyncHandlerActor doesn’t support message ReceiveTimeout
java.lang.IllegalArgumentException: Actor AsyncHandlerActor doesn’t support message ReceiveTimeout
at io.gatling.core.action.BaseActor.unhandled(Actions.scala:30)
at akka.actor.ActorCell$$anonfun$receiveMessage$1.apply(ActorCell.scala:425)
at akka.actor.ActorCell$$anonfun$receiveMessage$1.apply(ActorCell.scala:425)
at io.gatling.http.ahc.AsyncHandlerActor$$anonfun$receive$1.applyOrElse(AsyncHandlerActor.scala:54)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:425)

From the source of AsyncHandlerActor: (Neither AsyncHAndlerActor nor BaseActor handles timeout messages)

class AsyncHandlerActor extends BaseActor {

override def preStart {
context.setReceiveTimeout(AsyncHandlerActor.timeout)
}

def receive = {
case OnCompleted(task, response) => processResponse(task, response)

case OnThrowable(task, response, errorMessage) => ko(task, task.session, response, errorMessage)
}

Tank you and best regards,
Frode

Hi Frode,

There’s actually 2 (major) bugs:

Do you think you could try those fixes (either you build yourself, or I upload a package)?

I’ll talk with the other committers, but I think we’ll release a M3a or something…

Cheers,

Stéphane

Thank you Stéphane!

I built it from master and it works perfectly. Timeouts are reported correctly as:
java.util.concurrent.TimeoutException: Request reached time out of 120000 ms after 120724 ms

And the non-termination issue is gone.

Great job! :slight_smile:

Regards
Frode

Glad to hear, and thanks a lot for your feedback!
We’ll release a M3a tomorrow, as those bugs are really blocking ones.

Cheers,

Stéphane

Hi there,

Even after upgrading to M3a, some of our simulations are not ending. Could it be due to either of these issues:

javax.net.ssl.SSLException: Received fatal alert: handshake_failure
at sun.security.ssl.Alerts.getSSLException(Alerts.java:208) ~[na:1.7.0_04]
at sun.security.ssl.SSLEngineImpl.fatal(SSLEngineImpl.java:1639) ~[na:1.7.0_04]
at sun.security.ssl.SSLEngineImpl.fatal(SSLEngineImpl.java:1607) ~[na:1.7.0_04]
at sun.security.ssl.SSLEngineImpl.recvAlert(SSLEngineImpl.java:1776) ~[na:1.7.0_04]
at sun.security.ssl.SSLEngineImpl.readRecord(SSLEngineImpl.java:1080) ~[na:1.7.0_04]
at sun.security.ssl.SSLEngineImpl.readNetRecord(SSLEngineImpl.java:884) ~[na:1.7.0_04]
at sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:758) ~[na:1.7.0_04]
at javax.net.ssl.SSLEngine.unwrap(SSLEngine.java:624) ~[na:1.7.0_04]
at org.jboss.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1225) [netty-3.6.6.Final.jar:na]
Wrapped by: java.net.ConnectException: Received fatal alert: handshake_failure to https://whereever
at com.ning.http.client.providers.netty.NettyConnectListener.operationComplete(NettyConnectListener.java:103) ~[async-http-client-1.7.19.20130706.jar:na]
at org.jboss.netty.channel.DefaultChannelFuture.notifyListener(DefaultChannelFuture.java:427) [netty-3.6.6.Final.jar:na]
at org.jboss.netty.channel.DefaultChannelFuture.notifyListeners(DefaultChannelFuture.java:413) [netty-3.6.6.Final.jar:na]
at org.jboss.netty.channel.DefaultChannelFuture.setFailure(DefaultChannelFuture.java:380) [netty-3.6.6.Final.jar:na]
at org.jboss.netty.handler.ssl.SslHandler.setHandshakeFailure(SslHandler.java:1417) [netty-3.6.6.Final.jar:na]
at org.jboss.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1293) [netty-3.6.6.Final.jar:na]
at org.jboss.netty.handler.ssl.SslHandler.decode(SslHandler.java:913) [netty-3.6.6.Final.jar:na]

11:13:58.048 [WARN ] i.g.h.a.AsyncHandler - Request ‘Whatever’ failed
java.lang.NullPointerException: null
at org.jboss.netty.handler.codec.http.HttpClientCodec$Decoder.isContentAlwaysEmpty(HttpClientCodec.java:179) ~[netty-3.6.6.Final.jar:na]
at org.jboss.netty.handler.codec.http.HttpMessageDecoder.readHeaders(HttpMessageDecoder.java:514) ~[netty-3.6.6.Final.jar:na]
at org.jboss.netty.handler.codec.http.HttpMessageDecoder.decode(HttpMessageDecoder.java:193) ~[netty-3.6.6.Final.jar:na]
at org.jboss.netty.handler.codec.http.HttpClientCodec$Decoder.decode(HttpClientCodec.java:143) ~[netty-3.6.6.Final.jar:na]

Looks like a nasty one...

Having the simulation not ending means that some users are lost along the
workflow and don't reach the end that would make the engine stop as
expected. Could you double check that you use M3a (sorry if that's a stupid
request)?

Regarding the handshake_failure, I would expect to see it also logged by
the AsyncHandler, is that the case?
Regarding the NPE, it's logged by the AsyncHandler and should be handled
gracefully.

Do you see those errors in simulation.log too?

I've deployed a custom gatling-http jar into Excilys' repo. Could you try
it out, please? Version is 2.0.0-M3temp1 (force it with a
depencyManagement, for example)?

Unfortunately I’m also still having non terminating simulations. It appears to be caused by an exception thrown from a bodyString.transform() block. I think you will be able to reproduce it using something like:
bodyString.transform(_ => throw new Exception(“hello”))

I just wrapped my code in a Try().toOption so it’s not a big deal, but maybe it should be handled in gatling as well.

Regards,
Frode

Thanks Frode, I’ll try to reproduce ASAP.
What’s the exception you get exactly?

Oh, I think I misunderstood.
Do you mean that you did something custom with transform that crashed with an Exception, and that caused the message to be lost?

Yes. That is exactly what I’m doing. I’m using spray-json to parse the response to case-classes and storing them in the session. So if a parsing exception occurs, that user never finish. Maybe it should be my responsibility to prevent this :slight_smile:

Frode

Looks like a nasty one…

Having the simulation not ending means that some users are lost along the workflow and don’t reach the end that would make the engine stop as expected. Could you double check that you use M3a (sorry if that’s a stupid request)?

Double checked. Using correct version.

Regarding the handshake_failure, I would expect to see it also logged by the AsyncHandler, is that the case?

Yes, seems like it:

11:59:33.524 [WARN ] i.g.h.a.AsyncHandler - Request ‘Whatever’ failed
javax.net.ssl.SSLException: Received fatal alert: handshake_failure
at sun.security.ssl.Alerts.getSSLException(Alerts.java:208) ~[na:1.7.0_04]
at sun.security.ssl.SSLEngineImpl.fatal(SSLEngineImpl.java:1639) ~[na:1.7.0_04]
at sun.security.ssl.SSLEngineImpl.fatal(SSLEngineImpl.java:1607) ~[na:1.7.0_04]
at sun.security.ssl.SSLEngineImpl.recvAlert(SSLEngineImpl.java:1776) ~[na:1.7.0_04]
at sun.security.ssl.SSLEngineImpl.readRecord(SSLEngineImpl.java:1080) ~[na:1.7.0_04]
at sun.security.ssl.SSLEngineImpl.readNetRecord(SSLEngineImpl.java:884) ~[na:1.7.0_04]
at sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:758) ~[na:1.7.0_04]
at javax.net.ssl.SSLEngine.unwrap(SSLEngine.java:624) ~[na:1.7.0_04]
at org.jboss.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1225) [netty-3.6.6.Final.jar:na]
Wrapped by: java.net.ConnectException: Received fatal

Regarding the NPE, it’s logged by the AsyncHandler and should be handled gracefully.

Do you see those errors in simulation.log too?

REQUEST Start dokumentinnsending og last opp dokumenter 47 Last opp dokument 1373450373158 1373450373158 1373450373158 1373450373235 KO Received fatal alert: handshake_failure to https://e34jbsl00680.devillo.no:8443/henvendelse/services/domene.Brukerdialog/OppdatereHenvendelsesBehandlingService_v1

REQUEST Start dokumentinnsending og last opp dokumenter 290 Last opp dokument 1373450538407 1373450538407 1373450538407 1373450538408 KO java.lang.NullPointerException

I’ve deployed a custom gatling-http jar into Excilys’ repo. Could you try it out, please? Version is 2.0.0-M3temp1 (force it with a depencyManagement, for example)?

I’m on it. The thing is, it is an intermittent error, and might take some to possibly reproduce. I’ll kepp you posted.

Reproduced with the M3temp1 dependency. The maven plugin reports 1 running user and the simulation never finishes.

The only “fancy” things we’re doing are:

  • checks with regexes
  • injecting a custom saml assertion into the session like so:

val injectSamlAssertion = (session: Session) => {
session(“id”).validate[String].map(id =>
session.setAll(Map(“id” → id, “saml” → saml.generateSignedAssertion(id)))
)
}

exec(injectSamlAssertion)

Stefan, this is a longshot, but I have experienced the same NPE and we are probably working in similar environments :wink:
Are those multipart uploads?
Is BigIp involved?

Does it help if you set .connection(“close”) in the httpProtocolConfig?

I have wasted two days trying to figure out some weird issue with multipart, BigIp and Connection: keep-alive. It seems that BigIp sends additional response-data on the same connection after the reqular response. This causes the next request to fail. And since the buffer contains response data without any http version and status code (just html) I get the NPE that you mention. Either a pause or Connection: close seems to work around it.

Regards,
Frode

I’ve pushed:

@Frode: regarding your BigIp problem, did you check this with something like Charles or Wireshark? I’d hate this to be a kind of buffer overflow in Gatling/AHC/Netty…

Yes, multipart uploads and loads of them, but currenlty no big ip.
Since we’re using a crappy app server and are terminating ssl in the same crappy app server, I don’t want to close the connections but shareConnection() in order to reuse them (ssl handshake is quite heavy and kills our cpu).
we saw some HTTP 505s earlier, but then we introduced some pauses to get rid of the problem. We still pause after every multipart request actually.
Could it be that AHC is not flushing the buffer or something? Could it be that the server mixes up the requests? Request boundaries not ok or something?

@Stefan, with M3temp1, any message in the logs that would look like “AsyncHandlerActor crashed on message”?

I’ve pushed:

Thank you!

@Frode: regarding your BigIp problem, did you check this with something like Charles or Wireshark? I’d hate this to be a kind of buffer overflow in Gatling/AHC/Netty…

I’m signing off now, but I’ll describe it in details tomorrow morning!

Have a nice day!
Frode

Thanks a lot for your help!

No, I only see stuff like

15:01:07.498 [WARN ] i.g.h.a.AsyncHandler - Request ‘Opprett dokumentbehandling’ failed for user 122
java.lang.NullPointerException: null
at org.jboss.netty.handler.codec.http.HttpClientCodec$Decoder.isContentAlwaysEmpty(HttpClientCodec.java:179) ~[netty-3.6.6.Final.jar:na]

Can find any references to “AsyncHandlerActor” in my logs.

Cheers

Stefan

That was can’t find…