Gatling test breaks Tomcat with IOException

Source code: https://github.com/turingg/file-server/tree/gatling-issue

I am running some experiments with servlets and am using Gatling for load testing.

When I run my load test to upload a file that is about 500MB with more than 1 concurrent user using Gatling, most of the requests fail.

For example, for this load test:

import io.gatling.core.Predef._
import io.gatling.http.Predef._

class SyncUploadSimulation extends Simulation {

val httpConf = http.baseUrl("http://localhost:8080")

val scn = scenario("Upload files via sync endpoint")
.exec(http("Upload 1")
.post("/file-server/sync/upload")
.bodyPart(RawFileBodyPart(
"document",
"/path/to/500-mb-file.tar"
))
.check(status.is(200))
.check(jsonPath("$.status").is("Success"))
)
.pause(1)

setUp(scn.inject(atOnceUsers(2)))
.protocols(httpConf)
}

Tomcat fails with:

java.io.IOException: org.apache.tomcat.util.http.fileupload.FileUploadBase$IOFileUploadException: Processing of multipart/form-data request failed. java.io.EOFException: Unexpected EOF read on the socket
at org.apache.catalina.connector.Request.parseParts(Request.java:2915)
at org.apache.catalina.connector.Request.getParts(Request.java:2770)
at org.apache.catalina.connector.Request.getPart(Request.java:2939)
at org.apache.catalina.connector.RequestFacade.getPart(RequestFacade.java:1105)
at javax.servlet.http.HttpServletRequestWrapper.getPart(HttpServletRequestWrapper.java:374)
at xyz.behrang.fileserver.sync.SyncUploadServlet.isPartValid(SyncUploadServlet.java:71)
at xyz.behrang.fileserver.sync.SyncUploadServlet.doPostImpl(SyncUploadServlet.java:46)
at xyz.behrang.fileserver.sync.SyncUploadServlet.doPost(SyncUploadServlet.java:30)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:660)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:741)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:231)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:53)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
at net.bull.javamelody.MonitoringFilter.doFilter(MonitoringFilter.java:239)
at net.bull.javamelody.MonitoringFilter.doFilter(MonitoringFilter.java:215)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:202)
at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:96)
at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:526)
at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:139)
at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:92)
at org.apache.catalina.valves.AbstractAccessLogValve.invoke(AbstractAccessLogValve.java:678)
at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:74)
at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:343)
at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:367)
at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:65)
at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:860)
at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1591)
at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: org.apache.tomcat.util.http.fileupload.FileUploadBase$IOFileUploadException: Processing of multipart/form-data request failed. java.io.EOFException: Unexpected EOF read on the socket
at org.apache.tomcat.util.http.fileupload.FileUploadBase.parseRequest(FileUploadBase.java:298)
at org.apache.catalina.connector.Request.parseParts(Request.java:2868)
... 35 more
Caused by: org.apache.catalina.connector.ClientAbortException: java.io.EOFException: Unexpected EOF read on the socket
at org.apache.catalina.connector.InputBuffer.realReadBytes(InputBuffer.java:340)
at org.apache.catalina.connector.InputBuffer.checkByteBufferEof(InputBuffer.java:632)
at org.apache.catalina.connector.InputBuffer.read(InputBuffer.java:362)
at org.apache.catalina.connector.CoyoteInputStream.read(CoyoteInputStream.java:132)
at org.apache.tomcat.util.http.fileupload.MultipartStream$ItemInputStream.makeAvailable(MultipartStream.java:977)
at org.apache.tomcat.util.http.fileupload.MultipartStream$ItemInputStream.read(MultipartStream.java:881)
at java.base/java.io.InputStream.read(InputStream.java:205)
at org.apache.tomcat.util.http.fileupload.util.Streams.copy(Streams.java:98)
at org.apache.tomcat.util.http.fileupload.FileUploadBase.parseRequest(FileUploadBase.java:294)
... 36 more
Caused by: java.io.EOFException: Unexpected EOF read on the socket
at org.apache.coyote.http11.Http11InputBuffer.fill(Http11InputBuffer.java:743)
at org.apache.coyote.http11.Http11InputBuffer.access$300(Http11InputBuffer.java:41)
at org.apache.coyote.http11.Http11InputBuffer$SocketInputBuffer.doRead(Http11InputBuffer.java:1070)
at org.apache.coyote.http11.filters.IdentityInputFilter.doRead(IdentityInputFilter.java:102)
at org.apache.coyote.http11.Http11InputBuffer.doRead(Http11InputBuffer.java:246)
at org.apache.coyote.Request.doRead(Request.java:551)
at org.apache.catalina.connector.InputBuffer.realReadBytes(InputBuffer.java:336)
... 44 more

However a similar test using a framework other than Gatling, with 2 or even 100 concurrent requests, passes successfully without any issues:

#!/bin/bash

set -eu

START=$(date +%s)
echo "Start time: ${START}."

for i in {1..100}
do
curl --silent --connect-timeout 3000 --max-time 3000 \
-F document=@"/path/to/500-mb-file.tar" \
localhost:8080/file-server/sync/upload &
done
wait

END=$(date +%s)
echo "Start time: ${START}."
echo "End time : ${END}."
echo "Total time: ${END} - ${START} = $(expr ${END} - ${START})"
echo "Done."

Which outputs:

Start time: 1575208592.
{"status":"Success"}
...
{"status":"Success"}
Start time: 1575208592.
End time: 1575208688.
Total time: 1575208688 - 1575208592 = 96
Done.

Any ideas what is causing this?

I’ve attached TRACE level logs.

gatling.log (43.8 KB)

Works for me.
Are you sure you don’t hit a timeout?

Also, I’m very suspicious wrt your curl results. Uploading 500 MB on localhost in 96ms, really?

Just FYI I’m using AdoptOpenJDK 11 HotSpot.

And that’s 96 seconds.

Ah, so your problem is just the request timeout as default is 60s, see gatling.conf

Two simultaneous uploads takes less than 4-5 seconds.

Anyway, I increased the default timeout to an extremely large value and the problem persists:

Here is a simple Java program using Java 11’s build-in HTTP client that uploads the same file concurrently to the target servlet: https://gist.github.com/behrangsa/f645a9a8951936e9223381d5bf34bac5

I tried it with up to 4 threads and it is working without any issues.

Unfortunately it is not well written so it consumes +500MB per thread so I couldn’t run it with many more threads but similar to curl, it doesn’t break Tomcat.

Have you tried removing javamelody?
I’ve had countless issues with it in the past, so I removed it before even testing on my side.

This is most likely a Gatling issue. It doesn’t happen when I make concurrent requests using another tool or using a small Java program.

Also this issue occurs only when size of the file to be uploaded is relatively large (~500MB). With a small file (60K) this error doesn’t occur.

I removed JavaMelody and it didn’t eliminate this issue.

Stephane,

Let me know if you want to SSH into my EC2 server. I will keep it around for another 24 hours.

Cheers.

I think I know what happens. Could you please share your file?

I’ve uploaded the file here: https://send.firefox.com/download/0337b7ebea4368eb/#SGDd-laoUqhrl_JegmyU1Q

It turns out there’s an issue with zero copy feature I was considering dropping anyway…
Until 3.4.0, please disable atling.http.ahc.enableZeroCopy in gatling.conf.

Thanks for reporting!

No worries. Disabling that fixed the problem.

Best regards,
Behrang Saeedzadeh