Feeder causes out of memory

softmentor · April 8, 2014, 10:44am

Gatling Version used: 2.0.0-M3a

I am using a feeder to run a scenario as follows

val csvData = csv(“input-test.csv”).queue

val scn = scenario(“Events Test”)

.repeat(10) {
feed(csvData)
.exec(mine)
}

setUp(scn.inject(ramp(10 users) over (5 seconds),
constantRate(300 usersPerSec) during (1 minute)
))

If you need to see the full code, it is similar to the example here:
https://github.com/softmentor/gatling-examples/blob/master/gatling-custom-protocol-demo/src/test/scala/custom/protocol/test/TestCustomProtocolSimulation.scala

Now when the input-test.csv increases(number of lines), I get out of memory. I guess it is loading all of the lines in the csv file into memory.
Is there a way to chunk the input so that for each execution i can take the next set of input values from the file.
In this particular case for each execution I would need 10 lines for 10 users for rampup and 300 * 60sec = 18000 lines, hence total of 18010 in the input-test.csv for each execution.
I have specified to repeat it 10 times so I would need 18010 * 10 = 180100 lines in the input file.

Is there a way that it reads only 18010 lines for each run rather than 180100 lines all at-once which causes out of memory.

Objective is to load data from feeder incrementally rather than all at once to avoid memory issues. I want to use the realistic data in the input file but create constant load of 300- 500 usersPerSec.

Excilys · April 8, 2014, 11:02am

Now when the input-test.csv increases(number of lines), I get out of
memory.

How much data are you loading so that it doesn't fit in memory? What's the
file size? How much heap do you have?
Are you sure the memory consumption comes from the feeder? Does the OOM
occurs when the simulation is loaded and not during the run?

Note that current snapshot doesn't use an array but a Vector, so it doesn't
use contiguous memory.

I guess it is loading all of the lines in the csv file into memory.

Absolutely. The goal is to read from memory during the run and don't access
the file system.
Also, otherwise, circular strategy would be a bit more tricky, and random
would just be impossible.

Is there a way to chunk the input so that for each execution i can take
the next set of input values from the file.

In this particular case for each execution I would need 10 lines for 10

users for rampup and 300 * 60sec = 18000 lines, hence total of 18010 in the
input-test.csv for each execution.
I have specified to repeat it 10 times so I would need 18010 * 10 = 180100
lines in the input file.

Is there a way that it reads only 18010 lines for each run rather than
180100 lines all at-once which causes out of memory.

Objective is to load data from feeder incrementally rather than all at
once to avoid memory issues. I want to use the realistic data in the input
file but create constant load of 300- 500 usersPerSec.

In master, SeparatedValuesParser returns an Iterator, so you can use it as
it.
In 2M3a, it returns an Array, so you'll have to fork it.

Then, you'll have to build your own Feeder (= Iterator) on top of it that
buffers the reads.

Still, you might experience chokes when loading a new buffer.

Topic		Replies	Views
OutOfMemoryError (Java heap space) during initialization (loading a very large feeder file) Gatling (Open-Source)	7	183	December 3, 2014
Problem with Gatling stress tool when running for longer period of time using csv feeder Gatling (Open-Source)	3	108	May 6, 2013
Multiple feeders using the same files are loaded as different iterators instead of one Gatling (Open-Source)	3	145	July 11, 2014
Best practice to create test with big csv injection Gatling (Open-Source)	1	166	June 15, 2016
CSV Feeder, not for the request but for the workload Gatling (Open-Source)	1	150	February 26, 2016

Feeder causes out of memory

Related topics