OOM using 3.3.1

shen_tingting · September 18, 2020, 1:13am

Hello，
I uses gatling 3.3.1 to do perftest with tcv feeder batch mode. when I run the test, the memory keeps going up until 100%。

scala: object Recall {
val format = new SimpleDateFormat(“yyyy-MM-dd”)
val rnd = new Random
var flag = rnd.nextInt(5)
//val feeder1=jdbcFeeder(“jdbc:mysql://172.28.148.31:3306/test”, “root”, “”, “select url, cookie, body from test.test_data limit 1000000;”).circular
val feeder1=tsv("/cfs/mnt/recsdet/tm/iperf_test/url/test_data_0.csv").batch(20000).circular
val recall = feed(feeder1)
.exec(http(“Search”)
.get("${url}&forcebot=1")
//.body(StringBody("""${body}"""))
.header(“Cookie”, “${cookie}”)
.header(“Content-Type”,“application/x-www-form-urlencoded”)
.check(status.is(200)))
}

val httpProtocol = http
.baseUrl(“http://mjq.jd.local”)
.acceptHeader(“application/json;q=0.9,/;q=0.8”)
.acceptEncodingHeader(“gzip, deflate”)
.acceptLanguageHeader(“en-US,en;q=0.5”)
.doNotTrackHeader(“1”)
.disableFollowRedirect
.userAgentHeader(“Mozilla/5.0 (Windows NT 6.1; WOW64; rv:39.0) Gecko/20100101 Firefox/39.0”)

val recall = scenario(“recall”).exec(Recall.recall)

setUp(
recall.inject(constantUsersPerSec(3000) during (300000 seconds))
).protocols(httpProtocol)
}

the JVM configures is :-XX:+UseG1GC -Xmx8G -Xms8G -XX:MaxDirectMemorySize=8G -XX:ParallelGCThreads=10 -XX:MaxGCPauseMillis=350 -XX:InitiatingHeapOccupancyPercent=35 -XX:G1ReservePercent=15 -XX:ConcGCThreads=8 -XX:G1HeapRegionSize=10m -Djava.net.preferIPv4Stack=true -Djava.net.preferIPv6Addresses=false -XX:+OptimizeStringConcat -XX:+ParallelRefProcEnabled

version is :jdk1.8.0_221

is something wrong?

slandelle · September 18, 2020, 5:09am

You would have to provide a heap dump.
My best guess is that your system under load can’t withstand such load and virtual users just pile up in memory.

shen_tingting · September 21, 2020, 3:37am

I tested two scenario. the first one is 2000 virual users. the memory is used only 3% and very stable , cpu ~30%。
the second scenario is 3k virual users. the cpu is stable used only 30%+, but the memory keeps growing utill 100% then killed by os.

here is the information:
NMT INFO(memory up to 70%+)

[admin@cfsslave-acf322bf gatling_stt]$ /export/servers/jdk11.0.2/bin/jcmd 85542 VM.native_memory scale=MB
85542:

Native Memory Tracking:

Total: reserved=5756MB, committed=4450MB

Java Heap (reserved=4096MB, committed=4096MB)
(mmap: reserved=4096MB, committed=4096MB)
Class (reserved=1067MB, committed=50MB)
(classes #6712)
( instance classes #6308, array classes #404)
(malloc=1MB #20372)
(mmap: reserved=1066MB, committed=49MB)
( Metadata: )
( reserved=42MB, committed=41MB)
( used=39MB)
( free=2MB)
( waste=0MB =0.00%)
( Class space:)
( reserved=1024MB, committed=8MB)
( used=6MB)
( free=2MB)
( waste=0MB =0.00%)
Thread (reserved=77MB, committed=7MB)
(thread #76)
(stack: reserved=76MB, committed=6MB)
Code (reserved=243MB, committed=24MB)
(malloc=1MB #6892)
(mmap: reserved=242MB, committed=23MB)
GC (reserved=225MB, committed=225MB)
(malloc=41MB #34097)
(mmap: reserved=184MB, committed=184MB)
Compiler (reserved=1MB, committed=1MB)
(malloc=1MB #890)
Internal (reserved=2MB, committed=2MB)
(malloc=2MB #3873)
Other (reserved=33MB, committed=33MB)
(malloc=33MB #82)
Symbol (reserved=10MB, committed=10MB)
(malloc=8MB #72510)
(arena=2MB #1)
Native Memory Tracking (reserved=3MB, committed=3MB)
(tracking overhead=2MB)

the jmap heap info sees attach.
the pmap info sees attach

jmap_heap.txt (1.55 KB)

pmap.txt (35.6 KB)

shen_tingting · September 21, 2020, 4:12am

I did another test. I run 2 instances gatling in a docker with 2k virual users. The CPU is 50%+, MEM only used 6%+, and stable.
I run 1 gatling instance in a docker with 3k virual users. the memory keeps grouping until 100%.

slandelle · September 21, 2020, 5:23am

The jmap is useless, to have to specify the -dump option: https://docs.oracle.com/javase/7/docs/technotes/tools/share/jmap.html

Anyway, there’s a very good chance your virtual users are piling up in memory because you’re hitting a bottleneck : either the target system can’t withstand such load or you’re saturating the bandwidth.

shen_tingting · September 23, 2020, 2:35am

here is the dump info。
https://drive.google.com/file/d/1xEP25Y0PHJlo2TDF-rE_0XxNYIjx2PsK/view?usp=sharing

shen_tingting · September 23, 2020, 7:50am

I see the output “active” is increaing all the time.

this is the reason that the memory keeps increaing?

slandelle · September 23, 2020, 8:43am

Absolutely. Your virtual users are piling up in memory (~1,260,000) because you’re hitting a bottleneck:

your injector machine is saturated: 100% CPU or bandwidth
your system under load can’t keep up with such load

shen_tingting · September 23, 2020, 11:04am

it is very stange. I run two gatling instance in on docker with 2k virtual users. the memory is very stable.
but I run one gatling instance in one docker with 3k virtual users. the memory keeps growing until 100%.

slandelle · September 23, 2020, 11:16am

It’s not “strange”, it’s exactly what I described.

And from what I saw in your heap dump, it’s not 2k virtual users, it’s 2k new virtual users per second.

shen_tingting · September 23, 2020, 12:05pm

But I system under test received 4K qps.

slandelle · September 23, 2020, 12:08pm

OK, back to basics!

As per this group’s terms:

Provide a Short, Self Contained, Correct (Compilable), Example (see http://sscce.org/)

shen_tingting · September 23, 2020, 12:21pm

I am not get you point.
the system under test works well.
the injector machine is not saturated. CPU used only 35%。

I change my scala script “constantUsersPerSec” to “constantConcurrentUsers” . I run one gatling instance in a docker only can send 3k qps.
But I run two gatling instances in a docker can send 6k qps totally.
whey one gatling instance can only send 3k not 6k?
is there somethings reach bottlerneck?

slandelle · September 23, 2020, 12:25pm

It’s counterproductive to play riddles without being able to reproduce your problem.
And any way, there are more than 1 million virtual users stuck in the heap dump you provided and that’s something only you can figure out and fix.

shen_tingting · September 23, 2020, 12:56pm

are you in Bejing? I am working at Beijing.

shen_tingting · September 24, 2020, 6:44am

Another question. Has the feeder with batch.circular mode queue size config?

This mode with big file(3G+), the stack show a lot of threading in waiting status.

“GatlingSystem-akka.actor.default-dispatcher-25” #39 prio=5 os_prio=0 tid=0x0000000004bb4800 nid=0x82b4b waiting on condition [0x00007f3bb7822000]
java.lang.Thread.State: WAITING (parking)

slandelle · September 24, 2020, 7:07am

https://github.com/gatling/gatling/issues/3944

Topic		Replies	Views
Feeder causes out of memory Gatling (Open-Source)	1	189	April 8, 2014
Problem with Gatling stress tool when running for longer period of time using csv feeder Gatling (Open-Source)	3	108	May 6, 2013
Batch loads all rows in a file? General discussions	5	226	January 24, 2024
Out of Memory Crash Attempting to Load Feeder for Query of 5.5 Million Records Gatling (Open-Source)	3	160	July 16, 2019
Out of memory issue - heap memory Gatling (Open-Source)	11	246	June 20, 2024

OOM using 3.3.1

Related topics