Gatling Throttle: How should it work?

I am trying to generate 100 requests per second against our Microservice.

For that I used throttle feature of Gatling. The code is as below:

setUp(
serviceScenario.scnGetAPI
.inject(
constantUsersPerSec(100) during (10 minutes)
)
.throttle(
reachRps(100) in (2 minutes),
holdFor(10 minutes)
)
.protocols(serviceScenario.baseProtocol)
)
}

When I check the Gatling Reports,

  1. The average request per second never goes upto 100. It is usually 85~95 requests//sec. How to make it hit 100? Or is there something wrong in the code?
  2. In the ‘Number of Active Users’ graph, the number of users go upto 5000. Why is it 50 times higher than the number of constant users injected? How is this number determined?