My task to is test the performance of a LLM. Along with total response time I need the get the time-to-first-token metric. The call is an HTTPS POST, per example ‘https://api.openai.com/v1/chat/completions’ with ‘stream’ option added to the request body ( which serves the response as chat.completion.chunk json pieces for each token back.
Is there any way to achieve this? Thanks.
Thanks for the quick response. However, I need to pass a request body and doesn’t look like SseConnectRequestBuilder supports that ( I am at version 3.7)
Indeed, I failed to realize that.
EventSource object only allow this.
Contributions or sponsoring welcome.
This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.