🚀 The feature, motivation and pitch
Follow up on #11440
BS=512, 1000/1000, B200, TP=8
Output TPS ~0.5x, TTFT is ~0.2x compared to vLLM.
Targeting >0.8x as initial target.
Dump traces and identify performance optimizations.
Alternatives
No response
Additional context
No response
Before submitting a new issue...
🚀 The feature, motivation and pitch
Follow up on #11440
BS=512, 1000/1000, B200, TP=8
Output TPS ~0.5x, TTFT is ~0.2x compared to vLLM.
Targeting >0.8x as initial target.
Dump traces and identify performance optimizations.
Alternatives
No response
Additional context
No response
Before submitting a new issue...