All the benchmarks have either VLLM or sglang but not both. I assume this is to avoid twitter fights. I think a lot of people at labs like me are on a fork of one and spend some energy trying to translate one to the other, often poorly. I think it would be great for the community if you published both numbers (or at least encouraged it). Clearly this effort has sparked AMD to do a lot more useful shit so I think it VLLM v sglang competition would also probably be good for the community. I would also not allow disable_prefix_cache=True type things but I feel less strongly about that. I also think from VLLM and SGLANG people twitter wars, the fights are not that bad (more publicity) and could actually be resolved more gracefully on this platform.
All the benchmarks have either VLLM or sglang but not both. I assume this is to avoid twitter fights. I think a lot of people at labs like me are on a fork of one and spend some energy trying to translate one to the other, often poorly. I think it would be great for the community if you published both numbers (or at least encouraged it). Clearly this effort has sparked AMD to do a lot more useful shit so I think it VLLM v sglang competition would also probably be good for the community. I would also not allow disable_prefix_cache=True type things but I feel less strongly about that. I also think from VLLM and SGLANG people twitter wars, the fights are not that bad (more publicity) and could actually be resolved more gracefully on this platform.