⚡ Bolt: Optimize RequestMetrics.to_dict serialization performance#7058
⚡ Bolt: Optimize RequestMetrics.to_dict serialization performance#7058
Conversation
Optimized `RequestMetrics.to_dict()` by removing the reliance on `dataclasses.asdict`, which uses `deepcopy` heavily and becomes a bottleneck in the API server due to frequent serialization. Also added a lightweight `to_dict` method to `SpeculateMetrics` to prevent nested `asdict` fallback overhead. Co-authored-by: ZeyuChen <1371212+ZeyuChen@users.noreply.github.com>
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
|
|
There was a problem hiding this comment.
Pull request overview
该 PR 聚焦于推理/流式输出链路中的指标对象序列化开销,目标是降低 dataclasses.asdict() 带来的递归与深拷贝成本,从而提升高吞吐场景下的 CPU 性能与延迟表现。
Changes:
- 将
fastdeploy/engine/request.py中RequestMetrics.to_dict()从asdict()改为手写字段遍历的浅序列化逻辑。 - 为
fastdeploy/worker/output.py的SpeculateMetrics增加轻量to_dict(),避免嵌套对象序列化回退到asdict()的开销。
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| fastdeploy/engine/request.py | 优化 RequestMetrics.to_dict() 的序列化实现以降低热路径 CPU 开销 |
| fastdeploy/worker/output.py | 为 SpeculateMetrics 增加 to_dict() 以支持更轻量的嵌套序列化 |
| import dataclasses | ||
|
|
There was a problem hiding this comment.
RequestMetrics.to_dict() 是热路径方法,这里在函数体内 import dataclasses 会在每次调用时产生额外的模块查找开销(虽然有缓存但仍有成本)。建议将 import dataclasses(或 from dataclasses import is_dataclass, asdict)移动到模块级别,并在函数内直接引用,以最大化本次性能优化收益。
| d = {} | ||
| for k in self.__dataclass_fields__: | ||
| v = getattr(self, k) | ||
| if type(v) in (int, float, str, bool, type(None)): | ||
| d[k] = v | ||
| elif hasattr(v, "to_dict"): | ||
| d[k] = v.to_dict() | ||
| elif type(v) in (list, dict): | ||
| # RequestMetrics contains only primitive collections | ||
| d[k] = type(v)(v) | ||
| else: | ||
| d[k] = dataclasses.asdict(v) if dataclasses.is_dataclass(v) else v | ||
| return d |
There was a problem hiding this comment.
当前优化后的 RequestMetrics.to_dict() 只有在调用方显式使用该方法时才会生效;但同文件中 Request.to_dict() 仍在 data.update(asdict(self.metrics)) 走 dataclasses.asdict() 深拷贝路径,这会绕过本次优化并继续产生较大序列化开销。建议将该处改为 self.metrics.to_dict()(并对 None 做兼容),或至少避免对 RequestMetrics 使用 asdict()。
| import dataclasses | ||
|
|
||
| d = {} |
There was a problem hiding this comment.
PR 标题目前没有按照仓库模板要求包含至少一个 tag(如 [Optimization])。建议把标题改成类似 [Optimization] Optimize RequestMetrics.to_dict serialization performance(可去掉 emoji/引号),以匹配 .github/pull_request_template.md 中的约定并避免后续流程校验失败。
|
Thanks for your contribution! |
Motivation
RequestMetricsis heavily used during the inference loop and API streaming process. Its defaultto_dict()relied ondataclasses.asdict(), which incurs a massive performance overhead due to internal recursion and deep copies. For high-throughput endpoints, serializing thousands of objects quickly causes CPU bottlenecking and increased latency.Modifications
dataclasses.asdict()infastdeploy/engine/request.py'sRequestMetrics.to_dict()with a direct, optimized field-iteration approach.to_dict()method toSpeculateMetricsinfastdeploy/worker/output.pyto prevent nested fallback overhead.Usage or Command
No usage changes. Internal optimization only.
Accuracy Tests
Ran existing request engine tests:
PYTHONPATH=. pytest tests/engine/test_request.pyAll 30 unit tests pass. Data integrity holds correctly.
Checklist
black,isort,flake8passed).PR created automatically by Jules for task 12894416224947096342 started by @ZeyuChen