Skip to content

[DataProcessor]Remove ENABLE_V1_DATA_PROCESSOR#7052

Open
luukunn wants to merge 2 commits intoPaddlePaddle:developfrom
luukunn:remove_v1
Open

[DataProcessor]Remove ENABLE_V1_DATA_PROCESSOR#7052
luukunn wants to merge 2 commits intoPaddlePaddle:developfrom
luukunn:remove_v1

Conversation

@luukunn
Copy link
Copy Markdown
Collaborator

@luukunn luukunn commented Mar 27, 2026

Motivation

删除ENABLE_V1_DATA_PROCESSOR代码

💡 If this PR is a Cherry Pick, the PR title needs to follow the format by adding the [Cherry-Pick] label at the very beginning and appending the original PR ID at the end. For example, [Cherry-Pick][CI] Add check trigger and logic(#5191)

💡 如若此PR是Cherry Pick,PR标题需遵循格式,在最开始加上[Cherry-Pick]标签,以及最后面加上原PR ID,例如[Cherry-Pick][CI] Add check trigger and logic(#5191)

Modifications

Usage or Command

Accuracy Tests

Checklist

  • Add at least a tag in the PR title.
    • Tag list: [[FDConfig],[APIServer],[Engine], [Scheduler], [PD Disaggregation], [Executor], [Graph Optimization], [Speculative Decoding], [RL], [Models], [Quantization], [Loader], [OP], [KVCache], [DataProcessor], [BugFix], [Docs], [CI], [Optimization], [Feature], [Benchmark], [Others], [XPU], [HPU], [GCU], [DCU], [Iluvatar], [Metax]]
    • You can add new tags based on the PR content, but the semantics must be clear.
  • Format your code, run pre-commit before commit.
  • Add unit tests. Please write the reason in this PR if no unit tests.
  • Provide accuracy results.
  • If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag.

Copilot AI review requested due to automatic review settings March 27, 2026 09:19
@paddle-bot
Copy link
Copy Markdown

paddle-bot bot commented Mar 27, 2026

Thanks for your contribution!

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

该 PR 旨在移除 ENABLE_V1_DATA_PROCESSOR 环境开关及其相关分支逻辑,并清理 fastdeploy/input/v1 旧实现与对应测试用例,从而统一走当前默认的数据处理与传输协议路径。

Changes:

  • 删除 ENABLE_V1_DATA_PROCESSOR 环境变量定义,并移除各处基于该开关的条件分支。
  • 清理/删除 fastdeploy/input/v1/* 相关处理器实现与 tests/input/v1/* 测试。
  • ZMQ 侧统一按 to_dict() 结果进行序列化发送(不再发送旧 v1 对象)。

Reviewed changes

Copilot reviewed 53 out of 54 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
tests/inter_communicator/test_zmq_server.py 移除对 v1 开关分支的测试覆盖,更新断言以匹配统一序列化路径
tests/input/v1/test_tokenizer_client.py 删除 v1 tokenizer client 相关测试文件
tests/input/v1/test_text_processor.py 删除 v1 text processor 相关测试文件
tests/input/v1/test_process_video.py 删除 v1 视频处理相关测试文件
tests/input/v1/test_image_preprocessor_adaptive.py 删除 v1 自适应图像预处理相关测试文件
tests/input/v1/test_ernie_processor.py 删除 v1 Ernie 处理器相关测试文件
tests/input/v1/test_ernie4_5_processor.py 删除 v1 Ernie4.5 处理器相关测试文件
tests/input/test_preprocess.py 更新预处理创建逻辑相关测试,去除对 v1 开关的 patch 依赖
tests/entrypoints/test_serving_completion.py 更新 completion serving 测试,去除 v1 开关分支覆盖
tests/entrypoints/openai/test_serving_chat.py 更新 chat serving 测试,去除 v1 开关分支覆盖
tests/engine/test_common_engine.py 更新 common_engine 测试,去除 v1 开关 patch
fastdeploy/inter_communicator/zmq_server.py 统一 response 序列化逻辑为 to_dict()(pickle/msgpack 分支收敛)
fastdeploy/input/v1/qwen_vl_processor/qwen_vl_processor.py 删除 v1 Qwen2.5-VL 处理器实现
fastdeploy/input/v1/qwen_vl_processor/process_video.py 删除 v1 Qwen2.5-VL 视频采样实现
fastdeploy/input/v1/qwen_vl_processor/process.py 删除 v1 Qwen2.5-VL 多模态处理实现
fastdeploy/input/v1/qwen_vl_processor/image_processor.py 删除 v1 Qwen2.5-VL 图像处理实现
fastdeploy/input/v1/qwen_vl_processor/init.py 删除 v1 Qwen2.5-VL 包导出
fastdeploy/input/v1/qwen3_vl_processor/qwen3_vl_processor.py 删除 v1 Qwen3-VL 处理器实现
fastdeploy/input/v1/qwen3_vl_processor/image_processor.py 删除 v1 Qwen3-VL 图像处理实现
fastdeploy/input/v1/qwen3_vl_processor/init.py 删除 v1 Qwen3-VL 包导出
fastdeploy/input/v1/paddleocr_vl_processor/process_video.py 删除 v1 PaddleOCR-VL 视频采样实现
fastdeploy/input/v1/paddleocr_vl_processor/process.py 删除 v1 PaddleOCR-VL 多模态处理实现
fastdeploy/input/v1/paddleocr_vl_processor/paddleocr_vl_processor.py 删除 v1 PaddleOCR-VL 处理器实现
fastdeploy/input/v1/paddleocr_vl_processor/image_processor.py 删除 v1 PaddleOCR-VL 图像处理实现
fastdeploy/input/v1/paddleocr_vl_processor/init.py 删除 v1 PaddleOCR-VL 包导出
fastdeploy/input/v1/ernie4_5_vl_processor/utils/video_utils.py 删除 v1 ERNIE4.5-VL 视频工具实现
fastdeploy/input/v1/ernie4_5_vl_processor/utils/render_timestamp.py 删除 v1 时间戳渲染工具实现
fastdeploy/input/v1/ernie4_5_vl_processor/utils/io_utils.py 删除 v1 IO/下载工具实现
fastdeploy/input/v1/ernie4_5_vl_processor/utils/init.py 删除 v1 utils 包文件
fastdeploy/input/v1/ernie4_5_vl_processor/utils/Roboto-Regular.ttf 删除 v1 utils 字体资源(随包清理)
fastdeploy/input/v1/ernie4_5_vl_processor/process_video.py 删除 v1 ERNIE4.5-VL 视频处理实现
fastdeploy/input/v1/ernie4_5_vl_processor/image_preprocessor/image_preprocessor_adaptive.py 删除 v1 自适应图像预处理实现
fastdeploy/input/v1/ernie4_5_vl_processor/image_preprocessor/get_image_preprocessor.py 删除 v1 图像预处理器获取逻辑
fastdeploy/input/v1/ernie4_5_vl_processor/image_preprocessor/init.py 删除 v1 图像预处理包导出
fastdeploy/input/v1/ernie4_5_vl_processor/ernie4_5_vl_processor.py 删除 v1 ERNIE4.5-VL Processor 实现
fastdeploy/input/v1/ernie4_5_vl_processor/init.py 删除 v1 ERNIE4.5-VL 包导出
fastdeploy/input/v1/init.py 删除 v1 input 包初始化文件
fastdeploy/input/preprocess.py create_processor 去除 v1 开关分支,固定使用当前默认处理器实现
fastdeploy/envs.py 删除 ENABLE_V1_DATA_PROCESSOR 环境变量注册
fastdeploy/entrypoints/openai/serving_reward.py reward serving 去除 v1 分支,统一按 dict 请求路径处理
fastdeploy/entrypoints/openai/serving_embedding.py embedding serving 去除 v1 分支,统一按 dict 请求路径处理
fastdeploy/entrypoints/openai/serving_completion.py completion serving 去除 v1 分支(不再构造 Request 对象)
fastdeploy/entrypoints/openai/serving_chat.py chat serving 去除 v1 分支(不再构造 Request 对象)
fastdeploy/entrypoints/engine_client.py engine client 发送任务时不再依赖 v1 开关做 send_json/send_pyobj 分流(仅按 enable_mm)
fastdeploy/engine/common_engine.py engine 侧接收请求/构造 Request 时不再依赖 v1 开关
fastdeploy/engine/async_llm.py async llm 请求发送/响应反序列化不再依赖 v1 开关

Comment on lines 94 to 97
"EXPORTER_OTLP_HEADERS": lambda: os.getenv("EXPORTER_OTLP_HEADERS"),
# enable kv cache block scheduler v1 (no need for kv_cache_ratio)
"ENABLE_V1_KVCACHE_SCHEDULER": lambda: int(os.getenv("ENABLE_V1_KVCACHE_SCHEDULER", "1")),
# enable data processor v2
"ENABLE_V1_DATA_PROCESSOR": lambda: int(os.getenv("ENABLE_V1_DATA_PROCESSOR", "0")),
# set prealloc block num for decoder
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR 标题目前不符合仓库模板要求的标签格式(需要包含至少一个形如 [BugFix]/[DataProcessor] 的 tag),且 PR 描述的 Motivation/Modifications 等部分为空。建议按 .github/pull_request_template.md 补全:在标题加上合适标签,并在描述里说明为何移除该开关、是否存在迁移/兼容性影响以及如何验证(例如跑哪些测试/命令)。

Copilot uses AI. Check for mistakes.
Comment on lines 82 to 86
logger.info(f"Plugin input processor not available ({e}), using built-in processor")
if not self.model_config.enable_mm:
if not ErnieArchitectures.contains_ernie_arch(architecture):
if not envs.ENABLE_V1_DATA_PROCESSOR:
from fastdeploy.input.text_processor import DataProcessor
else:
from fastdeploy.input.v1.text_processor import DataProcessor
from fastdeploy.input.text_processor import DataProcessor

Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

当前 PR 删除了整个 fastdeploy/input/v1 包,但仓库里仍有测试在引用这些模块(例如 tests/model_executor/test_thinking_budget.py 仍在 import fastdeploy.input.v1.ernie4_5_vl_processor / fastdeploy.input.v1.text_processor)。这会导致单测在导入阶段直接失败。建议同步更新/删除这些 v1 引用(改为使用现有的非 v1 处理器/测试路径),或保留一个兼容层(例如在旧路径下提供 re-export/alias),确保 CI 能通过。

Copilot uses AI. Check for mistakes.
@luukunn luukunn changed the title remove ENABLE_V1_DATA_PROCESSOR [DataProcessor]Remove ENABLE_V1_DATA_PROCESSOR Mar 27, 2026
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Mar 27, 2026

Codecov Report

❌ Patch coverage is 76.47059% with 8 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (develop@8ff8236). Learn more about missing BASE report.

Files with missing lines Patch % Lines
fastdeploy/input/preprocess.py 50.00% 3 Missing ⚠️
fastdeploy/engine/async_llm.py 50.00% 0 Missing and 1 partial ⚠️
fastdeploy/engine/common_engine.py 50.00% 0 Missing and 1 partial ⚠️
fastdeploy/entrypoints/engine_client.py 50.00% 0 Missing and 1 partial ⚠️
fastdeploy/entrypoints/openai/serving_embedding.py 85.71% 0 Missing and 1 partial ⚠️
fastdeploy/entrypoints/openai/serving_reward.py 87.50% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             develop    #7052   +/-   ##
==========================================
  Coverage           ?   73.36%           
==========================================
  Files              ?      373           
  Lines              ?    52581           
  Branches           ?     8212           
==========================================
  Hits               ?    38576           
  Misses             ?    11274           
  Partials           ?     2731           
Flag Coverage Δ
GPU 73.36% <76.47%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants