-
Notifications
You must be signed in to change notification settings - Fork 730
[Optimization]Merge Text processor #7030
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Changes from all commits
32cdc34
a5ab32c
4f9fe3a
ca1f105
b1796e5
59a0e3f
286a025
b21a433
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Large diffs are not rendered by default.
Large diffs are not rendered by default.
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -82,28 +82,25 @@ def create_processor(self): | |
| except Exception as e: | ||
| logger.info(f"Plugin input processor not available ({e}), using built-in processor") | ||
| if not self.model_config.enable_mm: | ||
| if not ErnieArchitectures.contains_ernie_arch(architecture): | ||
| if not envs.ENABLE_V1_DATA_PROCESSOR: | ||
| from fastdeploy.input.text_processor import DataProcessor | ||
| else: | ||
| from fastdeploy.input.v1.text_processor import DataProcessor | ||
| if not envs.ENABLE_V1_DATA_PROCESSOR: | ||
| from fastdeploy.input.text_processor import TextProcessor | ||
|
|
||
| self.processor = DataProcessor( | ||
| tokenizer_type = "ernie4_5" if ErnieArchitectures.contains_ernie_arch(architecture) else "auto" | ||
| self.processor = TextProcessor( | ||
| model_name_or_path=self.model_name_or_path, | ||
| tokenizer_type=tokenizer_type, | ||
| reasoning_parser_obj=reasoning_parser_obj, | ||
| tool_parser_obj=tool_parser_obj, | ||
| ) | ||
| else: | ||
| if not envs.ENABLE_V1_DATA_PROCESSOR: | ||
| from fastdeploy.input.ernie4_5_processor import ( | ||
| Ernie4_5Processor, | ||
| ) | ||
| if not ErnieArchitectures.contains_ernie_arch(architecture): | ||
| from fastdeploy.input.v1.text_processor import DataProcessor | ||
| else: | ||
| from fastdeploy.input.v1.ernie4_5_processor import ( | ||
| Ernie4_5Processor, | ||
| Ernie4_5Processor as DataProcessor, | ||
| ) | ||
|
|
||
| self.processor = Ernie4_5Processor( | ||
| self.processor = DataProcessor( | ||
|
Comment on lines
+85
to
+103
|
||
| model_name_or_path=self.model_name_or_path, | ||
| reasoning_parser_obj=reasoning_parser_obj, | ||
| tool_parser_obj=tool_parser_obj, | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里新增了根据架构选择
tokenizer_type(ernie4_5 vs auto)并走TextProcessor的分支,但当前单测只覆盖了“非 Ernie、非多模态”的路径。建议补一个用例覆盖 Ernie 架构(例如architecture为 ERNIE4.5 相关字符串)时tokenizer_type被设置为ernie4_5,并验证TextProcessor的构造参数。