fix: keep full filename for BaseFile inputs in normalize_input_files#6177
Open
Hiyaarora wants to merge 1 commit into
Open
fix: keep full filename for BaseFile inputs in normalize_input_files#6177Hiyaarora wants to merge 1 commit into
Hiyaarora wants to merge 1 commit into
Conversation
The BaseFile branch stripped the file extension from the dict key, unlike every other input branch (and the documented behavior in tests, which keep the full filename). This made naming inconsistent and silently dropped files when two BaseFile inputs shared a base name but had different extensions (e.g. data.json and data.txt both collapsed to data). Keep the full filename so BaseFile matches file-source inputs and same-name files with different extensions no longer collide. Add a regression test.
|
Note Currently processing new changes in this PR. This may take a few minutes, please wait... ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (2)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment Tip CodeRabbit can generate a title for your PR based on the changes with custom instructions.Set the |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
normalize_input_filesbuilds a{name: file}dict. For most input types the key is the full filename (extension included) — that's what the existing tests assert (e.g.doc1.txt,named.txt). But theBaseFilebranch strips the extension before using it as the key, which is both inconsistent with every other branch and causes silent data loss: twoBaseFileinputs that share a base name but differ in extension collapse to one entry, and the first is dropped.Reproduction
Expected:
['data.json', 'data.txt'](2 entries)Actual:
['data'](1 entry — thedata.jsonfile is silently dropped)Fix
Keep the full filename for
BaseFileinputs (don't strip the extension), matching the behavior of the file-source branch and the existing tests. Same-base-name files with different extensions no longer collide.Tests
Added
test_normalize_base_file_keeps_full_filenameintest_files.py.Verified locally:
uv run pytest lib/crewai/tests/utilities/test_files.py→ 54 passeduv run pytest lib/crewai-files/tests/(resolution) → passeduv run ruff check/ruff format --check/uv run mypyon the changed files → cleanDisclosure
This PR was prepared with AI assistance (Claude Code), reviewed by me. Please apply the
llm-generatedlabel per CONTRIBUTING.