fix: handle non-numeric sort_docs_by in MetaFieldGroupingRanker#11726
Open
mustafakhokhar wants to merge 3 commits into
Open
fix: handle non-numeric sort_docs_by in MetaFieldGroupingRanker#11726mustafakhokhar wants to merge 3 commits into
mustafakhokhar wants to merge 3 commits into
Conversation
MetaFieldGroupingRanker.run() raised "TypeError: '<' not supported between
instances of 'float' and 'str'" when sort_docs_by pointed at a non-numeric
metadata field (e.g. ISO date strings) and some documents were missing that
field, because float("inf") was used as the missing-value sentinel.
Resolve the sort field once and use a type-safe (value is None, value) sort
key so documents missing the field (or with a None value) are placed last,
regardless of the field's type.
|
@mustafakhokhar is attempting to deploy a commit to the deepset Team on Vercel. A member of the Team first needs to authorize it. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Related Issues
Proposed Changes:
MetaFieldGroupingRanker.run()raisedTypeError: '<' not supported between instances of 'float' and 'str'whensort_docs_bypointed at a non-numeric metadata field (e.g. ISO date strings) and some documents were missing that field. The sort usedfloat("inf")as the missing-value sentinel, which cannot be compared against non-numeric values.This resolves the sort field once and uses a type-safe
(value is None, value)sort key, so documents missing the field (or whose value isNone) are placed at the end of their group regardless of the field's type. Behaviour for numeric fields is unchanged.Repro (raises on
main):How did you test it?
Added two unit tests in
test/components/rankers/test_meta_field_grouping_ranker.py:NoneAll existing ranker unit tests pass.
hatch run test:types(mypy) andhatch run fmtare clean, and the pre-commit hooks pass.Notes for the reviewer
Nonevalue) are placed last, matching the previous numeric behaviour and the class's "documents without a group are placed at the end" convention. Sort stability is preserved.intandstr) remains inherently unsortable — that limitation is unchanged.Checklist
fix:).Implemented with the help of an AI assistant; I have reviewed all changes and run the relevant tests and quality checks.