Skip to content

fix: handle non-numeric sort_docs_by in MetaFieldGroupingRanker#11726

Open
mustafakhokhar wants to merge 3 commits into
deepset-ai:mainfrom
mustafakhokhar:fix/meta-field-grouping-ranker-non-numeric-sort
Open

fix: handle non-numeric sort_docs_by in MetaFieldGroupingRanker#11726
mustafakhokhar wants to merge 3 commits into
deepset-ai:mainfrom
mustafakhokhar:fix/meta-field-grouping-ranker-non-numeric-sort

Conversation

@mustafakhokhar

Copy link
Copy Markdown

Related Issues

Proposed Changes:

MetaFieldGroupingRanker.run() raised TypeError: '<' not supported between instances of 'float' and 'str' when sort_docs_by pointed at a non-numeric metadata field (e.g. ISO date strings) and some documents were missing that field. The sort used float("inf") as the missing-value sentinel, which cannot be compared against non-numeric values.

This resolves the sort field once and uses a type-safe (value is None, value) sort key, so documents missing the field (or whose value is None) are placed at the end of their group regardless of the field's type. Behaviour for numeric fields is unchanged.

Repro (raises on main):

from haystack.components.rankers import MetaFieldGroupingRanker
from haystack.dataclasses import Document

docs = [
    Document(content="newest", meta={"group": "42", "date": "2023-03-01"}),
    Document(content="missing date", meta={"group": "42"}),
    Document(content="oldest", meta={"group": "42", "date": "2023-01-01"}),
]
MetaFieldGroupingRanker(group_by="group", sort_docs_by="date").run(documents=docs)
# TypeError: '<' not supported between instances of 'float' and 'str'

How did you test it?

Added two unit tests in test/components/rankers/test_meta_field_grouping_ranker.py:

  • sorting by a non-numeric field with a missing key
  • sorting by a field that is present but set to None

All existing ranker unit tests pass. hatch run test:types (mypy) and hatch run fmt are clean, and the pre-commit hooks pass.

Notes for the reviewer

  • Documents missing the sort field (or with a None value) are placed last, matching the previous numeric behaviour and the class's "documents without a group are placed at the end" convention. Sort stability is preserved.
  • Out of scope: a single group mixing value types for the same key (e.g. int and str) remains inherently unsortable — that limitation is unchanged.

Checklist


Implemented with the help of an AI assistant; I have reviewed all changes and run the relevant tests and quality checks.

MetaFieldGroupingRanker.run() raised "TypeError: '<' not supported between
instances of 'float' and 'str'" when sort_docs_by pointed at a non-numeric
metadata field (e.g. ISO date strings) and some documents were missing that
field, because float("inf") was used as the missing-value sentinel.

Resolve the sort field once and use a type-safe (value is None, value) sort
key so documents missing the field (or with a None value) are placed last,
regardless of the field's type.
@mustafakhokhar mustafakhokhar requested a review from a team as a code owner June 23, 2026 07:24
@mustafakhokhar mustafakhokhar requested review from anakin87 and removed request for a team June 23, 2026 07:24
@vercel

vercel Bot commented Jun 23, 2026

Copy link
Copy Markdown

@mustafakhokhar is attempting to deploy a commit to the deepset Team on Vercel.

A member of the Team first needs to authorize it.

@CLAassistant

CLAassistant commented Jun 23, 2026

Copy link
Copy Markdown

CLA assistant check
All committers have signed the CLA.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug: MetaFieldGroupingRanker raises TypeError when sorting by a non-numeric metadata field

2 participants