Skip to content

Conversation

@LinoGiger
Copy link
Collaborator

No description provided.

@LinoGiger LinoGiger requested a review from LucStr February 11, 2026 16:34
@claude
Copy link

claude bot commented Feb 11, 2026

Pull Request Review

Summary

This PR adds support for min_response_threshold parameter to the create_ranking_flow method, properly mapping it to the backend's minResponses and maxResponses fields.

✅ Positives

  1. Correct API Mapping: The change correctly uses minResponses and maxResponses instead of the deprecated responsesRequired field (line 38 of create_flow_endpoint_input.py shows responsesRequired is deprecated)

  2. Good Documentation: The docstring clearly explains that min_response_threshold is "The minimum number of responses required for the flow to be considered complete in case of a timeout"

  3. Reasonable Default: The default value of 1 for min_response_threshold is sensible and matches the backend default mentioned in the model

  4. Type Safety: Follows the codebase pattern of using from __future__ import annotations for type hints

🔍 Issues & Recommendations

1. Parameter Naming Inconsistency (Minor)

The parameter names could be more aligned:

  • responses_per_flow_batch (current) → suggests this is the target/expected number
  • min_response_threshold (new) → uses "threshold" terminology

Suggestion: Consider renaming for consistency. Either:

  • max_responses_per_batch and min_responses_per_batch, OR
  • max_response_threshold and min_response_threshold

The current naming mixes metaphors slightly, though it's not a breaking issue.

2. Parameter Validation Missing (Medium)

There's no validation to ensure min_response_threshold <= responses_per_flow_batch. This could lead to confusing runtime errors from the backend.

Suggestion: Add validation:

if min_response_threshold > responses_per_flow_batch:
    raise ValueError(
        f"min_response_threshold ({min_response_threshold}) cannot be greater than "
        f"responses_per_flow_batch ({responses_per_flow_batch})"
    )

3. Missing Test Coverage (High Priority)

No tests were added or modified for this new functionality. The change should be tested for:

  • Default behavior (min=1, max=100)
  • Custom values (min=5, max=50)
  • Edge cases (min=max, validation)

Suggestion: Add tests in the tests directory following the project structure.

4. Backwards Compatibility (Question)

While the new parameter has a default value, existing code will now send both minResponses and maxResponses instead of responsesRequired.

Question: Has this been verified to work correctly with the backend API? The model shows responsesRequired is deprecated, so this should be fine, but worth confirming.

📝 Code Quality

  • ✅ Follows project conventions from CLAUDE.md
  • ✅ Uses proper type annotations
  • ✅ Maintains consistent formatting
  • ⚠️ No type checking was run (per CLAUDE.md: pyright src/rapidata/rapidata_client should be executed)

🎯 Recommendation

Approve with minor changes suggested

The core functionality is solid, but adding parameter validation and test coverage would significantly improve the robustness of this change. The naming inconsistency is minor and can be addressed in a follow-up if desired.


Review generated by Claude Code

@claude
Copy link

claude bot commented Feb 11, 2026

PR Review: Added min/max responses

Summary

This PR updates the create_ranking_flow method to use the new min_response_threshold and max_response_threshold parameters instead of the deprecated responses_per_flow_batch parameter. This aligns with the backend API changes where responsesRequired is deprecated in favor of minResponses and maxResponses.

✅ Positives

  1. Correct API alignment: The changes properly map to the new backend API fields (minResponses and maxResponses).
  2. Better semantics: The new parameter names are more descriptive and clearly indicate their purpose (maximum threshold vs minimum threshold for timeout scenarios).
  3. Good documentation: The docstring clearly explains what each parameter does.
  4. Sensible defaults: Default values of max_response_threshold=100 and min_response_threshold=1 are reasonable.
  5. Type safety: Proper type annotations maintained.

🔍 Considerations & Recommendations

1. Breaking Change (High Priority)

This is a breaking API change that will affect existing users. The old responses_per_flow_batch parameter has been completely removed without deprecation.

Recommendation: Consider one of these approaches:

  • Add a deprecation warning for responses_per_flow_batch parameter before removing it entirely
  • Document this breaking change in a CHANGELOG
  • Consider if this warrants a major version bump (semver)

Example deprecation approach:

def create_ranking_flow(
    self,
    name: str,
    instruction: str,
    max_response_threshold: int = 100,
    min_response_threshold: int = 1,
    responses_per_flow_batch: int | None = None,  # Deprecated
    validation_set_id: str | None = None,
    settings: Sequence[RapidataSetting] | None = None,
) -> RapidataFlow:
    if responses_per_flow_batch is not None:
        import warnings
        warnings.warn(
            "responses_per_flow_batch is deprecated, use max_response_threshold instead",
            DeprecationWarning,
            stacklevel=2
        )
        max_response_threshold = responses_per_flow_batch

2. Documentation Update Required (Medium Priority)

The documentation file docs/flows.md:20 shows an example using create_ranking_flow but doesn't mention these new parameters. Consider adding an example showing the optional parameters:

flow = client.flow.create_ranking_flow(
    name="Image Quality Ranking",
    instruction="Which image looks better?",
    max_response_threshold=100,  # Optional: maximum responses per item
    min_response_threshold=10,   # Optional: minimum responses before timeout
)

3. Parameter Validation (Medium Priority)

Consider adding validation to ensure logical constraints:

  • min_response_threshold should be <= max_response_threshold
  • Both values should be positive integers
if min_response_threshold > max_response_threshold:
    raise ValueError("min_response_threshold must be <= max_response_threshold")
if min_response_threshold < 1:
    raise ValueError("min_response_threshold must be at least 1")

4. Testing (Medium Priority)

No test files were found for flows. Consider adding tests to cover:

  • Basic flow creation with default parameters
  • Flow creation with custom min/max thresholds
  • Parameter validation edge cases

5. Type Checking (Low Priority)

Per CLAUDE.md requirements, run pyright src/rapidata/rapidata_client to ensure no type errors (pyright needs to be installed in the environment).

🐛 Potential Issues

No bugs identified in the implementation itself - the changes are straightforward and correct.

⚡ Performance

No performance concerns - this is a parameter mapping change only.

🔒 Security

No security concerns identified.

📋 Test Coverage

No tests found for the flow functionality. Consider adding comprehensive test coverage.

Verdict

The code changes are technically correct and improve the API semantics. However, this is a breaking change that should be handled carefully with proper deprecation warnings, documentation updates, and version management.

Recommendation: Approve with the suggestion to add deprecation handling for the old parameter name to give users time to migrate.


Review generated by Claude Code

@LinoGiger LinoGiger merged commit c3c6ae6 into main Feb 11, 2026
2 checks passed
@LinoGiger LinoGiger deleted the feat/RAPID-add-min-max-responses branch February 11, 2026 17:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant