Skip to content

feat(firestore): support array functions#16128

Draft
Linchin wants to merge 10 commits intogoogleapis:mainfrom
Linchin:fs-array
Draft

feat(firestore): support array functions#16128
Linchin wants to merge 10 commits intogoogleapis:mainfrom
Linchin:fs-array

Conversation

@Linchin
Copy link
Contributor

@Linchin Linchin commented Mar 18, 2026

Supports the following:

  • array_first
  • array_last
  • array_first_n
  • array_last_n
  • array_maximum
  • array_maximum_n
  • array_minimum
  • array_minimum_n
  • array_slice
  • array_index_of
  • array_index_of_all
  • array_transform (need to check for ref impl when it's ready - docstring, and sep methods or not)
  • array_filter (need to check for ref impl when it's ready - docstring, and sep methods or not)

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the Firestore client library by integrating new array manipulation functions directly into pipeline expressions. This provides developers with more powerful and convenient ways to query and transform data within array fields, improving data processing capabilities without requiring client-side array handling.

Highlights

  • New Array Functions: Introduced array_first, array_last, array_first_n, and array_last_n to the Firestore pipeline expressions, allowing users to extract specific elements or sub-arrays from array fields.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds support for array_first, array_last, array_first_n, and array_last_n functions in Firestore pipelines. The implementation and system tests look good. However, I found issues in the unit tests for array_first_n and array_last_n where the parameter order is asserted incorrectly, leading to internally inconsistent tests. I've provided suggestions to fix these tests.

@Linchin
Copy link
Contributor Author

Linchin commented Mar 21, 2026

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds support for a number of new array functions in Firestore pipelines. The implementation looks mostly good, but there are a few areas for improvement.

There are some inconsistencies in the naming of the generated functions (e.g., maximum vs array_maximum) which should be standardized. The e2e tests for these functions are also missing assert_proto blocks to verify the generated output.

Additionally, there are some minor issues with docstrings and redundant code that can be cleaned up. The signature for array_transform is also a bit confusing and could be made clearer.

Overall, a solid addition with a few refinements needed.

Comment on lines +1550 to +1560
def array_maximum(self) -> "Expression":
"""Creates an expression that returns the maximum element of an array.

Example:
>>> # Select the maximum element of array 'scores'
>>> Field.of("scores").array_maximum()

Returns:
A new `Expression` representing the maximum element of the array.
"""
return FunctionExpression("maximum", [self])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The function name used for array_maximum is "maximum". This is inconsistent with other new array functions like array_first which uses "array_first". For consistency, this should probably be "array_maximum".

This same issue applies to:

  • array_minimum (uses "minimum")
  • array_maximum_n (uses "maximum_n")
  • array_minimum_n (uses "minimum_n")

Please update them all to use the array_ prefix for consistency and to prevent potential backend errors.

        return FunctionExpression("array_maximum", [self])

name: array_last_n
name: select

- description: testArrayMaximum
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This test, along with testArrayMinimum, testArrayMaximumN, testArrayMinimumN, testArraySlice1Arg, and testArraySlice2Args, is missing an assert_proto block. These assertions are crucial for verifying that the client is generating the correct protocol buffer messages for the backend. Please add them to ensure the implementation is correct. For example, this is needed to confirm whether the function name should be maximum or array_maximum.

def array_slice(
self, offset: int | "Expression", length: int | "Expression" | None = None
) -> "Expression":
"""Ccreates an expression that returns a slice of an array starting from the specified
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There is a typo in the docstring. "Ccreates" should be "Creates".

        """Creates an expression that returns a slice of an array starting from the specified

Comment on lines +1684 to +1686
def array_transform(
self, element_alias: str, body: "Expression", index_alias: str | None = None
) -> "Expression":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The method signature for array_transform is (self, element_alias: str, body: "Expression", index_alias: str | None = None). However, the implementation and the expected backend protocol place the body argument last in the argument list sent to the backend. This discrepancy can be confusing and lead to incorrect usage.

To improve clarity and prevent misuse, consider making index_alias a keyword-only argument. This enforces that it's passed by name and makes the signature less ambiguous.

    def array_transform(
        self, element_alias: str, body: "Expression", *, index_alias: str | None = None
    ) -> "Expression":

if index_alias is not None:
args.append(self._cast_to_expr_or_convert_to_constant(index_alias))

args.append(self._cast_to_expr_or_convert_to_constant(body))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The body parameter is already typed as Expression. Calling _cast_to_expr_or_convert_to_constant on it is redundant, as unit tests suggest this function is a no-op for Expression instances. You can append body directly to the args list for clarity.

This also applies to array_filter on line 1733.

        args.append(body)

Comment on lines +1717 to +1727
"""
Takes an array, evaluates a boolean expression on each element, and returns a new
array containing only the elements for which the expression evaluates to True.

Args:
element_alias: Element variable name.
body: Boolean expression applied to each element.

Returns:
Expression: The created FunctionExpression AST node.
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The docstring for array_filter is inconsistent with the other new functions in this PR.

  • It's missing an Example: section.
  • The Returns: description is not as user-friendly as the others.

Please update it for consistency. For example:

        """
        Takes an array and returns a new array containing only the elements for which
        the given boolean expression evaluates to True.

        Example:
            >>> # Filter for numbers greater than 5
            >>> Field.of("numbers").array_filter("num", Field.of("num").greater_than(5))

        Args:
           element_alias: Element variable name.
           body: Boolean expression applied to each element.

        Returns:
            A new `Expression` representing the filtered array.
        """

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant