[FLINK-38825][python] Add Python DataStream API integration for AsyncBatchFunction#27361
Open
featzhang wants to merge 8 commits intoapache:masterfrom
Open
[FLINK-38825][python] Add Python DataStream API integration for AsyncBatchFunction#27361featzhang wants to merge 8 commits intoapache:masterfrom
featzhang wants to merge 8 commits intoapache:masterfrom
Conversation
Collaborator
This was referenced Mar 3, 2026
Open
83e1073 to
090d2ee
Compare
added 7 commits
March 4, 2026 07:54
…syncBatchWaitOperator
…erence This commit introduces SQL/Table API support for batch async lookup joins, enabling AI/ML inference scenarios where batching lookups improves throughput. Key additions: - AsyncBatchLookupFunction: Batch-oriented async lookup interface - AsyncBatchLookupFunctionProvider: Provider with batch configuration - AsyncBatchLookupJoinRunner: Runtime lookup join runner - AsyncBatchLookupJoinFunctionAdapter: Adapter to streaming AsyncBatchFunction - LookupJoinUtil: Batch async lookup detection and options extraction - FunctionKind.ASYNC_BATCH_TABLE: New function kind enum The implementation bridges the Table API layer to the existing AsyncBatchWaitOperator runtime, ensuring consistent behavior with size-based, time-based batching, retry, and timeout strategies.
…BatchFunction This commit introduces Python support for batch-oriented async operations, enabling AI/ML inference scenarios to use batch processing for improved throughput. Key additions: - AsyncBatchFunction class for batch async operations - AsyncDataStream.unordered_wait_batch() method - AsyncDataStream.ordered_wait_batch() method - AsyncBatchOperation runtime implementation - Comprehensive unit tests The implementation reuses the Java AsyncBatchWaitOperator for all batching and scheduling logic, following existing PyFlink async function patterns.
090d2ee to
0275e66
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What is the purpose of the change
This PR adds Python DataStream API integration for the existing Java
AsyncBatchWaitOperatorruntime capability, enabling Python-based AI/ML inference and external service calls to use batch-oriented async execution.This is a pure integration PR - all batching, scheduling, and async execution logic is reused from the Java side.
Brief change log
New Python Classes
AsyncBatchFunctionAsyncBatchFunctionDescriptorAsyncBatchOperationBatchResultDistributorModified Files
async_data_stream.pyunordered_wait_batch()andordered_wait_batch()methodsfunctions.pyAsyncBatchFunctionandAsyncBatchFunctionDescriptorclasses__init__.pyAsyncBatchFunctionflink-fn-execution.protoASYNC_BATCHfunction typeTest Files
test_async_batch_function.pyAPI Design
AsyncBatchFunction
AsyncDataStream Methods
Example Usage
Testing
The PR includes comprehensive tests covering:
timeout_batchis calledDesign Principles
AsyncBatchWaitOperatorAsyncFunctionintegrationVerifying this change
This change added tests and can be verified as follows:
cd flink-python python -m pytest pyflink/datastream/tests/test_async_batch_function.py -vDoes this pull request potentially affect one of the following parts
@Public(Evolving): yesDocumentation
PR Series for FLINK-38825
JIRA: FLINK-38825 - Introduce an AI-friendly Async Batch Operator for high-latency inference workloads
This feature is implemented incrementally through the following PR series:
flink-streaming-javaflink-streaming-javaflink-streaming-javaflink-streaming-javaflink-streaming-javaflink-tableflink-python