Skip to content

[SPARK-56468][UDF] Validate required worker capabilities in direct dispatcher#55906

Open
RaghunandanKumar wants to merge 1 commit into
apache:masterfrom
RaghunandanKumar:SPARK-56468-worker-spec-validation
Open

[SPARK-56468][UDF] Validate required worker capabilities in direct dispatcher#55906
RaghunandanKumar wants to merge 1 commit into
apache:masterfrom
RaghunandanKumar:SPARK-56468-worker-spec-validation

Conversation

@RaghunandanKumar
Copy link
Copy Markdown

What changes were proposed in this pull request?

This PR tightens DirectWorkerDispatcher validation for the new language-agnostic UDF worker specification.

Specifically, it now rejects direct worker specs that:

  • omit worker capabilities entirely
  • provide UNSPECIFIED enum values in supported data formats or communication patterns
  • omit required protocol support for ARROW
  • omit required protocol support for BIDIRECTIONAL_STREAMING

The existing DirectWorkerDispatcherSuite fixtures are updated to include explicit capabilities where they were previously constructing minimal specs, and new focused tests cover the new validation paths.

Why are the changes needed?

SPARK-56468 is an audit task for the UDF gRPC / worker protocol before the Spark 4.2 protocol boundary hardens.

The protobuf comments already describe these capability fields as required, but the direct dispatcher previously accepted incomplete specs and deferred that inconsistency into later runtime paths. Failing closed here keeps the worker spec self-contained and prevents silently invalid protocol definitions from being treated as usable.

Does this PR introduce any user-facing change?

No. The UDF worker framework is still experimental and not yet wired into user-facing execution paths.

How was this patch tested?

Locally on macOS with Homebrew OpenJDK 17:

export JAVA_HOME=/opt/homebrew/opt/openjdk@17/libexec/openjdk.jdk/Contents/Home
export PATH="$JAVA_HOME/bin:$PATH"
build/sbt "udf-worker-core/testOnly org.apache.spark.udf.worker.core.DirectWorkerDispatcherSuite"

Passed: DirectWorkerDispatcherSuite (33 tests)

Was this patch authored or co-authored using generative AI tooling?

Yes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants