Skip to content

[SPARK-44734][PYTHON][DOCS] Expand PySpark type conversion guide#55902

Open
RaghunandanKumar wants to merge 1 commit into
apache:masterfrom
RaghunandanKumar:codex/pyspark-type-conversions-docs
Open

[SPARK-44734][PYTHON][DOCS] Expand PySpark type conversion guide#55902
RaghunandanKumar wants to merge 1 commit into
apache:masterfrom
RaghunandanKumar:codex/pyspark-type-conversions-docs

Conversation

@RaghunandanKumar
Copy link
Copy Markdown

What changes were proposed in this pull request?

This updates the PySpark type conversion guide to make it more useful for day-to-day users.

Changes:

  • remove unfinished TODO placeholders from the page
  • fix a small formatting typo in the DecimalType row
  • add practical notes about runtime conversion behavior, including:
    • null handling
    • fixed-width numeric limits
    • schema inference pitfalls for local Python data
    • nested dict inference as MapType vs StructType
    • BinaryType behavior under spark.sql.execution.pyspark.binaryAsBytes
  • add a short section describing Arrow-enabled Python UDF conversion caveats
  • add config-driven examples for:
    • spark.sql.pyspark.inferNestedDictAsStruct.enabled
    • spark.sql.timestampType=TIMESTAMP_NTZ
    • spark.sql.execution.pyspark.binaryAsBytes

Why are the changes needed?

The page had visible TODO markers and was missing several practical conversion cases that commonly confuse PySpark users, especially around schema inference and configuration-dependent behavior.

Does this PR introduce any user-facing change?

Yes, documentation only.

How was this patch tested?

Reviewed the rendered diff locally.

Full Spark docs/test validation was not run in this environment because Java is not installed.

@RaghunandanKumar RaghunandanKumar marked this pull request as ready for review May 15, 2026 16:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant