Skip to content

docs: remove references to native_datafusion and native_iceberg_compat scans#4362

Merged
mbutrovich merged 5 commits into
apache:mainfrom
andygrove:docs/remove-scan-impl-references
May 19, 2026
Merged

docs: remove references to native_datafusion and native_iceberg_compat scans#4362
mbutrovich merged 5 commits into
apache:mainfrom
andygrove:docs/remove-scan-impl-references

Conversation

@andygrove
Copy link
Copy Markdown
Member

@andygrove andygrove commented May 18, 2026

Which issue does this PR close?

Part of #4020.

Rationale for this change

Companion docs update to #4019. Once native_iceberg_compat is removed and Comet has a single Parquet scan implementation, references to the two named impls (native_datafusion, native_iceberg_compat) and the spark.comet.scan.impl selection mechanic become misleading. This PR strips those references from the user guide and contributor guide so the docs match the post-#4019 reality.

What changes are included in this PR?

  • compatibility/scans.md: flattened to a single "Parquet Scan Limitations" section; impl comparison table and spark.comet.scan.impl paragraph removed; shared + native_datafusion limitations merged; native_iceberg_compat limitation dropped.
  • compatibility/spark-versions.md: per-version notes now refer to "Comet's Parquet scan"; anchor links retargeted to #parquet-scan-limitations.
  • compatibility/index.md: dropped the "(both scan implementations, …)" parenthetical.
  • user-guide/latest/datasources.md: S3 section rephrased for a single scan; --conf spark.comet.scan.impl=native_datafusion removed from the HDFS example and the two S3 examples; the HDFS Scala snippet's COMET_NATIVE_SCAN_IMPL line removed.
  • user-guide/latest/understanding-comet-plans.md: removed the "active scan implementation is shown in brackets" note from CometScan.
  • contributor-guide/adding_a_new_spark_version.md: dropped scan_impl: "auto" guidance, removed spark_sql_test_native_iceberg_compat.yml, dropped scan-impl from the matrix list.
  • contributor-guide/bug_triage.md: removed the two impl labels from the area-label list and reworded the "core path over experimental" principle.

Changelog and versioned-snapshot files under docs/source/changelog/ and docs/comet-0.16/ are intentionally untouched.

How are these changes tested?

Docs-only change. Verified that no active docs file under docs/source/ (excluding changelog/) references native_datafusion, native_iceberg_compat, spark.comet.scan.impl, COMET_NATIVE_SCAN_IMPL, or SCAN_NATIVE_DATAFUSION.

andygrove added 2 commits May 18, 2026 18:51
…-references

# Conflicts:
#	docs/source/contributor-guide/bug_triage.md
@andygrove andygrove marked this pull request as ready for review May 19, 2026 01:14
Comment on lines +150 to +154
| `CometScan` | V1 Parquet scan driven by Spark's file-source path through Comet's Parquet reader. Decoding runs in native code; the resulting Arrow batches cross JNI into the native plan. |
| `CometBatchScan` | DataSource V2 scan, including Iceberg Parquet, that produces Arrow batches consumed by Comet. |
| `CometNativeScan` | Fully native Parquet scan that runs entirely in DataFusion (no JVM Parquet reader involvement). |
| `CometIcebergNativeScan` | Fully native Iceberg Parquet scan. |
| `CometCsvNativeScan` | Fully native CSV scan (experimental). |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still see these scan types? Or are these for the case users set the deprecated scan type?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CometScan should no longer appear. I will update this.

Comment thread docs/source/user-guide/latest/compatibility/scans.md Outdated
Copy link
Copy Markdown
Contributor

@mbutrovich mbutrovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks right. I can take another pass as we review all the docs for 1.0.0, but for now this is good. Thanks @andygrove!

@mbutrovich mbutrovich merged commit fbc3d2f into apache:main May 19, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants