Skip to content

improve observability over exceptions#1730

Open
arthurpassos wants to merge 3 commits into
antalya-26.3from
export_partition_refactor_exceptions_observability
Open

improve observability over exceptions#1730
arthurpassos wants to merge 3 commits into
antalya-26.3from
export_partition_refactor_exceptions_observability

Conversation

@arthurpassos
Copy link
Copy Markdown
Collaborator

@arthurpassos arthurpassos commented May 5, 2026

Changelog category (leave one):

  • Improvement

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Observability over exceptions on export partition operations was broken and not reliable. This PR simplifies it by making it a single JSON array instead of multiple paths in zookeeper. Another change is that now system.replicated_partition_exports returns only local information, it does not reach zookeeper. #1716

Documentation entry for user-facing changes

...

CI/CD Options

Exclude tests:

  • Fast test
  • Integration Tests
  • Stateless tests
  • Stateful tests
  • Performance tests
  • All with ASAN
  • All with TSAN
  • All with MSAN
  • All with UBSAN
  • All with Coverage
  • All with Aarch64
  • All Regression
  • Disable CI Cache

Regression jobs to run:

  • Fast suites (mostly <1h)
  • Aggregate Functions (2h)
  • Alter (1.5h)
  • Benchmark (30m)
  • ClickHouse Keeper (1h)
  • Iceberg (2h)
  • LDAP (1h)
  • Parquet (1.5h)
  • RBAC (1.5h)
  • SSL Server (1h)
  • S3 (2h)
  • S3 Export (2h)
  • Swarms (30m)
  • Tiered Storage (2h)

@arthurpassos arthurpassos added antalya port-antalya PRs to be ported to all new Antalya releases labels May 5, 2026
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 5, 2026

Workflow [PR], commit [4fc7ba0]

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 99d816eb7f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +377 to +380
if (leaf_exists)
ops.emplace_back(zkutil::makeSetRequest(last_exception_path, entry.toJsonString(), -1));
else
{
LOG_INFO(log, "ExportPartition: Exceptions per replica path does not exist, will create it");
ops.emplace_back(zkutil::makeCreateRequest(exceptions_per_replica_path, "", zkutil::CreateMode::Persistent));
ops.emplace_back(zkutil::makeCreateRequest(count_path, "1", zkutil::CreateMode::Persistent));
ops.emplace_back(zkutil::makeCreateRequest(last_exception_path, "", zkutil::CreateMode::Persistent));
ops.emplace_back(zkutil::makeCreateRequest(last_exception_path / "part", part_name, zkutil::CreateMode::Persistent));
ops.emplace_back(zkutil::makeCreateRequest(last_exception_path / "exception", exception_message, zkutil::CreateMode::Persistent));
}
ops.emplace_back(zkutil::makeCreateRequest(last_exception_path, entry.toJsonString(), zkutil::CreateMode::Persistent));
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Create missing last_exception container for legacy exports

For exports that were created before this change, ZooKeeper contains .../exceptions_per_replica but not .../last_exception. In that upgrade scenario, this branch issues Create(.../last_exception/<replica>) without creating the parent path first, so Keeper returns ZNONODE and the surrounding tryMulti fails atomically. Because appendExceptionOps is used inside failure-handling multis (part failure bookkeeping, commit failure retries, and timeout-to-KILLED), those state transitions can be skipped repeatedly for in-flight legacy tasks, leaving them stuck in PENDING during failures.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as previously, I don't think we actually care about it?

@arthurpassos arthurpassos force-pushed the export_partition_refactor_exceptions_observability branch from 99d816e to a2b1e06 Compare May 7, 2026 14:03
@arthurpassos arthurpassos changed the base branch from antalya-26.1 to antalya-26.3 May 7, 2026 14:03
mkmkme
mkmkme previously approved these changes May 16, 2026
Copy link
Copy Markdown
Collaborator

@mkmkme mkmkme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@DimensionWieldr
Copy link
Copy Markdown
Collaborator

DimensionWieldr commented May 20, 2026

New regression fails:

export_merge_tree_partition_system_table_prefer_remote_information was removed so export partition tests are failing. Not a real issue. We'll update tests after this is merged.

New integration fails:

test_database_iceberg/test.py::test_namespace_filter is failing. This PR doesn't touch iceberg, so not an issue caused here. The fail is related to S3 inconsistent URI (full path vs relative path; this same issue is biting us in every PR, why did upstream implement relative paths maaannn)

AI audit returned no major defects. (Nitpicks about documenting changes in SettingsChangesHistory.cpp or backwards compatibility are non-issues.)

LGTM

@DimensionWieldr DimensionWieldr added the verified Approved for release label May 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

antalya antalya-26.3 port-antalya PRs to be ported to all new Antalya releases verified Approved for release

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants