Skip to content

Comments

[core] Support empty dirs cleaning without bucket sub dir in orphan files cleanup#7295

Open
XiaoHongbo-Hope wants to merge 1 commit intoapache:masterfrom
XiaoHongbo-Hope:empty_dir_delete
Open

[core] Support empty dirs cleaning without bucket sub dir in orphan files cleanup#7295
XiaoHongbo-Hope wants to merge 1 commit intoapache:masterfrom
XiaoHongbo-Hope:empty_dir_delete

Conversation

@XiaoHongbo-Hope
Copy link
Contributor

@XiaoHongbo-Hope XiaoHongbo-Hope commented Feb 20, 2026

Purpose/Problem

Currently, OrphanFilesClean does not clean empty partition directories with no bucket subdirectories, causing a lot of empty dirs left.

Example Scenario:
Note: All belows are empty dirs.

Before:

table_root/
├── part1=0/
│   ├── part2=a/
│   │   ├── bucket-0/      ← cleaned
│   │   └── bucket-1/      ← cleaned
│   └── part2=b/           ← NOT cleaned
└── part1=1/               ← NOT cleaned

After:

table_root/
├── part1=0/
│   ├── part2=a/
│   │   ├── bucket-0/      ← cleaned
│   │   └── bucket-1/      ← cleaned
│   └── part2=b/           ← cleaned
└── part1=1/               ← cleaned

Tests

OrphanFilesCleanTest
RemoveOrphanFilesActionITCaseBase
LocalOrphanFilesCleanTest

API and Format

Documentation

Generative AI tooling

UT is coauthored by Cursor

@XiaoHongbo-Hope XiaoHongbo-Hope marked this pull request as ready for review February 20, 2026 06:15
clean code

clean code

refactor to list empty partition dirs when listPaimonFileDirs

revert orphan files clean change in spark

update test case after refactor

optimize test case to cover emptyNonLeafPartitionPath

optimize test case to cover emptyNonLeafPartitionPath

add test case about non-empty dir delete

add log for empty partition clean

clean empty partition dirs with age safeguard
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant