Skip to content

s3 killer: paginate to find complete list of files to delete#19009

Open
abannon wants to merge 2 commits intoapache:masterfrom
abannon:abannon/kill-1001
Open

s3 killer: paginate to find complete list of files to delete#19009
abannon wants to merge 2 commits intoapache:masterfrom
abannon:abannon/kill-1001

Conversation

@abannon
Copy link

@abannon abannon commented Feb 11, 2026

Description

For segments stored in s3, the segment killer looks for files to remove within the segment prefix by calling listObjectsV2, which returns at most 1000 items. In the case where there are more than 1000 files per segment prefix, the segment killer did not find and delete all of the files. The fix is to add pagination by reusing existing 'deleteSegmentFilesFromS3' function.

Release note

Fixed: Kill task for s3 will remove all segment files, instead of up to 1000 files.


Key changed/added classes in this PR
  • Changed S3DataSegmentKiller

This PR has:

  • been self-reviewed.
  • a release note entry in the PR description.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.

Copy link
Contributor

@kfaraz kfaraz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the changes, @abannon !
I have left a small suggestion.

Copy link
Contributor

@kfaraz kfaraz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor comments.

@abannon, just a heads up, #18891 will be merged soon and this PR might get some merge conflicts after that.

}
catch (Exception e) {
log.error("Error occurred while deleting segment files from s3. Error: %s", e.getMessage());
return true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please invert these return values as it is more natural to return true for success scenarios and false for failures.

);
}
catch (Exception e) {
log.error("Error occurred while deleting segment files from s3. Error: %s", e.getMessage());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would log the stack trace as well.

Suggested change
log.error("Error occurred while deleting segment files from s3. Error: %s", e.getMessage());
log.error(e, "Error occurred while deleting segment files from s3.");

);
}
catch (Exception e) {
log.error("Error occurred while deleting segment files from s3. Error: %s", e.getMessage());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
log.error("Error occurred while deleting segment files from s3. Error: %s", e.getMessage());
log.error(e, "Error occurred while deleting segment files from s3.");

Comment on lines +106 to +107
boolean hadException = deleteSegmentFilesFromS3(s3Bucket, path);
if (hadException) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
boolean hadException = deleteSegmentFilesFromS3(s3Bucket, path);
if (hadException) {
boolean success = deleteSegmentFilesFromS3(s3Bucket, path);
if (!success) {

for (S3ObjectSummary objectSummary : list.getObjectSummaries()) {
log.info("Removing index file[s3://%s/%s] from s3!", s3Bucket, objectSummary.getKey());
s3Client.deleteObject(s3Bucket, objectSummary.getKey());
try {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Maybe just call the new method deleteSegmentFilesFromS3 here, since the logic seems to be the same.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants