Conversation
There was a problem hiding this comment.
The pyarrow docs make it seem like there are only 2 available strategies to pick from AwsStandardS3RetryStrategy and AwsDefaultS3RetryStrategy.
retry_strategyS3RetryStrategy, default AwsStandardS3RetryStrategy(max_attempts=3)
The retry strategy to use with S3; fail after max_attempts. Available strategies are AwsStandardS3RetryStrategy, AwsDefaultS3RetryStrategy.
The __init__ function also check for the 2 classes above with isinstance
And only the given strategy classes' max_attempts field is used.
https://github.com/apache/arrow/blob/639201bfa412db26ce45e73851432018af6c945e/python/pyarrow/_s3fs.pyx#L409-L416
Given that only the max_attempts variable is used, is it still useful to expose custom S3 retry strategy?
|
@kevinjqliu I agree that that's the current state. However, there is some activity on Arrow to also add other strategies (exponential etc): apache/arrow#46517 |
kevinjqliu
left a comment
There was a problem hiding this comment.
LGTM. This will allow us to pass in custom S3RetryStrategy class.
Note, as of pyarrow 20.0.0 only AwsStandardS3RetryStrategy and AwsDefaultS3RetryStrategy class are allowed.
<!--
Thanks for opening a pull request!
-->
<!-- In the case this PR will resolve an issue, please replace
${GITHUB_ISSUE_ID} below with the actual Github issue id. -->
<!-- Closes #${GITHUB_ISSUE_ID} -->
# Rationale for this change
# Are these changes tested?
# Are there any user-facing changes?
<!-- In the case of user-facing changes, please add the changelog label.
-->
<!--
Thanks for opening a pull request!
-->
<!-- In the case this PR will resolve an issue, please replace
${GITHUB_ISSUE_ID} below with the actual Github issue id. -->
<!-- Closes #${GITHUB_ISSUE_ID} -->
# Rationale for this change
# Are these changes tested?
# Are there any user-facing changes?
<!-- In the case of user-facing changes, please add the changelog label.
-->
Rationale for this change
Are these changes tested?
Are there any user-facing changes?