aws: catch auth errors on XML responses#11891
Conversation
📝 WalkthroughWalkthroughConsolidates XML auth-error checks into one strcasestr() match across more AWS XML identifiers and extends JSON error-code recognition to include InvalidSecurity, TokenRefreshRequired, and InvalidSignature in flb_aws_is_auth_error(), plus a unit test exercising these cases. ChangesExpanded auth error detection
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 Infer (1.2.0)tests/internal/aws_util.ctests/internal/aws_util.c:3:10: fatal error: 'fluent-bit/flb_sds.h' file not found ... [truncated 725 characters] ... fer-linux-x86_64-v1.2.0/lib/infer/facebook-clang-plugins/clang/install/lib/clang/18/include" Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Debug log output from testing the change: |
|
Valgrind output that shows no leaks or memory corruption was found: |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/aws/flb_aws_util.c`:
- Around line 349-360: Fix spacing and indentation in the AWS XML error-check
block: add a space before "!= NULL" in the "SignatureDoesNotMatch" check so it
reads like the other strcasestr comparisons, and adjust the indentation of the
"return FLB_TRUE;" line to align with the other conditional body lines (use the
same 8-space indentation as the surrounding block); locate the block where
strcasestr(payload, "SignatureDoesNotMatch") is checked and the subsequent
return to make these two small edits.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
a3a9820 to
7ebc6f8
Compare
Signed-off-by: Antônio Franco <13881523+antoniomrfranco@users.noreply.github.com>
7ebc6f8 to
53f3aed
Compare
cosmo0920
left a comment
There was a problem hiding this comment.
The implementation is really reasonable for me.
But can we add test cases for this validation?
This is because we encourage to add unit test cases for something like newly adding functions or capabilities if possible.
In this case, I strongly recommend to add unit tests to validate this patch should be confirmed in error paths.
Signed-off-by: Antônio Franco <13881523+antoniomrfranco@users.noreply.github.com>
There was a problem hiding this comment.
🧹 Nitpick comments (1)
tests/internal/aws_util.c (1)
147-175: ⚡ Quick winExpand coverage to include all newly-added auth identifiers.
This test validates the path, but it only exercises a subset of the new matcher additions. Please add assertions for the other added auth codes (notably
InvalidSecurity,TokenRefreshRequired,InvalidSignature, and XMLSignatureDoesNotMatch/InvalidToken) so regressions in the expanded list are caught.Based on learnings: “Add or update tests for behavior changes, especially protocol parsing … prefer targeted tests close to the changed module.”
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/internal/aws_util.c` around lines 147 - 175, The test test_flb_aws_is_auth_error only covers a subset of new auth identifiers; add additional assertions in that function to exercise the newly-added matchers by creating payloads that contain the error codes "InvalidSecurity", "TokenRefreshRequired", "InvalidSignature", and the XML errors "SignatureDoesNotMatch" and "InvalidToken" (as well as any JSON variants, e.g. "{\"__type\":\"InvalidSecurity\",...}"), calling flb_aws_is_auth_error(payload, strlen(payload)) for each and asserting the expected FLB_TRUE/FLB_FALSE result consistent with other auth-error cases; keep the same style as existing tests (use TEST_CHECK and string payloads) so regressions in the expanded matcher list are detected.Source: Learnings
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@tests/internal/aws_util.c`:
- Around line 147-175: The test test_flb_aws_is_auth_error only covers a subset
of new auth identifiers; add additional assertions in that function to exercise
the newly-added matchers by creating payloads that contain the error codes
"InvalidSecurity", "TokenRefreshRequired", "InvalidSignature", and the XML
errors "SignatureDoesNotMatch" and "InvalidToken" (as well as any JSON variants,
e.g. "{\"__type\":\"InvalidSecurity\",...}"), calling
flb_aws_is_auth_error(payload, strlen(payload)) for each and asserting the
expected FLB_TRUE/FLB_FALSE result consistent with other auth-error cases; keep
the same style as existing tests (use TEST_CHECK and string payloads) so
regressions in the expanded matcher list are detected.
Problem
When AWS credentials are rotated (e.g. a new key pair written to
~/.aws/credentials), the S3 output plugin fails to detect the authentication error and trigger a credential refresh.The root cause is in
flb_aws_is_auth_error()(src/aws/flb_aws_util.c): S3 returns error responses in XML format, but several auth-related error codes —InvalidAccessKeyId,SignatureDoesNotMatch,InvalidToken,InvalidSecurity,TokenRefreshRequired, andInvalidSignature— were only checked via the JSON path (usingflb_aws_error(), which parses the__typefield). The XML path only coveredInvalidClientTokenId,AccessDenied, andExpired.As a result,
flb_aws_client_request()never calledprovider->refresh()on S3 auth failures, so stale credentials were used indefinitely despite an updated credentials file on disk.Fix
The missing error codes are added to the XML
strcasestrcheck block inflb_aws_is_auth_error(). Whenflb_aws_client_request()receives a 4xx response, it callsflb_aws_is_auth_error()on the response body. If the function returnsFLB_TRUE, the client callsprovider->refresh(), which re-reads the credentials file from disk and picks up the rotated keys. The next upload attempt — triggered by the S3 timer callback — then uses the fresh credentials and succeeds. With this fix, that detection now works correctly for S3's XML responses, not just JSON-based APIs like Kinesis.Enter
[N/A]in the box, if an item is not applicable to your change.Testing
Before we can approve your change; please submit the following in a comment:
If this is a change to packaging of containers or native binaries then please confirm it works for all targets.
ok-package-testlabel to test for all targets (requires maintainer to do).Documentation
Backporting
Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.
Summary by CodeRabbit