fix(auth): auto-recover from stale encrypted credentials after upgrade#435
fix(auth): auto-recover from stale encrypted credentials after upgrade#435
Conversation
When credentials.enc cannot be decrypted (e.g. after an upgrade that changed the encryption key), automatically remove the stale file and fall through to other credential sources (plaintext, ADC) instead of hard-erroring. This breaks the stuck loop where logout+login couldn't fix the issue. Also sync .encryption_key file backup when the keyring has a valid key but the file is missing, preventing future key loss if the keyring becomes unavailable. Fixes #389
🦋 Changeset detectedLatest commit: ad04fa0 The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly improves the robustness of credential management by introducing automatic recovery mechanisms for situations where encrypted credentials become unusable due to encryption key changes, typically after system upgrades. It ensures a smoother user experience by preventing credential-related dead ends and enhancing the resilience of encryption key storage. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #435 +/- ##
==========================================
+ Coverage 64.44% 66.02% +1.57%
==========================================
Files 38 38
Lines 15601 16407 +806
==========================================
+ Hits 10054 10832 +778
- Misses 5547 5575 +28 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Code Review
This pull request introduces a helpful auto-recovery mechanism for stale encrypted credentials and improves the durability of the encryption key by creating a file backup from the keyring. The implementation is logical and includes relevant tests.
My review identifies a recurring pattern in the new error handling logic: file system operation failures are silently ignored. In both auth.rs when cleaning up stale files and in credential_store.rs when creating the key backup, ignoring these errors could prevent the recovery and backup mechanisms from working, leaving the user in a broken state without clear feedback. I've suggested changes to log these errors, which will make the new features more robust and transparent to the user.
Address review feedback: log warnings when file removal or backup creation fails, so users get clear feedback instead of silent failures. Token cache cleanup now skips NotFound errors since they may not exist.
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces a robust recovery mechanism for when encrypted credentials become undecryptable, for example after an upgrade. Instead of erroring, the application now removes the stale files and falls back to other credential sources. The change also proactively creates a file backup of the encryption key when it's found in the OS keyring, improving resilience. The implementation is logical and includes good test coverage. I have one suggestion to improve the consistency of file operations within an async context.
Switch from std::fs::remove_file to tokio::fs::remove_file in the async load_credentials_inner function to avoid blocking the runtime, consistent with other file operations in the same function.
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces a robust recovery mechanism for stale encrypted credentials, which is a significant improvement for user experience after upgrades. The changes in src/auth.rs to handle decryption errors by cleaning up stale files and falling through to other credential sources are well-implemented. The addition in src/credential_store.rs to back up the keyring key to a file is a good defensive measure against key loss. The new tests effectively cover these changes. I have a couple of suggestions to improve code reuse and robustness.
- Reuse parse_credential_file for encrypted creds instead of manual JSON parsing. This removes duplication and adds ServiceAccount support. - Use save_key_file_exclusive (atomic) for key backup creation, with AlreadyExists race condition handling. - Update stale comment about encrypted creds being AuthorizedUser only.
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request effectively addresses the issue of stale encrypted credentials after an upgrade by automatically cleaning up undecryptable files and falling back to other credential sources. The added logic to back up the keyring encryption key to a file is a great improvement for robustness. The changes are well-tested. I have one suggestion to improve maintainability by centralizing the hardcoded token cache filenames.
Extract hardcoded 'token_cache.json' and 'sa_token_cache.json' strings into TOKEN_CACHE_FILENAME and SA_TOKEN_CACHE_FILENAME constants in auth_commands.rs. Update all 5 reference sites in auth.rs and auth_commands.rs to use the constants.
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces a robust auto-recovery mechanism for stale encrypted credentials, which is a great improvement for user experience after upgrades. The changes are well-implemented and include thorough tests. I've also added a suggestion to further improve the resilience of the encryption key backup mechanism to ensure it stays synchronized with the OS keyring, preventing potential issues if the keyring becomes unavailable.
- Unconditionally sync the .encryption_key file with the keyring value on every successful read, not just when file is missing. Prevents stale file backups causing decryption failures if keyring becomes unavailable later. Uses save_key_file (overwrite + fsync) instead of save_key_file_exclusive (create-only). - Use parse_credential_file for plaintext credentials at default path, consistent with encrypted/ADC paths. Adds ServiceAccount support for credentials.json, not just AuthorizedUser. - Updated test: keyring_backend_syncs_file_when_keyring_differs verifies file content is overwritten to match keyring key.
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces a robust auto-recovery mechanism for stale encrypted credentials, which can occur after an application upgrade changes key storage. When decryption fails, the stale credentials and token caches are now automatically removed, and the application falls back to other credential sources, preventing users from getting stuck in an unrecoverable state. Additionally, the encryption key from the OS keyring is now proactively backed up to a file to prevent key loss if the keyring becomes unavailable.
The changes are well-implemented and include thorough tests. I have one high-severity concern regarding the atomicity of the key file backup operation, which could lead to data loss under certain failure conditions. The details are in the specific comment.
Move TOKEN_CACHE_FILENAME/SA_TOKEN_CACHE_FILENAME constants and parse_credential_file for the plaintext credentials path to a follow-up refactor PR — keep this PR focused on stale credential auto-recovery.
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces a robust auto-recovery mechanism for stale encrypted credentials, which is a significant improvement for user experience after upgrades. When decryption fails, the CLI now intelligently removes the corrupt file and associated token caches, then gracefully falls back to other credential sources. Additionally, the change to back up the encryption key from the OS keyring to a file enhances the resilience of the authentication system. The code is well-structured and includes relevant tests. I've identified a minor issue in the new tests related to using synchronous I/O in an asynchronous context, which should be addressed.
Replace std::fs::write with tokio::fs::write().await in async test functions for consistency with the async runtime.
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces a robust recovery mechanism for stale encrypted credentials, which is a great improvement for user experience after upgrades. The logic to remove undecryptable files and fall back to other credential sources is well-implemented and tested. Similarly, adding a file backup for the keyring key enhances the resilience of the credential store. My only suggestion is to avoid hardcoding token cache filenames to improve maintainability, as detailed in the specific comment.
Summary
Fixes #389
After upgrading between versions that changed how the encryption key is stored (keyring migration in #345, native backends in #373), users can end up with a
credentials.encfile encrypted with a now-unavailable key. This leaves them stuck in a loop wherelogout + logindoesn't fix the issue.Changes
src/auth.rs— Whencredentials.encfails to decrypt:token_cache.json,sa_token_cache.json)credentials.json, ADC) instead of hard-erroringgws auth loginsrc/credential_store.rs— When the OS keyring returns a valid key but no.encryption_keyfile backup exists, create one. This prevents future key loss if the keyring becomes unavailable (container restart, daemon change, OS upgrade).Test plan
cargo clippycleancargo fmtcleantest_load_credentials_corrupt_encrypted_file_is_removed— verifies stale file is cleaned up and function falls throughtest_load_credentials_corrupt_encrypted_falls_through_to_plaintext— verifies fall-through to plaintext credentialskeyring_backend_creates_file_backup_when_missing— verifies file backup is created when keyring has key but file is missing