Skip to content

fix(cli): serialize token refresh across processes#34

Open
appleboy wants to merge 1 commit into
mainfrom
worktree-keyring
Open

fix(cli): serialize token refresh across processes#34
appleboy wants to merge 1 commit into
mainfrom
worktree-keyring

Conversation

@appleboy
Copy link
Copy Markdown
Member

@appleboy appleboy commented Apr 23, 2026

Summary

Two concurrent authgate-cli invocations that both see an expired access token race to call the refresh endpoint with the same refresh token. On servers that rotate refresh tokens (the go-authgate default), the first request consumes the token and the second gets invalid_grant, forcing the second CLI back into an interactive login. This PR serialises the refresh critical section across processes so only one refresh happens.

How it works

  • Cross-process advisory lock (github.com/gofrs/flock) wraps the load → refresh → save critical section. It relies on kernel fcntl/LockFileEx and is released automatically if the holder crashes — no stale-lock heuristics required.
  • Peer-refresh shortcut: after taking the lock the CLI re-reads the store; if a peer process already refreshed (the stored token is usable and differs from the stale copy), that fresh token is returned without a network call. If the peer rotated the refresh token, the rotated value is picked up before the exchange.
  • Signature change: refreshAccessToken now takes the full stale credstore.Token (instead of just the refresh-token string) so peer-refresh detection can compare both the access and refresh token fields.
  • Lock placement: the lock lives in refreshAccessToken, which every refresh path (run, ensureFreshToken, makeAPICallWithAutoRefresh) funnels through. The lock file sits next to the token file (<tokenFile>.<clientID>.lock); TokenFile is now exposed on AppConfig and pinned through ensureFreshToken so the lock and the store stay co-located across the lazy full-config upgrade.

Test plan

  • go test ./... passes (incl. existing TestRefreshAccessToken_RotationMode)
  • New TestRefreshAccessToken_PeerAlreadyRefreshed asserts no network call is made when the store already contains a peer-refreshed token
  • make lint is clean (golangci-lint: 0 issues)
  • Manual: run two CLI instances against a rotation-mode server with an expired access token — confirm only one refresh request hits the server

Notes

  • Reviewed with a multi-angle pass; two behavior-preserving cleanups were folded in: the redundant lock-wrapper struct was dropped (*flock.Flock already satisfies io.Closer), and the peer-check reuses the shared tokenUsable predicate.
  • Known limitation for follow-up: in keyring storage mode the lock path still derives from the (cwd-relative) token-file path, so invocations from different working directories do not serialise. Anchoring the lock to the backend identity (keyring service + client ID) would close that gap.

🤖 Generated with Claude Code

Copilot AI review requested due to automatic review settings April 23, 2026 14:59
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Apr 23, 2026

Codecov Report

❌ Patch coverage is 54.28571% with 16 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
lock.go 37.50% 5 Missing and 5 partials ⚠️
auth.go 81.25% 2 Missing and 1 partial ⚠️
config.go 0.00% 2 Missing ⚠️
main.go 0.00% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR prevents concurrent authgate-cli processes from racing during refresh-token rotation by serializing the refresh-and-save critical section using a cross-process advisory lock, and adds coverage to ensure peer-refreshed tokens short-circuit the network call.

Changes:

  • Add lockTokenStore (gofrs/flock-based) to serialize refresh across CLI processes.
  • Update refreshAccessToken to accept the full stale token, re-load the store after lock acquisition, and return peer-refreshed tokens without a refresh request.
  • Add a new test ensuring no network call occurs when a peer already refreshed, and update existing tests/call sites to the new refresh API.

Reviewed changes

Copilot reviewed 6 out of 7 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
main_test.go Updates test config to include TokenFile, adjusts refresh tests to new signature, and adds peer-refresh short-circuit test.
main.go Updates refresh call site to pass the full stored token.
auth.go Implements lock + peer-check logic in refreshAccessToken; updates auto-refresh call site.
config.go Adds TokenFile to AppConfig and wires it through config/store initialization.
lock.go Introduces cross-process advisory lock implementation used by refresh flow.
go.mod Adds github.com/gofrs/flock dependency.
go.sum Adds checksums for the new dependency.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread lock.go
Comment on lines +35 to +46
func lockTokenStore(ctx context.Context, tokenFile, clientID string) (io.Closer, error) {
if tokenFile == "" {
return nil, errors.New("lock: tokenFile is empty")
}
if clientID == "" {
return nil, errors.New("lock: clientID is empty")
}

dir := filepath.Dir(tokenFile)
if err := os.MkdirAll(dir, 0o700); err != nil {
return nil, fmt.Errorf("create lock directory %q: %w", dir, err)
}
Copy link

Copilot AI Apr 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lock placement currently depends on cfg.TokenFile, whose default is a relative path (.authgate-tokens.json). For keyring/auto (keyring available) this introduces a new requirement that the current working directory be writable (so the lock file can be created), otherwise refresh will fail even though the keyring backend itself doesn’t need filesystem writes. Consider choosing a lock location that’s reliably writable (e.g., under os.UserCacheDir()/os.UserConfigDir()) when the token file path is relative or when using keyring storage.

Copilot uses AI. Check for mistakes.
Comment thread lock.go
Comment on lines +48 to +50
lockPath := filepath.Join(dir, filepath.Base(tokenFile)+"."+clientID+".lock")
fl := flock.New(lockPath)
locked, err := fl.TryLockContext(ctx, lockRetryInterval)
Copy link

Copilot AI Apr 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lockPath is built by concatenating clientID directly into the filename. If clientID contains path separators or .. segments, filepath.Join will clean the path and can place the lock file outside the intended directory (and on Windows some characters can make the filename invalid). Consider deriving a filesystem-safe lock name (e.g., hash/hex of clientID, or base64-url encoding, and/or rejecting/escaping path separators) before constructing the path.

Copilot uses AI. Check for mistakes.
@appleboy appleboy force-pushed the worktree-keyring branch from 291eef5 to b18e909 Compare May 30, 2026 03:52
- Take a cross-process advisory lock (via gofrs/flock) around the refresh-and-save critical section so concurrent CLI invocations cannot spend the same refresh token twice
- Re-read the store under the lock and return a peer process's fresh token when it has already completed a refresh, avoiding an unnecessary network call and a guaranteed invalid_grant on rotation servers
- Change refreshAccessToken to accept the full stale Token so peer-refresh detection can compare access and refresh token fields
- Expose TokenFile on AppConfig so the lock path is known regardless of the active storage backend
- Add a test that pre-populates the store with a peer-refreshed token and asserts refreshAccessToken returns it without hitting the network
@appleboy appleboy force-pushed the worktree-keyring branch from b18e909 to 17bddfd Compare May 30, 2026 04:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants