Skip to content

feat(proxy): resolve push identity from token via SCM provider API#1604

Open
coopernetes wants to merge 1 commit into
mainfrom
feat/token-id-mapping
Open

feat(proxy): resolve push identity from token via SCM provider API#1604
coopernetes wants to merge 1 commit into
mainfrom
feat/token-id-mapping

Conversation

@coopernetes

@coopernetes coopernetes commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

Description

parsePush uses the last commit's committer as the push user. This adds a new chain processor that extracts the token from HTTP Basic auth, calls the SCM provider's user API (GitHub GET /user for now), and maps the SCM login to a git-proxy user via the gitAccount field.

  • TokenIdentityProvider interface with hostname-based dispatch
  • GitHubTokenIdentityProvider calling api.github.com/user
  • resolveUserFromToken chain processor (non-blocking on failure)
  • findUserByGitAccount DB lookup (file + mongo)
  • GET/PUT /api/v1/user/:username/git-account endpoints

This doesn't block a push if the gitAccount isn't mapped in order to allow introduction of the gitAccount via the UI. This acts as a "soft" check for now unless the maintainer team wishes to adopt this model and use it as a requirement for authorising the "pusher" identity link that is missing as per what is described in #1400

How it works

  1. resolveUserFromToken runs in the push chain after parsePush, before checkUserPushPermission
  2. Extracts the token from the HTTP Basic auth header (the password field)
  3. Dispatches to a TokenIdentityProvider based on the upstream hostname (github.com → GitHubTokenIdentityProvider)
  4. Calls GET /user with the token to get the SCM login
  5. Looks up the git-proxy user by gitAccount field — if found, sets action.user and action.userEmail from the DB user
  6. If no gitAccount match, falls back to using the SCM login directly (non-blocking)

Limitations

  • Does not work for a generic git repository provider that doesn't provide a user API. Forcing this behaviour within Git Proxy will constrain its applicability to only these providers which have an API for identity lookups to match them to a valid Git Proxy user. In a practical sense, a lot of open source projects are already hosted on GitHub and a few others on others such as GitLab or Codeberg. These aren't the only git repository servers hosting open source projects but they represent the large majority of projects.
  • for specific providers (GitLab, Forgejo/Codeberg/Gitea), an additional scope is needed. Originally documented here: https://github.com/RBC/fogwall/blob/main/docs/CONFIGURATION.md#token-scope-requirements

Token scope requirements

The SCM login check calls GET /user (or equivalent) on the upstream SCM using the pusher's token. The token must carry at least the following scope:

Provider API endpoint Additional scope
GitHub GET https://api.github.com/user No additional scopes required for either classic or fine-grained PATs.
GitLab GET {uri}/api/v4/user read_user or api (not recommended, prefer read_user)
Codeberg GET https://codeberg.org/api/v1/user read:user
Gitea GET https://gitea.com/api/v1/user read:user
  • BitBucket is just... weird... It has two separate sets of permissions between git and Bitbucket APIs. A user email can be linked between both "realms" but you cannot use your email to push code to that platform. Supporting Bitbucket proper requires some credential rewriting which is error-prone and brittle. See BitbucketProvider and BitbucketIdentityFilter in RBC/fogwall for details on what is needed in the HTTP flow. I also had only temporary access to a live Bitbucket server (created a single repo, pushed a single commit to it) before Atlassian's trial expired. If there is someone in the community who has a live Bitbucket server or an existing FOSS project which got approved for Bitbucket use to test against, that would needed to rigourously check this type of logic. It's shared here as prior art/learnings only.

Related Issue

related to #1400

General

Documentation

  • Required user docs for adding their gitAccount (GitHub username in this current iteration)
  • Update any architectural docs with the identity resolution

Configuration

no configuration changes introduced

Tests

  • Tests have been added/updated for new functionality
  • Unit tests pass (npm test)
  • Linting and formatting pass (npm run lint and npm run format:check)
  • Type checks pass (npm run check-types)
  • API route tests for GET/PUT /api/v1/user/:username/git-account (coverage exists but UI integration testing is deferred)

@coopernetes coopernetes requested a review from a team as a code owner June 19, 2026 19:18
@netlify

netlify Bot commented Jun 19, 2026

Copy link
Copy Markdown

Deploy Preview for endearing-brigadeiros-63f9d0 canceled.

Name Link
🔨 Latest commit 4b14544
🔍 Latest deploy log https://app.netlify.com/projects/endearing-brigadeiros-63f9d0/deploys/6a35bd93ec8dcb0008c5ad40

@linux-foundation-easycla

linux-foundation-easycla Bot commented Jun 19, 2026

Copy link
Copy Markdown

CLA Signed
The committers listed above are authorized under a signed CLA.

  • ✅ login: coopernetes / name: Thomas Cooper (ef788cf)

@github-actions

github-actions Bot commented Jun 19, 2026

Copy link
Copy Markdown

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

Scanned Files

None

@coopernetes coopernetes force-pushed the feat/token-id-mapping branch from 9c3d053 to ef788cf Compare June 19, 2026 19:20
@codecov

codecov Bot commented Jun 19, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 96.61017% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 85.76%. Comparing base (fc23d58) to head (4b14544).

Files with missing lines Patch % Lines
...oxy/processors/push-action/resolveUserFromToken.ts 96.29% 3 Missing ⚠️
src/db/file/users.ts 81.81% 2 Missing ⚠️
src/db/index.ts 50.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1604      +/-   ##
==========================================
+ Coverage   85.38%   85.76%   +0.37%     
==========================================
  Files          83       85       +2     
  Lines        7878     8055     +177     
  Branches     1312     1357      +45     
==========================================
+ Hits         6727     6908     +181     
+ Misses       1123     1120       -3     
+ Partials       28       27       -1     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

const user: GitHubUserResponse = await response.json();
return {
login: user.login,
email: user.email ?? undefined,

@coopernetes coopernetes Jun 19, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed in the related issue, the email field in this API response shouldn't be relied upon as it is almost always null (at least on GitHub) due to default profile privacy settings.

}

const [scheme, encoded] = authHeader.split(' ');
if (!scheme || !encoded || scheme.toLowerCase() !== 'basic') {

@coopernetes coopernetes Jun 19, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how much this code path even makes sense given that git by default uses HTTP Basic authorization. A Git Proxy custom credentialHelper could be an interesting angle to explore but that requires a customization on the client side. Maybe worth tracking in the backlog?

Comment thread src/db/mongo/users.ts

export const findUserByGitAccount = async function (gitAccount: string): Promise<User | null> {
const collection = await connect(collectionName);
const doc = await collection.findOne({ gitAccount: { $eq: gitAccount.toLowerCase() } });

@coopernetes coopernetes Jun 19, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any desire to support an list of accounts here? gitAccount is somewhat of a holdover from v1. It's also singular across the whole user context - there's no shape in the data model today that supports associative git account by upstream provider/hostname.

Ideally, we revisit this shape in support of this PR. Something like this:

# MongoDB doc
{
  # existing keys...
  "username": "git-proxy-user",
  "email": "user@corpo-example.com",
  "gitAccounts": {
    "github.com": ["foo", "bar"],
    "gitlab.com": [ "baz" ]
  }
}```

…1400)

parsePush incorrectly uses the last commit's committer as the push user.
This adds a new chain processor that extracts the token from HTTP Basic
auth, calls the SCM provider's user API (GitHub GET /user for now), and
maps the SCM login to a git-proxy user via the gitAccount field.

- TokenIdentityProvider interface with hostname-based dispatch
- GitHubTokenIdentityProvider calling api.github.com/user
- resolveUserFromToken chain processor (non-blocking on failure)
- findUserByGitAccount DB lookup (file + mongo)
- GET/PUT /api/v1/user/:username/git-account endpoints
@coopernetes coopernetes force-pushed the feat/token-id-mapping branch from 91a910d to 4b14544 Compare June 19, 2026 22:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant