Skip to content

fix(providers/git): force HTTP auth for public repos via extraHeader#63471

Open
antonio-mello-ai wants to merge 1 commit intoapache:mainfrom
antonio-mello-ai:fix/git-bundle-force-auth
Open

fix(providers/git): force HTTP auth for public repos via extraHeader#63471
antonio-mello-ai wants to merge 1 commit intoapache:mainfrom
antonio-mello-ai:fix/git-bundle-force-auth

Conversation

@antonio-mello-ai
Copy link
Contributor

Description

Git does not send credentials for public repositories because the server never responds with a 401 challenge. This causes Airflow's DAG bundle git fetches to hit anonymous rate limits even when valid credentials are configured in the connection.

Root cause

Git's HTTPS behavior: it always tries anonymous access first and only sends credentials if the server responds with 401. Public repos don't issue 401, so credentials embedded in the URL are never sent.

Fix

When an HTTP/HTTPS connection has an auth_token, set GIT_CONFIG_* environment variables to inject an http.extraHeader with a Basic auth token:

GIT_CONFIG_COUNT=1
GIT_CONFIG_KEY_0=http.extraHeader
GIT_CONFIG_VALUE_0=Authorization: Basic <base64(username:password)>

This forces git to send the Authorization header on every request (git >= 2.31), allowing authenticated rate limits to apply even for public repositories.

Also updated _fetch_bare_repo in GitDagBundle to pass all hook env vars via custom_environment(**self.hook.env) instead of only GIT_SSH_COMMAND. This ensures the auth header is used during both clone and fetch operations.

Changes

  • GitHook._set_http_auth_env(): New method that encodes credentials and sets GIT_CONFIG_* env vars.
  • GitHook._process_git_auth_url(): Calls _set_http_auth_env() when HTTPS/HTTP + auth_token.
  • GitDagBundle._fetch_bare_repo(): Passes all env vars (SSH + HTTP auth) instead of only SSH.

Backward compatibility

  • URL-embedded credentials are kept for backward compatibility (they still work for private repos via 401 challenge).
  • SSH connections are not affected (no GIT_CONFIG_* vars are set for SSH URLs).
  • The GIT_CONFIG_* env vars require git >= 2.31 (March 2021).

Tests

  • 4 new tests: HTTPS auth sets header, HTTP auth sets header, no auth skips header, SSH skips header.
  • All 17 hook tests + 80 bundle tests pass.

Closes #54829

Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com

@antonio-mello-ai antonio-mello-ai force-pushed the fix/git-bundle-force-auth branch 2 times, most recently from 01462cc to 0a5faf1 Compare March 13, 2026 23:04
@antonio-mello-ai antonio-mello-ai force-pushed the fix/git-bundle-force-auth branch 8 times, most recently from 86c308c to 7e95167 Compare March 21, 2026 21:04
Git does not send credentials for public repositories because the
server never responds with 401. This causes authenticated rate limits
to be ignored even when valid credentials are configured.

Add http.extraHeader with Basic auth via GIT_CONFIG_* env vars
(git >= 2.31) so credentials are sent on every request. Also update
_fetch_bare_repo to pass all hook env vars (not just GIT_SSH_COMMAND).

Closes apache#54829

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@antonio-mello-ai antonio-mello-ai force-pushed the fix/git-bundle-force-auth branch from 7e95167 to 30f945e Compare March 21, 2026 23:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

DAG-Bundle : Git connections ignore credentials for public repositories, causing anonymous rate limit issues

1 participant