gcs: add disable_http2 option for high-throughput parallel downloads#1437
Open
minguyen9988 wants to merge 1 commit into
Open
gcs: add disable_http2 option for high-throughput parallel downloads#1437minguyen9988 wants to merge 1 commit into
minguyen9988 wants to merge 1 commit into
Conversation
With HTTP/2 enabled (the default for Go's http.Transport when TLS is used), all concurrent part-download streams are multiplexed onto a small number of TCP connections. HTTP/2's per-connection flow control then caps the aggregate throughput at the backup tail — when only a handful of large parts remain in flight, the window never fully opens, and the link is under-utilised. The existing force_http option works around this by downgrading the scheme to cleartext http://, but that breaks the public GCS endpoint (which returns 403 "SSL Required") and is only suitable for internal reverse-proxy caches (e.g. varnish). This commit adds a new GCS config option, disable_http2, which suppresses HTTP/2 negotiation (ForceAttemptHTTP2=false, NextProtos=["http/1.1"]) WITHOUT downgrading the request scheme. The https:// scheme is preserved, so TLS and HTTPS_PROXY (CONNECT) continue to work. Each parallel part-download gets its own dedicated TCP connection (up to MaxIdleConnsPerHost=64), letting concurrent transfers saturate the link. The key design decision is in the round-tripper selection: force_http wraps the base transport in rewriteTransport (which rewrites https to http), while disable_http2 uses the base transport directly. This is tested in TestRewriteTransportSchemeDowngrade. Config: gcs.disable_http2 (bool, default false) Env: GCS_DISABLE_HTTP2
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #1434
Summary
Add
gcs.disable_http2config option that forces the GCS client onto HTTP/1.1 transport without downgrading the URL scheme to cleartext. This gives each parallel part-download its own dedicated TCP connection, eliminating the HTTP/2 per-connection flow control bottleneck that caps aggregate throughput at the backup tail.Problem
With HTTP/2 (the default when TLS is used), all concurrent download streams share a small number of TCP connections. HTTP/2's per-connection flow control window limits aggregate throughput across all multiplexed streams. At the backup tail — when only a few large parts remain in flight — the window never fully opens and the link is under-utilised.
The existing
force_httpoption suppresses HTTP/2 but also downgrades the scheme fromhttps://tohttp://viarewriteTransport, which breaks the public GCS endpoint (403 SSL Required) and HTTPS proxies.Changes
pkg/config/config.go— AddDisableHttp2field toGCSConfigwith YAML tagdisable_http2and env varGCS_DISABLE_HTTP2.pkg/storage/gcs.go— Three changes to theConnect()method:ForceHttp || DisableHttp2 || Debug(wasForceHttp || Debug).DisableHttp2is set, raiseMaxIdleConns=0(unlimited) andMaxIdleConnsPerHost=64so each parallel part gets its own TCP connection.ForceHttpwraps the transport inrewriteTransport(scheme downgrade tohttp://);DisableHttp2uses the base transport directly (preserveshttps://).pkg/storage/gcs_test.go— AddTestRewriteTransportSchemeDowngradewith two subtests documenting the behavioral contract:ForceHttpmust downgrade the scheme tohttp, whileDisableHttp2must preservehttps.test/testflows/.../cli.py.cli.snapshot— Adddisable_http2: falseto the default config snapshot.Behavioral difference from
force_httpforce_httpdisable_http2http://(cleartext)https://(TLS preserved)Testing