Skip to content

gcs: add disable_http2 option for high-throughput parallel downloads#1437

Open
minguyen9988 wants to merge 1 commit into
Altinity:masterfrom
minguyen9988:feat/gcs-disable-http2
Open

gcs: add disable_http2 option for high-throughput parallel downloads#1437
minguyen9988 wants to merge 1 commit into
Altinity:masterfrom
minguyen9988:feat/gcs-disable-http2

Conversation

@minguyen9988

Copy link
Copy Markdown
Contributor

Fixes #1434

Summary

Add gcs.disable_http2 config option that forces the GCS client onto HTTP/1.1 transport without downgrading the URL scheme to cleartext. This gives each parallel part-download its own dedicated TCP connection, eliminating the HTTP/2 per-connection flow control bottleneck that caps aggregate throughput at the backup tail.

Problem

With HTTP/2 (the default when TLS is used), all concurrent download streams share a small number of TCP connections. HTTP/2's per-connection flow control window limits aggregate throughput across all multiplexed streams. At the backup tail — when only a few large parts remain in flight — the window never fully opens and the link is under-utilised.

The existing force_http option suppresses HTTP/2 but also downgrades the scheme from https:// to http:// via rewriteTransport, which breaks the public GCS endpoint (403 SSL Required) and HTTPS proxies.

Changes

pkg/config/config.go — Add DisableHttp2 field to GCSConfig with YAML tag disable_http2 and env var GCS_DISABLE_HTTP2.

pkg/storage/gcs.go — Three changes to the Connect() method:

  1. Gate the custom-transport code path on ForceHttp || DisableHttp2 || Debug (was ForceHttp || Debug).
  2. When DisableHttp2 is set, raise MaxIdleConns=0 (unlimited) and MaxIdleConnsPerHost=64 so each parallel part gets its own TCP connection.
  3. Split the round-tripper selection: ForceHttp wraps the transport in rewriteTransport (scheme downgrade to http://); DisableHttp2 uses the base transport directly (preserves https://).

pkg/storage/gcs_test.go — Add TestRewriteTransportSchemeDowngrade with two subtests documenting the behavioral contract: ForceHttp must downgrade the scheme to http, while DisableHttp2 must preserve https.

test/testflows/.../cli.py.cli.snapshot — Add disable_http2: false to the default config snapshot.

Behavioral difference from force_http

force_http disable_http2
HTTP version HTTP/1.1 HTTP/1.1
URL scheme http:// (cleartext) https:// (TLS preserved)
Public GCS endpoint 403 SSL Required Works
HTTPS_PROXY Broken Works
Use case Internal cache (Varnish) Direct GCS / proxy

Testing

  • Unit tests verify the round-tripper scheme behavior
  • Tested in production against GCS via HTTPS proxy with parallel backup downloads
  • Default config snapshot updated

With HTTP/2 enabled (the default for Go's http.Transport when TLS is
used), all concurrent part-download streams are multiplexed onto a
small number of TCP connections. HTTP/2's per-connection flow control
then caps the aggregate throughput at the backup tail — when only a
handful of large parts remain in flight, the window never fully opens,
and the link is under-utilised.

The existing force_http option works around this by downgrading the
scheme to cleartext http://, but that breaks the public GCS endpoint
(which returns 403 "SSL Required") and is only suitable for internal
reverse-proxy caches (e.g. varnish).

This commit adds a new GCS config option, disable_http2, which
suppresses HTTP/2 negotiation (ForceAttemptHTTP2=false,
NextProtos=["http/1.1"]) WITHOUT downgrading the request scheme. The
https:// scheme is preserved, so TLS and HTTPS_PROXY (CONNECT) continue
to work. Each parallel part-download gets its own dedicated TCP
connection (up to MaxIdleConnsPerHost=64), letting concurrent transfers
saturate the link.

The key design decision is in the round-tripper selection: force_http
wraps the base transport in rewriteTransport (which rewrites https to
http), while disable_http2 uses the base transport directly. This is
tested in TestRewriteTransportSchemeDowngrade.

Config: gcs.disable_http2 (bool, default false)
Env:    GCS_DISABLE_HTTP2
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

GCS: HTTP/2 multiplexing caps parallel download throughput at backup tail

1 participant