-
Notifications
You must be signed in to change notification settings - Fork 79
fix for removing ggcr #545
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix for removing ggcr #545
Conversation
…environment variables and log messages
….com/go/compute/metadata and golang.org/x/oauth2
Summary of ChangesHello @ilopezluna, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces crucial fixes and a new integration test to restore and ensure proper model pulling functionality from various OCI-compliant registries, particularly Docker Hub and Hugging Face, following the removal of the Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey - I've found 3 issues, and left some high level feedback:
- In
manifestFetcher.Fetch, the manifest URL now hardcodes"latest"instead of usingdesc.Digest.String(), which breaks content-addressable behavior and will fail for non-latesttags or digest-based pulls; this should continue to use the descriptor’s digest (or tag when appropriate) rather than a constant. - The new
TestIntegration_PullFromDockerHubrelies on the real Docker Hub and a specific public image, which may introduce flakiness or rate-limit issues in CI; consider adding a way to skip or gate this test in constrained environments (e.g., via a build tag or env flag).
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `manifestFetcher.Fetch`, the manifest URL now hardcodes `"latest"` instead of using `desc.Digest.String()`, which breaks content-addressable behavior and will fail for non-`latest` tags or digest-based pulls; this should continue to use the descriptor’s digest (or tag when appropriate) rather than a constant.
- The new `TestIntegration_PullFromDockerHub` relies on the real Docker Hub and a specific public image, which may introduce flakiness or rate-limit issues in CI; consider adding a way to skip or gate this test in constrained environments (e.g., via a build tag or env flag).
## Individual Comments
### Comment 1
<location> `pkg/distribution/oci/remote/remote.go:383-387` </location>
<code_context>
+ // Docker Hub uses several aliases but the V2 API is served from registry-1.docker.io
+ registryHost := dockerHubAPIHost(registry.RegistryStr())
+
url := fmt.Sprintf("%s://%s/v2/%s/manifests/%s",
scheme,
- registry.RegistryStr(),
+ registryHost,
repo,
- desc.Digest.String())
+ "latest")
req, err := http.NewRequestWithContext(ctx, http.MethodGet, url, http.NoBody)
</code_context>
<issue_to_address>
**issue (bug_risk):** Using the hard-coded "latest" tag instead of the descriptor digest is likely incorrect and breaks content-addressable fetching.
This previously used `desc.Digest.String()` to align with the `/manifests/<digest>` behavior described above. Switching to the hard-coded `"latest"` ignores the requested descriptor and always fetches the latest tag, which can return the wrong manifest, cause cache inconsistencies, and make behavior non-deterministic. Unless there’s a strong requirement to use tags here, this should keep using `desc.Digest.String()` (or another identifier derived from `desc`) instead of a fixed tag.
</issue_to_address>
### Comment 2
<location> `cmd/cli/commands/integration_test.go:1188-1194` </location>
<code_context>
+ }
+}
+
+// TestIntegration_PullFromDockerHub is a smoke test that pulls a real model
+// from Docker Hub to verify that the OCI registry code works correctly
+// with the real Docker Hub registry (index.docker.io -> registry-1.docker.io).
+//
+// This test catches regressions where the code doesn't properly handle
+// Docker Hub's hostname remapping requirements.
+func TestIntegration_PullFromDockerHub(t *testing.T) {
+ env := setupDockerHubTestEnv(t)
+
</code_context>
<issue_to_address>
**suggestion (testing):** Consider gating this Docker Hub integration test behind an opt-in flag or build tag to avoid flakiness and external dependency issues in CI.
Because this test relies on external network access and Docker Hub availability (`ai/smollm2:135M-Q4_0` staying public, rate limits, outages), it can make CI flaky or unusable in restricted environments. Options:
- Skip by default unless an env var is set (e.g. `if os.Getenv("DMR_DOCKER_HUB_TEST") != "1" { t.Skip(...) }`).
- Guard it with a build tag like `//go:build integration` so it only runs in an integration pipeline.
This keeps the regression coverage while ensuring the default test suite remains stable and deterministic.
</issue_to_address>
### Comment 3
<location> `cmd/cli/commands/integration_test.go:1212-1216` </location>
<code_context>
+ err = pullModel(newPullCmd(), env.client, modelRef)
+ require.NoError(t, err, "Failed to pull model from Docker Hub: %s", modelRef)
+
+ // Verify the model was pulled
+ t.Log("Verifying model was pulled successfully")
+ models, err = listModels(false, env.client, true, false, "")
+ require.NoError(t, err)
+ require.NotEmpty(t, strings.TrimSpace(models), "Model should exist after pull from Docker Hub")
+
+ // Verify we can inspect the model
</code_context>
<issue_to_address>
**suggestion (testing):** Strengthen the assertion to verify the pulled model reference matches the expected Docker Hub-normalized form, not just that some model exists.
Currently this only verifies that *some* model string is returned, so it would still pass if a different registry or tag were used. To actually validate the hostname remapping behavior, assert that the normalized reference you expect appears in the output, e.g.
```go
require.Contains(t, models, "docker.io/ai/smollm2:135M-Q4_0")
```
(adjusted to match what `normalizeRef` produces).
```suggestion
// Verify the model was pulled
t.Log("Verifying model was pulled successfully")
models, err = listModels(false, env.client, true, false, "")
require.NoError(t, err)
// Strengthen assertion: ensure the normalized Docker Hub reference is present
normalizedRef := "docker.io/" + modelRef
require.Contains(t, models, normalizedRef, "Expected pulled model %s to appear in model list output", normalizedRef)
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces important fixes for pulling models from Docker Hub and HuggingFace registries after the removal of ggcr. The addition of an integration test for Docker Hub is a valuable improvement to catch registry-specific issues. The refactoring of the test setup to accommodate different container configurations is also well-executed.
The changes in pkg/distribution/oci/remote/remote.go correctly implement hostname remapping for Docker Hub. However, I've identified a critical issue where the manifest fetch logic has been hardcoded to use the "latest" tag. This will break the functionality for pulling any model version that is not tagged as latest, as well as pulling by digest. My review includes a suggestion to correct this behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey - I've found 4 issues, and left some high level feedback:
- In
manifestFetcher.Fetch, the new HuggingFace logic (shouldUseManifestEndpoint) now routes any JSON-like descriptor through the/manifests/path and hardcodeslatest, which breaks digest-based fetches and non-manifest JSON layers; consider keeping the originalisManifestMediaTypegate and usingdesc.Digest.String()instead of the literal"latest". - The Docker Hub hostname remapping via
dockerHubAPIHostis only applied in the/manifests/code path; if there are other calls that still useregistry.RegistryStr()directly (e.g., blob fetches), ensure they also apply the same mapping if needed to avoid inconsistent behavior between manifest and blob requests.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `manifestFetcher.Fetch`, the new HuggingFace logic (`shouldUseManifestEndpoint`) now routes any JSON-like descriptor through the `/manifests/` path and hardcodes `latest`, which breaks digest-based fetches and non-manifest JSON layers; consider keeping the original `isManifestMediaType` gate and using `desc.Digest.String()` instead of the literal `"latest"`.
- The Docker Hub hostname remapping via `dockerHubAPIHost` is only applied in the `/manifests/` code path; if there are other calls that still use `registry.RegistryStr()` directly (e.g., blob fetches), ensure they also apply the same mapping if needed to avoid inconsistent behavior between manifest and blob requests.
## Individual Comments
### Comment 1
<location> `pkg/distribution/oci/remote/remote.go:361-364` </location>
<code_context>
+
+ // For HuggingFace, try /manifests/ first for any JSON-like content
+ // since they don't serve manifests via /blobs/ at all
+ shouldUseManifestEndpoint := isHF && (desc.MediaType == "application/json" || strings.Contains(desc.MediaType, "+json"))
+
+ // For non-manifest content on non-HF registries, use the underlying fetcher
+ if !shouldUseManifestEndpoint {
return f.underlying.Fetch(ctx, desc)
}
</code_context>
<issue_to_address>
**issue (bug_risk):** Manifest fetches for non-HuggingFace registries no longer use the /manifests/ endpoint.
Previously, `Fetch` treated any `isManifestMediaType(desc.MediaType)` as a `/manifests/` request across all registries, which allowed non-HF registries that don’t expose manifests via `/blobs/` to work. Now only HF + JSON-like types use `/manifests/`, and all other manifest media types fall back to `/blobs/`, which appears to regress that behavior. Please keep the old `isManifestMediaType` behavior for non-HF registries and add the HF special-case on top, e.g.:
```go
shouldUseManifestEndpoint := (isHF && jsonLike) || (!isHF && isManifestMediaType(desc.MediaType))
```
</issue_to_address>
### Comment 2
<location> `pkg/distribution/oci/remote/remote.go:382-386` </location>
<code_context>
+ // Docker Hub uses several aliases but the V2 API is served from registry-1.docker.io
+ registryHost := dockerHubAPIHost(registry.RegistryStr())
+
url := fmt.Sprintf("%s://%s/v2/%s/manifests/%s",
scheme,
- registry.RegistryStr(),
+ registryHost,
repo,
- desc.Digest.String())
+ "latest")
req, err := http.NewRequestWithContext(ctx, http.MethodGet, url, http.NoBody)
</code_context>
<issue_to_address>
**issue (bug_risk):** Using "latest" instead of the descriptor digest breaks content-addressable fetching.
Previously this URL used `desc.Digest.String()`, guaranteeing we fetch the exact manifest referenced by the descriptor. Hardcoding `"latest"` means:
- We may fetch a different manifest than the descriptor identifies.
- Digest-based or non-`latest` references won’t work correctly.
- The `Fetch(ctx, desc)` contract of being content-addressable is broken.
Unless this function is intentionally tag-based and always meant to resolve `latest`, it should derive the reference from `desc`/`f.ref` instead of hardcoding a tag.
</issue_to_address>
### Comment 3
<location> `pkg/distribution/oci/remote/remote.go:347-348` </location>
<code_context>
+
+// isHuggingFaceRegistry returns true if the host is a HuggingFace registry.
+// HuggingFace doesn't serve manifests via /blobs/ endpoint, only via /manifests/.
+func isHuggingFaceRegistry(host string) bool {
+ return strings.Contains(host, "huggingface.co") || strings.Contains(host, "hf.co")
+}
+
</code_context>
<issue_to_address>
**suggestion (bug_risk):** HuggingFace host detection via substring matching may overmatch unintended hosts.
Because this uses `strings.Contains`, a host like `myhuggingface.co.evil.com` or `foo-hf.co.example.org` would be treated as HuggingFace, which could incorrectly send requests through the HuggingFace-specific path. Consider matching only exact hosts and valid subdomains instead, e.g. `host == "huggingface.co" || strings.HasSuffix(host, ".huggingface.co")` (and similarly for `hf.co`).
Suggested implementation:
```golang
// isHuggingFaceRegistry returns true if the host is a HuggingFace registry.
// HuggingFace doesn't serve manifests via /blobs/ endpoint, only via /manifests/.
func isHuggingFaceRegistry(host string) bool {
if host == "huggingface.co" || strings.HasSuffix(host, ".huggingface.co") {
return true
}
if host == "hf.co" || strings.HasSuffix(host, ".hf.co") {
return true
}
return false
}
// dockerHubAPIHost remaps Docker Hub registry hostnames to the actual API endpoint.
// Docker Hub uses several aliases (docker.io, index.docker.io) but the actual
// V2 API is served from registry-1.docker.io.
func dockerHubAPIHost(host string) string {
switch host {
case "docker.io", "index.docker.io":
return "registry-1.docker.io"
default:
return host
}
}
```
`).
Here are the code changes:
<file_operations>
<file_operation operation="edit" file_path="pkg/distribution/oci/remote/remote.go">
<<<<<<< SEARCH
// isHuggingFaceRegistry returns true if the host is a HuggingFace registry.
// HuggingFace doesn't serve manifests via /blobs/ endpoint, only via /manifests/.
// dockerHubAPIHost remaps Docker Hub registry hostnames to the actual API endpoint.
// Docker Hub uses several aliases (docker.io, index.docker.io) but the actual
// V2 API is served from registry-1.docker.io.
func dockerHubAPIHost(host string) string {
switch host {
case "docker.io", "index.docker.io":
return "registry-1.docker.io"
default:
return host
}
}
=======
// isHuggingFaceRegistry returns true if the host is a HuggingFace registry.
// HuggingFace doesn't serve manifests via /blobs/ endpoint, only via /manifests/.
func isHuggingFaceRegistry(host string) bool {
if host == "huggingface.co" || strings.HasSuffix(host, ".huggingface.co") {
return true
}
if host == "hf.co" || strings.HasSuffix(host, ".hf.co") {
return true
}
return false
}
// dockerHubAPIHost remaps Docker Hub registry hostnames to the actual API endpoint.
// Docker Hub uses several aliases (docker.io, index.docker.io) but the actual
// V2 API is served from registry-1.docker.io.
func dockerHubAPIHost(host string) string {
switch host {
case "docker.io", "index.docker.io":
return "registry-1.docker.io"
default:
return host
}
}
>>>>>>> REPLACE
</file_operation>
</file_operations>
<additional_changes>
1. Ensure this file imports the `strings` package:
```go
import (
// existing imports...
"strings"
)
```
If `strings` is already imported, no further change is needed.
</issue_to_address>
### Comment 4
<location> `pkg/distribution/oci/remote/remote.go:363-364` </location>
<code_context>
+ // since they don't serve manifests via /blobs/ at all
+ shouldUseManifestEndpoint := isHF && (desc.MediaType == "application/json" || strings.Contains(desc.MediaType, "+json"))
+
+ // For non-manifest content on non-HF registries, use the underlying fetcher
+ if !shouldUseManifestEndpoint {
return f.underlying.Fetch(ctx, desc)
}
</code_context>
<issue_to_address>
**nitpick:** The comment describing the conditional no longer matches the implemented logic.
The current wording implies this only applies to non-manifest content on non-HF registries, but `!shouldUseManifestEndpoint` also covers:
- All requests to non-HF registries (including manifest content), and
- HF registries where the media type is not JSON-like.
Please update either the comment or the condition so they describe the same behavior for when `/manifests/` vs `/blobs/` is used.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
…ic for HuggingFace registry
* Replace go-containerregistry with containerd/moby This commit removes the vendored go-containerregistry package and replaces it with containerd and moby packages for OCI registry operations. Signed-off-by: Eric Curtin <eric.curtin@docker.com> * fix for removing ggcr (#545) * Add Docker Hub integration tests and hostname remapping for API requests * Refactor Docker Model Runner setup to use a configuration struct for environment variables and log messages * Update go.mod dependencies to include latest versions of cloud.google.com/go/compute/metadata and golang.org/x/oauth2 * Add support for HuggingFace registry in manifest fetching * Refactor manifest endpoint logic to apply it only on HF repos * Remove Docker Hub hostname remapping and update manifest fetching logic for HuggingFace registry * Refactor manifest endpoint check for HuggingFace to use dedicated media type function --------- Signed-off-by: Eric Curtin <eric.curtin@docker.com> Co-authored-by: Ignasi <ignasi.lopez.luna@gmail.com>
These PR includes 3 changes:
The error was not catch because the local test registry (registry:3) doesn't need hostname remapping -
registry.local:5000works directly. However Docker Hub's actual V2 API is served fromregistry-1.docker.ioso we need this remapping (which was automatically handled by ggcr)./manifests/endpoint instead of/blob/(which is not implemented in HF):For example if we try to get the manifest of qwen3 via /blobs/ endpoint:
We get a 404:
Via
/manifests/works well:And any layer via blob also works: