🔴 Required Information
Is your feature request related to a specific problem?
GcsArtifactService.save_artifact() raises NotImplementedError when the artifact is a types.Part with file_data set - a URI pointer to an existing file (e.g. gs://my-bucket/report.pdf). This means any user working with GCS-hosted files (PDFs, videos, large datasets) cannot save them as artifacts without re-uploading the content. InMemoryArtifactService already supports this case, so GCS is inconsistent with the rest of the service layer.
Describe the Solution You'd Like
In _save_artifact, replace the raise NotImplementedError block with handling for two sub-cases:
- Internal artifact references (
artifact:// URIs): validate the URI format using artifact_util.parse_artifact_uri(), then write a zero-byte blob with file_uri stored in blob metadata.
- External URI (
gs:// or other): write a zero-byte blob with file_uri stored in blob metadata and the mime_type as content_type.
In _load_artifact, before calling download_as_bytes(), check blob.metadata.get("file_uri") — if present, return:
types.Part(file_data=types.FileData(file_uri=..., mime_type=blob.content_type))
This approach stores only a pointer (no data copy), consistent with how InMemoryArtifactService stores file_data parts as-is.
Impact on your work
Users who have files already in GCS cannot use GcsArtifactService to register those files as artifacts. They are forced to either download and re-upload the content (defeating the purpose of GCS URIs) or fall back to InMemoryArtifactService which does not persist across sessions.
Willingness to contribute
Yes, I am implementing this and will submit a PR.
🟡 Recommended Information
Describe Alternatives You've Considered
- Copying/rewriting the file bytes from the source GCS URI into the artifact bucket — rejected because it duplicates data and requires extra GCS permissions (cross-bucket reads).
- Rejecting
file_data with a descriptive error — already done, but doesn't solve the problem.
- Using
InMemoryArtifactService — does not persist across sessions or deployments, not viable for production.
Proposed API / Implementation
# _save_artifact — replace lines 232–236 in gcs_artifact_service.py
elif artifact.file_data:
if not artifact.file_data.file_uri:
raise InputValidationError("Artifact file_data must have a file_uri.")
if artifact_util.is_artifact_ref(artifact):
if not artifact_util.parse_artifact_uri(artifact.file_data.file_uri):
raise InputValidationError(
f"Invalid artifact reference URI: {artifact.file_data.file_uri}"
)
blob.metadata = {**(blob.metadata or {}), "file_uri": artifact.file_data.file_uri}
if artifact.file_data.mime_type:
blob.upload_from_string(b"", content_type=artifact.file_data.mime_type)
else:
blob.upload_from_string(b"")
# _load_artifact — add before download_as_bytes()
if blob.metadata and "file_uri" in blob.metadata:
return types.Part(
file_data=types.FileData(
file_uri=blob.metadata["file_uri"],
mime_type=blob.content_type or None,
)
)
Additional Context
InMemoryArtifactService._save_artifact() already handles this at line 129–138.
artifact_util.is_artifact_ref() and artifact_util.parse_artifact_uri() are the existing helpers for URI validation.
_get_artifact_version_sync() already constructs gs:// canonical URIs (line 387), confirming the pattern of storing URI metadata is established.
- New tests needed in
tests/unittests/artifacts/test_artifact_service.py following the existing MockBlob pattern.
🔴 Required Information
Is your feature request related to a specific problem?
GcsArtifactService.save_artifact()raisesNotImplementedErrorwhen the artifact is atypes.Partwithfile_dataset - a URI pointer to an existing file (e.g.gs://my-bucket/report.pdf). This means any user working with GCS-hosted files (PDFs, videos, large datasets) cannot save them as artifacts without re-uploading the content.InMemoryArtifactServicealready supports this case, so GCS is inconsistent with the rest of the service layer.Describe the Solution You'd Like
In
_save_artifact, replace theraise NotImplementedErrorblock with handling for two sub-cases:artifact://URIs): validate the URI format usingartifact_util.parse_artifact_uri(), then write a zero-byte blob withfile_uristored in blob metadata.gs://or other): write a zero-byte blob withfile_uristored in blob metadata and the mime_type as content_type.In
_load_artifact, before callingdownload_as_bytes(), checkblob.metadata.get("file_uri")— if present, return:types.Part(file_data=types.FileData(file_uri=..., mime_type=blob.content_type))This approach stores only a pointer (no data copy), consistent with how
InMemoryArtifactServicestoresfile_dataparts as-is.Impact on your work
Users who have files already in GCS cannot use GcsArtifactService to register those files as artifacts. They are forced to either download and re-upload the content (defeating the purpose of GCS URIs) or fall back to InMemoryArtifactService which does not persist across sessions.
Willingness to contribute
Yes, I am implementing this and will submit a PR.
🟡 Recommended Information
Describe Alternatives You've Considered
file_datawith a descriptive error — already done, but doesn't solve the problem.InMemoryArtifactService— does not persist across sessions or deployments, not viable for production.Proposed API / Implementation
Additional Context
InMemoryArtifactService._save_artifact()already handles this at line 129–138.artifact_util.is_artifact_ref()andartifact_util.parse_artifact_uri()are the existing helpers for URI validation._get_artifact_version_sync()already constructsgs://canonical URIs (line 387), confirming the pattern of storing URI metadata is established.tests/unittests/artifacts/test_artifact_service.pyfollowing the existingMockBlobpattern.