Ship static busybox shell in gpu-operator image#2434
Draft
rajathagasthya wants to merge 1 commit into
Draft
Conversation
19fd65d to
14f5202
Compare
This was referenced May 6, 2026
14f5202 to
20e9691
Compare
20e9691 to
9e3efb2
Compare
9e3efb2 to
448be34
Compare
rajathagasthya
added a commit
to rajathagasthya/mig-parted
that referenced
this pull request
May 19, 2026
The gpu-operator mounts a ConfigMap-backed `entrypoint.sh` into the nvidia-mig-manager container today: it waits for the driver-ready file, sources it as KEY=value env, derives `WITH_SHUTDOWN_HOST_GPU_CLIENTS=$IS_HOST_DRIVER`, and execs `nvidia-mig-manager`. That script requires a shell in the container image, which is currently provided by the `-dev` distroless variant via a busybox `/bin/sh` symlink. NVIDIA STIG policy is dropping `-dev` distroless as approved parent images, so the shell has to go — and that means the entrypoint logic has to live in the binary. Move startup hooks into `nvidia-mig-manager` itself. A new `internal/startup` package provides `WaitForFile` (polls `os.Stat`) and `SourceEnvFile` (parses `KEY=value` lines with quote and comment handling, calls `os.Setenv`). `main()` runs the hooks before `cli.App.Run` parses flags, so any env vars sourced from `driver-ready` are visible to the `EnvVars:` declarations on each cli.Flag. The hooks are opt-in via env vars: - `WAIT_FOR_DRIVER_READY=<path>` — block on the file's existence - `DRIVER_ENV_FILE=<path>` — source KEY=value into the process env - `WAIT_FOR_FILE_INTERVAL=<duration>` — poll interval, default 5s After sourcing, `IS_HOST_DRIVER` is mirrored into `WITH_SHUTDOWN_HOST_GPU_CLIENTS` for backward compatibility with the existing shell behavior. The cli flag picks up the env var as usual. Drop the `SHELL ["/busybox/sh", "-c"]` directive and the `RUN ln -s /busybox/sh /bin/sh && rm -r /var/run && ln -s /run /var/run` step from the Dockerfile, and flip the base from `distroless/go:v4.0.5-dev` to `v4.0.5`. The `/var/run` -> `/run` symlink is provided by the non-`-dev` distroless base. Companion to NVIDIA/gpu-operator#2434, which removes the `nvidia-mig-manager-entrypoint` ConfigMap and updates the state-mig-manager DaemonSet to invoke `nvidia-mig-manager` directly with the new env vars set on the container spec. Signed-off-by: Rajath Agasthya <ragasthya@nvidia.com>
acd7fa9 to
8d75aec
Compare
Flip the base from *-dev* to non-*-dev* distroless and source a static busybox from debian:trixie-slim. Init container wrappers, lifecycle hooks, and helper scripts continue to work via /bin/sh and busybox applet symlinks layered into the final image. Part of NVIDIA/cloud-native-team#299. Signed-off-by: Rajath Agasthya <ragasthya@nvidia.com>
8d75aec to
f6ed616
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Flip the base from
*-dev*to non-*-dev*distroless and source a static busybox fromdebian:trixie-slim. Init container wrappers, lifecycle hooks, and helper scripts continue to work via/bin/shand busybox applet symlinks layered into the final image.Part of NVIDIA/cloud-native-team#299.
Resolves #2435.