diff --git a/confidential-containers/attestation.rst b/confidential-containers/attestation.rst index 5538f7e0e..6c0bfa1be 100644 --- a/confidential-containers/attestation.rst +++ b/confidential-containers/attestation.rst @@ -17,12 +17,18 @@ .. headings # #, * *, =, -, ^, " -.. _attestation-overview: - *********** Attestation *********** +As a :ref:`Security Engineer `, use this page to understand more about attestation and stand up a local attestation backend for testing. + +.. note:: + + Attestation is not required to configure your cluster to deploy Confidential Containers workloads. + This page gives an overview of attestation and a quickstart for standing up a local attestation backend for testing. + You should refer to the upstream `Confidential Containers documentation `__ for more details on attestation and production deployment. + In Confidential Containers, a Trusted Execution Environment (TEE) isolates a workload from the host. Attestation is a process that cryptographically proves the state of the guest TEE, including both the CPU and the GPU, to a remote verifier before any secret or sensitive resource is released to the workload. Attestation is required for any feature that depends on secrets, including: @@ -32,8 +38,7 @@ Attestation is required for any feature that depends on secrets, including: * Using sealed secrets * Requesting secrets directly from workloads -When a workload requires a secret, such as a key to decrypt a container image or model, guest components collect hardware evidence from the active CPU and GPU enclaves. -The evidence is sent to the remote verifier to evaluate the evidence against known-good reference values and configured policies, and conditionally releases the requested resource. +Configure attestation when workloads require a secret, to decrypt a container image or model, or guest components collect hardware evidence from the active CPU and GPU enclaves. Key Concepts ============ @@ -52,7 +57,7 @@ The following concepts appear throughout this page: * KBS resource: A secret, for example, a key, credential, or token, that Trustee releases to a guest when attestation succeeds. Most resources are addressed by a three-part path: ``//``. * Policy: The rule set that Trustee evaluates against verified evidence to decide whether to release a resource. By default, Trustee denies resource requests from clients that have not presented valid TEE evidence. -Refer to the upstream `Confidential Containers documentation `_ for more details on these concepts and attestation best practices. +Refer to the upstream `Confidential Containers documentation `__ for more details on these concepts and attestation best practices. Quickstart ========== @@ -74,8 +79,8 @@ The goal is to give you a working local attestation backend and a client to inte To run attestation against real evidence from a confidential workload, refer to the upstream `Attestation `_ and `Features `_ documentation for more information. -What You'll Build ------------------ +What You Will Build +------------------- By the end of this quickstart, you will have: @@ -83,7 +88,7 @@ By the end of this quickstart, you will have: * The ``kbs-client`` command-line tool installed and able to reach your Trustee instance. * A sample resource request that exercises the end-to-end request path. -You'll know you're done when ``kbs-client`` can send a request to KBS and receive a response from the Trustee policy engine, even if that response is a policy denial. +You will know you are done when ``kbs-client`` can send a request to KBS and receive a response from the Trustee policy engine, even if that response is a policy denial. A denial in this quickstart is the expected, successful outcome: it confirms that the client reached KBS, the Attestation Service evaluated the request, and policy was applied. @@ -103,6 +108,7 @@ Prerequisites Step 1: Install Trustee with Docker Compose ------------------------------------------- +Installing Trustee with Docker Compose is the recommended install path. Clone the upstream Trustee repository. The repository ships with a ``docker-compose.yml`` that wires KBS, the Attestation Service, and the Reference Value Provider Service together. @@ -129,7 +135,7 @@ Start the Trustee containers in the background. .. note:: On first run, ``docker compose up -d`` pulls the KBS, AS, and RVPS images before starting them. - This step can take several minutes. The command returns once the containers are starting. The services may need an additional few seconds to become ready to accept requests. + This step can take several minutes. The command returns after the containers start. The services may need an additional few seconds to become ready to accept requests. For details on optional configuration such as the admin keypair, debug logging, and per-service config files, refer to the upstream `Install Trustee in Docker `_ guide. @@ -249,10 +255,10 @@ Next Steps You now have a working local Trustee instance and a client that can talk to it. For more details, refer to the upstream Confidential Containers documentation: -* `Attestation `_ — Trustee architecture, configuration, resources, policies, the client tool, and guidance for production deployment topology, network configuration, and hardening. -* `Features `_ — the complete set of Confidential Containers features, including how to wire attestation into real workloads. +* `Attestation `_: Trustee architecture, configuration, resources, policies, the client tool, and guidance for production deployment topology, network configuration, and hardening. +* `Features `_: the complete set of Confidential Containers features, including how to wire attestation into real workloads. -To shut down the local Trustee instance when you're finished, run the following command from the ``trustee`` repository directory: +To shut down the local Trustee instance when you are finished, run the following command from the ``trustee`` repository directory: .. code-block:: console diff --git a/confidential-containers/confidential-containers-deploy.rst b/confidential-containers/confidential-containers-deploy.rst index 872013e1c..1c3d7496c 100644 --- a/confidential-containers/confidential-containers-deploy.rst +++ b/confidential-containers/confidential-containers-deploy.rst @@ -19,226 +19,57 @@ .. _confidential-containers-deploy: -****************************** -Deploy Confidential Containers -****************************** +###################### +Detailed Install Guide +###################### -This page describes deploying Kata Containers and the NVIDIA GPU Operator. -These are key pieces of the NVIDIA Confidential Containers Reference Architecture used to manage GPU resources on your cluster and deploy workloads into Confidential Containers. +This page lists the steps for a :ref:`Kubernetes Cluster Administrator ` to deploy Kata Containers and the NVIDIA GPU Operator to your cluster and configure it for Confidential Containers. +For persona responsibilities and documentation structure, refer to :doc:`Personas `. -Before you begin, refer to the :doc:`Confidential Containers Reference Architecture ` for details on the reference architecture and the :doc:`Supported Platforms ` page for the supported platforms. +If you want the fastest path and intend to run Confidential Containers on every node in your cluster, use the :doc:`Quickstart Install ` instead. +Use this guide when you need per-node control, such as running Confidential Containers on some nodes and traditional GPU workloads on others, or when you want additional configuration options. -This guide is for Kubernetes cluster administrators with host access to worker nodes (for BIOS and kernel configuration) and cluster-admin access to use ``kubectl``. -It assumes you are familiar with the NVIDIA GPU Operator, Kata Containers, Helm, and Kubernetes cluster administration, and that you know whether your target hardware uses AMD SEV-SNP or Intel TDX. -Refer to the :doc:`NVIDIA GPU Operator ` and `Kata Containers `_ documentation for more information on these software components. -Refer to the `Kubernetes documentation `_ for more information on Kubernetes cluster administration. +.. _overview: +**************** +Install Overview +**************** -Overview -======== +This guide assumes you completed :doc:`Prerequisites ` on an existing Kubernetes cluster with GPU worker nodes. -The high-level workflow for configuring Confidential Containers is as follows: +Install workflow: -#. Configure the :ref:`Prerequisites `. +#. :doc:`Prerequisites `: prepare worker hosts and cluster software. +#. :ref:`Label nodes to deploy Confidential Containers components `: select GPU workers for Confidential Containers workloads. +#. :ref:`Install Kata Containers `: install runtime classes and node-level Kata components. +#. :ref:`Install the NVIDIA GPU Operator `: deploy Confidential Containers operands on target nodes. +#. :doc:`Run a Sample Workload `: confirm the deployment end to end. -#. :ref:`Label Nodes ` that you want to use with Confidential Containers. +**Success criteria:** Helm releases report ``STATUS: deployed``, the ``kata-deploy`` pod is ``Running``, SNP and TDX runtime classes are available, GPU Operator operands are healthy on target nodes, and the sample workload logs include ``Test PASSED``. -#. Install the :ref:`latest Kata Containers Helm chart `. - This installs the Kata Containers runtime binaries, UVM images and kernels, and TEE-specific shims (such as ``kata-qemu-nvidia-gpu-snp`` for AMD-based systems or ``kata-qemu-nvidia-gpu-tdx`` for Intel-based systems) onto the cluster's worker nodes. +When you finish this page, nodes are labeled for Confidential Container component deployment, Kata runtime classes are available, and GPU Operator operands are running on those nodes. +Continue to :doc:`Run a Sample Workload ` if you have not run it yet. -#. Install the :ref:`NVIDIA GPU Operator configured for Confidential Containers `. - This installs the NVIDIA GPU Operator components that are required to deploy GPU passthrough workloads. - The GPU Operator uses the node labels to determine what software components to deploy to a node. - -After installation, you can :ref:`run a sample GPU workload ` in a confidential container. -The sample CUDA workload at the end of this guide returns ``Test PASSED`` when your cluster is correctly configured. - -When you complete the steps in this guide, your cluster has the following: - -* One or more worker nodes labeled with ``nvidia.com/gpu.workload.config=vm-passthrough`` and ``nvidia.com/cc.ready.state=true``. -* The ``kata-qemu-nvidia-gpu-snp`` and ``kata-qemu-nvidia-gpu-tdx`` runtime classes installed on the cluster. -* GPU Operator pods, including the Confidential Computing Manager, Kata Sandbox Device Plugin, and VFIO Manager, running on labeled nodes. - -After this baseline is in place, you can schedule workloads that request GPU resources and use the ``kata-qemu-nvidia-gpu-snp`` runtime class for AMD-based systems or the ``kata-qemu-nvidia-gpu-tdx`` runtime class for Intel-based systems. -To verify the environment and release secrets to those workloads, configure :doc:`Attestation ` with the Trustee framework. -The Trustee attestation service is typically deployed on a separate, trusted environment. - -.. _coco-prerequisites: - -Prerequisites -============= - -Hardware and BIOS ------------------ - -* Use a supported platform configured for Confidential Computing. - For more information on machine setup, refer to :doc:`Supported Platforms `. - -* Ensure hosts are configured to enable hardware virtualization and Access Control Services (ACS). With some AMD CPUs and BIOSes, ACS might be grouped under Advanced Error Reporting (AER). Enable these features in the host BIOS. - -* Configure hosts to support IOMMU. - You can check if your host is configured for IOMMU by running the following command: - - .. code-block:: console - - $ ls /sys/kernel/iommu_groups - - If the output of this command includes 0, 1, and so on, then your host is configured for IOMMU. - - If the host is not configured or if you are unsure, add the appropriate IOMMU kernel command-line argument to the ``/etc/default/grub`` file: ``amd_iommu=on`` for AMD CPUs or ``intel_iommu=on`` for Intel CPUs. - - .. tab-set:: - - .. tab-item:: AMD-based system (SNP) - :sync: amd-snp - - .. code-block:: console - - ... - GRUB_CMDLINE_LINUX_DEFAULT="quiet amd_iommu=on modprobe.blacklist=nouveau" - ... - - .. tab-item:: Intel-based system (TDX) - :sync: intel-tdx - - .. code-block:: console - - ... - GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on modprobe.blacklist=nouveau" - ... - - After making the change, configure the bootloader. - - .. code-block:: console - - $ sudo update-grub - - *Example Output:* - - .. code-block:: output - - Sourcing file `/etc/default/grub' - Generating grub configuration file ... - Found linux image: /boot/vmlinuz-5.15.0-generic - Found initrd image: /boot/initrd.img-5.15.0-generic - done - - Reboot the host after configuring the bootloader. - - .. note:: - - After configuring IOMMU, you might see QEMU warnings about PCI P2P DMA when running GPU workloads. - These are expected and can be safely ignored. - Refer to :ref:`coco-limitations` for details. - -* Ensure that no NVIDIA GPU drivers are installed on the host. - Confidential Containers uses VFIO to pass GPUs directly to the confidential VM, and host-level GPU drivers interfere with VFIO device binding. - - To check if NVIDIA GPU drivers are installed, run the following command: - - .. code-block:: console - - $ lsmod | grep nvidia - - If the command produces no output, no NVIDIA GPU drivers are loaded and you can continue to the next step. - - Refer to `Removing the Driver `_ in the NVIDIA Driver Installation Guide to remove the drivers. - -Kubernetes Cluster ------------------- - -* A Kubernetes cluster with cluster administrator privileges. - Refer to the :ref:`Supported Software Components ` table for supported Kubernetes versions. - -* containerd version 2.2.2 installed. - Refer to the `containerd Getting Started guide `_ for installation instructions. - - To verify the installed version, run the following command: - - .. code-block:: console - - $ containerd --version - - *Example Output:* - - .. code-block:: output - - containerd containerd.io 2.2.2 ... - -* Helm installed. - Use the command below to install Helm or refer to the `Helm documentation `_ for installation instructions. - - .. code-block:: console - - $ curl -fsSL -o get_helm.sh https://raw.githubusercontent.com/helm/helm/master/scripts/get-helm-3 \ - && chmod 700 get_helm.sh \ - && ./get_helm.sh - - -* Enable the ``KubeletPodResourcesGet`` and ``RuntimeClassInImageCriApi`` Kubelet feature gates on your cluster. - On Kubernetes v1.34 and later, ``KubeletPodResourcesGet`` is already enabled by default and only ``RuntimeClassInImageCriApi`` requires explicit configuration. - On earlier Kubernetes versions, enable both gates. - - * ``KubeletPodResourcesGet``: Used by the Kata runtime to query the Kubelet Pod Resources API and discover allocated GPU devices during sandbox creation. - - * ``RuntimeClassInImageCriApi``: Alpha since Kubernetes v1.29 and not enabled by default. - Required to support pod deployments that use multiple snapshotters side-by-side. - - Add both feature gates to your Kubelet configuration (typically ``/var/lib/kubelet/config.yaml``): - - .. code-block:: yaml - - apiVersion: kubelet.config.k8s.io/v1beta1 - kind: KubeletConfiguration - featureGates: - KubeletPodResourcesGet: true - RuntimeClassInImageCriApi: true - - If your ``config.yaml`` already has a ``featureGates`` section, add the gates to the existing section rather than creating a duplicate. - - Restart the Kubelet service to apply the changes: - - .. code-block:: console - - $ sudo systemctl restart kubelet - -.. _configure-image-pull-timeouts: - -* Increase the kubelet image pull timeout to a value that comfortably covers your largest workload image. - Kubelet de-allocates a pod if the image pull exceeds the configured timeout before the container transitions to the running state. - Actual pull duration varies with image size and network throughput, so this guide uses ``20m`` as a conservative ceiling that accommodates most workload images. - - Set ``runtimeRequestTimeout`` in your `kubelet configuration `_ to ``20m`` to align with the default Kata shim ``image_pull_timeout`` of 1200 seconds. - The kubelet default is 2 minutes, which can be too short for GPU workloads. - - Add or update the ``runtimeRequestTimeout`` field in your kubelet configuration (typically ``/var/lib/kubelet/config.yaml``): - - .. code-block:: yaml - :emphasize-lines: 3 - - apiVersion: kubelet.config.k8s.io/v1beta1 - kind: KubeletConfiguration - runtimeRequestTimeout: 20m - - Restart the kubelet service to apply the change: +.. _installation-and-configuration: - .. code-block:: console +.. _coco-label-nodes: - $ sudo systemctl restart kubelet +************************************************** +Label Nodes for Confidential Containers Components +************************************************** - If you need a timeout of more than 1200 seconds (20 minutes), also adjust the Kata Agent's ``image_pull_timeout``. - This setting also controls the Confidential Data Hub's image pull API timeout in seconds. - To do this, add the ``agent.image_pull_timeout`` kernel parameter to your shim configuration, or pass an explicit value in a pod annotation in the ``io.katacontainers.config.hypervisor.kernel_params: "..."`` annotation. +The GPU Operator reads labels to determine what software components to deploy to a node. +To configure a node for Confidential Container workloads, you label the node with the ``nvidia.com/gpu.workload.config=vm-passthrough`` label. +Then, when the GPU Operator is installed in a subsequent step, it will deploy the software components needed to run Confidential Containers to the node. -.. _installation-and-configuration: +A node can only run one container runtime at a time, so a node configured for Confidential Container workloads cannot run traditional GPU container workloads. +The labeling approach is useful if you want to run Confidential Containers workloads on some nodes and traditional GPU container workloads on other nodes in your cluster. -Installation -============ +For more details on how the GPU Operator deploys components to your cluster, refer to the :ref:`GPU Operator Cluster Topology Considerations ` section in the architecture overview. -.. _coco-label-nodes: +.. tip:: -Label Nodes ------------ + Skip this section if you plan to use all nodes in your cluster to run Confidential Containers and instead set ``sandboxWorkloads.defaultWorkload=vm-passthrough`` when installing the GPU Operator. #. Get a list of the nodes in your cluster: @@ -254,6 +85,8 @@ Label Nodes node-01 Ready 10d v1.34.0 node-02 Ready 10d v1.34.0 + Identify the GPU worker node or nodes you want to configure for Confidential Containers and use its name in the next step. + #. Set the ``NODE_NAME`` environment variable to the name of the node you want to configure: .. code-block:: console @@ -270,16 +103,18 @@ Label Nodes $ kubectl label node $NODE_NAME nvidia.com/gpu.workload.config=vm-passthrough - The GPU Operator uses this label to determine what software components to deploy to a node. - The ``nvidia.com/gpu.workload.config=vm-passthrough`` label specifies that the node should receive the software components to run Confidential Containers. + *Example Output:* + + .. code-block:: output - A node can only run one container runtime at a time, so a labeled node runs only Confidential Container workloads and cannot run traditional GPU container workloads. - The labeling approach is useful if you want to run Confidential Containers workloads on some nodes and traditional GPU container workloads on other nodes in your cluster. - For more details on how the GPU Operator deploys components to your cluster, refer to the :ref:`GPU Operator Cluster Topology Considerations ` section in the architecture overview. + node/ labeled - .. tip:: + The ``node/ labeled`` message confirms the label was applied. + + .. note:: - Skip this section if you plan to use all nodes in your cluster to run Confidential Containers and instead set ``sandboxWorkloads.defaultWorkload=vm-passthrough`` when installing the GPU Operator. + If the command prints `` not labeled``, the label may already be set. + Continue to the next step to verify the label was added. #. Verify the node label was added: @@ -293,12 +128,17 @@ Label Nodes nvidia.com/gpu.workload.config: vm-passthrough -After labeling the node, you can continue to the next steps to install Kata Containers and the NVIDIA GPU Operator. +**Success criteria:** All nodes you intend to use for Confidential Container workloads have the ``nvidia.com/gpu.workload.config: vm-passthrough`` label. +By labeling the nodes in your cluster that you want to run Confidential Container workloads, you are signaling to the GPU Operator to deploy the software components needed to run Confidential Containers to the node and configuring the node to only run a Confidential runtime. + +After all your desired nodes are labeled, you can continue to the next step to install Kata Containers. + .. _coco-install-kata-chart: +************************************** Install the Kata Containers Helm Chart --------------------------------------- +************************************** Install Kata Containers using the ``kata-deploy`` Helm chart. The ``kata-deploy`` chart installs all required components from the Kata Containers project including the Kata Containers runtime binary, runtime configuration, UVM kernel, and images that NVIDIA uses for Confidential Containers and native Kata containers. @@ -323,10 +163,28 @@ The minimum required version is 3.29.0. --wait --timeout 10m \ --version "${VERSION}" - *Example Output:* + *Example Output immediately after running the command:* .. code-block:: output + Pulled: ghcr.io/kata-containers/kata-deploy-charts/kata-deploy:3.29.0 + Digest: sha256:aea41018779716ce2e0bf406d701637d10fb5a0792db51a08dfd3f76701eb933 + + The ``--wait`` flag in the install command instructs Helm to wait until the release is deployed before returning. + It can take a 2-3 minutes to return more output. + + .. note:: + + There is a `known Helm issue `_ on single node clusters, that may result in the Helm command finishing before all deployed pods are finished initializing. + If you are deploying to a single node cluster, you may need to wait for an additional few minutes after the Helm command completes for the ``kata-deploy`` pod to be in the Running state. + + + *Example Output when the release is deployed:* + + .. code-block:: output + + Pulled: ghcr.io/kata-containers/kata-deploy-charts/kata-deploy:3.29.0 + Digest: sha256:aea41018779716ce2e0bf406d701637d10fb5a0792db51a08dfd3f76701eb933 LAST DEPLOYED: Wed Apr 1 17:03:00 2026 NAMESPACE: kata-system STATUS: deployed @@ -334,13 +192,8 @@ The minimum required version is 3.29.0. DESCRIPTION: Install complete TEST SUITE: None - .. note:: - - The ``--wait`` flag in the install command instructs Helm to wait until the release is deployed before returning. - It can take a 2-3 minutes to return output. - - There is a `known Helm issue `_ on single node clusters, that may result in the Helm command finishing before all deployed pods are finished initializing. - If you are deploying to a single node cluster, you may need to wait for an additional few minutes after the Helm command completes for the ``kata-deploy`` pod to be in the Running state. + ``STATUS: deployed`` confirms the Helm release succeeded and the chart resources were applied. + This does not yet confirm the Kata components are healthy, so continue to the verification steps below before you install the GPU Operator. .. note:: @@ -349,7 +202,7 @@ The minimum required version is 3.29.0. The GPU Operator will deploy and manage NFD in the next step. -#. Optional: Verify that the ``kata-deploy`` pod is running: +#. Verify that the ``kata-deploy`` pod is running: .. code-block:: console @@ -359,10 +212,16 @@ The minimum required version is 3.29.0. .. code-block:: output - NAME READY STATUS RESTARTS AGE kata-deploy-b2lzs 1/1 Running 0 6m37s -#. Optional: Verify that the ``kata-qemu-nvidia-gpu``, ``kata-qemu-nvidia-gpu-snp``, and ``kata-qemu-nvidia-gpu-tdx`` runtime classes are available: + A ``READY`` value of ``1/1`` and a ``STATUS`` of ``Running`` mean the ``kata-deploy`` pod installed the Kata components on the node successfully. + If the pod is ``Pending``, ``ContainerCreating``, or ``CrashLoopBackOff``, wait a minute and re-run the command. + If it does not reach ``Running``, refer to the log steps below. + +#. Verify that the ``kata-qemu-nvidia-gpu-snp`` and ``kata-qemu-nvidia-gpu-tdx`` runtime classes are available: + + After ``helm install`` completes with ``STATUS: deployed``, the ``kata-deploy`` chart has created the Kata ``RuntimeClass`` resources on the cluster. + This check is the required checkpoint before you continue to :ref:`Install the NVIDIA GPU Operator `. .. code-block:: console @@ -378,22 +237,57 @@ The minimum required version is 3.29.0. kata-qemu-nvidia-gpu-tdx kata-qemu-nvidia-gpu-tdx 40s Several runtimes are installed by the ``kata-deploy`` chart. - The ``kata-qemu-nvidia-gpu`` runtime class is used with Kata Containers, in a non-Confidential Containers scenario. - The ``kata-qemu-nvidia-gpu-snp`` for AMD-based systems or ``kata-qemu-nvidia-gpu-tdx`` for Intel-based systems runtime classes are used to deploy Confidential Containers workloads. + The ``kata-qemu-nvidia-gpu`` runtime class is used with Kata + Containers, in a non-Confidential Containers scenario. + The ``kata-qemu-nvidia-gpu-snp`` for AMD-based systems or + ``kata-qemu-nvidia-gpu-tdx`` for Intel-based systems runtime + classes are used to deploy Confidential Containers workloads. -#. Optional: If you have an issue deploying the ``kata-deploy`` pod or are not seeing the expected runtime classes, get the pod name and view the logs: + If SNP or TDX runtime classes are not listed, the install did not complete correctly. + On a single-node cluster, retry after a few minutes only if Helm returned before the ``kata-deploy`` pod reaches ``Running`` (refer to the note above). + Otherwise, refer to the log steps below. + +**Success criteria:** Helm reports ``STATUS: deployed``, the ``kata-deploy`` pod is ``Running``, and both ``kata-qemu-nvidia-gpu-snp`` and ``kata-qemu-nvidia-gpu-tdx`` are available on the cluster. +After all checks pass, continue to :ref:`Install the NVIDIA GPU Operator `. + +If you have an issue deploying the ``kata-deploy`` pod or are not seeing the expected runtime classes, use the following steps to view the logs: + +#. Get the kata-deploy pod name: .. code-block:: console $ kubectl get pods -n kata-system | grep kata-deploy - $ kubectl logs -n kata-system + + *Example Output:* + + .. code-block:: output + + NAME READY STATUS RESTARTS AGE + kata-deploy- 1/1 Running 0 6m37s + +#. View the logs for the kata-deploy pod: + + .. code-block:: console + + $ kubectl logs -n kata-system kata-deploy- Replace ```` with the name of the ``kata-deploy`` pod from the first command's output. + *Example Output:* + + .. code-block:: output + + Install completed + daemonset mode: waiting for SIGTERM + + If logs show ``CrashLoopBackOff``, repeated errors, or runtime classes are missing after a successful Helm deploy, collect the log output and check for similar reports in the `Kata Containers GitHub repository `_. + If no existing issue matches your problem, `open a new issue `_ in that repository with your ``kata-deploy`` logs, chart version (``3.29.0``), and cluster details. + .. _coco-install-gpu-operator: +******************************* Install the NVIDIA GPU Operator --------------------------------- +******************************* Install the NVIDIA GPU Operator and configure it to deploy Confidential Container components. @@ -415,6 +309,11 @@ Install the NVIDIA GPU Operator and configure it to deploy Confidential Containe #. Install the GPU Operator with the following configuration: + .. tip:: + + Add ``--set sandboxWorkloads.defaultWorkload=vm-passthrough`` to configure every worker node for Confidential Containers workloads. + Refer to the :ref:`Label Nodes for Confidential Containers Components ` section for more details on this use case. + .. code-block:: console $ helm install --wait --timeout 10m --generate-name \ @@ -425,7 +324,6 @@ Install the NVIDIA GPU Operator and configure it to deploy Confidential Containe --set nfd.enabled=true \ --set nfd.nodefeaturerules=true \ --version=v26.3.1 - *Example Output:* @@ -438,21 +336,17 @@ Install the NVIDIA GPU Operator and configure it to deploy Confidential Containe REVISION: 1 TEST SUITE: None - .. note:: + ``STATUS: deployed`` confirms the Helm release succeeded. + The ``--wait`` flag instructs Helm to wait until the release is deployed before returning. + It may take 3-5 minutes for the Helm command to complete and for all GPU Operator pods to be in the Running state. - The ``--wait`` flag instructs Helm to wait until the release is deployed before returning. - Some GPU Operator pods can take several additional minutes to reach the Running state after the Helm command completes. - Use the optional verification step that follows to confirm pod status. + For additional installation settings, - .. tip:: + * Refer to the :ref:`Common GPU Operator Configuration Settings ` section on this page for more details on the Confidential Containers-specific configuration options you can specify when installing the GPU Operator. - Add ``--set sandboxWorkloads.defaultWorkload=vm-passthrough`` if every worker node should deploy Confidential Containers by default. + * Refer to the :ref:`Common chart customization options ` in :doc:`Installing the NVIDIA GPU Operator ` for more details on the additional general configuration options you can specify when installing the GPU Operator. - Refer to the :ref:`Common GPU Operator Configuration Settings ` section on this page for more details on the configuration options you can specify when installing the GPU Operator. - - Refer to the :ref:`Common chart customization options ` in :doc:`Installing the NVIDIA GPU Operator ` for more details on the additional general configuration options you can specify when installing the GPU Operator. - -#. Optional: Verify that all GPU Operator pods, especially the Confidential Computing Manager, Kata Device Plugin and VFIO Manager operands, are running: +#. Verify that all GPU Operator pods, especially the Confidential Computing Manager, Kata Device Plugin and VFIO Manager operands, are running: .. code-block:: console @@ -472,17 +366,12 @@ Install the NVIDIA GPU Operator and configure it to deploy Confidential Containe nvidia-sandbox-validator-6xnzc 1/1 Running 0 30s nvidia-vfio-manager-h229x 1/1 Running 0 62s - For more details on each of the GPU Operator components, refer to the :ref:`GPU Operator Cluster Topology Considerations ` section in the architecture overview. - - .. note:: - - If you are not seeing the expected output, view the logs for the GPU Operator pods: - - .. code-block:: console - - $ kubectl logs -n gpu-operator + Each pod should report a ``READY`` value of ``1/1`` and a ``STATUS`` of ``Running`` or ``Completed``. + The ``nvidia-cc-manager``, ``nvidia-kata-sandbox-device-plugin-daemonset``, and ``nvidia-vfio-manager`` operands are specific to Confidential Containers and must be present on labeled nodes. + Pods may briefly show ``Pending`` or ``Init`` while they start, which is expected. + When all operands are ``Running`` or ``Completed``, the GPU Operator components are deployed and you can continue. - Replace ```` with the name of the GPU Operator pod from ``kubectl get pods -n gpu-operator``. + For more details on each of the GPU Operator components, refer to the :ref:`GPU Operator Cluster Topology Considerations ` section in the architecture overview. #. Optional: If you have host access to the worker node, you can perform the following validation step: @@ -501,152 +390,33 @@ Install the NVIDIA GPU Operator and configure it to deploy Confidential Containe Kernel driver in use: vfio-pci Kernel modules: nvidiafb, nouveau - .. tip:: - - If you have an issue deploying the GPU Operator, refer to the :doc:`NVIDIA GPU Operator troubleshooting guide ` for guidance on troubleshooting and resolving issues. - -With Kata Containers and the GPU Operator installed, you can start using your cluster to run Confidential Containers workloads. -To run a sample workload, refer to the :ref:`Run a Sample Workload ` section. - -For further configuration settings, refer to the following sections: - -* :ref:`Managing the Confidential Computing Mode ` -* :ref:`Configuring Workloads to use Multi-GPU Passthrough ` -* :ref:`Configuring GPU or NVSwitch Resource Types Name ` - -.. _coco-run-sample-workload: - -Run a Sample Workload -===================== - -A pod manifest for a confidential container GPU workload requires that you specify the ``kata-qemu-nvidia-gpu-snp`` runtime class for AMD-based systems or ``kata-qemu-nvidia-gpu-tdx`` for Intel-based systems. - -1. Create a file, such as the following ``cuda-vectoradd-kata.yaml`` sample, specifying the appropriate runtime class for your system: - - .. tab-set:: - - .. tab-item:: AMD-based system (SNP) - :sync: amd-snp - - .. code-block:: yaml - :emphasize-lines: 7,14 - - apiVersion: v1 - kind: Pod - metadata: - name: cuda-vectoradd-kata - namespace: default - spec: - runtimeClassName: kata-qemu-nvidia-gpu-snp - restartPolicy: Never - containers: - - name: cuda-vectoradd - image: "nvcr.io/nvidia/k8s/cuda-sample:vectoradd-cuda12.5.0-ubuntu22.04" - resources: - limits: - nvidia.com/pgpu: "1" # for single GPU passthrough - memory: 16Gi - - .. tab-item:: Intel-based system (TDX) - :sync: intel-tdx - - .. code-block:: yaml - :emphasize-lines: 7,14 - - apiVersion: v1 - kind: Pod - metadata: - name: cuda-vectoradd-kata - namespace: default - spec: - runtimeClassName: kata-qemu-nvidia-gpu-tdx - restartPolicy: Never - containers: - - name: cuda-vectoradd - image: "nvcr.io/nvidia/k8s/cuda-sample:vectoradd-cuda12.5.0-ubuntu22.04" - resources: - limits: - nvidia.com/pgpu: "1" # for single GPU passthrough - memory: 16Gi - - The following are Confidential Containers configurations in the sample manifest: - - * Set the runtime class to ``kata-qemu-nvidia-gpu-snp`` for AMD-based systems or ``kata-qemu-nvidia-gpu-tdx`` for Intel-based systems, depending on the node type where the workloads should run. - - * In the sample above, ``nvidia.com/pgpu`` is the default resource type for GPUs. - If you are deploying on a heterogeneous cluster, you might want to update the default behavior by specifying the ``P_GPU_ALIAS`` environment variable for the Kata device plugin. - Refer to the :ref:`Configuring GPU or NVSwitch Resource Types Name ` section on this page for more details. - - * If you have machines that support multi-GPU passthrough, use a pod deployment manifest that specifies 8 PGPU. - If you are using NVIDIA Hopper GPUs with PPCIE mode, also specify 4 NVSwitch resources. - - .. code-block:: yaml - - resources: - limits: - nvidia.com/pgpu: "8" - nvidia.com/nvswitch: "4" # Only for NVIDIA Hopper GPUs with PPCIE mode - - .. note:: - If you are using NVIDIA Hopper GPUs for multi-GPU passthrough, you must also set the Confidential Computing mode to ``ppcie`` mode. - Refer to :ref:`Managing the Confidential Computing Mode ` for details. + ``Kernel driver in use: vfio-pci`` confirms the GPU is bound for VFIO passthrough into the confidential virtual machine. + If the driver in use is ``nvidia`` or ``nouveau`` instead, the GPU is not ready for passthrough. + Confirm your node meets the :ref:`Prerequisites ` section, including removing any NVIDIA GPU drivers on the host. +**Success criteria:** All GPU Operator pods are ``Running`` or ``Completed``. +Your cluster is now configured to deploy workloads in Kata Containers. +Continue to :doc:`Run a Sample Workload ` to confirm everything is working as expected. -2. Create the pod: +If you are not seeing the expected output, view the logs for the GPU Operator pods or refer to :doc:`Troubleshooting `. - .. code-block:: console - - $ kubectl apply -f cuda-vectoradd-kata.yaml - - *Example Output:* - - .. code-block:: output - - pod/cuda-vectoradd-kata created - - -3. Verify the pod is running: - - .. code-block:: console - - $ kubectl get pod cuda-vectoradd-kata - - *Example Output:* - - .. code-block:: output - - NAME READY STATUS RESTARTS AGE - cuda-vectoradd-kata 1/1 Running 0 10s - -4. View the logs from the pod after the container starts: - - .. code-block:: console - - $ kubectl logs -n default cuda-vectoradd-kata - - *Example Output:* - - .. code-block:: output - - [Vector addition of 50000 elements] - Copy input data from the host memory to the CUDA device - CUDA kernel launch with 196 blocks of 256 threads - Copy output data from the CUDA device to the host memory - Test PASSED - Done - -5. Delete the pod: +.. code-block:: console - .. code-block:: console + $ kubectl logs -n gpu-operator - $ kubectl delete -f cuda-vectoradd-kata.yaml +Replace ```` with the name of the GPU Operator pod from ``kubectl get pods -n gpu-operator``. +.. tip:: + For general GPU Operator issues such as driver or toolkit failures, refer to the :doc:`NVIDIA GPU Operator troubleshooting guide `. + For Confidential Containers-specific deploy failures, refer to :doc:`Troubleshooting `. + Common symptoms include :ref:`Insufficient nvidia.com/pgpu ` and :ref:`device cold plug failed `. .. _coco-configuration-settings: +****************************************** Common GPU Operator Configuration Settings -=========================================== +****************************************** The following are the available GPU Operator configuration settings to enable Confidential Containers: @@ -666,12 +436,13 @@ The following are the available GPU Operator configuration settings to enable Co * - ``sandboxWorkloads.defaultWorkload`` - Specifies the default type of workload for the cluster, one of ``container``, ``vm-passthrough``, or ``vm-vgpu``. - Setting ``vm-passthrough`` or ``vm-vgpu`` can be helpful if you plan to run all or mostly virtual machines in your cluster. + Set to ``vm-passthrough`` if you plan to run all or mostly virtual machines in your cluster. - ``container`` * - ``sandboxWorkloads.mode`` - Specifies the sandbox mode to use when deploying sandbox workloads. Accepted values are ``kubevirt`` (default) and ``kata``. + Set to ``kata`` to run Confidential Containers workloads in Kata Containers. - ``kubevirt`` * - ``kataSandboxDevicePlugin.env`` @@ -684,8 +455,9 @@ The following are the available GPU Operator configuration settings to enable Co .. _coco-configuration-heterogeneous-clusters: +*********************************************** Configuring GPU or NVSwitch Resource Types Name ------------------------------------------------- +*********************************************** By default, the NVIDIA GPU Operator creates a resource type for GPUs and NVSwitches, ``nvidia.com/pgpu`` and ``nvidia.com/nvswitch``. You can reference this name in your manifests to request GPU or NVSwitch resources for your workload. @@ -697,7 +469,7 @@ To do this, specify an empty ``P_GPU_ALIAS`` environment variable in the Kata sa ``--set kataSandboxDevicePlugin.env[0].name=P_GPU_ALIAS`` and ``--set kataSandboxDevicePlugin.env[0].value=""``. -When this variable is set to ``""``, the Kata device plugin creates GPU model-specific resource types, for example ``nvidia.com/GH100_H100L_94GB``, instead of the default ``nvidia.com/pgpu`` type. +When this variable is set to ``""``, the Kata device plugin creates GPU model-specific resource types, for example ``nvidia.com/GH100_H200_141GB``, instead of the default ``nvidia.com/pgpu`` type. Use the exposed device resource types in pod specs by specifying respective resource limits. Similarly, you can set ``NVSWITCH_ALIAS`` to ``""`` to advertise model-specific NVSwitch resource types. @@ -707,17 +479,17 @@ The following example installs the GPU Operator with both ``P_GPU_ALIAS`` and `` .. code-block:: console $ helm install --wait --timeout 10m --generate-name \ - -n gpu-operator --create-namespace \ - nvidia/gpu-operator \ - --set sandboxWorkloads.enabled=true \ - --set sandboxWorkloads.mode=kata \ - --set nfd.enabled=true \ - --set nfd.nodefeaturerules=true \ - --set kataSandboxDevicePlugin.env[0].name=P_GPU_ALIAS \ - --set kataSandboxDevicePlugin.env[0].value="" \ - --set kataSandboxDevicePlugin.env[1].name=NVSWITCH_ALIAS \ - --set kataSandboxDevicePlugin.env[1].value="" \ - --version=v26.3.1 + -n gpu-operator --create-namespace \ + nvidia/gpu-operator \ + --set sandboxWorkloads.enabled=true \ + --set sandboxWorkloads.mode=kata \ + --set nfd.enabled=true \ + --set nfd.nodefeaturerules=true \ + --set kataSandboxDevicePlugin.env[0].name=P_GPU_ALIAS \ + --set kataSandboxDevicePlugin.env[0].value="" \ + --set kataSandboxDevicePlugin.env[1].name=NVSWITCH_ALIAS \ + --set kataSandboxDevicePlugin.env[1].value="" \ + --version=v26.3.1 After installing the GPU Operator, you can view the GPU or NVSwitch resource types available on a node by running the following command: @@ -726,6 +498,7 @@ After installing the GPU Operator, you can view the GPU or NVSwitch resource typ $ kubectl get node $NODE_NAME -o json | grep nvidia.com .. note:: + The ``NODE_NAME`` environment variable was set in the :ref:`Label Nodes ` section. If you want to view the resource types for a different node, you can update the ``NODE_NAME`` environment variable and run the command again. @@ -733,171 +506,15 @@ After installing the GPU Operator, you can view the GPU or NVSwitch resource typ .. code-block:: output - "nvidia.com/GH100_H100L_94GB": "1" + "nvidia.com/GH100_H200_141GB": "1" You should see the resource type information for the GPUs and NVSwitches on the node. -.. _managing-confidential-computing-mode: - -Managing the Confidential Computing Mode -========================================= - -You can set the default confidential computing mode of the NVIDIA GPUs by setting the ``ccManager.defaultMode=`` option. -The default value of ``ccManager.defaultMode`` is ``on``. -You can set this option when you install NVIDIA GPU Operator or afterward by modifying the cluster-policy instance of the ClusterPolicy object. - -When you change the mode, the manager performs the following actions: - -* Evicts the other GPU Operator operands from the node. - - However, the manager does not drain user workloads. You must make sure that no user workloads are running on the node before you change the mode. - -* Changes the mode and resets the GPU. -* Reschedules the other GPU Operator operands. - -The supported modes are: - -.. list-table:: - :widths: 15 55 30 - :header-rows: 1 - - * - Mode - - Description - - Configuration Method - * - ``on`` (default) - - Enable Confidential Computing. - - cluster-wide default, node-level override - * - ``off`` - - Disable Confidential Computing. - - cluster-wide default, node-level override - * - ``ppcie`` - - Enable Confidential Computing on NVIDIA Hopper GPUs. - - On the NVIDIA Hopper architecture multi-GPU passthrough uses protected PCIe (PPCIE) - which claims exclusive use of the NVSwitches for a single Confidential Container - virtual machine. - If you are using NVIDIA Hopper GPUs for multi-GPU passthrough, - set the GPU mode to ``ppcie`` mode. - - The NVIDIA Blackwell architecture uses NVLink - encryption which places the switches outside of the Trusted Computing Base (TCB), - meaning the ``ppcie`` mode is not required. Use ``on`` mode in this case. - - node-level override - -You can set a cluster-wide default mode, and you can set the mode on individual nodes. -The mode that you set on a node has higher precedence than the cluster-wide default mode. - -Setting a Cluster-Wide Default Mode ------------------------------------- - -To set a cluster-wide mode, specify the ``ccManager.defaultMode`` field like the following example: - -.. code-block:: console - - $ kubectl patch clusterpolicies.nvidia.com/cluster-policy \ - --type=merge \ - -p '{"spec": {"ccManager": {"defaultMode": "on"}}}' - -*Example Output:* - -.. code-block:: output - - clusterpolicy.nvidia.com/cluster-policy patched - -.. note:: - - The ``ppcie`` mode cannot be set as a cluster-wide default, it can only be set as a node label value. - -Setting a Node-Level Mode --------------------------- - -To set a node-level mode, apply the ``nvidia.com/cc.mode=`` label on the node. - -.. note:: - - The ``NODE_NAME`` environment variable was set in the :ref:`Label Nodes ` section. - If you want to set the mode for a different node, you can update the ``NODE_NAME`` environment variable and run the command again. - -.. code-block:: console - - $ kubectl label node $NODE_NAME nvidia.com/cc.mode=on --overwrite - -The mode that you set on a node has higher precedence than the cluster-wide default mode. - -Verifying a Mode Change ------------------------- - -To verify that a mode change was successful, view the ``nvidia.com/cc.mode``, -``nvidia.com/cc.mode.state``, and ``nvidia.com/cc.ready.state`` node labels: - -.. code-block:: console - - $ kubectl get node $NODE_NAME -o json | \ - jq '.metadata.labels | with_entries(select(.key | startswith("nvidia.com/cc")))' - -*Example Output (CC mode disabled):* - -.. code-block:: json - - { - "nvidia.com/cc.mode": "off", - "nvidia.com/cc.mode.state": "off", - "nvidia.com/cc.ready.state": "false" - } - -*Example Output (CC mode enabled):* - -.. code-block:: json - - { - "nvidia.com/cc.mode": "on", - "nvidia.com/cc.mode.state": "on", - "nvidia.com/cc.ready.state": "true" - } - -* The ``nvidia.com/cc.mode`` label is the desired state. - -* The ``nvidia.com/cc.mode.state`` label reflects the mode that was last successfully applied to the GPU hardware by the Confidential Computing Manager. - Its value mirrors the applied mode ``on``, ``off``, or ``ppcie``, after the transition is complete on the node. - A value of ``failed`` indicates that the last mode transition encountered an error. - -* The ``nvidia.com/cc.ready.state`` label indicates whether the node is ready to run Confidential Container workloads. - It is set to ``true`` when ``cc.mode.state`` is ``on`` or ``ppcie``, and ``false`` when ``cc.mode.state`` is ``off``. - -.. note:: - - It can take one to two minutes for GPU state transitions to complete and the labels to be updated. - A mode change is complete and successful when ``nvidia.com/cc.mode`` and - ``nvidia.com/cc.mode.state`` have the same value. - - -.. _coco-configuration-multi-gpu-passthrough: - -Configuring Workloads to use Multi-GPU Passthrough -=================================================== - -To configure multi-GPU passthrough, you can specify the following resource limits in your manifests: - -.. code-block:: yaml - - limits: - nvidia.com/pgpu: "8" - nvidia.com/nvswitch: "4" # Only for NVIDIA Hopper GPUs with PPCIE mode - - -You must assign all the GPUs and NVSwitches on the node in your manifest to the same Confidential Container virtual machine. - -On the NVIDIA Hopper architecture, multi-GPU passthrough uses protected PCIe (PPCIE), which claims exclusive use of the NVSwitches for a single Confidential Container. -When using NVIDIA Hopper nodes for multi-GPU passthrough, transition your node's GPU Confidential Computing mode to ``ppcie`` by applying the ``nvidia.com/cc.mode=ppcie`` label. -Refer to the :ref:`Managing the Confidential Computing Mode ` section for details. - -The NVIDIA Blackwell architecture uses NVLink encryption which places the switches outside of the Trusted Computing Base (TCB) and only requires the GPU Confidential Computing mode to be set to ``on``. - +********** Next Steps -========== +********** -* Refer to the :doc:`Attestation ` page for more information on configuring attestation. +* :doc:`Run a Sample Workload ` to verify your deployment. * To help manage the lifecycle of Kata Containers, install the `Kata Lifecycle Manager `_. This Argo Workflows-based tool manages Kata Containers upgrades and day-two operations. -* Licensing information is available on the :doc:`Licensing ` page. \ No newline at end of file diff --git a/confidential-containers/configure-cc-mode.rst b/confidential-containers/configure-cc-mode.rst new file mode 100644 index 000000000..69b5ad6fb --- /dev/null +++ b/confidential-containers/configure-cc-mode.rst @@ -0,0 +1,185 @@ +.. license-header + SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. + SPDX-License-Identifier: Apache-2.0 + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + +.. headings # #, * *, =, -, ^, " + + +.. _managing-confidential-computing-mode: + +######################################### +Managing the Confidential Computing Mode +######################################### + +As a :ref:`Kubernetes Cluster Administrator `, use this page to configure the confidential computing mode of NVIDIA GPUs in your cluster. + +After installing the NVIDIA GPU Operator, you can use the GPU Operator to configure the confidential computing mode of the NVIDIA GPUs in your cluster. +You can set a cluster-wide default Confidential Computing mode, and you can set the mode on individual nodes. + +Set the cluster-wide default mode using the ``ccManager.defaultMode=`` option. +The default value of ``ccManager.defaultMode`` is ``on``. +Set a node-level mode by applying the ``nvidia.com/cc.mode=`` label on the node. +If you set a specific mode on a node, it has higher precedence than the cluster-wide default mode. + +The supported modes are: + +.. list-table:: + :widths: 15 55 30 + :header-rows: 1 + + * - Mode + - Description + - Configuration Method + * - ``on`` (default) + - Enable Confidential Computing. + - cluster-wide default, node-level override + * - ``off`` + - Disable Confidential Computing. + - cluster-wide default, node-level override + * - ``ppcie`` + - Enable Confidential Computing on NVIDIA Hopper GPUs. + On the NVIDIA Hopper architecture, :ref:`multi-GPU passthrough ` + uses protected PCIe (PPCIE), which claims exclusive use of the NVSwitches for a single + Confidential Container virtual machine. + If you use NVIDIA Hopper GPUs for multi-GPU passthrough, set the mode to ``ppcie``. + The NVIDIA Blackwell architecture uses NVLink encryption, which places the switches outside + of the Trusted Computing Base (TCB), so ``ppcie`` mode is not required. + Use ``on`` mode for Blackwell. + - node-level override + +When you change the mode, the manager performs the following actions: + +* Evicts the other GPU Operator operands from the node. + However, the manager does not drain user workloads. You must make sure that no user workloads are running on the node before you change the mode. +* Changes the mode and resets the GPU. +* Reschedules the other GPU Operator operands. + +*********************************** +Setting a Cluster-Wide Default Mode +*********************************** + +.. note:: + + Before changing the mode, make sure that no user workloads are running on the node. + +To set a cluster-wide mode, specify the ``ccManager.defaultMode`` field like the following example: + +.. code-block:: console + + $ kubectl patch clusterpolicies.nvidia.com/cluster-policy \ + --type=merge \ + -p '{"spec": {"ccManager": {"defaultMode": "on"}}}' + +*Example Output:* + +.. code-block:: output + + clusterpolicy.nvidia.com/cluster-policy patched + +.. note:: + + The ``ppcie`` mode cannot be set as a cluster-wide default, it can only be set as a node label value. + +************************* +Setting a Node-Level Mode +************************* + +To set a node-level mode, apply the ``nvidia.com/cc.mode=`` label on the node. + +Set the ``NODE_NAME`` environment variable to the name of the node you want to configure: + +.. code-block:: console + + $ export NODE_NAME="" + +Then apply the label: + +.. code-block:: console + + $ kubectl label node $NODE_NAME nvidia.com/cc.mode=on --overwrite + +The mode that you set on a node has higher precedence than the cluster-wide default mode. + +*********************** +Verifying a Mode Change +*********************** + +To verify that a mode change was successful, view the ``nvidia.com/cc.mode``, +``nvidia.com/cc.mode.state``, and ``nvidia.com/cc.ready.state`` node labels: + +.. code-block:: console + + $ kubectl get node $NODE_NAME -o json | \ + jq '.metadata.labels | with_entries(select(.key | startswith("nvidia.com/cc")))' + +*Example Output (CC mode disabled):* + +.. code-block:: json + + { + "nvidia.com/cc.mode": "off", + "nvidia.com/cc.mode.state": "off", + "nvidia.com/cc.ready.state": "false" + } + +*Example Output (CC mode enabled):* + +.. code-block:: json + + { + "nvidia.com/cc.mode": "on", + "nvidia.com/cc.mode.state": "on", + "nvidia.com/cc.ready.state": "true" + } + +When you disable CC mode after enabling it, wait one to two minutes for +``nvidia.com/cc.mode.state`` and ``nvidia.com/cc.ready.state`` to match the desired ``off`` state. +A mode change is complete and successful when ``nvidia.com/cc.mode`` and +``nvidia.com/cc.mode.state`` have the same value. + +If ``nvidia.com/cc.mode.state`` does not match ``nvidia.com/cc.mode``, refer to :ref:`nvidia.com/cc.mode.state Not Matching nvidia.com/cc.mode ` in the troubleshooting guide. +If ``nvidia.com/cc.mode.state`` is ``failed``, refer to :ref:`nvidia.com/cc.mode.state is failed `. + +************************************************ +Understanding Confidential Computing Mode Labels +************************************************ + +The following labels are used to manage the Confidential Computing mode on a node. +You only need to update the ``nvidia.com/cc.mode`` label, the other labels are managed by the Confidential Computing Manager to represent the current state of the Confidential Computing mode on the node. + +.. list-table:: + :widths: 30 20 50 + :header-rows: 1 + + * - Label Name + - Label Values + - Details + * - ``nvidia.com/cc.mode`` + - ``on``, ``off``, ``ppcie`` + - The desired Confidential Computing mode. + You update this node label to trigger a mode change. + * - ``nvidia.com/cc.mode.state`` + - ``on``, ``off``, ``ppcie``, ``failed`` + - Reflects the mode that was last successfully applied to the GPU hardware by the Confidential Computing Manager. + Its value mirrors the applied mode after the transition is complete on the node. + A value of ``failed`` indicates that the last mode transition encountered an error. + * - ``nvidia.com/cc.ready.state`` + - ``true``, ``false`` + - Indicates whether the node is ready to run Confidential Container workloads. + Set to ``true`` when ``cc.mode.state`` is ``on`` or ``ppcie``, and ``false`` when ``cc.mode.state`` is ``off``. + +.. note:: + + The ``ppcie`` mode is only supported on NVIDIA Hopper GPUs. diff --git a/confidential-containers/configure-workloads.rst b/confidential-containers/configure-workloads.rst new file mode 100644 index 000000000..fd7b7151d --- /dev/null +++ b/confidential-containers/configure-workloads.rst @@ -0,0 +1,316 @@ +.. license-header + SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. + SPDX-License-Identifier: Apache-2.0 + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + +.. headings # #, * *, =, -, ^, " + + +.. _coco-configure-workloads: + +############################################ +Configuring Confidential Container Workloads +############################################ + +As a :ref:`Container User `, use this page to configure confidential GPU workloads on a prepared cluster. +For persona responsibilities and documentation structure, refer to :doc:`Personas `. + +A Confidential Container workload is a standard Kubernetes pod that runs inside a TEE-protected +virtual machine and requests one or more GPUs through the NVIDIA Kata sandbox device plugin. +Compared with a traditional GPU pod, a Confidential Container workload pod manifest differs in +three ways: + +* It selects a TEE-aware Kata runtime class instead of the default ``runc``-based runtime. +* It requests GPU and NVSwitch resources using the resource types advertised by the NVIDIA + Kata sandbox device plugin, which can be either default names or model-specific names. +* For NVSwitch-based HGX systems, it requests every GPU and NVSwitch on the node together so + that all devices reside inside the same Confidential Container virtual machine. + +This page is part of **Advanced Setup** and is the usual next step after a successful install. + +**Before this page:** Complete the :doc:`Detailed Install Guide ` and verify the cluster with :doc:`Run a Sample Workload ` (``Test PASSED`` in pod logs). +For install steps, refer to :doc:`Prerequisites ` and :doc:`Detailed Install Guide `. + +This page describes each of these decisions and provides single-GPU and multi-GPU passthrough +manifest examples that you can copy and adapt to your environment. +The install sample uses a minimal manifest; use this page for production-style configuration. + +******************************** +Select a Container Runtime Class +******************************** + +A Confidential Container workload must set ``spec.runtimeClassName`` to a TEE-aware Kata +runtime that NVIDIA provides through the ``kata-deploy`` Helm chart. +Select the runtime class based on the CPU TEE on the target worker node: + +.. list-table:: + :header-rows: 1 + :widths: 30 40 30 + + * - Node TEE + - Runtime class + - Typical CPU vendor + * - AMD SEV-SNP + - ``kata-qemu-nvidia-gpu-snp`` + - AMD EPYC (Genoa or newer) + * - Intel TDX + - ``kata-qemu-nvidia-gpu-tdx`` + - Intel Xeon (Sapphire Rapids or newer) + +The ``kata-deploy`` chart also installs a ``kata-qemu-nvidia-gpu`` runtime class. +That class is intended for non-confidential Kata workloads. You should not use it for Confidential +Container workloads because it does not start the GPU in CC mode. + +.. _coco-resource-types: + +***************************************** +Reference GPU and NVSwitch Resource Types +***************************************** + +The NVIDIA Kata sandbox device plugin advertises GPUs and NVSwitches to Kubernetes as extended resources. +Your pod manifest requests those resources under ``resources.limits``. +You can use either the default resource types or model-specific resource types. + +By default, every passthrough GPU is advertised as ``nvidia.com/pgpu`` and every NVSwitch is advertised as ``nvidia.com/nvswitch``. +These names are stable across GPU models, which keeps manifests portable when every node in your cluster has the same GPU type. + +A sample resource request using the default resource type is shown below: + +.. code-block:: yaml + + resources: + limits: + nvidia.com/pgpu: "1" + +In heterogeneous clusters, where worker nodes use different GPU models, you can configure the Kata sandbox device plugin to advertise resources under model-specific names by setting +``P_GPU_ALIAS=""`` (and optionally ``NVSWITCH_ALIAS=""``) on the plugin. +With this configuration, GPUs are exposed as resources such as ``nvidia.com/GH100_H200_141GB``, +which lets a workload pin itself to a specific accelerator model. + +Refer to :ref:`Configuring GPU or NVSwitch Resource Types Name ` +for the GPU Operator install flags that enable this behavior. + +Use the model-specific resource name in workloads that must target a specific accelerator: + +.. code-block:: yaml + + resources: + limits: + nvidia.com/GH100_H200_141GB: "1" + +To list the GPU and NVSwitch resource types advertised on a node, run: + +.. code-block:: console + + $ kubectl get node $NODE_NAME -o json | grep nvidia.com + +*Example Output:* + +.. code-block:: output + + "nvidia.com/GH100_H200_141GB": "1" + +.. _coco-single-gpu-workload: + +********************** +Single-GPU Passthrough +********************** + +A single-GPU workload requests one GPU and runs inside its own Confidential Container virtual +machine. +This pattern is the recommended starting point for verifying a deployment and for most +independent workloads that do not require NVLink between GPUs. + +#. Create a file, such as ``cuda-vectoradd-kata.yaml``: + + .. code-block:: yaml + :emphasize-lines: 7,14 + + apiVersion: v1 + kind: Pod + metadata: + name: cuda-vectoradd-kata + namespace: default + spec: + runtimeClassName: kata-qemu-nvidia-gpu-snp # or kata-qemu-nvidia-gpu-tdx + restartPolicy: Never + containers: + - name: cuda-vectoradd + image: "nvcr.io/nvidia/k8s/cuda-sample:vectoradd-cuda12.5.0-ubuntu22.04" + resources: + limits: + nvidia.com/pgpu: "1" + memory: 16Gi + + .. note:: + + If you configured the Kata sandbox device plugin to use model-specific resource types, + replace ``nvidia.com/pgpu`` with the appropriate model-specific name, such as + ``nvidia.com/GH100_H200_141GB``. + +#. Create the pod: + + .. code-block:: console + + $ kubectl apply -f cuda-vectoradd-kata.yaml + +#. Verify the workload completes successfully: + + .. code-block:: console + + $ kubectl logs cuda-vectoradd-kata + + *Example Output:* + + .. code-block:: output + + [Vector addition of 50000 elements] + Copy input data from the host memory to the CUDA device + CUDA kernel launch with 196 blocks of 256 threads + Copy output data from the CUDA device to the host memory + Test PASSED + Done + +Refer to :doc:`Run a Sample Workload ` for the end-to-end verification flow including +deletion and pending-pod guidance. + +.. _coco-multi-gpu-prereqs: +.. _coco-multi-gpu-passthrough: + +********************* +Multi-GPU Passthrough +********************* + +Multi-GPU passthrough assigns every GPU and NVSwitch on a node to a single Confidential +Container virtual machine. +This configuration is required for NVSwitch (NVLink) based HGX systems running confidential +workloads. + +.. important:: + + You must assign all the GPUs and NVSwitches on the node to the same Confidential Container + virtual machine. + Configuring only a subset of GPUs for Confidential Computing on a single node is not + supported. + +NVIDIA Hopper PPCIE Mode +======================== + +For NVIDIA Hopper GPUs, multi-GPU passthrough requires protected PCIe (PPCIE) mode, which +claims exclusive use of the NVSwitches for a single Confidential Container. +The NVIDIA Confidential Computing Manager for Kubernetes transitions GPUs into the correct +mode based on the ``cc.mode`` label that you set. + +#. Set the ``NODE_NAME`` environment variable to the node you want to configure: + + .. code-block:: console + + $ export NODE_NAME="" + +#. Apply the ``ppcie`` CC mode label to the node: + + .. code-block:: console + + $ kubectl label node $NODE_NAME nvidia.com/cc.mode=ppcie --overwrite + +Refer to :doc:`Managing the Confidential Computing Mode ` for full details +on setting the CC mode and verifying the change. + +NVIDIA Blackwell GPUs use NVLink encryption, which places the switches outside of the +Trusted Computing Base (TCB), so the default CC mode of ``on`` is sufficient and no additional +configuration is required. + +Run a Multi-GPU Workload +======================== + +#. Create a file, such as ``multi-gpu-kata.yaml``, with a pod manifest that requests every GPU + and NVSwitch on the node: + + .. code-block:: yaml + :emphasize-lines: 7,14-16 + + apiVersion: v1 + kind: Pod + metadata: + name: multi-gpu-kata + namespace: default + spec: + runtimeClassName: kata-qemu-nvidia-gpu-snp # or kata-qemu-nvidia-gpu-tdx + restartPolicy: Never + containers: + - name: cuda-sample + image: "nvcr.io/nvidia/k8s/cuda-sample:vectoradd-cuda12.5.0-ubuntu22.04" + resources: + limits: + nvidia.com/pgpu: "8" + nvidia.com/nvswitch: "4" # Only for NVIDIA Hopper GPUs with PPCIE mode + memory: 128Gi + + .. note:: + + If you configured ``P_GPU_ALIAS`` or ``NVSWITCH_ALIAS`` for heterogeneous clusters, + replace ``nvidia.com/pgpu`` and ``nvidia.com/nvswitch`` with the corresponding + model-specific resource types. + Refer to :ref:`Reference GPU and NVSwitch Resource Types ` + for details. + +#. Create the pod: + + .. code-block:: console + + $ kubectl apply -f multi-gpu-kata.yaml + + *Example Output:* + + .. code-block:: output + + pod/multi-gpu-kata created + +#. Verify the pod is running: + + .. code-block:: console + + $ kubectl get pod multi-gpu-kata + + *Example Output:* + + .. code-block:: output + + NAME READY STATUS RESTARTS AGE + multi-gpu-kata 1/1 Running 0 30s + +#. Verify that all GPUs are visible inside the container: + + .. code-block:: console + + $ kubectl exec multi-gpu-kata -- nvidia-smi -L + + *Example Output:* + + .. code-block:: output + + GPU 0: NVIDIA H100 (UUID: GPU-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx) + GPU 1: NVIDIA H100 (UUID: GPU-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx) + GPU 2: NVIDIA H100 (UUID: GPU-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx) + GPU 3: NVIDIA H100 (UUID: GPU-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx) + GPU 4: NVIDIA H100 (UUID: GPU-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx) + GPU 5: NVIDIA H100 (UUID: GPU-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx) + GPU 6: NVIDIA H100 (UUID: GPU-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx) + GPU 7: NVIDIA H100 (UUID: GPU-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx) + +#. Delete the pod: + + .. code-block:: console + + $ kubectl delete -f multi-gpu-kata.yaml diff --git a/confidential-containers/configure.rst b/confidential-containers/configure.rst new file mode 100644 index 000000000..61c890aaa --- /dev/null +++ b/confidential-containers/configure.rst @@ -0,0 +1,58 @@ +.. license-header + SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. + SPDX-License-Identifier: Apache-2.0 + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + +.. headings # #, * *, =, -, ^, " + + +.. _configure-confidential-containers: + +####################### +Advanced Setup Overview +####################### + +This page is the entry point for the **Advanced Setup** section. +For persona responsibilities and documentation structure, refer to :doc:`Personas `. + +Before doing any of the topics, ensure you have completed the **Install** section. + + +.. grid:: 2 + :gutter: 3 + + .. grid-item-card:: :octicon:`cpu;1.5em;sd-mr-1` Configuring Workloads + :link: configure-workloads + :link-type: doc + + Runtime class selection, GPU and NVSwitch resource types, and single- or multi-GPU passthrough. + + .. grid-item-card:: :octicon:`gear;1.5em;sd-mr-1` Managing the Confidential Computing Mode + :link: configure-cc-mode + :link-type: doc + + Set the confidential computing mode on NVIDIA GPUs at the cluster or node level. + + .. grid-item-card:: :octicon:`shield-check;1.5em;sd-mr-1` Attestation + :link: attestation + :link-type: doc + + Stand up a local Trustee instance and verify connectivity with the KBS client. + + .. grid-item-card:: :octicon:`stack;1.5em;sd-mr-1` Multi-GPU Passthrough + :link: coco-multi-gpu-passthrough + :link-type: ref + + Assign every GPU and NVSwitch on a node to a single Confidential Container virtual + machine for NVSwitch-based HGX systems. diff --git a/confidential-containers/index.rst b/confidential-containers/index.rst index a5024ad2d..4bb42e172 100644 --- a/confidential-containers/index.rst +++ b/confidential-containers/index.rst @@ -16,63 +16,196 @@ .. headings # #, * *, =, -, ^, " -********************************************************** -NVIDIA Confidential Containers Architecture -********************************************************** +################################# +NVIDIA Confidential Containers +################################# .. toctree:: - :caption: NVIDIA Confidential Containers Architecture + :caption: Learn :hidden: :titlesonly: - Release Notes - Overview + Overview + Reference Architecture + Personas Supported Platforms - Deploy Confidential Containers + +.. toctree:: + :caption: Install + :hidden: + :titlesonly: + + Prerequisites + Quickstart Install + Detailed Install Guide + Run a Sample Workload + +.. toctree:: + :caption: Advanced Setup + :hidden: + :titlesonly: + + Advanced Setup Overview + Configuring Workloads + Managing the Confidential Computing Mode Attestation + +.. toctree:: + :caption: Reference + :hidden: + :titlesonly: + + Troubleshooting + Release Notes Licensing +NVIDIA Confidential Containers is a validated reference architecture for running GPU-accelerated AI workloads on Kubernetes inside hardware-enforced Trusted Execution Environments (TEEs). +It extends NVIDIA GPU Confidential Computing to standard Kubernetes deployments using CNCF `Confidential Containers `__ and Kata Containers with the NVIDIA GPU Operator. +Use it to protect model intellectual property and sensitive data from untrusted infrastructure across public cloud, on-premises, and edge deployments. + +Benefits +======== + +Confidential Containers provides the following benefits: + +* Protect model IP and sensitive data on untrusted public cloud, on-premises, or edge infrastructure. +* Deploy proprietary generative AI models in regulated industries on third-party or private clusters. +* Isolate GPU workloads in hardware-protected enclaves with encrypted memory and integrity verification. +* Operate confidential workloads with standard Kubernetes pods, runtime classes, and scheduling. +* Verify TEE state through remote attestation before releasing secrets or decrypted model weights. + +Refer to :doc:`Reference Architecture ` for the full value proposition, trust model, and architecture diagrams. + +Use Cases +--------- + +Common scenarios include protecting proprietary model IP on third-party infrastructure, running frontier models in a sovereign environment, and processing sensitive enterprise data in data clean rooms. +Refer to :ref:`Use Cases ` in the Reference Architecture for workflows and deployment scenarios. + +Core Concepts +============= + +`Confidential Containers `__ runs Kubernetes pods in hardware-isolated virtual machines instead of on the shared host kernel, protecting workloads from the host and other tenants. +On supported hardware (AMD SEV-SNP or Intel TDX), that isolation forms a trusted execution environment (TEE) with encrypted memory and integrity verification. + +Attestation, sealed secrets, and encrypted container images are core to the model. +Refer to :ref:`Background ` in the Reference Architecture. + +Core Components +=============== + +This documentation focuses on the components you install, configure, and operate to run workloads in a Confidential Containers runtime on Kubernetes. +The :doc:`Reference Architecture ` describes the full stack. +The install guides cover Kata Containers and the NVIDIA GPU Operator end to end. + +* Kata Containers: Runs pods inside TEE-protected virtual machines instead of on the shared host kernel. + Install Kata Deploy and TEE-specific runtime shims in :doc:`Quickstart Install ` and :doc:`Detailed Install Guide `. + Schedule workloads with a TEE-aware ``RuntimeClass`` in :doc:`Configuring Workloads `. + +* NVIDIA GPU Operator: Automates GPU Confidential Computing on eligible nodes, including CC mode, VFIO passthrough, and GPU allocation for Kata pods. + Configure the Operator and node labels in :doc:`Detailed Install Guide `. + Manage CC mode in :doc:`Managing the Confidential Computing Mode `. -This is documentation for NVIDIA's implementation of Confidential Containers including reference architecture information and supported platforms. + For Confidential Containers, the Operator deploys: + * NVIDIA Confidential Computing Manager (cc-manager) + * NVIDIA Kata Sandbox Device Plugin + * NVIDIA VFIO Manager + * Node Feature Discovery (NFD) -.. grid:: 3 +Using This Documentation +======================== + +This documentation describes the NVIDIA reference architecture for Confidential Containers and deployment recommendations for the upstream `CNCF Confidential Containers project `_ with NVIDIA GPUs. +It covers NVIDIA-specific configurations needed to run Confidential Containers workloads on Kubernetes. +This primarily includes the steps to enable and configure Kata Containers and the NVIDIA GPU Operator on your cluster. + +For advanced Confidential Containers topics and day-two operations, refer to the upstream `Confidential Containers documentation `__, as the workflows and implementations are not NVIDIA specific. +For example, an attestation implementation is not specific to NVIDIA GPUs. +A brief attestation overview and evaluation quickstart is available in :doc:`Attestation `, but full production attestation implementation guides are in the upstream `Confidential Containers attestation documentation `__. + + +Learn +===== + +.. grid:: 2 :gutter: 3 - .. grid-item-card:: :octicon:`book;1.5em;sd-mr-1` Overview + .. grid-item-card:: :octicon:`book;1.5em;sd-mr-1` Reference Architecture :link: overview :link-type: doc - Start here to review the reference architecture, use cases, and software components. + Use cases, software components, and cluster topology. + + .. grid-item-card:: :octicon:`info;1.5em;sd-mr-1` Personas + :link: personas + :link-type: doc + + Roles, responsibilities, and documentation navigation by persona. + .. grid-item-card:: :octicon:`server;1.5em;sd-mr-1` Supported Platforms :link: supported-platforms :link-type: doc - Learn about the validated hardware, OS, and component versions. + Validated hardware, OS, and component versions. + +Install +======= + +.. grid:: 2 + :gutter: 3 + + .. grid-item-card:: :octicon:`checklist;1.5em;sd-mr-1` Prerequisites + :link: prerequisites + :link-type: doc + + Hardware, BIOS, and Kubernetes cluster requirements. - .. grid-item-card:: :octicon:`rocket;1.5em;sd-mr-1` Deploy Confidential Containers + .. grid-item-card:: :octicon:`zap;1.5em;sd-mr-1` Quickstart Install + :link: install-quickstart + :link-type: doc + + Minimal steps to install Kata Containers and the GPU Operator. + + .. grid-item-card:: :octicon:`rocket;1.5em;sd-mr-1` Detailed Install Guide :link: confidential-containers-deploy :link-type: doc - Use this page to deploy with the NVIDIA GPU Operator on Kubernetes. + Install with per-node labeling, configuration options, and troubleshooting. - .. grid-item-card:: :octicon:`shield-check;1.5em;sd-mr-1` Attestation - :link: attestation + .. grid-item-card:: :octicon:`play;1.5em;sd-mr-1` Run a Sample Workload + :link: run-sample-workload :link-type: doc - Learn about remote attestation, Trustee, and the NVIDIA verifier for GPU workloads. + Run a sample GPU workload; success is ``Test PASSED`` in the pod logs. + +Advanced Setup +============== + +.. grid:: 2 + :gutter: 3 + + .. grid-item-card:: :octicon:`list-unordered;1.5em;sd-mr-1` Advanced Setup Overview + :link: configure + :link-type: doc + Choose attestation, CC mode, and workload configuration after install. - .. grid-item-card:: :octicon:`note;1.5em;sd-mr-1` Release Notes - :link: release-notes + .. grid-item-card:: :octicon:`cpu;1.5em;sd-mr-1` Configuring Workloads + :link: configure-workloads :link-type: doc - Review new features and known issues for each release. + Runtime classes, resource types, and multi-GPU passthrough. - .. grid-item-card:: :octicon:`law;1.5em;sd-mr-1` Licensing - :link: licensing + .. grid-item-card:: :octicon:`gear;1.5em;sd-mr-1` Managing the Confidential Computing Mode + :link: configure-cc-mode :link-type: doc - Learn about the licensing information for Confidential Containers documentation. + Set CC mode at the cluster or node level. + + .. grid-item-card:: :octicon:`shield-check;1.5em;sd-mr-1` Attestation + :link: attestation + :link-type: doc + Trustee quickstart and connectivity checks (not required for the install sample). diff --git a/confidential-containers/install-quickstart.rst b/confidential-containers/install-quickstart.rst new file mode 100644 index 000000000..6ed54d06f --- /dev/null +++ b/confidential-containers/install-quickstart.rst @@ -0,0 +1,206 @@ +.. license-header + SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. + SPDX-License-Identifier: Apache-2.0 + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + +.. headings # #, * *, =, -, ^, " + + +.. _coco-install-quickstart: + +################## +Quickstart Install +################## + +As a :ref:`Kubernetes Cluster Administrator `, use these steps to install Kata Containers and the NVIDIA GPU Operator with minimal steps. +For additional configuration options and install details, refer to the :doc:`Detailed Install Guide `. + +Use this quickstart if you want every node in your cluster to run Confidential Containers. +This is the fastest path and is ideal for evaluation or dedicated Confidential Containers clusters. +If you need to run Confidential Containers on only some nodes while keeping traditional GPU workloads on others, or you want more control over the installation, use the :doc:`Detailed Install Guide ` instead. + +This quickstart takes approximately 10 minutes to complete, assuming your cluster already meets the prerequisites. + +.. note:: + + Before starting, make sure your cluster meets the :doc:`Prerequisites `. + +What You Will Build +------------------- + +By the end of this quickstart, you will have: + +* Kata Containers running on your cluster. +* The NVIDIA GPU Operator installed and configured for Confidential Containers. +* All cluster nodes configured for Confidential Containers workloads. + +.. note:: + + This quickstart configures all cluster nodes for Confidential Containers workloads. + A cluster node can only be configured to run one container runtime at a time, so a node configured for Confidential Containers workloads cannot run traditional GPU container workloads. + + If you need to run traditional GPU container workloads on your cluster, refer to the :ref:`Label Nodes for Confidential Containers Components ` section in the :doc:`Detailed Install Guide `. + +.. _quickstart-install-kata: + +************************************** +Install the Kata Containers Helm Chart +************************************** + +#. Set the chart version and registry path: + + .. code-block:: console + + $ export VERSION="3.29.0" + $ export CHART="oci://ghcr.io/kata-containers/kata-deploy-charts/kata-deploy" + +#. Install the ``kata-deploy`` Helm chart: + + .. code-block:: console + + $ helm install kata-deploy "${CHART}" \ + --namespace kata-system --create-namespace \ + --set nfd.enabled=false \ + --wait --timeout 10m \ + --version "${VERSION}" + + *Example Output:* + + .. code-block:: output + + Pulled: ghcr.io/kata-containers/kata-deploy-charts/kata-deploy:3.29.0 + Digest: sha256:aea41018779716ce2e0bf406d701637d10fb5a0792db51a08dfd3f76701eb933 + LAST DEPLOYED: Wed Apr 1 17:03:00 2026 + NAMESPACE: kata-system + STATUS: deployed + REVISION: 1 + DESCRIPTION: Install complete + TEST SUITE: None + + It can take 2 to 3 minutes for the command to return and all output to be printed. + + .. note:: + + There is a `known Helm issue `_ on single-node clusters that may result in the Helm command finishing before all pods are done initializing. + If you are deploying to a single-node cluster, wait a few additional minutes after the command completes. + +#. Verify the ``kata-qemu-nvidia-gpu-snp`` and ``kata-qemu-nvidia-gpu-tdx`` runtime classes are available: + + After ``helm install`` completes, the ``kata-deploy`` chart creates the Kata ``RuntimeClass`` resources on the cluster. + Confirm SNP and TDX classes are present before you continue to :ref:`Install the NVIDIA GPU Operator `. + + .. code-block:: console + + $ kubectl get runtimeclass | grep kata-qemu-nvidia-gpu + + *Example Output:* + + .. code-block:: output + + NAME HANDLER AGE + kata-qemu-nvidia-gpu kata-qemu-nvidia-gpu 40s + kata-qemu-nvidia-gpu-snp kata-qemu-nvidia-gpu-snp 40s + kata-qemu-nvidia-gpu-tdx kata-qemu-nvidia-gpu-tdx 40s + + If SNP or TDX runtime classes are not listed, the install did not complete correctly. + On a single-node cluster, retry after a few minutes only if Helm returned before the ``kata-deploy`` pod reaches ``Running`` (refer to the note above). + Otherwise, refer to :doc:`Troubleshooting `. + +**Success criteria:** Helm reports ``STATUS: deployed`` and both SNP and TDX runtime classes appear in the output above. + +.. _quickstart-install-gpu-operator: + +******************************* +Install the NVIDIA GPU Operator +******************************* + +#. Add and update the NVIDIA Helm repository: + + .. code-block:: console + + $ helm repo add nvidia https://helm.ngc.nvidia.com/nvidia \ + && helm repo update + + *Example Output:* + + .. code-block:: output + + "nvidia" has been added to your repositories + Hang tight while we grab the latest from your chart repositories... + ...Successfully got an update from the "nvidia" chart repository + Update Complete. ⎈Happy Helming!⎈ + +#. Install the GPU Operator configured for Confidential Containers on all nodes: + + .. code-block:: console + + $ helm install --wait --timeout 10m --generate-name \ + -n gpu-operator --create-namespace \ + nvidia/gpu-operator \ + --set sandboxWorkloads.enabled=true \ + --set sandboxWorkloads.defaultWorkload=vm-passthrough \ + --set sandboxWorkloads.mode=kata \ + --set nfd.enabled=true \ + --set nfd.nodefeaturerules=true \ + --version=v26.3.1 + + *Example Output:* + + .. code-block:: output + + NAME: gpu-operator + LAST DEPLOYED: Tue Mar 10 17:58:12 2026 + NAMESPACE: gpu-operator + STATUS: deployed + REVISION: 1 + TEST SUITE: None + + It may take 3 to 5 minutes for all GPU Operator pods to reach the Running state. + + .. note:: + The ``sandboxWorkloads.defaultWorkload=vm-passthrough`` flag sets the default cluster workload type for Confidential Containers. + +#. Verify that all GPU Operator pods are running: + + .. code-block:: console + + $ kubectl get pods -n gpu-operator + + *Example Output:* + + .. code-block:: output + + NAME READY STATUS RESTARTS AGE + gpu-operator-1766001809-node-feature-discovery-gc-75776475sxzkp 1/1 Running 0 86s + gpu-operator-1766001809-node-feature-discovery-master-6869lxq2g 1/1 Running 0 86s + gpu-operator-1766001809-node-feature-discovery-worker-mh4cv 1/1 Running 0 86s + gpu-operator-f48fd66b-vtfrl 1/1 Running 0 86s + nvidia-cc-manager-7z74t 1/1 Running 0 61s + nvidia-kata-sandbox-device-plugin-daemonset-d5rvg 1/1 Running 0 30s + nvidia-sandbox-validator-6xnzc 1/1 Running 0 30s + nvidia-vfio-manager-h229x 1/1 Running 0 62s + +**Success criteria:** All GPU Operator pods are ``Running`` or ``Completed``. +Your cluster is now configured to run Confidential Containers workloads on all nodes. + +********** +Next Steps +********** + +* Continue to :doc:`Run a Sample Workload ` to confirm the deployment. + +* For more installation and configuration options, refer to the :doc:`Detailed Install Guide `. + +* Continue to the :doc:`Advanced Setup Overview ` section for more post installation configuration options. + diff --git a/confidential-containers/licensing.rst b/confidential-containers/licensing.rst index 43d76fff9..d30207776 100644 --- a/confidential-containers/licensing.rst +++ b/confidential-containers/licensing.rst @@ -16,9 +16,9 @@ .. headings # #, * *, =, -, ^, " -********* +######### Licensing -********* +######### While the Confidential Containers (CoCo) Reference Architecture includes some components that are open source, the NVIDIA Confidential Computing capability is a licensed feature for production use cases. To use these products, you must have a valid NVIDIA Confidential Computing license. diff --git a/confidential-containers/overview.rst b/confidential-containers/overview.rst index 2b9646695..638d18539 100644 --- a/confidential-containers/overview.rst +++ b/confidential-containers/overview.rst @@ -17,41 +17,45 @@ .. headings # #, * *, =, -, ^, " -***************************************************** +##################################################### NVIDIA Confidential Containers Reference Architecture -***************************************************** +##################################################### -NVIDIA GPUs with Confidential Computing support provide the hardware foundation for running GPU workloads inside a hardware-enforced Trusted Execution Environment (TEE). -The NVIDIA Confidential Containers Reference Architecture provides a validated deployment model for cluster administrators interested in leveraging NVIDIA GPU Confidential Computing capabilities on Kubernetes platforms. - -This documentation describes the architecture overview and the key software components, including the NVIDIA GPU Operator and Kata Containers, used to deploy and manage confidential workloads. -This architecture builds on principles of Confidential Computing and `Confidential Containers `__, the cloud-native approach to Confidential Computing. -It is recommended to be familiar with the basic concepts of Confidential Containers, including attestation, before reading this documentation. -Refer to the `Confidential Containers `__ documentation for more information. +This documentation describes NVIDIA's reference architecture for deploying CNCF Confidential Containers on compliant Confidential Computing hardware and software. +For documentation navigation by role, refer to :doc:`Personas `. .. _confidential-containers-overview: +********** Background -========== - -NVIDIA GPUs power the training and deployment of Frontier Models—world-class Large Language Models (LLMs) that define the state of the art in AI reasoning and capability. +********** -As organizations adopt these models in regulated industries such as financial services, healthcare, and the public sector, protecting model intellectual property and sensitive user data becomes essential. Additionally, the model deployment landscape is evolving to include public clouds, enterprise on-premises, and edge. A zero-trust posture on cloud-native platforms such as Kubernetes is essential to secure assets (model IP and enterprise private data) from untrusted infrastructure with privileged user access. +NVIDIA GPUs power the training and deployment of Large Language Models (LLMs) that define the state of the art in AI reasoning and capability. +As organizations adopt these models in regulated industries such as financial services, healthcare, and the public sector, protecting model intellectual property and sensitive user data becomes essential. +The model deployment landscape is also evolving to include public clouds, enterprise on-premises, and edge. +A zero-trust posture on cloud-native platforms such as Kubernetes is essential to secure assets (model IP and enterprise private data) from untrusted infrastructure with privileged user access. Confidential Computing (CC) addresses this gap by using hardware-based Trusted Execution Environments (TEEs), such as AMD SEV-SNP and Intel TDX, with NVIDIA Confidential Computing capabilities to provide isolation, memory encryption, and integrity verification during processing. In addition to isolation, CC provides Remote Attestation, which allows workload owners to cryptographically verify the state of a TEE before providing secrets or sensitive data. `Confidential Containers `__ (CoCo) is the cloud-native approach of CC on Kubernetes. -The Confidential Containers project leverages Kata Containers to provide the sandboxing capabilities. `Kata Containers `_ is an open-source project that provides lightweight Utility Virtual Machines (UVMs) that feel and perform like containers while providing strong workload isolation. Along with the Confidential Containers project, Kata enables the orchestration of secure, GPU-accelerated workloads in Kubernetes. +The Confidential Containers project leverages Kata Containers to provide the sandboxing capabilities. +`Kata Containers `_ is an open-source project that provides lightweight Utility Virtual Machines (UVMs) that feel and perform like containers while providing strong workload isolation. Along with the Confidential Containers project, Kata enables the orchestration of secure, GPU-accelerated workloads in Kubernetes. .. _coco-use-cases: +********* Use Cases -========= +********* + +The target for Confidential Containers is to enable model providers (closed and open source) and Enterprises to use the advancements of Gen AI, agnostic to the deployment model (Cloud, Enterprise, or Edge). -The target for Confidential Containers is to enable model providers (closed and open source) and Enterprises to use the advancements of Gen AI, agnostic to the deployment model (Cloud, Enterprise, or Edge). Some of the key use cases that CC and Confidential Containers enable are: +* For Model Providers: It enables the expansion of reach by allowing expensive, proprietary model weights to be deployed on-site at customer data centers without exposing the intellectual property (IP) to the customer's infrastructure administrators. +* For Adopters: It provides low-latency access to state-of-the-art frontier models within their own sovereign environment, ensuring their private prompts and data never leave their controlled premises while maintaining the security of the model provider's IP. -* **Zero-Trust AI & IP Protection:** You can deploy proprietary models (like LLMs) on third-party or private infrastructure. The model weights remain encrypted and are only decrypted inside the hardware-protected enclave, ensuring absolute IP protection from the host. -* **Data Clean Rooms:** This allows you to process sensitive enterprise data (like financial analytics or healthcare records) securely. Neither the infrastructure provider nor the model builder can see the raw data. +Some of the key use cases that CC and Confidential Containers enable are: + +* **Zero-Trust AI and IP Protection:** You can deploy proprietary models (such as LLMs) on third-party or private infrastructure. The model weights remain encrypted and are only decrypted inside the hardware-protected enclave, ensuring absolute IP protection from the host. +* **Data Clean Rooms:** This allows you to process sensitive enterprise data (such as financial analytics or healthcare records) securely. Neither the infrastructure provider nor the model builder can see the raw data. .. image:: graphics/CoCo-Sample-Workflow.png :alt: Sample Workflow for Securing Model IP on Untrusted Infrastructure with CoCo @@ -61,17 +65,18 @@ The target for Confidential Containers is to enable model providers (closed and .. _coco-architecture: +********************* Architecture Overview -===================== +********************* -NVIDIA's approach to the Confidential Containers architecture delivers on the key promise of Confidential Computing: confidentiality, integrity, and verifiability. -Integrating open source and NVIDIA software components with the Confidential Computing capabilities of NVIDIA GPUs, the Reference Architecture for Confidential Containers is designed to be the secure and trusted deployment model for AI workloads. +Integrating open source and NVIDIA software components with the Confidential Computing capabilities of NVIDIA GPUs, this reference architecture is designed to be the secure and trusted deployment model for AI workloads. The key values of this architecture approach are: -1. **Built on Open Source Software (OSS) standards** - The Reference Architecture for Confidential Containers is built on key OSS components such as Kata, Trustee, QEMU, OVMF, and Node Feature Discovery (NFD), along with hardened NVIDIA components like NVIDIA GPU Operator. -2. **Highest level of isolation** - The Confidential Containers architecture is built on Kata containers, which is the industry standard for providing hardened sandbox isolation, and augmenting it with support for GPU passthrough to Kata containers makes the base of the Trusted Execution Environment (TEE). -3. **Zero-trust execution with attestation** - Ensuring the trust of the model providers/data owners by providing a full-stack verification capability with attestation. The integration of NVIDIA GPU attestation capabilities with Trustee based architecture, to provide composite attestation provides the base for secure, attestation based key-release for encrypted workloads, deployed inside the TEE. +1. Built on Open Source Software (OSS) standards: this reference architecture is built on key OSS components such as Kata, Trustee, QEMU, OVMF, and Node Feature Discovery (NFD), along with NVIDIA components such as the NVIDIA GPU Operator. +2. Highest level of isolation: the Confidential Containers architecture is built on Kata containers, the industry standard for providing hardened sandbox isolation, and augmenting it with support for GPU passthrough to Kata containers makes the base of the Trusted Execution Environment (TEE). +3. Zero-trust execution with attestation: ensuring the trust of the model providers/data owners by providing a full-stack verification capability with attestation. + The integration of NVIDIA GPU attestation capabilities with Trustee based architecture, to provide composite attestation provides the base for secure, attestation based key-release for encrypted workloads, deployed inside the TEE. .. image:: graphics/CoCo-Reference-Architecture.png :alt: High-Level Reference Architecture for Confidential Containers @@ -89,8 +94,9 @@ The components are described in more detail in the next section. .. _coco-supported-platforms-components: +*********************************************** Software Components for Confidential Containers -=============================================== +*********************************************** The following is a brief overview of the software components in NVIDIA's Reference Architecture for Confidential Containers. Refer to the diagram above for a visual representation of the components. @@ -119,7 +125,7 @@ The GPU Operator uses node labels to manage the deployment of components to the These components include: * NVIDIA Confidential Computing Manager (cc-manager) for Kubernetes: Sets the confidential computing (CC) mode on the NVIDIA GPUs. - By default, the Confidential Computing Manager will transition all NVIDIA GPUs to the Confidential Computing mode, if they are not already in that mode. + By default, the Confidential Computing Manager will transition all NVIDIA GPUs to Confidential Computing mode, if they are not already in that mode. * NVIDIA Kata Sandbox Device Plugin: Creates host-side Container Device Interface (CDI) specifications for GPU passthrough and discovers NVIDIA GPUs along with their capabilities, advertises these to Kubernetes, and allocates GPUs during pod deployment. Allocatable GPU resources are advertised as type ``nvidia.com/pgpu`` by default. * NVIDIA VFIO Manager: Binds discovered NVIDIA GPUs and NVSwitches to the vfio-pci driver for VFIO passthrough. @@ -128,24 +134,29 @@ Refer to the :doc:`NVIDIA GPU Operator ` documentation for more **Node Feature Discovery (NFD)** -Bootstraps the node by advertising the node features using labels to make sophisticated scheduling decisions, like installing the Kata/CoCo stack only on the nodes that support the CC prerequisites for CPU and GPU. This feature directs the Operator to install node feature rules that detect CPU security features and the NVIDIA GPU hardware. +Bootstraps the node by advertising the node features using labels to make sophisticated scheduling decisions, such as installing the Kata/CoCo stack only on the nodes that support the CC prerequisites for CPU and GPU. +This directs the Operator to install node feature rules that detect CPU security features and the NVIDIA GPU hardware. Refer to the `Node Feature Discovery documentation `_ for upstream usage and reference material. The project source repository is `kubernetes-sigs/node-feature-discovery `_ on GitHub. -This component is deployed and managed by default by the GPU Operator. +This component is typically deployed and managed by default by the GPU Operator. **Snapshotter (for example, Nydus)** -Handles the container image "guest pull" functionality. Used as a remote snapshotter, it bypasses image pulls on the host. Instead, it fetches and unpacks encrypted and signed container images directly inside the protected guest memory, keeping proprietary contents hidden and ensuring image integrity. +Handles the container image "guest pull" functionality. +Used as a remote snapshotter, it bypasses image pulls on the host. +Instead, the snapshotter fetches and unpacks encrypted and signed container images directly inside the protected guest memory, keeping proprietary contents hidden and ensuring image integrity. **Kata Agent and Agent Security Policy** -Runs inside the guest VM to manage the container lifecycle while enforcing a strict, immutable agent security policy based on Rego (regorus). This blocks the untrusted host from executing unauthorized commands, such as a malicious ``kubectl exec``. +Runs inside the guest VM to manage the container lifecycle while enforcing a strict, immutable agent security policy based on Rego (regorus). +This blocks the untrusted host from executing unauthorized commands, such as a malicious ``kubectl exec``. **Trustee and Attestation Service** -Attestation and key brokering framework (which includes the Key Broker Service and Attestation Service). It acts as the cryptographic gatekeeper, verifying hardware/software evidence and only releasing secrets if the environment is proven secure. +Attestation and key brokering framework (which includes the Key Broker Service and Attestation Service). +It acts as the cryptographic gatekeeper, verifying hardware/software evidence and only releasing secrets if the environment is proven secure. **Confidential Data Hub (CDH)** @@ -160,7 +171,7 @@ A minimal hardened init system that securely bootstraps the guest environment, l .. _coco-gpu-operator-cluster-topology: GPU Operator Cluster Topology Considerations --------------------------------------------- +============================================ The GPU Operator deploys and manages components for allocating and utilizing the GPU resources on your cluster. Depending on how you configure the Operator, different components are deployed on the worker nodes. @@ -183,21 +194,22 @@ Consider the following example where node A is configured to run traditional con * Node Feature Discovery * NVIDIA GPU Feature Discovery - * NVIDIA Confidential Computing Manager for Kubernetes - * NVIDIA Sandbox Device Plugin + * NVIDIA Kata Sandbox Device Plugin * NVIDIA VFIO Manager * Node Feature Discovery -This configuration can be controlled through node labelling, as described in the :doc:`Confidential Containers deployment guide `. +This configuration can be controlled through node labelling, as described in :doc:`Detailed Install Guide `. +******************************************* Supported Features and Deployment Scenarios -=========================================== +******************************************* The following features are supported with Confidential Containers: * Support for Confidential Container workloads as - * Single-GPU passthrough (one physical GPU per pod). - * Multi-GPU passthrough on NVSwitch (NVLink) based HGX systems. + * :ref:`Single-GPU passthrough ` (one physical GPU per pod). + * :ref:`Multi-GPU passthrough ` on NVSwitch (NVLink) based HGX systems. .. note:: @@ -214,12 +226,13 @@ The following features are supported with Confidential Containers: * Ephemeral container data and image layer storage. * Lifecycle management of Kata Containers through the `Kata Lifecycle Manager `_. -More information on these features can be found in the `Confidential Containers documentation `_. +More information on these features can be found in the `Confidential Containers documentation `__. .. _coco-limitations: +**************************** Limitations and Restrictions -============================ +**************************** * NVIDIA supports the GPU Operator and confidential computing with the containerd runtime only. * All GPUs on the host must be configured for Confidential Computing. @@ -241,7 +254,7 @@ Limitations and Restrictions Refer to the `QEMU IOMMUFD documentation `_ for more information. Security Considerations ------------------------ +======================= * Application security defects: Confidential Computing does not protect against threats within the confidential VM, including vulnerabilities in the application itself. Applications must still follow security best practices such as input validation. @@ -259,11 +272,13 @@ Security Considerations * Availability: Confidential Computing does not provide availability guarantees. Achieve availability through replication, which is standard practice in Kubernetes deployments. +********** Next Steps -========== -Refer to the following pages to learn more about deploying with Confidential Containers: +********** + +To deploy on your cluster, start with the **Install** section: -.. grid:: 3 +.. grid:: 2 :gutter: 3 .. grid-item-card:: :octicon:`server;1.5em;sd-mr-1` Supported Platforms @@ -272,15 +287,23 @@ Refer to the following pages to learn more about deploying with Confidential Con Hardware, OS, and component versions validated for general availability (GA). - .. grid-item-card:: :octicon:`rocket;1.5em;sd-mr-1` Deploy Confidential Containers + .. grid-item-card:: :octicon:`checklist;1.5em;sd-mr-1` Prerequisites + :link: prerequisites + :link-type: doc + + Prepare worker nodes and the Kubernetes cluster. + + .. grid-item-card:: :octicon:`rocket;1.5em;sd-mr-1` Detailed Install Guide :link: confidential-containers-deploy :link-type: doc - Deploy with the NVIDIA GPU Operator on Kubernetes. + Install Kata Containers and the NVIDIA GPU Operator on Kubernetes. - .. grid-item-card:: :octicon:`shield-check;1.5em;sd-mr-1` Attestation - :link: attestation + .. grid-item-card:: :octicon:`play;1.5em;sd-mr-1` Run a Sample Workload + :link: run-sample-workload :link-type: doc - Remote attestation, Trustee, and the NVIDIA verifier for GPU workloads. + Verify the deployment; success is ``Test PASSED`` in pod logs. + +After installation, refer to :doc:`Advanced Setup Overview ` for attestation, CC mode, and workload configuration. diff --git a/confidential-containers/personas.rst b/confidential-containers/personas.rst new file mode 100644 index 000000000..3abff4983 --- /dev/null +++ b/confidential-containers/personas.rst @@ -0,0 +1,160 @@ +.. license-header + SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. + SPDX-License-Identifier: Apache-2.0 + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + +.. headings # #, * *, =, -, ^, " + +.. _coco-using-this-guide: + +######## +Personas +######## + +This page provides an overview of the prior knowledge recommended before implementing the architecture, the personas who own each part of the deployment, and how to navigate this documentation. + +****************** +Before You Begin +****************** + +This documentation describes NVIDIA's reference architecture and deployment recommendations for the upstream `CNCF Confidential Containers project `_ with NVIDIA GPUs. +Understanding the upstream project's goals, architecture, and threat model will give you the context needed to understand architecture decisions described in this documentation. + +Before using this documentation, you should be familiar with: + +* Confidential Containers concepts outlined in the upstream `Confidential Containers documentation `__, including the trust model, attestation flow, and key features such as sealed secrets and encrypted container images. + Start there if you are new to Confidential Computing on Kubernetes. +* Kubernetes administration and deployment experience, including deploying workloads, using ``kubectl``, and installing components with Helm. + Refer to the `Kubernetes documentation `_ if you need a foundation. +* Confidential Computing hardware, familiarity with AMD SEV-SNP or Intel TDX, and an understanding of which technology your target hardware uses. + +The documentation on this site is specific for deploying Confidential Containers on NVIDIA GPUs with Kata Containers and the NVIDIA GPU Operator. +It covers the steps you take to enable and configure these components on your cluster to align with the NVIDIA Reference Architecture for Confidential Containers. +For more advanced Confidential Containers topics, refer to the upstream `Confidential Containers documentation `__. + +******** +Personas +******** + +The personas used throughout this documentation describe who is responsible for each stage of enabling and managing Confidential Computing, from hardware selection through workload deployment. +Depending on your role, you may complete several sections or only a subset. + +.. list-table:: + :header-rows: 1 + :widths: 28 42 30 + + * - Persona + - Responsibilities + - Start here + * - :ref:`Hardware IT Administrator ` + - Selects Confidential Computing-capable CPU and GPU hardware and configures BIOS/UEFI settings. + - :doc:`Supported Platforms ` + * - :ref:`Host OS Administrator ` + - Prepares the host operating system after hardware and BIOS configuration are complete. + - :doc:`Supported Platforms ` + * - :ref:`Kubernetes Cluster Administrator ` + - Installs and manages the Kubernetes cluster and the Confidential Containers software stack. + - :doc:`Prerequisites ` + * - :ref:`Security Engineer ` + - Validates Confidential Computing configuration, attestation policy, and secret release for workloads. + - :doc:`Attestation Quickstart ` + * - :ref:`Container User ` + - Deploys confidential GPU workloads on a prepared cluster. + - :doc:`Configuring Workloads ` + +.. _coco-persona-hardware-it-administrator: + +*************************** +Hardware IT Administrator +*************************** + +The Hardware IT Administrator is near the beginning of the Confidential Computing workflow. +This persona selects the correct CPU and GPU part numbers and configures BIOS/UEFI settings for subsequent steps. +Typical roles include system architect and IT administrator. + +Relevant pages in this documentation: + +* :doc:`Supported Platforms `: validated CPU, GPU, OS, and component version combinations that NVIDIA has tested with Confidential Containers. + +For BIOS configuration and hardware setup, refer to the `NVIDIA Confidential Computing Deployment Guide `_ *Hardware IT Administrator* section. + +.. _coco-persona-host-os-administrator: + +********************** +Host OS Administrator +********************** + +The Host OS Administrator receives a system with BIOS/UEFI configured for Confidential Computing and prepares the host operating system. +This persona is responsible for host OS selection, initial configuration, and validation before confidential workloads can run. +Typical roles include system architect, cloud administrator, or advanced on-premises user. + +Relevant pages in this documentation: + +* :doc:`Supported Platforms `: validated host OS and kernel versions. + +For host OS setup, refer to the `NVIDIA Confidential Computing Deployment Guide `_ *Host OS Administrator* section. + +.. _coco-persona-kubernetes-cluster-administrator: + +******************************** +Kubernetes Cluster Administrator +******************************** + +The Kubernetes Cluster Administrator is responsible for installing and managing the Kubernetes cluster and the Confidential Containers software stack. +This persona could be a platform engineer with cluster-admin access to the API, host access to worker nodes, and familiarity with Helm and ``kubectl``. +This persona performs the initial deployment and is responsible for day-two operations such as upgrades and Confidential Computing mode changes. + +Relevant pages: + +* :doc:`Reference Architecture `: understand the software components and how they fit together. +* :doc:`Prerequisites `: prepare worker nodes and the Kubernetes cluster. +* :doc:`Quickstart Install `: minimal steps to install Kata Containers and the GPU Operator. +* :doc:`Detailed Install Guide `: install with per-node labeling and additional configuration options. +* :doc:`Run a Sample Workload `: confirm the deployment was successful. +* :doc:`Managing the Confidential Computing Mode `: change the CC mode on GPUs at the cluster or node level as needed. +* :doc:`Troubleshooting `: resolve install and deploy failures (for example :ref:`Insufficient nvidia.com/pgpu `). + +.. _coco-persona-security-engineer: + +***************** +Security Engineer +***************** + +The Security Engineer might or might not be the Kubernetes Cluster Administrator. +Their work may cover attestation services, reference values, policies, and secret release for confidential workloads. +Typical roles include security engineer, platform security, or DevSecOps. + +Relevant pages: + +* :doc:`Reference Architecture `: understand the use cases, trust model, and how workloads are isolated from the infrastructure. +* :doc:`Attestation Quickstart `: stand up a local Trustee instance and verify connectivity. + Attestation is required for workloads that use secrets, encrypted container images, or authenticated registries. + +For production attestation workflows, secret management, and policy configuration, refer to the upstream `Confidential Containers attestation documentation `_. + +.. _coco-persona-container-user: + +************** +Container User +************** + +The Container User deploys confidential applications on a system that is already configured for Confidential Computing. +In this documentation, that means deploying confidential GPU workloads with Kubernetes manifests on a cluster that the Kubernetes Cluster Administrator has prepared. +This persona works primarily with Kubernetes workload manifests and does not require host access to worker nodes. + +Relevant pages: + +* :doc:`Configuring Workloads `: runtime class selection, GPU and NVSwitch resource types, and single- or multi-GPU passthrough manifests. +* :doc:`Run a Sample Workload `: run the reference workload to confirm the cluster is ready before deploying your own application. +* :doc:`Advanced Setup Overview `: choose attestation, CC mode, and workload configuration topics after install. diff --git a/confidential-containers/prerequisites.rst b/confidential-containers/prerequisites.rst new file mode 100644 index 000000000..f0a1c8de4 --- /dev/null +++ b/confidential-containers/prerequisites.rst @@ -0,0 +1,309 @@ +.. license-header + SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. + SPDX-License-Identifier: Apache-2.0 + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + +.. headings # #, * *, =, -, ^, " + + +.. _coco-prerequisites: + +############# +Prerequisites +############# + +As a :ref:`Kubernetes Cluster Administrator `, prepare hosts and the Kubernetes cluster before you install Kata Containers and the NVIDIA GPU Operator. +You perform most steps in this section. +If you do not have access to host firmware, coordinate with your :ref:`Hardware IT Administrator ` or :ref:`Host OS Administrator ` to confirm or implement hardware prerequisites. + +For validated hardware and software versions, refer to :doc:`Supported Platforms `. +Use the checklists below for an at-a-glance summary, then follow each linked section for verification steps. + +**Hardware prerequisites** + +.. list-table:: + :header-rows: 1 + :widths: 30 70 + + * - Prerequisite + - Details + * - :ref:`Use a supported platform ` + - CPU, GPU, and host OS match :doc:`Supported Platforms ` + * - :ref:`Hardware virtualization and ACS enabled ` + - Hardware virtualization and ACS enabled in host BIOS + * - :ref:`IOMMU enabled ` + - IOMMU enabled on each host through the kernel command line (``amd_iommu=on`` or ``intel_iommu=on``) + * - :ref:`No host NVIDIA GPU drivers ` + - No NVIDIA GPU drivers installed or loaded on worker hosts. + +**Cluster prerequisites** + +.. list-table:: + :header-rows: 1 + :widths: 30 70 + + * - Prerequisite + - Details + * - :ref:`A Kubernetes cluster and cluster administrator access ` + - Cluster administrator access to a Kubernetes cluster running a supported version (refer to :ref:`Supported Software Components `) + * - :ref:`containerd 2.2.2 installed ` + - containerd 2.2.2 installed on each GPU worker node + * - :ref:`Helm installed ` + - Helm installed on your cluster administration system + * - :ref:`Kubelet configured ` + - Enable ``KubeletPodResourcesGet`` (required before Kubernetes v1.34) and ``RuntimeClassInImageCriApi`` feature gates; set ``runtimeRequestTimeout: 20m`` on GPU worker nodes + +***************** +Hardware and BIOS +***************** + +.. _coco-prereq-supported-platform: + +Supported Platform +================== + +Your hosts must use a platform validated for Confidential Computing in :doc:`Supported Platforms `. +Confirm with your :ref:`Hardware IT Administrator ` and :ref:`Host OS Administrator ` that any platform-specific BIOS, firmware, or OS steps are in place before continuing. + +.. _coco-prereq-hw-virtualization: + +Hardware Virtualization and ACS Enabled +======================================= + +Confirm with your :ref:`Hardware IT Administrator ` that your hosts are configured to enable hardware virtualization and Access Control Services (ACS). +With some AMD CPUs and BIOSes, ACS might be grouped under Advanced Error Reporting (AER). +Enable these features in the host BIOS if they are not already enabled. + +.. _coco-prereq-iommu: + +IOMMU Enabled +============= + +IOMMU must be enabled on all hosts that will run Confidential Containers workloads. + +#. Check whether IOMMU is already enabled: + + .. code-block:: console + + $ ls /sys/kernel/iommu_groups + + If the output lists numbered groups (``0``, ``1``, and so on), IOMMU is enabled. + + If the output is empty or the directory is missing, IOMMU is not enabled. + +#. If IOMMU is not enabled, add the appropriate kernel command-line argument to ``/etc/default/grub``: + + * ``amd_iommu=on`` for AMD CPUs + * ``intel_iommu=on`` for Intel CPUs + + .. tab-set:: + + .. tab-item:: AMD-based system (SNP) + :sync: amd-snp + + .. code-block:: console + + ... + GRUB_CMDLINE_LINUX_DEFAULT="quiet amd_iommu=on modprobe.blacklist=nouveau" + ... + + .. tab-item:: Intel-based system (TDX) + :sync: intel-tdx + + .. code-block:: console + + ... + GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on modprobe.blacklist=nouveau" + ... + +#. Update the bootloader configuration: + + .. code-block:: console + + $ sudo update-grub + + *Example Output:* + + .. code-block:: output + + Sourcing file `/etc/default/grub' + Generating grub configuration file ... + Found linux image: /boot/vmlinuz-5.15.0-generic + Found initrd image: /boot/initrd.img-5.15.0-generic + done + +#. Reboot the host. + +.. note:: + + After configuring IOMMU, you might see QEMU warnings about PCI P2P DMA when running GPU workloads. + These are expected and can be safely ignored. + Refer to :ref:`coco-limitations` for details. + +.. _coco-prereq-no-host-drivers: + +Ensure No Host NVIDIA GPU Drivers Are Present +============================================= + +Confidential Containers pass GPUs to the confidential virtual machine through VFIO. +Host-installed NVIDIA drivers prevent VFIO from binding the devices and must not be present on those hosts. +In this architecture, the NVIDIA GPU Operator handles GPU driver installation and lifecycle management when you follow the :doc:`Detailed Install Guide `. + +#. On each host, check whether NVIDIA GPU drivers are loaded: + + .. code-block:: console + + $ lsmod | grep nvidia + + If the command produces no output, no NVIDIA GPU drivers are loaded. + +#. If drivers are installed or loaded on any host, remove them. + + Refer to `Removing the Driver `_ in the NVIDIA Driver Installation Guide. + +****************** +Kubernetes Cluster +****************** + +The following sections describe requirements for worker nodes and for the system you use for cluster administration. + +.. _coco-prereq-cluster-admin: + +Kubernetes Cluster and Cluster Administrator Access +=================================================== + +You must have cluster administrator access to a Kubernetes cluster running a supported Kubernetes version. +Refer to the :ref:`Supported Software Components ` section in :doc:`Supported Platforms ` for supported Kubernetes and component versions. + +.. _coco-prereq-containerd: + +containerd 2.2.2 +================ + +Verify the installed version on each GPU worker node: + +.. code-block:: console + + $ containerd --version + +*Example Output:* + +.. code-block:: output + + containerd containerd.io 2.2.2 ... + +If you are running a different version on any worker node, refer to the `containerd Getting Started guide `_ for installation instructions. + +.. _coco-prereq-helm: + +Helm +==== + +Helm is used to install the NVIDIA GPU Operator and Kata Containers. + +Verify that Helm is installed on the system you use for cluster administration: + +.. code-block:: console + + $ helm version + +*Example Output:* + +.. code-block:: output + + version.BuildInfo{Version:"v3.14.0", GitCommit:"...", GitTreeState:"clean", GoVersion:"go1.21.6"} + +Your exact version details may vary. + +If Helm is not installed or the command is not found, refer to the `Helm documentation `_ for installation instructions. + +.. _coco-prereq-kubelet: +.. _configure-image-pull-timeouts: + +Kubelet Configured +================== + +On GPU worker nodes, the kubelet configuration (typically ``/var/lib/kubelet/config.yaml``) must include the required feature gates and an extended image pull timeout. + +Confidential Containers require these kubelet feature gates: + +* ``KubeletPodResourcesGet``: Allows the Kata runtime to query the kubelet Pod Resources API and discover GPUs allocated to a sandbox. + +* ``RuntimeClassInImageCriApi``: Alpha since Kubernetes v1.29; required for pods that use multiple snapshotters side by side. + +On Kubernetes v1.34 and later, ``KubeletPodResourcesGet`` is enabled by default. +On versions before v1.34, enable it explicitly. +``RuntimeClassInImageCriApi`` must be enabled explicitly on all supported versions. + +Increase the ``runtimeRequestTimeout`` from the 2-minute default to ``20m`` to avoid timeouts when pulling large GPU workload images. +If a pull exceeds the timeout before the container is running, the kubelet de-allocates the pod. +Actual pull duration varies with image size and network throughput, so this guide uses 20 minutes as a conservative ceiling that accommodates most workload images. + +Apply these settings as follows: + +#. Open the kubelet configuration file: + + .. code-block:: console + + $ sudo nano /var/lib/kubelet/config.yaml + + This is typically located at ``/var/lib/kubelet/config.yaml``, but your configuration file may be in a different location. + +#. Add the required settings to the kubelet configuration file. + Select the tab that matches your Kubernetes version: + + .. tab-set:: + + .. tab-item:: Kubernetes v1.34 and later + :sync: k8s-1-34-plus + + .. code-block:: yaml + + apiVersion: kubelet.config.k8s.io/v1beta1 + kind: KubeletConfiguration + featureGates: + RuntimeClassInImageCriApi: true + runtimeRequestTimeout: 20m + + .. tab-item:: Kubernetes earlier than v1.34 + :sync: k8s-pre-1-34 + + .. code-block:: yaml + + apiVersion: kubelet.config.k8s.io/v1beta1 + kind: KubeletConfiguration + featureGates: + KubeletPodResourcesGet: true + RuntimeClassInImageCriApi: true + runtimeRequestTimeout: 20m + + If your kubelet configuration already defines ``featureGates`` or ``runtimeRequestTimeout``, merge these settings into the existing file instead of replacing it. + +#. Restart the kubelet service: + + .. code-block:: console + + $ sudo systemctl restart kubelet + +.. note:: + + If you need a timeout longer than 1200 seconds (20 minutes), also adjust the Kata Agent ``image_pull_timeout``. + This setting controls the Confidential Data Hub image pull API timeout in seconds. + Add the ``agent.image_pull_timeout`` kernel parameter to your shim configuration, or pass a value in the pod annotation ``io.katacontainers.config.hypervisor.kernel_params``. + +********** +Next Steps +********** + +After completing the prerequisites, proceed to :doc:`Quickstart Install ` for a minimal install, or :doc:`Detailed Install Guide ` for full configuration details. diff --git a/confidential-containers/release-notes.rst b/confidential-containers/release-notes.rst index 5f7ddbe4f..94e430ee9 100644 --- a/confidential-containers/release-notes.rst +++ b/confidential-containers/release-notes.rst @@ -18,9 +18,9 @@ .. _coco-release-notes: -************* +############# Release Notes -************* +############# This document describes the new features and known issues for the NVIDIA Confidential Containers Reference Architecture. @@ -28,8 +28,9 @@ This document describes the new features and known issues for the NVIDIA Confide .. _coco-v1.0.0: +***** 1.0.0 -===== +***** This is the initial general availability (GA) release of the NVIDIA Confidential Containers Reference Architecture, a validated deployment model for running GPU-accelerated AI workloads inside hardware-enforced Trusted Execution Environments (TEEs). It is designed for organizations in regulated industries that require strong isolation and cryptographic verification to protect model intellectual property and sensitive data on untrusted infrastructure. @@ -37,7 +38,7 @@ It is designed for organizations in regulated industries that require strong iso The architecture combines NVIDIA GPU Confidential Computing, Kata Containers, and the NVIDIA GPU Operator to provide a secure, attestable, Kubernetes-native platform for confidential AI workloads. Key Features ------------- +============ * This release supports HGX platforms with: @@ -66,7 +67,7 @@ Key Features Limitations and Restrictions ----------------------------- +============================ * NVIDIA supports the GPU Operator and confidential computing with the containerd runtime only. diff --git a/confidential-containers/run-sample-workload.rst b/confidential-containers/run-sample-workload.rst new file mode 100644 index 000000000..93c68eddd --- /dev/null +++ b/confidential-containers/run-sample-workload.rst @@ -0,0 +1,157 @@ +.. license-header + SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. + SPDX-License-Identifier: Apache-2.0 + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + +.. headings # #, * *, =, -, ^, " + + +.. _coco-run-sample-workload: + +##################### +Run a Sample Workload +##################### + +As a :ref:`Kubernetes Cluster Administrator `, use this page to verify your installation and run a sample workload. +:ref:`Container User ` personas can also run the sample workload to confirm the cluster is ready before deploying applications. +For persona responsibilities and documentation structure, refer to :doc:`Personas `. + +Verify your Confidential Container setup by running a basic single-GPU sample workload inside a Confidential Container. + +This page assumes that you have completed :doc:`Prerequisites ` and either :doc:`Quickstart Install ` or :doc:`Detailed Install Guide `. +Your cluster should have ``kata-qemu-nvidia-gpu-snp`` and ``kata-qemu-nvidia-gpu-tdx`` runtime classes installed, and GPU Operator operands (including the Confidential Computing Manager, Kata Sandbox Device Plugin, and VFIO Manager) running on your nodes. + +This page intentionally uses the simplest possible manifest so that you can confirm the deployment end-to-end. +It is not a production workload template. +For runtime class selection, resource type naming, multi-GPU passthrough, and additional manifest patterns, refer to :doc:`Configuring Workloads `. + +#. Create a file named ``cuda-vectoradd-kata.yaml`` with a sample manifest for your system: + + .. tab-set:: + + .. tab-item:: AMD-based system (SNP) + :sync: amd-snp + + .. code-block:: yaml + :emphasize-lines: 7,14 + + apiVersion: v1 + kind: Pod + metadata: + name: cuda-vectoradd-kata + namespace: default + spec: + runtimeClassName: kata-qemu-nvidia-gpu-snp + restartPolicy: Never + containers: + - name: cuda-vectoradd + image: "nvcr.io/nvidia/k8s/cuda-sample:vectoradd-cuda12.5.0-ubuntu22.04" + resources: + limits: + nvidia.com/pgpu: "1" # for single GPU passthrough + memory: 16Gi + + .. tab-item:: Intel-based system (TDX) + :sync: intel-tdx + + .. code-block:: yaml + :emphasize-lines: 7,14 + + apiVersion: v1 + kind: Pod + metadata: + name: cuda-vectoradd-kata + namespace: default + spec: + runtimeClassName: kata-qemu-nvidia-gpu-tdx + restartPolicy: Never + containers: + - name: cuda-vectoradd + image: "nvcr.io/nvidia/k8s/cuda-sample:vectoradd-cuda12.5.0-ubuntu22.04" + resources: + limits: + nvidia.com/pgpu: "1" # for single GPU passthrough + memory: 16Gi + + The following is a brief list of the options available for the manifest: + + * Runtime class: Use ``kata-qemu-nvidia-gpu-snp`` on AMD-based systems or ``kata-qemu-nvidia-gpu-tdx`` on Intel-based systems. + * GPU resource type: The sample requests ``nvidia.com/pgpu``, which is the default resource name advertised by the NVIDIA Kata Sandbox Device Plugin. + If your cluster was installed with the ``P_GPU_ALIAS=""`` setting, replace it with the model-specific name advertised on your node, for example ``nvidia.com/GH100_H200_141GB``. + + Refer to :doc:`Configuring Confidential Container Workloads ` for additional guidance on each option. + +#. Create the pod: + + .. code-block:: console + + $ kubectl apply -f cuda-vectoradd-kata.yaml + + *Example Output:* + + .. code-block:: output + + pod/cuda-vectoradd-kata created + +#. Verify the pod is running: + + .. code-block:: console + + $ kubectl get pod cuda-vectoradd-kata + + *Example Output:* + + .. code-block:: output + + NAME READY STATUS RESTARTS AGE + cuda-vectoradd-kata 1/1 Running 0 10s + + The pod could also say ``Completed`` if the container already completed successfully. + + If the pod stays ``Pending`` for more than a few minutes, refer to :ref:`Pod Stuck in Pending State with Insufficient nvidia.com/pgpu Error ` in :doc:`Troubleshooting ` before continuing. + +#. View the logs from the pod after the container starts: + + .. code-block:: console + + $ kubectl logs -n default cuda-vectoradd-kata + + *Example Output:* + + .. code-block:: output + + [Vector addition of 50000 elements] + Copy input data from the host memory to the CUDA device + CUDA kernel launch with 196 blocks of 256 threads + Copy output data from the CUDA device to the host memory + Test PASSED + Done + + The output should include ``Test PASSED`` if the container completed successfully. + + If you do not see any log output, make sure the pod is running and the container is started. + +#. Delete the pod: + + .. code-block:: console + + $ kubectl delete -f cuda-vectoradd-kata.yaml + + + +********** +Next Steps +********** + +* Refer to the :doc:`Advanced Setup Overview ` section for more information on managing the Confidential Computing mode and configuring workloads. diff --git a/confidential-containers/supported-platforms.rst b/confidential-containers/supported-platforms.rst index 986d170ef..3765540c1 100644 --- a/confidential-containers/supported-platforms.rst +++ b/confidential-containers/supported-platforms.rst @@ -18,17 +18,24 @@ .. _coco-supported-platforms: -******************* +################### Supported Platforms -******************* +################### Following are the platforms supported by the NVIDIA Confidential Containers Reference Architecture. -Supported Hardware Platform -=========================== +This page is relevant to the following users: + +* The :ref:`Hardware IT Administrator ` uses the hardware tables to confirm that the selected CPU and GPU are validated for Confidential Computing before configuring the system. +* The :ref:`Host OS Administrator ` uses the hardware tables to confirm validated host OS and kernel versions. +* The :ref:`Kubernetes Cluster Administrator ` uses the software component matrix to confirm that the correct versions are in place before beginning cluster installation. + +******** +Hardware +******** NVIDIA GPUs ------------ +=========== .. list-table:: :header-rows: 1 @@ -57,8 +64,8 @@ NVIDIA GPUs .. note:: - Multi-GPU passthrough on NVIDIA Hopper HGX systems requires that you set the Confidential Computing mode to ``ppcie`` mode. - Refer to :ref:`Managing the Confidential Computing Mode ` in the deployment guide for details. + :ref:`Multi-GPU passthrough ` on NVIDIA Hopper HGX systems requires that you set the Confidential Computing mode to ``ppcie`` mode. + Refer to :doc:`Managing the Confidential Computing Mode ` for details. .. note:: @@ -66,7 +73,7 @@ NVIDIA GPUs Configuring only some GPUs on a node for Confidential Computing is not supported. CPU Platforms -------------- +============= .. flat-table:: :header-rows: 1 @@ -97,8 +104,10 @@ For additional resources on machine setup: .. _coco-supported-software-components: + +***************************** Supported Software Components ------------------------------ +***************************** .. flat-table:: :header-rows: 1 diff --git a/confidential-containers/troubleshooting.rst b/confidential-containers/troubleshooting.rst new file mode 100644 index 000000000..4967cfe0b --- /dev/null +++ b/confidential-containers/troubleshooting.rst @@ -0,0 +1,316 @@ +.. license-header + SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. + SPDX-License-Identifier: Apache-2.0 + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + +.. headings # #, * *, =, -, ^, " + +.. _coco-deploy-troubleshooting: + +############### +Troubleshooting +############### + +Use this page when Confidential Containers installation or workload deployment steps fail. + +Refer to the :doc:`NVIDIA GPU Operator troubleshooting guide ` for general operator issues such as driver daemonsets, the container toolkit, and validator pods. +The sections below cover Confidential Containers-specific deploy failures: CC node labels, Kata runtime installation, and host prerequisites. + +If these steps do not resolve your issue, refer to :ref:`Getting Help `. + +.. _coco-gpu-operator-logs: + +********************** +View GPU Operator Logs +********************** + +#. Get the list of GPU Operator pods: + + .. code-block:: console + + $ kubectl get pods -n gpu-operator + + *Example Output:* + + .. code-block:: output + + NAME READY STATUS RESTARTS AGE + gpu-operator-1766001809-node-feature-discovery-gc-75776475sxzkp 1/1 Running 0 86s + gpu-operator-1766001809-node-feature-discovery-master-6869lxq2g 1/1 Running 0 86s + gpu-operator-1766001809-node-feature-discovery-worker-mh4cv 1/1 Running 0 86s + gpu-operator-f48fd66b-vtfrl 1/1 Running 0 86s + nvidia-cc-manager-7z74t 1/1 Running 0 61s + nvidia-kata-sandbox-device-plugin-daemonset-d5rvg 1/1 Running 0 30s + nvidia-sandbox-validator-6xnzc 1/1 Running 0 30s + nvidia-vfio-manager-h229x 1/1 Running 0 62s + +#. Get specific logs for a pod: + + .. code-block:: console + + $ kubectl logs -n gpu-operator + + Replace ```` with the name of the GPU Operator pod from ``kubectl get pods -n gpu-operator``. + + +************************* +View Kata Containers Logs +************************* + +#. Get the list of Kata Containers pods: + + .. code-block:: console + + $ kubectl get pods -n kata-system + + *Example Output:* + + .. code-block:: output + + NAME READY STATUS RESTARTS AGE + kata-deploy- 1/1 Running 0 6m37s + +#. View the logs for the Kata Containers pod: + + .. code-block:: console + + $ kubectl logs -n kata-system + + Replace ```` with the name of the Kata Containers pod from ``kubectl get pods -n kata-system``. + + +.. _coco-cc-mode-troubleshoot: + +****************************************************************** +``nvidia.com/cc.mode.state`` Not Matching ``nvidia.com/cc.mode`` +****************************************************************** + +When changing the Confidential Computing mode (refer to :doc:`Managing the Confidential Computing Mode `), the Confidential Computing Manager updates the ``nvidia.com/cc.mode.state`` label to reflect the current state of the Confidential Computing mode. +If the ``nvidia.com/cc.mode.state`` does not match the desired CC mode (``on``, ``off``, or ``ppcie``), it means the Confidential Computing update is still ongoing. +Wait a few more minutes, then check the labels again. + +.. code-block:: console + + $ kubectl get node $NODE_NAME -o json | \ + jq '.metadata.labels | with_entries(select(.key | startswith("nvidia.com/cc")))' + +*Example Output:* + +.. code-block:: json + + { + "nvidia.com/cc.mode": "on", + "nvidia.com/cc.mode.state": "on", + "nvidia.com/cc.ready.state": "true" + } + +.. _coco-cc-mode-failed: + +****************************************** +``nvidia.com/cc.mode.state`` is ``failed`` +****************************************** + +When the ``nvidia.com/cc.mode.state`` is ``failed``, it means there was a problem updating the Confidential Computing mode on the GPU. + +**Checks:** + +#. Confirm no user workloads are running on the node before changing CC mode. + List pods scheduled on the node: + + .. code-block:: console + + $ export NODE_NAME="" + $ kubectl get pods -A --field-selector spec.nodeName=$NODE_NAME -o wide + + This lists pods on ``$NODE_NAME``. + ``kube-system`` DaemonSets (for example CNI or ``kube-proxy``) are expected on every worker node. + ``gpu-operator`` and ``kata-system`` pods are expected only if this node is configured for Confidential Containers (labeled ``nvidia.com/gpu.workload.config=vm-passthrough`` or cluster-wide ``sandboxWorkloads.defaultWorkload=vm-passthrough``). + Delete or reschedule any other ``Running`` pods (especially GPU workloads) before changing CC mode. + +#. View ``nvidia-cc-manager`` pod logs: + + .. code-block:: console + + $ kubectl logs -n gpu-operator nvidia-cc-manager- + + Replace ```` with the name of the ``nvidia-cc-manager`` pod from ``kubectl get pods -n gpu-operator``. + +#. Confirm hardware virtualization and ACS are enabled in the host BIOS. + One way to do this is to check for ``vmx`` (Intel) or ``svm`` (AMD) in ``/proc/cpuinfo``. + For ACS, coordinate with your :ref:`Hardware IT Administrator ` if needed. + +#. Re-apply the desired mode label to retry the transition: + + .. code-block:: console + + $ kubectl label node $NODE_NAME nvidia.com/cc.mode=on --overwrite + +For mode configuration options, refer to :doc:`Managing the Confidential Computing Mode `. + +.. _coco-container-creating-cold-plug: + +************************************************************************* +Pod Stuck in ``ContainerCreating`` with ``device cold plug failed`` error +************************************************************************* + +If you see the following error when ``kubectl describe pod -n `` and the pod is stuck in the ``ContainerCreating`` state, it means the ``KubeletPodResourcesGet`` feature gate is not enabled on the worker node. +Refer to the Kubelet Configuration section in :doc:`Prerequisites ` for more information on setting the feature gate. + +.. code-block:: output + + Events: + Type Reason Age From Message + ---- ------ ---- ---- ------- + Warning FailedCreatePodSandBox 19s (x16 over 34s) kubelet (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to start sandbox "d0a43b5d3c6c433f011efbfacb6de3f7ac448f3d09a272cef8d43249712b12b1": failed to create containerd task: failed to create shim task: device cold plug failed: cold plug: GetPodResources failed for pod(cuda-vectoradd-kata) in namespace(default): rpc error: code = Unknown desc = PodResources API Get method disabled + +.. _coco-pending-pod: + +************************************************************************** +Pod Stuck in ``Pending`` State with ``Insufficient nvidia.com/pgpu`` Error +************************************************************************** + +If ``kubectl describe pod -n `` shows the pod stuck in the ``Pending`` state, the scheduler cannot place the pod on a node with available passthrough GPU capacity. + +.. code-block:: output + + Events: + Type Reason Age From Message + ---- ------ --- ---- ------- + Warning FailedScheduling ... default-scheduler 0/1 nodes are available: 1 Insufficient nvidia.com/pgpu. + +**Common causes:** + +* The worker node is not configured for Confidential Containers workloads. +* GPU Operator Confidential Containers operands are missing or not ``Running`` on the worker node. +* ``nvidia.com/pgpu`` capacity on the node is zero because GPUs are not bound to ``vfio-pci`` on the host. +* All passthrough GPUs on eligible nodes are already allocated to other pods. + +**Resolution:** + +#. Confirm GPU Operator operands are ``Running`` on the worker node: + + .. code-block:: console + + $ kubectl get pods -n gpu-operator -o wide --field-selector spec.nodeName= + + Expected Confidential Containers operands include ``nvidia-cc-manager``, ``nvidia-vfio-manager``, ``nvidia-kata-sandbox-device-plugin``, and ``nvidia-sandbox-validator``. + If an operand is not ``Running``, refer to :ref:`View GPU Operator Logs `. + +#. Confirm the node is configured for Confidential Containers workloads: + + .. code-block:: console + + $ kubectl describe node | grep nvidia.com/gpu.workload.config + + *Example Output:* + + .. code-block:: output + + nvidia.com/gpu.workload.config: vm-passthrough + + If the label is missing, add it: + + .. code-block:: console + + $ kubectl label node nvidia.com/gpu.workload.config=vm-passthrough + + If you set the cluster-wide default during installation instead of per-node labeling, confirm ``sandboxWorkloads.defaultWorkload`` is ``vm-passthrough``. + Refer to :ref:`Common GPU Operator Configuration Settings ` in :doc:`Detailed Install Guide `. + +#. Check ``nvidia.com/pgpu`` capacity on the node: + + .. code-block:: console + + $ kubectl describe node | grep nvidia.com/pgpu + + *Example Output:* + + .. code-block:: output + + nvidia.com/pgpu: 8 + nvidia.com/pgpu: 8 + + If capacity and allocatable are zero, GPUs are not available for scheduling. + On the worker host, confirm VFIO binding: + + .. code-block:: console + + $ lspci -nnk -d 10de: + + *Example Output (expected):* + + .. code-block:: output + + 65:00.0 3D controller [0302]: NVIDIA Corporation Device [10de:xxxx] (rev a1) + Kernel driver in use: vfio-pci + + If the output shows ``Kernel driver in use: nvidia`` or ``nouveau``, remove host drivers as described in :ref:`Ensure No Host NVIDIA GPU Drivers Are Present `. + Confirm IOMMU is enabled: + + .. code-block:: console + + $ ls /sys/kernel/iommu_groups + + If the directory is empty or missing, configure IOMMU as described in :ref:`Prerequisites `, then reboot the host. + Review ``nvidia-vfio-manager`` pod logs on the affected node in :ref:`View GPU Operator Logs `. + After fixing host prerequisites, wait for operand pods to reconcile and confirm ``nvidia.com/pgpu`` is non-zero. + +#. If the node shows non-zero ``nvidia.com/pgpu`` capacity but the pod is still ``Pending``, all GPUs may be in use. + Check allocatable capacity and running workloads on the node. + +Refer to the optional VFIO validation step in :doc:`Detailed Install Guide `. + + +.. _coco-getting-help: + +************ +Getting Help +************ + +If the steps on this page do not resolve your issue, use the resources below based on which component is failing. + +NVIDIA GPU Operator and Confidential Computing Operands +======================================================= + +For issues with GPU Operator pods or Confidential Containers operands (``nvidia-cc-manager``, ``nvidia-vfio-manager``, ``nvidia-kata-sandbox-device-plugin``, and ``nvidia-sandbox-validator``): + +#. Review the :doc:`NVIDIA GPU Operator troubleshooting guide `. +#. If the issue is not documented there, run the GPU Operator ``must-gather`` utility to collect cluster diagnostics: + + .. code-block:: console + + $ curl -o must-gather.sh -L https://raw.githubusercontent.com/NVIDIA/gpu-operator/main/hack/must-gather.sh + $ chmod +x must-gather.sh + $ ./must-gather.sh + + The utility produces an archive with manifests and logs from GPU Operator-managed components. + +#. Prepare a bug report and file an issue in the `NVIDIA GPU Operator GitHub repository `_. + +Kata Containers +=============== + +For issues with ``kata-deploy``, missing runtime classes, or Kata runtime failures: + +#. Search the `Kata Containers GitHub issues `_ for similar reports. +#. If no existing issue matches your problem, `open a new issue `_ in that repository. + + Include your environment details, Kata chart version, ``kata-deploy`` pod logs, and cluster configuration. + +Attestation and Upstream Confidential Containers +================================================ + +For attestation, Trustee, sealed secrets, or other upstream Confidential Containers features, refer to the `Confidential Containers documentation `__ and the `Confidential Containers GitHub repository `_. + +For NVIDIA Confidential Computing licensing requirements, refer to :doc:`Licensing `. \ No newline at end of file