Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions gpu-operator/cdi.rst
Original file line number Diff line number Diff line change
Expand Up @@ -153,6 +153,11 @@ disable CDI and use the legacy NVIDIA Container Toolkit stack instead with the f
About the Node Resource Interface (NRI) Plugin
**********************************************

.. note::

The containerd project has not yet released a general availability (GA) version of the NRI Plugin. The implementation might change before the GA release.
Refer to the `containerd NRI repository <https://github.com/containerd/nri#api-stability>`_ for details on project details.

Node Resource Interface (NRI) is a standardized interface for plugging in extensions, called NRI Plugins, to OCI-compatible container runtimes like containerd.
NRI Plugins serve as hooks which intercept pod and container lifecycle events and perform functions including injecting devices to a container, topology aware placement strategies, and more. For more details on NRI, refer to the `NRI overview <https://github.com/containerd/nri/tree/main?tab=readme-ov-file#background>`_ in the containerd repository.

Expand Down
13 changes: 7 additions & 6 deletions gpu-operator/getting-started.rst
Original file line number Diff line number Diff line change
Expand Up @@ -156,8 +156,9 @@ To view all the options, run ``helm show values nvidia/gpu-operator``.
- ``true``

* - ``cdi.nriPluginEnabled``
- When set to ``true``, the Node Resource Interface (NRI) Plugin will be used for injecting GPUs into workload containers.
In NRI Plugin mode, the NVIDIA Container Toolkit will no longer modify the runtime config.
- When set to ``true``, the Node Resource Interface (NRI) Plugin will be used for injecting GPUs into workload containers.

In NRI Plugin mode, the NVIDIA Container Toolkit will no longer modify the runtime config.
This feature requires containerd v1.7.30, v2.1.x, or v2.2.x.
Refer to the :doc:`cdi` page for more information.
- ``false``
Expand Down Expand Up @@ -512,8 +513,9 @@ Specifying Configuration Options for containerd

.. note::

It's recommended that you enable the NRI Plugin to configure the container runtime by setting ``cdi.nriPluginEnabled=true``.
When enabled, you do not need to specify the ``toolkit.env`` options and injecting GPUs into workload containers is handled by the NRI Plugin.
When you enable the NRI Plugin, you do not need to specify the ``toolkit.env`` options and injecting GPUs into workload containers is handled by the NRI Plugin.
You can enable the NRI Plugin to configure the container runtime by setting ``cdi.nriPluginEnabled=true``.
The NRI Plugin is available for use on RKE2.
Refer to the :ref:`NRI Plugin <nri-plugin>` documentation, for more information.

When you use containerd as the container runtime, the following configuration
Expand Down Expand Up @@ -584,8 +586,7 @@ For Rancher Kubernetes Engine 2 (RKE2), refer to
`Deploy NVIDIA Operator <https://docs.rke2.io/add-ons/gpu_operators#deploy-nvidia-operator>`__
in the RKE2 documentation.

It's recommended that you enable CDI (default) and the NRI Plugin on RKE.
With both features enabled, you do not need to set ``runtimeClassName: nvidia`` in your pod spec.
With CDI (the default) and the NRI Plugin both enabled, you do not need to set ``runtimeClassName: nvidia`` in your pod specification, and you do not need to configure the ``CONTAINERD_CONFIG``, ``CONTAINERD_SOCKET``, or ``RUNTIME_CONFIG_SOURCE`` environment variables.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With CDI (the default) and the NRI Plugin both enabled, you do not need to set runtimeClassName: nvidia in your pod specification

This statement is true even for GPU Operator versions starting from 25.10 where CDI is enabled by default (and NRI Plugin mode is disabled)


Refer to the :ref:`v24.9.0-known-limitations`.

Expand Down
Loading