Init pass at restructuring CoCo TOC by a-mccarthy · Pull Request #385 · NVIDIA/cloud-native-docs

a-mccarthy · 2026-04-30T15:27:16Z

The deployment guide has grown quite long. this is a draft attempt at splitting up the content into a more useable form.

github-actions · 2026-04-30T15:30:32Z

Documentation preview

https://nvidia.github.io/cloud-native-docs/review/pr-385

a-mccarthy · 2026-05-12T15:41:59Z

+
+   resources:
+     limits:
+       nvidia.com/GH100_H200_141GB: "1"


confirm this is a valid gpu name on a node

a-mccarthy · 2026-05-12T15:42:57Z

-If you need a timeout of more than 1200 seconds, you will also need to adjust Kata Agent Policy's ``image_pull_timeout`` value which controls the agent-side timeout for guest-image pull.
-To do this, add the ``agent.image_pull_timeout`` kernel parameter to your shim configuration, or pass an explicit value in a pod annotation in the ``io.katacontainers.config.hypervisor.kernel_params: "..."`` annotation.
-
+   "nvidia.com/GH100_H200_141GB": "1"


confirm this output.

manuelh-dev · 2026-05-21T15:47:02Z


-*****************************************************
+#####################################################
 NVIDIA Confidential Containers Reference Architecture


I am thinking of whether it is possible to make the aspects
"Supported Features and Deployment Scenarios" and "Limitations and Restrictions" a bit more prominent. These get a bit buried in the already lengthy overview page. Maybe we can relocate these two into a different main page (or even create a separate main page)?

I'd be in favor of doing this, but not in this PR. I have to circle back with hema, b/c i think that we can flush out at lot of these sections more and it may be a good idea to create separate pages.

manuelh-dev

LGTM, left just a few comments, feel free to resolve these if these don't seem immediately helpful

a-mccarthy · 2026-06-09T19:27:07Z

@manuelh-dev i made some more updates to this PR. Do you have time this week to review?

Updates in cluding

a new index home page
a new persona page to go over users and their resonsbilities
Added a troubleshooting page (which is still a bit of a draft)
Added a quick start, which cuts down on the commands/details to just installing kata + gpu operator. Do you think this type of install adds value to users?

mikemckiernan

Def a good idea to provide a streamlined, common install page. LMK what gibberish I can clarify.

mikemckiernan · 2026-06-09T20:06:57Z

-Deploy Confidential Containers
-******************************
+#########################################
+Install Guide for Confidential Containers


Def better than what I had and requires differentiation from the quickstart approach. I don't think the title is wrong, but I'm wondering if it can be more of a contrast to quickstart.

Detailed Installation

Common Installation Options (might be untrue)

Traditional Workload Considerations

Updated to Detailed Install Guide

mikemckiernan · 2026-06-09T20:09:32Z

-Refer to the :doc:`NVIDIA GPU Operator <gpuop:overview>` and `Kata Containers <https://katacontainers.io/docs/>`_ documentation for more information on these software components.
-Refer to the `Kubernetes documentation <https://kubernetes.io/docs/home/>`_ for more information on Kubernetes cluster administration.
+#. :doc:`Prerequisites <prerequisites>`.
+#. :ref:`Label nodes for Confidential Containers components <coco-label-nodes>`


not-sure: I wonder if "Label nodes to install Confidential Containers components" could set expectations for why we're labelling nodes. Or, "Label the nodes to configure for Confidential Containers"?

thanks for the suggestions on this section! i updated the wording here to hopefully be less clunky

mikemckiernan · 2026-06-09T20:26:44Z

+You can set the default confidential computing mode of the NVIDIA GPUs by setting the ``ccManager.defaultMode=<on|off>`` option.
+The default value of ``ccManager.defaultMode`` is ``on``.
+You can set this option when you install NVIDIA GPU Operator or afterward by modifying the cluster-policy instance of the ClusterPolicy object.
+
+Set a node-level mode by applying the ``nvidia.com/cc.mode=<on|off|ppcie>`` label on the node.
+If you set a specific mode on a node, it has higher precedence than the cluster-wide default mode.
+
+When you change the mode, the manager performs the following actions:
+
+* Evicts the other GPU Operator operands from the node.
+  However, the manager does not drain user workloads. You must make sure that no user workloads are running on the node before you change the mode.
+* Changes the mode and resets the GPU.
+* Reschedules the other GPU Operator operands.


I wonder if this info could follow the table or if it can be removed if it is redundant with the info in the sections that follow. You likely inherited some verbosity from my content.

manuelh-dev · 2026-06-10T17:49:58Z

+Complete the **Install** section (through :doc:`Run a Sample Workload <run-sample-workload>` with ``Test PASSED``) before wiring attestation into production workloads.
+
+Attestation is not required for the install sample workload.
+Configure attestation when workloads need secrets, encrypted container images, or authenticated registries.


@fitzthum do we care about authenticated registries?

should we generally formulate this more broadly? Every deployment should need attestation. Is there value in the solution when not conducting attestation?

manuelh-dev · 2026-06-10T17:59:44Z

 Attestation
 ***********

+As a :ref:`Security Engineer <coco-persona-security-engineer>`, use this page to configure and verify attestation for confidential workloads.


I think we should change the scope here and clearly delimit what this page does and what not. We should emphasize that attestation is required but that this is out of scope for this page, and instead describe that this page explains how to get to a basic setup of trustee and kbs-client for evaluation purposes. The workload etc. needs to be configured for attestation, so our goal is to not provide an end-to-end sample

this has been updated. I also added a Using this documentation section to the index page that calls out right from the start that we only deal with nvidia specific info.

Signed-off-by: Abigail McCarthy <20771501+a-mccarthy@users.noreply.github.com>

a-mccarthy marked this pull request as draft April 30, 2026 15:27

a-mccarthy changed the title ~~Init pass at restructuring TOC~~ Init pass at restructuring CoCo TOC Apr 30, 2026

a-mccarthy force-pushed the coco-strucutre branch from 8a83b89 to 8fac0fb Compare May 12, 2026 15:40

a-mccarthy marked this pull request as ready for review May 12, 2026 15:41

a-mccarthy commented May 12, 2026

View reviewed changes

a-mccarthy requested a review from manuelh-dev May 18, 2026 20:23

manuelh-dev reviewed May 21, 2026

View reviewed changes

Comment thread confidential-containers/attestation.rst Outdated

manuelh-dev reviewed May 21, 2026

View reviewed changes

Comment thread confidential-containers/attestation.rst Outdated

manuelh-dev reviewed May 21, 2026

View reviewed changes

manuelh-dev approved these changes May 21, 2026

View reviewed changes

a-mccarthy force-pushed the coco-strucutre branch from 685c303 to a4aa920 Compare June 3, 2026 16:46

a-mccarthy requested a review from manuelh-dev June 9, 2026 19:24

mikemckiernan approved these changes Jun 9, 2026

View reviewed changes

manuelh-dev reviewed Jun 10, 2026

View reviewed changes

Comment thread confidential-containers/prerequisites.rst Outdated

manuelh-dev reviewed Jun 10, 2026

View reviewed changes

Comment thread confidential-containers/index.rst Outdated

manuelh-dev reviewed Jun 10, 2026

View reviewed changes

Comment thread confidential-containers/attestation.rst

manuelh-dev reviewed Jun 10, 2026

View reviewed changes

Comment thread confidential-containers/attestation.rst Outdated

Init pass at restructuring TOC

2371029

Signed-off-by: Abigail McCarthy <20771501+a-mccarthy@users.noreply.github.com>

a-mccarthy force-pushed the coco-strucutre branch from aadf85d to 2371029 Compare June 11, 2026 19:22

a-mccarthy added 2 commits June 12, 2026 13:25

troubleshooting improvements, grammer nits

984a9de

Signed-off-by: Abigail McCarthy <20771501+a-mccarthy@users.noreply.github.com>

Clarify install paths between quickstart and detailed isntall guide

431f9be

Signed-off-by: Abigail McCarthy <20771501+a-mccarthy@users.noreply.github.com>

Conversation

a-mccarthy commented Apr 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Apr 30, 2026

Documentation preview

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

manuelh-dev left a comment

Choose a reason for hiding this comment

Uh oh!

a-mccarthy commented Jun 9, 2026

Uh oh!

mikemckiernan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

a-mccarthy commented Apr 30, 2026 •

edited

Loading