Adds support to deploy cyborg controlplane services#1102
Conversation
|
Build failed (check pipeline). Post https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/3a20fb346aa44e9d80abc0ba1ff8cdf5 ✔️ openstack-meta-content-provider SUCCESS in 2h 58m 29s |
|
check-rdo |
| labels: | ||
| app.kubernetes.io/name: nova-operator | ||
| app.kubernetes.io/managed-by: kustomize | ||
| name: cyborg-cyborg-admin-role |
There was a problem hiding this comment.
the existing roles for nova do not repeat the service name, e.g nova_admin_role, novaconductor_admin_role. If possible I think it would be good to use the same pattern for the cyborg roles
| // ensureTopology - when a Topology CR is referenced, remove the | ||
| // finalizer from a previous referenced Topology (if any), and retrieve the | ||
| // newly referenced topology object | ||
| func ensureTopology( |
There was a problem hiding this comment.
this function seems to be an exact duplicate of
. This might be a question more for the nova-operator maintainers, but is there a way to reuse code between the controllers of different services?There was a problem hiding this comment.
IIUC the approach is to not share code among the different groups in the operator (nova, placement and cyborg). Let's see what nova-operator maintainers think about it.
|
Build failed (check pipeline). Post ✔️ openstack-meta-content-provider SUCCESS in 3h 45m 37s |
|
check-rdo |
|
This change depends on a change that failed to merge. Change #1121 is needed. |
|
check-rdo |
|
This change depends on a change that failed to merge. Change #1121 is needed. |
|
check-rdo |
|
Build failed (check pipeline). Post ✔️ openstack-meta-content-provider SUCCESS in 4h 23m 58s |
Using operator-sdk command: operator-sdk create api --group cyborg --version v1beta1 --kind Cyborg --resource --controller operator-sdk create api --group cyborg --version v1beta1 --kind CyborgAPI --resource --controller operator-sdk create api --group cyborg --version v1beta1 --kind CyborgConductor --resource --controller Signed-off-by: Alfredo Moralejo <amoralej@redhat.com>
Define CRD specs for Cyborg, CyborgAPI and CyborgConductor resources: - Add CyborgSpec with DB, RabbitMQ, Keystone and TLS configuration - Add CyborgAPISpec and CyborgConductorSpec with configSecret, replicas, resources, nodeSelector and TLS fields - Implement defaulting and validation webhooks for all three CRDs - Register CRDs in the operator scheme - Update CRD YAML manifests and CSV for OLM Reconcile and configuration logic will be created in next commits. Assisted-By: Claude Signed-off-by: Alfredo Moralejo <amoralej@redhat.com>
Add full reconcile logic for the Cyborg CR: - Manage RBAC resources (ServiceAccount, Role, RoleBinding) - Validate input password secret and RabbitMQ TransportURL secret - Create MariaDB database and run DB sync job via a batch Job - Register Cyborg service in Keystone - Create a sub-level secret aggregating DB credentials, transport URL and service password to be consumed by CyborgAPI and CyborgConductor - Track readiness via structured conditions on CyborgStatus - Add functional tests covering the full reconcile flow Assisted-By: Claude Signed-off-by: Alfredo Moralejo <amoralej@redhat.com>
Add full reconcile logic for the CyborgConductor CR: - Validate input from the config secret created by the Cyborg controller - Generate conductor config from templates (00-default.conf) - Create a StatefulSet to run cyborg-conductor pods - Track readiness (ReadyCount, conditions, hash, topology) - Expose IsReady and topology helpers on CyborgConductor type - Update CyborgConductorStatus with structured conditions and hash - Extend Cyborg controller to propagate conductor and check readiness upwards - Add functional tests for the conductor reconcile loop Assisted-By: Claude Signed-off-by: Alfredo Moralejo <amoralej@redhat.com>
Add full reconcile logic for the CyborgAPI CR: - Validate input from config secret provided by the Cyborg controller - Render WSGI/httpd and cyborg-api configuration templates - Create a StatefulSet for cyborg-api pods with TLS support - Register Keystone endpoints (public and internal) for the API - Track readiness (ReadyCount, conditions, hash, topology) - Expose IsReady and topology helpers on CyborgAPI type - Extend Cyborg controller to create CyborgAPI and check readiness upwards - Add functional tests covering the full API reconcile flow Assisted-By: Claude Signed-off-by: Alfredo Moralejo <amoralej@redhat.com>
Deployment using httpd is not longer supported in kolla upstream images since 2026.1 release [1]. [1] https://review.opendev.org/c/openstack/kolla/+/986488 Signed-off-by: Alfredo Moralejo <amoralej@redhat.com>
Add an end-to-end kuttl test suite for the Cyborg operator: - Cleanup step to delete any pre-existing Cyborg CR before the test - Deploy step creating a full Cyborg CR (cyborg-kuttl) - Assert step verifying all conditions are True on Cyborg, CyborgAPI, CyborgConductor and MariaDBDatabase CRs - Error step covering missing-dependency failure scenarios - Register cyborg container images (api, conductor, agent) as default RELATED_IMAGE env vars in the manager deployment - Enable ENABLE_CYBORG=true in the CI webhook deploy script Assisted-By: Claude Signed-off-by: Alfredo Moralejo <amoralej@redhat.com>
The Cyborg controller now generates a `{name}-agent-config` secret
containing the rendered configuration for the cyborg-agent service
running on EDPM compute nodes. This secret is consumed by the
edpm-ansible cyborg role to configure the agent on the dataplane.
The shared 00-default.conf template is updated to guard the
[database] section with a conditional, allowing reuse for the
agent config without a separate template.
Assisted-By: claude
Signed-off-by: Alfredo Moralejo <amoralej@redhat.com>
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: amoralej The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Add support for OpenStack Cyborg (accelerator lifecycle management service) in nova-operator, introducing three new CRDs and their controllers.
oc get cyborg/cyborgapi/cyborgconductor.Assisted-By: Claude
Jira: OSPRH-27674