Skip to content

Comments

Set helper container resource limits for Guaranteed QoS on profiling tests#3672

Open
desponda wants to merge 1 commit intoDataDog:masterfrom
desponda:desponda/fix-profiler-qos-guaranteed
Open

Set helper container resource limits for Guaranteed QoS on profiling tests#3672
desponda wants to merge 1 commit intoDataDog:masterfrom
desponda:desponda/fix-profiler-qos-guaranteed

Conversation

@desponda
Copy link

Summary

  • The shared GitLab runner pool leaves helper container limits unset by default (substitutions.bzl in dd-source sets {helpers.cpuLimit} and {helpers.memoryLimit} to empty strings)
  • This causes profiling test pods to be Burstable QoS even when the build container has matching requests and limits (KUBERNETES_CPU_REQUEST == KUBERNETES_CPU_LIMIT)
  • Adds KUBERNETES_HELPER_CPU_LIMIT and KUBERNETES_HELPER_MEMORY_LIMIT (matching the existing requests of cpu: 1, memory: 2Gi) so all containers have requests == limits → Guaranteed QoS

Impact

  • Profiling test pods will get Guaranteed QoS class, ensuring proper CPU cgroup enforcement
  • This fixes the issue where nproc reports the node's full CPU count (e.g. 66) instead of the requested 3 CPUs, which causes tools to spawn too many parallel jobs
  • The helper and init-permissions containers will now have their CPU limits enforced

Test plan

  • Run a profiling test job and verify qosClass: Guaranteed in the pod spec
  • Verify nproc reports the correct CPU count (3) instead of the node's total

🤖 Generated with Claude Code

…tests

The shared runner pool defaults leave helper container limits unset,
causing pods to be Burstable QoS even when the build container has
matching requests and limits. This adds explicit helper container
resource limits (requests == limits) so the pod achieves Guaranteed
QoS class, ensuring proper CPU cgroup enforcement.

This fixes the issue where nproc reports the node's full CPU count
(e.g. 66) instead of the requested 3 CPUs.
@desponda desponda requested a review from a team as a code owner February 24, 2026 18:29
@desponda
Copy link
Author

/ddci trigger

@gh-worker-devflow-routing-ef8351

View all feedbacks in Devflow UI.

2026-02-24 18:30:40 UTC ℹ️ Start processing command /ddci trigger
If you need support, contact us on Slack #devflow!


2026-02-24 18:30:56 UTC 🚨 Devflow

404 Not Found

Details
child workflow execution error (type: changeorchestrator.Changeorchestrator_GenerateDDCIRequestFromDevflow, workflowID: 3c0e5f9a-a442-4a6b-a3e6-267b0265ed85_38, runID: 019c90ea-ec45-77ee-a1ed-bbf76d72b5e5, initiatedEventID: 38, startedEventID: 39): activity error (type: gitlab.GitlabService_GetCommit, scheduledEventID: 49, startedEventID: 50, identity: 1@gitlab-worker-66c8b7446c-l24rc@): 404 Not Found

If you need support, contact us on Slack #devflow with those details!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant