Skip to content

Lazy load validation.security on import of datadog_checks.checks#23936

Draft
jinh-labs wants to merge 1 commit into
masterfrom
jinh-labs/lazy-load-security-validation
Draft

Lazy load validation.security on import of datadog_checks.checks#23936
jinh-labs wants to merge 1 commit into
masterfrom
jinh-labs/lazy-load-security-validation

Conversation

@jinh-labs
Copy link
Copy Markdown
Collaborator

@jinh-labs jinh-labs commented Jun 5, 2026

What does this PR do?

Makes the validation package lazy using the same lazy_loader.attach_stub pattern already applied to other base subpackages (#19838), so import datadog_checks.checks no longer loads validation.security at import time (which pulls in dataclasses).

Motivation

Follow-up to the 9 MiB / instance sub-interpreter overhead investigation (https://datadoghq.atlassian.net/wiki/spaces/ARUN/pages/6797659760/), specifically the question "how much of the import datadog_checks.checks cost can be lazy-loaded?"

Findings:

Honest scope: please read before reviewing

  • On an import-only measurement (the doc's subinterp_mem.c benchmark), this defers ~1 MiB: the import drops from ~5.7 MiB to ~4.6 MiB (sub-interpreter infra, ~2.6 MiB, is unaffected).
  • But a running idle check calls check.run() -> load_configuration_models, which builds the config-validation context and reloads security regardless of whether the check has config models. So the saving is undone after the first check run and net savings on a running idle instance is approximately 0.
  • A follow-up could make it persist (skip building the validation context when a check has no config models), but that only helps no-config checks and those that fit this criteria, snmp and tokumx, are deprecated/unsupported.

So this only moves the import-phase number, not running instances.

Validation

# 1. Security behavior unchanged (the feature's own tests)
ddev test datadog_checks_base -- tests/models/test_security.py      # 13 passed

# 2. No regression (full base suite)
ddev test datadog_checks_base                                        

# 3. Lint / style
ddev test datadog_checks_base -s                                     # All checks passed!

# 4. security/dataclasses no longer load on import or instantiation (the deferral)
#    Note: calls c.check() directly, not c.run(), so load_configuration_models is not
#    triggered. This validates import-phase deferral only; c.run() would load it in practice.
python -c "
import sys
from datadog_checks.base import AgentCheck
class C(AgentCheck):
    def check(self, _): self.gauge('x', 1.0)
c = C('s', {}, [{}]); c.check_id = 's'; c.check(None)
print('dataclasses loaded:', 'dataclasses' in sys.modules)           # False (True on master)
"

# 5. Import-only memory, this branch vs master (Linux; reads /proc/self/smaps_rollup)
#    Run it a few times on each side -- the first run after a git checkout is a one-time
#    .pyc recompile (warmup); ignore it and read the steady value.
MEAS='import sys
def pss():
    for l in open("/proc/self/smaps_rollup"):
        if l.startswith("Pss:"): return int(l.split()[1])/1024.0
b = pss(); import datadog_checks.checks; print(round(pss()-b, 2), "MiB")'

for i in 1 2 3 4; do python -c "$MEAS"; done                 # this branch  (discard 1st, read steady)
git checkout master
for i in 1 2 3 4; do python -c "$MEAS"; done                 # master       (discard 1st, read steady)
git checkout -
# branch steady ~1 MiB lower than master (the import drops from ~5.7 to ~4.6 MiB)

Verified on the doc's subinterp_mem.c benchmark : ~8.27 → ~7.20 MiB per sub-interpreter and ~5.67 → 4.55 MiB for import (~1 MiB reduction).

Review checklist (to be filled by reviewers)

  • Feature or bugfix MUST have appropriate tests (unit, integration, e2e)
  • Add qa/required if this PR needs QA validation, or qa/skip-qa if it does not. Exactly one of the two is required.
  • If you need to backport this PR to another branch, you can add the backport/<branch-name> label to the PR and it will automatically open a backport PR once this one is merged

Importing datadog_checks.checks always pulls in the security module
(datadog_checks.base.utils.models.validation.security), which accounts for about
1 MiB of memory, even for checks that never use it. Now it's lazy loaded, so the
security module is only loaded on use. That saves about 1 MiB for checks that
don't need it, with no change for the ones that do.
@datadog-datadog-prod-us1
Copy link
Copy Markdown
Contributor

datadog-datadog-prod-us1 Bot commented Jun 5, 2026

Pipelines  Tests  Code Coverage

Fix all issues with BitsAI

⚠️ Warnings

🚦 9 Pipeline jobs failed

Check PR | run / Check PR changelog   View in Datadog   GitHub Actions

See error Package "datadog_checks_base" has changes that require a changelog.

PR All | test / j06ca546 / SNMP   View in Datadog   GitHub Actions

See error Max retries exceeded when trying to resolve 'ddintegrations.blob.core.windows.net'.

PR All | test / j3a8e004 (py3.13-linux-2022-tokyo) / SQL Server on Linux-py3.13-linux-2022-tokyo   View in Datadog   GitHub Actions

See error Unable to connect to SQL Server: TCP-connection error. Connection refused on host localhost:1433. Login timeout expired while using ODBC Driver 18 for SQL Server.

View all 9 failed jobs.

🧪 20 Tests failed in 1 job

PR All | run   GitHub Actions

test_bulk_table from test_check.py   View in Datadog (Fix with Cursor)
HTTPSConnectionPool(host=&#39;ddintegrations.blob.core.windows.net&#39;, port=443): Max retries exceeded with url: /snmp/cisco-3850.snmprec (Caused by NameResolutionError(&#34;HTTPSConnection(host=&#39;ddintegrations.blob.core.windows.net&#39;, port=443): Failed to resolve &#39;ddintegrations.blob.core.windows.net&#39; ([Errno -2] Name or service not known)&#34;))
test_cast_metrics from test_check.py   View in Datadog (Fix with Cursor)
HTTPSConnectionPool(host=&#39;ddintegrations.blob.core.windows.net&#39;, port=443): Max retries exceeded with url: /snmp/cisco-3850.snmprec (Caused by NameResolutionError(&#34;HTTPSConnection(host=&#39;ddintegrations.blob.core.windows.net&#39;, port=443): Failed to resolve &#39;ddintegrations.blob.core.windows.net&#39; ([Errno -2] Name or service not known)&#34;))

View all 20 test failures

ℹ️ Info

No other issues found (see more)

❄️ No new flaky tests detected

🎯 Code Coverage (details)
Patch Coverage: 100.00%
Overall Coverage: 87.63% (+0.09%)

Useful? React with 👍 / 👎

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: 887df72 | Docs | Datadog PR Page | Give us feedback!

@dd-octo-sts
Copy link
Copy Markdown
Contributor

dd-octo-sts Bot commented Jun 5, 2026

Validation Report

Validation Description Status
qa-label Validate the pull request declares whether it needs QA for the next Agent release

Run ddev validate all changed --fix to attempt to auto-fix supported validations.

Passed validations (20)
Validation Description Status
agent-reqs Verify check versions match the Agent requirements file
ci Validate CI configuration and code coverage settings
codeowners Validate every integration has a CODEOWNERS entry
config Validate default configuration files against spec.yaml
dep Verify dependency pins are consistent and Agent-compatible
http Validate integrations use the HTTP wrapper correctly
imports Validate check imports do not use deprecated modules
integration-style Validate check code style conventions
jmx-metrics Validate JMX metrics definition files and config
labeler Validate PR labeler config matches integration directories
legacy-signature Validate no integration uses the legacy Agent check signature
license-headers Validate Python files have proper license headers
licenses Validate third-party license attribution list
metadata Validate metadata.csv metric definitions
models Validate configuration data models match spec.yaml
openmetrics Validate OpenMetrics integrations disable the metric limit
package Validate Python package metadata and naming
readmes Validate README files have required sections
saved-views Validate saved view JSON file structure and fields
version Validate version consistency between package and changelog

View full run

@jinh-labs jinh-labs changed the title Lazy load the security validation module Lazy load validation.security on import of datadog_checks.checks Jun 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant