Submit general resource via checks base#23905
Conversation
|
| ``include`` is required: ``{"paths": [...], "map_paths": [...], "annotation_keys": [...]}``. | ||
| ``paths`` select values; ``map_paths`` select whole flat maps (e.g. ``metadata.labels``); | ||
| ``annotation_keys`` glob ``metadata.annotations`` keys. A path that resolves to a structured | ||
| object is dropped. ``seen_at`` / ``expire_at`` are optional ``int`` unix-seconds. |
There was a problem hiding this comment.
nit
I find the wording a bit confusing.
- There is no need to specify that args/kwargs are required or not, that is ensured by the function signature.
- After reading I'm still not sure where 'paths' and the other selectors select FROM?
There was a problem hiding this comment.
Paths select from a json tree. For example:
status.operationState.operation.initiatedBy.username
Selects a field inside of status called operationState etc.:
Something like this for example.
{
"status": {
"operationState": {
"operation": {
"initiatedBy": {
"username": "string"
}
}
}
}
}
status.operationState.operation[*].username:
This is used for arrays if operation was an array. It just means grab the username for each operation inside the array.
For maps_path it's only for flat maps for example:
{
"metadata": {
"labels": {
"team": "platform",
"env": "prod"
}
}
}
metadata.labels will take both team and env labels.
| if not key: | ||
| self.log.warning("genresources: dropping resource with empty key for type=%s", type) | ||
| _emit_dropped() | ||
| return |
There was a problem hiding this comment.
request
Please add a check for type as well.
| _emit_dropped() | ||
| return | ||
|
|
||
| if any(not pattern.strip("*?") for pattern in annotation_keys): |
There was a problem hiding this comment.
question
Patterns like *a* will pass through, is this intended?
| if _targets_annotations(segments): | ||
| continue |
There was a problem hiding this comment.
question
Why are "annotations" forbidden in the paths?
request
Please add a log for the skipped path/map_path. Maybe at DEBUG level.
There was a problem hiding this comment.
I think for the most part, annotations can contain sensitive information in kubernetes. Say for example you annotate this application with a postgrescheck for example. So this is just to prevent annotations and secrets from being leaked and collected.
| datadog_agent.emit_agent_telemetry(integration, "datadog.agent.check.genresources.dropped", count, "count") | ||
|
|
||
| if not key: | ||
| self.log.warning("genresources: dropping resource with empty key for type=%s", type) |
There was a problem hiding this comment.
Is there any more information you can print in this log line?
| return | ||
|
|
||
| # stdlib json on purpose: module-level json is the orjson wrapper, which coerces datetime instead of failing. | ||
| import json as _json |
There was a problem hiding this comment.
Just curious, what you think about about using orjson but using the orjson.OPT_PASSTHROUGH_DATETIME flag to handle datetime?
| ``include`` is required: ``{"paths": [...], "map_paths": [...], "annotation_keys": [...]}``. | ||
| ``paths`` select values; ``map_paths`` select whole flat maps (e.g. ``metadata.labels``); | ||
| ``annotation_keys`` glob ``metadata.annotations`` keys. A path that resolves to a structured | ||
| object is dropped. Pass ``include=INCLUDE_ALL`` to ship a dict you built in code as-is — only |
There was a problem hiding this comment.
This might just be personal preference, but I don't see the need to define this INCLUDE_ALL constant when we could just let None signify INCLUDE_ALL
There was a problem hiding this comment.
I wanted this to be an explicit flag. The process for scrubbing data is pretty hard sadly. So I wanted the ability to ingest everything to be explicitly agreed on.
|
|
||
| from datadog_checks.base.utils.genresources.inclusion import apply_allow_list, find_invalid_include | ||
|
|
||
|
|
There was a problem hiding this comment.
What do you think about testing the inclusion logic solely in test_agent_check_submit_generic_resource.py by passing different include values to get_event_platform_events? That way we only test the public method get_event_platform_events and the tests are resilient to refactors of this logic.
| def test_submit_generic_resource_drops_and_counts_invalid_input(aggregator, check, kwargs): | ||
| check.submit_generic_resource(**kwargs) | ||
| assert aggregator.get_event_platform_events("genresources", parse_json=False) == [] | ||
| datadog_agent.assert_telemetry("argocd", "datadog.agent.check.genresources.dropped", "count", 1) |
There was a problem hiding this comment.
Can you test the log lines here too?
Validation ReportAll 21 validations passed. Show details
|
What does this PR do?
Add helper functions to submit data to general resources via EVP.
As part a 3 part process:
PR1 to add the forwarder: DataDog/datadog-agent#51365
PR2 to add the helper functions in checks: this PR
PR3 to onboard ArgoCD as to submit entity metadata: #23834