Skip to content

Redact Connection strings with "unstructured" extra #63160

@0x0OZ

Description

@0x0OZ

Description

The Airflow REST API redacts sensitive fields in connection responses. The password field is always replaced with ***. The extra field is redacted by parsing it as JSON and redacting each value in the resulting dictionary.

However, if the extra field contains a non-JSON string (e.g., a raw Bearer token, a key=value pair, XML, or any other format), the json.loads() call raises JSONDecodeError, and the exception handler returns the raw value as-is without redaction.

From reading the comment there, it seems this is a known issue, so I didn't report this as a security bug, but hopefully as a feature request.

    def redact_extra(cls, v: str | None) -> str | None:
        if v is None:
            return None
        try:
            extra_dict = json.loads(v)
            redacted_dict = redact(extra_dict)
            return json.dumps(redacted_dict)
        except json.JSONDecodeError:
            # we can't redact fields in an unstructured `extra`
            return v

Currenlty the code simply returns the extra field if it failed to dumps its JSON, instead maybe it should just return ***, e.g return "***"

Ref:
https://github.com/apache/airflow/blob/main/airflow-core/src/airflow/api_fastapi/core_api/datamodels/connections.py#L53-L65

Use case/motivation

No response

Related issues

No response

Are you willing to submit a PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    area:APIAirflow's REST/HTTP APIkind:featureFeature Requestsneeds-triagelabel for new issues that we didn't triage yet

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions