Skip to content

AirbyteTriggerSyncOperator does not enforce execution_timeout semantics in Deferrable mode #64048

@SameerMesiah97

Description

@SameerMesiah97

Apache Airflow Provider(s)

airbyte

Versions of Apache Airflow Providers

apache-airflow-providers-airbyte==5.3.3

Apache Airflow version

main

Operating System

Debian GNU/Linux 12 (bookworm)

Deployment

Other

Deployment details

No response

What happened

When using AirbyteTriggerSyncOperator in deferrable mode with execution_timeout set, the Airflow task times out but the underlying Airbyte job continues running.

In non-deferrable mode, exceeding execution_timeout causes both the task to fail and the Airbyte job to be cancelled via on_kill(). In deferrable mode, the task fails due to timeout, but the external Airbyte job is not stopped. This can result in long-running or orphaned Airbyte jobs.

This creates inconsistent behavior between deferrable and non-deferrable execution modes.

What you think should happen instead

execution_timeout should be enforced consistently regardless of execution mode.

When a deferrable AirbyteTriggerSyncOperator exceeds execution_timeout:

  • The Airflow task should fail due to execution timeout
  • The associated Airbyte job should be cancelled

This ensures predictable timeout behavior and prevents orphaned Airbyte jobs.

How to reproduce

  1. Configure an Airbyte connection in Airflow (airbyte_default).

  2. Create an Airbyte connection (source → destination) that runs longer than ~30 seconds
    (for example: a slow running sync between 2 datasets).

  3. Use the following DAG (replace <CONNECTION_ID> with your Airbyte connection ID):

from airflow import DAG
from airflow.providers.airbyte.operators.airbyte import AirbyteTriggerSyncOperator
from datetime import datetime, timedelta

with DAG(
    dag_id="airbyte_deferrable_execution_timeout_repro",
    start_date=datetime(2024, 1, 1),
    schedule=None,
    catchup=False,
) as dag:

    run_airbyte = AirbyteTriggerSyncOperator(
        task_id="run_airbyte",
        airbyte_conn_id="airbyte_default",
        connection_id="<CONNECTION_ID>",
        asynchronous=False,
        deferrable=True,
        execution_timeout=timedelta(seconds=30),
    )
  1. Trigger the DAG and wait for the task to exceed execution_timeout.

Observed Behavior

  • The Airflow task fails due to execution timeout
  • The Airbyte job continues running and is not cancelled

Anything else

This inconsistency makes execution_timeout unreliable for deferrable Airbyte jobs and can lead to unintended resource usage.

While deferrable operators execute via the triggerer rather than a worker process, this does not preclude enforcing execution_timeout semantics. The expectation is that task-level timeout behavior remains consistent across execution modes, even if the underlying enforcement mechanism differs.

This issue is similar to #61467, which reports the same class of bug for DbtCloudRunJobOperator.

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions