Apache Airflow Provider(s)
airbyte
Versions of Apache Airflow Providers
apache-airflow-providers-airbyte==5.3.3
Apache Airflow version
main
Operating System
Debian GNU/Linux 12 (bookworm)
Deployment
Other
Deployment details
No response
What happened
When using AirbyteTriggerSyncOperator in deferrable mode with execution_timeout set, the Airflow task times out but the underlying Airbyte job continues running.
In non-deferrable mode, exceeding execution_timeout causes both the task to fail and the Airbyte job to be cancelled via on_kill(). In deferrable mode, the task fails due to timeout, but the external Airbyte job is not stopped. This can result in long-running or orphaned Airbyte jobs.
This creates inconsistent behavior between deferrable and non-deferrable execution modes.
What you think should happen instead
execution_timeout should be enforced consistently regardless of execution mode.
When a deferrable AirbyteTriggerSyncOperator exceeds execution_timeout:
- The Airflow task should fail due to execution timeout
- The associated Airbyte job should be cancelled
This ensures predictable timeout behavior and prevents orphaned Airbyte jobs.
How to reproduce
-
Configure an Airbyte connection in Airflow (airbyte_default).
-
Create an Airbyte connection (source → destination) that runs longer than ~30 seconds
(for example: a slow running sync between 2 datasets).
-
Use the following DAG (replace <CONNECTION_ID> with your Airbyte connection ID):
from airflow import DAG
from airflow.providers.airbyte.operators.airbyte import AirbyteTriggerSyncOperator
from datetime import datetime, timedelta
with DAG(
dag_id="airbyte_deferrable_execution_timeout_repro",
start_date=datetime(2024, 1, 1),
schedule=None,
catchup=False,
) as dag:
run_airbyte = AirbyteTriggerSyncOperator(
task_id="run_airbyte",
airbyte_conn_id="airbyte_default",
connection_id="<CONNECTION_ID>",
asynchronous=False,
deferrable=True,
execution_timeout=timedelta(seconds=30),
)
- Trigger the DAG and wait for the task to exceed
execution_timeout.
Observed Behavior
- The Airflow task fails due to execution timeout
- The Airbyte job continues running and is not cancelled
Anything else
This inconsistency makes execution_timeout unreliable for deferrable Airbyte jobs and can lead to unintended resource usage.
While deferrable operators execute via the triggerer rather than a worker process, this does not preclude enforcing execution_timeout semantics. The expectation is that task-level timeout behavior remains consistent across execution modes, even if the underlying enforcement mechanism differs.
This issue is similar to #61467, which reports the same class of bug for DbtCloudRunJobOperator.
Are you willing to submit PR?
Code of Conduct
Apache Airflow Provider(s)
airbyte
Versions of Apache Airflow Providers
apache-airflow-providers-airbyte==5.3.3Apache Airflow version
main
Operating System
Debian GNU/Linux 12 (bookworm)
Deployment
Other
Deployment details
No response
What happened
When using
AirbyteTriggerSyncOperatorin deferrable mode withexecution_timeoutset, the Airflow task times out but the underlying Airbyte job continues running.In non-deferrable mode, exceeding
execution_timeoutcauses both the task to fail and the Airbyte job to be cancelled viaon_kill(). In deferrable mode, the task fails due to timeout, but the external Airbyte job is not stopped. This can result in long-running or orphaned Airbyte jobs.This creates inconsistent behavior between deferrable and non-deferrable execution modes.
What you think should happen instead
execution_timeoutshould be enforced consistently regardless of execution mode.When a deferrable
AirbyteTriggerSyncOperatorexceedsexecution_timeout:This ensures predictable timeout behavior and prevents orphaned Airbyte jobs.
How to reproduce
Configure an Airbyte connection in Airflow (
airbyte_default).Create an Airbyte connection (source → destination) that runs longer than ~30 seconds
(for example: a slow running sync between 2 datasets).
Use the following DAG (replace
<CONNECTION_ID>with your Airbyte connection ID):execution_timeout.Observed Behavior
Anything else
This inconsistency makes
execution_timeoutunreliable for deferrable Airbyte jobs and can lead to unintended resource usage.While deferrable operators execute via the triggerer rather than a worker process, this does not preclude enforcing
execution_timeoutsemantics. The expectation is that task-level timeout behavior remains consistent across execution modes, even if the underlying enforcement mechanism differs.This issue is similar to #61467, which reports the same class of bug for
DbtCloudRunJobOperator.Are you willing to submit PR?
Code of Conduct