Use native SQLA UUID/Uuid instead of sqlalchemy-utils#61532
Use native SQLA UUID/Uuid instead of sqlalchemy-utils#61532potiuk merged 1 commit intoapache:mainfrom
Conversation
airflow-core/src/airflow/migrations/versions/0103_3_2_0_fix_uuid_column_types.py
Outdated
Show resolved
Hide resolved
1828fed to
fa290a9
Compare
ea5f602 to
2156654
Compare
|
I think a lot of small changes, would request another pair of eyes for review if I missed anything, then LGTM! |
bugraoz93
left a comment
There was a problem hiding this comment.
Looks great, thanks!
This looks like breaking for the public API on the datamodel change. It won't in most of the cases due to str translation of HTTP layer but if someone depends on our datamodels to communicate, then API communication would break. It would be amazing if you can add significant newsfragment for it so it could be descriptive to any type of Airflow users :)
@bugraoz93 What do you mean in regards of public API and data model? The DB Model is for sure not a public API and we usually warn everybody who makes calls and interactions with the DB. Rest API only makes now a bit more preceise details about the format of the UUID keys. Do you see this as a breaking change? Because content will not be different. |
|
Birefly closing + re-opening to force a re-check with all versions label added. |
Thanks Jens, good question! This could maybe explain more, so I can see if I missed anything too |
Okay, then I am 95% certain... any kind of customization and type casting is non breaking in my view - just type checking. Otherwise the interpreter should do type casting under the hood automatically. Then let's leave the PR open for a (short) moment if any other expert has a stringer opinion. For me this is an internal change only and only typing outside effect. |
|
This also depends on how we categorise this. I agree that %95 won't be breaking part. For example, FastAPI changed how they return the value, which was breaking for us, even though it wasn't a breaking change and only an internal change. Of course, not saying this should wait for more versions, just trying to see if we can/need to warn some group of people using/building over Airflow :) Great idea to get more take! |
3a7ca09 to
06f9077
Compare
jason810496
left a comment
There was a problem hiding this comment.
For example, if they introduced a new endpoint in their environment, or even applications that extend and use Airflow Public API, or some process that depends on our datamodels to communicate with the Public API or extend in any way. In that case, ti.id is not str anymore for those use cases for them while using as from airflow.api_fastapi.core_api.datamodels.task_instances import TaskInstanceResponse, they may need to do some level of casting or a change in their type annotations in their code. They should do some degree of change, I assume
What do you think?
I think the change from str to UUID in datamodels will also impact airflow-client-python, since it is generated from the yaml.
Maybe we could add a field_serializer to ensure the responses remain compatible with the response layer.
@field_serializer("id", when_used="always")
def serialize_id(self, id_value: UUID) -> str:
return str(id_value)
Couple of questions:
|
jason810496
left a comment
There was a problem hiding this comment.
Couple of questions:
- Why str(id_value) (dashed) and not id_value.hex (dashless)?
- Should we have a test for this behavior if it is some API we must adhere to?
- If we use a dashless UUID as the response (which would also affect the JSON field of the API response), it will be an even bigger breaking change IMHO.
- A simple unit test to verify the data model serialization behavior would be enough, since the generated OpenAPI yaml spec will also reflect that behavior.
|
@jason810496 Please see below Claude's analysis of the point you raised. I am going to add the extra test. Please let me know if you think this is insufficient. Reviewer concern:
|
Oh, indeed, I oversight the generated yaml spec, the type is still string instead of uuid, so I think it still compatible for the generated client.
Yes, the analysis makes sense to me, then I think we don't need |
c8001cb to
4c63256
Compare
4c63256 to
b750e1d
Compare
|
#protm |
|
Cool! THanks for the efforts also from my side! |
This PR removes a dependency on a 3rd party library by switching to native SQLA types.
Migrations:
sa.Uuid(that uses backend-appropriate dtypes automatically).related (split from): #61229
Was generative AI tooling used to co-author this PR?
Generated-by: Claude Opus 4.6 following the guidelines
{pr_number}.significant.rstor{issue_number}.significant.rst, in airflow-core/newsfragments.