fix(postgres): transpile DAY/MONTH/YEAR to EXTRACT [CLAUDE]#7798
fix(postgres): transpile DAY/MONTH/YEAR to EXTRACT [CLAUDE]#7798vjpovlitz wants to merge 2 commits into
Conversation
|
@vjpovlitz thanks for the PR, in order to transpile For example the input (tsql) can be a string value, which we can't transpile in postgres as is (we need on the parsing side of tsql to wrap the function argument with a Let's first make an analysis on all of these cases/inputs ^ in order to know the exact semantics of these functions, then we can come up with a robust implementation that covers theses cases (or a part of them). |
|
Thank you very much for the kind and detailed feedback @geooo109. This is one of my first open source contributions, so I really appreciate the guidance. Please keep it coming, it genuinely helps me grow as a programmer. I made assumptions, and thank you for pointing out that the naive form breaks as soon as the argument isn't already a date. I dug into the semantics and here's what I found. T-SQL
Proposed approach: wrap the argument on the T-SQL parse side in
The existing One scope question before I push: the integer-as-days-since-1900 case ( Would you prefer this PR to: (a) cover date, datetime, and string inputs and treat the integer case as unsupported (smaller and or (b) also emit the I'd like to do (b) if you're open to it, since handling all three input types keeps the transpilation faithful to T-SQL and feels more useful for the project, and I'll make sure it's well tested. If you'd rather keep this PR small, I'm glad to land (a) first and do (b) as a follow-up. |
|
@vjpovlitz great work!! thanks for the analysis. Let's focus on (a) for this PR, and we can do a following PR for the integer values. Keep in mind that the input can be a column, so the value of the argument isn't known (non-literal) on parse time. For example: So, we should use the sqlglot/sqlglot/generators/duckdb.py Line 3007 in 7521a08 You can assume that we always run in order the parsing-type annotation-generation, check for example: sqlglot/tests/dialects/test_duckdb.py Line 1511 in 7521a08 annotate_types and it's part of sqlglot's optimizer).
Before doing this ^, can you verify that it works for the tsql formats ? For example the docs state the following (for the This ^ will break the transpilation. |
|
@geooo109 Thanks, this was really helpful guidance. I implemented (a) and verified it against a local Postgres 15 on a MacBook and a Linux VM I have. Changes:
One note on the pipeline: On the formats, I ran each through real Postgres:
Committing and pushing to the PR branch and going through the GitHub actions checks right now, pausing for your review. Ty! |
Summary
PostgreSQL has no scalar
DAY/MONTH/YEARfunctions — date parts are read viaEXTRACT(field FROM source). When transpiling from a dialect that does have them (e.g.T-SQL), SQLGlot emitted the functions unchanged, producing invalid Postgres SQL.
This adds Postgres generator rules so
exp.Day/exp.Month/exp.Yearrender asEXTRACT(<part> FROM ...).Reproduction
MONTHandYEARbehave the same.EXTRACTis the standard Postgres construct for theseparts: https://www.postgresql.org/docs/current/functions-datetime.html#FUNCTIONS-DATETIME-EXTRACT
Approach
exp.Day/exp.Month/exp.Yearnodes, so only the Postgres output was wrong. Parsing and otherdialects are untouched (
tsql -> tsqlstill yieldsDAY(d)).exp.Extractnode rather than hand-built SQL, and the threerules share a small
_extract_sql(part)factory in the style of the neighboring_date_add_sql(kind)helper.they likewise lack scalar
DAY/MONTH/YEAR.Test
tests/dialects/test_postgres.py::TestPostgres::test_extract_date_partsasserts theT-SQL -> Postgres transpilation and that the
EXTRACTform round-trips in Postgres.Fixes #6220