feat(snowflake): T1.2 data types — TypeName struct + parseDataType#39
Open
feat(snowflake): T1.2 data types — TypeName struct + parseDataType#39
Conversation
Single TypeName struct with TypeKind enum (22 values) covering every Snowflake data type from the legacy grammar's data_type rule. Numeric params via []int slice, recursive ElementType for ARRAY/VECTOR, and VectorDim for VECTOR(type, dim). Requires small F2 scope creep: 3 new keyword constants (kwTIMESTAMP_LTZ/NTZ/TZ) + 6 keyword map entries (with/without underscore aliases). Multi-word types (DOUBLE PRECISION, CHAR VARYING, NCHAR VARYING) handled via parser lookahead, not F2 changes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
6-task plan: F2 timestamp variant keywords (3 constants + 6 map entries), TypeKind enum (22 values) + TypeName struct in ast, AST unit tests, parseDataType dispatch (all type forms from legacy grammar), parser tests (16 categories), acceptance sweep. Handles multi-word types (DOUBLE PRECISION, CHAR/NCHAR VARYING) via parser lookahead, ARRAY(element_type) via recursive parse, and VECTOR(type, dim) via restricted element-type rule. Walker regen produces a real *TypeName case with ElementType child walk. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Second Tier 1 node. Adds data type parsing covering every form from the
legacy SnowflakeParser.g4 data_type rule.
AST types (snowflake/ast):
type TypeKind int // 22 values: TypeInt through TypeVector
type TypeName struct {
Kind TypeKind
Name string // source text for round-tripping
Params []int // numeric params: [precision] or [precision, scale]
ElementType *TypeName // ARRAY element / VECTOR element type
VectorDim int // VECTOR dimensions; -1 for non-VECTOR
Loc Loc
}
TypeName is a Node (T_TypeName tag). The walker descends into
ElementType when non-nil (verified by TestTypeName_WalkerVisitsElementType).
Parser helpers (snowflake/parser/datatypes.go):
parseDataType() — full dispatch: 30+ keyword cases
parseOptionalTypeParams()— optional (n) or (n, m)
parseTimestampType() — shared handler for TIMESTAMP variants
parseVectorElementType() — restricted to INT/INTEGER/FLOAT/FLOAT4/FLOAT8
ParseDataType(string) — freestanding exported helper
Multi-word types (DOUBLE PRECISION, CHAR VARYING, NCHAR VARYING) are
fused via parser lookahead — F2 emits them as separate tokens (kwDOUBLE
+ tokIdent("PRECISION")), the parser peeks and combines.
F2 keyword additions (scope creep, 3 constants + 6 map entries):
kwTIMESTAMP_LTZ, kwTIMESTAMP_NTZ, kwTIMESTAMP_TZ
Both underscore and no-underscore forms in keywordMap
(TIMESTAMP_LTZ and TIMESTAMPLTZ both accepted)
Tests: 25 AST subtests (TypeKind.String, Tag, walker traversal) +
60+ parser subtests (all integer types, NUMBER with params, float
aliases, DOUBLE PRECISION, CHAR/NCHAR VARYING, all string types,
binary, timestamp variants incl. no-underscore forms, TIME/DATETIME
with precision, BOOLEAN, semi-structured, ARRAY untyped/typed/nested,
VECTOR, error cases, freestanding helper, Loc spanning).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Second Tier 1 node of the snowflake parser migration (BYT-9000): data type parsing covering every form from the legacy
data_typegrammar rule. AddsTypeNameNode struct withTypeKindenum (22 values) tosnowflake/ast, plusparseDataType()parser helper tosnowflake/parser.Depends on T1.1 identifiers (PR #22, merged). Unblocks T1.3 (expressions — needs type parsing for CAST/TRY_CAST and
::operator).Design highlights
TypeNamestruct withKind TypeKindenum for fast dispatch +Name stringfor round-tripping. Covers INT through VECTOR with 22 classified categories. No per-type-family struct bloat.Params []intfor numeric parameters (NUMBER(38,0), VARCHAR(100), TIME(9)). Nil = no params; zero IS a valid value (TIME(0)).ElementType *TypeNamefor recursive ARRAY and VECTOR element types. Walker descends into it.Test plan
snowflake/ast/andsnowflake/parser/🤖 Generated with Claude Code