-
Notifications
You must be signed in to change notification settings - Fork 2k
Description
Describe the bug
When an INSERT INTO ... VALUES statement contains a mix of bare placeholders ($1, $3) and function-wrapped placeholders (func($2)), the inferred types for placeholders can be incorrectly assigned due to an index corruption bug in insert_to_plan.
It scans the VALUES clause and collects placeholder-to-column-type mappings into a BTreeMap<usize, FieldRef>. Only direct placeholders are matched — placeholders nested inside function calls are skipped. The BTreeMap is then converted to a Vec via into_values().collect(), which discards the keys and compresses the sparse entries into a dense Vec.
For example, given:
INSERT INTO t (id, name, age) VALUES ($1, upper($2), $3)
-- columns: id(UInt32), name(Utf8), age(Int32)- The scan produces
BTreeMap { 0: UInt32, 2: Int32 }(index 1 skipped because $2 is inside upper()) into_values().collect()producesVec [UInt32, Int32]— indices 0 and 2 are collapsed to 0 and 1- When
create_placeholder_exprresolves $2, it computesidx = 2 - 1 = 1, then looks up Vec[1] = Int32 $2is incorrectly inferred asInt32(the type of age) instead of being left as unknown (None)
This causes downstream type-checking failures. In real-world usage (e.g., GreptimeDB GreptimeTeam/greptimedb#7610), INSERT INTO t VALUES ($1, parse_json($2), $3) fails because $2 is wrongly inferred as Timestamp instead of String, making parse_json(Timestamp) fail type validation.
Fix
Replace into_values().collect() with a conversion that preserves index positions, filling gaps with None:
// Before (broken):
let prepare_param_data_types = prepare_param_data_types.into_values().collect();
// After (correct):
let prepare_param_data_types: Vec<Option<FieldRef>> =
if let Some(&max_idx) = prepare_param_data_types.keys().last() {
(0..=max_idx)
.map(|i| prepare_param_data_types.remove(&i))
.collect()
} else {
vec![]
};This requires changing the prepare_param_data_types type from Vec<FieldRef> to Vec<Option<FieldRef>> throughout the PlannerContext API to properly represent "type unknown for this placeholder position".
I'd love to create a PR for this fix if you agree.
Thanks!
To Reproduce
No response
Expected behavior
No response
Additional context
No response