Fix: Add missing serial_size() method to ShortType#688
Fix: Add missing serial_size() method to ShortType#688mykaul wants to merge 1 commit intoscylladb:masterfrom
Conversation
ShortType (smallint) is a fixed-size type (2 bytes) but was missing the serial_size() classmethod. This prevented optimizations that depend on knowing the fixed size of elements, such as the struct.unpack optimization in VectorType deserialization. All other fixed-size numeric types (Int32Type, LongType, FloatType, DoubleType, BooleanType) already have serial_size() defined. Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>
|
This is not a bug in the driver. Even better: this PR introduces a bug in the driver! impl NativeType {
/// This function returns the size of the type as it is used by Cassandra
/// for the purposes of serialization and deserialization of vectors,
/// it is needed as the variable size types (`None`) are de/serialized
/// differently than the fixed size types (`Some(size)`). Note that
/// many fixed size types are treated as variable size by Cassandra.
pub(crate) fn type_size(&self) -> Option<usize> {
match self {
NativeType::Ascii => None,
NativeType::Boolean => Some(1),
NativeType::Blob => None,
NativeType::Counter => None,
NativeType::Date => None,
NativeType::Decimal => None,
NativeType::Double => Some(8),
NativeType::Duration => None,
NativeType::Float => Some(4),
NativeType::Int => Some(4),
NativeType::BigInt => Some(8),
NativeType::Text => None,
NativeType::Timestamp => Some(8),
NativeType::Inet => None,
NativeType::SmallInt => None,
NativeType::TinyInt => None,
NativeType::Time => None,
NativeType::Timeuuid => Some(16),
NativeType::Uuid => Some(16),
NativeType::Varint => None,
}
}
} |
|
Wow. Seriously? And they did not bother to fix it? Wow. Why do we have to live with it? Why not do it correctly against ScyllaDB? Did Scylla follow and got it 'wrong' too ? |
|
CC @swasik for awareness. Scary stuff - how does Scylla work in this case? |
That would require a new protocol version, or at least an extension.
We are protocol-compatible with Cassandra. If we did it differently, our drivers would not work with C*, and their drivers would not work with Scylla.
Yes, and it is not their only questionable design decision regarding vectors.
Adding Vector support was 2nd most commented Rust Driver PR ever for good reason. |
@Lorak-mmk , @swasik - WDYT? |
But I think we want to keep the compatibility with Cassandra drivers - this way Cassandra integration (such as LangChain) will work out of the box without the need to change to Scylla drivers. |
The unfortunate truth is that the protocol specs are really bad w.r.t vectors. Another example of breakage: We could fix vectors with protocol extension. Does it make sense? I don't think so. None of the issues affect |
|
Things are going from bad to worse. Wireshark CQL dissector doesn't parse custom fields in a response. So now I'm fixing that. But it doesn't compile, so now I'm bisecting that. |
I'm hoping that this is good enough. I think I'll focus on float optimization and let the rest be as it used to be. |
ShortType (smallint) is a fixed-size type (2 bytes) but was missing the serial_size() classmethod. This prevented optimizations that depend on knowing the fixed size of elements, such as the struct.unpack optimization in VectorType deserialization.
All other fixed-size numeric types (Int32Type, LongType, FloatType, DoubleType, BooleanType) already have serial_size() defined.
Pre-review checklist
./docs/source/.Fixes:annotations to PR description.