Conversation
connortsui20
left a comment
There was a problem hiding this comment.
Is it possible to follow the template a bit more closely? It's less about following the exact template and more that it's a bit hard to understand what it being proposed and the different tradeoffs (alternatives) without some of the sections in the template. Basically, when reading through this I'm trying to figure out the "why" via the "what", and it should be the other way around.
Additionally, there is quite a lot of code here that is paired with high-level description. I think it would be helpful if some of the boilerplate-y code was removed so we could focus on things that are not obvious. That would help in making this easier to read as well.
Also as an aside: I'm not an FFI expert but I feel like there is a lack of ownership semantics described here. Shouldn't that be one of the main things we have to worry about with interop between Rust and C?
I've thought about this, and likely we don't want to expose (in-memory) caching as a separate implementation. If host provides us with malloc/free-compatible functions, we can build our cache inside Vortex and host will still have observability. On persistent caching, this may possibly be useful for remote files, but in this case it should be host's responsibility to save the decoded or original file somewhere on the filesystem and provide us with a reference. This can already be done with file_open (host may get the reference to a cached file). High level API is missing a way to ready a file directly from a buffer, but I didn't add it as I'm not sure this would be used by anyone. |
Duckdb integration PoC: https://github.com/vortex-data/vortex/tree/myrrc/scan-api-duckdb