Skip to content

v3 websocket transport protocol#4761

Open
joshua-spacetime wants to merge 2 commits intomasterfrom
joshua/ws/v3
Open

v3 websocket transport protocol#4761
joshua-spacetime wants to merge 2 commits intomasterfrom
joshua/ws/v3

Conversation

@joshua-spacetime
Copy link
Copy Markdown
Collaborator

Description of Changes

The v3 WebSocket API adds a thin transport layer around the existing v2 message schema so that multiple logical ClientMessages can be sent in a single WebSocket frame.

The motivation is throughput. In v2, each logical client message requires its own WebSocket frame, which adds per-frame overhead in the client runtime, server framing/compression path, and network stack. High-throughput clients naturally issue bursts of requests, and batching those requests into a single frame materially reduces that overhead while preserving the existing logical message model.

v3 keeps the v2 message schema intact and treats batching as a transport concern rather than a semantic protocol change. This lets the server support both protocols cleanly:

  • v2 remains unchanged for existing clients
  • v3 allows new clients to batch logical messages without changing reducer/procedure semantics
  • inner messages are still processed in order

On the server side, this PR adds:

  • v3.bsatn.spacetimedb protocol support
  • ClientFrame / ServerFrame transport envelopes
  • decoding of inbound batched client frames into ordered v2 logical messages
  • v3 outbound framing on the server side

API and ABI breaking changes

None. v2 clients continue to work unchanged.

Expected complexity level and risk

2

Testing

Testing will be included in the patches that update the sdk bindings

Copy link
Copy Markdown
Contributor

@Centril Centril left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My main comment here is that I think Client/ServerFrame don't actually need to exist and could just be Bytes which would also be more efficient and avoid some temporary allocations introduced here.

/// The inner bytes are BSATN-encoded v2 [`crate::websocket::v2::ClientMessage`] values.
#[derive(SpacetimeType, Debug)]
#[sats(crate = spacetimedb_lib)]
pub enum ClientFrame {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm; both Single and Batch could actually be stored as just Bytes with the logic that the host will try to deserialize as many ClientMessages as it can until the read buffer has been exhausted.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, we could go for a more structured / typed representation here, with Box<[super::v2::ClientMessage]>. This would still include some of the overhead/bookkeeping bytes and allocations which Mazdak's suggestion eliminates, but it would add type safety and eliminate a lot of ser/de boilerplate code. That would look like:

enum ClientFrame {
    Single(super::v2::ClientMessage),
    Batch(Box<[super::v2::ClientMessage]>),
}

And, you know, doing the same thing to ServerMessage.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah if we go for the tagged representation then a typed one seems better. However, given that the goal of this PR is perf, it seems to me that we should go for the representation with the least overhead. (This PR actually is a regression for the single case, but with my suggestion, the added overhead is a single branch rather than a full extra allocation / taking from the pool + memcpy.)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I prefer @Centril's suggestion to make v3 purely a transport protocol for v2 and remove the message schema entirely.

/// The inner bytes are BSATN-encoded v2 [`crate::websocket::v2::ClientMessage`] values.
#[derive(SpacetimeType, Debug)]
#[sats(crate = spacetimedb_lib)]
pub enum ClientFrame {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, we could go for a more structured / typed representation here, with Box<[super::v2::ClientMessage]>. This would still include some of the overhead/bookkeeping bytes and allocations which Mazdak's suggestion eliminates, but it would add type safety and eliminate a lot of ser/de boilerplate code. That would look like:

enum ClientFrame {
    Single(super::v2::ClientMessage),
    Batch(Box<[super::v2::ClientMessage]>),
}

And, you know, doing the same thing to ServerMessage.

pub mod common;
pub mod v1;
pub mod v2;
pub mod v3;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you prefer adding a new version with a wrapper, rather than adding new message variants to v2?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mainly because this new version does not represent a change in the message format/schema, but rather message transport/framing. I wanted to keep all of the message handlers for v2 essentially unchanged, and I thought this was the most maintainable way to split it.

clockwork-labs-bot pushed a commit that referenced this pull request Apr 12, 2026
# Description of Changes

Adds TypeScript SDK support for the `v3.bsatn.spacetimedb` websocket API
which was added in #4761.

`v3` just adds batching on top of `v2`, so adding support for it just
required the following:
1. The client negotiates the protocol with the server and falls back to
`v2` if necessary
2. When `v3` is negotiated, same-tick outbound client messages are
batched into a single ws frame
3. Inbound `v3` server frames are unwrapped into their inner v2 messages
and processed in-order

Some notes on the batching behavior:
1. Outbound `v3` frames are capped at `256 KiB` (unless a single message
exceeds the limit)
2. The SDK sends one v3 frame per flush
3. If more messages remain queued after hitting the cap, it schedules a
follow-up flush on a later task instead of draining in a tight loop, so
that outbound ws work doesn't starve inbound ws work or any other
event-loop tasks

# API and ABI breaking changes

None, `v2` fallback remains supported, so existing servers that only
negotiate `v2` still work with the updated client.

# Expected complexity level and risk

3

Constraints that must hold:
1. Same-tick sends should coalesce under `v3`
2. `v2` should remain unaffected
3. Capped batching should not starve inbound processing
4. Server frames containing multiple logical messages must be processed
in-order
5. Handshake fallback to `v2` must remain correct

# Testing

Added tests for the fast path and fallback behavior
@joshua-spacetime joshua-spacetime changed the title v3 websocket protocol v3 websocket transport protocol Apr 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants