feat(go): add go desrialization support via io streams by ayush00git · Pull Request #3374 · apache/fory

ayush00git · 2026-02-20T04:48:00Z

Why?

To enable stream-based deserialization in Fory's Go library, allowing for direct reading from io.Reader without pre-buffering the entire payload. This improves efficiency for network and file-based transport.

What does this PR do?

1. go/fory/buffer.go

Enhanced ByteBuffer to support io.Reader with an internal sliding window and automatic filling.

Added reader io.Reader and minCap int fields.
Implemented fill(n int) bool for on-demand data fetching and compaction.
Updated all Read* methods (fixed-size, varint, tagged) to fetch data from the reader if not cached.

func (b *ByteBuffer) fill(n int) bool {
    if b.reader == nil { return false }
    // Compaction and stream reading logic
    ...
}

2. go/fory/fory.go

Added the DeserializeFromReader method as the primary public API for stream deserialization.

Integrated io package.
Implemented DeserializeFromReader to reset the buffer state and initiate deserialization from a stream.

func (f *Fory) DeserializeFromReader(r io.Reader, v any) error {
    defer f.resetReadState()
    f.readCtx.buffer.ResetWithReader(r, 0)
    // Deserialization logic
    ...
}

3. go/fory/reader.go

Ensured ReadContext correctly manages the buffer state when switching between memory-only and stream-backed modes.

Updated SetData to reset the reader field.

Related issues

Closes #3302

Does this PR introduce any user-facing change?

Does this PR introduce any public API change?
Does this PR introduce any binary protocol compatibility change?

Benchmark

N/A

ayush00git · 2026-02-21T04:39:58Z

Hey @chaokunyang
Have a review and let me know the changes

Zakir032002 · 2026-02-23T07:05:46Z

hey @ayush00git, looked through this and the main issue i see is in DeserializeFromReader —
it calls ResetWithReader at the start of every call:

func (f *Fory) DeserializeFromReader(r io.Reader, v any) error {
    defer f.resetReadState()
    f.readCtx.buffer.ResetWithReader(r, 0) // this wipes the prefetch window every time

so if fill() reads ahead past the first object boundary (which it will), those bytes
are gone on the next call. sequential decode from one stream is broken:

for {
    var msg Msg
    f.DeserializeFromReader(conn, &msg) // bytes after first object get thrown away
}

if you look at how he handles this for c++/python — the Buffer is constructed
from the stream once and passed to each deserialize call directly. the buffer holds
state across calls, it's never reset between objects. the python test
test_stream_deserialize_multiple_objects_from_single_stream shows this exactly —
same reader buffer passed to multiple fory.deserialize() calls.

the go version probably needs something similar — a stream reader type that owns the
buffer and gets reused across deserializations rather than resetting on each call.

Happy to discuss if I'm misreading the flow here

ayush00git · 2026-02-23T09:12:08Z

Hiii @Zakir032002
Thanks for noticing this, exactly this is a bug in the implementation from my side. yes the call would clear any prefetched data from the ByteBuffer making the sequential reads from the stream impossible, also it was clearing the typemetadata as well. thanks for mentioning this, i'll look at the c++ python implementation to correct the deserializer.

Zakir032002 · 2026-02-23T10:04:29Z

hey @ayush00git , one more thing — ReadBinary and ReadBytes return a direct slice into
b.data:

v := b.data[b.readerIndex : b.readerIndex+length]
return v

the problem is fill() compacts the buffer in-place:

copy(b.data, b.data[b.readerIndex:])

so if someone reads a []byte field and holds onto that slice, then the next
read triggers a fill() — the compaction just overwrote the bytes they're
still holding. no error, no panic, just wrong data.

in stream mode you probably want to copy before returning instead of aliasing:

if b.reader != nil {
    result := make([]byte, length)
    copy(result, b.data[b.readerIndex:b.readerIndex+length])
    b.readerIndex += length
    return result
}

in-memory path stays as is.

Zakir032002 · 2026-02-23T10:07:58Z

also noticed — ReadVarUint32Small7 only does fill(1) for the first byte, but if that byte has 0x80 set it falls through to continueReadVarUint32 which isn't touched in this PR. so in stream mode, if a multi-byte varint straddles a chunk boundary, the continuation bytes may not be in the buffer yet — you either get a BufferOutOfBoundError or silently read the wrong bytes depending on what's sitting at that position in the buffer.

easiest fix is probably just routing the multi-byte case through readVarUint32Slow since that's already stream-aware after your changes. or adding fill(1) guards inside continueReadVarUint32 directly, either works.

Happy to discuss if I'm misreading the flow here

ayush00git · 2026-02-23T18:12:26Z

Hey @Zakir032002
Sorry i'm a bit busy with my exams, as i get free, i'll review the comments

docs/compiler/compiler-guide.md

ayush00git · 2026-02-25T19:34:28Z

Hii @Zakir032002
Thanks for pointing out the flows,

The ReadFromDeserializer and returning a direct slice into the data stream are wrongly implemented by me, thanks for suggesting the chnages to fix them as well.

But i think you misunderstood the ReadVarUint32Small7. We already have a check condition -

if len(b.data)-readIdx >= 5 {

}

If we are near a chunk boundary (less than 5 bytes remaining in the buffer), the execution completely skips continueReadVarUint32 and jumps straight to readVaruint36Slow. I don't think this part need any changes

…le stateful deserialization

ayush00git · 2026-02-26T12:02:14Z

I've added the StreamReader which now creates a copy slice during desrialization to preserve the data between sequential desrialization calls. DeserializeFromReader only is there if the user wants to deserialize a single struct and don't want a stream overhead for that.

chaokunyang · 2026-02-28T05:56:42Z

go/fory/buffer.go

+			b.data = b.data[:len(b.data)+readBytes]
+			b.writerIndex += readBytes
+		}
+		if err != nil {


fill currently folds reader errors into false, and callers then emit BufferOutOfBoundError. This masks non-EOF transport failures (for example connection reset) as bounds issues. Please preserve/propagate non-EOF read errors so stream deserialization reports the real I/O failure.

now it logs the exact error

chaokunyang · 2026-02-28T05:56:56Z

go/fory/buffer.go

 	return &ByteBuffer{data: data}
 }

+func NewByteBufferFromReader(r io.Reader, minCap int) *ByteBuffer {


This introduces stream mode, but ByteBuffer.Read(p) still only copies from in-memory b.data and never calls fill. Any decode path that uses Read can therefore observe partial/zero bytes with short-chunk readers. Please make stream-backed Read fetch until len(p) bytes are available (or return the underlying read error) to avoid silent metadata corruption.

now it calls fill() upto reserved bytes and copies as well to prevent metadata corruption.

ayush00git · 2026-03-02T10:14:54Z

Hii @chaokunyang
Have a look and let me know the changes.

chaokunyang · 2026-03-03T07:00:12Z

Please take #3307 as reference to finish the remaining works. And create a Deseralize help methods in tests, then use that instead of fory.Deserialize for deserialization, and in the Deseralize test helper, first deserialize from bytes, then wrap it into a OneByteStream to deserialize it to ensure deserialization works.

Then run benchmarks/go to compare with asf/main to enure your code change don't introduce any performance regression.

added io.Reader to ByteBuffer for streaming deserialization

ba26d6a

ayush00git requested a review from chaokunyang as a code owner February 20, 2026 04:48

ayush00git changed the title ~~feat(go): add go desrialization support via transport streams~~ feat(go): add go desrialization support via io streams Feb 20, 2026

ayush00git added 5 commits February 20, 2026 14:01

added NewByteBufferFromReader and fill method

8ba4dba

added condition to check for read stream and ResetByteBuffer method

e4baf32

added stream deserializer and initialized buffer reader to 0

f751916

added stream test suites

88ecb2d

fix ci

a768ac1

Merge branch 'main' into feat/go-deserialization

4ffe66f

fix(docs): updated compiler guide

1742a5d

ayush00git requested review from PragmaTwice and theweipeng as code owners February 25, 2026 15:46

chaokunyang reviewed Feb 25, 2026

View reviewed changes

docs/compiler/compiler-guide.md Outdated Show resolved Hide resolved

ayush00git added 2 commits February 25, 2026 21:42

Merge branch 'main' into feat/go-deserialization

4887f5b

Merge branch 'main' into feat/go-deserialization

544645b

ayush00git added 6 commits February 26, 2026 16:52

Merge branch 'main' into feat/go-deserialization

1be86de

fix: create a copy of slice while desrlz to prevent overwriting

e205d5d

added StreamReader struct and NewStreamReader method which would hand…

4fdb8ec

…le stateful deserialization

added StreamReader struct and NewStreamReader method which would hand…

1a8c100

…le stateful deserialization

added tests for stream reader

92c207d

code lint checks

d90c55d

ayush00git requested a review from chaokunyang February 27, 2026 05:17

Merge branch 'main' into feat/go-deserialization

d276644

chaokunyang reviewed Feb 28, 2026

View reviewed changes

ayush00git added 2 commits February 28, 2026 13:10

fix: added correct error boundation

42c3bbb

read now fetch from the stream until p bytes eln

08c5cf1

ayush00git requested a review from chaokunyang February 28, 2026 09:14

ayush00git added 2 commits February 28, 2026 22:41

Merge branch 'main' into feat/go-deserialization

bb04c0f

Merge branch 'main' into feat/go-deserialization

2f9f0bd

ayush00git mentioned this pull request Mar 3, 2026

[BUG][GO]: //go:inline pragmas are no-ops — hot-path functions not being inlined #3446

Open

2 tasks

ayush00git added 2 commits March 3, 2026 12:01

Merge branch 'main' into feat/go-deserialization

120ddf2

trigger ci

0c6dc2b

Conversation

ayush00git commented Feb 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why?

What does this PR do?

1. go/fory/buffer.go

2. go/fory/fory.go

3. go/fory/reader.go

Related issues

Does this PR introduce any user-facing change?

Benchmark

Uh oh!

ayush00git commented Feb 21, 2026

Uh oh!

Zakir032002 commented Feb 23, 2026

Uh oh!

ayush00git commented Feb 23, 2026

Uh oh!

Zakir032002 commented Feb 23, 2026

Uh oh!

Zakir032002 commented Feb 23, 2026

Uh oh!

ayush00git commented Feb 23, 2026

Uh oh!

Uh oh!

ayush00git commented Feb 25, 2026

Uh oh!

ayush00git commented Feb 26, 2026

Uh oh!

chaokunyang Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

ayush00git Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

chaokunyang Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

ayush00git Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

ayush00git commented Mar 2, 2026

Uh oh!

chaokunyang commented Mar 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ayush00git commented Feb 20, 2026 •

edited

Loading

chaokunyang commented Mar 3, 2026 •

edited

Loading