Skip to content

Make NettyByteBuf share parent reference count.#1891

Open
vbabanin wants to merge 5 commits intomongodb:mainfrom
vbabanin:leak-fix
Open

Make NettyByteBuf share parent reference count.#1891
vbabanin wants to merge 5 commits intomongodb:mainfrom
vbabanin:leak-fix

Conversation

@vbabanin
Copy link
Member

@vbabanin vbabanin commented Feb 18, 2026

NettyButeBuf currently uses retainedDuplicate() when duplicating Netty buffers. While retainedDuplicate() is documented as similar to duplicate().retain(), the two can differ under pooled Netty buffers, which can surface as IllegalReferenceCountException in the command logging path.

What’s happening (lifetime difference in this flow)

duplicate().retain() (shared refCnt):
duplicate() returns a view that shares the parent’s refCnt; retain() increments that shared counter. Behavior in the relevant block:

1. bsonOutput holds NettyByteBuf_A wrapping PARENT
PARENT refCnt = 1

2. getByteBuffers() -> duplicate().retain()
PARENT refCnt = 2 (same counter shared by both views)

3. getCommandDocument() finally block inside releases the duplicate
PARENT refCnt = 1 (original still held by bsonOutput)

4. Logging reads from the buffer
PARENT refCnt = 1 OK

5. bsonOutput.close() releases the original
PARENT refCnt = 0 freed
Detailed diagnostic
InternalStreamConnection.sendAndReceiveInternal()                                                                                                                                                            
 │                                                                                                                                                                                                            
 │  try (ByteBufferBsonOutput bsonOutput = ...) {        // bsonOutput owns PARENT buffers                                                                                                                    
 │  │                                                                                                                                                                                                         
 │  │   bsonOutput.bufferList = [PARENT]                  PARENT refCnt = 1                                                                                                                                   
 │  │
 │  │   message.encode(bsonOutput, ...);                  // writes command into PARENT
 │  │
 │  │   commandDocument = message.getCommandDocument(bsonOutput);
 │  │   │
 │  │   │  byteBuffers = bsonOutput.getByteBuffers()
 │  │   │  │
 │  │   │  │  cur.duplicate()                             // NettyByteBuf.duplicate()
 │  │   │  │  │  proxied.duplicate().retain()             PARENT refCnt = 1 -> 2
 │  │   │  │  │  returns DUP (shared counter with PARENT)
 │  │   │  │
 │  │   │  byteBuffers = [DUP]                            PARENT refCnt = 2
 │  │   │
 │  │   │  CompositeByteBuf byteBuf = new CompositeByteBuf(byteBuffers)
 │  │   │  │  DUP.asReadOnly() -> returns DUP itself (return this)
 │  │   │  │  component stores DUP                        PARENT refCnt = 2
 │  │   │
 │  │   │  byteBufBsonDocument = createOne(byteBuf)       // doc wraps byteBuf's components = DUP
 │  │   │  return byteBufBsonDocument
 │  │   │
 │  │   │  finally:
 │  │   │    byteBuf.release()                            // CompositeByteBuf own AtomicInteger only
 │  │   │    byteBuffers.forEach(ByteBuf::release)
 │  │   │    │  DUP.release()                             PARENT refCnt = 2 -> 1  - alive
 │  │   │
 │  │   commandDocument is backed by DUP                  PARENT refCnt = 1
 │  │
 │  │   new LoggingCommandEventSender(..., commandDocument, ...)
 │  │   │  commandDocument.getFirstKey()
 │  │   │  │  reads from DUP                              PARENT refCnt = 1  - works
 │  │
 │  }  // bsonOutput.close()
 │     │  for (cur : bufferList) cur.release()
 │     │    PARENT.release()                              PARENT refCnt = 1 -> 0  freed

retainedDuplicate() (may be an independent wrapper lifecycle):
With pooled buffers, retainedDuplicate() can return a derived buffer object (P) with its own lifecycle. Releasing P makes P unusable even if the underlying memory is still retained by the parent. Behavior with retainedDuplicate():

1. bsonOutput holds NettyByteBuf_A wrapping PARENT
PARENT refCnt = 1

2. getByteBuffers() -> retainedDuplicate() => returns P
P refCnt = 1 (P has its own refCnt)
PARENT refCnt = 2 (underlying memory retained)

3. getCommandDocument() finally releases P
P refCnt = 0 DEAD
PARENT refCnt = 1 (still alive)

4. Logging tries to read from P
IllegalReferenceCountException (P is already released)

5. bsonOutput.close() not reached

Why the behavior depends on pooling (method name is not the contract)

Underlying buffer duplicate().retain() retainedDuplicate()
Unpooled UnpooledDuplicatedByteBuf (shared ref count) Calls duplicate().retain() (shared ref count) - same thing
Pooled UnpooledDuplicatedByteBuf (shared ref count) PooledDuplicatedByteBuf (independent ref count)

Pooled buffers use recycled derived objects (“less garbage”), and recycled objects need their own ref counter so they can be returned to the recycler when their counter hits 0. Under PooledByteBufAllocator, retainedDuplicate() can therefore change the ref-counting model.

If the allocator is Unpooled, retainedDuplicate() is identical to duplicate().retain() and this issue does not surface.

Change in this PR

Revert this specific duplication call back to duplicate().retain() to restore the shared-refCnt lifetime behavior, while keeping the rest of the close/release changes intact.

JAVA-6107

@rozza
Copy link
Member

rozza commented Feb 19, 2026

LGTM

Copy link
Member

@rozza rozza left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recommended a minor tweak - but its optional as the new code is essentially the same.

Recommend reenabling the test in CommandHelperSpecification.groovy which was the source of the original ticket.

Other than that LGTM

vbabanin and others added 2 commits February 19, 2026 13:05
…tyStream.java

Co-authored-by: Ross Lawley <ross.lawley@gmail.com>
@vbabanin vbabanin marked this pull request as ready for review February 19, 2026 21:09
@vbabanin vbabanin requested a review from a team as a code owner February 19, 2026 21:09
@vbabanin vbabanin requested a review from rozza February 19, 2026 21:09
@stIncMale
Copy link
Member

Duplication from Slack (I should have really posted that comment here, to avoid trying to find it later in Slack if it is needed):

Let's create a ticket for this PR.

There we may say that when fixing https://jira.mongodb.org/browse/JAVA-5901, we replaced a pair of invocations io.netty.buffer.ByteBuf.retain&readSlice with a single invocation of io.netty.buffer.ByteBuf.readRetainedSlice, believing that they are semantically identical based on the API documentation https://javadoc.io/doc/io.netty/netty-buffer/4.2.9.Final/io/netty/buffer/ByteBuf.html#readRetainedSlice(int). The documentation turned out to be incorrect, and this new ticket reverts the code back to using io.netty.buffer.ByteBuf.retain&readSlice.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

Comments