JAVA-6071 by strogiyotec · Pull Request #1881 · mongodb/mongo-java-driver

strogiyotec · 2026-02-05T17:37:58Z

JAVA-6071
This ticket involves two separate bugs in a test case testBulkWriteHandlesWriteErrorsAcrossBatches for reactive driver , specifically when ordered field is false

IllegalState exception on Mono Timeout

The first part of the failed test is

[2026/01/21 16:07:57.619] FAILURE: org.opentest4j.AssertionFailedError: Unexpected exception type thrown, expected: <com.mongodb.ClientBulkWriteException> but was: <java.lang.IllegalStateException> (org.opentest4j.AssertionFailedError)

Here is the link to the build
Here is the log that caught my attention

[2026/01/21 16:03:37.770]     00:03:37.750 [cluster-ClusterId{value='697169596f43721a0d1b8470', description='null'}-localhost:27017] INFO  org.mongodb.driver.cluster - Monitor thread successfully connected to server with description ServerDescription{address=localhost:27017, type=STANDALONE, cryptd=false, state=CONNECTED, ok=true, minWireVersion=0, maxWireVersion=25, maxDocumentSize=16777216, logicalSessionTimeoutMinutes=30, roundTripTimeNanos=3744986, minRoundTripTimeNanos=0}
[2026/01/21 16:05:39.870]   6. MongoClient.bulkWrite handles individual WriteErrors across batches

As you can see the time diff between two last logs is exactly two minutes , and according to our settings, the sync driver blocks mono by two minutes

So timeout happened and Mono.block threw IllegalStateException
This PR does not solve this issue, the IllegalState is expected, unless we want to override timeout for this test case only as it deals with a huge number of documents per batch

Connection leak

Second asserting that failed on this test case is connection leak

[2026/01/21 16:07:57.619] The connection pool listener reports '1' open connections.
[2026/01/21 16:07:57.619] 		at com.mongodb.assertions.Assertions.assertTrue(Assertions.java:190)
[2026/01/21 16:07:57.619] 		at com.mongodb.reactivestreams.client.syncadapter.SyncMongoClient$ConnectionPoolCounter.assertConnectionsClosed(SyncMongoClient.java:373)
[2026/01/21 16:07:57.619] 		at com.mongodb.reactivestreams.client.syncadapter.SyncMongoClient.close(SyncMongoClient.java:305)
[2026/01/21 16:07:57.619] 		at com.mongodb.client.CrudProseTest.testBulkWriteHandlesWriteErrorsAcrossBatches(CrudProseTest.java:246)
[2026/01/21 16:07:57.619] 		... 39 more

Before looking at at the leak, I started checking what the test case is actually doing I noticed that it tries to insert documents in batches of 10K invalid(duplicate ids) docs per batch
The assertion that happens for connection leak happens here
The code is running a loop which fails after 2 seconds if not all connections are closed, in my local env I noticed that 2 seconds is not enough for such a huge number of documents to be processed so I introduced a new method that can propagate and override this timeout, after running this test case 100 times locally I didn't notice the connection leak anymore

To test it locally

change the Mono timeout from 2 min to 2 seconds so that you can simulate IllegalStateException
add RepeatedTest to this test case, in my case , I hardcoded ordered=false instead of relying on parametrized test so that I can use RepeatedTest

NullPointerException

The last part of this failed test is NullPointer here

[2026/01/21 16:07:57.619] Exception in thread "Thread-29" java.lang.NullPointerException: Cannot invoke "java.nio.ByteBuffer.hasRemaining()" because "this.buf" is null
[2026/01/21 16:07:57.619] 	at org.bson.ByteBufNIO.hasRemaining(ByteBufNIO.java:91)

The exception makes sense; hasRemaining() throws an NPE because the underlying buffer becomes null once released. While I couldn't reproduce this after 100 local test runs, I added a guard for ByteBufNIO. By using asNio() first (which acts as a direct getter for ByteBufNIO but avoids the buffer copying found in Netty or other implementations), we can safely check if the underlying buffer is null before checking hasRemaining , for other implementation we still rely on hasRemaining

I don't like this approach because of abstraction leak and open for any suggestions

rozza · 2026-02-09T16:21:08Z

I think this is symptomatic of the higher resource being released too early issue, so as much as this prevents the NPE it doesn't solve the root cause.

I think this should be addressed by #1873, #1874 & #1876 and a future PR for Session & connection monitor that will be built upon those changes.

strogiyotec · 2026-02-09T17:44:29Z

I think this is symptomatic of the higher resource being released too early issue, so as much as this prevents the NPE it doesn't solve the root cause.

I think this should be addressed by #1873, #1874 & #1876 and a future PR for Session & connection monitor that will be built upon those changes.

hey @rozza , after looking at your listed fixes I didn't understand if they will fix NPE in ByteBufNIO but we can give it a try, I can remove my NPE fix but still keep longer timeout for connection pool to close , let me know if it works for you

rozza · 2026-02-10T09:35:05Z

@strogiyotec apologies, my changes will help with the byte buffer management from InternalStreamConnection which uses the Stream. However, your correct that is not obvious if that will help with these internals of AsynchronousChannelStream and pipeOneBuffer.

I'd like to understand more about the failure scenario:

Why does byteBuffer.hasRemaining() NPE?
Why won't calling byteBuffer.asNIO() also NPE?

Looks like when a ByteBufNIO is fully released the underlying buf is set to null which then goes on to cause the NPE. What I'm not sure of is the source of the release.

Is there anyway to replicate the error locally? I have two questions about the root cause:

Is it a resource counting issue that will be fixed by the InternalStream and ByteBuf improvement PR's?
Is it a race condtion? where the AsynchronousChannelStream is closed while a pipeOneBuffer is in flight? I have a feeling we'd have seen it more often if that was the case. Should there be a check to see if isClosed() before checking the buffer.

JAVA-6071

cd0869e

strogiyotec requested review from rozza and vbabanin February 5, 2026 17:37

strogiyotec requested a review from a team as a code owner February 5, 2026 17:37

Almas Abdrazak added 2 commits February 5, 2026 11:03

checkstyle

10d485a

avoid magic numbers

8314592

strogiyotec closed this Feb 11, 2026

strogiyotec deleted the JAVA-6071 branch February 11, 2026 19:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

JAVA-6071#1881

JAVA-6071#1881
strogiyotec wants to merge 3 commits intomongodb:mainfrom
strogiyotec:JAVA-6071

strogiyotec commented Feb 5, 2026 •

edited

Loading

Uh oh!

rozza commented Feb 9, 2026

Uh oh!

strogiyotec commented Feb 9, 2026

Uh oh!

rozza commented Feb 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

strogiyotec commented Feb 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

IllegalState exception on Mono Timeout

Connection leak

NullPointerException

Uh oh!

rozza commented Feb 9, 2026

Uh oh!

strogiyotec commented Feb 9, 2026

Uh oh!

rozza commented Feb 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

strogiyotec commented Feb 5, 2026 •

edited

Loading