Version
trunk (master branch)
What's Wrong?
When running BE UT with CMAKE_BUILD_TYPE=LSAN on ARM64 architecture, the test cases IntersectOperatorTest::test_sink_large_string_data_over_4g and ExceptOperatorTest::test_sink_large_string_data_over_4g crash the entire UT process with the following error:
==15521==ERROR: LeakSanitizer: requested allocation size 0x200000000 exceeds maximum supported size of 0x100000000
#0 ... in realloc
#1 ... in doris::DefaultMemoryAllocator::realloc(void*, unsigned long) allocator.h:100
#2 ... in doris::Allocator::realloc(...) allocator.cpp:411
#3 ... in doris::PODArrayBase::realloc(...) pod_array.h:191
#4 ... in doris::PODArrayBase::reserve(...) pod_array.h:261
#5 ... in doris::PODArrayBase::resize(...) pod_array.h:267
#6 ... in doris::ColumnStr::insert_range_from_ignore_overflow(...) column_string.cpp:113
#7 ... in doris::MutableBlock::merge_impl_ignore_overflow(...) block.h:586
#8 ... in doris::MutableBlock::merge_ignore_overflow(...) block.h:564
#9 ... in doris::SetSinkOperatorX<true>::sink(...) set_sink_operator.cpp:89
#10 ... in IntersectOperatorTest_test_sink_large_string_data_over_4g_Test::TestBody() set_operator_test.cpp:480
After this error, the UT process exits immediately, preventing any subsequent tests from running.
What You Expected?
The UT process should not crash. Tests that are incompatible with LSAN's ARM64 allocation limit should be gracefully skipped so that the rest of the UT suite can continue running.
How to Reproduce?
- Build BE UT on ARM64 with
CMAKE_BUILD_TYPE=LSAN
- Run:
./run-be-ut.sh --run --filter=IntersectOperatorTest.test_sink_large_string_data_over_4g
- Observe the crash
Root Cause Analysis
The crash is caused by the interaction of two factors:
1. LSAN's ARM64 allocation limit
In LSAN's source code (compiler-rt/lib/lsan/lsan_allocator.cpp), the maximum allowed single allocation size is defined per architecture:
#if defined(__i386__) || defined(__arm__)
static const uptr kMaxAllowedMallocSize = 1ULL << 30; // 1GB
#elif defined(__mips64) || defined(__aarch64__)
static const uptr kMaxAllowedMallocSize = 4ULL << 30; // 4GB (ARM64)
#else
static const uptr kMaxAllowedMallocSize = 1ULL << 40; // 1TB (x86_64)
#endif
On ARM64 (__aarch64__), kMaxAllowedMallocSize = 0x100000000 (4GB). Any single allocation request exceeding this limit triggers a fatal error.
2. PODArray's power-of-two rounding
PODArrayBase::reserve() (pod_array.h:259-262) rounds up the requested size to the next power of two via round_up_to_power_of_two_or_zero(). When the test accumulates ~4.1GB of string data in ColumnStr::chars, the resize request of ~4.1GB gets rounded up to 8GB (0x200000000), which exceeds LSAN's 4GB limit on ARM64.
The detailed calculation for the Intersect test (4200 rows × 1MB per row, batched at 500 rows):
| Batch |
chars.size() |
resize target |
round_up_to_2^N |
Realloc? |
| 5 |
1.95GB |
2.44GB |
0x100000000 (4GB) |
Yes |
| 6-8 |
1.95~3.91GB |
2.44~3.91GB |
0x100000000 (4GB) |
No (capacity sufficient) |
| 9 |
3.91GB |
4.10GB |
0x200000000 (8GB) |
Yes → CRASH |
This issue does not occur on x86_64 because LSAN's limit there is 1TB.
Are you willing to submit PR?
Version
trunk (master branch)
What's Wrong?
When running BE UT with
CMAKE_BUILD_TYPE=LSANon ARM64 architecture, the test casesIntersectOperatorTest::test_sink_large_string_data_over_4gandExceptOperatorTest::test_sink_large_string_data_over_4gcrash the entire UT process with the following error:After this error, the UT process exits immediately, preventing any subsequent tests from running.
What You Expected?
The UT process should not crash. Tests that are incompatible with LSAN's ARM64 allocation limit should be gracefully skipped so that the rest of the UT suite can continue running.
How to Reproduce?
CMAKE_BUILD_TYPE=LSAN./run-be-ut.sh --run --filter=IntersectOperatorTest.test_sink_large_string_data_over_4gRoot Cause Analysis
The crash is caused by the interaction of two factors:
1. LSAN's ARM64 allocation limit
In LSAN's source code (
compiler-rt/lib/lsan/lsan_allocator.cpp), the maximum allowed single allocation size is defined per architecture:On ARM64 (
__aarch64__),kMaxAllowedMallocSize = 0x100000000(4GB). Any single allocation request exceeding this limit triggers a fatal error.2. PODArray's power-of-two rounding
PODArrayBase::reserve()(pod_array.h:259-262) rounds up the requested size to the next power of two viaround_up_to_power_of_two_or_zero(). When the test accumulates ~4.1GB of string data inColumnStr::chars, the resize request of ~4.1GB gets rounded up to 8GB (0x200000000), which exceeds LSAN's 4GB limit on ARM64.The detailed calculation for the Intersect test (4200 rows × 1MB per row, batched at 500 rows):
This issue does not occur on x86_64 because LSAN's limit there is 1TB.
Are you willing to submit PR?