Describe the enhancement requested
The memray memory profiler works by interposing certain dynamic symbols in the profiled process to replace them with their own functions that will collect memory allocation data. It will currently, to the best of my knowledge, only recognize system C calls such malloc, mmap...
When a third-party allocator like mimalloc or jemalloc is being used, such that Arrow does by default, memray does not see the logical allocation calls made through these allocator's APIs (because they are not interposed), but only the raw memory reservations that they issue using system routines.
This can lead people using memray to think that a given Arrow workload (or any workload using such allocators, really) that an inordinate amount of memory is being used, while the reported memory mostly represents non-committed virtual memory that the allocator keeps for performance reasons. Concrete example in GH-40301: we allocate a number of 1kiB buffers from mimalloc, but memray sees a similar number of 64MiB calls to mmap.
We discussed how to enhance memray such as to account for the corresponding logical allocations, and we came to the conclusion that it requires that Arrow exposes API calls that can be dynamically interposed. Since we typically build against a static libmimalloc.a, the mimalloc symbols cannot be exposed (at least, I cannot seem to get this to work on Ubuntu). This means we need to define our own symbols wrapping the mimalloc APIs.
Component(s)
C++
Describe the enhancement requested
The memray memory profiler works by interposing certain dynamic symbols in the profiled process to replace them with their own functions that will collect memory allocation data. It will currently, to the best of my knowledge, only recognize system C calls such
malloc,mmap...When a third-party allocator like mimalloc or jemalloc is being used, such that Arrow does by default, memray does not see the logical allocation calls made through these allocator's APIs (because they are not interposed), but only the raw memory reservations that they issue using system routines.
This can lead people using memray to think that a given Arrow workload (or any workload using such allocators, really) that an inordinate amount of memory is being used, while the reported memory mostly represents non-committed virtual memory that the allocator keeps for performance reasons. Concrete example in GH-40301: we allocate a number of 1kiB buffers from mimalloc, but memray sees a similar number of 64MiB calls to
mmap.We discussed how to enhance memray such as to account for the corresponding logical allocations, and we came to the conclusion that it requires that Arrow exposes API calls that can be dynamically interposed. Since we typically build against a static
libmimalloc.a, the mimalloc symbols cannot be exposed (at least, I cannot seem to get this to work on Ubuntu). This means we need to define our own symbols wrapping the mimalloc APIs.Component(s)
C++