Skip to content

[Bug] Spill GC crashes due to missing pointer dereference in LocalFileSystem::list_impl #60904

@xuchenhao

Description

@xuchenhao

Search before asking

  • I had searched in the issues and found no similar issues.

Version

master

What's Wrong?

http://43.132.222.7:8111/buildConfiguration/Doris_DorisRegression_P0Regression/896382?expandBuildDeploymentsSection=false&hideTestsFromDependencies=false&hideProblemsFromDependencies=false&expandPull+Request+Details=true&expandBuildProblemsSection=true&expandBuildTestsSection=true&expandBuildChangesSection=true
During P0 regression testing, a crash occurred in the BE process with AddressSanitizer reporting a CHECK failure. The stack trace indicates the crash originated from LocalFileSystem::list_impl(line 248) when called by the spill GC thread.

AddressSanitizer: CHECK failed: sanitizer_posix_libcdep.cpp:319 "((14)) == ((write_errno))" (0xe, 0x20) (tid=41007)
    #0 0x55d2766388e1 in __asan::CheckUnwind() (/mnt/ssd01/pipline/OpenSourceDoris/clusterEnv/P0/Cluster0/be/lib/doris_be+0x297bd8e1)
    #1 0x55d276653182 in __sanitizer::CheckFailed(char const*, int, char const*, unsigned long long, unsigned long long) (/mnt/ssd01/pipline/OpenSourceDoris/clusterEnv/P0/Cluster0/be/lib/doris_be+0x297d8182)
    #2 0x55d2766556cf in __sanitizer::IsAccessibleMemoryRange(unsigned long, unsigned long) (/mnt/ssd01/pipline/OpenSourceDoris/clusterEnv/P0/Cluster0/be/lib/doris_be+0x297da6cf)
    #3 0x55d27667237a in __ubsan::checkDynamicType(void*, void*, unsigned long) (/mnt/ssd01/pipline/OpenSourceDoris/clusterEnv/P0/Cluster0/be/lib/doris_be+0x297f737a)
    #4 0x55d276671712 in HandleDynamicTypeCacheMiss(__ubsan::DynamicTypeCacheMissData*, unsigned long, unsigned long, __ubsan::ReportOptions) (/mnt/ssd01/pipline/OpenSourceDoris/clusterEnv/P0/Cluster0/be/lib/doris_be+0x297f6712)
    #5 0x55d2766716e3 in __ubsan_handle_dynamic_type_cache_miss (/mnt/ssd01/pipline/OpenSourceDoris/clusterEnv/P0/Cluster0/be/lib/doris_be+0x297f66e3)
    #6 0x55d276bcc7a7 in doris::io::LocalFileSystem::list_impl(std::filesystem::__cxx11::path const&, bool, std::vector<doris::io::FileInfo, std::allocator<doris::io::FileInfo> >*, bool*) /root/doris/be/build_ASAN/../src/io/fs/local_file_system.cpp:248:32
    #7 0x55d2768c62b0 in doris::io::FileSystem::list(std::filesystem::__cxx11::path const&, bool, std::vector<doris::io::FileInfo, std::allocator<doris::io::FileInfo> >*, bool*) /root/doris/be/build_ASAN/../src/io/fs/file_system.cpp:84:5
    #8 0x55d2933cd95e in doris::vectorized::SpillStreamManager::gc(int) /root/doris/be/build_ASAN/../src/vec/spill/spill_stream_manager.cpp:233:49
    #9 0x55d2933ccce1 in doris::vectorized::SpillStreamManager::_spill_gc_thread_callback() /root/doris/be/build_ASAN/../src/vec/spill/spill_stream_manager.cpp:110:9
    #10 0x55d27c171956 in std::function<void ()>::operator()() const /usr/local/ldb-toolchain-v0.26/bin/../lib/gcc/x86_64-pc-linux-gnu/15/include/g++-v15/bits/std_function.h:593:9
    #11 0x55d27c171956 in doris::Thread::supervise_thread(void*) /root/doris/be/build_ASAN/../src/util/thread.cpp:460:5
    #12 0x55d276628d26 in asan_thread_start(void*) (/mnt/ssd01/pipline/OpenSourceDoris/clusterEnv/P0/Cluster0/be/lib/doris_be+0x297add26)
    #13 0x7f460580a608 in start_thread /build/glibc-SzIz7B/glibc-2.31/nptl/pthread_create.c:477:8
    #14 0x7f460571d132 in __clone /build/glibc-SzIz7B/glibc-2.31/misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:95

The bug is in the following code segment:

Status LocalFileSystem::list_impl(const Path& dir, bool only_file, std::vector<FileInfo>* files,
                                  bool* exists) {
    RETURN_IF_ERROR(exists_impl(dir, exists));
    if (!exists) {  // BUG: This checks if the pointer is null, not the boolean value it points to.
        return Status::OK();
    }
    // ... rest of the function
}

What You Expected?

The condition should dereference the pointer to check the actual boolean value.

Status LocalFileSystem::list_impl(const Path& dir, bool only_file, std::vector<FileInfo>* files,
                                  bool* exists) {
    RETURN_IF_ERROR(exists_impl(dir, exists));
    if (!*(exists)) {  // CORRECT: Check the value pointed to by 'exists'.
        return Status::OK();
    }
    // ... rest of the function
}

How to Reproduce?

No response

Anything Else?

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions