Skip to content

Crash on Large Java Project: Memory Corruption Causes Database Corruption #189

@liujinyonggiyt

Description

@liujinyonggiyt

Summary

codebase-memory-mcp crashes with SIGTRAP (pointer authentication failure) when indexing a large Java project (~142万 lines, 7574 files). The crash occurs during the dump phase, resulting in a corrupted database file
(missing SQLite header).

Environment

  • codebase-memory-mcp version:0.5.7
  • OS:macOS 26.3.2 (Sequoia), Build 25D2140
  • Architecture:ARM64 (Apple Silicon - Mac17,6)
  • Project Language:Java
  • Project Size:7,574 files, ~1,420,000 lines of code
  • Resulting DB Size :~51 MB (corrupted)

Crash Details

Crash Report Location

  ~/Library/Logs/DiagnosticReports/codebase-memory-mcp-2026-04-01-165325.ips

Crash Signature

  Exception Type:  EXC_BREAKPOINT (SIGTRAP)
  Exception Codes: 0x0000000000000001, 0x000000018fdabcc4
  ESR Description: (Breakpoint) pointer authentication trap IB
  Crashed Thread:  0

  Termination: Trace/BPT trap: 5
  Symbol: xzm_malloc_zone_try_free_default (libsystem_malloc.dylib)

Process Timeline

  Launch Time:     2026-04-01 16:53:22
  Crash Time:      2026-04-01 16:53:25
  Running Duration: ~3 seconds

Memory State at Crash

  MALLOC regions:  1.4G virtual, 320 regions
  Writable regions: 1.4G total

Full Crash Report JSON

Click to expand crash report ```json { "app_name": "codebase-memory-mcp", "timestamp": "2026-04-01 16:53:25.00 +0800", "pid": 60495, "procLaunch": "2026-04-01 16:53:22.3297 +0800", "procExitAbsTime": 3456298509386, "parentProc": "opencode", "parentPid": 56396, "exception": { "codes": "0x0000000000000001, 0x000000018fdabcc4", "type": "EXC_BREAKPOINT", "signal": "SIGTRAP" }, "termination": { "flags": 0, "code": 5, "namespace": "SIGNAL", "indicator": "Trace/BPT trap: 5" }, "faultingThread": 0, "threads": [{ "triggered": true, "threadState": { "esr": { "value": 4060136561, "description": "(Breakpoint) pointer authentication trap IB" } } }], "usedImages": [{ "source": "P", "arch": "arm64", "base": 4341891072, "size": 134496256, "path": "/Users/happyelements/.local/bin/codebase-memory-mcp" }, { "source": "P", "arch": "arm64e", "base": 6708269056, "size": 310376, "path": "/usr/lib/system/libsystem_malloc.dylib" }] } ```
Database Corruption Analysis

After the crash, the database file was left in a corrupted state:

File Inspection

  $ file Users-happyelements-work-animal-animal-server-20221102-animal-service.db
  data  # Not recognized as SQLite database!

  $ hexdump -C *.db | head -10
  00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
  *
  00010000  0d 00 00 00 d1 02 98 00  ff 16 fe 49 fd 33 fc 72  |...........I.3.r|
  # First 64KB is all zeros, data starts at offset 0x10000 (page 2)

Root Cause of Corruption

The cbm_write_db() function in sqlite_writer.c writes data in this order:

  1. Phase 1: Write node/edge data tables to pages 2+
  2. Phase 2: Write metadata tables to subsequent pages
  3. Last: Write page 1 (sqlite_master + 100-byte SQLite file header)
  // sqlite_writer.c:1500+
  int cbm_write_db(const char *path, ...) {
      // Phase 1: Data tables first
      write_data_tables(&w, &nodes_root, &edges_root);

      // Phase 2: Metadata tables
      write_metadata_tables(&w, ...);

      // Phase 3: Build indexes (CPU intensive)
      // ... parallel sort, build index pages ...

      // LAST: Write page 1 with SQLite header
      // Line 1699+: "For sqlite_master, we need to write page 1 as the root"
      memcpy(page1, "SQLite format 3\000", 16);
      // ... write page 1 ...
  }

When the process crashes before the final fwrite(page1) call, the database has:

  • ✅ All node/edge data written (pages 2+)
  • ❌ No SQLite header (page 1 is zeros)
  • ❌ No sqlite_master table (schema missing)

This makes the file unrecognizable as a SQLite database.

Crash Pattern

The crash has occurred 11 times today on the same project:

┌──────────┬──────────┬────────┐
│ Time │ Duration │ Result │
├──────────┼──────────┼────────┤
│ 14:30:49 │ ~few sec │ Crash │
├──────────┼──────────┼────────┤
│ 14:34:33 │ ~few sec │ Crash │
├──────────┼──────────┼────────┤
│ 14:35:56 │ ~few sec │ Crash │
├──────────┼──────────┼────────┤
│ 14:37:17 │ ~few sec │ Crash │
├──────────┼──────────┼────────┤
│ 14:44:10 │ ~few sec │ Crash │
├──────────┼──────────┼────────┤
│ 14:52:24 │ ~few sec │ Crash │
├──────────┼──────────┼────────┤
│ 14:54:43 │ ~few sec │ Crash │
├──────────┼──────────┼────────┤
│ 14:55:38 │ ~few sec │ Crash │
├──────────┼──────────┼────────┤
│ 14:58:23 │ ~few sec │ Crash │
├──────────┼──────────┼────────┤
│ 14:59:44 │ ~few sec │ Crash │
├──────────┼──────────┼────────┤
│ 16:53:25 │ ~3 sec │ Crash │
└──────────┴──────────┴────────┘

All crashes show identical signature: pointer authentication trap IB in libsystem_malloc.dylib.

Analysis

Primary Issue: Memory Corruption

The SIGTRAP with pointer authentication failure on ARM64 indicates:

  • Heap corruption (buffer overflow, use-after-free, or double-free)
  • Memory allocator detected corruption and triggered SIGTRAP
  • This is a memory safety bug in the codebase, not a timeout or resource exhaustion

Contributing Factor: Large Project Size

The crash only occurs with this large project:

  • 7,574 Java files
  • ~1.4M lines of code
  • Results in 51MB database with many nodes/edges

Smaller projects (e.g., changeset-server.db at 5.3MB) index successfully.

Secondary Issue: Non-atomic Database Write

Even if the crash is fixed, the write order in cbm_write_db() is fragile:

  • Data written before header
  • Crash at any point leaves corrupted file
  • Should use atomic write pattern (write to temp file, then rename)

Recommended Fixes

Fix 1: Investigate Memory Bug

The crash occurs in libsystem_malloc.dylib during memory operations. Likely areas:

  • pb_add_table_cell_with_flush() in page building
  • Record building (build_node_record, build_edge_record)
  • Large buffer allocations for sorting
  • Possible integer overflow in size calculations

Fix 2: Atomic Database Writes

Modify cbm_write_db() to use atomic write pattern:

  int cbm_write_db(const char *path, ...) {
      // Write to temporary file first
      char tmp_path[1024];
      snprintf(tmp_path, sizeof(tmp_path), "%s.tmp.%d", path, getpid());

      FILE *fp = fopen(tmp_path, "wb");
      // ... all writes to tmp_path ...

      // Ensure all data is flushed
      fflush(fp);
      fclose(fp);

      // Atomic rename to final path
      rename(tmp_path, path);
      return 0;
  }

Or alternatively, write page 1 (header) first:

  int cbm_write_db(...) {
      // Write header first (page 1 placeholder)
      // This ensures file is always recognizable as SQLite
      write_empty_page1_with_header(fp);

      // Then write data tables
      write_data_tables(...);

      // Finally update page 1 with actual schema
      fseek(fp, 0, SEEK_SET);
      write_final_page1(fp, ...);
  }

Workaround for Users

  1. Delete corrupted database files:
  rm ~/.cache/codebase-memory-mcp/*.db
  1. Try indexing smaller subdirectories instead of the entire project
  2. If using with an MCP client, try restarting the client and re-indexing

Additional Crash Reports

Multiple crash reports available at:

  ~/Library/Logs/DiagnosticReports/codebase-memory-mcp-2026-04-01-*.ips

Happy to provide full crash reports or additional debugging information upon request.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingstability/performanceServer crashes, OOM, hangs, high CPU/memory

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions