Crash on Large Java Project: Memory Corruption Causes Database Corruption

Summary

 codebase-memory-mcp crashes with SIGTRAP (pointer authentication failure) when indexing a large Java project (~142万 lines, 7574 files). The crash occurs during the dump phase, resulting in a corrupted database file
 (missing SQLite header).

 Environment
- codebase-memory-mcp version:0.5.7 
- OS:macOS 26.3.2 (Sequoia), Build 25D2140 
- Architecture:ARM64 (Apple Silicon - Mac17,6) 
- Project Language:Java 
- Project Size:7,574 files, ~1,420,000 lines of code
- Resulting DB Size :~51 MB (corrupted) 

 Crash Details

 ### Crash Report Location

 ```
   ~/Library/Logs/DiagnosticReports/codebase-memory-mcp-2026-04-01-165325.ips
 ```

 ### Crash Signature

 ```
   Exception Type:  EXC_BREAKPOINT (SIGTRAP)
   Exception Codes: 0x0000000000000001, 0x000000018fdabcc4
   ESR Description: (Breakpoint) pointer authentication trap IB
   Crashed Thread:  0

   Termination: Trace/BPT trap: 5
   Symbol: xzm_malloc_zone_try_free_default (libsystem_malloc.dylib)
 ```

 ### Process Timeline

 ```
   Launch Time:     2026-04-01 16:53:22
   Crash Time:      2026-04-01 16:53:25
   Running Duration: ~3 seconds
 ```

 ### Memory State at Crash

 ```
   MALLOC regions:  1.4G virtual, 320 regions
   Writable regions: 1.4G total
 ```

 ### Full Crash Report JSON

 <details>
 <summary>Click to expand crash report</summary>
 ```json
   {
     "app_name": "codebase-memory-mcp",
     "timestamp": "2026-04-01 16:53:25.00 +0800",
     "pid": 60495,
     "procLaunch": "2026-04-01 16:53:22.3297 +0800",
     "procExitAbsTime": 3456298509386,
     "parentProc": "opencode",
     "parentPid": 56396,
     "exception": {
       "codes": "0x0000000000000001, 0x000000018fdabcc4",
       "type": "EXC_BREAKPOINT",
       "signal": "SIGTRAP"
     },
     "termination": {
       "flags": 0,
       "code": 5,
       "namespace": "SIGNAL",
       "indicator": "Trace/BPT trap: 5"
     },
     "faultingThread": 0,
     "threads": [{
       "triggered": true,
       "threadState": {
         "esr": {
           "value": 4060136561,
           "description": "(Breakpoint) pointer authentication trap IB"
         }
       }
     }],
     "usedImages": [{
       "source": "P",
       "arch": "arm64",
       "base": 4341891072,
       "size": 134496256,
       "path": "/Users/happyelements/.local/bin/codebase-memory-mcp"
     }, {
       "source": "P",
       "arch": "arm64e",
       "base": 6708269056,
       "size": 310376,
       "path": "/usr/lib/system/libsystem_malloc.dylib"
     }]
   }
 ```

 </details>
 Database Corruption Analysis

 After the crash, the database file was left in a corrupted state:

 ### File Inspection

 ```bash
   $ file Users-happyelements-work-animal-animal-server-20221102-animal-service.db
   data  # Not recognized as SQLite database!

   $ hexdump -C *.db | head -10
   00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
   *
   00010000  0d 00 00 00 d1 02 98 00  ff 16 fe 49 fd 33 fc 72  |...........I.3.r|
   # First 64KB is all zeros, data starts at offset 0x10000 (page 2)
 ```

 ### Root Cause of Corruption

 The cbm_write_db() function in sqlite_writer.c writes data in this order:

 1. Phase 1: Write node/edge data tables to pages 2+
 2. Phase 2: Write metadata tables to subsequent pages
 3. Last: Write page 1 (sqlite_master + 100-byte SQLite file header)

 ```c
   // sqlite_writer.c:1500+
   int cbm_write_db(const char *path, ...) {
       // Phase 1: Data tables first
       write_data_tables(&w, &nodes_root, &edges_root);

       // Phase 2: Metadata tables
       write_metadata_tables(&w, ...);

       // Phase 3: Build indexes (CPU intensive)
       // ... parallel sort, build index pages ...

       // LAST: Write page 1 with SQLite header
       // Line 1699+: "For sqlite_master, we need to write page 1 as the root"
       memcpy(page1, "SQLite format 3\000", 16);
       // ... write page 1 ...
   }
 ```

 When the process crashes before the final fwrite(page1) call, the database has:
 - ✅ All node/edge data written (pages 2+)
 - ❌ No SQLite header (page 1 is zeros)
 - ❌ No sqlite_master table (schema missing)

 This makes the file unrecognizable as a SQLite database.

 Crash Pattern

 The crash has occurred 11 times today on the same project:

 ┌──────────┬──────────┬────────┐
 │ Time     │ Duration │ Result │
 ├──────────┼──────────┼────────┤
 │ 14:30:49 │ ~few sec │ Crash  │
 ├──────────┼──────────┼────────┤
 │ 14:34:33 │ ~few sec │ Crash  │
 ├──────────┼──────────┼────────┤
 │ 14:35:56 │ ~few sec │ Crash  │
 ├──────────┼──────────┼────────┤
 │ 14:37:17 │ ~few sec │ Crash  │
 ├──────────┼──────────┼────────┤
 │ 14:44:10 │ ~few sec │ Crash  │
 ├──────────┼──────────┼────────┤
 │ 14:52:24 │ ~few sec │ Crash  │
 ├──────────┼──────────┼────────┤
 │ 14:54:43 │ ~few sec │ Crash  │
 ├──────────┼──────────┼────────┤
 │ 14:55:38 │ ~few sec │ Crash  │
 ├──────────┼──────────┼────────┤
 │ 14:58:23 │ ~few sec │ Crash  │
 ├──────────┼──────────┼────────┤
 │ 14:59:44 │ ~few sec │ Crash  │
 ├──────────┼──────────┼────────┤
 │ 16:53:25 │ ~3 sec   │ Crash  │
 └──────────┴──────────┴────────┘

 All crashes show identical signature: pointer authentication trap IB in libsystem_malloc.dylib.

 Analysis

 ### Primary Issue: Memory Corruption

 The SIGTRAP with pointer authentication failure on ARM64 indicates:
 - Heap corruption (buffer overflow, use-after-free, or double-free)
 - Memory allocator detected corruption and triggered SIGTRAP
 - This is a memory safety bug in the codebase, not a timeout or resource exhaustion

 ### Contributing Factor: Large Project Size

 The crash only occurs with this large project:
 - 7,574 Java files
 - ~1.4M lines of code
 - Results in 51MB database with many nodes/edges

 Smaller projects (e.g., changeset-server.db at 5.3MB) index successfully.

 ### Secondary Issue: Non-atomic Database Write

 Even if the crash is fixed, the write order in cbm_write_db() is fragile:
 - Data written before header
 - Crash at any point leaves corrupted file
 - Should use atomic write pattern (write to temp file, then rename)

 Recommended Fixes

 ### Fix 1: Investigate Memory Bug

 The crash occurs in libsystem_malloc.dylib during memory operations. Likely areas:
 - pb_add_table_cell_with_flush() in page building
 - Record building (build_node_record, build_edge_record)
 - Large buffer allocations for sorting
 - Possible integer overflow in size calculations

 ### Fix 2: Atomic Database Writes

 Modify cbm_write_db() to use atomic write pattern:

 ```c
   int cbm_write_db(const char *path, ...) {
       // Write to temporary file first
       char tmp_path[1024];
       snprintf(tmp_path, sizeof(tmp_path), "%s.tmp.%d", path, getpid());

       FILE *fp = fopen(tmp_path, "wb");
       // ... all writes to tmp_path ...

       // Ensure all data is flushed
       fflush(fp);
       fclose(fp);

       // Atomic rename to final path
       rename(tmp_path, path);
       return 0;
   }
 ```

 Or alternatively, write page 1 (header) first:

 ```c
   int cbm_write_db(...) {
       // Write header first (page 1 placeholder)
       // This ensures file is always recognizable as SQLite
       write_empty_page1_with_header(fp);

       // Then write data tables
       write_data_tables(...);

       // Finally update page 1 with actual schema
       fseek(fp, 0, SEEK_SET);
       write_final_page1(fp, ...);
   }
 ```

 Workaround for Users

 1. Delete corrupted database files:
   ```bash
     rm ~/.cache/codebase-memory-mcp/*.db
   ```
 2. Try indexing smaller subdirectories instead of the entire project
 3. If using with an MCP client, try restarting the client and re-indexing

 Additional Crash Reports

 Multiple crash reports available at:

 ```
   ~/Library/Logs/DiagnosticReports/codebase-memory-mcp-2026-04-01-*.ips
 ```

 Happy to provide full crash reports or additional debugging information upon request.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Crash on Large Java Project: Memory Corruption Causes Database Corruption #189

Crash Report Location

Crash Signature

Process Timeline

Memory State at Crash

Full Crash Report JSON

File Inspection

Root Cause of Corruption

Primary Issue: Memory Corruption

Contributing Factor: Large Project Size

Secondary Issue: Non-atomic Database Write

Fix 1: Investigate Memory Bug

Fix 2: Atomic Database Writes

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Crash on Large Java Project: Memory Corruption Causes Database Corruption #189

Description

Crash Report Location

Crash Signature

Process Timeline

Memory State at Crash

Full Crash Report JSON

File Inspection

Root Cause of Corruption

Primary Issue: Memory Corruption

Contributing Factor: Large Project Size

Secondary Issue: Non-atomic Database Write

Fix 1: Investigate Memory Bug

Fix 2: Atomic Database Writes

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions