Enable MMU + D-cache: fix sustained host→device WRITE by widgetii · Pull Request #20 · OpenIPC/defib

widgetii · 2026-03-31T17:19:27Z

Summary

Enable ARMv7 MMU with D-cache to fix FIFO overflow during sustained host→device writes.

ARMv7 short-descriptor page tables with 1MB identity-mapped sections:

DDR (128MB from RAM_BASE): write-back, write-allocate
I/O regions (UART, FMC, CRG, flash window): device/uncached

With D-cache, COBS+CRC processing is ~10x faster, eliminating PL011 FIFO overflow.

Before (uncached DDR)

Size	Result
16KB	OK
64KB	FAIL (FIFO overflow)
256KB	FAIL

After (D-cache enabled)

Size	Speed	Result
16KB	49 KB/s	OK
64KB	80 KB/s	OK
256KB	79 KB/s	OK

All verified with CRC32 read-back.

Test plan

All CI checks pass locally (ruff, mypy, pytest 247, C 1604)
Self-update to real hi3516ev300 — agent boots with MMU enabled
16KB / 64KB / 256KB WRITE all verified
CI on PR

🤖 Generated with Claude Code

ARMv7 short-descriptor page tables with 1MB identity-mapped sections. DDR (128MB from RAM_BASE) is cacheable write-back/write-allocate. All I/O regions (UART, FMC, CRG, flash window) are device/uncached. With D-cache, COBS decode + CRC32 processing is ~10x faster, eliminating PL011 FIFO overflow during sustained host→device transfers. Previously WRITE failed after ~16-420KB; now 256KB verified at 79 KB/s. Page table (16KB) allocated in BSS with 16KB alignment for TTBR0. Tested on hi3516ev300: - 16KB write: OK (previously OK) - 64KB write: OK (previously FAILED) - 256KB write: OK (previously IMPOSSIBLE) - All verified with CRC32 read-back Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

PL011 RX interrupt handler drains hardware FIFO into 4KB ring buffer automatically. GIC configured for UART0 IRQ (SPI 7 on ev200/ev300). IRQ mode stack set up. proto_recv reads from ring buffer via uart_getc_safe — no more polling soft_rx_drain. Combined with MMU/D-cache, this should eliminate sustained WRITE failures. Testing showed 3/4 blocks work but block 4 loses 3 packets (14848/16384 bytes received). Ring buffer overflow suspected. Known issue: 8KB ring buffer crashes agent (BSS overlap or GIC issue). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Per-packet COBS ACK in handle_write for flow control. Host waits for ACK before sending next DATA packet. Added proto_drain_fifo call in uart_putc TX wait loop to prevent RX FIFO overflow during bidirectional backpressure traffic. Added proto_reset_rx to flush both software and hardware RX buffers. WRITE_MAX_TRANSFER set to 32MB (single block) to avoid inter-block READY packet desynchronization. Root cause found: selfupdate also loses data, producing corrupted agent binaries. Need per-packet backpressure in selfupdate too. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Root cause found: selfupdate was blasting packets without flow control, losing data. Agent's CRC check should have caught this but corrupted binaries were being deployed, causing cascading failures in all subsequent operations. Fix: both handle_selfupdate and handle_write now send proto_send_ack after each DATA packet. Host waits for ACK before sending next. Guarantees zero data loss at any baud rate. Also: proto_drain_fifo in uart_putc TX wait loop, proto_reset_rx for flushing both hardware and software RX buffers. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

ROOT CAUSE: cobs_decode() stripped trailing zero bytes from decoded output. When a COBS packet's CRC32 had 0x00 as its MSB (LE last byte), the decode removed it, producing a 1-byte-shorter output. This caused CRC mismatch for ~1/256 of all packets — deterministic, data-dependent. Found via ASAN: "left shift of 136 by 24 places cannot be represented in type 'int'" led to investigating CRC byte extraction, which led to the COBS decode length mismatch. Fixes: - Remove trailing zero stripping from cobs_decode() (C) - Cast uint8_t to uint32_t before << 24 in CRC extraction (UB fix) - Per-packet backpressure ACK in WRITE and SELFUPDATE - Fixed _recv_packet_sync: no partial frame stashing - recv_response: reset deadline after READY skip - ASAN firmware data test catches the bug offline Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

widgetii and others added 5 commits April 4, 2026 18:11

widgetii force-pushed the feature/mmu-dcache branch from ec8bd22 to 80b2897 Compare April 4, 2026 15:12

widgetii merged commit 50bdc79 into master Apr 4, 2026
13 checks passed

widgetii deleted the feature/mmu-dcache branch April 4, 2026 15:21

widgetii mentioned this pull request Apr 4, 2026

Flash agent roadmap: SPI flash, high-speed UART, defib integration #11

Open

42 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable MMU + D-cache: fix sustained host→device WRITE#20

Enable MMU + D-cache: fix sustained host→device WRITE#20
widgetii merged 5 commits intomasterfrom
feature/mmu-dcache

widgetii commented Mar 31, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

widgetii commented Mar 31, 2026

Summary

Before (uncached DDR)

After (D-cache enabled)

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant