This document explains how to debug a running Python process in this repo, how to grant access to another user (for example the agent user), and which local scripts are useful for memory investigations.
sys.remote_exec(pid, script_path) asks the target Python process to run a
Python script file.
Important behavior from Python docs:
sys.remote_execmay return before the script executes.- Script execution happens later when the target reaches a safe evaluation point.
So always wait and poll for output files/logs after injection.
Python 3.14 remote debugging docs state:
- tracer needs ptrace permission (
CAP_SYS_PTRACEor equivalent), - target should be same UID and signal-able,
- Yama (
ptrace_scope) can restrict attach.
Assume:
- target process owner user:
TARGET_USER - debugging user (agent):
DEBUG_USER - Python binary used by debugger:
/path/to/python
id
cat /proc/sys/kernel/yama/ptrace_scope
ps -o pid,user,args -p <PID>If possible, run the debugger as the same user as the target process.
Run as root:
sudo setcap cap_sys_ptrace+ep /path/to/python
getcap /path/to/pythonTemporary until reboot:
echo 0 | sudo tee /proc/sys/kernel/yama/ptrace_scopePersistent:
echo "kernel.yama.ptrace_scope = 0" | sudo tee /etc/sysctl.d/99-dw-debug.conf
sudo sysctl --systemkill -0 <PID> && echo OK"/path/to/python" - <<'PY'
import sys
print(sys.remote_exec(<PID>, "/abs/path/to/script.py"))
PYsleep 90
ls -l /tmp/<expected_output_file>If file not present, wait longer and poll again.
scripts/muppy_heap_breakdown.py- per-type heap size summary, includes
str >= 1MBtotals.
- per-type heap size summary, includes
scripts/muppy_large_str_owners.py- tags large strings by content and reports owner fields.
scripts/owner_awaited_by_probe.py- finds task with largest
_asyncio_awaited_byset and summarizes waiters.
- finds task with largest
scripts/clear_largest_awaited_by.py- one-off cleanup of done waiters for the largest owner task.
- diagnostic/rescue tool only; not a product fix.
scripts/remote_exec_write_probe.py- minimal sanity check that remote script execution is happening.
- Run
muppy_heap_breakdown.pyto confirm dominant object type. - Run
muppy_large_str_owners.pyto identify payload type and owner fields. - Run
owner_awaited_by_probe.pyto detect task waiter retention patterns. - Patch root cause in normal code path.
- Optionally run
clear_largest_awaited_by.pyonce to validate hypothesis live.
- Always use absolute script paths for
sys.remote_exec. - Make scripts write outputs to predictable absolute paths (
/tmp/...or repo data dir). - For long-running probes, use larger waits (90-180s) before concluding failure.
- Avoid assuming immediate side effects after
sys.remote_execreturns.
- Lowering
ptrace_scopeand grantingCAP_SYS_PTRACEboth reduce hardening. - Only enable these in trusted environments and revert if not needed.