Skip to content
View johnzfitch's full-sized avatar
🖲️
Working on all the things
🖲️
Working on all the things

Block or report johnzfitch

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
johnzfitch/README.md

Header

definitelynot.ai  Internet Universe  UC Berkeley Mathematics  Email

SF Bay Area  •  Git Page  •  All icons from iconics


OpenAI Codex: Finding the Ghost in the Machine

Important

Solved a pre-main()(⁠#[ctor::ctor]) environment stripping bug causing 11–300× GPU slowdowns that eluded OpenAI’s debugging team for months. This was the main blocker to Codex spawning effective subagents, and also explains why OpenAI wasn’t able to use Codex in-house until February 2026.

Proof: Issue #8945  |  PR #8951  |  Release notes (rust-v0.80.0)

Full Investigation Details

The Ghost

In October 2025, OpenAI assembled a specialized debugging team to investigate mysterious slowdowns affecting Codex. After a week of intensive investigation: nothing.

The bug was literally a ghost — pre_main_hardening() executed before main(), stripped critical environment variables (LD_LIBRARY_PATH, DYLD_LIBRARY_PATH), and disappeared without a trace. Standard profilers saw nothing. Users saw variables in their shell, but inside codex exec they vanished.


The Hunt

Within 3 days of their announcement, I identified the problematic commit PR #4521 and contacted @tibo_openai.

But identification is not proof. I spent 2 months building an undeniable case.

Timeline

Date Event
Sept 30, 2025 PR #4521 merges, enabling pre_main_hardening() in release builds
Oct 1, 2025 rust-v0.43.0 ships (first affected release)
Oct 6, 2025 First “painfully slow” regression reports
Oct 1–29, 2025 Spike in env/PATH inheritance issues across platforms
Oct 29, 2025 Emergency PATH fix lands (did not catch root cause)
Late Oct 2025 OpenAI’s specialized team investigates, declares there is no root cause, identifies issue as user behavior change
Jan 9, 2026 My fix merged, credited in release notes

Evidence Collected

Platform Issues Failure Mode
macOS #6012, #5679, #5339, #6243, #6218 DYLD_* stripping breaking dynamic linking
Linux/WSL2 #4843, #3891, #6200, #5837, #6263 LD_LIBRARY_PATH stripping → silent CUDA/MKL degradation

Compiled evidence packages:

 Platform-specific failure modes
Reproduction steps with quantifiable performance regressions (11–300×) and benchmarks
 Pattern analysis
Cross-referenced 15+ scattered user reports over 3 months, traced process environment inheritance through fork/exec boundaries

   Comprehensive Technical Analysis
   Investigation Methodology


Why Conventional Debugging Failed

The bug was designed to be invisible:

Pre-main execution
Used #[ctor::ctor] to run before main(), before any logging or instrumentation
Silent stripping
No warnings, no errors — just missing environment variables
Distributed symptoms
Appeared as unrelated issues across different platforms and configurations
User attribution
Everyone assumed they misconfigured something (shell looked fine)
Wrong search space
Team was debugging post-main application code

[!NOTE] Standard debugging tools cannot see pre-main execution. Profilers start at main(). Log hooks are not initialized yet. The code executes, modifies the environment, and vanishes.


The Impact

OpenAI confirmed and merged the fix within 24 hours, explicitly crediting the investigation in v0.80.0 release notes:

“Codex CLI subprocesses again inherit env vars like LD_LIBRARY_PATH/DYLD_LIBRARY_PATH to avoid runtime issues. As explained in #8945, failure to pass along these environment variables to subprocesses that expect them (notably GPU-related ones), was causing 10×+ performance regressions! Special thanks to @johnzfitch for the detailed investigation and write-up in #8945.”

Restored:

GPU acceleration Internal ML/AI dev teams
CUDA/PyTorch ML researchers
MKL/NumPy Scientific computing users
Conda environments Cross-platform compatibility
Enterprise drivers Database connectivity

When the tools are blind, the system lies, and everyone else has stopped looking for it — this is the type of problem I love solving.


Selected Work

Observatory
WebGPU deepfake detection running 4 ML models in browser (live demo)
specHO
LLM watermark detection via phonetic/semantic analysis (The Echo Rule)
filearchy
COSMIC Files fork with sub-10ms(2.15M files) trigram search (Rust)
nautilus-plus
Enhanced GNOME Files with sub-ms search (AUR)
indepacer
PACER CLI for federal court research (PyPI: pacersdk)

Self-hosting bare metal infrastructure (NixOS) with post-quantum cryptography, authoritative DNS, and containerized services.


Featured

Observatory — WebGPU Deepfake Detection

Live Demo: look.definitelynot.ai

Browser-based AI image detection running 4 specialized ML models (ViT, Swin Transformer) through WebGPU. Zero server-side processing; all inference happens client-side with 672MB of ONNX models.

Model Accuracy Architecture
dima806_ai_real 98.2% Vision Transformer
SMOGY 98.2% Swin Transformer
Deep-Fake-Detector-v2 92.1% ViT-Base
umm_maybe 94.2% Vision Transformer

Stack: JavaScript (ES6)  •  Transformers.js  •  ONNX  •  WebGPU/WASM


iconics — Semantic Icon Library

3,372+ PNG icons with semantic CLI discovery. Find the right icon by meaning, not filename.

icon suggest security       # → lock, shield, key, firewall…
icon suggest data           # → chart, database, folder…
icon use lock shield        # Export to ./icons/

Features: Fuzzy search  •  theme variants  •  batch export  •  markdown integration
Stack: Python  •  FuzzyWuzzy  •  PIL


filearchy + triglyph — Sub-10ms File Search

COSMIC Files fork with embedded trigram search engine. Memory-mapped indices achieve sub-millisecond searches across 2.15M+ files with near-zero resident memory.

filearchy/
├── triglyph/      # Trigram library (mmap)
└── triglyphd/     # D-Bus daemon for system-wide search
Performance
2.15M(files indexed)  •  <10ms(query time)  •  156MB(index on disk)
Stack
Rust  •  libcosmic  •  memmap2  •  zbus

The Echo Rule — LLM Detection Methodology

LLMs echo their training data. That echo is detectable through pattern recognition:

Signature Detection Method
Phonetic CMU phoneme analysis, Levenshtein distance
Structural POS tag patterns, sentence construction
Semantic Word2Vec cosine similarity, hedging clusters

Implemented in specHO with 98.6% preprocessor test pass rate. Live demo at definitelynot.ai.


Skills

Technical focus — skills breakdown

Core: Rust  |  Python  |  TypeScript  |  C  |  Nix  |  Shell


Project Dashboard

Projects grouped by category
Text project index (copyable)

AI / ML

observatory
WebGPU deepfake detection — live: look.definitelynot.ai
specHO
LLM watermark detection (Echo Rule)
definitelynot.ai
Unicode security defenses
[marginium]
Multimodal generation tooling
[gemini-cli]
Privacy-enhanced Gemini CLI fork

Security Research

eero (private)
Mesh WiFi router security analysis
blizzarchy (private)
OAuth analysis and telemetry RE
[featherarchy]
Security-hardened Monero wallet fork
alienware-monitor (private)
Firmware RE
proxyforge (private)
Transparent MITM proxy

Systems Programming

[filearchy]
COSMIC Files fork with trigram search
[triglyph]
Trigram index library
[triglyphd]
D-Bus search daemon
[nautilus-plus]
Enhanced GNOME Files
[search-cache]
Sub-ms search cache/index
[cod3x]
Terminal coding agent
bitmail (private)
Bitmessage client

CLI Tools

[indepacer]
PACER CLI
[iconics]
Semantic icon library
[gemini-sharp]
Single-file Gemini CLI binaries

Desktop / Linux

[omarchy]
Omarchy fork
[waybar-config]
Waybar RSS ticker
[claude-desktop-arch]
Claude patch for Arch
qualcomm-x870e-linux-bug-patch
WiFi 7 firmware fix
[arch-deps]
Graph theory analysis

Web / Mobile

[NetworkBatcher]
Network batching for iOS
[Liberty-Links]
Privacy-respecting link alternatives

Infrastructure

NixOS Server (private)
Post-quantum SSH, Rosenpass VPN, authoritative DNS
unbound-config (private)
Recursive DNS with DNSSEC and ad blocking

Infrastructure

Primary server: Dedicated bare-metal NixOS host (details available on request)

Security Post-quantum SSH  •  Rosenpass VPN  •  nftables firewall
DNS Unbound resolver with DNSSEC  •  ad/tracker blocking
Services FreshRSS  •  Caddy (HTTPS/HTTP/3)  •  cPanel/WHM  •  Podman containers
Network Local 10Gbps  •  Authoritative BIND9 with RFC 2136 ACME

Philosophy

Pinned Loading

  1. burn-plugin burn-plugin Public

    Comprehensive Claude Code plugin for the Burn deep learning framework

    Shell 1

  2. llmx llmx Public

    Codebase indexer with BM25 search and semantic chunk exports for local agent consumption

    Rust 1

  3. specho-v2 specho-v2 Public

    Python 1

  4. claude-warden claude-warden Public

    Token-saving hooks for Claude Code. Prevents verbose output, blocks binary reads, enforces subagent budgets, truncates large outputs.

    Shell 1

  5. pyghidra-lite pyghidra-lite Public

    Lightweight MCP server for Ghidra-based reverse engineering with iOS, Linux, and game file support

    Python 1