Skip to content

Harden CodexBar cold-start menu readiness#1150

Open
AmrMohamad wants to merge 6 commits into
steipete:mainfrom
AmrMohamad:enhance/UI-performance-issues
Open

Harden CodexBar cold-start menu readiness#1150
AmrMohamad wants to merge 6 commits into
steipete:mainfrom
AmrMohamad:enhance/UI-performance-issues

Conversation

@AmrMohamad
Copy link
Copy Markdown

Summary

This PR hardens CodexBar’s first-launch menu readiness path after macOS boot. The issue we investigated was an unstable first-open Codex tab where expensive adjunct UI, especially charts and hosted submenu content, could remain absent until the user manually refreshed or switched tabs.

The core fix is not a visual redesign. It keeps the UI shape intact and improves the lifecycle behind it: menu readiness invalidation, hosted submenu hydration, cached token data availability, duplicate-launch protection, and cold-start validation.

Why this changed

CodexBar’s menu is an AppKit NSStatusItem / NSMenu surface with SwiftUI-hosted rows and lazily hydrated submenus. On first launch after boot, several data sources can become ready after the menu has already been opened:

  • OpenAI dashboard attachment and usage data
  • credits state
  • Codex token-cost cache
  • plan utilization history
  • hosted chart submenu data

Before this change, the menu invalidation path was too coarse for those adjunct readiness transitions. A parent menu could remain open with stale structure, and a hosted submenu could stay stuck on its placeholder even after the backing data became available.

What changed

  • Adds an explicit menu adjunct readiness signature so open menus can refresh when dashboard, credits, token cache, or plan-history readiness changes.
  • Rebuilds open hosted submenu content from current store state instead of only resizing stale hosted views.
  • Requests an OpenAI dashboard refresh when the Codex parent menu opens and the chart history is stale.
  • Hydrates cached Codex token snapshots at startup so local token charts can appear without waiting for a fresh scanner pass.
  • Tracks plan utilization history revisions so menu readiness can respond to newly recorded samples.
  • Adds a single-instance launch guard so duplicate app launches do not race menu-bar state after packaging or login.
  • Makes Login Item registration idempotent to avoid unnecessary ServiceManagement churn.
  • Reduces avoidable UI cost by caching DateFormatter instances used in menu/chart rendering.
  • Avoids a keychain prompt lock inversion by serializing the alert on the main actor instead of holding the prompt lock before the main-queue hop.
  • Adds a cold-start validation harness that can install a one-shot LaunchAgent, disable/restore the normal Login Item, launch the packaged app after reboot, capture AX/screenshot artifacts, and verify first-open menu readiness.
  • Tightens package/run scripts so validation uses the freshly packaged app and detects duplicate launched instances.

Validation

Source and focused behavior checks:

  • swift test --filter StatusMenuHostedSubmenuRefreshTests
  • swift test --filter StatusMenuOpenRefreshTests
  • swift test --filter LaunchAtLoginManagerTests
  • make check
  • git diff --check
  • Focused AppKit smell scan on touched menu/keychain files

Runtime proof:

  • Built and validated .build/package/CodexBar.app
  • Verified code signing with codesign --verify --deep --strict --verbose=2 .build/package/CodexBar.app
  • Installed a one-shot cold-start LaunchAgent
  • Disabled the normal CodexBar Login Item before reboot
  • Rebooted the Mac and let the LaunchAgent perform the first-launch validation
  • Restored the normal CodexBar Login Item and removed the one-shot LaunchAgent afterward

Cold-start artifact:

  • /Users/amrmohamad/.codex/artifacts/CodexBar/cold-start-20260526-025603

Key artifact evidence:

  • post_boot_first_launch_candidate=true
  • existing_codexbar_process_count=0
  • codexbar_login_item_present=false at runner start
  • uptime_seconds=319
  • immediate parent menu status was complete
  • immediate parent AX included Buy Credits..., Cost | submenu=true, Usage Dashboard, Status Page, and Refresh
  • settled Cost submenu AX included Cost | submenu=true and Cost submenu item count=1
  • Scripts/prepare_next_boot_cold_start_validation.sh check-result passed against the captured artifact

Notes

The LaunchAgent’s original summary recorded an older checker failure because the validator used the stdlib PNG decoder fallback instead of Pillow metadata. The artifact itself is valid and immutable; after correcting the checker to accept image_decoder=stdlib_png, the same first-boot artifact validates successfully.

This PR intentionally does not change the visible UI design. The changes are lifecycle, data-readiness, validation, and performance hardening around the existing menu hierarchy.

Copy link
Copy Markdown
Author

@steipete I could not request you through GitHub's reviewer API from this account, so tagging you here for review.

@clawsweeper
Copy link
Copy Markdown

clawsweeper Bot commented May 26, 2026

Codex review: needs maintainer review before merge. Reviewed May 25, 2026, 8:48 PM ET / 00:48 UTC.

Summary
Adds cold-start menu readiness invalidation, hosted-submenu refresh, cached Codex token hydration, duplicate-launch and login-item hardening, keychain prompt serialization, validation scripts, and focused tests.

Reproducibility: yes. The PR provides a concrete reboot-based cold-start harness and copied artifact evidence for first-open menu readiness, though I did not rerun it in this read-only Linux review environment.

Review metrics: 3 noteworthy metrics.

  • Diff size: 30 files, +3370/-60. The PR spans runtime code, tests, and scripts, so maintainers should review it as a broad hardening change rather than a tiny menu patch.
  • Validation scripts: 3 added. The new cold-start harness touches LaunchAgent and login-item workflows that normal unit tests do not exercise.
  • Startup policy: 1 new single-instance guard. The added guard changes how duplicate app launches behave for existing users and developer/package flows.

Merge readiness
Overall: 🐚 platinum hermit
Proof: 🦞 diamond lobster
Patch quality: 🐚 platinum hermit
Result: ready for maintainer review.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

  • Get maintainer acceptance or adjustment of the duplicate-launch policy before merge.

Risk before merge

  • The single-instance guard terminates the later process for the same bundle identifier by oldest PID, so a user or developer launching a freshly packaged app while another copy is running may keep the older instance unless maintainers accept that policy.
  • The new cold-start validation scripts intentionally manipulate a one-shot LaunchAgent and the CodexBar login item; this is appropriate for the proof harness, but it is a maintainer-visible local automation surface before merge.

Maintainer options:

  1. Accept oldest-instance wins
    Maintainers can accept the current guard as the intended compatibility tradeoff for preventing duplicate menu-bar instances.
  2. Narrow duplicate handling
    Before merge, change the guard to preserve launch intent for a different bundle path or freshly packaged app and add focused tests for that policy.
  3. Split duplicate-launch policy
    If cold-start menu readiness should land independently, remove the single-instance guard from this PR and track that compatibility decision separately.

Next step before merge
A human maintainer should decide whether the duplicate-launch compatibility tradeoff is acceptable; no narrow automated repair remains after the provider-identity fix.

Security
Cleared: The diff adds local validation and launch/login-item tooling but no external downloads, dependency changes, secret expansion, or supply-chain execution path beyond the explicit local harness.

Review details

Best possible solution:

Land after maintainer acceptance of the single-instance semantics and CI/checks stay green; otherwise narrow duplicate handling to same-bundle/path or split that policy into a follow-up.

Do we have a high-confidence way to reproduce the issue?

Yes. The PR provides a concrete reboot-based cold-start harness and copied artifact evidence for first-open menu readiness, though I did not rerun it in this read-only Linux review environment.

Is this the best way to solve the issue?

Unclear. The menu readiness and provider-identity repairs look like the narrow maintainable direction, but the oldest-PID duplicate-launch behavior remains a maintainer compatibility decision.

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against c644f6874e9a.

Label changes

Label changes:

  • add P2: This is a normal-priority app hardening PR with focused tests and proof, plus a bounded compatibility decision before merge.
  • add merge-risk: 🚨 compatibility: Merging the new single-instance guard could change existing launch/package workflows by terminating the later same-bundle-id process.
  • add rating: 🐚 platinum hermit: Overall readiness is 🐚 platinum hermit; proof is 🦞 diamond lobster and patch quality is 🐚 platinum hermit.
  • add status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (logs): The PR body includes after-fix reboot validation logs and artifact metadata showing the first-open parent menu and settled Cost submenu were ready; the artifact path is contributor-local but the copied live evidence is sufficient.
  • remove status: ⏳ waiting on author: Current PR status label is status: 👀 ready for maintainer look.
  • remove rating: 🦐 gold shrimp: Current PR rating is rating: 🐚 platinum hermit, so this older rating label is no longer current.

Label justifications:

  • P2: This is a normal-priority app hardening PR with focused tests and proof, plus a bounded compatibility decision before merge.
  • merge-risk: 🚨 compatibility: Merging the new single-instance guard could change existing launch/package workflows by terminating the later same-bundle-id process.
  • rating: 🐚 platinum hermit: Overall readiness is 🐚 platinum hermit; proof is 🦞 diamond lobster and patch quality is 🐚 platinum hermit.
  • status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (logs): The PR body includes after-fix reboot validation logs and artifact metadata showing the first-open parent menu and settled Cost submenu were ready; the artifact path is contributor-local but the copied live evidence is sufficient.
  • proof: sufficient: Contributor real behavior proof is sufficient. The PR body includes after-fix reboot validation logs and artifact metadata showing the first-open parent menu and settled Cost submenu were ready; the artifact path is contributor-local but the copied live evidence is sufficient.
Evidence reviewed

What I checked:

  • Repository policy read: AGENTS.md was read fully and its guidance affected the review, especially the provider data siloing note, focused XCTest preference, and bundle-level validation expectation for UI/runtime behavior. (AGENTS.md:1, c644f6874e9a)
  • Provider identity fix verified in patch: The latest PR patch preserves provider.rawValue on Cost, OpenAI API usage, and Subscription Utilization hosted chart items, matching the previous ClawSweeper finding and the contributor's follow-up comment. (Sources/CodexBar/StatusItemController+HostedSubmenus.swift:254, 351dd198bd6a)
  • Single-instance compatibility surface: The PR adds SingleInstanceLaunchGuard and wires it from applicationWillFinishLaunching, terminating the later same-bundle-id process by oldest PID; that is the remaining maintainer-facing compatibility choice. (Sources/CodexBar/SingleInstanceLaunchGuard.swift:16, fddbd9144f47)
  • Runtime proof in PR body: The PR body reports a built, signed packaged app, one-shot LaunchAgent reboot validation, first-open parent menu status complete, settled Cost submenu readiness, and check-result passing against the captured cold-start artifact. (351dd198bd6a)
  • Current menu history provenance: git blame shows the current hosted submenu implementation in main is from the v0.29.0-era commit, making it the central baseline for the PR's refresh changes. (Sources/CodexBar/StatusItemController+HostedSubmenus.swift:22, f61469c3cde6)
  • Recent adjacent history: Recent main history includes a StatusItemController menu-bar identity fix and OpenAI dashboard background refresh work, both adjacent to the PR's menu and dashboard readiness changes. (Sources/CodexBar/StatusItemController.swift:1, 22cf7d4d2b76)

Likely related people:

  • steipete: git blame attributes the current hosted submenu, launch-at-login, keychain prompt, UsageStore, and script baselines sampled here to the v0.29.0-era commit authored by Peter Steinberger. (role: feature-history owner; confidence: high; commits: f61469c3cde6; files: Sources/CodexBar/StatusItemController+HostedSubmenus.swift, Sources/CodexBar/StatusItemController.swift, Sources/CodexBar/LaunchAtLoginManager.swift)
  • Perry Story: Recent main history changed UsageStore and OpenAI web background dashboard refresh behavior, which is adjacent to the PR's parent-menu stale dashboard refresh path. (role: recent adjacent contributor; confidence: medium; commits: c2b63a01bae4; files: Sources/CodexBar/UsageStore+OpenAIWeb.swift, Sources/CodexBar/UsageStore.swift)
  • lederniermagicien: Recent main history changed StatusItemController menu-bar item identity and tests, adjacent to the PR's menu lifecycle and duplicate-launch behavior. (role: recent adjacent contributor; confidence: medium; commits: 22cf7d4d2b76; files: Sources/CodexBar/StatusItemController.swift, Sources/CodexBar/MenuBarVisibilityWatcher.swift)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5ee755640c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

try register()
} else {
try service.unregister()
guard status() == .enabled else { return }
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Unregister pending login item when disabling autostart

Treating setEnabled(false) as a no-op unless status() == .enabled leaves the login item registered when Service Management reports .requiresApproval. In that state the service is already registered but awaiting user approval, so users who toggle Launch at Login off will not actually remove it and can still see the pending/background item in System Settings on next login. This regression was introduced by the new status guard and should also call unregister() for .requiresApproval (or for any state except .notRegistered).

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 31a37207: setEnabled(false) now calls unregister() for any ServiceManagement state except .notRegistered, so pending .requiresApproval login items are removed instead of left registered. Added a focused regression test for the .requiresApproval disable path and verified with swift test --filter LaunchAtLoginManagerTests plus git diff --check.

@clawsweeper clawsweeper Bot added proof: sufficient Contributor real behavior proof is sufficient. rating: 🦐 gold shrimp Decent PR readiness signal, but merge confidence is limited. status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action. labels May 26, 2026
Copy link
Copy Markdown
Author

Fixed the hosted submenu provider-identity regression in 63f36891.

What changed:

  • Preserved provider.rawValue on hydrated Cost, OpenAI API usage, and Subscription Utilization hosted chart menu items in both rendering and non-rendering branches.
  • Added a focused regression test that opens an already-hydrated provider-specific Cost submenu, refreshes it while open, and verifies it remains provider-tagged and hydrated instead of falling back to No data available.

Verification run:

  • swift test --filter StatusMenuHostedSubmenuRefreshTests
  • swift test --filter AppDelegateTests
  • swift test --filter LaunchAtLoginManagerTests
  • git diff --check
  • make check

Duplicate-launch note: I left SingleInstanceLaunchGuard unchanged in this patch. Its current behavior intentionally terminates later duplicate launches by bundle identifier to avoid the user-visible double menu-bar instance; dev/package flows should kill stale instances before relaunch. That compatibility tradeoff remains ready for maintainer acceptance or adjustment.

@clawsweeper re-review

@clawsweeper
Copy link
Copy Markdown

clawsweeper Bot commented May 26, 2026

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

@AmrMohamad AmrMohamad force-pushed the enhance/UI-performance-issues branch from 63f3689 to 351dd19 Compare May 26, 2026 00:43
@clawsweeper clawsweeper Bot added rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. P2 Normal priority bug or improvement with limited blast radius. merge-risk: 🚨 compatibility 🚨 Merging this PR could break existing users, config, migrations, defaults, or upgrades. and removed rating: 🦐 gold shrimp Decent PR readiness signal, but merge confidence is limited. status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action. labels May 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

merge-risk: 🚨 compatibility 🚨 Merging this PR could break existing users, config, migrations, defaults, or upgrades. P2 Normal priority bug or improvement with limited blast radius. proof: sufficient Contributor real behavior proof is sufficient. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant