Fix macOS silent microphone capture handling#597
Conversation
Greptile SummaryThis PR adds detection and clear error surfacing for macOS microphone permission issues, where silent all-zero audio would previously produce an empty transcript with no user-visible explanation. It also adds upfront permission gating in the Swift menu-bar layer so the system dialog fires before recording starts.
Confidence Score: 4/5Safe to merge; all changes are additive and the new error path is well-isolated from the normal recording flow. The core detection logic (all-zero PCM + empty transcript → SilentAudioCaptureError) is sound and thoroughly tested. The Swift permission gating is injected via a protocol so it does not change existing behaviour when permissions are already granted. The only items worth revisiting are a dead None-guard on send_task, a class-raise instead of an instance-raise in the helper, a two-call TOCTOU in requestAccessIfNeeded, and a missing notDetermined test case — all minor. agent_cli/services/asr.py (redundant None guard and class-raise style), macos/AgentCLI/Sources/AgentCLI/MicrophonePermission.swift (double currentStatus() call) Important Files Changed
Sequence Diagram%%{init: {'theme': 'neutral'}}%%
sequenceDiagram
participant U as User
participant ACR as AgentCommandRunner (Swift)
participant MPC as MicrophonePermissionController
participant PY as transcribe._async_main (Python)
participant ASR as asr._transcribe_live_audio_wyoming
participant SA as _send_audio
participant WY as Wyoming ASR Server
U->>ACR: hold/toggle recording
ACR->>MPC: currentStatus()
MPC-->>ACR: .notDetermined / .denied / .authorized
alt not authorized
ACR->>MPC: requestAccessIfNeeded(completion)
MPC-->>U: System permission dialog (notDetermined only)
ACR-->>U: statusMessage + notification
ACR-->>U: return (recording blocked)
else authorized
ACR->>PY: run transcribe command
PY->>ASR: live_transcriber(...)
ASR->>SA: _send_audio(stream, ...)
SA->>SA: capture_stats.observe(chunk) per chunk
SA->>WY: AudioChunk events
SA-->>ASR: AudioCaptureStats(byte_count, peak_sample)
ASR->>WY: AudioStop
WY-->>ASR: transcript string
alt "peak_sample==0 AND transcript empty"
ASR-->>PY: raise SilentAudioCaptureError
PY-->>PY: "catch => TranscriptResult(error=message)"
PY-->>U: print error + exit(1)
else normal
ASR-->>PY: transcript string
PY-->>U: transcript output
end
end
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
sequenceDiagram
participant U as User
participant ACR as AgentCommandRunner (Swift)
participant MPC as MicrophonePermissionController
participant PY as transcribe._async_main (Python)
participant ASR as asr._transcribe_live_audio_wyoming
participant SA as _send_audio
participant WY as Wyoming ASR Server
U->>ACR: hold/toggle recording
ACR->>MPC: currentStatus()
MPC-->>ACR: .notDetermined / .denied / .authorized
alt not authorized
ACR->>MPC: requestAccessIfNeeded(completion)
MPC-->>U: System permission dialog (notDetermined only)
ACR-->>U: statusMessage + notification
ACR-->>U: return (recording blocked)
else authorized
ACR->>PY: run transcribe command
PY->>ASR: live_transcriber(...)
ASR->>SA: _send_audio(stream, ...)
SA->>SA: capture_stats.observe(chunk) per chunk
SA->>WY: AudioChunk events
SA-->>ASR: AudioCaptureStats(byte_count, peak_sample)
ASR->>WY: AudioStop
WY-->>ASR: transcript string
alt "peak_sample==0 AND transcript empty"
ASR-->>PY: raise SilentAudioCaptureError
PY-->>PY: "catch => TranscriptResult(error=message)"
PY-->>U: print error + exit(1)
else normal
ASR-->>PY: transcript string
PY-->>U: transcript output
end
end
Reviews (1): Last reviewed commit: "Fix macOS silent microphone capture hand..." | Re-trigger Greptile |
| def _raise_silent_audio_capture_error() -> NoReturn: | ||
| raise SilentAudioCaptureError |
There was a problem hiding this comment.
_raise_silent_audio_capture_error raises the class object rather than an instance (raise SilentAudioCaptureError vs raise SilentAudioCaptureError()). Python 3 instantiates the class with no arguments when it sees raise ExceptionClass, so the DEFAULT_MESSAGE is used correctly and there is no runtime bug. However, raising a class directly is unconventional, surprises some static analysers, and adds an unnecessary call frame to every traceback. Since the function has no other callers and provides no testability benefit, it can be collapsed to a direct raise at the call site.
| def _raise_silent_audio_capture_error() -> NoReturn: | |
| raise SilentAudioCaptureError | |
| def _raise_silent_audio_capture_error() -> NoReturn: | |
| raise SilentAudioCaptureError() |
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
| return_when=asyncio.ALL_COMPLETED, | ||
| ) | ||
| result = recv_task.result() | ||
| capture_stats = send_task.result() if send_task is not None else None |
There was a problem hiding this comment.
send_task is never None here. manage_send_receive_tasks is typed as returning tuple[asyncio.Task, asyncio.Task] and always calls asyncio.create_task(send_task_coro) unconditionally, so the guard if send_task is not None is dead code. It could mislead a future reader into thinking there is a code path where send_task can be absent.
| capture_stats = send_task.result() if send_task is not None else None | |
| capture_stats = send_task.result() |
| func requestAccessIfNeeded(completion: @escaping (Bool) -> Void) { | ||
| guard currentStatus() == .notDetermined else { | ||
| completion(currentStatus() == .authorized) | ||
| return | ||
| } | ||
| AVCaptureDevice.requestAccess(for: .audio, completionHandler: completion) | ||
| } |
There was a problem hiding this comment.
currentStatus() is called twice on the same execution path: once for the guard and once inside the else branch for the completion value. Between the two calls the OS authorization status could change (e.g., user grants access in System Settings while the app is on the guard check), causing the completion to be called with a stale value. Caching the result in a local constant eliminates the TOCTOU window.
| func requestAccessIfNeeded(completion: @escaping (Bool) -> Void) { | |
| guard currentStatus() == .notDetermined else { | |
| completion(currentStatus() == .authorized) | |
| return | |
| } | |
| AVCaptureDevice.requestAccess(for: .audio, completionHandler: completion) | |
| } | |
| func requestAccessIfNeeded(completion: @escaping (Bool) -> Void) { | |
| let status = currentStatus() | |
| guard status == .notDetermined else { | |
| completion(status == .authorized) | |
| return | |
| } | |
| AVCaptureDevice.requestAccess(for: .audio, completionHandler: completion) | |
| } |
| func testAuthorizedPermissionAllowsRecording() { | ||
| let presentation = MicrophonePermissionPresentation(status: .authorized) | ||
|
|
||
| XCTAssertTrue(presentation.canRecord) | ||
| XCTAssertEqual(presentation.statusMessage, nil) | ||
| } | ||
|
|
||
| func testDeniedPermissionExplainsSettingsFix() { | ||
| let presentation = MicrophonePermissionPresentation(status: .denied) | ||
|
|
||
| XCTAssertFalse(presentation.canRecord) | ||
| XCTAssertEqual( | ||
| presentation.statusMessage, | ||
| "Allow Microphone permission for Agent CLI in System Settings." | ||
| ) | ||
| } | ||
| } |
There was a problem hiding this comment.
Missing
notDetermined presentation test
The notDetermined state has a distinct canRecord = false and a different statusMessage ("Approve Microphone permission…") compared to the denied branch. Adding a test case would guard against the two non-authorized messages being accidentally swapped or the canRecord property being set to true for notDetermined in a future refactor.
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
Summary
Test Plan