Skip to content

fix(#333): permanent Pixel 6/7 BLE fix — small-MTU compatibility mode#615

Merged
9thLevelSoftware merged 7 commits into
pixel_test_moarfrom
claude/phoenix-mp-333-root-cause-zhb0vz
Jul 4, 2026
Merged

fix(#333): permanent Pixel 6/7 BLE fix — small-MTU compatibility mode#615
9thLevelSoftware merged 7 commits into
pixel_test_moarfrom
claude/phoenix-mp-333-root-cause-zhb0vz

Conversation

@9thLevelSoftware

Copy link
Copy Markdown
Owner

Summary

Permanent fix for #333 — the Pixel 6/7 GATT_ERROR(133) disconnect at workout start that has existed since the original Android app.

Root cause: on Pixel 6/7 (Tensor + Broadcom BCM4389 BT controller), the 96-byte workout CONFIG packet sent as a single large ATT PDU under the 517-byte MTU that Android 14+ forces on the first requestMtu() call never completes. The controller's write lane wedges permanently (WriteRequestBusy forever, proven by the v7 quarantine experiment), then surfaces GATT_ERROR(133) and drops the link. The official Vitruvian app never calls requestMtu(), so it stays at the default 23-byte MTU where large writes are automatically chunked by the ATT long-write procedure — which these controllers handle fine. A pure-WebBluetooth reproduction (no Kable/Kotlin involved) confirmed the app stack was never at fault.

This PR productionizes the combination validated in the pixel-test-new-v14 build (first-ever successful Pixel workout, 2026-07-04) as a BLE Compatibility Mode:

  • New BleCompatibilityMode policy + persisted tri-state setting (Auto/On/Off) in Settings → Developer Tools. Auto (default) enables the path only on the Pixel 6/6 Pro/6a/7/7 Pro/7a codenames (oriole, raven, bluejay, panther, cheetah, lynx); all other devices keep today's behavior. Replaces the memory-only experiment flags A–I from the test branches.
  • Compatibility path behavior (KableBleConnectionManager): skip requestMtu() entirely (stay at default MTU → CONFIG auto-chunks via Prepare/Execute Write), request high connection priority after service discovery, skip pre-ready diagnostic version reads and the post-CONFIG diagnostic read, suppress the GATT heartbeat loop (MetricPollingEngine.startAll(p, includeHeartbeat=false); restart paths also respect it).
  • Vendored Kable Android patch (third_party/kable-core-android-patched): Kable 0.43.1 Android sources (re-vendored from Maven Central, same approach as v14's 0.42.0 vendoring) with the single-function patch — writeCharacteristicOrThrow uses the legacy one-argument BluetoothGatt.writeCharacteristic() on all API levels, matching the official app's write shape which reliably long-writes at small MTU. Wired via dependency substitution of kable-core-android, so iOS and the common metadata still use upstream.
  • v10 ready gate (BleReadyGate.kt): app-level Connected is not published until MTU path, service discovery, critical REPS subscription, and polling startup complete — with a bounded Device initialization timeout instead of an indefinite Connecting hang. This stops the UI's LED-restore write from racing the ready-up GATT sequence (the WriteRequestBusy seen right after CONNECT_SUCCESS in v9/v10 logs).
  • Bounded CONFIG retry (2 retries, 250 ms): targets the residual transient failure on StuGotz's Pixel 6 Pro under v14 — Write failed: Unknown at write initiation without a disconnect, right after the trainer briefly engaged the cables for the first time ever on his device. That failure mode is retryable; the old wedged-lane mode was not.
  • Detailed [#333 compat] connection-log instrumentation (opcode, commandId, first8, force-profile floats, maxWriteLength, elapsed) kept from the diagnostic builds.

Deviation from the v14 branch, on purpose: v14's per-characteristic BleNotificationSubscriptionGate is not ported — main's newer observeRepsWithRetry (2026-06-28 RCA: CCCD busy-retry + bounded reps-ready gate) supersedes it, and the ready gate now waits on that mechanism inline.

Evidence trail

Full analysis in the issue thread: flags A–H (polling/heartbeat/PHY/GATT-cache/close-lifecycle/quiesce/phantom-START/raw-GATT-write) all failed on both test devices; the WebBluetooth repro exonerated the app stack; the v7 quarantine proved the permanent write-lane wedge; v12–v14 (skip requestMtu()) produced the first success (FossForUs1, Pixel 7 Pro: CONFIG in 436 ms, cables engaged, reps counted, maxWriteLength=20).

Test plan

  • ./gradlew :shared:testAndroidHostTest — passes (new BleCompatibilityModeTest 7/7, ported BleReadyGateTest 2/2, full existing BLE suite green)
  • ./gradlew :androidApp:assembleDebug — APK builds with the vendored Kable substitution
  • On-device validation (community, no local hardware): publish a debug pre-release from this branch; StuGotz (Pixel 6 Pro) + FossForUs1 (Pixel 7 Pro) run the [Bug]: Bluetooth disconnecting when workout starts #333 pass bar — CONFIG succeeds, cables engage, reps count, 5 consecutive sessions, setting left on Auto
  • Regression check on a non-Pixel device (compat Auto resolves off; heartbeat + MTU 517 path unchanged)

Fixes #333 (leave open until both testers confirm the pass bar).

🤖 Generated with Claude Code

https://claude.ai/code/session_01DPLu2HRRPU7gfa9Z6p3S8k


Generated by Claude Code

claude added 2 commits July 4, 2026 17:55
Root cause: on Pixel 6/7 (Tensor + BCM4389), the 96-byte workout CONFIG
write sent as a single large ATT PDU under the MTU 517 that Android 14+
forces on the first requestMtu() call never completes — the controller's
write lane wedges (WriteRequestBusy forever), surfaces GATT_ERROR(133),
and drops the link at workout start. The official Vitruvian app never
requests an MTU, so its large writes are chunked by the ATT long-write
procedure at the default 23-byte MTU, which these controllers handle.

Productionizes the combination validated in pixel-test-new-v14 (first
successful Pixel 7 Pro workout, 2026-07-04):

- BleCompatibilityMode policy + persisted Auto/On/Off setting (Developer
  Tools). Auto enables only on Pixel 6/6 Pro/6a/7/7 Pro/7a codenames.
- Compatibility path: skip requestMtu(), high connection priority after
  service discovery, no GATT heartbeat, no pre-ready/post-CONFIG
  diagnostic reads, [#333 compat] connection-log instrumentation.
- Vendored Kable 0.43.1 Android variant (third_party) with the legacy
  one-arg BluetoothGatt.writeCharacteristic() patch, substituted for
  kable-core-android — the write shape that long-writes reliably.
- v10 ready gate: app-level Connected waits for MTU path, services,
  REPS subscription, and polling startup, with a bounded init timeout
  (stops the LED-restore write racing ready-up).
- Bounded CONFIG retry (2x250ms) for the transient 'Write failed:
  Unknown' still seen on Pixel 6 Pro under v14.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01DPLu2HRRPU7gfa9Z6p3S8k
Kable 0.43.1 marks Profile APIs with @com.juul.kable.ExperimentalApi;
the vendored sources need the opt-in to compile standalone.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01DPLu2HRRPU7gfa9Z6p3S8k
@cursor

cursor Bot commented Jul 4, 2026

Copy link
Copy Markdown
Contributor

Bugbot is not enabled for your account, so this pull request was not reviewed.

Enable Bugbot in the Cursor dashboard to get automatic reviews on future PRs.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a BLE compatibility mode ("official small-MTU path") to resolve GATT_ERROR(133) disconnect issues on Pixel 6/7 devices at workout start. This is achieved by vendoring and patching the Android variant of the Kable library to use legacy characteristic writes, skipping MTU negotiation, heartbeats, and pre-ready diagnostic reads when the compatibility path is active, and adding a tri-state preference setting for users. Additionally, a "ready gate" mechanism is implemented in KableBleConnectionManager to defer publishing the Connected state until initialization is complete. Feedback on the changes suggests wrapping the readyGate CompletableDeferred in a thread-safe volatile reference holder to prevent a multi-threaded data race during connection retries when the reference is reassigned.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment on lines +3 to +11
import com.devil.phoenixproject.util.BleConstants
import kotlinx.coroutines.CompletableDeferred
import kotlinx.coroutines.TimeoutCancellationException
import kotlinx.coroutines.withTimeout

internal class BleDeviceInitializationException(
message: String,
cause: Throwable? = null,
) : IllegalStateException(message, cause)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

To prevent a multi-threaded data race on the readyGate reference during connection retries, we should wrap the CompletableDeferred in a thread-safe volatile reference holder. Let's define a VolatileRef class here and use it in KableBleConnectionManager.

import com.devil.phoenixproject.util.BleConstants
import kotlin.concurrent.Volatile
import kotlinx.coroutines.CompletableDeferred
import kotlinx.coroutines.TimeoutCancellationException
import kotlinx.coroutines.withTimeout

internal class BleDeviceInitializationException(
    message: String,
    cause: Throwable? = null,
) : IllegalStateException(message, cause)

internal class VolatileRef<T>(@Volatile var value: T)

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in dfe96f6 — valid visibility concern (the retry loop reassigns on the caller's dispatcher while the state observer reads on the manager scope). Went with a @Volatile class field on KableBleConnectionManager instead of a VolatileRef holder: same happens-before guarantee, one less indirection, and it matches the existing deviceReadyJob/negotiatedMtu field pattern.


Generated by Claude Code

Comment on lines +583 to +585
deviceReadyJob?.cancel()
deviceReadyJob = null
var readyGate = CompletableDeferred<Unit>()

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Use the thread-safe VolatileRef wrapper to hold the readyGate instance. This avoids a data race where the connection retry loop reassigns readyGate on one thread while the state observer coroutine reads it on another thread.

Suggested change
deviceReadyJob?.cancel()
deviceReadyJob = null
var readyGate = CompletableDeferred<Unit>()
deviceReadyJob?.cancel()
deviceReadyJob = null
val readyGateRef = VolatileRef(CompletableDeferred<Unit>())

Comment on lines +610 to 662
if (deviceReadyJob == null && !readyGate.isCompleted) {
deviceReadyJob = scope.launch {
try {
withTimeout(BleConstants.CONNECTION_TIMEOUT_MS) {
onDeviceReady()
}
reportConnectionState(
ConnectionState.Connected(
deviceName = device.name,
deviceAddress = device.address,
hardwareModel = HardwareDetection.detectModel(device.name),
),
)
readyGate.complete(Unit)
} catch (e: TimeoutCancellationException) {
val failure = BleDeviceInitializationException(
"Device initialization timeout after ${BleConstants.CONNECTION_TIMEOUT_MS}ms",
e,
)
log.e(e) { failure.message ?: "Device initialization timeout" }
logRepo.error(
LogEventType.ERROR,
"Device initialization timeout",
connectedDeviceName,
connectedDeviceAddress,
failure.message,
)
if (!readyGate.isCompleted) {
readyGate.completeExceptionally(failure)
}
} catch (e: CancellationException) {
readyGate.cancel(e)
throw e
} catch (e: Exception) {
log.e(e) { "onDeviceReady failed: ${e.message}" }
logRepo.error(
LogEventType.ERROR,
"Device initialization failed",
connectedDeviceName,
connectedDeviceAddress,
e.message,
)
if (!readyGate.isCompleted) {
readyGate.completeExceptionally(
BleDeviceInitializationException(
"Device initialization failed: ${e.message}",
e,
),
)
}
}
}
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Update the state observer to read and complete the thread-safe readyGateRef.value instead of the direct readyGate reference.

                            if (deviceReadyJob == null && !readyGateRef.value.isCompleted) {
                                deviceReadyJob = scope.launch {
                                    try {
                                        withTimeout(BleConstants.CONNECTION_TIMEOUT_MS) {
                                            onDeviceReady()
                                        }
                                        reportConnectionState(
                                            ConnectionState.Connected(
                                                deviceName = device.name,
                                                deviceAddress = device.address,
                                                hardwareModel = HardwareDetection.detectModel(device.name),
                                            ),
                                        )
                                        readyGateRef.value.complete(Unit)
                                    } catch (e: TimeoutCancellationException) {
                                        val failure = BleDeviceInitializationException(
                                            "Device initialization timeout after ${BleConstants.CONNECTION_TIMEOUT_MS}ms",
                                            e,
                                        )
                                        log.e(e) { failure.message ?: "Device initialization timeout" }
                                        logRepo.error(
                                            LogEventType.ERROR,
                                            "Device initialization timeout",
                                            connectedDeviceName,
                                            connectedDeviceAddress,
                                            failure.message,
                                        )
                                        if (!readyGateRef.value.isCompleted) {
                                            readyGateRef.value.completeExceptionally(failure)
                                        }
                                    } catch (e: CancellationException) {
                                        readyGateRef.value.cancel(e)
                                        throw e
                                    } catch (e: Exception) {
                                        log.e(e) { "onDeviceReady failed: ${e.message}" }
                                        logRepo.error(
                                            LogEventType.ERROR,
                                            "Device initialization failed",
                                            connectedDeviceName,
                                            connectedDeviceAddress,
                                            e.message,
                                        )
                                        if (!readyGateRef.value.isCompleted) {
                                            readyGateRef.value.completeExceptionally(
                                                BleDeviceInitializationException(
                                                    "Device initialization failed: ${e.message}",
                                                    e,
                                                ),
                                            )
                                        }
                                    }
                                }
                            }

Comment on lines +686 to +690
if (!readyGate.isCompleted) {
readyGate.completeExceptionally(
BleDeviceInitializationException("Device disconnected during initialization"),
)
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Update the disconnect handler to exceptionally complete the thread-safe readyGateRef.value.

Suggested change
if (!readyGate.isCompleted) {
readyGate.completeExceptionally(
BleDeviceInitializationException("Device disconnected during initialization"),
)
}
if (!readyGateRef.value.isCompleted) {
readyGateRef.value.completeExceptionally(
BleDeviceInitializationException("Device disconnected during initialization"),
)
}

for (attempt in 1..BleConstants.Timing.CONNECTION_RETRY_COUNT) {
try {
log.d { "Connection attempt $attempt of ${BleConstants.Timing.CONNECTION_RETRY_COUNT}" }
readyGate = CompletableDeferred()

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Reassign the volatile readyGateRef.value on each connection retry attempt.

Suggested change
readyGate = CompletableDeferred()
readyGateRef.value = CompletableDeferred()

// Issue #333 (v10 ready gate): connect() succeeding only means the
// physical link is up. Wait for ready-up (MTU path, services, critical
// notifications, polling) before reporting success to the caller.
awaitBleReadyGate(readyGate) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Wait on the thread-safe readyGateRef.value during connection establishment.

Suggested change
awaitBleReadyGate(readyGate) {
awaitBleReadyGate(readyGateRef.value) {

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 96cb2b97fa

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread gradle/libs.versions.toml
kotlin-multiplatform = { id = "org.jetbrains.kotlin.multiplatform", version.ref = "kotlin" }
kotlin-serialization = { id = "org.jetbrains.kotlin.plugin.serialization", version.ref = "kotlin" }
kotlin-compose = { id = "org.jetbrains.kotlin.plugin.compose", version.ref = "kotlin" }
kotlin-parcelize = { id = "org.jetbrains.kotlin.plugin.parcelize", version.ref = "kotlin" }

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Use a resolvable Parcelize plugin id

With this catalog entry, any Gradle invocation that evaluates the root plugins block fails before the task graph is usable; I confirmed ./gradlew :shared:compileKotlinMetadata :shared:compileDebugKotlinAndroid --dry-run stops with Plugin [id: 'org.jetbrains.kotlin.plugin.parcelize', version: '2.4.0'] was not found. The vendored Kable module now applies this alias, so Android builds cannot start until this points at a valid published Parcelize plugin id such as the standard kotlin-parcelize alias, or the plugin is removed if it is not needed.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't reproduce this. org.jetbrains.kotlin.plugin.parcelize is the canonical published plugin id (the kotlin-parcelize shorthand is the legacy id("kotlin-parcelize") form applied inside a module — it isn't usable in a version-catalog plugins alias with a version). With this exact catalog entry, both ./gradlew :shared:testAndroidHostTest and ./gradlew :androidApp:assembleDebug resolve, configure, and build successfully on this branch — full test suite green and APK produced. The marker artifact org.jetbrains.kotlin.plugin.parcelize:...gradle.plugin:2.4.0 resolves from mavenCentral/gradlePluginPortal; the dry-run failure above looks like a sandbox without access to those repositories rather than a bad id.


Generated by Claude Code

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re the fresh dry-run evidence: this PR's own GitHub Actions runs disprove it. Every CI job on these commits gets well past root plugin resolution — Unit Tests, Lint Check, Security Hygiene, and iOS Schema Sync all configured and executed on GitHub-hosted runners (the one iOS failure was a Kotlin/Native Volatile import at compile time, fixed in 9644bed, not plugin resolution). If the id were unresolvable, every one of those jobs would have failed identically at the top-level plugins block. The dry-run failure is specific to the review sandbox's repository access; no catalog change is warranted. Leaving as-is.


Generated by Claude Code

The connect retry loop reassigns readyGate on the caller's dispatcher
while the state observer coroutine reads it on the manager scope; a
stale read would complete the previous attempt's gate and force a
spurious initialization timeout. Promote the captured local to a
@volatile field (addresses PR #615 review).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01DPLu2HRRPU7gfa9Z6p3S8k
@cursor

cursor Bot commented Jul 4, 2026

Copy link
Copy Markdown
Contributor

Bugbot is not enabled for your account, so this pull request was not reviewed.

Enable Bugbot in the Cursor dashboard to get automatic reviews on future PRs.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: dfe96f634e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".


override suspend fun setBleCompatibilityMode(setting: BleCompatibilitySetting) {
settings.putString(KEY_BLE_COMPATIBILITY_MODE, setting.name)
BleCompatibilityMode.setting = setting

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Defer BLE mode changes until reconnect

When a user changes this setting while already connected, this immediately mutates the singleton that sendWorkoutCommand and polling helpers read, even though the MTU request/heartbeat/diagnostic setup for the active GATT connection was already chosen in onDeviceReady. For example, connecting with compatibility OFF then toggling ON before starting a set makes the CONFIG write take the compat branch while the connection still has the large MTU that wedges Pixel 6/7; toggling OFF after a small-MTU connect re-enables post-CONFIG diagnostic traffic on that same vulnerable connection. Since the Settings UI says the change takes effect on the next connection, keep a per-connection snapshot or force a reconnect before changing the live BLE mode.

Useful? React with 👍 / 👎.

* Issue #333 (v10 ready gate): job running onDeviceReady() for the current
* connection attempt. App-level Connected is not published until it completes.
*/
private var deviceReadyJob: Job? = null

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔥 The Roast: You sweated the @Volatile annotation onto readyGate ten lines down — and then walked right past its sibling deviceReadyJob, which has the exact same cross-dispatcher reassignment pattern you just documented. That's like installing a deadbolt on the front door and leaving the back door wide open with a "we don't use it much" sticky note.

The reassignment sites are 593, 621, 701, 756, 774 — the state observer (launchIn(scope)) reads deviceReadyJob at line 620, the retry loop reassigns it on the caller's coroutine, and the disconnect handler / timeout callback reassign it without an immediately-paired readyGate write. The @Volatile on readyGate does NOT carry a happens-before guarantee to a sibling field that's reassigned independently. A stale read here can drop an onDeviceReady() launch and force a spurious initialization timeout — which is exactly the failure mode you just spent a PR fixing.

🩹 The Fix: Add @Volatile to mirror the adjacent pattern, and update the KDoc on line 132-135 to call out the same cross-dispatcher hazard:

Suggested change
private var deviceReadyJob: Job? = null
@Volatile
private var deviceReadyJob: Job? = null

📏 Severity: warning


Reply with @kilocode-bot fix it to have Kilo Code address this issue.

@kilo-code-bot

kilo-code-bot Bot commented Jul 4, 2026

Copy link
Copy Markdown
Contributor

Code Review Roast 🔥

Verdict: No Issues Found | Recommendation: Merge

Overview

Severity Count
🚨 critical 0
⚠️ warning 0
💡 suggestion 0
🤏 nitpick 0
Issue Details (click to expand)

No new issues on changed lines.


🏆 Best part: The previous review's lone P2 (peripheral / isExplicitDisconnect read without @Volatile in the ready-up teardown guard at L652) is closed surgically. @Volatile is now on both fields with KDoc explaining the exact cross-dispatcher hazard — same wording style as readyGate / deviceReadyJob / compatibilityPathActive / includeHeartbeatForSession. The torn-down guard at L652 now actually observes the teardown it claims to observe. Plus the comment you just marked "Done" earns its keep: the cancel you added in disconnect() / cancelConnection() is now visible to the ready-up coroutine, which means the "don't resurrect Connected over a Disconnected UI" invariant is finally thread-safe in the JVM-memory-model sense, not just the source-code-reading sense.

💀 Worst part: There's nothing left to roast. Two @Volatile keywords and ten lines of KDoc that precisely describe the hazard they defend against. If I'm being paid by the finding, this commit is a hostile act on my livelihood.

📊 Overall: The PR now has consistent @Volatile discipline across every cross-dispatcher field on the ready-up path — peripheral, isExplicitDisconnect, readyGate, deviceReadyJob, compatibilityPathActive, includeHeartbeatForSession. Surgical, complete, and exactly what the previous P2 asked for. Pixel-6/7 root-cause fix + thread-hygiene follow-up both land clean. Ship it.

Files Reviewed (1 changed file)
  • shared/src/commonMain/kotlin/com/devil/phoenixproject/data/ble/KableBleConnectionManager.kt - 0 issues (@Volatile added to peripheral at L122 and isExplicitDisconnect at L177 with matching KDoc, closing the prior P2; ready-up teardown guard at L652 now reads both fields with proper happens-before)
Previous Review Summaries (4 snapshots, latest commit d499470)

Current summary above is authoritative. Previous snapshots are kept for context only.

Previous review (commit d499470)

Verdict: 1 Issue Found | Recommendation: Address before merge

Overview

Severity Count
🚨 critical 0
⚠️ warning 1
💡 suggestion 0
🤏 nitpick 0
Issue Details (click to expand)
File Line Roast
shared/src/commonMain/kotlin/com/devil/phoenixproject/data/ble/KableBleConnectionManager.kt 642 The new torn-down guard reads peripheral and isExplicitDisconnect — neither marked @Volatile — so it has the same cross-dispatcher hazard the previous reviewer flagged for readyGate/deviceReadyJob. The cancel-then-guard pattern is only as good as the visibility of those fields.

🏆 Best part: The two Codex P2s from the previous review (third_party/.../BluetoothGatt.kt:62 on legacy-write busy retry and KableBleConnectionManager.kt:634 on ready-up teardown race) are both addressed head-on. The BleOperationQueue workaround is the right call given the constraint of keeping the vendored Kable patch byte-identical to v14 — string-matching the documented "Write failed: Unknown" signal preserves the existing bounded busy-retry/backoff for every write path under the dependency substitution. The disconnect() / cancelConnection() cancel of deviceReadyJob is the textbook fix for the resurrection race. Solid surgical work.

💀 Worst part: The safety net you just added to catch the same race you just fixed in another hunk reads two unprotected fields. Adding @Volatile to peripheral and isExplicitDisconnect is two lines, and the previous reviewer literally already paved the road with the same annotation pattern on neighboring fields. Without it, disconnect() → cancel → null out happens on the caller's dispatcher while the ready-up coroutine is reading the field, and the guard is theoretically racing. In practice the timing window is tiny (a single statement between onDeviceReady() returning and reportConnectionState(Connected)), but this is exactly the kind of "I can't reproduce it in a unit test" race that shows up once a year in production and makes you question your career choices.

📊 Overall: One targeted follow-up, not a blocker. Apply the @Volatile annotation to peripheral and isExplicitDisconnect to match the existing readyGate/deviceReadyJob pattern, and this incremental commit earns the same pass-grade verdict the prior one did. The Pixel-6/7 fix is the real story here — this is just the last piece of thread hygiene on the same teardown path.

Files Reviewed (2 changed files)
  • shared/src/commonMain/kotlin/com/devil/phoenixproject/data/ble/BleOperationQueue.kt - 0 issues (legacy-write "Write failed: Unknown" classified as transient; matches upstream Kable pre-API-33 docs and preserves the existing bounded busy-retry/backoff across all writes under the dependency substitution)
  • shared/src/commonMain/kotlin/com/devil/phoenixproject/data/ble/KableBleConnectionManager.kt - 1 issue (new torn-down guard at L639–649 reads peripheral/isExplicitDisconnect without @Volatile; disconnect()/cancelConnection() correctly cancel deviceReadyJob, but the guard's read visibility is the missing piece)

Fix these issues in Kilo Cloud

Previous review (commit a1b5255)

Verdict: No Issues Found | Recommendation: Merge

Oh wait, this incremental commit actually cleaned up its own mess. I had my flamethrower warmed up for a follow-up roast and instead you fixed the thing.

Overview

Severity Count
🚨 critical 0
⚠️ warning 0
💡 suggestion 0
🤏 nitpick 0

🏆 Best part: You literally took the previous review's worst-part suggestion ("Move the declaration above startAll() and you're done") and shipped it as a single targeted commit. The field keeps its @Volatile, its KDoc, and its = true default — same happens-before guarantee, same documentation, just relocated to where a reader actually looks for it. That's the kind of follow-through that makes incremental review boring in the best possible way.

💀 Worst part: There is nothing to roast. I went hunting for the kind of subtle mistake this kind of pure-relocation commit can introduce — missing @volatile, dropped default, lost KDoc, broken indentation, accidental logic edit — and found none of them. The diff is one field moved up with zero collateral damage. Offensively clean.

📊 Overall: Like watching someone actually fix their bugs in response to review instead of arguing — rare, refreshing, and slightly suspicious. The drivetrain is now actually road-ready, and the cosmetic concern from the prior pass is gone with it.

Files Reviewed (1 changed file)
  • shared/src/commonMain/kotlin/com/devil/phoenixproject/data/ble/MetricPollingEngine.kt - 0 issues (field relocated above startAll(), visibility/annotation/default all preserved)

Previous review (commit 9644bed)

Verdict: No Issues Found | Recommendation: Merge

Oh wait, this incremental commit actually cleaned up its own mess. I had my flamethrower warmed up for a follow-up roast and instead you fixed the thing.

Resolved Since Last Review

File Line Previous Finding Status
shared/src/commonMain/kotlin/com/devil/phoenixproject/data/ble/KableBleConnectionManager.kt 138 deviceReadyJob missing @Volatile despite identical cross-dispatcher reassignment pattern ✅ Fixed — @Volatile added, KDoc updated
shared/src/commonMain/kotlin/com/devil/phoenixproject/data/preferences/PreferencesManager.kt 703 Mid-connection setting changes could take the compat CONFIG branch on a large-MTU connection (or vice versa) ✅ Fixed — per-connection snapshot pattern via compatibilityPathActive + includeHeartbeatForSession

Overview

Severity Count
🚨 critical 0
⚠️ warning 0
💡 suggestion 0
🤏 nitpick 0

🏆 Best part: The per-connection snapshot pattern is the right architectural answer, and you implemented it cleanly in three places — KableBleConnectionManager.compatibilityPathActive for the write path, MetricPollingEngine.includeHeartbeatForSession for the restart path, and onDeviceReady() as the single snapshot point. The KDoc on compatibilityPathActive explicitly calls out the failure mode (compat CONFIG branch on a large-MTU connection) and references [startAll] as the lifecycle owner. That's the kind of comment that saves the next person from re-introducing the bug in six months. Also nice catch adding the missing import kotlin.concurrent.Volatile to BleCompatibilityMode.kt — that file was using @Volatile on setting without the import, which is exactly the kind of thing that compiles in IntelliJ with auto-import and then mysteriously breaks in CI.

💀 Worst part: There's nothing to roast this round, which is frankly offensive to my review persona. The only thing I'd flag is cosmetic — includeHeartbeatForSession is used in startAll() at line 192 but declared at line 212. Kotlin handles it fine (member fields resolve at runtime, not parse time), but a reader scanning top-to-bottom will do a double-take. Move the declaration above startAll() and you're done. Not worth a comment thread.

📊 Overall: Like watching someone actually fix their bugs in response to review instead of arguing — rare, refreshing, and slightly suspicious. The drivetrain is now actually road-ready.

Files Reviewed (3 changed files)
  • shared/src/commonMain/kotlin/com/devil/phoenixproject/data/ble/BleCompatibilityMode.kt - 0 issues (missing import fixed)
  • shared/src/commonMain/kotlin/com/devil/phoenixproject/data/ble/KableBleConnectionManager.kt - 0 issues (previous warning resolved, snapshot field added)
  • shared/src/commonMain/kotlin/com/devil/phoenixproject/data/ble/MetricPollingEngine.kt - 0 issues (snapshot field added)

Previous review (commit dfe96f6)

Verdict: 1 Issue Found | Recommendation: Address before merge

Overview

Severity Count
🚨 critical 0
⚠️ warning 1
💡 suggestion 0
🤏 nitpick 0
Issue Details (click to expand)
File Line Roast
shared/src/commonMain/kotlin/com/devil/phoenixproject/data/ble/KableBleConnectionManager.kt 136 New deviceReadyJob field shipped without the same @Volatile you put on readyGate — a sibling field with an identical cross-dispatcher reassignment pattern.

🏆 Best part: The whole compatibility-mode architecture is genuinely well thought through. The tri-state setting, the BleCompatibilityMode global with thread-safe writes, the per-device detection via Build.DEVICE codenames, the bounded CONFIG retry guarded by peripheral === p, and the ready-gate that holds Connected until MTU + services + critical CCCD are all lined up — this is the kind of choreography that takes most teams three rewrites to get right. You got it in one. The vendored Kable patch is also admirably surgical: one function, one legacy call, documented in the source. I respect that.

💀 Worst part: The @Volatile annotation saw its sibling and went "nah, I don't know her." You wrote the exact rationale for the annotation on readyGate (retry loop vs. state observer dispatcher split, stale read forces a spurious init timeout), then reassigned deviceReadyJob at five different call sites — three of which (onEach state observer, disconnect handler, timeout callback) do not have an immediately-paired readyGate write. The JMM does not transfer happens-before across sibling fields. The fix is two characters and one KDoc sentence. Do it before this ships and becomes someone else's production mystery.

📊 Overall: Like a Tesla with a recalled airbag sensor — the drivetrain is brilliant, the UI is clean, the vendor patch is documented to the teeth, and then one little part wasn't torqued down and the whole rest of the engineering work has to carry that asterisk.

Files Reviewed (15 files)
  • build.gradle.kts - 0 issues
  • gradle/libs.versions.toml - 0 issues
  • settings.gradle.kts - 0 issues
  • shared/src/androidMain/kotlin/com/devil/phoenixproject/util/DeviceInfo.android.kt - 0 issues
  • shared/src/commonMain/kotlin/com/devil/phoenixproject/data/ble/BleCompatibilityMode.kt - 0 issues
  • shared/src/commonMain/kotlin/com/devil/phoenixproject/data/ble/BleReadyGate.kt - 0 issues
  • shared/src/commonMain/kotlin/com/devil/phoenixproject/data/ble/KableBleConnectionManager.kt - 1 issue
  • shared/src/commonMain/kotlin/com/devil/phoenixproject/data/ble/MetricPollingEngine.kt - 0 issues
  • shared/src/commonMain/kotlin/com/devil/phoenixproject/data/preferences/PreferencesManager.kt - 0 issues
  • shared/src/commonMain/kotlin/com/devil/phoenixproject/domain/model/BleCompatibilitySetting.kt - 0 issues
  • shared/src/commonMain/kotlin/com/devil/phoenixproject/domain/model/UserPreferences.kt - 0 issues
  • shared/src/commonMain/kotlin/com/devil/phoenixproject/presentation/manager/SettingsManager.kt - 0 issues
  • shared/src/commonMain/kotlin/com/devil/phoenixproject/presentation/navigation/NavGraph.kt - 0 issues
  • shared/src/commonMain/kotlin/com/devil/phoenixproject/presentation/screen/SettingsTab.kt - 0 issues
  • shared/src/commonMain/kotlin/com/devil/phoenixproject/presentation/viewmodel/MainViewModel.kt - 0 issues
  • shared/src/commonMain/kotlin/com/devil/phoenixproject/util/DeviceInfo.kt - 0 issues
  • shared/src/commonTest/kotlin/com/devil/phoenixproject/data/ble/BleCompatibilityModeTest.kt - 0 issues
  • shared/src/commonTest/kotlin/com/devil/phoenixproject/data/ble/BleReadyGateTest.kt - 0 issues
  • shared/src/commonTest/kotlin/com/devil/phoenixproject/testutil/FakePreferencesManager.kt - 0 issues
  • shared/src/iosMain/kotlin/com/devil/phoenixproject/util/DeviceInfo.ios.kt - 0 issues
  • third_party/kable-core-android-patched/ (vendored Kable 0.43.1 sources + 1-line patch) - 0 issues

Fix these issues in Kilo Cloud


Reviewed by minimax-m3 · Input: 25.5K · Output: 4.1K · Cached: 228K

…MP Volatile imports

- Resolve compatibility mode once per connection (onDeviceReady) and use
  the snapshot in sendWorkoutCommand and notification setup; polling
  restart paths reuse the heartbeat decision from startAll. A settings
  change mid-connection no longer mixes choreography chosen for the live
  GATT connection — it applies on the next connect, matching the UI text.
- Import kotlin.concurrent.Volatile in commonMain files (fixes iOS
  compileKotlinIosArm64: Android resolves kotlin.jvm.Volatile implicitly,
  Native does not).
- Mark deviceReadyJob @volatile — same cross-dispatcher reassignment
  hazard as readyGate.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01DPLu2HRRPU7gfa9Z6p3S8k
@cursor

cursor Bot commented Jul 4, 2026

Copy link
Copy Markdown
Contributor

Bugbot is not enabled for your account, so this pull request was not reviewed.

Enable Bugbot in the Cursor dashboard to get automatic reviews on future PRs.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9644bed7ba

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread gradle/libs.versions.toml
kotlin-multiplatform = { id = "org.jetbrains.kotlin.multiplatform", version.ref = "kotlin" }
kotlin-serialization = { id = "org.jetbrains.kotlin.plugin.serialization", version.ref = "kotlin" }
kotlin-compose = { id = "org.jetbrains.kotlin.plugin.compose", version.ref = "kotlin" }
kotlin-parcelize = { id = "org.jetbrains.kotlin.plugin.parcelize", version.ref = "kotlin" }

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Use a resolvable Parcelize plugin id

Fresh evidence on the current 8b913df tree: ./gradlew :shared:compileDebugKotlinAndroid --dry-run still fails during root plugin resolution with Plugin [id: 'org.jetbrains.kotlin.plugin.parcelize', version: '2.4.0'] was not found, so any Gradle invocation that evaluates the top-level plugins block stops before task configuration. The vendored Kable module applies this alias, so Android builds remain blocked until the catalog points at a plugin id/version that Gradle can resolve in the configured repositories.

Useful? React with 👍 / 👎.

@cursor

cursor Bot commented Jul 4, 2026

Copy link
Copy Markdown
Contributor

Bugbot is not enabled for your account, so this pull request was not reviewed.

Enable Bugbot in the Cursor dashboard to get automatic reviews on future PRs.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a1b5255866

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +61 to +62
if (!writeCharacteristic(characteristic)) {
throw GattWriteException(WriteResult.Unknown)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve busy retry for legacy writes

Because the root dependency substitution sends every Android Kable write through this patched method, a legacy writeCharacteristic false result now becomes Write failed: Unknown. BleOperationQueue.write only retries messages containing Busy/WriteRequestBusy, and the new explicit retry only covers 96-byte CONFIG writes in compatibility mode, so transient busy failures on other commands such as LED color/init writes or standard-path CONFIG writes now fail after the first attempt instead of using the existing backoff. Please preserve the busy signal or make this legacy Unknown path retryable where appropriate.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — fixed in d499470. BleOperationQueue.write now treats the legacy path's Write failed: Unknown as transient (upstream Kable documents pre-API-33 false as covering the busy condition), restoring the bounded busy-retry with backoff for all writes under the substitution. Kept the vendored patch itself untouched so it stays byte-identical to the v14-tested behavior.


Generated by Claude Code

// the saved LED color scheme, so publishing early injects a
// 34-byte write directly into the ready-up GATT sequence.
if (deviceReadyJob == null && !readyGate.isCompleted) {
deviceReadyJob = scope.launch {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Cancel ready-up job during explicit teardown

When the user taps Cancel/Disconnect while ready-up is still waiting, this job is launched in the repository scope and disconnect()/cancelConnection() do not cancel it before clearing the connection state. If onDeviceReady() then returns after seeing peripheral !== p, the job still runs the success path and reports Connected after the UI was already set to Disconnected. Please cancel/fail deviceReadyJob in the explicit teardown paths or guard before publishing Connected.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in d499470disconnect() and cancelConnection() now cancel deviceReadyJob before clearing state, and the success path re-checks peripheral == null || isExplicitDisconnect before publishing Connected (failing the ready gate instead), so a late ready-up can no longer overwrite a Disconnected UI.


Generated by Claude Code

…down race

- BleOperationQueue: treat the vendored legacy write path's 'Write
  failed: Unknown' as transient. Pre-API-33 writeCharacteristic() reports
  a busy GATT stack as a plain false (upstream Kable documents this), so
  without this every non-CONFIG write lost its bounded busy-retry under
  the dependency substitution.
- KableBleConnectionManager: cancel deviceReadyJob in disconnect() and
  cancelConnection(), and guard the Connected publish on live peripheral +
  !isExplicitDisconnect, so an in-flight ready-up cannot resurrect
  Connected after an explicit teardown.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01DPLu2HRRPU7gfa9Z6p3S8k
@cursor

cursor Bot commented Jul 4, 2026

Copy link
Copy Markdown
Contributor

Bugbot is not enabled for your account, so this pull request was not reviewed.

Enable Bugbot in the Cursor dashboard to get automatic reviews on future PRs.

// Explicit disconnect/cancel may have torn the
// connection down while ready-up was in flight —
// don't resurrect Connected over a Disconnected UI.
if (peripheral == null || isExplicitDisconnect) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔥 The Roast: You just earned the flamethrower back. The whole point of this commit is "don't let a late ready-up resurrect Connected over a Disconnected UI" — and the safety net you just added reads two fields (peripheral at line 118 and isExplicitDisconnect at line 168) that have the exact same cross-dispatcher hazard your previous reviewer flagged for readyGate and deviceReadyJob. Without @Volatile, those reads can race the writes from disconnect() / cancelConnection() on a different dispatcher, and the guard is exactly as reliable as a screen door on a submarine. The previous P2 from Codex said "cancel the job during teardown"; you're now squarely on the hook for making that teardown visible to the read, otherwise the cancel you so carefully added two hunks down can still be in flight when this branch executes.

🩹 The Fix: Mirror the @Volatile treatment already on readyGate / deviceReadyJob so this check actually observes the teardown it claims to observe. Update the field declarations:

@Volatile
private var peripheral: Peripheral? = null
@Volatile
private var isExplicitDisconnect = false

…with matching KDoc noting the cross-dispatcher reassignment hazard (same wording style as the deviceReadyJob field).

📏 Severity: warning


Reply with @kilocode-bot fix it to have Kilo Code address this issue.

@9thLevelSoftware 9thLevelSoftware changed the base branch from main to pixel_test_moar July 4, 2026 18:57
The ready-up teardown guard reads both fields on the manager scope while
disconnect()/cancelConnection() write them on the caller's dispatcher —
same cross-dispatcher visibility hazard as readyGate/deviceReadyJob.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01DPLu2HRRPU7gfa9Z6p3S8k
@cursor

cursor Bot commented Jul 4, 2026

Copy link
Copy Markdown
Contributor

Bugbot is not enabled for your account, so this pull request was not reviewed.

Enable Bugbot in the Cursor dashboard to get automatic reviews on future PRs.

@9thLevelSoftware 9thLevelSoftware merged commit d72674f into pixel_test_moar Jul 4, 2026
2 checks passed
@9thLevelSoftware 9thLevelSoftware deleted the claude/phoenix-mp-333-root-cause-zhb0vz branch July 4, 2026 19:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Bluetooth disconnecting when workout starts

2 participants