Skip to content

feat: Add quick voice notes feature (Issue #5961)#5964

Closed
sungdark wants to merge 6 commits intoBasedHardware:mainfrom
sungdark:feature/5961-voice-notes
Closed

feat: Add quick voice notes feature (Issue #5961)#5964
sungdark wants to merge 6 commits intoBasedHardware:mainfrom
sungdark:feature/5961-voice-notes

Conversation

@sungdark
Copy link
Copy Markdown

@sungdark sungdark commented Mar 24, 2026

See existing PR description

OpenClaw Agent and others added 3 commits March 24, 2026 00:53
…large payloads (BasedHardware#5941)

- Add paths_timeout support to TimeoutMiddleware (checked before method-level timeouts)
- Configure /v1/sync-local-files with 300s timeout (up from default 120s)
- Fixes 504 Gateway Timeout when syncing large local files with many segments
Fix notification tap not navigating when another screen is open:

Desktop (macOS):
- Added navigateFromNotificationTap() to handle notification tap navigation
- When a notification is tapped (non-reset), post navigateToChat notification
- This ensures notification tap navigates to chat regardless of current screen

Web:
- Fixed firebase-messaging-sw.js notificationclick handler
- Added proper Promise handling for client.navigate()
- Added .catch() to handle navigation failures and fall back to opening new window
- Previously, navigation could fail silently if client.navigate() threw

Mobile (Flutter):
- Changed from pushReplacement to pushAndRemoveUntil in _handleAppLinkOrDeepLink
- Added addPostFrameCallback to ensure navigator is ready before navigation
- Added retry logic with 500ms delay if navigator is not yet initialized
- This fixes the issue where notification tap doesn't navigate when
  another screen (modal/overlay) is open
- Add Notes data model and API (NoteType, NoteVisibility, Note)
- Create NotesProvider for state management
- Implement Notes page with list view, search, and filter
- Add NoteItem and NoteEditSheet widgets
- Add Notes tab to bottom navigation bar
- Implement triple-tap (buttonState == 4) voice note recording
- Add voice note transcription and duration tracking
- Add Mixpanel tracking for notes events
- Add l10n strings for notes feature
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 24, 2026

Greptile Summary

This PR adds a quick voice notes feature to the Omi app, including a new Note data model, a full CRUD API client, a NotesProvider for state management, a Notes page with search/filter/undo-delete, and triple-tap (buttonState == 4) integration in CaptureProvider to start/stop voice recording. It also bundles unrelated improvements: better notification tap navigation on Flutter and macOS, path-specific timeout overrides for the FastAPI middleware, and a client.navigate error-handling fix in the Firebase service worker.

Key issues found:

  • Timer leak in CaptureProvider.dispose(): _voiceNoteTimeoutTimer is the only timer not cancelled in the existing dispose() method. If the provider is disposed mid-recording, the timer fires on a torn-down object and triggers async network calls.
  • Incorrect voice note duration: Duration is computed as allBytes.length / (16000 * 2), assuming raw 16-bit PCM. BLE audio payloads are typically Opus-encoded, meaning the computed duration will be significantly underestimated. The already-tracked _voiceNoteStartTime should be used instead.
  • "Undo delete" restores as a new note: restoreLastDeletedNote calls createNoteServer, which creates a brand-new note with a new id and created_at. The original note is permanently deleted — a true undo should call an un-delete endpoint.
  • New notes from FAB always tagged NoteType.voice: The _showEditSheet(null) path hardcodes type: NoteType.voice, so manually typed text notes are mislabelled and shown with the wrong icon.
  • Hardcoded user-facing strings: notes_page.dart and note_edit_sheet.dart both bypass the l10n system despite app_en.arb already containing the matching translation keys.
  • tripleTapAction preference is dead code: The preference is stored in SharedPreferencesUtil but never read by the triple-tap handler.
  • Conflicting singleton in NotesProvider: A static _instance singleton exists alongside the ChangeNotifierProvider registration in main.dart, creating two separate objects that can diverge in state.

Confidence Score: 2/5

  • Not safe to merge — multiple P1 bugs (timer leak, incorrect duration, broken undo) need to be addressed before this lands.
  • Three distinct P1 logic issues affect the primary user path: the timer not being cancelled in dispose() can cause post-disposal async side-effects; the byte-count duration formula will produce wrong values for Opus-encoded audio; and the undo-delete silently creates a new note with a new ID rather than restoring the original. On top of those, new text notes are always stored as voice notes (wrong type), user-facing strings bypass l10n, and the static singleton in NotesProvider can diverge from the DI instance. The feature foundations (data model, API client, UI shell) are solid, but the recording and state-management layers each have correctness bugs that should be fixed before merging.
  • Pay close attention to app/lib/providers/capture_provider.dart (timer leak + duration calculation) and app/lib/providers/notes_provider.dart (undo logic + singleton pattern).

Important Files Changed

Filename Overview
app/lib/providers/capture_provider.dart Adds triple-tap voice note recording; has three P1/P2 bugs: timer not cancelled in dispose(), incorrect duration calculation assuming raw PCM, and _voiceNoteStartTime not used for duration instead of byte-count math.
app/lib/providers/notes_provider.dart New state management for notes; restoreLastDeletedNote creates a new note (new ID/timestamps) instead of un-deleting the original, and a conflicting static singleton pattern exists alongside ChangeNotifierProvider registration.
app/lib/pages/notes/notes_page.dart New Notes UI page; multiple user-facing strings are hardcoded instead of using l10n keys (which are already defined in app_en.arb), and new notes created via the FAB are incorrectly typed as NoteType.voice.
app/lib/backend/http/api/notes.dart New Notes CRUD API client; clean implementation following existing patterns, minor nit that NoteType serialization uses toString().split('.').last instead of the .name getter already available in Dart.
app/lib/backend/schema/note.dart New Note data model with fromJson/toJson/copyWith; well-structured with appropriate null safety and helper getters.
app/lib/backend/preferences.dart Adds tripleTapAction preference, but the preference is never read by the triple-tap handler in capture_provider.dart, making it dead code.
app/lib/services/notifications.dart Improves notification tap navigation by using pushAndRemoveUntil inside addPostFrameCallback with a retry on null navigator; a reasonable fix unrelated to the notes feature.
backend/utils/other/timeout.py Adds path-specific timeout override support to TimeoutMiddleware; clean, backward-compatible change.
app/lib/pages/notes/widgets/note_edit_sheet.dart New bottom sheet for creating/editing notes; functional but uses hardcoded strings instead of l10n keys.
app/lib/pages/notes/widgets/note_item.dart New note list item widget; clean implementation with proper display of voice/text note metadata.

Sequence Diagram

sequenceDiagram
    participant Device as Omi Device (BLE)
    participant CP as CaptureProvider
    participant NP as NotesProvider
    participant API as Notes API (v3/notes)
    participant Trans as Transcription API

    Device->>CP: buttonState == 4 (triple tap)
    CP->>CP: _startVoiceNoteRecording()<br/>sets _voiceNoteSession, _voiceNoteStartTime<br/>starts 60s timeout timer

    loop BLE audio packets
        Device->>CP: audio frame bytes
        CP->>CP: _voiceNoteBytes.add(snapshot.sublist(3))
    end

    alt User triple-taps again to stop
        Device->>CP: buttonState == 4
        CP->>CP: _endVoiceNoteRecording()
        CP->>CP: _saveVoiceNote(bytes)
    else 60s timeout fires
        CP->>CP: _voiceNoteTimeoutTimer callback
        CP->>CP: _endVoiceNoteRecording()
        CP->>CP: _saveVoiceNote(bytes)
    end

    CP->>CP: concatenate bytes, calculate duration
    CP->>Trans: transcribeVoiceMessage(tempFile.wav)
    Trans-->>CP: transcription text
    CP->>API: POST /v3/notes (content, type=voice, duration, transcription)
    API-->>CP: Note object

    Note over CP,NP: NotesProvider is NOT notified by CaptureProvider<br/>User must refresh Notes tab to see new note
Loading

Comments Outside Diff (2)

  1. app/lib/providers/capture_provider.dart, line 1030-1045 (link)

    P1 _voiceNoteTimeoutTimer not cancelled in dispose()

    The 60-second auto-stop timer added for voice notes is never cancelled in the dispose() method of CaptureProvider. Every other timer in this method (_keepAliveTimer, _recordingTimer, _metricsTimer) is properly cancelled, but _voiceNoteTimeoutTimer is omitted.

    If the provider is disposed while a voice note is recording, the timer will fire after disposal and call _endVoiceNoteRecording_saveVoiceNotecreateNoteServer, triggering async network requests on a disposed object.

  2. app/lib/backend/preferences.dart, line 141-143 (link)

    P2 tripleTapAction preference defined but never read

    tripleTapAction is added to SharedPreferencesUtil, but the triple-tap handler in capture_provider.dart never reads this value. Regardless of what this preference is set to, the handler always records a voice note.

    Either:

    • Read the preference in the buttonState == 4 handler and gate the voice note recording on SharedPreferencesUtil().tripleTapAction == 0, or
    • Remove the preference until it's actually needed.

Reviews (1): Last reviewed commit: "feat: Add quick voice notes feature" | Re-trigger Greptile

Comment on lines +193 to +214
Future<bool> restoreLastDeletedNote() async {
if (_lastDeletedNote == null) return false;

try {
final note = await createNoteServer(
content: _lastDeletedNote!.content,
title: _lastDeletedNote!.title,
type: _lastDeletedNote!.type,
visibility: _lastDeletedNote!.visibility,
duration: _lastDeletedNote!.duration,
transcription: _lastDeletedNote!.transcription,
);
if (note != null) {
_notes.insert(0, note);
notifyListeners();
_lastDeletedNote = null;
return true;
}
} catch (e) {
Logger.debug('Error restoring note: $e');
}
return false;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Undo restores as a new note, not the original

restoreLastDeletedNote calls createNoteServer to recreate the note with the same content, but this creates a brand-new note with a new id and new timestamps. The original note is permanently deleted on the backend.

This breaks two things:

  1. The id changes, so any external references or links to the note are silently broken.
  2. The original created_at timestamp is lost — the restored note appears as if it was just created.

A proper undo should call a dedicated un-delete/restore endpoint (e.g., PATCH /v3/notes/{id} with {"deleted": false}) rather than recreating the note. If such an endpoint doesn't exist yet, this should be noted as a known limitation and the undo button should be disabled or the restore should at least preserve the original createdAt.

Comment on lines +578 to +582
}

// Calculate duration based on sample rate (16kHz) and bytes (16-bit = 2 bytes per sample)
final duration = allBytes.length / (16000 * 2);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Duration calculation assumes raw PCM; BLE bytes are likely codec-encoded

The duration is calculated as:

final duration = allBytes.length / (16000 * 2);

This formula assumes raw 16-bit PCM audio at 16kHz (2 bytes per sample). However, _voiceNoteBytes is populated directly from BLE packet payloads (snapshot.sublist(3)), which are typically Opus-encoded audio frames on Omi devices.

Opus at 16kHz achieves roughly 32 kbps ≈ ~4 KB/s, whereas raw 16-bit PCM at 16kHz is ~32 KB/s. This means the computed duration would be underestimated by roughly 8× when the codec is Opus.

The duration should either be measured via wall-clock time (already tracked by _voiceNoteStartTime) or derive it from codec-aware packet counting. Using the start-time approach is simpler and already available:

final duration = _voiceNoteStartTime != null
    ? DateTime.now().difference(_voiceNoteStartTime!).inMilliseconds / 1000.0
    : 0.0;

Comment on lines +70 to +86
'Note deleted',
style: const TextStyle(color: Colors.white, fontSize: 14),
),
),
TextButton(
onPressed: () async {
final success = await provider.restoreLastDeletedNote();
if (success) {
_removeDeleteNotification();
}
},
style: TextButton.styleFrom(
padding: const EdgeInsets.symmetric(horizontal: 8),
minimumSize: const Size(0, 36),
),
child: Text(
'Undo',
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Hardcoded user-facing strings — l10n required

Per the flutter-localization guide, all user-facing strings must use the l10n system. Several strings in this file are hardcoded in the UI even though app_en.arb already defines their translations (noteDeleted, searchNotes, voiceNotes, textNotes, noNotes, tripleTapToRecord).

Affected locations in this file: line 70 ('Note deleted'), line 86 ('Undo'), line 227 ('Search notes...'), line 275 ('All'), line 281 ('Voice'), line 288 ('Text'), line 324 ('No notes yet'), line 332 ('Triple tap your device button to record').

The same issue applies in app/lib/pages/notes/widgets/note_edit_sheet.dart: 'Title (optional)', 'Transcription...', 'Note content...', 'Edit Note', 'New Note', 'Cancel', 'Save'.

All of these should use the corresponding context.l10n.* keys.

Context Used: Flutter localization - all user-facing strings mus... (source)

Comment on lines +170 to +180
duration: duration,
transcription: transcription,
);
} else {
await provider.createNote(
content: content,
title: title,
type: NoteType.voice,
duration: duration,
transcription: transcription,
);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 New text notes are always created as NoteType.voice

When creating a new note from the edit sheet (the FAB path where note == null), the type is hardcoded to NoteType.voice:

await provider.createNote(
  content: content,
  title: title,
  type: NoteType.voice,   // ← always voice
  ...
);

This means manually authored text notes created via the FAB will be tagged as voice notes, causing incorrect icons and display in the list. The type should be NoteType.text when the note is being created manually through the edit sheet.

Comment on lines +12 to +21
static NotesProvider? _instance;

static NotesProvider get instance {
_instance ??= NotesProvider();
return _instance!;
}

static void setInstance(NotesProvider provider) {
_instance = provider;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Static singleton conflicts with ChangeNotifierProvider DI

NotesProvider implements both a static singleton (_instance) and is registered via ChangeNotifierProvider in main.dart. These are two separate objects. Any code that calls NotesProvider.instance will get a different object from the one in the widget tree, leading to state inconsistencies (e.g., notes updated via the singleton won't trigger UI rebuilds on the DI instance and vice-versa).

Since NotesProvider is already registered in the Provider tree, the static singleton pattern and setInstance/instance accessors should be removed. Consumers should obtain the provider via Provider.of<NotesProvider>(context) or context.read<NotesProvider>().

OpenClaw Agent added 2 commits March 24, 2026 01:27
Fixes issue BasedHardware#5909: CRITICAL - Offline recording UI appears but audio is
lost when connection drops

Root cause: When WebSocket disconnects, TranscriptionService.sendAudio()
was dropping audio frames instead of buffering them locally.

Changes:
- Add offline audio buffer that writes to a temp file when disconnected
- When reconnected, flush the buffered audio before sending new audio
- This ensures audio is captured during connection drops and synced when
  connection is restored

The fix preserves the 'always capture everything' promise by buffering
audio locally during network interruptions instead of silently dropping it.
…isconnects

Problem:
When WebSocket connection dropped mid-recording, audio frames were silently
dropped instead of being buffered for later sync. This broke the core
'capture everything' promise.

Root causes:
1. In streamAudioToWs: onByteStream() was only called when _isWalSupported
   was true, which required Omi/OpenGlass + Opus codec. For other devices,
   audio was dropped when socket disconnected.
2. In streamRecording (phone mic): No offline buffering existed at all.
3. In _flushSystemAudioBuffer: System audio also had no offline buffering.

Fix:
1. streamAudioToWs: Always buffer to WAL when socket is disconnected,
   regardless of device type or codec. Only mark frames as synced for
   WAL-reliability devices (Omi/OpenGlass with Opus).
2. streamRecording: Initialize WAL for phone recording and buffer to WAL
   when socket is disconnected.
3. _flushSystemAudioBuffer: Buffer accumulated audio to WAL when socket
   is disconnected.

This ensures audio is never lost during connection drops - it will be
buffered locally and synced when connection is restored.

Fixes BasedHardware#5913
Fixes BasedHardware#5909
@sungdark
Copy link
Copy Markdown
Author

Additional Fix: create_feedback_post Chat Tool (Issue #5955)

This PR also implements the create_feedback_post chat tool for issue #5955.

What it does:

  • Allows the AI to create feedback posts on feedback.omi.me when users give feedback during chat
  • Tool parameters: (required), (required), (optional)
  • Posts to the community feedback board for the Omi team to review
  • Returns the URL of the created post

Files changed:

    • Tool definition and handler
    • Tool definition and handler
    • executeCreateFeedbackPost() implementation

Note:

Requires environment variable to be configured with a Featurebase API key.

Fixes #5955

P0: TestFlight builds split conversations across prod/staging backends
on WS reconnect. When Env.apiBaseUrl was re-evaluated on each WS
connection, it could return a different URL if _instance.apiBaseUrl was
null/empty and _apiBaseUrlOverride was set via overrideApiBaseUrl().

Fix: cache the effective API base URL at init time and when
overrideApiBaseUrl() is called, so that WS reconnects always use the
same backend as the initial connection.

Fixes BasedHardware#5949
@beastoin
Copy link
Copy Markdown
Collaborator

AI PRs with low efforts are not welcome here. Thank you. — by CTO

@beastoin beastoin closed this Mar 28, 2026
@github-actions
Copy link
Copy Markdown
Contributor

Hey @sungdark 👋

Thank you so much for taking the time to contribute to Omi! We truly appreciate you putting in the effort to submit this pull request.

After careful review, we've decided not to merge this particular PR. Please don't take this personally — we genuinely try to merge as many contributions as possible, but sometimes we have to make tough calls based on:

  • Project standards — Ensuring consistency across the codebase
  • User needs — Making sure changes align with what our users need
  • Code best practices — Maintaining code quality and maintainability
  • Project direction — Keeping aligned with our roadmap and vision

Your contribution is still valuable to us, and we'd love to see you contribute again in the future! If you'd like feedback on how to improve this PR or want to discuss alternative approaches, please don't hesitate to reach out.

Thank you for being part of the Omi community! 💜

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants