Skip to content

Article-mode oEmbed extraction for video pages + V1 payload parity#658

Open
wikirby wants to merge 18 commits into
masterfrom
user/wikirby/article-oembed
Open

Article-mode oEmbed extraction for video pages + V1 payload parity#658
wikirby wants to merge 18 commits into
masterfrom
user/wikirby/article-oembed

Conversation

@wikirby
Copy link
Copy Markdown
Contributor

@wikirby wikirby commented May 14, 2026

Summary

V1's article-mode flow on YouTube/Vimeo produced a saved OneNote page with an embedded video iframe, citation, title/author caption, and AutoPageTags* meta tags that OneNote's renderer uses to recognize the result as an article-style clip with playable embeds. V2 shipped without that machinery — clicking Article on a YouTube page just ran Readability, which strips iframes and produces a text-only result with no player and no description. Users reported the regression.

This PR restores V1's user-visible behavior using the oEmbed standard instead of V1's per-provider URL parsing. Both YouTube and Vimeo publish CORS-enabled oEmbed endpoints that the chrome-extension origin can fetch directly under existing <all_urls> host_permissions.

Provider scope

Narrowed to V1 parity: YouTube + Vimeo only.

V1 also handled Khan Academy via regex scrape of lesson-page HTML for embedded YouTube IDs — markup likely no longer matches modern Khan Academy pages; skipped per maintainer direction. Additional providers may be added as a focused capability-expansion PR in the future.

Changes

Area Change
src/scripts/contentCapture/oembedExtractor.ts (new) 3-entry provider table + hostname-pattern matching + fetch/parse + sanitizeProviderHtml helper. Returns raw OEmbedData (renderer composes preview & save separately).
renderer.ts extractArticle Try oEmbed first; on null fall through to existing Readability path (no behavior change for non-matching domains).
renderer.ts composeOEmbedForPreview Thumbnail at fixed 600x338 (16:9 — matches saved iframe size), title <h3>, "author · provider" attribution, og:description, CSS-only overlay on video-type thumbnails. No nested iframe in preview because preview-frame is sandboxed (allow-same-origin) and the player can't execute JS inside it.
renderer.ts composeOEmbedForSave Sanitized provider iframe with data-original-src=<pageUrl> injected and dimensions normalized to 600x338 — the marker OneNote's renderer uses to recognize the embed (matches V1's YoutubeVideoExtractor.createEmbeddedVideoFromId).
renderer.ts save dispatch Threads new pageMetadata map through the save port message. Article paths (oEmbed + Readability) populate AutoPageTagsCodes=Article, AutoPageTags=Article, plus title/author/siteName + description.
webExtensionWorker.ts buildPage HTML output realigned to V1 OneNoteApi.OneNotePage.getEntireOnml: no <!DOCTYPE>, <html xmlns="http://www.w3.org/1999/xhtml" lang=<locale>> (no quotes around lang — matches V1 literally), iterates pageMetadata to emit one <meta> per entry. Same shape applied to the parallel distHtml builder for distributed-PDF saves.
renderer.ts imageToDataUrl Drive-by bookmark fix: initial PNG encode + iterative JPEG-quality step-down when encoded data URL exceeds OneNote's per-MIME-part limit. Matches V1's deleted DomUtils.adjustImageQualityIfNecessary. Surfaced because bookmark mode on YouTube was hitting 400 Maximum request size exceeded (1280x720 og:image PNG-encoded ~2.5MB).

Follow-up fixes (separate commits)

  • Article-mode highlighter (9bc0549): Pre-existing regression from PR Client-side migration: unified renderer window with self-contained sign-in #651's sandbox hardening. Library now loads as a parent-window script (same pattern as pdf.js) and operates on the iframe DOM via allow-same-origin rather than executing inside it. Small textHighlighter.js patch routes event bindings through el.ownerDocument.defaultView. Sandbox stays tight.
  • Sidebar group spacing (ca6ac23 + a292926): Per designer, more breathing room around the Fluent dividers — .sidebar-group padding 8px → 12px. Window-height cap stays at 900 per owning team (PDF-mode success-banner button may sit under the fold on caps-bound displays; UX trade-off accepted).
  • Segoe UI @font-face restored for Mac/Linux (fe40b3d): Re-adds the V1 @font-face declaration so non-Windows users get Segoe UI from OneNote's CDN instead of falling through to -apple-system (San Francisco). Local() cascade keeps Windows on the installed font, no network request. No CSP change — font loading was already unrestricted.
  • Selection mode + ContextImage + cross-mode highlight preservation (be5b15c): Restores V1's right-click context-menu features that were either no-ops or broken in V2.
    • Listener bug fix: contextMenus.onClicked.addListener was registered inside the menu-creation for loop, so a single right-click fired the handler N times — N parallel captureVisibleTab requests blew the per-second quota and the popup blinked closed. Hoisted out of the loop.
    • "Clip Image to OneNote" (ContextImage) — previously dispatched but ignored by V2's renderer. Worker now forwards invokeDataForMode (image srcUrl) to the renderer, which fetches via the existing imageToDataUrl and pre-seeds Region mode with the single image as a thumbnail (matches V1's regionResult.data = [srcUrl] + default-mode-Region pattern).
    • "Clip Selection to OneNote" — new 6th conditional mode button revealed only when invoked via right-click. Selection HTML capture in contentCaptureInject.ts reuses the existing full-DOM cleanup pipeline by substituting body with the cloned range, materializing img.src/a.href from properties so absolute URLs survive into body.innerHTML (loses the <base> tag during render-wrap). Result piped through cleanArticleHtml for ONML-equivalent strip — iframe added to the strip list (safe because the oEmbed path bypasses cleanArticleHtml by construction, so video iframes for known providers still flow through). Save reuses article's branch with AutoPageTags=Article (V1 parity). Button uses Fluent UI System Icons Copy Select 20-Regular; label/tooltip wired via existing V1 i18n keys (no new strings).
    • Cross-mode highlight preservation — pre-existing latent bug: articleWorkingHtml saved on switch-out of article, but switchToArticle / switchToSelection themselves didn't capture the outgoing mode's state, so even article ↔ selection ↔ article would have lost article's highlights. Selection had no equivalent at all. Generalized saveArticleWorkingStatesaveWorkingState dispatched by currentMode, added the save call at every transition entry point, added selectionWorkingHtml slot. Also added articleWorkingHtml/selectionWorkingHtml to the sign-in/sign-out fresh-capture reset blocks.
    • Highlighter line-jump in <pre> blocks.highlight-anchor was display: inline-block, making the wrapped span an atomic layout unit and defeating word-break: break-word. Long highlighted URLs in code blocks jumped to their own line. Changed to display: inline (position:relative on an inline still works as a containing block for the absolute delete button per CSS spec).

Test plan

  • npm run build:prod clean (zero TS errors, tslint passes)
  • YouTube watch page → article mode → preview shows thumbnail + ▶ overlay + title/author/description; save produces OneNote page with playable embedded video
  • Vimeo video page → same flow
  • Non-matching domain (regular article) → falls through to Readability with no regression
  • Bookmark mode on YouTube → saves successfully (was 400 before the imageToDataUrl JPEG fallback)
  • Article-mode highlighter: select text → yellow span with red-circle delete button; click delete removes
  • Segoe UI loads from network on machines without it locally (verified via local()-strip dry run)
  • Right-click "Clip Image to OneNote" → opens in Region mode with that image pre-seeded as a thumbnail
  • Right-click "Clip Selection to OneNote" → opens with Selection mode button revealed and auto-engaged; selected text preserved with absolute image/link URLs; iframes stripped
  • Highlights preserved across article ↔ selection ↔ article (and via other modes)
  • Long URL highlighted in a <pre> block wraps mid-token instead of jumping to a new line
  • PDF mode, full-page mode, region mode otherwise unaffected

Out of scope (deferred)

  • Image-size fallback on PDF mode / full-page mode / region mode — those paths already use JPEG and aren't reported as failing; add if needed
  • Broader provider support — separate, focused PR if appetite arises
  • Pixel-perfect Mac/Windows parity on Semibold rendering (browser synthesizes from Regular today; add @font-face for Segoe UI Semibold if designer flags it)
  • V1's video-domain iframe special-handling in selection mode (V1 kept YouTube/Vimeo iframes in selections via VideoExtractorFactory; V2 strips all iframes from selection — users who want a video clip use article mode)

Demo

image image

🤖 Generated with Claude Code

V1's article-mode flow on video pages (YouTube, Vimeo) produced a save
payload with an embedded video iframe, citation, title/author caption,
and `<meta name="AutoPageTagsCodes" content="Article" />` /
`<meta name="AutoPageTags" content="Article" />` tags that OneNote's
page renderer uses to recognize the result as an article-style clip
with playable embeds. V2 shipped without any of that machinery -- the
article mode just ran Readability over the YouTube DOM, which strips
iframes and produces a text-only result with no player and no
description. Users reported the regression on YouTube specifically;
the same gap applied to Vimeo as well.

oEmbed standard provides exactly the shape we need (iframe `html`,
title, author_name, thumbnail_url, dimensions) without any
provider-specific scraping. Both YouTube and Vimeo publish CORS-enabled
oEmbed endpoints that the chrome-extension origin can fetch directly
under our existing `<all_urls>` host_permissions.

Changes:

  - New `src/scripts/contentCapture/oembedExtractor.ts` -- thin module
    with a provider table (YouTube + Vimeo only, matching V1's
    SupportedVideoDomains), hostname-pattern matching, fetch + JSON
    parse, and a small `sanitizeProviderHtml` helper that strips
    script-execution surfaces from provider-supplied HTML.
  - `extractArticle` in renderer now tries oEmbed first; on no-match
    or fetch failure it falls through to the existing Readability
    path with zero behavior change.
  - Preview vs save are decoupled:
      - Preview shows the `thumbnail_url` at the same 600x338 (16:9)
        box the saved iframe uses, with title / "author . provider"
        attribution, page description (og:description fallback chain
        same as bookmark mode), and a CSS-only play-glyph overlay
        when `type === "video"`. No iframe in preview because the
        renderer's `preview-frame` is sandboxed (allow-same-origin)
        and the YouTube/Vimeo player can't run JS inside it -- which
        is why earlier attempts produced a broken "Unable to execute
        JavaScript" placeholder.
      - Save uses the provider's iframe HTML (sanitized), with
        `data-original-src=<pageUrl>` injected and dimensions
        normalized to 600x338 -- the marker OneNote's renderer uses
        to recognize and render the embedded player on the saved
        page, matching V1's YoutubeVideoExtractor behavior exactly.
  - PageMetadata plumbing: renderer threads a `pageMetadata` map
    through the save port message; worker's `buildPage` iterates
    and emits `<meta name="K" content="V" />` for each entry.
    Mirrors V1's `OneNoteApi.OneNotePage.getPageMetadataAsHtml`
    behavior. Article mode (both oEmbed and Readability paths)
    populates `AutoPageTagsCodes=Article`, `AutoPageTags=Article`,
    plus title/author/siteName (oEmbed) or
    title/description/author/siteName/publishedTime (Readability,
    matching V1 augmentationHelper).
  - `buildPage` HTML output realigned to V1
    `OneNoteApi.OneNotePage.getEntireOnml` shape: no `<!DOCTYPE>`,
    `<html xmlns="http://www.w3.org/1999/xhtml" lang=<locale>>`
    (no quotes around lang -- matches V1 output literally), locale
    via `chrome.i18n.getUILanguage()`. Same change applied to the
    parallel `distHtml` builder for distributed-PDF saves so all
    save paths emit the same shape.
  - Bookmark thumbnail size fallback restored: `imageToDataUrl`
    initial-encode is PNG (good for icons/logos), with iterative
    JPEG-quality step-down when the encoded data URL exceeds the
    OneNote API per-MIME-part limit (~2MB minus padding). Matches
    V1's deleted `DomUtils.adjustImageQualityIfNecessary` behavior
    including the 0.1 step size. Surfaced because the user was
    hitting "400 Maximum request size exceeded" on bookmark-mode
    saves of YouTube pages whose 1280x720 og:image PNG-encoded to
    ~2.5MB.

Provider scope is intentionally narrow (YouTube + Vimeo only) to
match V1's effective surface and avoid accidentally enabling capture
on sites V1 never supported. V1 also handled Khan Academy via regex
scrape for embedded YouTube IDs in lesson-page HTML; that markup
likely no longer matches modern Khan Academy pages and is skipped
here per maintainer direction.

Verified manually: YouTube watch page and Vimeo video page produce
saved OneNote pages with the embedded player, title/author caption,
and og:description text below; non-matching domains fall through to
Readability with no regression; bookmark mode on YouTube saves
successfully without the 400 limit error.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@wikirby wikirby force-pushed the user/wikirby/article-oembed branch from a066275 to 60aea08 Compare May 14, 2026 06:48
wikirby and others added 16 commits May 14, 2026 03:54
The article-mode highlighter has been non-functional since commit
16f62c0 (Security hardening) added `sandbox="allow-same-origin"` to
the preview-frame iframe. The sandbox blocks script execution inside
the iframe document, so `setupHighlighter`'s runtime injection of
`<script src="textHighlighter.js">` into the iframe head silently
no-opped -- the script tag existed but the library code never ran,
so `pWin.TextHighlighter` stayed undefined and selection never
produced highlight spans.

Fix mirrors how pdf.js is structured in this codebase: load the
trusted script in the parent renderer.html window (chrome-extension://
origin, hardened CSP), then operate on the iframe's DOM from the
parent. `allow-same-origin` permits cross-frame DOM access and
selection reads from the parent; only inside-iframe script execution
is blocked, which we don't need.

Changes:

  - renderer.html: add `<script src="textHighlighter.js">` before
    renderer.js so `window.TextHighlighter` is available in parent
    context, same loading pattern as pdf.combined.js.
  - renderer.ts initHighlighter / createHighlighterInstance: drop
    the runtime script injection path entirely. Construct directly
    from `(window as any).TextHighlighter` against
    `previewFrame.contentDocument.body` -- same constructor signature
    the library was designed for, just invoked from a different
    script context.
  - textHighlighter.js bindEvents / unbindEvents: route the
    keyup / mouseup / touchend listeners through
    `el.ownerDocument.defaultView` (the iframe's window) instead of
    bare `window` (the parent's window). When the highlighter is
    constructed from the parent and operates on a same-origin iframe,
    bare `window` resolves to the parent, missing events that fire
    inside the iframe. All other DOM access in the library already
    routes through `dom(el).get{Window,Document,Selection}()` helpers
    -- this is the only outlier, six lines patched + a clarifying
    comment.

No sandbox change. Security hardening from PR #651 remains intact:
script execution inside preview-frame is still blocked, so untrusted
content (page-author HTML in bookmark mode, provider-supplied iframe
HTML in oEmbed mode) cannot run JS. The highlighter is trusted code
that lives outside the sandbox and reaches in.

Verified manually: article mode on a regular page -- click highlight
button, drag-select text, span appears with #fefe56 background and
red-circle delete button; click delete button removes the highlight;
mode-switch and save preserve highlights via the existing
articleWorkingHtml path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per designer: more breathing room around the 1px Fluent dividers
between sidebar groups (mode buttons, meta fields, section picker).
Bumped `.sidebar-group` padding from 8px/8px to 12px/12px so each
gap goes from 8+8=16px around the divider to 12+12=24px. Designer
asked for 16/16 originally; 12/12 was the landing point after
checking PDF mode -- 16/16 pushed the "View in OneNote" success
banner against the window bottom and required scrolling. 12/12 is
still a clear improvement over 8/8 while leaving the post-clip
button in view.

Bumped the window-height cap from 900px to 980px to absorb the
+24px sidebar growth and keep the PDF-mode success banner comfortably
visible. Floor of 600px unchanged; smaller browsers still shrink to
fit. 980 stays well clear of typical 1080p displays.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Owning team prefers to keep the renderer-window height cap at the
original 900px and let the UX consequence of the roomier divider
spacing surface (PDF-mode "View in OneNote" success-banner button
sits below the fold and requires scrolling on caps-bound displays).
Sidebar group padding stays at 12px/12px from the prior commit so
the design ask is still in place; only the height-cap accommodation
is reverted.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
V1 had an @font-face declaration that loaded Segoe UI from OneNote's
own CDN, explicitly so Apple/Android users without local Segoe UI
would still render the intended Fluent typography. V2 dropped that
declaration and never replaced it -- Mac users have been falling
through the font-family stack to -apple-system (San Francisco)
since V2 launch, which renders noticeably differently from the
intended look. Designer flagged it.

Re-added the declaration verbatim from V1 (same
www.onenote.com/styles/segoeui.{eot,woff} URLs V1 used; verified
currently serving 200). Local() cascade hits first on Windows (no
network), remote fallback only on machines that don't have Segoe UI
installed.

V1 also declared @font-face for `Segoe UI Light`; V2's CSS never
references the Light variant -- skipped here as dead code. V1
referenced `Segoe UI Semibold` as a font-family value without ever
declaring an @font-face for it; V2 instead uses `font-weight: 600`
on the Regular face throughout, which the browser synthesizes
(matches V1's effective behavior on Mac since Mac doesn't have
Segoe UI Semibold locally either).

No manifest CSP change: the prior CSP omitted font-src entirely,
so font loading was already unrestricted -- the @font-face just
works under existing policy. Article-preview pages with their own
@font-face declarations (e.g. FontAwesome on a Gentoo page)
continue to render their icon fonts in the sandboxed preview iframe
as before.

Verified on Windows: with local() in place, no network request
(uses installed Segoe UI). Stripping local() temporarily forced
the remote fetch path; renderer DevTools' "Rendered Font:
Network resource" confirmed the CDN-served woff was used.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reproduces V1's context-menu features that were either no-ops or
silently broken in V2's unified-window architecture, plus a latent
listener bug that caused right-click invocations to crash.

1. Context-menu listener registered once, not N times
   chrome.contextMenus.onClicked.addListener was being called from
   inside the for-loop that creates the three menu items, so one
   right-click fired the handler 3x -> 3 parallel invokeClipperInTab
   -> 3 captureVisibleTab racing the per-second quota -> disconnected
   port + "image readback failed" + the popup blinking and closing.
   Hoisted the addListener call out of the loop; types narrowed to
   match chrome.contextMenus.OnClickData signature (tab?: Tab).

2. "Clip Image to OneNote" (ContextImage) -- V1 parity
   The menu item dispatched InvokeMode.ContextImage with the image
   srcUrl, but the V2 renderer ignored it and just opened in Full Page
   mode. Worker now captures options.invokeDataForMode into a new
   pendingInvokeData field, forwarded as `invokeData` on the
   loadContent port message. Renderer fetches the URL via
   imageToDataUrl (same path bookmark mode uses for og:image), pushes
   into regionImages, auto-clicks Region. switchToRegion already does
   the right thing on a non-empty regionImages -- skips overlay launch,
   renders thumbnails, enables Save. Tainted/CORS-failed fetch falls
   back to Full Page focus so the user is never stranded.

3. "Clip Selection to OneNote" -- V1 parity
   A 6th conditional mode button revealed only when invoked via the
   right-click "Clip Selection to OneNote" item AND when a non-empty
   selection was captured. Auto-engages after the screenshot finishes,
   leaving other modes (Region/Bookmark/Article/PDF) reachable. Button
   uses Fluent UI System Icons "Copy Select" 20-Regular SVG (matches
   the four existing Fluent-style mode icons in renderer.html).

   Selection HTML capture (contentCaptureInject.ts): builds a
   "selection-substituted doc" -- cloneDocument(document), replace body
   with the cloned selection fragment, then run the doc-only steps of
   the existing pipeline (addBaseTagIfNecessary, addImageSize,
   removeUnwantedItems). Parallel-walk steps (inlineHiddenElements,
   neutralizePositioning, flattenShadowDomSlots,
   convertCanvasElementsToImages) are screenshot-stitching concerns
   and don't apply to static save HTML -- correctly skipped. To
   preserve URL resolution after the <base href> is dropped (we return
   body.innerHTML which loses head), img.src + a.href are materialized
   from the resolved properties back into attributes. resolveLazyImages
   applied on the resulting string, same as full-DOM capture.

   Renderer (renderer.ts): selection HTML piped through the existing
   cleanArticleHtml for ONML-equivalent strip. Added `iframe` to the
   strip list -- safe because the oEmbed article path
   (composeOEmbedForPreview/composeOEmbedForSave) intentionally
   bypasses cleanArticleHtml, so the video iframe for known providers
   still flows through; and Readability already drops iframes upstream.
   Net effect: selection iframes get stripped, matching V1's
   tagsNotSupportedInOnml. cleanArticleHtml also gained blank-image
   removal (V1's removeBlankImages parity).

   Save dispatch: article + selection share one branch; selection
   uses cachedSelectionHtml as source and writes
   AutoPageTagsCodes/AutoPageTags=Article in PageMetadata (V1 parity).

   Localization: button label + tooltip wired via existing V1 i18n
   keys WebClipper.ClipType.Selection.Button and .Tooltip -- no new
   strings.

4. Cross-mode highlight preservation
   Pre-existing latent bug: switching away from article mode and back
   preserved highlights via articleWorkingHtml, but article <-> other
   <-> article via the article->selection or selection->article path
   would have lost article's highlights (those switch functions never
   called saveArticleWorkingState). Same gap for selection mode --
   selectionWorkingHtml didn't exist; switch-away-then-back rendered
   the pristine cachedSelectionHtml every time.

   Generalized saveArticleWorkingState -> saveWorkingState dispatched
   by currentMode (article/selection slots). Added the save call at
   the top of switchToArticle and switchToSelection so the outgoing
   mode's preview-frame body is captured on every transition.
   switchToSelection now renders selectionWorkingHtml ||
   cachedSelectionHtml, mirroring article. Also added
   articleWorkingHtml + selectionWorkingHtml to the sign-in/sign-out
   fresh-capture reset blocks (article's reset was also missing).

   V1 wrote on every highlight add/delete via handleBodyChange (eager);
   V2 writes on switch-out (lazy). Functionally equivalent because the
   save path already clones the live preview-frame DOM when highlights
   are present.

5. Highlighter line-jump in code blocks
   .highlight-anchor was display:inline-block, making the wrapped text
   an atomic layout unit and defeating word-break:break-word inside
   <pre> blocks -- highlighting a long URL forced the entire span to
   its own line. Changed to display:inline; position:relative on an
   inline still works as the containing block for the absolutely-
   positioned delete button (CSS spec: any positioned ancestor, inline
   or block, serves as a containing block).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
V1's augmentationHelper ran Readability on every article-mode page
including video pages (augmentationHelper.ts:62-71) and used
article.excerpt for metadata.description + article.publishedTime
for the timestamp. It then ADDED video iframes on top via
addEmbeddedVideosWhereSupported -- Readability-extracted description
and video player coexisted in the saved page.

V2's oEmbed path skipped Readability entirely, so the only description
source was the iframe document's og:description / description /
twitter:description meta tags. On YouTube specifically, og:description
is often truncated (~155 chars) and sometimes missing entirely on SPA
navigations, which is the unreliability that surfaced in testing.

Fix: run Readability alongside oEmbed in extractArticle. Use
article.excerpt as the primary description and fall back to the
meta-tag chain only on Readability failure / empty excerpt. Also pick
up article.publishedTime (V2 was missing this metadata field for
oEmbed pages -- V1 had it). oEmbed still owns title / author /
thumbnail / iframe -- those are richer from the provider's structured
response than from Readability's HTML inference, and V1's
VideoExtractorFactory is functionally replaced by oEmbed for that part.

Readability is already lazy-loaded for the non-oEmbed Readability-
fallback path, so this adds no bundle weight -- just ~500ms first-call
latency on a video page (cached after first call). Preview-frame stays
blank during that window (pre-existing UX, not changed here).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
WIP -- holding local pending designer signoff. Reason: with Selection
mode added, the mode-button stack started pushing the sidebar content
past the 900px window cap on smaller displays (already a borderline
condition with PDF options or signed-in user-info footer present).
Collapsing into a single dropdown control recovers ~150-200px of
vertical headroom.

HTML (renderer.html):
- Wrap #mode-panel in #mode-selector and add #mode-trigger button
  above it. Trigger has aria-haspopup="true" aria-expanded + the
  shared aria-labelledby="mode-label mode-trigger-label" association.
- #mode-panel switches from role="toolbar" to role="menu", hidden
  until .open is added.

LESS (renderer.less):
- #mode-trigger styled as a canonical Fluent 2 Dropdown matched to
  #section-selected (same padding/height/weight/focus treatment),
  not as an oversized "MenuButton" -- per Fluent 2 spec, hierarchy
  between two sibling pickers on the same panel is expressed via
  the field-label and group spacing, not by giving one control
  outsized visual weight.
- Leading icon slot (16x16, mirrors the selected mode's icon) at
  the start; absolutely-positioned chevron at the end that rotates
  180deg when expanded for affordance.
- .mode-btn restyled to match .section-item (8px 12px padding, 16x16
  icon, gap 6px, weight 600 + brand color on selected) -- two
  dropdowns with consistent option-list styling.
- #mode-panel becomes a position:absolute popover (top: 100% + 4px)
  with white bg, 1px border, 4px radius, standard menu shadow.

renderer.ts:
- syncTriggerToSelected() clones the .mode-btn.selected element's
  icon (.mode-icon or .mode-icon-img) into #mode-trigger-icon-slot
  and mirrors its label text to #mode-trigger-label. Called once at
  boot (reflects the default fullpage selection) and from inside the
  existing mode-btn click handler so every mode change -- user click
  or programmatic .click() from the Selection/ContextImage auto-engage
  paths -- updates the trigger.
- openModeMenu / closeModeMenu toggle .open + aria-expanded; opening
  focuses the first non-disabled, visible mode option for arrow-key
  nav (existing keydown handler on .mode-btn unchanged).
- Trigger keyboard: ArrowDown / Enter / Space opens; Escape closes.
- Escape inside the menu closes and returns focus to the trigger.
- Click outside (not on trigger or menu) closes the menu.
- All 24 existing .mode-btn[data-mode="X"] selectors continue to work
  unchanged -- the buttons still exist, just inside a popover now.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When the user invokes via the toolbar button, they haven't expressed
a mode intent yet -- so we capture the full-page screenshot eagerly
because Full Page is the default and the user might pick any mode.
When the user invokes via the right-click context menu ("Clip
Selection to OneNote" / "Clip Image to OneNote"), they HAVE expressed
intent -- and the screenshot loop is 2-5s of wasted work scrolling
the original tab to produce an asset the user has no use for in those
flows. Defer the capture loop to when the user explicitly enters Full
Page mode (existing infrastructure handles "switch to FP with capture
not yet done" just by re-using captureDimensions on demand).

Net behavior change:
- Toolbar invocation: identical to before (eager capture).
- Right-click "Clip Selection to OneNote": no screenshot loop;
  Selection mode auto-engages immediately on loadContent.
- Right-click "Clip Image to OneNote": no screenshot loop; Region
  mode auto-engages with the right-clicked image pre-seeded as a
  thumbnail.
- Switching to Full Page later in any of the above: capture loop
  kicks off then; user sees the existing "Capturing..." progress UI.

Implementation:

State (renderer.ts:138-143)
  Two new vars next to fullPageComplete:
    captureInProgress  -- true between `dimensions` send and `finalize`
                          complete; distinct from fullPageComplete which
                          means "we have a screenshot ready"
    captureDimensions  -- caches viewport/page/content heights from
                          onReady so a deferred kickoff can reuse them
                          without re-measuring after layout shifts

Helpers (renderer.ts ~2590)
  kickoffFullPageCapture()      sends `dimensions` to the worker;
                                 idempotent via captureInProgress guard
  enterReadyStateWithoutCapture() hides the capture panel, enables
                                 save + mode buttons + signout, fires
                                 the auto-engage. Replaces the
                                 capture-complete equivalent for the
                                 skip-capture path.
  triggerContextMenuAutoEngage() reveals the Selection / Region mode
                                 button (per invokeMode) and clicks it
                                 synchronously. Late image fetch
                                 callbacks guard on captureInProgress
                                 / fullPageComplete / currentMode so a
                                 slow image load can't bounce the user
                                 out of a mode they explicitly chose.

Behavior fixes that fell out of the rewiring:

Click-handler guard (renderer.ts:2465)
  The pre-existing guard `if (!fullPageComplete && mode !== "fullpage")`
  was the wrong gate: it conflates "capture is running" with "capture
  hasn't completed." fullPageComplete stays false in the skip-capture
  path by design (no capture has run), so mode switching was blocked
  for the entire skip flow. Changed to captureInProgress.

unlockSidebar (renderer.ts:960)
  Same wrong gate -- after save in a Selection-mode skip-capture, this
  was disabling Selection / Article / Bookmark / Region back. Removed
  the gate; switchToFullPage handles deferred-capture kickoff on its
  own when user enters Full Page.

iframe visibility (renderer.ts:2693 + 1033)
  iframeDoc.write() paints the captured page DOM synchronously; the
  auto-engage doesn't fire until onReady (which waits on image loads),
  so the user briefly saw the "wrong" Full Page DOM during the gap.
  loadContent now sets iframe.style.visibility = "hidden" the moment
  it detects a context-menu invocation -- iframe stays in layout (so
  onReady's measurements are accurate) but doesn't paint. The
  subsequent switchToSelection / switchToRegion sets display:none.
  switchToFullPage's deferred-kickoff branch resets visibility back to
  "" so captureVisibleTab grabs real pixels.

Finalize handler (renderer.ts:3110)
  Two cleanups:
  (1) previewContainer.innerHTML = "" before appending the screenshot
      -- previewContainer is shared with Region mode (thumbnails +
      remove buttons), and the old code appended the screenshot below
      the existing region content in deferred-capture flows. Visible
      symptom: after Full Page capture in image-context mode, user
      saw the region thumbnail stacked above the screenshot.
  (2) Removed the dead `else` branch that was appendChild-ing a
      screenshot <img> when the user wasn't in Full Page. That image
      was never referenced -- switchToFullPage rebuilds a fresh <img>
      from fullPageDataUrl on entry. Removing eliminates the
      mode-bleed-through risk.
  (3) Removed the inline context-menu auto-engage logic (moved to the
      shared triggerContextMenuAutoEngage helper, called only from
      enterReadyStateWithoutCapture). Without this removal, a
      deferred Full Page capture in the image-context flow would
      bounce the user back to Region at finalize time.

Reset blocks (renderer.ts:3394 + 3443)
  captureInProgress + captureDimensions added to both sign-in and
  sign-out fresh-capture clears.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
MV3 service workers unload aggressively when idle. When a context-menu
click wakes the worker back up, Chrome dispatches the event during
the worker's initialization run. If the onClicked listener isn't
registered yet at that moment, the event is silently dropped.

Pre-existing behavior in V1 (which also gated the listener behind
an async chain), but in MV3 the worker unloads far more often than
V1's persistent background page did, so users hit this regularly:
"first right-click does nothing, second one works." The second
attempt succeeds because by then the worker has fully woken up and
the listener is in place.

The fix is standard MV3 hygiene: register the listener
synchronously in the constructor (which runs at top-level during
service-worker startup, before any await/then). The handler itself
already defers actual clipper invocation on clipperIdProcessed via
invokeClipperInTab, so we don't need state to be ready at listener
registration time -- we just need the listener wired up.

Implementation:

Split registerContextMenuItems into two methods:
  registerContextMenuClickListener  Synchronous; just registers the
                                     onClicked listener. Called
                                     directly from the constructor.
  registerContextMenuItems          Async; only creates the menu
                                     items themselves. Still gated
                                     on clipperIdProcessed +
                                     fetchAndStoreLocStrings because
                                     menu titles need localization.

The listener body is unchanged from before -- same dispatch on
info.menuItemId, same PDF-plugin parent-tab fallback for selection
mode, same default-mode fallback for image mode without srcUrl.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
New behavior, not V1 parity: V1's getDefaultClipMode always returned
FullPage for non-context-menu / non-PDF invocations. This is a pilot
for the broader pattern of site-suggested initial mode -- when we can
infer the user's likely intent from URL/page signals, skip the eager
full-page capture and open directly in the relevant mode.

The deferred-capture infrastructure (commit 4db0d29) makes this nearly
free: the same skip-capture path used by right-click "Clip Selection"
and "Clip Image" now also fires for toolbar invocations whose pageUrl
matches a known oEmbed provider (YouTube, Vimeo). The user lands in
Article mode where extractArticle's existing oEmbed branch will render
the embedded video player.

Trade-off: a user who actually wants a screenshot of the YouTube/Vimeo
page chrome (suggestions sidebar, comments, etc.) now has to click
Full Page to trigger the deferred capture. Same cost as the existing
context-menu skip flows. The 5+ seconds of wasted scroll-and-stitch
on what is mostly chrome and dynamic content is recovered for the
common case ("clip this video to OneNote").

Implementation:

oembedExtractor.ts (+17)
  New sync helper isOEmbedProviderUrl(url): boolean -- delegates to
  the existing matchProvider() hostname check. No fetch; just URL
  pattern matching against the same 2-entry provider table (YouTube,
  Vimeo) that tryOEmbed uses for the actual extraction. False
  positives (provider hostname matches but the specific URL has no
  oEmbed) gracefully fall back to Readability via the existing
  extractArticle path -- user lands in Article mode with Readability
  output instead of Full Page screenshot, still a reasonable result.

renderer.ts (+60 net)
  New module-level flag siteSuggestsArticleMode, set on loadContent
  when pageUrl matches an oEmbed provider AND the invocation isn't
  a context-menu one (which carries an explicit user mode intent
  that takes precedence). The flag is folded into the existing
  skip-capture decisions:
    - iframe.style.visibility = "hidden" (prevent the Full Page DOM
      blink during the iframe-load -> auto-engage gap)
    - onReady skip-vs-kickoff branch (skip eager capture loop)
  triggerContextMenuAutoEngage renamed to triggerInitialModeAutoEngage
  (now covers three sources of "skip + auto-engage": context-menu
  Selection, context-menu Image -> Region, site-suggested Article).
  Added an Article branch that clicks the Article mode button
  synchronously, same pattern as the existing two branches.

Reset blocks (sign-in / sign-out) clear siteSuggestsArticleMode to
avoid stale state across sessions.

Future generalization: this pattern could grow into a single
suggestedMode field on the loadContent message, with sources
ranging from URL-based (oEmbed, PDF) through page-content
detection (article-rich heuristic) to user preference (settings).
Today's implementation keeps it as a simple boolean alongside
the existing context-menu flags for minimal-risk incremental
change, but the abstraction is intentionally extensible.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Chrome Web Store automated review rejected the update with:

  Violation: Requesting but not using the following permission(s):
  storage

The audit is correct. We declared the "storage" permission in the
manifests for both Chrome and Edge, but the code never calls
chrome.storage.* anywhere -- ripgrep across all of target/ finds zero
hits.

The permission was a leftover. An earlier V2 build used
chrome.storage.session as part of the worker -> renderer content
handoff, then we eliminated that round-trip in favor of an in-memory
pendingContent buffer (commit history references "Eliminate
chrome.storage.session for content payload"). The manifest entry was
never removed in that refactor.

What we actually use is the Web Storage API (window.localStorage),
which is available to any page context without an extension
permission. The service worker accesses it indirectly via the
offscreen document, which is a normal page context and reads/writes
window.localStorage directly. The "offscreen" permission (kept) is
what gates that arrangement.

No behavior change -- just removes a permission we never exercised.
Unblocks the Chrome Web Store update.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Single commit covering three small bugs that appeared once the
dropdown UI and the deferred-capture path were both in place. Kept
together to minimize commit churn; one of the fixes (PDF dropdown
trigger sync) is dependent on the dropdown commit and would need a
one-line cleanup if that commit ever gets reverted -- captured in
the dropdown_revert_orphan memory note.

1. PDF mode dropdown trigger shows wrong icon/label
   enterPdfMode() directly manipulates .selected/aria-pressed on
   mode buttons instead of routing through the click handler (which
   the dropdown commit hooked syncTriggerToSelected() into). Two
   problems:
   (a) The else branch of the mode-btn forEach only set
       display:none on hidden buttons -- it never cleared their
       .selected state. The Full Page button kept the .selected
       class from boot, and PDF added its own, so two buttons
       claimed .selected at once. The dropdown's
       querySelector(".mode-btn.selected") picked the DOM-first
       match (Full Page) and rendered its icon/label in the
       trigger while the actual preview correctly showed PDF.
   (b) enterPdfMode never called syncTriggerToSelected(), so even
       after fixing (a) the trigger wouldn't refresh.
   Fix: clear .selected on hidden buttons too, and push the
   trigger sync manually at the end of enterPdfMode.

2. Deferred-capture sidebar lock parity
   switchToFullPage's else-branch (which kicks off deferred
   capture for skip-eager flows: Selection / ContextImage /
   oEmbed-routing) was missing pieces of the lock the boot-time
   eager-capture path uses (renderer.ts ~line 920):
   - .disabled class on mode buttons (visual disabled state)
   - disableSignout() -- comment at the boot site says "prevents
     race conditions; clicking sign-out mid-capture corrupts
     state." Same hazard applies to deferred capture.
   - announceToScreenReader("Capturing...") for a11y parity
   - Progress-bar track reset (stale from prior runs in rare
     paths)
   Title field, note field, section picker stay editable -- user
   can prep their save while the loop runs (matches eager-flow
   behavior).

3. Region-cancel forgets previous mode
   Pre-existing bug (the regionCancelled handler had a hardcoded
   "fullpage" / "pdf" fallback), made far worse by:
   - Dropdown commit: handler manipulates .selected directly
     without calling syncTriggerToSelected, so the trigger goes
     stale.
   - Deferred-capture commit: snapping to Full Page when
     !fullPageComplete now KICKS OFF the screenshot loop --
     definitely not what someone who just cancelled wanted. If
     the user was in Selection mode, entered Region, cancelled,
     they'd suddenly find themselves in Full Page with a
     capture loop starting.
   Fix: new modeBeforeRegion state var, stashed at the top of
   switchToRegion (only when not already in region). On
   regionCancelled with empty regionImages, click the
   corresponding mode button -- routing through the click
   handler gets selected-state, syncTriggerToSelected,
   closeModeMenu, telemetry context, and the mode-specific
   switchToXxx all for free. Falls back to pdf/fullpage if
   the stash is empty (defensive -- switchToRegion always
   stashes, so this shouldn't trigger in practice).
   Reset modeBeforeRegion in both sign-in and sign-out
   fresh-capture blocks for state-machine cleanliness.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ropdown"

This reverts commit 6af8a0d -- designer
preferred all mode buttons visible up front.

Also rolled in:
- Selection mode button icon swapped to Fluent "Reading List Add"
  (20-Regular) per designer.
- Window height cap bumped 900 -> 980 to fit the restored stacked
  buttons + PDF options without scroll in most states (PDF mode is
  still borderline; tiny scrollbar acceptable).
- Removed the one-line orphan call to syncTriggerToSelected() inside
  enterPdfMode that came from the dropdown-sync fix in ea8f1d9.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reviewer noted V1's PDF mode showed a PDF icon + filename in the
preview area as a visual cue that the PDF file itself would be
attached to the OneNote page (in addition to the rendered page
images). V2 had the checkbox + save-time behavior wired up but no
visual indicator, so users couldn't tell from the preview whether
attachment was active.

Restored via a small node mounted at the top-leading edge of
#preview-container -- mirrors V1's PdfPreviewAttachment component
(84x96 box, 48px icon, filename without extension below). V1's
pdf_attachment_icon.png is already shipped (used to ship for the
legacy clipper) so no new asset needed.

renderer.ts
  renderPdfAttachmentIndicator()  Idempotent helper: removes any
                                   existing #pdf-attachment-indicator,
                                   no-op when pdfAttach is false, else
                                   builds + inserts as first child of
                                   #preview-container. role="img" +
                                   aria-label for screen readers.
  Called from four lifecycle sites so the indicator persists across
  the previewContainer.innerHTML="" rebuilds:
    - PDF load success in enterPdfMode (after renderPdfPagesInPreview)
    - switchToPdf re-entry from another mode
    - pdfAttachCheckbox change handler (real-time toggle)
    - updateAttachCheckbox auto-disable for >24.9MB PDFs

renderer.less
  .pdf-attachment-indicator -- matches V1's .attachment-overlay
  dimensions exactly (84x96, 48px icon, 13px filename, ellipsis on
  overflow). Top-leading positioning via margin-inline-start so it
  mirrors correctly under RTL.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
V2 doesn't call chrome.cookies.* anywhere -- ripgrep finds zero hits
across src/ and target/. The permission was scaffolding from the
initial commit (65e218d, 2015) and never removed in the 7+ years
since. Chrome's manifest never had it -- if the shared JS were
actually using chrome.cookies, Chrome would have thrown at runtime
years ago.

Confirmed server-side that all cookies (WebClipper validation +
WebClipperTokenInformation + refresh token storage) are managed
entirely by the server with HttpOnly + Secure flags, and read by
the server via Request.Cookies. The browser stores them and
auto-sends on cross-origin POSTs to /webclipper/userinfo without
any client API involvement.

Same shape as the Chrome storage permission removed in d343f64.
Tightens the manifest for Edge store review and removes dead
attack surface.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
V2's renderer.ts at line 924 read userInformation straight from
localStorage and fired the OneNote API notebooks fetch with whatever
access token was there -- including expired ones. After a 1-hour
expiry, every renderer-open's notebooks fetch returned 401, the user
saw stale cached notebook lists, and telemetry was logging spurious
GetNotebooks HTTP 401 events. The save path's existing token refresh
masked this for users who actually saved (worker calls
ensureFreshUserBeforeClip), but readers who just opened the renderer
to browse notebooks hit the stale-token failure every time.

V1 didn't have this gap. clipper.tsx's initializeExtensionCommunicator
called getInitialUser on every UI init, which routed through
extensionWorkerBase to auth.updateUserInfoData(InitialRetrieval).
That call is cache-aware via clipperData.getFreshValue with TTL =
(accessTokenExpiration*1000) - 180000 -- if the cached token is still
within its expiry-minus-3-minutes window, returns cached without a
network call. Otherwise hits /webclipper/userinfo using the
refresh-token cookie and writes a fresh access token to localStorage.
SectionPicker then fetched notebooks reactively when the access token
in clipperState changed.

Restored equivalent in V2:

webExtensionWorker.ts -- new port handler for action "refreshUser":
  calls this.auth.updateUserInfoData(clipperId, InitialRetrieval) and
  posts back { action: "userRefreshed" } on both success and failure.
  No success/failure payload -- the renderer inspects localStorage
  state to make its own decision, mirroring V1's data-driven pattern
  (V1 returned the UserInfo object and the UI checked .user presence).

renderer.ts -- two changes:
  (1) The eager fetchFreshNotebooks() call at module-load is removed.
      Notebooks fetch is now deferred to the userRefreshed handler.
  (2) safeSend({ action: "refreshUser" }) added right next to the
      existing safeSend({ action: "ready" }) at the bottom of the
      file, AFTER port.onMessage.addListener has been registered.
      Sending refreshUser before listener registration could race --
      worker responds before renderer is listening.

If refreshUser response somehow never arrives (worker hung), the
renderer's populateSectionDropdown() has already rendered the cached
notebook list synchronously at module-load, so the user isn't
stranded. No timeout fallback needed -- the cached-display layer is
the fallback.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@aanchalbhansali aanchalbhansali self-requested a review May 20, 2026 10:18
@aanchalbhansali
Copy link
Copy Markdown
Contributor

@wikirby There are too many long-liner comments in all the changed files. Please clean them up / shorten.

Comment thread src/scripts/renderer.ts Outdated
Two changes from the PR review:

1. Telemetry parity for Selection mode click event.
   Added "selection" entry to modeClickIds in the mode-button click
   handler so logClickEvent fires "selectionButton" (matching V1's
   Constants.Ids.selectionButton) instead of falling back to the raw
   "selection" data-mode value.

2. Shortened verbose comment blocks throughout the PR's changed files.
   Per review feedback, condensed multi-paragraph "why and history"
   blocks to 1-3 line "why" comments. Removed:
   - V1 archeology paragraphs (those belong in commit messages and
     memory notes, not in source).
   - Mechanism walk-throughs that the code already shows.
   - Dangling preamble for code that was removed in an earlier commit.
   Kept:
   - One-sentence rationale where the WHY isn't obvious from code.
   - V1 parity tags (e.g. "matches legacy getInitialUser") for cross-ref
     to commit history.

Net: ~150+ comment lines removed across renderer.ts, webExtensionWorker.ts,
webExtension.ts, oembedExtractor.ts. No behavior changes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants