Skip to content

Expand the resource content viewer to render PDF, CSV, HTML, XML, and syntax-highlighted code #1329

@cliffhall

Description

@cliffhall

Expand the resource content viewer to render PDF, CSV, HTML, XML, and syntax-highlighted code

Background

The Resources screen renders a fetched resource through ContentViewer
(clients/web/src/components/elements/ContentViewer/ContentViewer.tsx), which
currently handles a narrow set of cases:

  • Text with a text/markdown branch via react-markdown + remark-gfm
  • Text with a {/[ heuristic for JSON pretty-print (no highlighting)
  • Image as a data: URI in a Mantine Image
  • Audio as a data: URI in <audio>
  • Embedded resource / resource_link (passthrough)

Anything blob-typed that isn't image or audio currently shortcuts to a synthetic
text block in ResourcePreviewPanel.toContentBlock:

return {
  type: "text",
  text: `[Binary content (${mimeType}) — preview not supported]`,
};

For an Inspector that mostly exists to peek at MCP server output, that's a
meaningful gap — any server that surfaces a PDF, a CSV, an HTML report, or a
formatted XML/JSON/CSS document shows up as opaque blob text.

Scope

Centralize a per-MIME dispatch for resource content so the Resources screen can
render the full type matrix below. Where a renderer needs a heavier dependency,
lazy-load it so the main bundle isn't penalized for resources the user never
opens.

Implementation shape

Extend ContentViewer rather than splitting off a parallel resource-only
element. Broaden its input to accept the raw
TextResourceContents | BlobResourceContents (with its base64 blob) and an
effective mimeType, so the per-MIME branches run inside the existing
component. ResourcePreviewPanel.toContentBlock collapses to a trivial
passthrough.

This keeps a single dispatch surface that tool results and prompt messages can
adopt later (#1328) without duplicating the type matrix.

Type matrix

Type MIME Renderer Notes
Image image/* Mantine Image with data: URI Already implemented
Audio audio/* <audio> with data: URI Already implemented
PDF application/pdf <iframe> over a Blob URL with #view=FitH New — decode base64 to Uint8Array, new Blob([…], { type: 'application/pdf' }), URL.createObjectURL, revoke on unmount
CSV text/csv (or .csv URI suffix) Mantine Table (striped, hoverable), first ~100 rows New — parse with papaparse (header: true); fall back to plain <pre> when header detection fails
JSON application/json (or .json URI suffix) Pretty-printed code with syntax highlighting Upgrade — keep JSON.stringify(JSON.parse(text), null, 2), render through the new lazy highlighter
XML text/xml, application/xml (or .xml URI suffix) Indented + syntax-highlighted code New — small hand-rolled formatter (>\s*< split + indent), then highlighter w/ language="xml"
HTML text/html (or .html / .htm URI suffix) Sandboxed <iframe> over a Blob URL New — see HTML sandbox notes below
CSS text/css (or .css URI suffix) Syntax-highlighted code New
Markdown text/markdown, text/x-markdown (or .md / .markdown URI suffix) react-markdown + remark-gfm Already implemented; tighten the anchor href allowlist (see below)
Other text text/* and application/{javascript,xml,json} etc. not handled above Plain wrapping <pre> (Code variant="wrapping") Already implemented; remains the catch-all for unrecognized text MIMEs
Binary / unsupported Anything else "Binary content ({mimeType}) — preview not supported" Already implemented; reachable only when the dispatch above doesn't match

Cross-cutting work

  • Lazy syntax highlighter. Add a small <CodeHighlight language="…">
    element that, on first use in a session, dynamic-imports
    react-syntax-highlighter's prism-light runtime + one theme chunk
    (tomorrow is a reasonable default), then dynamic-imports the requested
    language grammar and registers it with the runtime. Track three module-level
    caches:
    • registeredLanguages: Set<string> — successfully loaded grammars
    • failedLoads: Set<string> — grammars whose import rejected; prevents
      retry loops
    • loadingPromises: Map<string, Promise<void>> — in-flight imports so
      concurrent mounts of the same language share one async call
      While a grammar is loading (or the language tag is unknown), render a plain
      <pre> so users don't see a flash of unstyled tokens. Initial language
      coverage: json, xml (alias of markup), css, plus aliases (yml→yaml,
      md→markdown, etc.) — add more as the type matrix expands later.
  • HTML sandbox. A wrapHtmlWithCsp(html) helper that injects a CSP
    <meta http-equiv> into the document's <head> (or wraps a fragment in
    <html><head>…</head><body>…</body></html>) before handing the result to an
    HtmlIframe that:
    • sets sandbox="" — explicitly empty, no allow-scripts,
      allow-forms, or allow-same-origin
    • serves the content via URL.createObjectURL(new Blob([…])) and revokes
      on unmount
      CSP should be defense-in-depth (correct even if sandbox is later loosened):
      default-src 'none'; style-src 'unsafe-inline' https://fonts.googleapis.com; img-src data: blob:; font-src data: https://fonts.gstatic.com; base-uri 'none'; object-src 'none'; form-action 'none';. Do not add
      script-src — letting it fall through to default-src 'none' is what makes
      the policy load-bearing under a future allow-scripts change.
  • Base64 → text decode. A decodeBase64ToUtf8(base64) helper
    (atobUint8ArrayTextDecoder('utf-8')) so blob content delivered as
    BlobResourceContents for an inherently text MIME (CSV, XML, HTML, JSON,
    CSS, Markdown delivered with non-UTF-8 encoding markers) renders correctly.
  • MIME inference fallback. ResourcePreviewPanel.effectiveMime already
    infers text/markdown from .md / .markdown URIs. Extend the inference
    table to cover the suffixes that map to new renderers: .csvtext/csv,
    .jsonapplication/json, .xmlapplication/xml, .html / .htm
    text/html, .csstext/css, .pdfapplication/pdf. Servers that
    omit mimeType are common; without this the new renderers never engage.
  • Markdown anchor safety. Add a SAFE_HREF allowlist for the markdown
    a component — pattern roughly ^(https?:|mailto:|#|\/(?!\/)) (bare
    /path is allowed; protocol-relative //evil.com is not). Non-matching
    anchors render as <span> so user-supplied markdown can't smuggle
    javascript: or similar.

Explicitly out of scope (for this issue)

  • DOCX (application/vnd.openxmlformats-officedocument.wordprocessingml.document)
    and RTF (application/rtf, text/rtf). Faithful rendering needs
    server-side extraction (mammoth / rtf-to-html style) and a new dev-backend
    endpoint plus dependencies. Defer to a follow-up; both hit the
    binary-unsupported fallback for now.
  • New attachment-style controls (download button, copy-as-blob, expand/collapse
    wrapper around the viewer). Keep the existing Copy button behavior on text /
    markdown cases as the only built-in action.

Acceptance criteria

  • Reading a PDF resource shows it in an in-page viewer (not the
    binary-fallback text); the blob URL is revoked when the panel unmounts or the
    user navigates to a different resource.
  • Reading a CSV resource shows a table for the first ~100 rows; malformed
    CSV falls back to plain <pre> rather than throwing.
  • Reading an HTML resource shows the rendered HTML in a sandboxed iframe
    (sandbox="", no script execution, CSP injected). A script tag in the
    resource cannot execute and cannot reach the network.
  • Reading a JSON, XML, or CSS resource shows syntax-highlighted
    code; the highlighter runtime + theme chunk loads only on the first such
    resource per session, and subsequent resources of the same language don't
    refire the import.
  • Reading a Markdown resource (including a .md URI without an explicit
    mimeType) continues to render as markdown; anchors whose href does not
    match the safe-scheme allowlist render as <span> instead of <a>.
  • Existing image / audio / plain text rendering is unchanged.
  • npm run validate, npm run test:integration, and npm run test:storybook
    all pass; new unit tests in ContentViewer (or the new component) cover at
    minimum: PDF blob URL creation + cleanup, CSV table render + fallback, HTML
    sandbox attributes + CSP injection, JSON / XML / CSS highlighted output, and
    the URI-suffix MIME-inference paths.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    v2Issues and PRs for v2

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions