Conversation
There was a problem hiding this comment.
Pull request overview
This PR replaces the custom Markdown→HTML conversion pipeline in internal/richtext with a goldmark-based converter configured for Basecamp/Trix-compatible output, and updates HTML→Markdown handling to improve edit-loop round-trip fidelity (notably for multiline blockquotes, list continuations, and code fences).
Changes:
- Swap the hand-rolled
MarkdownToHTMLparser for a goldmark converter with custom AST transformer/renderer and a\@mention-suppression inline parser. - Improve
HTMLToMarkdownnormalization for blockquotes, lists, and code fences to preserve multiline structure across round-trips. - Add/adjust richtext tests and promote goldmark to a direct dependency in
go.mod.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| internal/richtext/richtext.go | Replaces Markdown conversion with goldmark + custom Trix components; adjusts HTML→Markdown normalization helpers. |
| internal/richtext/richtext_test.go | Updates expected HTML and adds targeted round-trip and edge-case tests for the new behavior. |
| go.mod | Adds github.com/yuin/goldmark as a direct requirement (was previously indirect). |
Tip
If you aren't ready for review, convert to a draft PR.
Click "Convert to draft" or run gh pr ready --undo.
Click "Ready for review" or run gh pr ready to reengage.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
1 issue found across 3 files
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="internal/richtext/richtext.go">
<violation number="1" location="internal/richtext/richtext.go:613">
P1: `reStripP` regex matches `<pre>`, `<progress>`, and any other tag starting with `p`, not just `<p>`. The pattern `</?p[^>]*>` needs a word boundary after `p` — e.g., require the next character to be `\s` or `>` so that `<pre>` is excluded.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b053614b71
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6574c73bcf
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d73e8155c2
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 5d8fca04b3
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 271de640f1
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Replace the ~350-line regex pipeline (convertInline, placeholder extraction/restoration, line-by-line parser) with goldmark configured for Trix editor compatibility via three custom components: - trixTransformer: forces tight lists, converts soft breaks to hard inside list items and blockquotes, inserts <br> between blank-line-separated blocks for Trix paragraph spacing - trixRenderer: compact blockquote output, HTML escaping for raw HTML blocks (vs <!-- raw HTML omitted -->), TrixBreak and EscapedAt node rendering - escapedAtParser: intercepts \@ before goldmark's standard escape handling to preserve mention-suppression syntax HTMLToMarkdown gains <br>-aware round-trip support: blockquote and list handlers now normalize <br> tags before splitting, so multiline blockquotes and list continuations survive the edit loop (MarkdownToHTML → HTMLToMarkdown) faithfully. Also fixes multiline blockquote/paragraph regex matching ((?i) → (?is)) and trims trailing newlines from code fence content. Promotes goldmark from indirect to direct dependency (v1.7.13, already present via glamour).
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| for j < len(html) && depth > 0 { | ||
| // Decrement for any list close tag (handles mixed <ul>/<ol> nesting) | ||
| if hasPrefixFold(html[j:], "</ul>") || hasPrefixFold(html[j:], "</ol>") { | ||
| closeLen := 5 // len("</ul>") == len("</ol>") | ||
| depth-- | ||
| if depth == 0 { | ||
| inner := html[contentStart:j] | ||
| var md string | ||
| if tag == "ul" { | ||
| md = convertULInner(inner) | ||
| } else { | ||
| md = convertOLInner(inner) | ||
| } | ||
| s := result.String() | ||
| if len(s) > 0 && s[len(s)-1] != '\n' { | ||
| result.WriteByte('\n') | ||
| } | ||
| result.WriteString(md + "\n\n") | ||
| j += closeLen | ||
| break | ||
| } | ||
| j += closeLen | ||
| } else if loc := reListOpen.FindStringSubmatchIndex(html[j:]); loc != nil && loc[0] == 0 { | ||
| depth++ | ||
| j += loc[1] | ||
| } else { | ||
| j++ | ||
| } |
There was a problem hiding this comment.
In replaceBalancedListBlocks(), the inner scanning loop increments j one byte at a time and calls reListOpen.FindStringSubmatchIndex(html[j:]) on most iterations. This makes the list conversion potentially O(n²) on larger HTML bodies (especially when there are few tags, so the regex runs for many bytes). Consider scanning by jumping to the next '<' (e.g., via strings.IndexByte) and only then checking for open/close tags, or doing a single pass tokenization to avoid repeated regex calls per byte.
| s := result.String() | ||
| if len(s) > 0 && s[len(s)-1] != '\n' { | ||
| result.WriteByte('\n') | ||
| } | ||
| result.WriteString(md + "\n\n") |
There was a problem hiding this comment.
replaceBalancedListBlocks() calls result.String() inside the main conversion loop to check whether the last written byte is a newline. strings.Builder.String() materializes the full accumulated string, which is avoidable extra work and can become costly for large inputs. Consider tracking whether the builder currently ends with a newline via a small state variable (or by maintaining a separate lastByte as you write) instead of converting the whole builder to a string each time.
| for j < len(html) && depth > 0 { | ||
| if hasPrefixFold(html[j:], "</li>") { | ||
| depth-- | ||
| if depth == 0 { | ||
| content := html[contentStart:j] | ||
| content = replaceBalancedListBlocks(content) | ||
| items = append(items, content) | ||
| j += 5 | ||
| break | ||
| } | ||
| j += 5 | ||
| } else if loc := reLIOpen.FindStringIndex(html[j:]); loc != nil && loc[0] == 0 { | ||
| depth++ | ||
| j += loc[1] | ||
| } else { | ||
| j++ | ||
| } |
There was a problem hiding this comment.
extractListItems() uses a byte-by-byte scan (j++) and runs reLIOpen.FindStringIndex(html[j:]) repeatedly to detect nested
j by searching for the next '<' and then checking for <li / </li> prefixes (case-insensitively), reserving the regex for the initial “find next
Summary
convertInline, placeholder extraction/restoration,line-by-line parser) with goldmark configured for Trix editor compatibility via three custom
components:
trixTransformer(tight lists, soft→hard breaks, paragraph spacing),trixRenderer(compact blockquotes, HTML escaping, TrixBreak), andescapedAtParser(
\@mention suppression)HTMLToMarkdownround-trip fidelity: blockquote and list handlers now normalize<br>before splitting, so multiline blockquotes and list continuations survive the edit loop.
Also fix multiline regex matching (
(?i)→(?is)) and trim trailing newlines from codefence content
Net: -433/+452 lines across 3 files. The hand-rolled parser shrinks by ~300 lines; goldmark
components add ~200; tests add ~200 with exact round-trip assertions.
Behavioral changes
- Two\nAfterkeepsAfterinside the listitem instead of starting a new paragraph. Rare in CLI input; CommonMark-correct.
Text\n---is now an h2 (CommonMark), not paragraph + hr.%in URLs:\%produces%not%25(CommonMark-correct).\ninside<code>(goldmark standard fenced code output).\n) instead of space join — HTML renders identically.Test plan
bin/cigreen (all checks pass; tidy-check requires committed state)TestMarkdownToHTML/TestMarkdownToHTMLBackslashEscapesexpectations updatedTestMarkdownToHTMLBackslashAtCounts(4 backslash counts),TestMarkdownToHTMLMultiParagraphBlockquote(single/multi/multi-para),TestMarkdownToHTMLRawHTMLBlock(single/multiline script),TestHTMLToMarkdownMultilineBlockquote,TestHTMLToMarkdownMultilineParagraph,TestHTMLToMarkdownCodeFenceNewlineTestEditLoopRoundTripwith exact assertions: blockquotes (3 variants), lists (plain +continuation), code fences, headings, mixed content
basecamp todos create/basecamp messages createagainst a realBasecamp account — created todo with heading, bold/italic, list with continuation,
multiline blockquote, and code fence; server HTML → markdown → HTML → markdown
round-tripped stably with all structure preserved