feat(tiptap): convert Story Block content to Markdown (#35727)#35728
feat(tiptap): convert Story Block content to Markdown (#35727)#35728wezell wants to merge 2 commits into
Conversation
…35727) Adds TiptapMarkdown (com.dotcms.tiptap) which converts Story Block / ProseMirror documents to markdown and back. Wires it into the existing renderable surface so Velocity can call: $contentlet.storyBlock.toMd $markdownTool.blockToMarkdown(json) Supports paragraph, heading 1-6, blockquote, bullet/ordered lists, codeBlock with language, horizontalRule, hardBreak, image, GFM tables, plus the dotCMS-specific dotImage and youtube extensions. Marks: bold, italic, strike, code, link. Marks with no markdown equivalent (underline, highlight, sub/superscript, textStyle, color) are dropped silently; truly unknown nodes/marks log once at INFO and are skipped so the converter never throws on user-extended Tiptap schemas. Markdown -> Tiptap uses commonmark-java with GFM tables and strikethrough extensions (zero transitive runtime deps). Tests: 56 passing -- 49 synthetic + 7 against a trimmed real-blog fixture (blog-test.json) covering every node and mark plus fixed-point round-trip stability. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Claude finished @wezell's task in 3m 9s —— View job PR Review —
|
Mirrors the naming of the existing toHtml() Renderable surface so Velocity callers use `\$contentlet.storyBlock.toMarkdown` alongside `\$contentlet.storyBlock.toHtml`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes #35727
Summary
com.dotcms.tiptap.TiptapMarkdown— bidirectional converter between Tiptap JSON (Story Block / ProseMirror) and Markdown.$contentlet.storyBlock.toMarkdown()(StoryBlockMap)$markdownTool.blockToMarkdown(json)(MarkdownTool)org.commonmark:commonmark+-ext-gfm-tables+-ext-gfm-strikethrough(0.22.0). Zero transitive runtime deps (~250KB total).What it handles
Nodes: paragraph, heading 1-6, blockquote, bulletList, orderedList, listItem, codeBlock (with language), horizontalRule, hardBreak, image, table/tableRow/tableHeader/tableCell, plus dotCMS-specific
dotImageandyoutube.Marks: bold, italic, strike, code, link.
Graceful degradation: marks with no markdown equivalent (
underline,highlight,subscript,superscript,textStyle,color) are dropped silently. Any other unknown node/mark logs once at INFO viaLogger.infoand is skipped — Tiptap is extensible, so the converter never throws on user-extended schemas.Notable correctness details
*x *is invalid). The serializer extracts trailing whitespace out of mark spans before emitting closers, and leading whitespace before openers, so output is always well-formed and parses back to the same structure.codemarks orcodeBlocknodes is emitted literally — special chars are NOT backslash-escaped.codeBlockwhose body contains triple backticks gets a longer fence (4+ ticks) so the fence can't collide.Test plan
TiptapMarkdownTest— 49 synthetic unit tests covering every supported node, every mark, escaping, fence-width, JSON-string overload, round-trip stability per node type.TiptapMarkdownBlogContentTest— 7 tests againstblog-test.json(trimmed to 2 real Story Block bodies, 122KB), verifying:./mvnw compile -pl :dotcms-core) clean.$contentlet.storyBlock.toMarkdown()in a Velocity template to sanity-check end-to-end wiring.Out of scope (documented)
youtuberenders as a plain link to the video src (markdown has no native embed). Reviewer call: switch to an<iframe>HTML block if richer rendering is wanted.underlineetc. are intentionally lossy on the JSON→MD direction since markdown lacks the syntax.🤖 Generated with Claude Code