Skip to content

Fix parsing of tag descriptions with continuation lines starting with *#444

Open
tmdk wants to merge 1 commit intophpDocumentor:6.xfrom
tmdk:fix/continuation-lines-with-star
Open

Fix parsing of tag descriptions with continuation lines starting with *#444
tmdk wants to merge 1 commit intophpDocumentor:6.xfrom
tmdk:fix/continuation-lines-with-star

Conversation

@tmdk
Copy link

@tmdk tmdk commented Feb 25, 2026

When a tag description contains continuation lines that begin with *, the * appears twice in the parsed output. This affects block-style @param descriptions (common in WordPress docblocks) where list items or @type alignment uses a leading star:

/**
 * @param array $foo {
 *     Description of foo.
 *
 *     @type string $bar Description of bar with
 *                       * a list
 *                       * spanning *multiple* lines
 * }
 */

The parsed description for $foo would incorrectly contain ** a list instead of * a list.

Root cause

AbstractPHPStanFactory::tokenizeLine() worked around a PHPStan lexer limitation by splitting TOKEN_PHPDOC_EOL tokens that contained trailing whitespace into two tokens: a bare EOL and a separate TOKEN_HORIZONTAL_WS. This placed * in both token values. When PHPStan's parseText() stopped early at a tag-like token (e.g. @type) and rolled back, joinUntil() concatenated both raw token values, doubling the star.

Fix

Prefix every continuation line with "* " (star + space) before tokenizing. The PHPStan lexer's TOKEN_PHPDOC_EOL pattern matches exactly \n + optional whitespace + * (star + one space), so it always consumes the inserted "* " as part of the EOL token, leaving the original indentation as a separate subsequent token. A trim(..., "* \t") on every EOL token value then strips the inserted "* " back out, reducing each EOL to a plain "\n". The manual split into a separate TOKEN_HORIZONTAL_WS is no longer needed.

Changes

  • src/DocBlock/Tags/Factory/AbstractPHPStanFactory.php — revised tokenizeLine()
  • tests/integration/InterpretingDocBlocksTest.php — regression test covering the
    block-description case with embedded @type lines and star-prefixed continuation lines

…iptions

When a tag description contains continuation lines beginning with *, the *
was duplicated in the parsed output. This affected block-style @param
descriptions (e.g. WordPress-style docblocks with @type lines and star-
prefixed list items).

Root cause: tokenizeLine() split TOKEN_PHPDOC_EOL tokens with trailing
whitespace into a bare EOL + TOKEN_HORIZONTAL_WS pair, placing * in both
token values. When PHPStan rolled back past such a token on encountering a
tag-like token (@type), joinUntil() concatenated both raw values, doubling
the star.

Fix: prefix every continuation line with "* " before tokenizing. The lexer
always consumes the inserted "* " as part of TOKEN_PHPDOC_EOL, leaving
original indentation as a separate subsequent token. An unconditional
trim(..., "* \t") on every EOL token value strips the inserted "* " back
out. The manual split into TOKEN_HORIZONTAL_WS is no longer needed.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant