Summary
ExpressionColumnConfig currently treats any per-row Jinja render failure as a full-column failure. This makes expression columns much more brittle than LLM-backed columns: one bad row can abort the whole dataset build even when the expression is valid for the rest of the batch.
Feature request: keep expression columns as full-column processing, but handle row-level render failures by dropping only the affected rows and reporting structured warning counts. If every row is dropped, fail the column as a user template/config error.
Current behavior
In the current sync engine path, ExpressionColumnGenerator.generate(...) renders the expression row by row inside a full-column generator. If one row renders to empty text, UserTemplateSandboxEnvironment.validate_rendered_text(...) raises UserTemplateError("User template renders to empty text."). That exception escapes the generator and ColumnWiseBuilder._run_batch(...) reports the entire expression column as failed.
This differs from LLM-backed columns, which use cell-by-cell generation. Their worker error callback marks only the failing record for omission and continues processing the remaining rows.
Concrete example
A workflow generated math problems, reviewed each generated row with an llm-structured column, and then projected the reviewed answer into a required output field:
ExpressionColumnConfig(
name="output",
expr="{{ review.canonical_answer }}",
dtype="str",
)
The upstream review column produced valid structured objects, but a small number of generated examples were intentionally judged invalid or unsolvable by the review model. For those rows, the structured review contained an empty canonical answer:
{
"review": {
"is_valid": false,
"canonical_answer": "",
"issue": "Under-determined problem; multiple solutions exist."
}
}
Observed outcome in a 1024-row generation:
- 1010 rows had a non-empty
review.canonical_answer
- 14 rows had
review.canonical_answer == ""
- the expression column failed on the first empty render
- the job exited nonzero before writing the final
output column
- downstream consumers saw a partial parquet shard without the required
output column and had to treat the whole generation as failed
The relevant log looked like:
[INFO] Generating column `output` from expression
UserTemplateError: User template renders to empty text.
DatasetGenerationError: Failed to process column 'output':
User provided prompt generation template is invalid.
From the user's perspective this is surprising because a tiny number of bad model-generated rows can invalidate the entire expression column, even though LLM generation failures elsewhere are handled as per-record drops.
Requested behavior
For expression columns, keep full-column processing, but introduce row-level error handling during expression rendering:
- Render the expression for each row as today.
- If the rendered value is
None, empty, or whitespace-only, drop that row instead of failing the whole column.
- If rendering raises a row-specific error, drop that row instead of failing the whole column.
- Track dropped rows by error category.
- Log a warning that includes the column name, total dropped count, total input rows, and a breakdown by error type.
- If all rows are dropped, raise a
UserTemplateError so clearly broken expressions still fail loudly.
Example warning for partial drops:
[WARNING] Expression column 'output' dropped 14/1024 rows after render: EmptyRenderedExpression=14. Continuing with 1010 rows.
Example all-dropped failure:
[ERROR] Expression column 'output' dropped 1024/1024 rows after render: EmptyRenderedExpression=1024.
UserTemplateError: Expression column 'output' produced no valid rows.
Suggested error categories could include names like:
EmptyRenderedExpression for None, empty, or whitespace-only render results
TemplateRenderError for row-specific Jinja rendering exceptions
TypeCastError for failures converting the rendered value to the configured dtype
Static configuration errors should still fail immediately. For example, missing required columns, invalid Jinja syntax, unsupported template operations, or an invalid expression dtype should remain full-column/user-template failures before row-level processing begins.
Impact on existing users
Positive impact:
- Makes expression columns consistent with model-backed columns when failures are caused by individual records.
- Prevents large generation jobs from being invalidated by a few stochastic upstream outputs.
- Improves robustness for common patterns where expression columns project or normalize fields produced by LLM columns, validators, or judges.
- Preserves visibility into quality problems through warning counts and downstream row counts/yield metrics.
Behavior change to be aware of:
- A workflow that currently fails on the first empty expression render would instead complete with fewer rows.
- This could mask some user mistakes if users ignore warnings. The all-dropped case should still fail, and static template/config errors should still fail before row processing.
- If maintainers want a transition path, this could be exposed through a run/config flag, but the default row-drop behavior would better match existing LLM cell failure semantics.
Why this matters
Expression columns are often deterministic only in syntax. In real pipelines, they frequently depend on stochastic upstream LLM-generated fields. Treating every render failure as a global configuration error is too strict for that usage pattern. Dropping only the affected rows gives users the same resilience they already get from LLM columns while retaining fail-fast behavior for truly broken expressions.
Summary
ExpressionColumnConfigcurrently treats any per-row Jinja render failure as a full-column failure. This makes expression columns much more brittle than LLM-backed columns: one bad row can abort the whole dataset build even when the expression is valid for the rest of the batch.Feature request: keep expression columns as full-column processing, but handle row-level render failures by dropping only the affected rows and reporting structured warning counts. If every row is dropped, fail the column as a user template/config error.
Current behavior
In the current sync engine path,
ExpressionColumnGenerator.generate(...)renders the expression row by row inside a full-column generator. If one row renders to empty text,UserTemplateSandboxEnvironment.validate_rendered_text(...)raisesUserTemplateError("User template renders to empty text."). That exception escapes the generator andColumnWiseBuilder._run_batch(...)reports the entire expression column as failed.This differs from LLM-backed columns, which use cell-by-cell generation. Their worker error callback marks only the failing record for omission and continues processing the remaining rows.
Concrete example
A workflow generated math problems, reviewed each generated row with an
llm-structuredcolumn, and then projected the reviewed answer into a required output field:The upstream review column produced valid structured objects, but a small number of generated examples were intentionally judged invalid or unsolvable by the review model. For those rows, the structured review contained an empty canonical answer:
{ "review": { "is_valid": false, "canonical_answer": "", "issue": "Under-determined problem; multiple solutions exist." } }Observed outcome in a 1024-row generation:
review.canonical_answerreview.canonical_answer == ""outputcolumnoutputcolumn and had to treat the whole generation as failedThe relevant log looked like:
From the user's perspective this is surprising because a tiny number of bad model-generated rows can invalidate the entire expression column, even though LLM generation failures elsewhere are handled as per-record drops.
Requested behavior
For expression columns, keep full-column processing, but introduce row-level error handling during expression rendering:
None, empty, or whitespace-only, drop that row instead of failing the whole column.UserTemplateErrorso clearly broken expressions still fail loudly.Example warning for partial drops:
Example all-dropped failure:
Suggested error categories could include names like:
EmptyRenderedExpressionforNone, empty, or whitespace-only render resultsTemplateRenderErrorfor row-specific Jinja rendering exceptionsTypeCastErrorfor failures converting the rendered value to the configured dtypeStatic configuration errors should still fail immediately. For example, missing required columns, invalid Jinja syntax, unsupported template operations, or an invalid expression dtype should remain full-column/user-template failures before row-level processing begins.
Impact on existing users
Positive impact:
Behavior change to be aware of:
Why this matters
Expression columns are often deterministic only in syntax. In real pipelines, they frequently depend on stochastic upstream LLM-generated fields. Treating every render failure as a global configuration error is too strict for that usage pattern. Dropping only the affected rows gives users the same resilience they already get from LLM columns while retaining fail-fast behavior for truly broken expressions.