Skip to content

Commit 63492a8

Browse files
teunbrandclaudethomasp85
authored
Text layer (#155)
* Add fontsize aesthetic for linear text sizing Introduces a separate 'fontsize' aesthetic as an alternative to 'size' for text/label geoms. Unlike 'size' (which uses area-based scaling with radius² conversion for point marks), 'fontsize' uses linear scaling for font sizes. Changes: - Grammar: Add 'fontsize' to aesthetic names - Geoms: Add 'fontsize' to Text and Label supported aesthetics - Aesthetics: Register 'fontsize' in NON_POSITIONAL list - Writer: Map 'fontsize' → 'size' channel in Vega-Lite output - Scale: Add default range [8.0, 20.0] for fontsize aesthetic - Tests: Add test_fontsize_linear_scaling integration test Usage: DRAW text MAPPING x AS x, y AS y, value AS fontsize SCALE fontsize TO [10, 20] -- Linear: 10pt to 20pt (not area-converted) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * Implement TextRenderer with data-splitting for font properties Add TextRenderer implementation that handles font aesthetics (family, fontface, hjust, vjust) by splitting data into multiple Vega-Lite layers when font properties vary across rows. Key features: - Single-layer optimization: When all fonts are constant, generates one layer with mark properties set directly - Multi-layer splitting: When fonts vary, creates one layer per unique font combination while preserving ORDER BY - Proper SOURCE_COLUMN filtering: Uses empty string for single-layer and suffix keys for multi-layer to match BoxplotRenderer pattern - Font property mapping: - family → mark.font - fontface → mark.fontWeight/fontStyle - hjust → mark.align - vjust → mark.baseline Tests included for both constant and varying font cases. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * Simplify FontStrategy by unifying single and multi-layer cases Remove the FontStrategy enum variants and use a single struct with a groups vector. The single-layer case now has 1 group containing all rows, while the multi-layer case has N groups. Benefits: - Eliminates redundant code paths (no more match statements) - Simpler prepare_data() - just iterate over groups - Simpler finalize() - unified layer generation logic - Fewer lines of code overall Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * Remove TextMetadata wrapper, use FontStrategy directly TextMetadata was simply wrapping FontStrategy with no additional value. Store FontStrategy directly in PreparedData metadata instead. This eliminates 4 lines and one level of indirection. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * Remove unused signature field from FontGroup The signature field was only used during group construction as a HashMap key to track row assignments. After groups are built, the field was never accessed (marked with #[allow(dead_code)]). Removed the field and its assignments, keeping the local signature variable for grouping logic. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * Simplify TextRenderer by using HashMap for grouping Eliminated FontGroup struct and common_properties field by: - Using HashMap<String, (properties, indices)> for grouping during construction, then converting to sorted Vec - Storing all properties (constant + varying) in each group's HashMap - Using plain tuples (HashMap<String, Value>, Vec<usize>) instead of a dedicated struct This reduces code by 24 net lines while maintaining the same functionality. Properties are now the HashMap keys (via signature) and row indices are values, making the data structure more direct. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * Remove FontStrategy wrapper struct FontStrategy was just wrapping a single Vec. Eliminated it by: - Returning Vec<(HashMap<String, Value>, Vec<usize>)> directly from analyze_font_columns() - Storing the Vec directly as metadata in PreparedData::Composite - Downcasting to Vec type directly in finalize() This removes 7 net lines while maintaining identical functionality. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * Use HashMap<FontKey, Vec<usize>> with direct property conversion Refactored TextRenderer to use FontKey tuple containing converted Vega-Lite Values instead of intermediate structures: - FontKey = (family, fontWeight, fontStyle, align, baseline) as Values - convert_fontface returns (fontWeight, fontStyle) tuple - Properties converted once during grouping (in analyze_font_columns) - finalize_layers directly inserts Values into mark object - Eliminated font_key_to_properties, apply_mark_property, and map_aesthetic_to_mark_property helpers Benefits: - No string signatures or intermediate HashMaps - Properties converted once per unique combination (not per row) - Simpler finalize_layers with direct value insertion - No special-case spreading logic for fontface This removes 70 net lines while maintaining identical functionality. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * Sort font groups once in analyze_font_columns Changed analyze_font_columns to return Vec<(FontKey, Vec<usize>)> instead of HashMap, with sorting done once at the end of grouping. Before: HashMap was sorted twice - once in prepare_data() and again in finalize_layers() to maintain consistent ordering. After: Groups are sorted once after HashMap construction in analyze_font_columns(), then both prepare_data() and finalize_layers() iterate the pre-sorted Vec directly. This preserves HashMap's O(1) insertion benefit during construction while eliminating redundant sort operations. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * Use Option<Value> for family and apply clippy suggestions Changes: - convert_family() returns Option<Value> instead of Value - Returns None for empty family strings - Simplifies finalize_layers to use if let Some(family_val) - Apply clippy suggestion: use or_default() instead of or_insert_with(Vec::new) This eliminates the is_none_or check and makes the intent clearer: family is optional and should be omitted from the mark object when not specified. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * Split non-contiguous indices to preserve z-order When font groups have non-contiguous row indices (e.g., [0, 2, 5, 6]), split them into separate contiguous ranges ([0], [2], [5, 6]) to preserve rendering order. Example: - Row 0: Arial "A" - Row 1: Courier "B" - Row 2: Arial "C" Before: Arial layer renders A and C together, then B on top After: Three layers render in order: A, then B, then C This ensures that the DRAW clause ORDER BY is respected for z-order stacking, even when rows with the same font properties are interleaved with rows having different properties. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * Suppress legend and scale for text encoding The label aesthetic (mapped to Vega-Lite 'text' encoding) should not generate a legend or scale, as text values are literal display strings rather than data values that need scaling or legend representation. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * Refactor TextRenderer to use nested layers with shared encoding Changes: - Use nested layer structure for multi-group text rendering - Single group: returns one layer with full encoding - Multiple groups: returns parent layer with shared encoding, child layers only have mark + transform - Extract helper functions for code reuse: - apply_font_properties: applies font properties to mark object - build_transform_with_filter: creates transform with source filter - Both finalize_single_layer and finalize_nested_layers now use helpers to avoid duplication This approach eliminates duplicate encoding specifications in multi-layer output while preserving z-order through contiguous range splitting. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * Add test for text renderer nested layers structure - Verifies nested layer structure is correct for multiple font groups - Tests that parent spec has shared encoding - Tests that child layers only have mark + transform - Tests that font properties are applied to mark objects Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * Unify single and nested layer logic in TextRenderer Changes: - Remove finalize_single_layer function - Always use nested layer structure (works for 1 or N groups) - Simplify prepare_data to always use _font_N suffix - Update test expectations This eliminates code duplication and special-case handling for single-group scenarios, reducing implementation by ~24 lines. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * Add angle aesthetic to text geom Changes: - Add 'angle' to supported aesthetics in Text geom - Update FontKey tuple to include angle (6th element) - Extract angle column in analyze_font_columns - Add convert_angle function (parses numeric angle in degrees) - Apply angle property in apply_font_properties - Remove angle from encoding in modify_encoding The angle aesthetic is now handled the same way as other font properties (family, fontface, hjust, vjust) via data-splitting, since Vega-Lite requires it as a mark property. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * Complete angle aesthetic implementation with integration test This commit completes the angle aesthetic implementation: Grammar changes: - Add 'angle' to aesthetic keywords in tree-sitter grammar Label geom consistency: - Add 'angle' to supported aesthetics in Label geom - Brings label geom in line with text geom support TextRenderer improvements: - Fix convert_angle to handle both numeric and string columns - Add angle normalization to [0, 360) range - Handle integer, float, and string angle values Integration test: - Add test_text_angle_integration for full SQL → Vega-Lite pipeline - Verifies nested layer structure with angle mark properties - Tests angle normalization and data splitting - Validates non-contiguous index handling The angle aesthetic now works end-to-end: SQL query with angle column → TextRenderer splits data by unique angles → Vega-Lite generates nested layers with angle mark properties. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * Refactor TextRenderer to use pure run-length encoding Replace the group-sort-split approach with elegant run-length encoding for handling font property variations in text layers. Changes: Algorithm improvement: - Replace HashMap grouping + sorting + contiguous splitting with single-pass RLE scan - Complexity: O(n log n) → O(n) - Memory: 8n bytes per run → 16 bytes per run Type simplification: - Before: Vec<(FontKey, Vec<usize>)> - explicit row indices - After: Vec<(FontKey, usize)> - run lengths with implicit positions - Start positions derived from cumulative run lengths DataFrame operations: - Replace boolean masking (filter_by_indices) with direct slicing - Use df.slice(position, length) - O(1) pointer arithmetic - Remove filter_by_indices helper function entirely Function rename: - analyze_font_columns() → build_font_rle() - Clearer name indicating RLE technique and output type Benefits: - 28 net lines removed (52 insertions, 80 deletions) - Simpler single-pass algorithm - More efficient memory usage - Faster DataFrame operations - All tests pass unchanged The refactoring maintains identical behavior while using the canonical run-length encoding pattern for grouping consecutive rows. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * Add nudge_x and nudge_y parameters to text/label geoms Add nudge parameters that map to Vega-Lite's xOffset/yOffset mark properties, allowing fine-grained positioning adjustments for text labels. Changes: Text and Label geoms: - Add nudge_x and nudge_y to default_params - Default to Null (not applied unless explicitly set) TextRenderer: - Build base mark prototype with nudge offsets (if specified) - Clone and extend with font properties for each run - Pass layer to finalize_nested_layers for parameter access Integration test: - Verify nudge_x → xOffset and nudge_y → yOffset mapping - Confirm parameters apply to all nested text layers Usage: DRAW text SETTING nudge_x => 5, nudge_y => -10 This enables fine-tuning text label positions without modifying the underlying x/y data, useful for avoiding overlaps or improving label placement in dense visualizations. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * Add format parameter for text label formatting Add template-based label formatting to text/label geoms, reusing the existing format.rs infrastructure from SCALE RENAMING. Changes: format.rs improvements: - Add format_dataframe_column() - clean API for DataFrame column formatting - Refactor to convert columns to strings first, then apply formatting - Add format_value() helper shared by both APIs - Improved error message showing actual datatype for unsupported types - Two-step process: column→string, then template application Text/Label geoms: - Add 'format' parameter (defaults to Null) - Works with both geoms for consistency TextRenderer: - Add apply_label_formatting() helper - Apply formatting in prepare_data() before font analysis - Pass layer parameter through prepare_data() trait method - Update all GeomRenderer implementations Integration tests: - test_text_label_formatting - Title case transformation - test_text_label_formatting_numeric - Printf-style number formatting Supported placeholder syntax: - {} - Plain insertion - {:UPPER} - Uppercase - {:lower} - Lowercase - {:Title} - Title Case - {:time %fmt} - DateTime strftime format - {:num %fmt} - Number printf format Usage: DRAW text SETTING format => 'Region: {:Title}' DRAW text SETTING format => '${:num %.2f}' DRAW text SETTING format => '{:time %b %Y}' The format parameter transforms label values before rendering, enabling clean label presentation without modifying source data. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * soothe compiler * Handle font properties from parameters * Refactor text geom font property handling - Separate value selection from conversion in all convert functions - Use early returns with ? operator for cleaner control flow - Inline convert function calls to eliminate intermediate variables - Change property insertion to use if let Some with .insert() - Fix column lookup to use naming::aesthetic_column() - Optimize angle extraction to handle numeric columns without cast->parse - Remove unused FontKey type alias - Fix test_fontsize_linear_scaling to include required label aesthetic All text rendering tests passing (11/11). Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * specify fontsize in pt * delenda est * docs * fix mismerged test * fix another test expectation * finally do something about this darn test that keeps mucking up test results on my machine * soothe clippy * update docs * exclude label as group aesthetic * move label formatting to post_process method * eradicate label layer * divorce 'fontface' into 'fontweight' and 'italic' * rename `nudge_x/y` to `offset_x/y` * cargo fmt * Update doc/syntax/layer/type/text.qmd Co-authored-by: Thomas Lin Pedersen <thomasp85@gmail.com> * Combine offset_x and offset_y into single (array) offset * fancy approach to fontweight * rename `family` to `typeface` * rename angle to rotation * cargo fmt * soothe clippy --------- Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com> Co-authored-by: Thomas Lin Pedersen <thomasp85@gmail.com>
1 parent 09ab79a commit 63492a8

21 files changed

Lines changed: 1864 additions & 107 deletions

File tree

CLAUDE.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -334,7 +334,7 @@ pub enum Geom {
334334
// Statistical geoms
335335
Histogram, Density, Smooth, Boxplot, Violin,
336336
// Annotation geoms
337-
Text, Label, Segment, Arrow, Rule, Linear, ErrorBar,
337+
Text, Segment, Arrow, Rule, Linear, ErrorBar,
338338
}
339339

340340
pub enum AestheticValue {
@@ -1211,7 +1211,7 @@ Maps data values (columns or literals) to visual aesthetics. Syntax: `value AS a
12111211
- **Position**: `x`, `y`, `xmin`, `xmax`, `ymin`, `ymax`
12121212
- **Color**: `color`, `fill`, `stroke`, `opacity`
12131213
- **Size/Shape**: `size`, `shape`, `linetype`, `linewidth`
1214-
- **Text**: `label`, `family`, `fontface`
1214+
- **Text**: `label`, `typeface`, `fontweight`, `italic`
12151215

12161216
**Literal vs Column**:
12171217

doc/ggsql.xml

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -168,8 +168,9 @@
168168
<item>linewidth</item>
169169
<item>width</item>
170170
<item>height</item>
171-
<item>family</item>
172-
<item>fontface</item>
171+
<item>typeface</item>
172+
<item>fontweight</item>
173+
<item>italic</item>
173174
<item>hjust</item>
174175
<item>vjust</item>
175176
</list>

doc/syntax/index.qmd

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,7 @@ There are many different layers to choose from when visualising your data. Some
2525
- [`area`](layer/type/area.qmd) is used to display series as an area chart.
2626
- [`ribbon`](layer/type/ribbon.qmd) is used to display series extrema.
2727
- [`polygon`](layer/type/polygon.qmd) is used to display arbitrary shapes as polygons.
28+
- [`text`](layer/text.qmd) is used to render datapoints as text.
2829
- [`bar`](layer/type/bar.qmd) creates a bar chart, optionally calculating y from the number of records in each bar.
2930
- [`density`](layer/type/density.qmd) creates univariate kernel density estimates, showing the distribution of a variable.
3031
- [`violin`](layer/type/violin.qmd) displays a rotated kernel density estimate.

doc/syntax/layer/type/text.qmd

Lines changed: 139 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,139 @@
1+
---
2+
title: "Text"
3+
---
4+
5+
> Layers are declared with the [`DRAW` clause](../clause/draw.qmd). Read the documentation for this clause for a thorough description of how to use it.
6+
7+
The text layer displays rows in the data as text. It can be used as a visualisation itself, or used to annotate a different layer.
8+
9+
## Aesthetics
10+
The following aesthetics are recognised by the text layer.
11+
12+
### Required
13+
* Primary axis (e.g. `x`): Position along the primary axis.
14+
* Secondary axis (e.g. `y`): Position along the secondary axis.
15+
* `label` The text to dislay.
16+
17+
### Optional
18+
* `stroke` The colour at the contour lines of glyphs. Typically kept blank.
19+
* `fill` The colour of the glyphs.
20+
* `colour` Shorthand for setting `stroke` and `fill` simultaneously.
21+
* `opacity` The opacity of the fill colour.
22+
* `typeface` The typeface to style the lettering.
23+
* `fontsize` The size of the text in points.
24+
* `fontweight` Font weight. Interpretation is writer dependent. Vega-Lite converts everything to 'normal' or 'bold'. Can be one of the following:
25+
* CSS keywords: `'thin'`, `'hairline'`, `'extra-light'`, `'ultra-light'`, `'light'`, `'normal'` (default), `'regular'`, `'lighter'`, `'medium'`, `'semi-bold'`, `'demi-bold'`, `'bold'`, `'bolder'`, `'extra-bold'`, `'ultra-bold'`, `'black'`, `'heavy'`
26+
* Numeric values between 0-1000.
27+
* `italic` Whether text should be italicised. Boolean value (`true` or `false`).
28+
* `hjust` Horizontal justification. Can be a numeric value between 0-1 or one of `"left"`, `"right"` or `"centre"` (default). Interpretation of numeric values is writer-dependent.
29+
* `vjust` Vertical justification. Can be a numeric value between 0-1 or one of `"top"`, `"bottom"` or `"middle"` (default). Interpretation of numeric values is writer-dependent.
30+
* `rotation` Rotation of the text in degrees.
31+
32+
## Settings
33+
* `offset` Position offset expressed in absolute points. Can be one of the following:
34+
* a single number that applies both horizontally and vertically
35+
* an numeric array `[h, v]` where the first number is the horizontal offset and the second number is the vertical offset.
36+
* `format` Formatting specifier, see explanation below.
37+
* `position`: Determines the position adjustment to use for the layer (default is `'identity'`)
38+
39+
### Format
40+
The `format` setting can take a string that will be used in formatting the `label` aesthetic.
41+
The basic syntax for this is that the `label` value will be inserted into any place where `{}` appears.
42+
This means that e.g. `SETTING format => '{} species'` will result in the label "adelie species" for a row where the `label` value is "adelie".
43+
Besides simply inserting the value as-is, it is also possible to apply a formatter to `label` before insertion by naming a formatter inside the curly braces prefixed with `:`.
44+
Known formatters are:
45+
46+
* `{:Title}` will title-case the value (make the first letter in each work upper case) before insertion, e.g. `SETTING format => '{:Title} species'` will become "Adelie species" for the "adelie" label.
47+
* `{:UPPER}` will make the value upper-case, e.g. `SETTING format => '{:UPPER} species'` will become "ADELIE species" for the "adelie" label.
48+
* `{:lower}` works much like `{:UPPER}` but changes the value to lower-case instead.
49+
* `{:time ...}` will format a date/datetime/time value according to the format defined afterwards. The formatting follows strftime format using the Rust chrono library. You can see an overview of the supported syntax at the [chrono docs](https://docs.rs/chrono/latest/chrono/format/strftime/index.html). The basic usage is `SETTING format => '{:time %B %Y}` which would format a value at 2025-07-04 as "July 2025".
50+
* `{:num ...}` will format a numeric value according to the format defined afterwards. The format follows the printf format using the Rust sprintf library. The syntax is `%[flags][width][.precision]type` with the following meaning:
51+
- `flags`: One or more modifiers:
52+
* `-`: left-justify
53+
* `+`: Force sign for positive numbers
54+
* ` `: (space) Space before positive numbers
55+
* `0`: Zero-pad
56+
* `#`: Alternate form (`0x` prefix for hex, etc)
57+
- `width`: The minimum width of characters to render. Depending on the `flags` the string will be padded to be at least this width
58+
- `precision`: The maximum precision of the number. For `%g`/`%G` it is the total number of digits whereas for the rest it is the number of digits to the right of the decimal point
59+
- `type`: How to present the number. One of:
60+
* `d`/`i`: Signed decimal integers
61+
* `u`: Unsigned decimal integers
62+
* `f`/`F`: Decimal floating point
63+
* `e`/`E`: Scientific notation
64+
* `g`/`G`: Shortest form of `e` and `f`
65+
* `o`: Unsigned octal
66+
* `x`/`X`: Unsigned hexadecimal
67+
68+
## Data transformation
69+
The text layer does not transform its data but passed it through unchanged.
70+
71+
## Orientation
72+
The text layer has no orientation. The axes are treated symmetrically.
73+
74+
## Examples
75+
76+
Standard drawing data points as labels.
77+
78+
```{ggsql}
79+
VISUALISE bill_len AS x, bill_dep AS y FROM ggsql:penguins
80+
DRAW text MAPPING island AS label
81+
```
82+
83+
You can use the `format` setting to tweak the display of the label.
84+
85+
```{ggsql}
86+
VISUALISE bill_len AS x, bill_dep AS y FROM ggsql:penguins
87+
DRAW text
88+
MAPPING island AS label
89+
SETTING format => '{:UPPER}'
90+
```
91+
92+
Setting font properties. Colours are typically mapped to the fill.
93+
94+
```{ggsql}
95+
VISUALISE bill_len AS x, bill_dep AS y FROM ggsql:penguins
96+
DRAW text
97+
MAPPING
98+
island AS label,
99+
species AS fill,
100+
flipper_len AS fontsize
101+
SETTING
102+
opacity => 0.8,
103+
fontweight => 'bold',
104+
typeface => 'Times New Roman'
105+
SCALE fontsize TO [6, 20]
106+
```
107+
108+
The 'stroke' aesthetic is applied to the outline of the text.
109+
110+
```{ggsql}
111+
SELECT 1 as x, 1 as y
112+
VISUALISE x, y, 'My Label' AS label
113+
DRAW text
114+
SETTING fontsize => 30, stroke => 'red'
115+
```
116+
117+
Labelling precomputed bars with the data value.
118+
119+
```{ggsql}
120+
SELECT island, COUNT(*) AS n FROM ggsql:penguins GROUP BY island
121+
VISUALISE island AS x, n AS y
122+
DRAW bar
123+
DRAW text
124+
MAPPING n AS label
125+
SETTING vjust => 'top', offset => [0, -11], fill => 'white'
126+
```
127+
128+
If you label bars at the extreme end, you may need to expand the scale to accommodate the labels.
129+
130+
```{ggsql}
131+
SELECT island, COUNT(*) AS n FROM ggsql:penguins GROUP BY island
132+
VISUALISE island AS x, n AS y
133+
DRAW bar
134+
DRAW text
135+
MAPPING n AS label
136+
SETTING vjust => 'bottom', offset => [0, 11]
137+
SCALE y FROM [0, 200]
138+
```
139+

ggsql-vscode/syntaxes/ggsql.tmLanguage.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -249,7 +249,7 @@
249249
"patterns": [
250250
{
251251
"name": "support.type.aesthetic.ggsql",
252-
"match": "\\b(x|y|xmin|xmax|ymin|ymax|xend|yend|weight|color|colour|fill|stroke|opacity|size|shape|linetype|linewidth|width|height|label|family|fontface|hjust|vjust|panel|row|column)\\b"
252+
"match": "\\b(x|y|xmin|xmax|ymin|ymax|xend|yend|weight|color|colour|fill|stroke|opacity|size|shape|linetype|linewidth|width|height|label|typeface|fontweight|italic|hjust|vjust|panel|row|column)\\b"
253253
}
254254
]
255255
},

src/execute/mod.rs

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -687,8 +687,12 @@ fn add_discrete_columns_to_partition_by(
687687
.map(|c| c.name.as_str())
688688
.collect();
689689

690-
// Get aesthetics consumed by stat transforms (if any)
690+
// Build set of excluded aesthetics that should not trigger auto-grouping:
691+
// - Stat-consumed aesthetics (transformed, not grouped)
692+
// - 'label' aesthetic (text content to display, not grouping categories)
691693
let consumed_aesthetics = layer.geom.stat_consumed_aesthetics();
694+
let mut excluded_aesthetics: HashSet<&str> = consumed_aesthetics.iter().copied().collect();
695+
excluded_aesthetics.insert("label");
692696

693697
for (aesthetic, value) in &layer.mappings.aesthetics {
694698
// Skip positional aesthetics - these should not trigger auto-grouping.
@@ -698,8 +702,8 @@ fn add_discrete_columns_to_partition_by(
698702
continue;
699703
}
700704

701-
// Skip stat-consumed aesthetics (they're transformed, not grouped)
702-
if consumed_aesthetics.contains(&aesthetic.as_str()) {
705+
// Skip excluded aesthetics (stat-consumed or label)
706+
if excluded_aesthetics.contains(aesthetic.as_str()) {
703707
continue;
704708
}
705709

src/format.rs

Lines changed: 92 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -179,29 +179,105 @@ pub fn apply_label_template(
179179
}
180180
let key = elem.to_key_string();
181181

182-
let break_val = key.clone();
183182
// Only apply template if no explicit mapping exists
184-
result.entry(key).or_insert_with(|| {
185-
let label = if placeholders.is_empty() {
186-
// No placeholders - use template as literal string
187-
template.to_string()
188-
} else {
189-
// Replace each placeholder with its transformed value
190-
// Process in reverse order to preserve string indices
191-
let mut label = template.to_string();
192-
for parsed in placeholders.iter().rev() {
193-
let transformed = apply_transformation(&break_val, &parsed.placeholder);
194-
label = label.replace(&parsed.match_text, &transformed);
195-
}
196-
label
197-
};
198-
Some(label)
183+
result.entry(key.clone()).or_insert_with(|| {
184+
// Use shared format_value helper
185+
Some(format_value(&key, template, &placeholders))
199186
});
200187
}
201188

202189
result
203190
}
204191

192+
/// Apply label formatting template to a DataFrame column.
193+
///
194+
/// Returns a new DataFrame with the specified column formatted according to the template.
195+
///
196+
/// # Arguments
197+
/// * `df` - DataFrame containing the column to format
198+
/// * `column_name` - Name of the column to format
199+
/// * `template` - Template string with placeholders (e.g., "{:Title}", "{:num %.2f}")
200+
///
201+
/// # Returns
202+
/// New DataFrame with formatted column
203+
///
204+
/// # Example
205+
/// ```ignore
206+
/// let formatted_df = format_dataframe_column(&df, "_aesthetic_label", "Region: {:Title}")?;
207+
/// ```
208+
pub fn format_dataframe_column(
209+
df: &polars::prelude::DataFrame,
210+
column_name: &str,
211+
template: &str,
212+
) -> Result<polars::prelude::DataFrame, String> {
213+
use polars::prelude::*;
214+
215+
// Get the column
216+
let column = df
217+
.column(column_name)
218+
.map_err(|e| format!("Column '{}' not found: {}", column_name, e))?;
219+
220+
// Step 1: Convert entire column to strings
221+
let string_values: Vec<Option<String>> = if let Ok(str_col) = column.str() {
222+
// String column (includes temporal data auto-converted to ISO format)
223+
str_col
224+
.into_iter()
225+
.map(|opt| opt.map(|s| s.to_string()))
226+
.collect()
227+
} else if let Ok(num_col) = column.cast(&DataType::Float64) {
228+
// Numeric column - use shared format_number helper for clean integer formatting
229+
use crate::plot::format_number;
230+
231+
let f64_col = num_col
232+
.f64()
233+
.map_err(|e| format!("Failed to cast column to f64: {}", e))?;
234+
235+
f64_col
236+
.into_iter()
237+
.map(|opt| opt.map(format_number))
238+
.collect()
239+
} else {
240+
return Err(format!(
241+
"Formatting doesn't support type {:?} in column '{}'. Try string or numeric types instead.",
242+
column.dtype(),
243+
column_name
244+
));
245+
};
246+
247+
// Step 2: Apply formatting template to all string values
248+
let placeholders = parse_placeholders(template);
249+
let formatted_values: Vec<Option<String>> = string_values
250+
.into_iter()
251+
.map(|opt| opt.map(|s| format_value(&s, template, &placeholders)))
252+
.collect();
253+
254+
let formatted_col = Series::new(column_name.into(), formatted_values);
255+
256+
// Replace column in DataFrame
257+
let mut new_df = df.clone();
258+
new_df
259+
.replace(column_name, formatted_col)
260+
.map_err(|e| format!("Failed to replace column: {}", e))?;
261+
262+
Ok(new_df)
263+
}
264+
265+
/// Format a single value using template and parsed placeholders
266+
fn format_value(value: &str, template: &str, placeholders: &[ParsedPlaceholder]) -> String {
267+
if placeholders.is_empty() {
268+
// No placeholders - use template as literal string
269+
template.to_string()
270+
} else {
271+
// Replace each placeholder with its transformed value
272+
let mut result = template.to_string();
273+
for parsed in placeholders.iter().rev() {
274+
let transformed = apply_transformation(value, &parsed.placeholder);
275+
result = result.replace(&parsed.match_text, &transformed);
276+
}
277+
result
278+
}
279+
}
280+
205281
#[cfg(test)]
206282
mod tests {
207283
use super::*;

src/parser/builder.rs

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -611,7 +611,6 @@ fn parse_geom_type(text: &str) -> Result<Geom> {
611611
"boxplot" => Ok(Geom::boxplot()),
612612
"violin" => Ok(Geom::violin()),
613613
"text" => Ok(Geom::text()),
614-
"label" => Ok(Geom::label()),
615614
"segment" => Ok(Geom::segment()),
616615
"arrow" => Ok(Geom::arrow()),
617616
"rule" => Ok(Geom::rule()),

src/plot/aesthetic.rs

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -58,7 +58,7 @@ pub const USER_FACET_AESTHETICS: &[&str] = &["panel", "row", "column"];
5858
/// - Color aesthetics: color, colour, fill, stroke, opacity
5959
/// - Size/shape aesthetics: size, shape, linetype, linewidth
6060
/// - Dimension aesthetics: width, height
61-
/// - Text aesthetics: label, family, fontface, hjust, vjust
61+
/// - Text aesthetics: label, typeface, fontweight, italic, hjust, vjust
6262
pub const NON_POSITIONAL: &[&str] = &[
6363
"color",
6464
"colour",
@@ -72,8 +72,10 @@ pub const NON_POSITIONAL: &[&str] = &[
7272
"width",
7373
"height",
7474
"label",
75-
"family",
76-
"fontface",
75+
"typeface",
76+
"fontweight",
77+
"italic",
78+
"fontsize",
7779
"hjust",
7880
"vjust",
7981
];

0 commit comments

Comments
 (0)