fix: include all mapped columns in stat transform GROUP BY#240
Closed
fix: include all mapped columns in stat transform GROUP BY#240
Conversation
Integer columns mapped to non-positional aesthetics like fill/color were dropped by stat transforms because only String/Boolean/Categorical columns were added to partition_by. Now all non-positional, non-stat-consumed mapped columns are included in GROUP BY, matching ggplot2's grouping behavior. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Collaborator
Author
|
@thomasp85 Take this with a heavy grain of salt. I haven't investigated this change at all yet -- more so just proposing it to promote awareness and get a conversation going. |
Collaborator
|
I'm very much against this. If you wish to use an integer column then make it discrete (which will make it part of the grouping). You are already saying that you meant for the column to be discrete but it happened to be numeric - the fix is to explicitly make it discrete |
Collaborator
Author
|
Ahh, right, I'm happy to work around this via prompting. This might a good case to keep in mind for #85? |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #239.
Integer columns mapped to non-positional aesthetics like
fillorcolorwere being dropped by stat transforms (e.g., bar's COUNT). This happened because onlyString/Boolean/Categoricalcolumns were considered "discrete" and added topartition_by— integer columns were skipped, causing them to be excluded fromGROUP BY.Before:
DRAW bar MAPPING sex AS x, survived AS fill(wheresurvivedis integer 0/1) → validation error becausefillcolumn was dropped by stat transform.After: All non-positional, non-stat-consumed mapped columns are added to
partition_byregardless of data type, matching ggplot2's grouping behavior.Changes
src/execute/mod.rs: Renamedadd_discrete_columns_to_partition_by→add_mapped_columns_to_partition_byand removed the discreteness filter. Simplified the function signature (removed unusedscalesandaesthetic_ctxparameters).src/execute/layer.rs,src/execute/position.rs: Updated comments referencing the renamed function.test_bar_with_integer_fill_columnverifies integer fill columns survive bar stat transforms.Test plan
test_bar_with_integer_fill_columnpasses🤖 Generated with Claude Code