Skip to content

Conversation

@Bahex
Copy link
Member

@Bahex Bahex commented Dec 10, 2024

Description

Add aggregate, a command that operates on the output of group-by --to-table to help aggregate to do quick inspections.

Related

Examples

open ~/Downloads/movies.csv
  | group-by Lead_Studio Genre --to-table
  | aggregate Worldwide_Gross
  # | first 4
  # | to md
Lead_Studio Genre count Worldwide_Gross_min Worldwide_Gross_avg Worldwide_Gross_max Worldwide_Gross_sum
The Weinstein Company Comedy 1 19.62 19.62 19.62 19.62
The Weinstein Company Drama 1 8.26 8.26 8.26 8.26
Independent Comedy 7 14.31 57.01 205.3 399.07
Independent Romance 7 0.03 149.82142857142858 702.17 1048.75

open ~/Downloads/movies.csv
  | group-by Lead_Studio Genre --to-table
  | aggregate Worldwide_Gross --ops {avg: {math avg}, std: {math stddev}}
  # | first 4
  # | to md
Lead_Studio Genre count Worldwide_Gross_avg Worldwide_Gross_std
The Weinstein Company Comedy 1 19.62 0
The Weinstein Company Drama 1 8.26 0
Independent Comedy 7 57.01 66.1709932134704
Independent Romance 7 149.82142857142858 229.79475832816996

open ~/Downloads/movies.csv
  | group-by Lead_Studio Genre --to-table
  | aggregate Worldwide_Gross Audience_score_% --ops {avg: {math avg}}
  # | first 4
  # | to md
Lead_Studio Genre count Worldwide_Gross_avg Audience_score_%_avg
The Weinstein Company Comedy 1 19.62 52
The Weinstein Company Drama 1 8.26 84
Independent Comedy 7 57.01 60.142857142857146
Independent Romance 7 149.82142857142858 59.857142857142854

@fdncred
Copy link
Contributor

fdncred commented Dec 10, 2024

My initial thought is to just land this PR because I want to use it. But we need to follow the readme so we need tests. The documentation is probably ok but I'd like to see more examples with different agg ops.

I love this script, and I already have a version of it from discord in my own custom commands that I source at every startup. Good work and thanks for adding this here.

@Bahex Bahex marked this pull request as ready for review December 31, 2024 18:45
@fdncred fdncred merged commit 8db6af6 into nushell:main Dec 31, 2024
@fdncred
Copy link
Contributor

fdncred commented Dec 31, 2024

Thanks! Let's go!

@NotTheDr01ds
Copy link
Contributor

NotTheDr01ds commented Jan 27, 2025

@Bahex I'd like to propose that we promote this into std, but two things are holding me back:

  1. There's now (on Nushell main) a failing test which I believe is due to the changes in Add run-time type checking for command pipeline input nushell#14741 - I'm not sure how you want to handle that. Previously, you special-cased a helpful error message suggesting group-by --to-table, but I don't think that's possible after the Add run-time type checking for command pipeline input nushell#14741 changes.
  2. Perhaps less important, but at the moment this should probably be in std/tables/aggregate.nu. This will match up with the other PR I still need to move over from the "old" std-rfc. However, the current std implementation doesn't support more than one "nesting". I'd really like to fix that before we promote this, but it may take me a bit.

@Bahex
Copy link
Member Author

Bahex commented Jan 27, 2025

@NotTheDr01ds

  1. While not ideal, I can add a second input output type of record -> error to keep this helpful suggestion? I checked out the current error message and depending on the input record the error can contain an extremely input type.
Details

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants