csvstats: number of bin calculations is too high for data/random-data.csv dataset

```sh
# This generates a NaN as the first delta
$ csvdelta -i -c timestamp data/random-data.csv
$ cargo run --bin csvstats -- -c timestamp-deltas data/random-data.csv -H
Stats for column "timestamp-deltas":
    count: 1173
    filtered: 1 (total: 1174)
    Q1: 0.0010528564453125
    median: 0.0010530948638916016
    Q3: 0.0010530948638916016
    min: 0.0010089874267578125 at index: 0
    max: 0.0010700225830078125 at index: 1172
    mean: 0.0010526164006902329
    stddev: 0.000029847184932617655

2025-03-10T00:00:38.348237Z  INFO csvizmo::plot: Using 1350 bins with width 0.0000
```

1350 is more bins than there are samples. Either I have a bug in my Freedman Diaconis rule calculation, or it doesn't give the kind of results I want.

A dataset like this isn't quite normal, so I think the KDE estimation isn't quite right, and a histogram isn't the most useful way of visualizing this data.

![Image](https://github.com/user-attachments/assets/3c219e9d-f248-4e96-ba0a-62585d9bd039)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

csvstats: number of bin calculations is too high for data/random-data.csv dataset #31

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

csvstats: number of bin calculations is too high for data/random-data.csv dataset #31

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions