Skip to content

Double becoming Int when reading Csv #177

@maxigit

Description

@maxigit

This is a bit similar to the #157 issue with column being Maybe or not depending on luck.

Here the problem is between Double and Int. Yesterday I tried to rerun something I've done last week, fed the system with new file and suddenly, a column which was a Double is now Int. In fairness the column has always contains natural numbers but to be mixed other columns it is either to see it as a Double.

Changing 0 to 0.0 in the cvs file fixed the problem but it wasn't initially easy to find. The error was a runtime error and I don't remember if the column name was actually spelled out on the error.

I can see two way of solving the problem:
1 - specifying Double when reading the csv
2 - casting somehow the column to an Double.

For 1) I understand that there is a readCsvWithOptions which allow to pass a schema. However I just want to specify one column not all of them : AFAU, readCsv can Either infer from sample or use a schema but not both (.i.e use the schema for some columns and infer from the rest.

For 2) I could use F.lift round or apply round but I think both will crash the column is already a Double (This is similar problem to impute issue in #157).

Maybe we need a way to apply, lift a function only if matches the g - iven type but do nothing instead of crashing . This could be done by either

  • introducing new functions,
  • new functions returning Either as in applyEither :: Expr a -> ( a -> b) -> Either b b
  • maybe a new wrapper as in apply (F.col @(DontCrashIfIAMNoAnA Int) round
  • or a new type of column apply (F.colWhichIsPossiblyAn Int) round.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions