Skip to content

Croissant 1.1 (summary statistics)#12214

Open
pdurbin wants to merge 7 commits intodevelopfrom
12014-croissant-1.1
Open

Croissant 1.1 (summary statistics)#12214
pdurbin wants to merge 7 commits intodevelopfrom
12014-croissant-1.1

Conversation

@pdurbin
Copy link
Copy Markdown
Member

@pdurbin pdurbin commented Mar 12, 2026

What this PR does / why we need it:

The Croissant metadata export format has been updated from version 1.0 to 1.1.

Summary statistics (mean, min, max, etc.) are now included for tabular files that were successfully ingested.

Which issue(s) this PR closes:

Special notes for your reviewer:

Suggestions on how to test this:

  • Try uploading various files and checking croissant and croissantSlim exports.

Does this PR introduce a user interface change? If mockups are available, please link/include them here:

No.

Is there a release notes update needed for this change?:

Yes, included.

Additional documentation:

Preview doc changes at https://dataverse-guide--12214.org.readthedocs.build/en/12214/api/changelog.html

@pdurbin pdurbin added this to the 6.11 milestone Mar 12, 2026
@pdurbin pdurbin moved this to Ready for Review ⏩ in IQSS Dataverse Project Mar 12, 2026
@github-actions github-actions bot added Croissant Croissant and Kaggle related work FY26 Sprint 11 FY26 Sprint 11 (2025-11-20 - 2025-12-03) FY26 Sprint 12 FY26 Sprint 12 (2025-12-03 - 2025-12-17) FY26 Sprint 13 FY26 Sprint 13 (2025-12-17 - 2025-12-31) FY26 Sprint 14 FY26 Sprint 14 (2025-12-31 - 2026-01-14) FY26 Sprint 15 FY26 Sprint 15 (2026-01-14 - 2026-01-28) Size: 20 A percentage of a sprint. 14 hours. labels Mar 12, 2026
@pdurbin pdurbin mentioned this pull request Mar 12, 2026
@github-actions

This comment has been minimized.

1 similar comment
@github-actions

This comment has been minimized.

@pdurbin pdurbin mentioned this pull request Mar 16, 2026
12 tasks
@pdurbin pdurbin force-pushed the 12014-croissant-1.1 branch 2 times, most recently from c94c6ee to 97ff858 Compare March 16, 2026 19:03
@coveralls
Copy link
Copy Markdown

Coverage Status

coverage: 24.9% (+0.06%) from 24.842%
when pulling 97ff858 on 12014-croissant-1.1
into d55acc6 on develop.

@github-actions

This comment has been minimized.

@landreev landreev moved this from Ready for Review ⏩ to In Review 🔎 in IQSS Dataverse Project Mar 23, 2026
@landreev landreev self-assigned this Mar 23, 2026
@landreev landreev self-requested a review March 23, 2026 15:08
@pdurbin pdurbin force-pushed the 12014-croissant-1.1 branch from 97ff858 to b98a3be Compare March 24, 2026 19:48
@github-actions
Copy link
Copy Markdown

📦 Pushed preview images as

ghcr.io/gdcc/dataverse:12014-croissant-1.1
ghcr.io/gdcc/configbaker:12014-croissant-1.1

🚢 See on GHCR. Use by referencing with full name as printed above, mind the registry name.

@cmbz cmbz added the FY26 Sprint 20 FY26 Sprint 20 (2026-03-26 - 2026-04-08) label Mar 27, 2026
Copy link
Copy Markdown
Contributor

@landreev landreev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm assuming Jenkins test failed for unrelated reasons last time the branch was touched.
I'm going to assume that QA will involve confirming that all the tests are passing.
Looks good overall!

@github-project-automation github-project-automation bot moved this from In Review 🔎 to Ready for QA ⏩ in IQSS Dataverse Project Mar 30, 2026
pdurbin added 4 commits March 30, 2026 15:37
"For namespace URLs, we use http:// (that's an issue discussed
at length in the schema.org community, and you can probably find
an older issue about it.)"

-- mlcommons/croissant#929 (review)
The wikidata namespace is unused and doesn't appear in the
1.0 or 1.1 spec. (It was in a titanic example for 1.0). Remove.
@pdurbin pdurbin force-pushed the 12014-croissant-1.1 branch from b98a3be to 12e9553 Compare March 30, 2026 19:38
@landreev landreev removed their assignment Mar 30, 2026
@github-actions
Copy link
Copy Markdown

📦 Pushed preview images as

ghcr.io/gdcc/dataverse:12014-croissant-1.1
ghcr.io/gdcc/configbaker:12014-croissant-1.1

🚢 See on GHCR. Use by referencing with full name as printed above, mind the registry name.

"conformsTo": "http://mlcommons.org/croissant/1.1",
"name": "Cars",
"url": "https://doi.org/10.5072/FK2/CY7BWA",
"creator": [
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm just leaving a note here at the bottom about the state of Jenkins and API tests that we can resolve later.

They are currently failing at https://jenkins.dataverse.org/job/IQSS-Dataverse-Develop-PR/job/PR-12214/6/consoleFull with this:

TASK [dataverse : STORAGE | Run docker-compose up MinIO] *********************
fatal: [localhost]: FAILED! => {"changed": false, "errors": [], "module_stderr": "", "module_stdout": "latest: Pulling from minio/minio\n", "msg": "Error starting project unknown: failed to copy: httpReadSeeker: failed open: unexpected status from GET request to https://quay.io/v2/minio/minio/blobs/sha256:c1bc68842c41cb716734641f75dc37d629f05df5f812a8cc2e7e0370d4e833ec: 502 Bad Gateway"}

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://status.redhat.com says quay.io is under maintenance:

Screenshot 2026-03-30 at 4 15 35 PM

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good news! API tests were able to run. They are passing on this PR: https://jenkins.dataverse.org/job/IQSS-Dataverse-Develop-PR/job/PR-12214/7/testReport/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Croissant Croissant and Kaggle related work FY26 Sprint 11 FY26 Sprint 11 (2025-11-20 - 2025-12-03) FY26 Sprint 12 FY26 Sprint 12 (2025-12-03 - 2025-12-17) FY26 Sprint 13 FY26 Sprint 13 (2025-12-17 - 2025-12-31) FY26 Sprint 14 FY26 Sprint 14 (2025-12-31 - 2026-01-14) FY26 Sprint 15 FY26 Sprint 15 (2026-01-14 - 2026-01-28) FY26 Sprint 20 FY26 Sprint 20 (2026-03-26 - 2026-04-08) Size: 20 A percentage of a sprint. 14 hours.

Projects

Status: Ready for QA ⏩

Development

Successfully merging this pull request may close these issues.

Croissant 1.1

4 participants