Expert Workflows for Digital Curation & Preservation
This repository contains a professional collection of R-based workflows designed to support Research Data Management (RDM) tasks. It provides data curators with standardized, documented functions to transform raw data submissions into FAIR (Findable, Accessible, Interoperable, Reusable) research objects.
The full documentation and interactive guide are available at: 👉 https://alliance-rdm-gdr.github.io/CUR_Res_CurationTools/
- Automated Triage: Rapid inspection of file extensions and basic fixity.
- Deep Validation: Format-specific modules for Tabular (CSV, Excel, SPSS, SAS, Stata), Scientific (HDF5, NetCDF), and Geospatial (GeoPackage, TIFF) data.
- OCR Intelligence: Document extraction using Google Cloud AI.
- Archival Reporting: Generation of standardized curation logs and metadata summaries.
To preview the book locally:
quarto previewTo build the final version:
quarto renderindex.qmd: Landing page with branding identity.Inspect_*.qmd: Format-specific curation notebooks.Scripts/: Standalone R scripts for batch processing.data/: Sample data for testing inspection routines.styles/styles.css: Alliance-branded visual theme.references.bib: Consolidated bibliography.
Please see CONTRIBUTING.md for documentation standards and branding guidelines.
This project is maintained by the Curation Services Team of the Digital Research Alliance of Canada.