Skip to content

Design URI structure for data.isamples.org #81

@rdhyee

Description

@rdhyee

Context

data.isamples.org currently serves parquet files flat at the root (e.g., data.isamples.org/isamples_202601_wide.parquet). Ben Norton suggests adding path segments that convey resource type, following OGC-style patterns:

data.isamples.org/parquet/isamples_202601_wide.parquet   # data files
data.isamples.org/record/<uuid>                           # individual sample records
data.isamples.org/term/<term-slug>                        # vocabulary terms

"This allows you to better manage resources, provides additional context and informs the user what type of resource a pid is expected to return. This pattern is also part of several specifications (i.e. OGC)."
— Ben Norton

Relevant specifications

Questions to discuss

  1. Scope vs. timeline — The grant ends July 2026. Which path segments are realistic to implement?

    • /parquet/ for data files: trivial (Worker routing change + redirects from old flat paths)
    • /term/ for vocabulary: moderate (could redirect to existing vocab pages on the site)
    • /record/<uuid> for individual samples: heavy (requires a query service, not just static files)
  2. Backwards compatibility — PR Use data.isamples.org for all parquet file URLs #79 just migrated all references to flat URLs. If we restructure, we'd want redirects from the old paths.

  3. Content negotiation — Should /record/<uuid> return JSON-LD vs HTML based on Accept header? That's the full linked-data pattern but adds complexity.

  4. Versioning — Current files are date-stamped (202601). Should the URI structure make versioning explicit (e.g., /parquet/v202601/wide.parquet)?

Current state

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions