Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dataset Series serialization #334

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from
Draft

Dataset Series serialization #334

wants to merge 1 commit into from

Conversation

amercader
Copy link
Member

Refs #298 , fixes #332

This adds preliminary support for exposing Dataset Series and their members (managed by ckanext-dataset-series).

Datasets of type dataset_series (TODO: support custom series types) are serialized as dcat:DatasetSeries, and member Datasets include the dcat:inSeries property. If the series is ordered, navigation is included for both entities (dcat:first / dcat:last and dcat:previous / dcat:next respectively):

Example Dataset Series (http://localhost:5000/dataset_series/test-dataset-series.ttl)

@prefix dcat: <http://www.w3.org/ns/dcat#> .
@prefix dct: <http://purl.org/dc/terms/> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

<http://localhost:5017/dataset/20f41df2-0b50-4b6b-9a75-44eb39411dca> a dcat:DatasetSeries ;
    dct:description "Testing" ;
    dct:identifier "20f41df2-0b50-4b6b-9a75-44eb39411dca" ;
    dct:issued "2025-01-22T13:43:38.208410"^^xsd:dateTime ;
    dct:modified "2025-01-28T13:53:03.900418"^^xsd:dateTime ;
    dct:publisher <http://localhost:5017/organization/a27490ed-4abf-46bd-a80a-d6e19d7fff18> ;
    dct:title "Test Dataset series" ;
    dcat:distribution <http://localhost:5017/dataset/20f41df2-0b50-4b6b-9a75-44eb39411dca/resource/0a526400-7a45-4c2c-a1db-7058acb270b0> ;
    dcat:first <http://localhost:5017/dataset/826bd499-40e5-4d92-bfa1-f777775f0d76> ;
    dcat:last <http://localhost:5017/dataset/ce8fb09a-f285-4ba8-952e-46dbde08c509> .

<http://localhost:5017/dataset/20f41df2-0b50-4b6b-9a75-44eb39411dca/resource/0a526400-7a45-4c2c-a1db-7058acb270b0> a dcat:Distribution ;
    dct:issued "2025-01-22T13:43:49.560508"^^xsd:dateTime ;
    dct:modified "2025-01-22T13:43:49.555378"^^xsd:dateTime ;
    dct:title "need to drop this" .

<http://localhost:5017/organization/a27490ed-4abf-46bd-a80a-d6e19d7fff18> a foaf:Agent ;
    foaf:name "Test org 1" .

Example member Dataset (http://localhost:5000/dataset/test-series-member-2.ttl)

@prefix dcat: <http://www.w3.org/ns/dcat#> .
@prefix dct: <http://purl.org/dc/terms/> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

<http://localhost:5017/dataset/de9cb401-5fc7-47cd-83ac-f7fd154b2cee> a dcat:Dataset ;
    dct:description "sdas" ;
    dct:identifier "de9cb401-5fc7-47cd-83ac-f7fd154b2cee" ;
    dct:issued "2025-01-22T13:57:13.319491"^^xsd:dateTime ;
    dct:modified "2025-01-24T10:42:00.788016"^^xsd:dateTime ;
    dct:publisher <http://localhost:5017/organization/a27490ed-4abf-46bd-a80a-d6e19d7fff18> ;
    dct:title "Test series member 2" ;
    dcat:distribution <http://localhost:5017/dataset/de9cb401-5fc7-47cd-83ac-f7fd154b2cee/resource/aab3cabd-69b9-40e9-b922-1b0548de6cfc> ;
    dcat:inSeries <http://localhost:5017/dataset/20f41df2-0b50-4b6b-9a75-44eb39411dca> ;
    dcat:next <http://localhost:5017/dataset/ce8fb09a-f285-4ba8-952e-46dbde08c509> ;
    dcat:previous <http://localhost:5017/dataset/826bd499-40e5-4d92-bfa1-f777775f0d76> .

<http://localhost:5017/dataset/de9cb401-5fc7-47cd-83ac-f7fd154b2cee/resource/aab3cabd-69b9-40e9-b922-1b0548de6cfc> a dcat:Distribution ;
    dct:issued "2025-01-22T13:57:18.992071"^^xsd:dateTime ;
    dct:modified "2025-01-22T13:57:18.990029"^^xsd:dateTime ;
    dcat:accessURL <https://data.gov.ie> .

<http://localhost:5017/organization/a27490ed-4abf-46bd-a80a-d6e19d7fff18> a foaf:Agent ;
    foaf:name "Test org 1" .

When requesting the catalog endpoint (e.g. http://localhost:5000/catalog.ttl) Dataset Series are typed as dcat:DatasetSeries and member datasets contain the dcat:inSeries property but the navigation properties are not provided for performance reasons. I think this is a good compromise for now as the full properties can be accessed on each dataset serialization.

A note on URIs

At first I though about constructing the Dataset URIs using /dataset_series/ for consistency:

<http://localhost:5017/dataset/20f41df2-0b50-4b6b-9a75-44eb39411dca> a dcat:DatasetSeries ;

But that brings extra considerations. If we want to support custom series dataset type (i.e. stuff like /projects/ or /collections/) those should also have the same URI pattern, probably using /dataset_series/ and not the custom type. This would involve making dataset_uri() aware of the preferred dataset type, probably via a param.
We definitely don't want to change the URIs for any arbitrary dataset type (as this might break existing URIs in existing sites with custom dataset types), but for those types that describe Dataset Series perhaps it's worth the added complexity (and other entities could also have different URI patterns in the future if they are implemented with dataset types, like Data Services).

Any thoughts @seitenbau-govdata @hcvdwerf ?

TODO:

  • Remove distributions from Dataset Series
  • Support arbitrary series dataset types
  • Decide on Dataset Series URIs
  • Tests
  • Documentation

@amercader amercader marked this pull request as draft January 29, 2025 14:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support Dataset Series serialization
1 participant