Open Data Systems

The users will be able to add metadata and publish datasets with persistent identifiers directly on the Science Mesh sites or to external data repositories.
 


Open Data Systems
This Data Service responds to the need for integration of open data repositories (focusing on OpenAIRE standards such as OAI-PMH) in the Science Mesh.

The RDM (Research Data Management) functionality will be integrated with IOP, OCM, CS3APIs with a use-case of reference provided by AARNet constituencies and PSNC; the redeployment will be with other partners' sites and across different EFSS (Enterprise File Sync and Share) platforms.

Objective: Users of the EFSS (Enterprise File Sync and Share) services will be able to organise work-data via tagging and metadata assignment, turn a set of work data into a published, referable dataset and finally expunge a valuable dataset to an open data repository, archive or library for curation and long-term archiving. EFSS user’s home service account will be associated with a persistent identifier (such as ORCID) and this identity will follow the data when publishing to open data repositories. Users will be able to tag datasets as public and make them public and searchable on the home service.


 

Technology Readiness Level by M1

6

Technology Readiness Level by M36

9

Technologies used

  • Describo
  • InvenioRDM
  • ScieboRDS

Unique Selling Points (USPs)

Repository-quality FAIRness on a live data system.

Target users

  • Researchers
  • Data librarians
  • Policy makers

Use cases

Business / Market sectors

  • Academia
  • Museums
  • Galleries
  • Archives

Specific needs

Ability to perform collections-level (transcending the bare file level) manipulation on data assets held in an EFSS.

Specific benefits

Adding this feature allows users to retain metadata (produced by, e.g., instruments) as they ingest their files into the Mesh, and later on to connect their data collections to 3rd party systems (e.g., archives) enriched by metadata.

Examples real users

  • PARADISEC
  • GALAXY

Exploitation drivers

PARADISEC hosts files that aren't meaningfully searchable by filename; they rely on metadata to search and serve out files. Therefore if ScienceMesh users want access to PARADISEC data through a share/ocm type approach, Mesh nodes need to be able to understand collections and ro-crates and let user manipulate them (search, inspect)

Current stage

Merging Describo-online with EFSS systems through cs3api; letting ScieboRDS invoke describo to generate RO-crates and then hand off the crate back to ScieboRDS

 


 

Find out more about Open Data Systems in our interview with Guido Aben

Go to the podcast