Archiving and preservation for research environments

A Research, Management and Preservation Platform

Pilot Solution

The Proposed Solution

The solution developed by the consortium led by LIBNOVA provides a Research, Management and Preservation Platform combining existing technologies and new components, to solve obstacles for research dataset management (including preservation) identified in the ARCHIVER project. The solution proposed is based on pre-existing digital preservation platforms already in use by many leading organizations across the world. The solution is designed for the whole organization and for the whole data life-cycle, completely aligned with OAIS, ISO16363, FAIR and TRUST principles, with powerful and really innovative capabilities in all functionality layers.

The R&D potential

Scalability (sustained high throughput in the 100s of PBs range)
Digital Preservation Best Practices (OAIS, ISO 16363, PSC-Preservation Storage Criteria, Best Practices recommendations and implementations – OAIS Information Model including RepresentationInformationand Preservation Description Information components, problem detection such as duplicates, hidden encryption, format migration/evolution, exit strategy).
Metadata management (import/creation and preservation), following OAIS
Data integrity management (integrity chain, integrity at rest)
FAIR principles (F: containers, customized metadata, structured hierarchy, A: multiprotocol access, public sharing, discovery solutions, I: Data policies, research data Representation Information, R: Integrated active integrity control, Representation Information and Provenance Information)
Cost efficiency (flexibility on deployment, several computation/storage options).

Architecture Overview

The overall architecture is built on two main components:

Core software components (Group A) running inside Kubernetes containers. The number of containers running in parallel of each class can be adjusted manually or automatically by the platform based on service demand to ensure full scalability.
Auxiliary services (Group B) based on the Core services above. When running in “on-prem” mode, organizations will need to provide them for the platform to work. When running “as a service”, the service provider (LIBNOVA) will provide them.

Five assets are deployed on the components above:

Containers – keep content accessible with several protocols, organized and protected. These containers keep metadata, data and code together to ensure usability (OAIS-aligned).
Dynamic Insights – help users when dealing with personal information, digital preservation and emissions reduction, with the following components: Data Policies Assistant, GDPR Assistant, Emissions Optimizer, Digital Preservation
Budget assistant – helps users to plan and follow expenditures
Content gateway – connects the platform with repositories for discovery solutions, such as Invenio or Dataverse
Digital Preservation, OAIS and FAIR conformance – as support for the OAIS Information Model and for the Mandatory Responsibilities, and the results will fully support repositories in OAIS conformance. The focus on usability is also critical for the “Interoperability” and “Reusability” required by the FAIR principles

Comparison between the levels of R&D before and after the introduction of ARCHIVER solutions (January 2021)

Baseline before ARCHIVER	ARCHIVER R&D
Storage/basic archiving/secure backup (Layer 1) Deployments over private, public, hybrid, community and special purpose clouds in the single PB range infrastructure	Infrastructure agnostic for multiple PB; Multitenancy; sustained data ingest rates 1-10Gb/s for multiple use cases with different access patterns.
Preservation (Layer 2) Preservation services of files at basic level of redundancy, limited API support	Richer API set (essentially all capabilities available via the GUI for seamless integration); Active monitoring of data integrity in order to detect unwanted changes such as file corruption or loss on top of infrastructure services. Support for handling unstructured or missing metadata, test models to map responsibilities for local support, responsibilities for long-term data management planning;
Baseline user services (Layer 3) Volumes of hundreds of TBs with support of Indexing, elastic search, deduplication.	Software development for search, look up or filter potential datasets rapidly, to access dataset metadata and decide on its relevance.
Advanced Services (Layer 4) Basic support of retention and integrity of certain types of data	Container Orchestration engine support based on Kubernetes for the compute capabilities to allow scientific analyses to be carried out off-prem. Interfaces from infrastructure layer integrated on the overall design (allow access to data, no matter where stored)

About ARCHIVER

The OMC process

Procuring R&D for the EOSC

ARCHIVER Tender

The Early Adopter Programme

Last News

Preserving data for the long run: ARCHIVER Results featured by CORDIS

News

ARCHIVER Awarded for Collaboration and Cooperation by the Digital Preservation Coalition

Keeping intellectual control of data in the Digital Age: ARCHIVER featured in Géant Magazine

ARCHIVER finalist at Digital Preservation Awards 2022

ARCHIVER supporting the EOSC Early adopter programme

A Research, Management and Preservation Platform

The Proposed Solution

The R&D potential

Architecture Overview

Comparison between the levels of R&D before and after the introduction of ARCHIVER solutions (January 2021)

Baseline before ARCHIVER

ARCHIVER R&D

Storage/basic archiving/secure backup
(Layer 1)

Preservation
(Layer 2)

Baseline user services
(Layer 3)

Advanced Services
(Layer 4)

Watch the interview with Antonio Guillermo Martinez (LIBNOVA)

About ARCHIVER

The OMC process

Procuring R&D for the EOSC

ARCHIVER Tender

The Early Adopter Programme

Last News

Preserving data for the long run: ARCHIVER Results featured by CORDIS

News

ARCHIVER Awarded for Collaboration and Cooperation by the Digital Preservation Coalition

Keeping intellectual control of data in the Digital Age: ARCHIVER featured in Géant Magazine

ARCHIVER finalist at Digital Preservation Awards 2022

ARCHIVER supporting the EOSC Early adopter programme

A Research, Management and Preservation Platform

The Proposed Solution

The R&D potential

Architecture Overview

Comparison between the levels of R&D before and after the introduction of ARCHIVER solutions (January 2021)

Baseline before ARCHIVER

ARCHIVER R&D

Storage/basic archiving/secure backup (Layer 1)

Preservation (Layer 2)

Baseline user services (Layer 3)

Advanced Services (Layer 4)

Watch the interview with Antonio Guillermo Martinez (LIBNOVA)

Storage/basic archiving/secure backup
(Layer 1)

Preservation
(Layer 2)

Baseline user services
(Layer 3)

Advanced Services
(Layer 4)