Scalable and sustainable long term digital preservation of scientific datasets

In occasion of the 17th edition of iPRES, the premier and longest-running conference series on digital preservation, which will take place from 19 to 22 October 2021, in Beijing, China, Matthew Addis, coordinator of the Arkivum consortium, which is building a solution for long-term data management and online access as part of the ARCHIVER project, submitted a paper titled "Scalable and sustainable long term digital preservation of scientific datasets", addressing the conference topics of Building the Capacity & Capability, and Scanning the New Development.

The paper highlighted that the European Commission supported ARCHIVER project (Archiving and Preservation for Research Environments) with the aim to “introduce significant improvements in the area of archiving and digital preservation services, supporting the IT requirements of European scientists and providing end-to-end archival and preservation services, cost-effective for data generated in the petabyte range with high, sustained ingest rates, in the context of scientific research projects”. The paper presents a software solution developed by Arkivum to meet the needs of long-term digital preservation of scientific datasets in ARCHIVER, presenting present and discussing how this solution is scalable (able to process and store very large volumes of research data) and sustainable (both economically and environmentally). This is achieved through a combination of serverless computing, deployment on hyperscale infrastructure, and implementation of configurable ‘Minimum Effort Ingest’ workflows. In particular, Arkibum shows how high-performance and scalable Long Term Digital Preservation (LTDP) of very large datasets can be done in a way that is entirely compatible with high levels of cost-efficiency and minimized environmental impact.

The full paper can be read and downloaded here.