How it works

The SafeArchive system was designed to provide a simple tool for distributed replication and policy compliance. It coordinates six primary activities:

  1. Documents the agreed-upon replication policies or rules of institutions participation in a network.
  2. Makes each participating institution's digital holdings available through the web or network system.
  3. Harvests and replicates the collections from their original source using OAI-PMH protocol.
  4. Monitors and maintains the integrity of the network, using caches.
  5. Audits the network to ensure compliance with agreed-upon policies and produces an audit report for all participating institutions.
  6. Identifies and corrects inconsistencies in the collection of each participating institition in the network.

The SafeArchive system is built around a core LOCKSS network. Participating institutions expose digital content through the OAI-PMH protocol and through the Dataverse Network (DVN) digital library system. Institutions in the network chose which of their own and which of the other partners' content to replicate by creating policies (or rules), which are formalized in a machine-readable schema.

The complete holdings of each partner, including metadata, data, documentation and legal agreements, are replicated by the network.  Replicated copies are geographically and institutionally distributed, which guards against technical and organizational preservation failures.

When new collections are added to the preservation network, the system provides a way to automatically identify collaborating peers with the required resources and initiates regular harvesting by those peers. Previous versions of the replicated content are maintained, as well.

Content in the network is audited regularly to demonstrate conformance with preservation requirements. While each partner is trusted to hold others' public content and to not disseminate content improperly, no partner is granted "super-user" rights. Trust is verified through automated audits of trusted repository requirements, which  provide the reliability of a top-down replication system with the resilience of a peer-to-peer model.