Issue 14378

Options for managing redundant storage between data repositories

14378
Reporter: ahahn
Assignee: trobertson
Type: Epic
Summary: Options for managing redundant storage between data repositories
Priority: Major
Status: ToDo
Created: 2013-11-18 15:52:24.874
Updated: 2014-09-03 11:24:56.235
        
Description: In a system of data repositories, it is desirable to store redundant identical copies of datasets. The synchronisation between repositories needs to be managed in such a way that the master copy of each dataset is clearly identified, while the copies are updated with as little lag time as possible. A data harvester needs to be able to recognize multiple copies as duplicates of the same dataset, and identify the master copy.

*Rationale*
Redundant storage of datasets with appropriate metadata, distributed across persistent repositories, provide for disaster recovery of data.

*Required components*
- registry: identify master dataset and duplicates
- IPT: repository manager needs to be able to distinguish between own and mirrored datasets
]]>