SafeArchive Audit System: "Google Summer of Code Ideas"

The SafeArchive Audit System (SAAS) Potential Projects

We encourage students to additional submit ideas, questions, and comments to the developer mailing list
And you may want to examine our major software product: SafeArchive and look at the user-submitted "wishlist" on SourceForge.

Adaptive User Interface Design

During the implementation of the SAAS we have noticed a need for user interface customization. As we configured the tool to audit larger preservation networks, we noticed some networks containing collections too numerous for the current user interface to handle effectively. The user interfaces for both the archival units definitions and the preservation schema definitions need to adapt to the quantity of discovered materials in the network. The ability of the user interface to scale and adapt to the quantity of information being defined will be a great advantage for the users. The user interface is developed with PrimeFaces.

Skills: Java Programming, Willingness to learn PrimeFaces

Contacts:  developer mailing list

Preservation Information Data Collection

The current implementation of the SAAS has been focused on retrieving replication and preservation information from LOCKSS preservation networks. The development   team at Data-PASS would like to expand the use of this tool to other replication technologies. We would like to see a collaborative programming project aimed at expanding the discovery tools of the SAAS to analyze multiple storage networks. An example would be the design of an API to gather preservation information for digital objects stored inside iRODS rule based grids or monitored by ACE Auditing Control Environment. Both technologies lend themselves very nicely to an integration project of this nature. This project would expand the application of the SAAS to a larger audience and in particular the large scientific community currently utilizing iRODS or ACE to manage their research data.

Skills: Java Programming, Willingness to learn iRODS Jargon API, Willingness to experiment with ACE

Contacts:  developer mailing list

Prototype Preservation Replication Using Off-the-shelf Filesystems and Storage Networks

A number of widely available, OSS off-the-shelf systems can maintain distributed replica's of file. The Hadoop File System, Tahoe-LAFS, and Freenet are particularly well-known for this. None of these systems however provide the write-once-read-many (WORM) semantics that a preservation file system requires. A successful project would prototype extensions or encapsulations of these systems to support WORM semantics.  

Skills: Java, willingness to learn about filesystem architecture

Contacts:  developer mailing list

Where's my data? Interactive visualization.

Reports are boring. Sometimes you just want to *see* where your data is, who has it, and how healthy it is. We encourage experimentation with Flare, Protovis, Google Charts or other Open Source (or free + OpenAPI) -- the more interactive the better!

Skills: basic experience in interactive information visualization

Contacts:  developer mailing list