Digital Repository Interoperability
Microsoft, the Mellon Foundation, the Coalition for
Networked Information, the Digital Library Federation, and the Joint Information
Systems Committee in the
The final report for this meeting is available at the Mellon Foundation. It is not casual reading, intended to capture a discussion rather than characterize the state of the art, though readers may find the background materials and recommendations generally useful.
Those conversant with digital library research may be familiar with the early work of Robert Kahn and Robert Willenski on repository architectures. It is interesting to note that, a dozen years later, there still is no commonly agreed terminology, let alone a universally accepted model for this critical piece of digital library infrastructure. It is a reminder of how new our digital workspace is, and how much effort remains to achieve even a rudimentary infrastructure for supporting reliable, persistent access to electronic assets.
There is inherent tension between standardization and localized design, especially in an unstable technological environment. Builders want to build, not reconcile. The conferees at this meeting (largely implementors of early repository systems) did not even entirely agree on which aspects of functionality should be considered core for interoperability purposes (though progress was made towards this goal).
It is perhaps not astounding that understanding of repository core functionality is not so different now than in 1995. The Kahn-Willenski list included (I paraphrase) access, deposit, and tell-me-more. The 2006 meeting agreed on obtain and harvest, and talked at length about whether put should be there. The notions map roughly across the intervening years, though current understanding of the underlying details far exceeds the 1995 model.
If this sounds like scant progress for a decade, keep in
mind that a great deal of experience has been garnered through the deployment
of DSpace, arXiv, Fedora, ePrints aD
The problem is broad in scope… a data model and architecture to support documents, data archives, and formats, policies, and recombinant practices that are to a significant degree yet undeveloped. The architecture must accommodate the functional requirements of disparate domains with entirely different business models, legal requirements, and data demands. As the first crop of serious repository applications have matured, the field is now ready for the harder task of bringing these efforts into an interoperable framework. This meeting will have helped to focus attention and effort on this important goal and set the stage for additional progress.
-----
Image: Downtown Seattle from Elliot Bay, July, 2006


