I'm in Paris, struggling in a haze of jet lag to be productive (ok... maybe just stay awake at 3:00 AM Seattle time) at a meeting of the Sun Preservation and Access Special Interest Group, an effort to create a community of practice around persistent and reliable storage technologies for scholarly archives. Last week I gave a presentation to the Digital Futures Alliance advisory committee, which included representatives from Microsoft, Vulcan Technologies, Real Networks, Amgen, and others. A few weeks earlier, I participated in a Microsoft-sponsored symposium, the Global Research Library 2020 activity. Seeing a common thread? Various dot coms working at "doing well by doing good" (as Michael Keller said of Sun Microsystems a few minutes ago). And of course, none of us in the library world take steps unshadowed by Google books/scholar/etc....
Am I making your information-wants-to-be-free-spirit Flickr (oops... flicker) in the rapacious winds of commercial self interest? Like it or not, libraries are embedded in a supply chain that involves the largest, most dynamic and innovative companies in the information marketplace.
OCLC may be the biggest player in the library pond, but it is tiny in comparison to most of the organizations we liaise with in this rapidly changing market place. So, when I see posts such as There's no such thing as a free PURL, I would like to interject a bit of historical perspective and offer a teaching moment, so that the author of this post might understand the motivation for PURLs more clearly (and in the bargain, perhaps flog the latest developments in the PURL domain).
As a member of the team that implemented PURLs, I can speak with some authority about the motivations and history of the effort. PURLs emerged from the efforts of Eric Miller and myself in the IETF back in the earliest years of the Web, as people recognized that naming and identification of electronic resources in a persistent manner is in fact problematic. Based on our experiences in what I have ever since described as the groundhog day meetings (cf. the movie of the same name), OCLC invested modest resources of the library collaborative in an 80% solution that achieved even more modest success as one piece of the ad hoc naming architecture of the Web. PURLs is among the longest continuously operating pieces of Web technology, running pretty much continuously since its inception in the mid 1990s.
I'm guessing OCLC spent a million dollars on PURLs in that decade if you roll in all the costs of the research, meetings, technology, and miscellaneous overhead for the various people who thought about the problem, wrote code, and maintained machines. We made the code open-source, and have encouraged others to use it freely. Recently, recognizing that the code base and architecture of the original implementation was a bit long in the tooth, OCLC funded a re-engineering of PURLs, with the idea of bringing the code up to date with current HTTP headers and making the code more embeddable and hence more useful in a web architecture that has evolved somewhat since the mid 1990s. Roll out of the new code base will take place soon, and should help sustain the viability of the identifiers minted in the PURL domain as well as to encourage wider deployment of PURLS in general. It will, like the original code base, be made available as open-source software.
Evidence of success of the strategy may be found in the adoption of PURLs for the identification of some billion URI-based assertions about proteins in the UniProt database, an international database of proteins intended to support open research in the biological sciences. In the latest release of UniProt (11.3), all URIs of the form:
have been replaced with URLs of the form:
Not a dime of revenue has come OCLC's way as the result of PURLs to offset the cost of developing and maintaining this system. But a decade and more of thinking about the intricacies of identifiers will, I fervently hope, redound to the long term fiscal benefit of OCLC, as we 'insinuate" ourselves into "more and more aspects of information organization and retrieval". I want OCLC to be the most authoritative, reliable source of persistent identifier management on planet-library.
OCLC represents assets of this global community in an information marketplace dominated by innovative, fast-moving organizations that don't always share the values of the library community. The fact of the matter is that organizations don't HAVE values. Their leadership - people - do. After more than two decades of working at OCLC, I've seen good leadership and bad, good decisions and bad, and have played a role in both varieties. At no point in that time have I seen better leadership in more difficult circumstances than now, and I'm gratified to have played a small supporting part. By and large, I think my colleagues have as strong a claim to the library mission and philosophy as any institutional cohort. Nor is it a small point that OCLC is a collaborative with a governance structure. Spell that "a c c o u n t a b l e".
Still, if you're squirming a bit at the incursions of the dot com world into the purity of our own information enclave, join the club. I share this discomfiture, even as I am pleased to see the Microsofts and Googles and Sun Microsystems paying a lot of attention to the marketplace that libraries represent, not so much directly, as in recognition of the central place of library assets in the information ecosystem. It is appropriate to examine and challenge motivations of the players (including OCLC)... too little of that is done, as we enjoy the jumbo shrimp and the bottomless glasses of wine at sponsored receptions. But it is not a bad idea to ask the question with some knowledge rather than simply suspicions. Now, please excuse me... I think Sun is about to serve lunch.
My favorite piece at the Musée Rodin. I didn't get the metadata... chalk it up to jetlag and slack-jawed love at first sight.