URLs versus URIs
In Sean McGrath’s piece on URLs and social commitment,
he alludes to the common confusion between the acronyms URL and URI, correctly
pointing out that only Web-head protocol-wonks are liable to be caught using the URI
terminology (and most of them don't usually either). Everyone else on the planet uses the largely-interchangeable and better-understood moniker of URL.
Is this a distinction without a difference, as usage would lead us to believe? Unhappily, in today’s
Web, it rarely matters a whit. Sean tells
us why:
The great thing about URLs is that you can click on them.
That is a great thing, and it informs our expectation of
URLs to the exclusion of all other possibilities. The social contract implied in http:// is,
then, that they are actionable: you can click on them and bring the referent of the link into your machine
and read it or listen to it or watch it. The link serves as a pointer to a location, and clicking on
it invokes a behavior specified by the http protocol: voila!
So, what's to be unhappy about? Overloaded onto this simple actionable relationship is
the additional important function of identity. The URL serves as both a key for a retrieval transaction, and an
identifier. It is no accident that CNRI,
in a stroke of marketing genius, chose the term Handle for their identifier
protocol, for that is exactly the right term for such identifiers. We want a handle for otherwise-slippery
electronic content so we can hold on to it, pass it back and forth, refer to
it, and hang it over our desk to grab like a frying pan from a pot-rack.
Mostly this overloading of identity and location/retrieval
is fine. And to the extent that it is, the conflation of URL and URI is not a problem. So what is missing?
Three things:
1. Persistent reference pointers – There are many classes of
electronic resources that we know we will want to refer to in a location-independent
way for as long as we can imagine. Books, journal articles, or any component of a persistent resource of
cultural, social, or economic importance. Yes, we want these to be actionable (clickable) as far as possible, but sustainable
access requires that we distinguish between identity and resolution in the life
cycle of any information resource of more than passing importance. Conflating location and identity makes this harder (though not impossible).
2. Appropriate copy resolution – In a world without access barriers,
any copy is the appropriate copy. In a
world of tradable intellectual property, individuals and organizations have differential
access to resources. The Web should be
neutral about business models, but it cannot be indifferent to them. Owners of IP must have the means to manage
and meter access, and this generally implies a decoupling of identity and
resolution.
3. Conceptual resources – Our expectation that clicking 'gets' us something is not fully met if the resource is a conceptual asset. The development of Semantic
Web technology demands the application of an identity architecture to concepts
as well as documents, multimedia, and pizza-ordering forms. Proponents of the just-let-HTTP-do-it rightly
point out that HTTP URLs are entirely capable of being used for identifying
conceptual resources (to use the RDF parlance). This is undeniably true as far as the technology goes. The more interesting question is what happens
when you click on such a link? What SHOULD happen? The answer is context dependent. Some people, myself included,
are uncomfortable with the use of standard HTTP URLs for this purpose, because it
breaks the widespread-if-informal social contract of URLs. You may wish to define them, or locate them within a larger conceptual structure, or access various of their attributes, but in general you're not trying to retrieve them.
Each of these examples begs for decoupling of identity and
resolution in some contexts, but requires an additional layer of mapping of location or function that, getting back to Sean’s consumer contract:
all come
at very significant extra cost in terms of complexity. On this issue, the world
has voted very loudly with its mouse clicking fingers. The world values
hyperlinking simplicity over complexity by many orders of magnitude.
The resolution of these conundrums will require daunting co-evolution of technology, business
processes, and cross-community practices. Technologies such as Handles, DOIs, OpenURLs, PURLs, and "INFO" URIs all represent approaches to addressing aspects of these problems. At this time they are niche technologies. Their impact on the constellation of problems we know as identity-versus-resolution will depend far more on business processes than on TECHNOLOGY (or the ideologies of their proponents or detractors). Meanwhile, the URL rules.