My Photo

WorldCat


Twitter Updates

    follow me on Twitter
    Blog powered by TypePad

    google analytics


    meter


    Categories

    Categories

    February 19, 2008

    Uncoupling identification and resolution

    Melbourne_riverfront_night_6952 Conflating identity and resolution of Web resources is often useful... it is usually the right thing to do. But I've written in the past about the need, on occasion, to uncouple these fundamental functions.

    There is a fairly long standing and often vitriolic debate among Web technologists about whether there should be a component of web architecture that does this: identifiers that simply identify, and carry no implication of resolution.  The Just-Use-HTTP camp insists that there is no place in the naming architecture of the Web for identifiers not grounded in the HTTP protocol, even when resolution is not intended.

    Others of us have argued that persistent identifiers without a direct resolution mechanism are useful and desirable.  DOI's are a purpose-built example of this, to support the management of commercial publishing assets.  INFO URIs are intended to meet the need in other niches.  URNs were the earliest effort in this direction, though they have not been wildly successful.

    Thus, it was interesting and ironic to see a post on a W3C Team blog about  excessive traffic (100 million hits a day!) resulting from the static HTTP identifiers associated with DTDs (document type definitions) hosted by the W3C.  It is important to declare, maintain, and serve such resources, but they are not intended for routine retrieval by applications or users. Instead, such structural declarations are meant to be parsed by applications that intend to process data according to a set of declarations in the DTD, or more often, simply to confirm... 'yep... this is a document of a type known to me.'

    The use of HTTP identifiers (URLs) is an implied promise... a label that says 'there's something here for you to retrieve'.  Yes, I know about HTTP headers.  I understand that an application written to the latest protocols will understand the return codes and should take intelligent action based on those codes.  But it is laughable to expect all web applications to be well behaved in this way.  The blog post speaks of applications "creating a Distributed Denial of Service (DDoS) attack against W3C" and "abusive request patterns".  In fact, the root cause of the very real dilemma faced by the W3C and others with this problem is the ideological opposition to an alternative solution: Identifiers uncoupled from resolution. 
    -----
    The Yarra riverfront in downtown Melbourne