My Photo

WorldCat


Twitter Updates

    follow me on Twitter
    Blog powered by TypePad

    google analytics


    meter


    Categories

    Categories

    « October 2007 | Main | December 2007 »

    November 15, 2007

    The Dark Side of Facebook

    Img_3449 I've been on the road for a few days, and hadn't logged into my Facebook account for that period, and in fact for some time before that.  So, imagine my surprise when I logged in this morning and found the following:

    Please Read This!
    Warning! Your account could be disabled.

    Your use of our Notes feature indicates that you may be in violation of Facebook's Terms of Use. Continued misuse of Facebook's features could result in your account being disabled. If you have any questions or concerns, you can visit our FAQ page here.

    There is no further indication of what I did or in what way it is objectional, though clicking to the FAQ page reveals that I was apparently abusive at the rate at which I used some feature.  Since I hadn't used Facebook at all, and it is the NOTES feature I am accused of abusing, I can only surmise that some interaction from my blog feed into my Facebook notes must account for it, and since I don't post enough, perhaps it is my twittering to my blog that has triggered this?  Any insight that my allegedly abused 'friends' on Facebook might care to share would be welcome, though, I gotta say... there just isn't enough there there to care.  So, send me mail, or call, or post a note to CraigsList, cuz I don't think I'm welcome back in FaceVille.  Auvoir, mon ami!

    I do think its interesting that in a so-called Social Software environment, my 'privileges' can be withdrawn without explaining why, without explicating limits, without an indication of the duration ... without anything at all other than I have in a way unknown to me violated a limit that cannot be made known to me for and that I am timed out for a duration that will not be made known to me. Sounds pretty antisocial to me.

    I have to say, I feel a swelling in my breast at the notion that I have become, with so little effort, a pariah, an outcast, a dangerous desperado... lo... a SPAMMER! on the world's fastest growing social networking system. (An entrepreneurial opportunity, perchance?  FaceSlap! When you're too marginal for polite company on the hottest advertising platform of the year.)  How will I get along without ads for the Seattle Personals???

    I feel special!

    Transcript of my 'citation' follows.

              Warning - Blocked from Using Feature

    Facebook has determined that you were using a feature at a rate that is likely to be abusive. Before you were blocked, you were given a warning to slow down your use of this feature. For example, if you were blocked from sending friend requests, then we determined that you were adding new friends too quickly. Please be aware that further abuse of such features can result in your account being permanently disabled and removed from the site entirely.
    Facebook has limits in place to prevent abuse of our features and to protect users from potential spam.
    Unfortunately, Facebook cannot provide any specifics on the rate limits that we enforce. Please know, however, that the speed at which you are acting and the sheer number of actions you have made are both taken into account.
    The duration of the block varies depending on the nature of the offense, ranging from a few hours to a few days. When the block is over, please proceed with your site activity at a slower rate so as to avoid hitting another block or having your account disabled.

    Facebook will not lift this block for you until the entire penalty time has elapsed.
    -----
    Don't you just KNOW this guy sees right to who I really am???

     

     

    Quantity has a (loss of) quality all its own

    Img_3455 Some years ago my colleague Thom Hickey encountered the rubric "Quantity has a quality all its own" (nominally attributed to Josef Stalin), and it has become a time-worn inside joke for some of us in OCLC Research ever since. Largely, I suppose, for the variety of circumstances that plead for its invocation. (One does wonder if it is as snappy in Russian as in English?)  Anyway, the most interesting tidbit about preservation I've encountered at this meeting fits nicely.

    David Rosenthal, of the LOCKSS and CLOCKSS efforts, advances an interesting argument about confidence in digital preservation systems.  Its a thought experiment about measuring the effectiveness of large data store preservation systems.  If you are inclined to take for granted the stability of digital media and the systematic efforts appropriate to its reliable management, reading this post should disturb your sleep.

    David calls it the Petabyte For a Century argument, and it goes something like this:
    An organization wants to measure the effectiveness of a preservation system purported to be able to sustain a Petabyte store for 100 years with a 50% chance of integrity loss.  How might one model such a question?

    One way to think about it (David's way) is in terms of the half life of bits.  The requirement translates to 0.8 exabit-years of preservation with a 50% chance of success, or a half life of 0.8 exa-years, which as it turns out, is equivalent to 100,000,000 times the age of the universe.  David goes on to make the point that the cost of measuring such a standard of performance turns out to be intractable (by at least 6 orders of magnitude.  So... the problem is really hard, and assessing the effectiveness of possible solutions is pretty hard as well.

     What kinda mileage does this car get?
         Oh, I can't tell you! 
    Why?
         Well, it would make the cost of the car 20 billion dollars!

    OK... thanks anyway... How about the brakes?
         Oh, I can't tell you!

    David's argument is well worth reading, and while reasonable people may quibble about the measure of success, it illustrates the seriousness of the problem, and has sparked a substantial debate in the digital preservation community.  I find it a convincing argument that bit-for-bit reproducibility is not going to be the standard by which we will measure large scale digital preservation efforts.  Got a good alternative?

    One participant in this working group (Richard, who I believe is from the BBC) raised the goal of simply having bit losses that are sub-catastrophic.  That is, systems should have among their design criteria the ability to recover from errors without the loss of large chunks of otherwise uncorrupted data.   A scratch on a vinyl disc will annoy, but you can still listen to those scratchy old LPs.  A reading error on a CD or DVD, however, can make an entire album unusable.  Returning to more graceful failure modes of earlier media can and should be part of the design of storage systems.  And while we're talking about petabytes... the BBC is generating 4 of them each and every week.  No secret why THEY are here.

    A representative of the British Library in the same session provided another Quality of Quantity argument, and the resultant need for multi-site, self-checking, self-healing systems.  Some stats:

    • 150 million items
    • 50 million cataloging records
    • 750 million newsprint pages
    • 5 billion pages
    • 650 km of shelving
    • 1.5 million disks and tapes

    All lovely, impressive numbers, music to our more-is-more ears.  As they (we all) go digital, however, some simple observations will make evident the need for substantial improvements in system behaviors.  File-error monitoring identified 'bit rot' errors at a rate of ~1 per thousand files in a 3 year period.  Not bad, eh?  Well, it translates to corruption in 4,000 files per month in a 150 million file collection.  Not acceptable.

    I've been guilty of a certain glibness about digital preservation... Oh, the bucket-o-bits part is the easy part... things get really hard as you go up the ladder and have to keep the bits, and the applications, and the operating environments and the hardware all synchronized....  Well, that may still be true, but I won't gloss over the bit bit again.
    -----
    Moon over a church spire on Avenue George-V on a perfect Paris evening.  My mother tried hard and largely without success to cure me of whining.  Walking the streets of Paris with a camera may be the best possible remedy for jet-lag-induced self pity. (Hi Mom!)

    PURLy Gates and Gift Horses

    Img_3168 I'm in Paris, struggling in a haze of jet lag to be productive (ok... maybe just stay awake at 3:00 AM Seattle time) at a meeting of the Sun Preservation and Access Special Interest Group, an effort to create a community of practice around persistent and reliable storage technologies for scholarly archives.  Last week I gave a presentation to the Digital Futures Alliance advisory committee, which included representatives from Microsoft, Vulcan Technologies, Real Networks, Amgen, and others.  A few weeks earlier, I participated in a Microsoft-sponsored symposium, the Global Research Library 2020 activity.  Seeing a common thread?  Various dot coms working at "doing well by doing good" (as Michael Keller said of Sun Microsystems a few minutes ago).  And of course, none of us in the library world take steps unshadowed by Google books/scholar/etc....

    Am I making your information-wants-to-be-free-spirit Flickr (oops... flicker) in the rapacious winds of commercial self interest?  Like it or not, libraries are embedded in a supply chain that involves the largest, most dynamic and innovative companies in the information marketplace.

    OCLC may be the biggest player in the library pond, but it is tiny in comparison to most of the organizations we liaise with in this rapidly changing market place.  So, when I see posts such as There's no such thing as a free PURL, I would like to interject a bit of historical perspective and offer a teaching moment, so that the author of this post might understand the motivation for PURLs more clearly (and in the bargain, perhaps flog the latest developments in the PURL domain).   

    As a member of the team that implemented PURLs, I can speak with some authority about the motivations and history of the effort.  PURLs emerged from the efforts of Eric Miller and myself in the IETF back in the earliest years of the Web, as people recognized that naming and identification of electronic resources in a persistent manner is in fact problematic.  Based on our experiences in what I have ever since described as the groundhog day meetings (cf. the movie of the same name), OCLC invested modest resources of the library collaborative in an 80% solution that achieved even more modest success as one piece of the ad hoc naming architecture of the Web. PURLs is among the longest continuously operating pieces of Web technology, running pretty much continuously since its inception in the mid 1990s.   

    I'm guessing OCLC spent a million dollars on PURLs in that decade if you roll in all the costs of the research, meetings, technology, and miscellaneous overhead for the various people who thought about the problem, wrote code, and maintained machines.   We made the code open-source, and have encouraged others to use it freely.  Recently, recognizing that the code base and architecture of the original implementation was a bit long in the tooth, OCLC funded a re-engineering of PURLs, with the idea of bringing the code up to date with current HTTP headers and making the code more embeddable and hence more useful in a web architecture that has evolved somewhat since the mid 1990s.  Roll out of the new code base will take place soon, and should help sustain the viability of the identifiers minted in the PURL domain as well as to encourage wider deployment of PURLS in general.  It will, like the original code base, be made available as open-source software.

    Evidence of success of the strategy may be found in the adoption of PURLs for the identification of some billion URI-based assertions about proteins in the UniProt database, an international database of proteins intended to support open research in the biological sciences. In the latest release of UniProt (11.3), all URIs of the form:

    urn:lsid:uniprot.org:{db}:{id}

    have been replaced with URLs of the form:

    http://purl.uniprot.org/{db}/{id}

    Some ‘live’examples:

    http://purl.uniprot.org/uniprot/P12345,

    http://purl.uniprot.org/taxonomy/9606,

    http://purl.uniprot.org/pdb/1BRC

    Not a dime of revenue has come OCLC's way as the result of PURLs to offset the cost of developing and maintaining this system.  But a decade and more of thinking about the intricacies of identifiers will, I fervently hope, redound to the long term fiscal benefit of OCLC, as we 'insinuate" ourselves into  "more and more aspects of information organization and retrieval".  I want OCLC to be the most authoritative, reliable source of persistent identifier management on planet-library.

    OCLC represents assets of this global community in an information marketplace dominated by innovative, fast-moving organizations that don't always share the values of the library community.  The fact of the matter is that organizations don't HAVE values.  Their leadership - people - do. After more than two decades of working at OCLC, I've seen good leadership and bad, good decisions and bad, and have played a role in both varieties.   At no point in that time have I seen better leadership in more difficult circumstances than now, and I'm gratified to have played a small supporting part. By and large, I think my colleagues have as strong a claim to the library mission and philosophy as any institutional cohort.  Nor is it a small point that OCLC is a collaborative with a governance structure.  Spell that "a c c o u n t a b l e".

    Still, if you're squirming a bit at the incursions of the dot com world into the purity of our own information enclave, join the club.  I share this discomfiture, even as I am pleased to see the Microsofts and Googles and Sun Microsystems paying a lot of attention to the marketplace that libraries represent, not so much directly, as in recognition of the central place of library assets in the information ecosystem.  It is appropriate to examine and challenge motivations of the players (including OCLC)... too little of that is done, as we enjoy the jumbo shrimp and the bottomless glasses of wine at sponsored receptions.  But it is not a bad idea to ask the question with some knowledge rather than simply suspicions.  Now, please excuse me... I think Sun is about to serve lunch.
    -----
    My favorite piece at the Musée Rodin.  I didn't get the metadata... chalk it up to jetlag and slack-jawed love at first sight. 

    November 13, 2007

    Review: Joule (the restaurant)

    Paper_mache_mama_2 Among the downsides of living alone, eating alone most of the time is surely a prominent disaffection.  This past weekend I dragged myself away from the microwave and tried the newest star in the culinary firmament of Wallingford (a residential and business district just west of the University of Washington).  Aptly named Joule, it is a place of charm and energy, where I found eating without company more engaging than in most restaurants.

    I'm not a foodie by temperament or training, so I acknowledge the impetus for trying out Joule came from Betsy Wilson, dean of UW Libraries, and, I gather, something of a saintly patron to the energetic proprietors. On Saturday night, in need of comfort and sustenance, I dropped in.  Parking is everywhere a challenge in the Emerald City, so it was a pleasant surprise to find that it was easy to park in the adjacent bank parking lot.  Auspicious!

    Joule has a diners' bar that sushi habitué's would find familiar, though in this case, you're ringside to food preparation involving flaring industrial burners and the orchestrated, chaotic excitement of watching meals created on the fly.  No precious antics here... serious professionals working on the adrenalin of that most exhausting startup experience - the restaurant. 

    Watching the action unfold is fun and interesting, and a natural cover for the singular melancholy of dining alone (you folks at the tables... all I can say is you miss out!).  And the chef, aware that he's actually fixing your dinner while you watch, is solicitous and offertory in both a culinary and spiritual sense.  These folks obviously love to please the palates of others.

    I failed to resist the temptation to play the "Betsy sent me" card... Whoa!  The response, pressed palms and all, can best be captured in the word namaste. Heck, I'm a member of the family at this point.

    Oh... yeah... the food.  I liked it lots. Again, I'm no foodie, so I'll leave the culinary criticism to others (though I have to admit the boar's bacon sucked me right in).  I had three courses, all delightful, all bringing out the worst/best of my plate-licking inclinations.  Gauche, sure, but I'm guessing restaurateurs everywhere love it.  I have to say, a big part of the satisfaction for me was that I didn't eat alone. 

    Five smiles.
    ------
    The visage decorating this post stands watch in a grungy turn of the century (or so) warehouse that serves, in part, as an erstwhile home of the Saturday House Geek Fest, a loose aggregation of techies who get together on Saturdays to incubate ideas and code. I know nothing further of her pedigree. 

    November 01, 2007

    Timely Alerts

    Yangontaxistand As a graduate student back in the 1970s, the Health Science Library at the Ohio State University offered a service they called SDI, or Selective Dissemination of Information.  Registration of the words or phrases you were looking for  earned a stack of cards on a regular basis, each identifying a journal article corresponding to one of your search terms. Nice for scanning, nice for filing.  It was pretty hot stuff at the time, a huge time saver compared with laborious thumbing of the latest Current Contents which was likely to have already been through a lot of hands before it got to you. The subscription (I almost wrote prescription!) was pretty pricey and it was the rare graduate student so profligate as to have his or her own.
    One perused these cards with an admixture of gratitude, fear and guilt... So much easier, but is everything there? Shouldn't I spend Friday nights chained to Index Medicus to be sure?  Well, all that is in the past now... or is it? Of course not.  I signed up for Google Alerts some months ago to try to catch stuff I miss on various topics, and to further assure that no part of my life is left untouched by Google.  Seems to work reasonably well for things like news, but once again, the opacity of the operational model leaves one feeling uneasy.  This unease is amplified when one gets, as I did today, retrievals for one's vanity search (yes yes, I confess) of email posted to a listserv from two years ago.
    Its not that I've been waiting for this one to show up, but why now, months after initiating the search?  The Web is perhaps bigger than we fully appreciate.
    -----
    Yangon Taxi Stand, September 2007