Metadata Profile: URNs

overview

on the web

Dublin Core

RDF

PICS

PURLs

URNs

URNs

As the web evolves some experts are calling for Uniform Resource Name (URNs) to complement - or replace - Uniform Resource Locators (URLs) in identifying and thereby retrieving online documents.

URNs would, it's claimed, provide a persistent and unique identifier for digital resources - a more powerful version of the ISBNs used by librarian and publishers to identify books.

URLs

Finding information on the web is based on URLs - an address for each document with a format that's similar to the URL for this page: connectingguide10.htm

URLs identify documents according to their location. This document, for example is located in the identification folder of the briefings component of the Caslon domain within the dot com domain space.

URLs are familiar to most users of the web, who take them for granted as a mechanism for identifying online documents and describing their location for future retrieval. However, they have been criticised by some as unsatisfactory.

Critics note that each URL simply points to the current location of a document, rather than uniquely identifying it independent of its location in cyberspace. If a resource is moved to a new location (renamed, placed in a new folder on the same site, moved to a new site), the URL is no longer useful because it points to a location that no longer exists. It's not unique and it's not persistent.

Librarians, publishers and proponents of global electronic copyright management systems (ECMS) have thus argued that a persistent and unique identifier would be specific to a particular digital resource. Their vision is that identification independent of location would facilitate access to the document regardless of its location, as long as it still existed on the Internet, and underpin rights tracking systems.

concept

Proposals for a Uniform Resource Name (URN) scheme have two parts.

Each document would be marked with a standard, persistent and unique identifier as part of its metadata.

So that users could link from the URN to the specific URL, a 'resolver service' - essentially a global automated directory - would be required.

The expectation is that URNs would include a Namespace Identifier (NID) code and a Namespace Specific String (NSS). The NID code would flag the identification system being used and facilitate interpretation of the NSS, a unique code identifying the individual document.

Where would the NID and NSS come from? The vision is that the international ISBN and ISSN agencies - described in our ECMS profile - would use the existing International Standard Book Number (ISBN) and International Standard Serial Number (ISSN) as the NIDs. Various national libraries, including Australia's NLA, are considering URNs based on National Bibliography Numbers (NBN), with 'NBN' as the Namespace Identifier and the existing NBN used as the NSS.

What would it look like? There's a detailed explanation of an NBN system in the Nordic Metadata Project's URN User Guide (UUGuide).

coming soon to a desktop near you?

Advocates for the URN - primarily drawn within the library sector and associated information technology researchers - have claimed that "the Uniform Resource Name (URN) may eventually be the internet standard for identifying and finding electronic resources".

At this stage that claim appears overambitious. It assumes achievement of a network architecture - in particular the resolver service - that is still taking shape. Work by the Internet Engineering Task Force's (IETF) Uniform Resource Names Working Group (URNWG) continues.

Just as importantly, while it's easy for particular sectors to mandate standards, getting the commitment of the people who create web pages is another matter. The experience of the library sector in promulgating the Dublin Core (DC) metadata standard is a good example.

DC has not broken outside the walls of the curatorial ghetto and thus is found on much less than 1% of the web. It is unclear whether businesses, individuals, non-profit groups and even many academic institutions can be persuaded to adopt URN, particularly since reported noted in our Metrics guide suggest that the half life of pages on the web is less than two years.

The publisher-oriented Digital Object Identifier (DOI) scheme, perhaps the most advanced ECMS project, is less ambitious, using the competing Handle system to allocate a unique digital identifier to commercial digital publications.

As an "interim measure" some figures are promoting the Persistent Uniform Resource Locator (PURL), in which the identifier points to a resolution service instead of the actual location of the digital resource. The resolution service then redirects the user to the appropriate URL, serving as another link to the current location of the particular document. When that document's location changes, it would only be necessary to update the PURL resolver service for users to find it with the same PURL.