overview
past & future
economics
studies
delivery
formats
monographs
e-journals
newspapers
directories
video
interactive
editing
business
education
government
culture
e-books
libraries
digitisation
on demand
rights trade
DIY
systems

related
Guides:
Intellectual
Property
Censorship
Design
Accessibility
Information
Economy

related
Profiles:
Print &
the Book
Blogging |
Digitisation and archiving
Large-scale projects to 'digitise the past' and thereby
ensure future generations have networked access to print
publications, photographs, sound recordings, cinefilms
and other material have proved contentious.
Digitisation means users view a 'digital surrogate' (preserving
often fragile originals), access is not tied to physical
proximity (ie ease of convenience and savings in staff
costs) and physical storage requirements are reduced,
although costs savings are not as great as anticipated
and there's been considerable criticism of institutions
- such as the British Library - that digitised and then
destroyed major parts of their collection.
The Preserving Digital Information report
of the CPA & RLG suggests that digitisation by individual
institutions is often not cost effective; however resource
sharing (ie collaborative digitisation and access to shared
material through an intranet or a global digital library)
is attractive.
Andrew
Odlyzko echoed Michael Leask, author of Practical Digital
Libraries: Books, Bytes & Bucks (San Francisco,
Morgan Kaufmann 97), in noting that
the
costs of just the buildings of the new British Library
in London and the new French National Library in Paris
are two or three times higher than the costs of converting
their book collections to a digital format. In a more
rational world, the money going into bricks and mortar
would have gone into scanning the books, which would
have provided much more rapid and convenient access
to the data for scholars. The physical volumes themselves
could be housed in cheap warehouses, for the rare occasions
when they might have to be consulted. However, user
resistance to new media, copyright constraints, and
the politicians' and the public's liking for visible
edifices and for solid books make it hard to take that
step.
....
the entire mathematical literature collected over the
centuries is perhaps 30 million pages, so digitizing
it at a cost of $0.60 per page would cost $18 million,
less than ten percent of the annual journal bill
benchmarks
In the US the American Memory (AM)
project, aimed at providing digital access to millions
of items held by the Library of Congress and other institutions
has, for political as well as technological reasons, concentrated
on the digitisation of images - including maps, paintings,
photographs - and some manuscripts of literary or historic
significance.
Locally the National Archives of Australia (NAA)
has digitised key federation documents and commenced the
daunting task of providing digital colour facsimiles of
the millions of documents in its custody, while the National
Library's PictureAustralia (PA)
is a gateway for images from the State Library of Victoria,
University of Queensland Library, Australian War Memorial
and other institutions.
The University of California's Alexandria Digital
Library project (Pharos)
aims to create a digital library encompassing maps and
pictorial material for use by institutions across the
US.
Yale University's Project Open Book (POB)
is exploring the conversion of microfilm, hitherto the
medium of choice among the archival mafia, to digital
imagery.
The Mellon Foundation, noted earlier in this guide, has
funded the large-scale Journal Storage (JSTOR)
Project, with universities coming together to provide
ongoing electronic access in a secure environment to over
147 law, science and humanities journals. Imaging of that
print material is now close to the target of 750,000 journal
pages, with access by over 1,000 institutions. In April
the Foundation announced establishment of artSTOR, a large-scale
digital image library.
As part of the Making of America Project a consortium
of US universities such as Cornell and the Uni of Michigan
are placing the text of several thousand magazines
and books
online.
private projects
Most media attention has focussed on two private initiatives
- Bartleby and Gutenberg - although they're dwarfed by
major academic digitisation projects.
Project Bartleby (Bartleby)
is began with online publication of Whitman's Leaves
of Grass and now features a full-text searchable database
containing over 200,000 web pages, including over 22,000
quotations and 4,765 poems. Most of the content is out
of copyright: Bartleby's essentially capturing old publications.
Project Gutenberg (Gutenberg)
also draws on public domain works. Presentation is in
ASCII rather than HTML or PDF and material is added to
the database by volunteers so the coverage is eclectic
rather than comprehensive. Gutenberg has around 3,000
titles. It's unrelated to the academic Gutenberg-E
project described in Analysphere of 1 July. There's
a characteristically incisive analysis by Bradford DeLong
here,
commenting that founder Michael Hart's dream "has failed
to achieve any form of critical mass" in contrast to Linux
and continues to move ahead at a snail's pace.
The more ambitious Universal Library Project (UL)
aims to "start a worldwide movement to make available
ALL the Authored Works of Mankind on the Internet so that
anyone can access these works from any place at any time".
Searching and viewing would be free; individuals and existing
libraries would be able to purchase digital copies.
archiving the web
There is increasing interest in archiving the web,
with projects providing thematic/sectoral collections,
offering snapshots or more grandiosely attempting to capture
the entire web.
An example of the latter is the US-based Internet
Archive, under the leadership of Brewster Kahle. His
2001 Public Access to Digital Material article
(with Rick Prelinger & Mary Jackson) claimed that
universal digital access is attainable and is the "epic
opportunity of our digital age", since
the technology has reached the point where scanning
all books, digitizing all audio recordings, downloading
all websites, and recording the output of all TV and
radio stations is not only feasible but less costly
than buying and storing the physical versions.
That's
an intriguing but very problematical vision, with major
questions regarding intellectual property and resource
identification. We've explored some of the issues in a
more detailed profile.
next page (on
demand)
|
|