overview
exceptionalism
commons
dogs in space
rich & hip
borders
e-cargo cults
community
home alone
red lights
it's all there
|
it's all there?
This
page explores myths about online access to what some writers
have characterised as the "information cornucopia"
or global digital library: claims that everything you
want to know is online, that you can easily find it, that
you'll be able to do so in future and that people are
suffering from 'information overload'.
It covers -
introduction
Brewster Kahle's 2001 Public Access to Digital Material
article
identified universal online access to content as the
"epic opportunity of our digital age", claiming
that
technology has reached the point where scanning all
books, digitizing all audio recordings, downloading
all websites, and recording the output of all TV and
radio stations is not only feasible but less costly
than buying and storing the physical versions.
A year later Business Week ran
with the spin, claiming that Kahle's Internet Archive
is -
a
collection of 10 billion pages, including Internet sites,
movies, and Usenet postings five times larger than the
amount of information at the Library of Congress
and
that
Today,
a single copy of everything that's on the Net -- equal
to 15,000 copies of Encyclopedia Britannica -- is added
to the archive every two months.
Would
that were true, since the Archive in fact seems to be
far more selective.
At a less rarified level there are four basic myths about
information in cyberspace
- everything
is online
-
all online content can be found
-
all online content can be accessed (and will be accessible
in future)
-
access is resulting in information overload
what's online?
The explosion of web sites,
high numbers of results displayed by search engines, ready
access to some contemporary music through filesharing
and casual references to global digital libraries have
encouraged a belief that "everything" is online
... or could be available via the internet through the
efforts of volunteers or through the removal of impediments
such as intellectual property.
That belief is at best naive. Although there are several
million sites on the web (with an unknown number of pages),
much of the content is corporate or personal and ephemeral.
As of mid-2003 a majority of pages are probably in English:
some languages
are barely represented in terms of readership or authorship
(eg there's little Lao, Inuit, Bantu or Amharic content).
What text is online is patchy in the extreme. Standard
works from the Latin and Greek classics are online (albeit
often in superseded editions and translations) but there's
little Provencal, Persian, Chinese, Khmer or Aramaic.
More recent literature is sparse: umpteen copies of pronouncements
by Gilmore and Barlow but no Patrick White, Christina
Stead, David Malouf, Heimito von Doderer or Robert Musil.
There's little Proust, less Mann (Heinrich, Klaus or Thomas).
Publishers such as Gale are undertaking large-scale digitisation
programs (eg Gale's 20 million page 150,000 titles The
18th Century literature project).
That activity is, of course, on a commercial basis and
- as in past microfilm or CD-ROM projects - access to
the text is generally restricted to academic ghettoes.
Digitisation of archival content from newspapers
and journals is underway but again, much of that content
won't be freely available. As we've noted in discussing
electronic publishing, many current serials are online
... but protected by firewalls for access on a subscription,
sessional or per item payment. Few online newspapers contain
all the content that is featured in their print versions.
Biography and critical literature prior to the 1980s is
equally sparse and, given the priorities of initiatives
such as the Gutenberg and Bartleby projects, is likely
to remain so. Don't expect to find a standard edition
of Lukacs, Adorno, Kojeve or Bakhtin. Only scraps of historians
such as Namier, Bloch, Kehr, Febvre, Michelet or Matthiesen
are on the web. In summary, most of the Library of Congress,
National Library of Australia or even mid-range university
or community library isn't on the web.
What of music? A rarely-remarked feature is the almost
total absence of scores: the net provides access to recordings
rather than notation. With some exceptions the classical
repertoire is largely absent: little Machaut, Ives, Zemlinsky,
Pergolesi, Gesualdo.
As uptake of broadband increases, access to video content
is growing. At the moment most video on the web (and downloaded
through filesharing) emanates from the adult
content sector. Neither Hollywood nor national film
industries plan to release their libraries (including
early b&w silent films) onto the web. The BBC's proposals
to place much of its audiovisual library online is an
exception with little enthusiasm from commercial and public
sector peers.
is it readily identified?
Much content is freely available on the net. However,
ready identification of that content often poses particular
challenges. In discussing internet metrics we've noted
reports that suggest most traffic goes to a small proportion
of sites (the 'winner takes all' model): many potential
users simply don't find the content that is available.
For practical purposes that content does not exist.
Developments in enhanced search engines, metadata and
other identification mechanisms are arguably not keeping
pace with the growth of the net, the volatility of much
online content and the resistance of many users to unstructured
or 'naive' retrieval. It is clear that many users are
content to settle for second or third best and that that
many are overwhelmed by the task of sifting through exhaustive
search results.
Even the major engines don't cover all of the public web;
few cover much of the 'deep' web, ie content that's behind
firewalls or is generated dynamically from databases rather
than static web pages that are readily spidered.
is it accessible?
A pernicious myth is the notion that most content
can be readily accessed. In considering digital divides,
usability and other
questions we have suggested that access to the net is
quite uneven. Much of the web is 'dark', either because
content is held behind firewalls (no password or no credit
card number = no access) or because site operators have
disregarded usability principles.
Within advanced economies substantial parts of the population
do not have ready access. That is because they face physical
challenges (eg poor sight and motor difficulties), because
the infrastructure is not available or because they simply
can't afford the ongoing investment in a recent computer
and broadband charges.
Such impediments in Australia and New Zealand are more
critical in other parts of the world, where as we've noted
over a billion people don't have ready access to electricity
(and several million depend on dried cow dung and straw
for warmth and cooking). Hype from Microsoft, Cisco and
MIT about breaking down digital divides through wireless
networks seems somewhat displaced when the cost of a personal
computer is several times the annual income of the average
family in central Africa or Bangladesh.
We have argued that notions of the digital divide encompass
deficits in skills, expectations and the broader economic
environment. Charles Kenny of the World Bank for example
comments that
Lack
of education is a major barrier to productive Internet
use .... In Ethiopia, 98 percent of Internet users in
1998 had a university degree, yet 64.5 percent of the
overall population is illiterate. Worldwide, most people
living on $1 a day are illiterate. Further, they usually
speak a minority language in their own countryfew
speak a major global language. For example, about 17
million people in Nigeria speak Igbo. My search for
Web pages in Igbo turned up only five sites: a translation
of the Universal Declaration of Human Rights, a translation
of a document called 'The Four Spiritual Laws' (theological
provenance undetermined), a translation of the food
pyramid, a two-page Igbo phrase book, and a prayer manual.
There isn't an Igbo translation service on the Web,
so an Igbo speaker would be limited to these five. None
involved sound or video, so the illiterate Igbo speaker
would gain nothing. Bridging the gaps in language and
technical skills as well as basic literacy will be difficult,
considering the small per-student spending available
in the poorest countries' primary schools, where the
discretionary budget per student is as little as $5
a year.
Kenny
rightly dismisses hype about pervasive benefits from e-commerce
by noting that
even
if poor people are lucky enough to be literate and conversant
in a major world language, their use of the Web for
activities such as e-commerce is likely to be limited
by their lack of credit cards, not to mention the challenge
of persuading FedEx and UPS to start delivery services
in their neighborhoods. Limitations in relevant content
and ability to use that content perhaps best explain
why only 2.2 percent of India's Internet users have
ever engaged in buying or selling over the Web.
That
lack of the fantastic green plastic also precludes use
outside libraries of paid access sites.
is it accurate?
Notions of the web as a well-ordered and comprehensive
free library (whose librarians provide quality control
in the acquisiton of content and the systematic weeding
out of superseded content) are misplaced. Online publication
is not a guarantee of accuracy.
A more effective metaphor instead is the net as the 'marketplace
of ideas', in which everyone is free to offer content
and in which truth eventually triumphs over ignorance
or deception. Regrettably, in that marketplace lies are
often more seductive or simply easier to find. Much of
the factual information on the net is false or has become
so through the passage of time.
The self-referential nature of much online content creation
- authors appropriating online content without referring
to offline sources and the echo-chamber effect of much
blogging, exacerbated
by the 'winner takes all' phenomenon - means that inaccuracies
can gain wide circulation. That's of particular concern
for medical
sites. It's also of concern regarding sites with a historical
or technical reference function (one reason why this site
features a range of online sources and references to offline
writing). It's a basis for skepticism about arguments
that defamation online
is not a major problem, as the defamed can supposedly
'out-publish' the falsehoods in a triumph of free
speech.
One response is the development of a digital information
literacy, with readers having appropriate expectations
about what's found online, skills in assessing accuracy
and a capability (and commitment) to checking information
found on the net.
information overload
A fashionable metaphor during the 1990s was that using
the web was like trying to drink from a firehose. More
broadly, critics have claimed - with more enthusiasm than
substance - that people in advanced economies are suffering
from 'information overload', infoglut
and web
addiction ... and that the overload is increasing.
Richard Saul Wurman's polemical Information Anxiety
(Indianapolis: QUE 2001) claims that
A
weekday edition of The New York Times contains
more information than the average person was likely
to come across in a lifetime in 17th century England.
with
the conclusion apparently being that the 'average person'
is now compelled to read the NY Times rather
than skimming text, channel surfing broadcast media and
becoming adept with the 'delete' button in managing incoming
email.
Such claims are essentially ahistorical and often rather
patronising, since they are predicated on a view of people
as passive receptacles without choice or ability to discriminate.
A recurrent lament since the beginning of recorded history
is that people - particularly literati and executives
- are faced with too much information, have too little
time, a too stressed. Perhaps that is part of the human
condition.
Upper-class consumers in Victorian London complained that
seven mail deliveries a day made life a misery. Contemporary
psychologists in Boston and Vienna worried that constant
telegrams were resulting in unprecedented stress. Critics
in Paris and New York claimed that the proliferation of
books and newspapers through new technologies (cheaper
paper, easier printing) and readier access by publishers
to capital was leading to nervous debility and other woes,
best addressed through time in the South Seas or the Wild
West. By the 1920s pundits were decrying a bombardment
from the phonograph and radio, with calls to 'switch off'.
Percy Shelley, in the 1821 Defense of Poetry
(here),
had more provocatively commented
We
have more moral, political and historical wisdom than
we know how to reduce into practice: we have more scientific
and economical knowledge than can be accommodated to
the just distribution of the produce which it multiplies.
The poetry, in these systems of thought, is concealed
by the accumulation of facts and calculating processes.
There is no want of knowledge respecting what is wisest
and best in morals, government and political economy,
or at least what is wiser and better than what men now
practise and endure. But we let 'I dare not' wait upon
'I would', like the poor cat in the adage. We want the
creative faculty to imagine that which we know; we want
the generous impulse to act that which we imagine; we
want the poetry of life: our calculations have outrun
conception; we have eaten more than we can digest. The
cultivation of those sciences which have enlarged the
limits of the empire of man over the external world
has, for want of the poetical faculty, proportionally
circumscribed those of the internal world, and man,
having enslaved the elements, remains himself a slave.
Herbert
Simon's 1978 Rationality as Process and as Product
of Thought commented that
In
a world where information is relatively scarce, and
where problems for decision are few and simple. information
is almost always a positive good. In a world where attention
is a major scarce resource, information may be an expensive
luxury, for it may turn our attention from what is important
to what is unimportant. We cannot afford to attend to
information simply because it is there.
Perspectives
are offered by Mark Brosnan's Technophobia: The Psychological
Impact of Information Technology (New York: Routledge
1998), Langdon
Winner's Autonomous Technology: Technics-Out-of-Control
as a Theme in Political Thought (Cambridge: MIT Press
1977) and David Shenk's Data Smog (New York:
Harper 1997). In discussing info overload and cyberaddiction
in our Digital Environment guide
we've commented on other works such as The Age of Access:
The New Culture Of Hypercapitalism Where All Of Life Is
A Paid-For Experience (New York: Tarcher 2000) from
dyspeptic-by-numbers Jeremy Rifkin and Theodore Roszak's
The Cult Of Information: A Neo-Luddite Treatise On
High Tech, Artificial Intelligence & The True Art
Of Thinking (Berkeley: Uni of California Press 1996).
::
|
|