overview
issues

related
Guides:
Networks
& the GII

related
Profiles
and Notes:
Search
Engines
Popular
Search
Terms
Directories
Metadata
|
overview
This note discusses search engine optimisation (SEO) and the
SEO industry.
It covers -
It supplements discussion elsewhere on this site regarding
search engines (and popular search terms), metadata, online
resource identification and networks.
The following page considers SEO principles, myths and responses.
introduction
Search engine optimization aims to ensure that online content
is found by users of web search engines, in particular by
appearance of the 'optimised' site/page at the top of list
of search results.
It reflects recognition that many users rely on search
engines rather than domain
names, links or other pointers (eg an URL featured on
a business card or the side of a bus) for identification of
content. It also reflects awareness that some users rely on
search engines as their primary measure of the quality of
online content, for example assuming
that the first item on a list of search results will be the
most authoritative, accurate or merely up-to-date.
Vendors of search optimisation services offer various methodologies
that are claimed to "guarantee" online content a
favourable position in search results.
Some of those methodologies are so obvious that it is surprising
consumers would pay for the advice or emphasise optimisation
at the expense of more effective mechanisms to attract the
attention of desired demographics.
Other claims by participants in the search optimisation industry
are deceptive or merely nonsensical. That has led critics
to compare SEO businesses with those that purported to offer
sure-fire solutions for choosing
the perfect domain name and
thereby being 'found' during the dot-com bubble.
SEO optimisation
Search engine optimisation is predicated on recognition that
search engines use algorithms for sorting web sites, pages
and other content for presentation in a list of search results
in what is sometimes referred to as a search engine results
page (SERP). Those algorithms are proprietary and often highly
sophisticated. They are not static: they change over time,
reflecting factors such as changing user needs, efforts to
foil abuse by site operators and work to produce better results.
Algorithms and thus SERP results vary from engine to engine.
Variation is also attributable to different coverage of the
net: some engines have simply encountered more sites/pages
than others, some visit different parts of the net more frequently
than other engines, some process submissions about new sites
more quickly than their competitors.
Most base their retrieval on identification of content within
the body of a HTML page, PDF file or other online resource.
They are not based on metadata in the header of files. Major
engines indeed appear to penalise crude attempts by SEO vendors
and site operators to gain a higher rank by using a particular
word twenty or one hundred times in the header.
They instead rank sites/pages through weightings that appear
to include -
- whether
the particular resource features the search term (or an
equivalent term) - often referred to as a keyword, a usage
that confuses many people from a library, print indexing
or legal background
- the
significance of that term (eg does it occur once or several
times; is it featured as a page title, subdirectory name
or navigation point within a page; does it occur as part
of what the engine perceives as natural syntax or as an
attempt to artificially weight the page)
- factors
such as the site's age, liveness, 'reputation' (in particular
whether other respected sites point to it) and avoidance
of negatives such as participation in link exchange schemes.
Some of those factors are highlighted below.
Many
SEO services centre their marketing on a supposed understanding
of the individual algorithms and on claimed expertise in retitling
web pages or rewording text to increase the number of keywords
per page. Such promotion is problematical; in particular mechanistic
approaches may result in -
- penalisation
by search engine operators, which are known to have automatically
given lower
rankings or even excluded
some sites after egregious abuses
- pages
that read strangely and thereby send the wrong message to
users
the SEO industry
Online
resource identification is a billion dollar sector of the
Australian and global economy, with search engine and directory
operators gaining (or aspiring to gain) substantial revenue
from paid placement and other services and site operators
seeking ways of ensuring that their sites appear prominently
in SERP results.
The SEO industry appeared in the mid 1990s and first gained
public attention after 1997 as the dot com bubble expanded.
It is an echo of the domain naming and domain name valuation
industries, which flourished with the bubble and attracted
criticism over poor (even deceptive) performance.
There has been no comprehensive independent study of the industry.
At a global and Australian level it appears to be volatile,
with low barriers for entry by individuals and businesses,
many of whom seem undeterred by unfamiliarity with search
technologies and - alas - uninhibited about offering guarantees
that demonstrably cannot be met.
SEO services for example cannot guarantee that any new site
will gain a top ranking immediately (particularly as some
engines 'sandbox' new sites - ie do not include them in SERP
or give them a low SERP ranking - for several weeks or months
in order to minimise evanescent sites created as pointers
to adult content sites.
There is no licensing by government or accreditation by the
search engines, which are fiercely competitive. Margins for
successful SEO services appear to be high but many minor operators
appear to leave the industry within two years, whether because
they discovered revenue was hard to achieve, they received
bad words of mouth (or even court action) by disappointed
clients or they simply got bored.
Some have expanded into search engine submission services,
charging what are often exorbitant fees to alert engines and
directories to the existence of a client's site. Such services
often claim that there are 2,500 or even 5,000 search engines,
with success supposedly requiring submission to each and every
engine. Elsewhere on this site we have noted that those figures
are gross exaggerations (eg conflate directories with engines)
and that most users rely on a handful of engines.
Given the comments above it is unsurprising that there are
no generally accepted standards, particularly standards developed
by entities outside the industry and articulated by independent
organisations such as Standards Australia or ANSI.
Some industry participants proclaim that their staff are "certified",
for example have certification in "search engine optimisation"
or are graduates of an "advanced search engine marketing
skills" program. Sceptics respond - fairly or otherwise
- that is difficult to embrace self-certification or diploma
mill-style certification. Wariness is appropriate where
-
- there
is no independent testing or validation of the certification
- the
expertise of the graduate is not reflected in endorsement
by computer scientists or academics engaged in the study
of information retrieval.
Other
SEO vendors emphasise a proprietary methodology, often with
a black box approach that claims to identify desired keyword
weightings and enable the service to effectively rewrite sites/pages
for guaranteed higher rankings.
Some, with a hazy understanding of engines such as Google,
claim to dramatically boost a site's ranking through inclusion
of the most popular search
terms or through rapid generation of other sites that exist
merely to point to the client's site.
Promises from such services typically feature guarantees that
the chosen site will go to a top page - or even the number
one rank on that SERP - within weeks or even days. In discussing
such myths later in this note we suggest that clients might
better spend their money elsewhere, particular if their site
isn't unique and is located in a crowded part of cyberspace.
factors
Information about how different search engines rank search
results is uncertain. Understanding has been impeded by myths
about SERP and the effectiveness of SEO services.
Factors that may be significant for major engines
such as Google appear to include weightings or exclusions
regarding -
-
age (including how long the site has been online, the age
of individual pages and measures of how frequently content
is updated). Overall there appears to be a bias in favour
of sites that have existed for a while, are stable and are
updated on an ongoing basis - thereby being different from
sites that are 'dead' and from those that are manufactured
merely to redirect traffic or increase another site's link
count
- quantity
(including number of pages/files, number of words in aggregate
and number of words per page)
- availability
(is the content available 24/7, especially in regions such
as Australia where continuous availability is expected)
- positive
reputation (number of citation by other sites, particularly
citations by sites that have a high score in terms of age,
quantity and so forth)
- technical
negatives (non-compliant code, broken outgoing links, conflicts
between page titles and page content, indications that metatags
have been heavily 'optimised' through recurrent use of words
such as sex or adult, use of 'invisible' text)
- reputation
negatives (outgoing links that point to sites with a low
reputation, files that feature illegal content or malware,
participation in commercial link exchange schemes, sudden
spurts in inclusion of links to sites with a poor reputation)
- quality
indicators (uniqueness of content, inclusion of bibliographic
material and of automatically verifiable contact details)
- measures
of user satisfaction (eg click through from initial entry
to other pages on the site, time spent on the site, correlation
between free and paid-placement search results)
- user
demographics (matching site content with information about
users)
- auspices
(weighting for recognised publishers, government agencies,
professional organisations)
- IP
addresses (weighting against ISPs/ICHs that are perceived
as being permissive to spammers
and against address blocks that have an unusually high number
of low reputation sites)
- keywords
(in particular keywords that the engine perceives as presented
in an appropriate syntax rather than at an unnatural frequency
and at random to subvert the algorithm)
- cloaking
(whether the site serves different content to different
categories of users)
Some
observers have suggested that there is human validation of
particular sites or of leading results on some of the most
frequent search terms.
studies
In discussing online information
seeking and navigation we noted ongoing growth of specialist
literature, often with a rigorous empirical base, regarding
search algorithms and human interaction with search engines.
Unfortunately few of the insights offered by that research
or other areas of cognitive science are apparent in industry
or popular writing about SEO. There are no outstanding works
for a general audience by an SEO practitioner and contact
with particular SEO vendors has led us to question whether
they have more than a casual awareness of methodologies.
For a grounding see Web Search: Public Searching of the
Web (London: Springer 2004) by Amanda Spink & Bernard
Jansen, Annabel Pollock & Andrew Hockley's 1997 What's
Wrong with Internet Searching paper
and Modern Information Retrieval (London: Longman
1999) by Ricardo Baeza-Yates & Berthier Ribero-Neto.
There is a broader perspective in Elaine Svenonius' The
Intellectual Foundation of Information Organisation (Cambridge:
MIT Press 2000), Christine Borgman's From Gutenberg to
the Global Information Infrastructure: Access To Information
in the Networked World (Cambridge: MIT Press 2000) and
Preferred Placement: Knowledge Politics on the Web
(Maastricht: Jan van Eyck Akademie Editions 2000) edited by
Richard Rogers.
As points of entry regarding the specialist literature see
Donald Case's Looking for Information: A Survey of Research
on Information Seeking, Needs and Behavior (New York:
Academic Press 2002), Bernard Jansen's 2000 paper
A Review of Web Searching Studies, Richard Belew's
Finding Out About: Search Engine Technology From A Cognitive
Perspective (Cambridge: Cambridge Uni Press 2001) and
Web Work: Information Seeking & Knowledge Work on
the World Wide Web (New York: Kluwer 2000) by Chun Wei
Choo, Brian Detlor & Don Turnbull. Papers of particular
value include What Do Web Users Do? An Empirical Analysis
of Web Use (PDF)
by Andy Cockburn & Bruce McKenzie and the Analysis
of a very large web search engine query log study
by Craig Silverstein, Hannes Marais & Michael Moricz.
Search Engine Optimization for Dummies (New York:
Wiley 2005) by Peter Kent is one of several DIY guides.
next page (myths and responses)
|
|