Search Engine Optimisation: Basis, Industry, Studies

home | about | site use | services | guides | profiles | papers | timeline || Analysphere | Ketupa | Cinetext

overview

issues

related
Guides:

Networks
& the GII

related
Profiles
and Notes:

Search
Engines

Popular
Search
Terms

Directories

Metadata

overview

This note discusses search engine optimisation (SEO) and the SEO industry.

It covers -

introduction
optimisation - what is search engine optimization
the SEO industry - the shape of the SEO industry
factors - what might be significant in online rankings
studies - selected independent studies about search

It supplements discussion elsewhere on this site regarding search engines (and popular search terms), metadata, online resource identification and networks.

The following page considers SEO principles, myths and responses.

introduction

Search engine optimization aims to ensure that online content is found by users of web search engines, in particular by appearance of the 'optimised' site/page at the top of list of search results.

It reflects recognition that many users rely on search engines rather than domain names, links or other pointers (eg an URL featured on a business card or the side of a bus) for identification of content. It also reflects awareness that some users rely on search engines as their primary measure of the quality of online content, for example assuming that the first item on a list of search results will be the most authoritative, accurate or merely up-to-date.

Vendors of search optimisation services offer various methodologies that are claimed to "guarantee" online content a favourable position in search results.

Some of those methodologies are so obvious that it is surprising consumers would pay for the advice or emphasise optimisation at the expense of more effective mechanisms to attract the attention of desired demographics.

Other claims by participants in the search optimisation industry are deceptive or merely nonsensical. That has led critics to compare SEO businesses with those that purported to offer sure-fire solutions for choosing the perfect domain name and thereby being 'found' during the dot-com bubble.

SEO optimisation

Search engine optimisation is predicated on recognition that search engines use algorithms for sorting web sites, pages and other content for presentation in a list of search results in what is sometimes referred to as a search engine results page (SERP). Those algorithms are proprietary and often highly sophisticated. They are not static: they change over time, reflecting factors such as changing user needs, efforts to foil abuse by site operators and work to produce better results.

Algorithms and thus SERP results vary from engine to engine. Variation is also attributable to different coverage of the net: some engines have simply encountered more sites/pages than others, some visit different parts of the net more frequently than other engines, some process submissions about new sites more quickly than their competitors.

Most base their retrieval on identification of content within the body of a HTML page, PDF file or other online resource. They are not based on metadata in the header of files. Major engines indeed appear to penalise crude attempts by SEO vendors and site operators to gain a higher rank by using a particular word twenty or one hundred times in the header.

They instead rank sites/pages through weightings that appear to include -

whether the particular resource features the search term (or an equivalent term) - often referred to as a keyword, a usage that confuses many people from a library, print indexing or legal background
the significance of that term (eg does it occur once or several times; is it featured as a page title, subdirectory name or navigation point within a page; does it occur as part of what the engine perceives as natural syntax or as an attempt to artificially weight the page)
factors such as the site's age, liveness, 'reputation' (in particular whether other respected sites point to it) and avoidance of negatives such as participation in link exchange schemes. Some of those factors are highlighted below.

Many SEO services centre their marketing on a supposed understanding of the individual algorithms and on claimed expertise in retitling web pages or rewording text to increase the number of keywords per page. Such promotion is problematical; in particular mechanistic approaches may result in -

penalisation by search engine operators, which are known to have automatically given lower rankings or even excluded some sites after egregious abuses
pages that read strangely and thereby send the wrong message to users

the SEO industry

Online resource identification is a billion dollar sector of the Australian and global economy, with search engine and directory operators gaining (or aspiring to gain) substantial revenue from paid placement and other services and site operators seeking ways of ensuring that their sites appear prominently in SERP results.

The SEO industry appeared in the mid 1990s and first gained public attention after 1997 as the dot com bubble expanded. It is an echo of the domain naming and domain name valuation industries, which flourished with the bubble and attracted criticism over poor (even deceptive) performance.

There has been no comprehensive independent study of the industry. At a global and Australian level it appears to be volatile, with low barriers for entry by individuals and businesses, many of whom seem undeterred by unfamiliarity with search technologies and - alas - uninhibited about offering guarantees that demonstrably cannot be met.

SEO services for example cannot guarantee that any new site will gain a top ranking immediately (particularly as some engines 'sandbox' new sites - ie do not include them in SERP or give them a low SERP ranking - for several weeks or months in order to minimise evanescent sites created as pointers to adult content sites. There is no licensing by government or accreditation by the search engines, which are fiercely competitive. Margins for successful SEO services appear to be high but many minor operators appear to leave the industry within two years, whether because they discovered revenue was hard to achieve, they received bad words of mouth (or even court action) by disappointed clients or they simply got bored.

Some have expanded into search engine submission services, charging what are often exorbitant fees to alert engines and directories to the existence of a client's site. Such services often claim that there are 2,500 or even 5,000 search engines, with success supposedly requiring submission to each and every engine. Elsewhere on this site we have noted that those figures are gross exaggerations (eg conflate directories with engines) and that most users rely on a handful of engines.

Given the comments above it is unsurprising that there are no generally accepted standards, particularly standards developed by entities outside the industry and articulated by independent organisations such as Standards Australia or ANSI.

Some industry participants proclaim that their staff are "certified", for example have certification in "search engine optimisation" or are graduates of an "advanced search engine marketing skills" program. Sceptics respond - fairly or otherwise - that is difficult to embrace self-certification or diploma mill-style certification. Wariness is appropriate where -

there is no independent testing or validation of the certification
the expertise of the graduate is not reflected in endorsement by computer scientists or academics engaged in the study of information retrieval.

Other SEO vendors emphasise a proprietary methodology, often with a black box approach that claims to identify desired keyword weightings and enable the service to effectively rewrite sites/pages for guaranteed higher rankings.

Some, with a hazy understanding of engines such as Google, claim to dramatically boost a site's ranking through inclusion of the most popular search terms or through rapid generation of other sites that exist merely to point to the client's site.

Promises from such services typically feature guarantees that the chosen site will go to a top page - or even the number one rank on that SERP - within weeks or even days. In discussing such myths later in this note we suggest that clients might better spend their money elsewhere, particular if their site isn't unique and is located in a crowded part of cyberspace.

factors

Information about how different search engines rank search results is uncertain. Understanding has been impeded by myths about SERP and the effectiveness of SEO services.

Factors that may be significant for major engines such as Google appear to include weightings or exclusions regarding -

age (including how long the site has been online, the age of individual pages and measures of how frequently content is updated). Overall there appears to be a bias in favour of sites that have existed for a while, are stable and are updated on an ongoing basis - thereby being different from sites that are 'dead' and from those that are manufactured merely to redirect traffic or increase another site's link count
quantity (including number of pages/files, number of words in aggregate and number of words per page)
availability (is the content available 24/7, especially in regions such as Australia where continuous availability is expected)
positive reputation (number of citation by other sites, particularly citations by sites that have a high score in terms of age, quantity and so forth)
technical negatives (non-compliant code, broken outgoing links, conflicts between page titles and page content, indications that metatags have been heavily 'optimised' through recurrent use of words such as sex or adult, use of 'invisible' text)
reputation negatives (outgoing links that point to sites with a low reputation, files that feature illegal content or malware, participation in commercial link exchange schemes, sudden spurts in inclusion of links to sites with a poor reputation)
quality indicators (uniqueness of content, inclusion of bibliographic material and of automatically verifiable contact details)
measures of user satisfaction (eg click through from initial entry to other pages on the site, time spent on the site, correlation between free and paid-placement search results)
user demographics (matching site content with information about users)
auspices (weighting for recognised publishers, government agencies, professional organisations)
IP addresses (weighting against ISPs/ICHs that are perceived as being permissive to spammers and against address blocks that have an unusually high number of low reputation sites)
keywords (in particular keywords that the engine perceives as presented in an appropriate syntax rather than at an unnatural frequency and at random to subvert the algorithm)
cloaking (whether the site serves different content to different categories of users)

Some observers have suggested that there is human validation of particular sites or of leading results on some of the most frequent search terms.

studies

In discussing online information seeking and navigation we noted ongoing growth of specialist literature, often with a rigorous empirical base, regarding search algorithms and human interaction with search engines.

Unfortunately few of the insights offered by that research or other areas of cognitive science are apparent in industry or popular writing about SEO. There are no outstanding works for a general audience by an SEO practitioner and contact with particular SEO vendors has led us to question whether they have more than a casual awareness of methodologies.

For a grounding see Web Search: Public Searching of the Web (London: Springer 2004) by Amanda Spink & Bernard Jansen, Annabel Pollock & Andrew Hockley's 1997 What's Wrong with Internet Searching paper and Modern Information Retrieval (London: Longman 1999) by Ricardo Baeza-Yates & Berthier Ribero-Neto.

There is a broader perspective in Elaine Svenonius' The Intellectual Foundation of Information Organisation (Cambridge: MIT Press 2000), Christine Borgman's From Gutenberg to the Global Information Infrastructure: Access To Information in the Networked World (Cambridge: MIT Press 2000) and Preferred Placement: Knowledge Politics on the Web (Maastricht: Jan van Eyck Akademie Editions 2000) edited by Richard Rogers.

As points of entry regarding the specialist literature see Donald Case's Looking for Information: A Survey of Research on Information Seeking, Needs and Behavior (New York: Academic Press 2002), Bernard Jansen's 2000 paper A Review of Web Searching Studies, Richard Belew's Finding Out About: Search Engine Technology From A Cognitive Perspective (Cambridge: Cambridge Uni Press 2001) and Web Work: Information Seeking & Knowledge Work on the World Wide Web (New York: Kluwer 2000) by Chun Wei Choo, Brian Detlor & Don Turnbull. Papers of particular value include What Do Web Users Do? An Empirical Analysis of Web Use (PDF) by Andy Cockburn & Bruce McKenzie and the Analysis of a very large web search engine query log study by Craig Silverstein, Hannes Marais & Michael Moricz.

Search Engine Optimization for Dummies (New York: Wiley 2005) by Peter Kent is one of several DIY guides.

next page (myths and responses)

see also the Australian SEO industry