| A view on Google’s Patent: Information Retrieval
Based on Historical Data
by: Peter Faber
Google doesn’t stop innovating
their search engine, and where others try to follow,
Google is not just 1 step ahead, but 10 steps ahead.
Their latest innovation, which actually may already
be in place for a year or longer, can be found in the
patent: “Information Retrieval Based on Historical
Data.”
The abstract of the patent is: “A
system identifies a document and obtains one or more
types of history data associated with the document.
The system may generate a score for the document based,
at least in part, on the one or more types of history
data“.
This article has the goal to give
a simplified representation of this patent + contains
recommendations as to what would be the best SEO techniques
to obtain high rankings, with a specific focus on links.
This article is the opinion of the writer and following
recommendation in this article is done at your own risk.
Google’s search results have
been increasingly difficult to explain and many theories
have been developed on what is going on. Most popular
is the “sand box” theory, which says that
a new site is put in a virtual sand box and has to wait
until it has aged before obtaining high rankings. This
patent has some excellent information that can explain
this phenomenon.
Information Retrieval
The information that this invention
of Google is claimed to retrieve based on the historical
data are:
1. Age/Time
2. Change
3. Trends
A score is calculated based on the
above 3 factors which can then, at least partially,
be used to rank the selected pages.
Historical Data
The patent describes a huge amount
of historical data. The following is an overview of
most items for which historical data can be measured:
* Pages/sites
* Links
* Anchor Texts
* Content
* Query
* Traffic
* Ranking
* User
* Domain
Ranking Based On Information Retrieved
From Historical Data
The patent describes in quite a lot
of detail how selected pages are ranked based on the
information retrieved from historical data. This chapter
will describe the basic logic applied.
Age/Time
Of all historical data a date of
inception is used to determine 4 important values:
* Age
* Average Age
* Date
* Average Date
These factors can be determined for
pages, links, anchor text, content, topics, queries,
etc. Comparing the age or date of a page to the average
of the site for example tells the search engine if this
information is relatively new or old.
Comparing the average age or date
of a page to the average age or date of all pages selected
for a query (keyword phrase) tells the search engine
if the page is relatively new or old. This information
can be used to rank the selected pages.
Comparing to an average has the advantage
that there is no preset base of rules that determine
the rankings of a page. For one query 6 months may be
considered new (product descriptions for example) while
for another page 6 days may be considered old (news
items for example). It all depends on the average age.
This same logic applies to links.
In order to determine how popular a page or site is,
the average age of all back links tells the search engine
if the popularity of the page is recent or not. It makes
sense that if most back links have been obtained 4 years
ago and that hardly anybody has been interested to link
to this page/site since then, that the page is not as
popular as the existing back links would suggest.
The patent goes even as far as determining
age factors for anchor texts of links.
Change
Information changes over time. Opinions
change, knowledge changes, popularity changes, etc.
Like mentioned before, a page that was popular 4 years
ago, may be totally forgotten now, but still have most
of its backlinks that were obtained when the page actually
was popular. However, if this page all the sudden becomes
popular again, and new back links start showing up,
the average age of the backlinks will remain high. This
will prevent the page of ranking high.
Detecting changes is crucial to give
old information the chance to rank high again. Consequently,
the lack of change can be a reason to lower the rank
of a page.
Trends
Even though comparing to averages
is a great way to get information about freshness, it
fails to recognize smaller events like a sudden increase
in popularity of a page. Though detecting changes do
help to recognize smaller events, more information can
be obtained by detecting trends.
Sudden increases of popularity can
be caused by seasonal events like Christmas or the Super
Bowl. For this reason the search engine will try to
determine trends within pages links, anchor text, content,
topics, queries, etc. Detecting trends makes it possible
to rank pages higher that would not be ranked high with
the standard ranking methods or with comparing to average
ages or dates. Google has recognized here a very important
fact of information: Relevance and importance of information
is (con)temporary.
Detecting Spam Using Historical Data
Having all kinds of historical data
available can be used to detect search engine spam.
Unexpected events that happen to a site can be an indication
of spam. Obviously a strong improvement of 1 single
factor would not be a direct indication of spam, generally
multiple factors are showing strange behavior when a
site is using spam to increase rankings. It would not
be in Google’s interest to penalize a site for
advertising. However, excessive advertising in sites/pages
that are totally unrelated will not do your site any
good.
Recommendations
Nothing changed in regards to links.
This patent pretty much confirms what we at www.textlinkbrokers.com
already knew and have been explaining to our customers
as well. The following recommendations can be helpful:
Keep links related
Related links matter, unrelated links
can be considered spam.
Build links on a continuous moderate
bases
As the patent describes, the average
age of your backlinks should not be too high. It is
therefore wise to continue adding backlinks to secure
a reasonable average age of all your backlinks. How
many you need to add over time depends on your market.
Be better than the average
Very important is to be better than
the average, but don’t overdo it. It would be
expensive and unnecessary.
Focus on seasonal events
A good way to increase the success
of your website is to set up text link campaigns for
seasonal events. Start your advertising campaign 2 to
3 months before the actual event to give Google the
time to find the links and update your site’s
information with it. After the event you can let these
links go again.
Spread links over multiple sites
(unique backlinks)
A very important factor is the number
of unique websites in your backlinks. Google seems to
put a strong emphasis on this factor.
About The Author
Peter Faber is an Internet marketing
consultant working for http://www.textlinkbrokers.com,
an SEO company specialized in link building. He has
his own personal blog at http://www.seo-works.com.
|