Tips, tricks, and pokes, just WebTrends Analytics
Random header image... Refresh for more!

Canonical URLs – Why You Should Care

Until Google published their Canonical URL link-tag standard in February 2009, we Outsiders hadn’t seen the word “canonical” in actual written form since grad school.

Anyway, one meaning of the word canonical is “the simplest form.”  In other words, a “canonical” mathematical model is the model with the fewest possible rules and variables, out of all possible mathematical models for a thing.

We love the irony of an obscure multisyllabic word that means “simple.”

They might have called it the “Standard for Preventing URL variations from being indexed.”  The SFPUVFBI.

You should care about Canonical URLs when you have a page on your site that can potentially be reached using multiple URLs, due to tracking parameters.

Example:

Your page http://yoursite.com/promo.asp can be reached through the plain URL.

But:

  • Because you want to track clicks from a special promo graphic on your pages, you’ve hard-coded the promo graphic’s link to go to /promo.asp?WT.ac=fromhomepage.  Or maybe /promo.asp?prevpage=homepage.
  • There are affiliate sites with links to your promo.asp page, and they have helped your tracking by hard-coding their links to go to /promo.asp?source=affiliatesitename
  • A search engine has followed a banner or affiliate or even paid search link that contains campaign parameters, and displays an organic search listing going to /promo.asp?WT.mc_id=oprahbanner or /promo.asp?WT.mc_id=paidsearchmsn
  • Somebody followed a link from one of your campaign emails and copied what they saw in their address bar into their blog, resulting in links going to /promo.asp?source=Feb2011email.
  • A search engine picks up that same link from that blog and puts it in the index.

See, that promo page can be reached through five or six different URLs, a plain one and several others with campaign parameters in them.

It should be obvious that the only one you want to be in the search engine index is the plain one.  If the wonky URLs are indexed and clicked on, your campaign reports will report on visits that appeared to come from a campaign … but actually came from an organic search listing.

This is bad.

There are other problems, too.  At best, multiple versions of a page’s URL will water down the ranks or PageRank for these pages.  At worst, the search engine will assume it’s seeing spamming, i.e. duplicate content on multiple pages.

The Canonical URL tag will fix  all of the above

All of this can be avoided by adding code to your page’s <head> section.  This bit of code was announced by Google back in February and has since been adopted, at least in intention, by other search engines such as MSN-LiveSearch, Yahoo and Ask.

(February 2011 update – Bing/Yahoo still ignores the Canonical tags!  We’re not sure about Ask.  We’ve added another post that uses Webmaster Tools to do pretty much the same thing as Canonical tags, and it WILL affect Bing/Yahoo.)

The code snippet is used only by the search engine spiders.  It states what you want to be the one and only way that you want the page to appear in their indexes.

<link rel=”canonical” href=”http://www.yoursite.com/subdirectory/promo.asp” />

Remember, it goes inside the  <head> section of the page.

By the way, this tag doesn’t affect what WebTrends sees or SDC records at all.  It’s only used by search engine spiders.

Think about how many of your pages might be affected by this problem, considering your banners, pay-per-click, affiliates, and on-site advertising.  If you have a lot of them, like we do, you might want to program your content management system to automatically put the canonical link-tag into the header of every page.

Share:
  • Digg
  • Sphinn
  • del.icio.us
  • Facebook
  • Mixx
  • Google

Tags

, , , , ,

Somewhat Related Posts

  • Use Webmaster Tools to Clean Up Organic Search URLs
  • ...
  • You need to read this post about Table Limits
  • ...
  • Update Your WebTrends Software With Less Pain
  • ...

    9 comments

    1 Bryan Cristina { 04.21.09 at 9:47 am }

    Loved the concept when I first heard of it, however, people around here are still paranoid about it affecting SEO efforts. For shame.

    Someday they’ll buy into it, and on that day I can track things a bit better without people being so scared. :)

    2 rocky { 04.21.09 at 10:10 am }

    Google is careful to say it is not a “directive” but instead is “a hint that we honor strongly.” Which leaves room for doubt for those who want to doubt.

    Here’s a link to a Google blog with some more information:

    http://googlewebmastercentral.blogspot.com/2009/02/specify-your-canonical.html

    3 JD { 05.01.09 at 1:26 pm }

    You, sir, are no geek at all if you don’t have an easy familiarity with the word canonical.

    4 rocky { 05.02.09 at 6:09 am }

    Truer words were never spoken. I am flattered whenever anybody accidentally calls me a geek, but I am a mere social systems researcher by training, and “canonical” until now has only been, to me, a type of correlation analysis.

    5 Mike { 06.17.09 at 11:24 am }

    How would the parameter effect the path analysis? Basically if you had two identical links on the same, would they come up separately in the path analysis?

    6 Nestor { 06.18.09 at 9:52 am }

    I found your site yesterday and it has been very informative. Mad props!

    In terms of this post, a possible alternative to not breaking your SEO rankings is implementing tracking your WT parameters via the hash tag instead of adding it to the regular query parameter list. For example, “index.htm#WT.mc_id=cid” instead of “index.htm?WT.mc_id=cid”. This can be done by adding a small snippet of code in your WT base tag to parse the said hash tag parameters and pass it to the page tag image.

    7 rocky { 06.23.09 at 6:53 am }

    That is a really interesting idea because to the best of my knowledge the hash tag stuff is ignored when a page is indexed.

    At the same time, that trick cannot be used in URLs used in paid search, at least not for Google AdWords. The redirect in Google AdWords will strip out the hash tag portion during the redirect and the visit will not be identified as a paid search visit.

    8 rocky { 06.23.09 at 7:00 am }

    Mike – any parameter will affect path analysis only if you’ve allowed the parameter to be displayed in reports. If you’ve specified that the parameter should be suppressed in reports, path analysis will ignore it (the differentiating parameter). The suppress/display of specific parameters is controlled by WebTrends’ “URL Rebuilding” function which is found in the part of the config menu called “Administration.”

    9 Use Webmaster Tools to Clean Up Organic Search URLs | WebTrends Outsider { 02.20.11 at 3:33 pm }

    [...] Bing, and Yahoo frequently do have, in their indexes, URLs with campaign parameters.  See our Canonical URLs post for a description of several ways those superfluous parameters sneak into the [...]

    Leave a Comment